CN107122179A

CN107122179A - The function control method and device of voice

Info

Publication number: CN107122179A
Application number: CN201710210831.5A
Authority: CN
Inventors: 潘葚
Original assignee: Alibaba Group Holding Ltd
Current assignee: Advanced New Technologies Co Ltd; Advantageous New Technologies Co Ltd
Priority date: 2017-03-31
Filing date: 2017-03-31
Publication date: 2017-09-01
Also published as: EP3575957A4; US20190304461A1; KR20190089032A; KR102228964B1; MY194401A; US20200219510A1; WO2018177233A1; US10643615B2; US10991371B2; EP3575957A1; EP3575957B1; JP6869354B2; TWI665656B; JP2020510852A; PH12019501488A1; TW201837896A

Abstract

The present application provides a voice function control method, which is applied to the voice assistant of the terminal, including: determining an associated application program according to the recognition result of the voice input by the user; the associated application program is used to realize the function to be used by the user; The user's input voice is transmitted to the associated application program for the associated application program to recognize the user's input voice, and implement the function according to the recognition result. Through the technical solution of the present application, the functions required by the user can be completed more accurately and quickly, and the efficiency of voice function control can be improved.

Description

Voice function control method and device

技术领域technical field

本申请涉及网络通信技术领域，尤其涉及一种语音的功能控制方法和装置。The present application relates to the technical field of network communication, in particular to a voice function control method and device.

背景技术Background technique

语音助手是运行在终端上的一种软件，能够和用户进行语音交流，以及协助用户实现其指定的各项功能，如进行信息搜索、终端操作等。自从苹果推出其语音助手Siri以后，用户对语音助手类的软件的关注程度和使用率有了极大的提高，也给语音助手的发展带来了动力。Voice assistant is a kind of software running on the terminal, which can communicate with users by voice and assist users to realize various functions specified by them, such as information search and terminal operation. Since Apple launched its voice assistant Siri, users have greatly increased their attention and usage of voice assistant software, which has also brought impetus to the development of voice assistants.

目前语音助手能够与安装在终端上的应用程序相结合，用户向语音助手发出执行某项任务的指令，由语音助手调用相应的应用程序来完成该项任务，从而极大的丰富了用户能够以语音助手为单一入口来完成的功能。At present, the voice assistant can be combined with the application program installed on the terminal. The user sends an instruction to the voice assistant to perform a certain task, and the voice assistant calls the corresponding application program to complete the task, thus greatly enriching the user's ability to use Voice assistant is a single entry to complete the function.

现有技术中，以Siri为例，Siri可以与六类应用程序(打车、通讯、照片搜索、支付、网络电话、健身)协作，当Siri接收到用户的语音输入时，判断用户的意图，并决定是由自己处理还是调用应用程序处理。如果决定调用应用程序，Siri从自己对用户语音的识别结果中提取相关信息交给应用程序。应用程序按照Siri提供的信息，执行该信息指定的功能。In the prior art, taking Siri as an example, Siri can cooperate with six types of application programs (taxi-hailing, communication, photo search, payment, Internet phone, fitness). When Siri receives the user's voice input, it judges the user's intention, and Decide whether to handle it yourself or the calling application. If it decides to call the application, Siri extracts relevant information from its own recognition results of the user's voice and gives it to the application. The application executes the function specified by the information according to the information provided by Siri.

可见，Siri提供的信息是用户指定的任务是否能被正确完成的基础。当Siri识别出的输入语音不够准确时，用户通过语音进行的功能控制就难以达到令人满意的水平。It can be seen that the information provided by Siri is the basis for whether the task specified by the user can be completed correctly. When Siri does not recognize the input speech accurately enough, it is difficult for the user to control the functions through voice to a satisfactory level.

发明内容Contents of the invention

有鉴于此，本申请提供一种语音的功能控制方法，应用在终端的语音助手上，包括：In view of this, this application provides a voice function control method, which is applied to the voice assistant of the terminal, including:

根据对用户输入语音的识别结果，确定关联应用程序；所述关联应用程序用来实现用户要使用的功能；According to the recognition result of the voice input by the user, an associated application program is determined; the associated application program is used to realize the function to be used by the user;

将用户的输入语音传递给所述关联应用程序，供所述关联应用程序对用户的输入语音进行识别，并根据识别结果进行所述功能的实现。The user's input voice is delivered to the associated application program for the associated application program to recognize the user's input voice, and implement the function according to the recognition result.

本申请提供的一种语音的功能控制方法，应用在用来实现除语音助手外其他功能的终端应用程序上，包括：A voice function control method provided by this application is applied to a terminal application program used to realize functions other than voice assistant, including:

接收来自语音助手的用户的输入语音；receive voice input from the user of the voice assistant;

识别用户的输入语音，根据识别结果进行用户要使用功能的实现。Recognize the user's input voice, and realize the function that the user wants to use according to the recognition result.

本申请还提供了一种语音的功能控制装置，应用在终端的语音助手上，包括：The present application also provides a voice function control device, which is applied to the voice assistant of the terminal, including:

关联应用程序单元，用于根据对用户输入语音的识别结果，确定关联应用程序；所述关联应用程序用来实现用户要使用的功能；The associated application program unit is used to determine the associated application program according to the recognition result of the voice input by the user; the associated application program is used to realize the function to be used by the user;

输入语音传递单元，用于将用户的输入语音传递给所述关联应用程序，供所述关联应用程序对用户的输入语音进行识别，并根据识别结果进行所述功能的实现。The input voice transmission unit is used to transmit the user's input voice to the associated application program, so that the associated application program can recognize the user's input voice, and implement the function according to the recognition result.

本申请提供的一种语音的功能控制装置，应用在用来实现除语音助手外其他功能的终端应用程序上，包括：A voice function control device provided by this application is applied to a terminal application program used to realize functions other than voice assistant, including:

输入语音接收单元，用于接收来自语音助手的用户的输入语音；Input voice receiving unit, is used for receiving the input voice from the user of voice assistant;

功能实现单元，用于识别用户的输入语音，根据识别结果进行用户要使用功能的实现。The function realization unit is used to recognize the input voice of the user, and realize the function to be used by the user according to the recognition result.

由以上技术方案可见，本申请的实施例中，在语音助手根据对用户输入语音的识别结果确定关联应用程序后，将用户的输入语音传递给关联应用程序，由关联应用程序自行识别输入语音后执行用户的指令；由于关联应用程序都有其专属的应用领域，比如高德地图是地图及导航领域，因此在其所属的功能类型的语音识别准确度在绝大多数情况下高于通用于所有功能类型的语音助手的准确度，从而能够更为准确和快速的完成用户需要的功能，提高语音功能控制的效率。It can be seen from the above technical solutions that in the embodiment of the present application, after the voice assistant determines the associated application program based on the recognition result of the user's input voice, the user's input voice is passed to the associated application program, and the associated application program recognizes the input voice by itself. Execute the user's instructions; since the associated applications have their own exclusive application fields, such as the map and navigation field of Gaode map, the accuracy of speech recognition in the type of function it belongs to is higher than that of general use in most cases. The accuracy of the voice assistant of the function type, so that the functions required by the user can be completed more accurately and quickly, and the efficiency of voice function control can be improved.

附图说明Description of drawings

图1是本申请实施例中一种应用在终端的语音助手上、语音的功能控制方法的流程图；FIG. 1 is a flowchart of a voice function control method applied to a voice assistant of a terminal in an embodiment of the present application;

图2是本申请实施例中一种应用在终端的应用程序上、语音的功能控制方法的流程图；FIG. 2 is a flow chart of a voice function control method applied to an application program of a terminal in an embodiment of the present application;

图3是本申请应用示例的工作原理示意图；Fig. 3 is a schematic diagram of the working principle of the application example of the present application;

图4是终端的一种硬件结构图；FIG. 4 is a hardware structural diagram of a terminal;

图5是本申请实施例中一种应用在终端的语音助手上、语音的功能控制装置的逻辑结构图；FIG. 5 is a logical structure diagram of a voice function control device applied to a voice assistant of a terminal in an embodiment of the present application;

图6是本申请实施例中一种应用在终端的应用程序上、语音的功能控制装置的逻辑结构图。FIG. 6 is a logical structure diagram of a voice function control device applied to an application program of a terminal in an embodiment of the present application.

具体实施方式detailed description

现有技术中，终端上安装的很多非语音助手类的应用程序本身也具有接收用户语音输入，识别出并执行用户的语音指令，实现用户要使用功能的能力。简便起见，以下将语音助手类的应用程序称为语音助手，将非语音助手类、用来实现除语音助手外的其他功能的应用程序称为应用程序。In the prior art, many non-voice assistant application programs installed on the terminal itself also have the ability to receive user voice input, recognize and execute the user's voice command, and realize the functions that the user wants to use. For the sake of brevity, the applications of voice assistants are referred to as voice assistants below, and the application programs of non-voice assistants and used to implement functions other than voice assistants are referred to as applications.

语音助手的设计初衷是作为与用户进行语音交互的统一入口，除了识别用户可能涉及所有功能类型的指令，还要陪用户聊天，因此语音助手词库的词汇范围极其广泛，并且其语音识别算法的优化也必须兼顾所有类型词汇的识别率。而应用程序往往专注于几项主要功能的实现，如地图类型应用程序专注于寻址、定位和导航，购物类型的应用程序专注于商品和交易等等，用户在采用这些应用程序时，意图基本确定是要使用应用程序专注的功能。因此应用程序中用于语音识别的词库的词汇范围同样集中在其专注的功能领域，语音识别算法的优化也是如此。The voice assistant is originally designed as a unified entrance for voice interaction with the user. In addition to recognizing the user's instructions that may involve all types of functions, it also needs to chat with the user. Therefore, the voice assistant lexicon has an extremely wide range of vocabulary, and its speech recognition algorithm Optimization must also take into account the recognition rate of all types of vocabulary. However, applications often focus on the realization of several main functions. For example, map-type applications focus on addressing, positioning and navigation, and shopping-type applications focus on commodities and transactions. When users adopt these applications, their intentions are basically Make sure to use the application-focused features. The vocabulary range of the thesaurus used for speech recognition in the application is thus also focused on its dedicated functional area, as is the optimization of the speech recognition algorithm.

从发出语音指令后到等待终端给出响应的等待时长会极大的影响用户体验，因而可供进行语音识别的时间非常有限。在有限的时间内，在任何一个特定的功能领域，语音助手对用户意图使用该领域某个功能的输入语音的识别，在准确度上很难赶上属于该功能领域的应用程序进行的识别。The waiting time from sending out the voice command to waiting for the terminal to give a response will greatly affect the user experience, so the time available for voice recognition is very limited. In a limited period of time, in any specific functional area, the voice assistant's recognition of the user's input voice intended to use a certain function in this area is difficult to catch up with the recognition of applications belonging to this functional area in terms of accuracy.

比如在地点名称上，高德地图的词库比Siri的词库更为全面准确；同时，由于有长期的关于地点方面的搜索的积累，高德地图的识别算法在识别地名上也比Siri更为精准，Siri在导航这个应用场景中没有更为可靠的地名、位置等数据作为参考，来优化其识别算法。For example, in terms of location names, the thesaurus of AutoNavi Maps is more comprehensive and accurate than that of Siri; at the same time, due to the accumulation of long-term searches on locations, the recognition algorithm of AutoNavi Maps is also more accurate than Siri in recognizing place names. In order to be accurate, Siri does not have more reliable data such as place names and locations as a reference in the application scenario of navigation to optimize its recognition algorithm.

可见，现有技术中由语音助手将识别结果传递给应用程序的做法，实际上是由不擅长某个工作的一方来完成工作，将不准确的工作结果交给擅长该工作的另一方，另一方不得不以较差的工作结果为基础来执行任务，自然难以达到好的结果。It can be seen that in the prior art, the voice assistant transmits the recognition result to the application program. In fact, the party who is not good at a certain job completes the work, and the inaccurate work result is handed over to the other party who is good at the job. One party has to perform tasks on the basis of poor work results, and it is naturally difficult to achieve good results.

基于上述思路，本申请的实施例提出一种新的语音功能控制方法，由语音助手通过对用户输入语音的识别，确定实现用户意图的关联应用程序，再将输入语音传递给关联应用程序，由关联应用程序自行识别语音后实现用户要使用的功能，使得应用程序基于原始的输入语音执行用户指令，可以以自行识别的结果而不必以语音助手不理想的识别结果为基础来执行，从而能够更为准确和快速的完成用户需要的功能，以解决现有技术中存在的问题。Based on the above ideas, the embodiment of the present application proposes a new voice function control method. The voice assistant determines the associated application program that realizes the user's intention through the recognition of the user's input voice, and then passes the input voice to the associated application program. After the associated application program recognizes the voice by itself, it realizes the functions that the user wants to use, so that the application program executes the user's instructions based on the original input voice, and can execute based on the result of self-recognition instead of the unsatisfactory recognition result of the voice assistant. In order to accurately and quickly complete the functions required by users, so as to solve the problems existing in the prior art.

本申请的实施例中，语音助手和应用程序都运行在用户的终端上。语音助手可以运行在终端的操作系统层面，也可以是作为操作系统上层的应用来运行，不做限定。用户的终端可以是任何具有语音输入、计算和存储功能的设备，如手机、平板电脑、PC(PersonalComputer，个人电脑)、笔记本、服务器等，同样不做限定。In the embodiment of the present application, both the voice assistant and the application program run on the user's terminal. The voice assistant can run at the operating system level of the terminal, or as an application on the upper layer of the operating system, without limitation. The user's terminal can be any device with voice input, calculation and storage functions, such as mobile phone, tablet computer, PC (Personal Computer, personal computer), notebook, server, etc., which is also not limited.

本申请实施例中，语音的功能控制方法应用在语音助手中的流程如图1所示，应用在应用程序中的流程如图2所示。In the embodiment of the present application, the flow of the voice function control method applied to the voice assistant is shown in FIG. 1 , and the flow of the method applied in the application program is shown in FIG. 2 .

在语音助手上，步骤110，根据对用户输入语音的识别结果，确定关联应用程序。关联应用程序用来实现用户要使用的功能。On the voice assistant, in step 110, the associated application program is determined according to the recognition result of the voice input by the user. The associated application program is used to realize the function to be used by the user.

当语音助手接收到用户的语音输入后，识别用户的语音，如果用户的指令不涉及对哪个功能的使用、或者用户要使用的功能由语音助手来执行，则语音助手依据识别结果回复用户的输入，或者执行用户的指令。如果语音助手的识别结果是用户要使用某种由应用程序来执行的功能，则语音助手确定实现该功能的关联应用程序。When the voice assistant receives the user's voice input, it recognizes the user's voice. If the user's instruction does not involve the use of any function, or the function that the user wants to use is executed by the voice assistant, the voice assistant will reply to the user's input based on the recognition result. , or execute user commands. If the recognition result of the voice assistant is that the user intends to use a function performed by an application program, the voice assistant determines an associated application program that implements the function.

用户可能在输入语音中指定想使用的应用程序。在这种情形下，语音助手可以从对用户输入语音的识别结果中提取出应用程序名称，将该应用程序(即用户在输入语音中指定的应用程序)作为关联应用程序。例如，用户对语音助手说，“用滴滴帮我叫车回家”，语音助手识别出应用程序名称“滴滴”，则将应用程序滴滴作为关联应用程序。The user may specify the desired application in the voice input. In this case, the voice assistant can extract the application program name from the recognition result of the user's input voice, and use the application program (ie, the application program specified by the user in the input voice) as an associated application program. For example, the user says to the voice assistant, "Use Didi to help me call a car home", and the voice assistant recognizes the application name "Didi", and uses the application Didi as the associated application.

如果用户没有在输入语音中指定想使用的应用程序，语音助手可以根据对用户输入语音的识别结果，确定用户要使用的功能，再按照用户要使用的功能在终端上安装的应用程序中确定关联应用程序。语音助手从识别结果中确定用户要使用的功能的方法可以参照各种现有技术实现，例如，可以预置各个功能的若干个关键词，如果对用户语音的识别结果命中了某个功能关键词，则可获知用户想要使用的功能是哪一个。If the user does not specify the desired application in the input voice, the voice assistant can determine the function that the user wants to use based on the recognition result of the user input voice, and then determine the association among the applications installed on the terminal according to the function that the user wants to use application. The method for the voice assistant to determine the function to be used by the user from the recognition result can be realized by referring to various existing technologies. For example, several keywords of each function can be preset. If the recognition result of the user's voice hits a certain function keyword , you can know which function the user wants to use.

由用户要使用的功能确定关联应用程序的方式可以根据实际应用场景的需要来决定，以下以两个实现方式为例进行说明。The manner in which the associated application program is determined by the function to be used by the user may be determined according to the needs of the actual application scenario, and the following two implementation manners are taken as examples for illustration.

在第一个实现方式中，语音助手在识别出用户要使用的功能后，可以将终端上安装的应用程序中，能够实现用户要使用的功能、并且支持语音输入的一个到多个应用程序作为备选应用程序，将备选应用程序的名称显示给用户，供用户选择。在收到用户的选择结果后，语音助手将用户选定的应用程序作为关联应用程序。In the first implementation manner, after the voice assistant recognizes the function to be used by the user, one or more applications installed on the terminal that can realize the function to be used by the user and support voice input can be used as Alternate applications, display the names of alternative applications to the user for selection. After receiving the user's selection result, the voice assistant uses the application selected by the user as an associated application.

在第二个实现方式中，可以在终端上维护要使用的功能与应用程序的映射关系。在识别出用户要使用的功能后，语音助手可以将与用户要使用的功能具有映射关系的应用程序作为关联应用程序。在前述以功能关键词来反映用户要使用的功能的例子中，可以在终端上保存一张功能关键词与应用程序的映射关系表，语音助手从对用户输入语音的识别结果中提取出功能关键词后，即可将与该功能关键词具有映射关系的应用程序作为关联应用程序。In the second implementation manner, the mapping relationship between the functions to be used and the application programs can be maintained on the terminal. After identifying the function to be used by the user, the voice assistant may use an application having a mapping relationship with the function to be used by the user as an associated application. In the aforementioned example of using function keywords to reflect the functions that users want to use, a mapping relationship table between function keywords and applications can be saved on the terminal, and the voice assistant can extract the function keys from the recognition results of the user's input voice. After the word, the application program that has a mapping relationship with the function keyword can be used as the associated application program.

在第二个实现方式中，要使用的功能与应用程序的映射关系可以由用户设置和/或修改，也可以由语音助手或操作系统生成，还可以同时应用这些方式，本申请的实施例不做限定。一个例子中，用户可以在语音助手或操作系统提供的设置项中设置语音输入时，要使用的功能和关联应用程序的映射关系(一个到多个功能对应于一个应用程序)。另一个例子中，可以把要实现某个功能时，用户最频繁采用的应用程序作为与该功能有映射关系的应用程序；具体而言，如果安装在终端上的实现某功能的应用程序只有一个，则将该应用程序作为与该功能有映射关系的应用程序；如果超过一个，则可以按照操作系统统计的用户使用每个实现该功能的应用程序的频率，将频率最高的一个作为与该功能有映射关系的应用程序。第三个例子中，用户还可以在实现某个功能的应用程序中，将该应用程序设置为语音输入时、与该功能有映射关系的应用程序，应用程序在收到用户的设置指令后，向语音助手提交本应用程序与要使用功能的映射关系；如用户在高德地图中设置语音输入时、与功能关键词“导航”具有映射关系，高德地图按照用户的操作将该设置项提交给Siri，Siri将这一映射关系保存起来，后续用户语音输入“导航去哪里”时，Siri将按照映射关系，以高德地图作为关联应用程序。In the second implementation, the mapping relationship between the function to be used and the application program can be set and/or modified by the user, or generated by a voice assistant or an operating system, and these methods can also be applied at the same time. The embodiments of the present application do not Do limited. In one example, the user can set the mapping relationship between the function to be used and the associated application program during voice input in the setting item provided by the voice assistant or the operating system (one to multiple functions corresponds to one application program). In another example, when a certain function is to be implemented, the application program most frequently used by the user may be used as the application program having a mapping relationship with the function; specifically, if there is only one application program installed on the terminal to implement a certain function , then the application program is used as the application program that has a mapping relationship with the function; if there is more than one application program, the frequency of using each application program that implements the function can be used according to the statistics of the operating system, and the one with the highest frequency is used as the application program that has the mapping relationship with the function. An application with a mapping relationship. In the third example, the user can also set the application program as the application program that has a mapping relationship with the function during voice input in the application program that implements a certain function. After the application program receives the user's setting instruction, Submit the mapping relationship between this application and the function to be used to the voice assistant; if the user sets the voice input in the AutoNavi map, and has a mapping relationship with the function keyword "navigation", AutoNavi will submit the setting item according to the user's operation For Siri, Siri saves this mapping relationship, and when the user voice inputs "where to navigate", Siri will follow the mapping relationship and use AutoNavi Maps as the associated application.

需要说明的是，上述两个确定关联应用程序的方式也可以结合应用。例如，当语音助手确定用户要使用的功能后，查询保存的功能与应用程序的映射关系，如果能查到与要使用的功能有映射关系的应用程序，则按照映射关系来确定关联应用程序；如果查不到，则将终端上能够实现要使用的功能、并且支持语音输入的应用程序给用户选择，并且按照用户选择确定关联应用程序。在用户选择后可以请用户设置该功能的默认关联应用程序，如果用户进行设置，则保存该功能与用户设置的应用程序的映射关系；如果没有进行设置，语音助手也可以在用户选择一个应用程序实现某个功能的次数和频率都超过设定阈值后，保存该应用程序与该功能的映射关系。例如，Siri维护的功能与应用程序的映射关系表中不包括“导航”这一功能的映射关系，在5次用户给出语音指令“导航去哪里”后，Siri将终端上安装的高德地图、百度地图和搜狗地图的应用程序名称显示给用户，由用户选择要使用哪个应用程序导航；如果用户有4次选择了高德地图，则Siri将“导航”这一功能与高德地图的映射关系保存至该映射关系表中；之后用户给出导航的语音指令时，Siri将直接以高德地图作为关联应用程序。It should be noted that the above two manners for determining associated application programs may also be applied in combination. For example, after the voice assistant determines the function to be used by the user, query the mapping relationship between the saved function and the application program, and if the application program that has a mapping relationship with the function to be used can be found, determine the associated application program according to the mapping relationship; If it cannot be found, the user can select an application program on the terminal that can realize the function to be used and supports voice input, and determine the associated application program according to the user's selection. After the user chooses, the user can be asked to set the default associated application of the function. If the user sets it, the mapping relationship between the function and the application set by the user will be saved; if no setting is made, the voice assistant can also select an application after the user After the number and frequency of implementing a certain function exceed the set threshold, the mapping relationship between the application program and the function is saved. For example, the mapping relationship between the function and the application program maintained by Siri does not include the mapping relationship of the function of "navigation". , Baidu Maps and Sogou Maps application names are displayed to the user, and the user chooses which application to use for navigation; if the user selects Gaode Map 4 times, Siri will map the "navigation" function with Gaode Map The relationship is saved in the mapping relationship table; when the user gives voice instructions for navigation, Siri will directly use Gaode Map as the associated application.

在语音助手上，步骤120，将用户的输入语音传递给关联应用程序，供关联应用程序对用户的输入语音进行识别，并根据识别结果进行该功能的实现。On the voice assistant, in step 120, the user's input voice is transmitted to the associated application program, so that the associated application program can recognize the user's input voice, and realize the function according to the recognition result.

在关联应用程序上，步骤210，接收来自语音助手的用户的输入语音。On the associated application program, step 210, receive input voice from the user of the voice assistant.

语音助手在确定执行用户要使用功能的关联应用程序后，打开关联应用程序(包括启动该关联应用程序、将该关联应用程序置于前台运行等)，将用户的输入语音传递给关联应用程序。After the voice assistant determines the associated application that executes the function that the user wants to use, it opens the associated application (including starting the associated application, placing the associated application in the foreground, etc.), and transmits the user's input voice to the associated application.

在关联应用程序上，步骤220，识别用户的输入语音，根据识别结果进行用户要使用功能的实现。On the associated application program, in step 220, the input voice of the user is recognized, and the function to be used by the user is realized according to the recognition result.

关联应用程序自行识别来自语音助手的用户的输入语音，按照识别结果，运行关联应用程序的业务处理逻辑，来实现用户要使用的功能。关联应用程序可以按照现有技术来进行语音识别和功能实现，不再赘述。The associated application program recognizes the user's input voice from the voice assistant, and runs the business processing logic of the associated application program according to the recognition result to realize the functions that the user wants to use. The associated application program can perform speech recognition and function realization according to the existing technology, and details will not be repeated here.

在一个例子中，语音助手可以将自己对用户输入语音的识别结果和用户的输入语音一并传递给关联应用程序。关联应用程序自行识别用户的输入语音，根据关联应用程序自己的识别结果和来自语音助手的识别结果，来实现用户要使用功能的实现。语音助手的识别结果可以作为供关联应用程序在语音识别时的参考，进一步增加识别的准确程度。In an example, the voice assistant can transmit its own recognition result of the user's input voice and the user's input voice to the associated application program. The associated application program recognizes the user's input voice by itself, and realizes the function that the user wants to use according to the recognition result of the associated application program itself and the recognition result from the voice assistant. The recognition result of the voice assistant can be used as a reference for the associated application program in speech recognition, further increasing the accuracy of recognition.

可见，本申请的实施例中，语音助手通过对用户输入语音的识别，确定实现用户要使用功能的关联应用程序，并将输入语音传递给关联应用程序，由关联应用程序自行识别输入语音后执行用户的指令，从而能够利用应用程序其在所属功能类型更为准确的语音识别结果，更为准确和快速的完成用户需要的功能，提高语音功能控制的效率。It can be seen that in the embodiment of the present application, the voice assistant determines the associated application program that realizes the function that the user wants to use by recognizing the voice input by the user, and passes the input voice to the associated application program, and the associated application program recognizes the input voice by itself and then executes it. Instructions from the user, so that the more accurate voice recognition results of the application program in its function type can be used to complete the functions required by the user more accurately and quickly, and improve the efficiency of voice function control.

在本申请的一个应用示例中，用户的苹果终端上安装有语音助手Siri和若干个能够完成各种类型功能的应用程序。Siri中保存有功能关键词和应用程序的映射关系表。一种映射关系表的示例如表1：In an application example of the present application, the user's Apple terminal is installed with voice assistant Siri and several application programs capable of completing various types of functions. A mapping relationship table between function keywords and application programs is stored in Siri. An example of a mapping relationship table is shown in Table 1:

表1Table 1

本应用示例的工作原理如图3所示，在收到用户的输入语音后，Siri对输入语音进行识别，假设用户要使用的功能需要借助于其他应用程序来实现，Siri从识别结果中提取描述用户要使用功能的功能关键词，用功能关键词查找映射关系表。如果找到对应于该功能关键词的应用程序，则以该应用程序为关联应用程序。如果没有在表中找到该功能关键词，Siri将终端上安装的所有能够实现该功能、并且支持语音输入的应用程序名称显示给用户，请用户选择想用哪个应用程序。Siri将用户选定的应用程序作为关联应用程序。The working principle of this application example is shown in Figure 3. After receiving the user’s input voice, Siri recognizes the input voice. Assuming that the functions that the user wants to use need to be realized with the help of other applications, Siri extracts the description from the recognition result The user wants to use the function keyword of the function, and use the function keyword to look up the mapping relationship table. If an application program corresponding to the function keyword is found, the application program is used as an associated application program. If the function keyword is not found in the table, Siri will display the names of all applications installed on the terminal that can realize the function and support voice input to the user, and ask the user to select which application program to use. Siri uses the user-selected app as an associated app.

Siri将关联应用程序置于前台运行，并把用户的输入语音通过操作系统传输给关联应用程序。关联应用程序识别用户的输入语音，按照自己的识别结果和业务流程来完成用户指令的任务。Siri puts the associated application in the foreground and transmits the user's input voice to the associated application through the operating system. The associated application program recognizes the user's input voice, and completes the tasks instructed by the user according to its own recognition results and business processes.

例如，用户对Siri说，“转账2000给张三”。Siri识别出功能关键词“转账”，从表1查询到关联应用程序为支付宝。Siri打开支付宝，将用户的输入语音传递给支付宝。支付宝识别输入语音，启动转账业务流程，向用户显示“收款方：张三”、“转账金额：2000”等内容，在用户输入密码或验证指纹后即可完成转账。For example, the user says to Siri, "Transfer 2000 to Zhang San". Siri recognizes the function keyword "transfer", and inquires from Table 1 that the associated application is Alipay. Siri opens Alipay and passes the user's input voice to Alipay. Alipay recognizes the input voice, starts the transfer business process, and displays "Payee: Zhang San" and "Transfer Amount: 2000" to the user. After the user enters the password or verifies the fingerprint, the transfer can be completed.

与上述流程实现对应，本申请的实施例还提供了一种应用在终端语音助手上的语音的功能控制装置，和一种应用在终端应用程序上的语音的功能控制装置。这两种装置均可以通过软件实现，也可以通过硬件或者软硬件结合的方式实现。以软件实现为例，作为逻辑意义上的装置，是通过终端的CPU(Central Process Unit，中央处理器)将对应的计算机程序指令读取到内存中运行形成的。从硬件层面而言，除了图4所示的CPU、内存以及非易失性存储器之外，语音的功能控制装置所在的终端通常还包括用于进行无线信号收发的芯片等其他硬件，和/或用于实现网络通信功能的板卡等其他硬件。Corresponding to the implementation of the above process, the embodiment of the present application also provides a voice function control device applied to a terminal voice assistant, and a voice function control device applied to a terminal application program. These two devices can be implemented by software, or by hardware or a combination of software and hardware. Taking software implementation as an example, as a device in a logical sense, it is formed by reading corresponding computer program instructions into the memory through the CPU (Central Process Unit, central processing unit) of the terminal to run. From the perspective of hardware, in addition to the CPU, memory, and non-volatile memory shown in Figure 4, the terminal where the voice function control device is located usually includes other hardware such as a chip for wireless signal transmission and reception, and/or Other hardware such as boards and cards used to realize network communication functions.

图5所示为本申请实施例提供的一种语音的功能控制装置，应用在终端的语音助手上，包括关联应用程序单元和输入语音传递单元，其中：关联应用程序单元用于根据对用户输入语音的识别结果，确定关联应用程序；所述关联应用程序用来实现用户要使用的功能；输入语音传递单元用于将用户的输入语音传递给所述关联应用程序，供所述关联应用程序对用户的输入语音进行识别，并根据识别结果进行所述功能的实现。Figure 5 shows a voice function control device provided by the embodiment of the present application, which is applied to the voice assistant of the terminal, and includes an associated application program unit and an input voice transfer unit, wherein: the associated application program unit is used to The voice recognition result determines the associated application program; the associated application program is used to realize the function that the user wants to use; the input voice transfer unit is used to transmit the user's input voice to the associated application program for the associated application program to The user's input voice is recognized, and the function is realized according to the recognition result.

一个例子中，所述终端上维护有要使用的功能与应用程序的映射关系；所述关联应用程序单元具体用于：根据对用户输入语音的识别结果，确定用户要使用的功能，将与用户要使用的功能具有映射关系的应用程序作为关联应用程序。In one example, the terminal maintains a mapping relationship between the function to be used and the application program; the associated application program unit is specifically used to: determine the function to be used by the user according to the recognition result of the user's input voice, and associate with the user The function to be used has an application with a mapping relationship as an associated application.

上述例子中，所述要使用的功能与应用程序的映射关系，包括：功能关键词与应用程序的映射关系；所述关联应用程序单元具体用于：提取对用户输入语音的识别结果中的功能关键词，将与所述功能关键词具有映射关系的应用程序作为关联应用程序。In the above example, the mapping relationship between the function to be used and the application program includes: the mapping relationship between the function keyword and the application program; the associated application program unit is specifically used to: extract the function in the recognition result of the user input voice keywords, and use the application program that has a mapping relationship with the function keyword as an associated application program.

上述例子中，所述要使用的功能与应用程序的映射关系，包括：由用户设置的要使用的功能与应用程序的映射关系；和/或，以实现所述要使用的功能时，用户最频繁采用的应用程序作为与所述要使用的功能具有映射关系的应用程序；和/或，由某个应用程序提交的、所述应用程序与要使用功能的映射关系。In the above example, the mapping relationship between the function to be used and the application program includes: the mapping relationship between the function to be used and the application program set by the user; A frequently used application program is an application program having a mapping relationship with the function to be used; and/or, a mapping relationship between the application program and the function to be used submitted by a certain application program.

可选的，所述关联应用程序单元具体用于：根据对用户输入语音的识别结果，确定用户要使用的功能，将所述终端上能够实现所述功能、并且支持语音输入的若干个应用程序名称显示给用户供其选择，将用户选定的应用程序作为关联应用程序。Optionally, the associated application program unit is specifically configured to: determine the function to be used by the user according to the recognition result of the voice input by the user, and link several application programs on the terminal that can realize the function and support voice input The name is displayed to the user for selection, with the application selected by the user as the associated application.

可选的，所述关联应用程序单元具体用于：提取对用户输入语音的识别结果中的应用程序名称，将输入语音中指定的应用程序作为关联应用程序。Optionally, the associated application program unit is specifically configured to: extract the application program name in the recognition result of the voice input by the user, and use the application program specified in the input voice as the associated application program.

可选的，所述输入语音传递单元具体用于：将所述识别结果和用户的输入语音传递给所述关联应用程序，供所述关联应用程序对用户的输入语音进行识别，并根据关联应用程序的识别结果和语音助手的识别结果进行所述功能的实现。Optionally, the input speech transfer unit is specifically configured to: transfer the recognition result and the user's input speech to the associated application, so that the associated application can recognize the user's input speech, and The recognition result of the program and the recognition result of the voice assistant realize the function.

图6所示为本申请实施例提供的一种语音的功能控制装置，应用在用来实现除语音助手外其他功能的终端应用程序上，包括输入语音接收单元和功能实现单元，其中：输入语音接收单元用于接收来自语音助手的用户的输入语音；功能实现单元用于识别用户的输入语音，根据识别结果进行用户要使用功能的实现。Figure 6 shows a voice function control device provided by the embodiment of the present application, which is applied to a terminal application program used to realize functions other than voice assistant, and includes an input voice receiving unit and a function realization unit, wherein: input voice The receiving unit is used to receive the user's input voice from the voice assistant; the function realization unit is used to recognize the user's input voice, and realize the function that the user wants to use according to the recognition result.

可选的，所述输入语音接收单元具体用于：接收来自语音助手的用户的输入语音，以及语音助手对所述输入语音的识别结果；所述功能实现单元具体用于：识别用户的输入语音，根据自己的识别结果和来自语音助手的识别结果，进行用户要使用功能的实现。Optionally, the input voice receiving unit is specifically configured to: receive the input voice from the user of the voice assistant, and the recognition result of the input voice by the voice assistant; the function realization unit is specifically configured to: recognize the user’s input voice , realize the function that the user wants to use according to the recognition result of oneself and the recognition result from the voice assistant.

一个例子中，所述装置还包括：映射关系提交单元，用于根据用户的指令，向语音助手提交本应用程序与要使用功能的映射关系。In an example, the device further includes: a mapping relationship submitting unit, configured to submit the mapping relationship between the application program and the function to be used to the voice assistant according to the user's instruction.

上述例子中，所述本应用程序与要使用功能的映射关系，包括：本应用程序与功能关键词的映射关系。In the above example, the mapping relationship between the application program and the functions to be used includes: the mapping relationship between the application program and function keywords.

以上所述仅为本申请的较佳实施例而已，并不用以限制本申请，凡在本申请的精神和原则之内，所做的任何修改、等同替换、改进等，均应包含在本申请保护的范围之内。The above is only a preferred embodiment of the application, and is not intended to limit the application. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the application should be included in the application. within the scope of protection.

在一个典型的配置中，计算设备包括一个或多个处理器(CPU)、输入/输出接口、网络接口和内存。In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

内存可能包括计算机可读介质中的非永久性存储器，随机存取存储器(RAM)和/或非易失性内存等形式，如只读存储器(ROM)或闪存(flash RAM)。内存是计算机可读介质的示例。Memory may include non-permanent storage in computer readable media, in the form of random access memory (RAM) and/or nonvolatile memory such as read only memory (ROM) or flash RAM. Memory is an example of computer readable media.

计算机可读介质包括永久性和非永久性、可移动和非可移动媒体可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括，但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带，磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质，可用于存储可以被计算设备访问的信息。按照本文中的界定，计算机可读介质不包括暂存电脑可读媒体(transitory media)，如调制的数据信号和载波。Computer-readable media, including both permanent and non-permanent, removable and non-removable media, can be implemented by any method or technology for storage of information. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer-readable media excludes transitory computer-readable media, such as modulated data signals and carrier waves.

还需要说明的是，术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含，从而使得包括一系列要素的过程、方法、商品或者设备不仅包括那些要素，而且还包括没有明确列出的其他要素，或者是还包括为这种过程、方法、商品或者设备所固有的要素。在没有更多限制的情况下，由语句“包括一个……”限定的要素，并不排除在包括所述要素的过程、方法、商品或者设备中还存在另外的相同要素。It should also be noted that the term "comprises", "comprises" or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article, or apparatus comprising a set of elements includes not only those elements, but also includes Other elements not expressly listed, or elements inherent in the process, method, commodity, or apparatus are also included. Without further limitations, an element defined by the phrase "comprising a ..." does not exclude the presence of additional identical elements in the process, method, article or apparatus comprising said element.

本领域技术人员应明白，本申请的实施例可提供为方法、系统或计算机程序产品。因此，本申请可采用完全硬件实施例、完全软件实施例或结合软件和硬件方面的实施例的形式。而且，本申请可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。Those skilled in the art should understand that the embodiments of the present application may be provided as methods, systems or computer program products. Accordingly, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

Claims

1. A function control method of voice, applied to the voice assistant of the terminal, characterized in that, comprising:

According to the recognition result of the voice input by the user, an associated application program is determined; the associated application program is used to realize the function to be used by the user;

The user's input voice is delivered to the associated application program for the associated application program to recognize the user's input voice, and implement the function according to the recognition result.

2. The method according to claim 1, wherein the terminal maintains a mapping relationship between functions to be used and application programs;

The determining the associated application program according to the recognition result of the user's input voice includes: determining the function to be used by the user according to the recognition result of the user's input voice, and using the application program that has a mapping relationship with the function to be used by the user as the associated application program.

3. The method according to claim 2, wherein the mapping relationship between the function to be used and the application program includes: the mapping relationship between the function keyword and the application program;

The determining the associated application program according to the recognition result of the user input voice includes: extracting the function keyword in the recognition result of the user input voice, and using the application program having a mapping relationship with the function keyword as the associated application program.

4. The method according to claim 2 or 3, wherein the mapping relationship between the function to be used and the application program includes:

The mapping relationship between the functions to be used and the application program set by the user; and/or,

When realizing the function to be used, the application program most frequently used by the user is the application program having a mapping relationship with the function to be used; and/or,

Submitted by an application, the mapping relationship between the application and the function to be used.

5. The method according to claim 1, wherein the determining the associated application program according to the recognition result of the voice input by the user comprises: determining the function to be used by the user according to the recognition result of the voice input by the user, and On the terminal, the names of several application programs that can realize the described functions and support voice input are displayed to the user for selection, and the application program selected by the user is used as an associated application program.

6. The method according to claim 1, wherein said determining the associated application program according to the recognition result of the user input voice comprises: extracting the name of the application program in the recognition result of the user input voice, and converting the input voice The application specified in as the associated application.

7. The method according to claim 1, wherein the transmitting the user's input voice to the associated application program comprises: transmitting the recognition result and the user's input voice to the associated application program, The associated application program is used to recognize the user's input voice, and the function is implemented according to the recognition result of the associated application program and the recognition result of the voice assistant.

8. A function control method of voice, applied to a terminal application program for realizing other functions except voice assistant, characterized in that, comprising:

receive voice input from the user of the voice assistant;

Recognize the user's input voice, and realize the function that the user wants to use according to the recognition result.

9. The method according to claim 8, wherein the receiving the input voice from the user of the voice assistant comprises: receiving the input voice from the user of the voice assistant, and the recognition result of the input voice by the voice assistant ;

The recognition of the user's input voice and the realization of the user's desired function according to the recognition result include: recognizing the user's input voice and realizing the user's desired function according to the own recognition result and the recognition result from the voice assistant.

10. The method according to claim 8, further comprising: submitting the mapping relationship between the application program and the function to be used to the voice assistant according to the user's instruction.

11. The method according to claim 10, wherein the mapping relationship between the application program and the functions to be used comprises: the mapping relationship between the application program and function keywords.

12. A voice function control device, applied to a terminal voice assistant, characterized in that it comprises:

The associated application program unit is used to determine the associated application program according to the recognition result of the voice input by the user; the associated application program is used to realize the function to be used by the user;

The input voice transmission unit is used to transmit the user's input voice to the associated application program, so that the associated application program can recognize the user's input voice, and implement the function according to the recognition result.

13. The device according to claim 12, wherein the terminal maintains a mapping relationship between functions to be used and applications;

The associated application program unit is specifically configured to: determine the function to be used by the user according to the recognition result of the user's input voice, and use the application program that has a mapping relationship with the function to be used by the user as the associated application program.

14. The device according to claim 13, wherein the mapping relationship between the function to be used and the application program includes: the mapping relationship between the function keyword and the application program;

The associated application program unit is specifically configured to: extract a functional keyword in a recognition result of the user input voice, and use an application program having a mapping relationship with the functional keyword as an associated application program.

15. The device according to claim 13 or 14, wherein the mapping relationship between the function to be used and the application program includes:

16. The device according to claim 12, wherein the associated application program unit is specifically configured to: determine the function to be used by the user according to the recognition result of the voice input by the user, and implement the functions and support voice input, the names of several application programs are displayed to the user for selection, and the application program selected by the user is used as the associated application program.

17. The device according to claim 12, wherein the associated application program unit is specifically configured to: extract the application program name in the recognition result of the voice input by the user, and use the application program specified in the input voice as the associated application program program.

18. The device according to claim 12, wherein the input voice transmission unit is specifically configured to: transmit the recognition result and the user's input voice to the associated application program for the associated application program to The user's input voice is recognized, and the function is realized according to the recognition result of the associated application program and the recognition result of the voice assistant.

19. A function control device for voice, which is applied to a terminal application program for realizing functions other than voice assistant, characterized in that it comprises:

Input voice receiving unit, is used for receiving the input voice from the user of voice assistant;

The function realization unit is used to recognize the input voice of the user, and realize the function to be used by the user according to the recognition result.

20. The device according to claim 19, wherein the input voice receiving unit is specifically configured to: receive an input voice from a user of the voice assistant, and a recognition result of the input voice by the voice assistant;

The function realization unit is specifically configured to: recognize the user's input voice, and realize the function that the user wants to use according to its own recognition result and the recognition result from the voice assistant.

21. The device according to claim 19, further comprising: a mapping relationship submitting unit, configured to submit the mapping relationship between the application program and the function to be used to the voice assistant according to the user's instruction.

22. The device according to claim 21, wherein the mapping relationship between the application program and the function to be used comprises: a mapping relationship between the application program and function keywords.