US20150187355A1 - Text Editing With Gesture Control And Natural Speech - Google Patents
Text Editing With Gesture Control And Natural Speech Download PDFInfo
- Publication number
- US20150187355A1 US20150187355A1 US14/577,600 US201414577600A US2015187355A1 US 20150187355 A1 US20150187355 A1 US 20150187355A1 US 201414577600 A US201414577600 A US 201414577600A US 2015187355 A1 US2015187355 A1 US 2015187355A1
- Authority
- US
- United States
- Prior art keywords
- user
- transcription text
- editing
- text
- computer device
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F1/00—Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
- G06F1/16—Constructional details or arrangements
- G06F1/1613—Constructional details or arrangements for portable computers
- G06F1/163—Wearable computers, e.g. on a belt
-
- G06F17/24—
-
- G06F17/2735—
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/166—Editing, e.g. inserting or deleting
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
Definitions
- Mobile computing devices such as a laptop or notebook PC, a smart phone, and tablet computing device, are now common tools used for producing, analyzing, communicating, and consuming data in both business and personal life. Consumers continue to embrace a mobile digital lifestyle as the ease of access to digital information increases with high speed wireless communications technologies becoming ubiquitous.
- Popular uses of mobile computing devices include displaying large amounts of high-resolution computer graphics information and video content, often wirelessly streamed to the device. While these devices typically include a display screen, the preferred visual experience of a high resolution, large format display cannot be easily replicated in such mobile devices because the physical size of such device is limited to promote mobility.
- micro-displays can provide large-format, high-resolution color pictures and streaming video in a very small form factor.
- One application for such displays can be integrated into a wireless headset computer worn on the head of the user with a display within the field of view of the user, similar in format to eyeglasses, audio headset or video eyewear.
- a “wireless computing headset” device also referred to herein as a headset computer (HSC) or head mounted display (HMD), includes one or more small, high resolution micro-displays and associated optics to magnify the image.
- the high resolution micro-displays can provide super video graphics array (SVGA) (800 ⁇ 600) resolution or extended graphic arrays (XGA) (1024 ⁇ 768) resolution, or higher resolutions known in the art.
- SVGA super video graphics array
- XGA extended graphic arrays
- a wireless computing headset contains one or more wireless computing and communication interfaces, enabling data and streaming video capability, and provides greater convenience and mobility through hands dependent devices.
- HSC HSC headset computers
- HMD head mounded display device
- wireless computing headset device
- Embodiments of the present invention have features of at least three forms of hands-free editing.
- Text Dictation Post-Processing which concerns techniques for automatically correcting dictated text using resources from the user's own environment. Such techniques include the use of auto-correction algorithms (based on a standard language dictionary and/or the user's personal dictionary.
- Another feature concerns the use of speech commands to edit the message as a whole.
- Such commands include global find and replace commands, for example.
- Another feature concerns using speech and gesture commands for selecting and editing words in a document (e.g., a text file).
- the invention is a headset computer device including a microdisplay driven by a processor, a microphone configured to provide user utterances as input to the processor, and a speech processing engine executed by the processor.
- the speech processing engine configured to transcribe the user utterances and display resulting transcription text on the microdisplay, and be responsive to a user input directed to editing the displayed transcription text.
- being responsive to input from the user further may include providing editing assistance regarding the transcription text.
- the editing assistance may include suggested alternatives to one or more portions of the transcription text.
- the suggested alternatives are based at least on context of the portions of the transcription text within the transcription text.
- the editing assistance may include auto-correcting transcription text based on a dictionary.
- the dictionary may include one or more of a standard language dictionary and a personal dictionary.
- the editing assistance may include name correction using a local contact database.
- the user input directed to editing the displayed transcription text includes at least one spoken global language command.
- the user input may include at least one of a user control gesture, and a user control utterance, for selecting a portion of the transcription text.
- the user input may further include at least one of a user edit gesture, and a user edit utterance, for selecting an edit command to apply to the portion of the transcription text.
- the invention is a method of editing on a headset computer device, including transcribing user utterances to produce transcription text, displaying the transcription text, and responding to user input directed to editing the displayed transcription text.
- responding to user input further includes providing editing assistance.
- Providing editing assistance may further include suggesting alternatives to one or more portions of the transcription text.
- the alternatives may be based at least on context of the portions of the transcription text within the transcription text.
- the portions of the transcription text may be individual words, or groups of words (i.e., phrases).
- providing editing assistance may include auto-correcting transcription text based on a dictionary.
- Providing editing assistance may also include correcting names using a local contact database, such as a contacts list or an address book, or other information structure containing names, addresses, phone numbers, and so on.
- a local contact database such as a contacts list or an address book, or other information structure containing names, addresses, phone numbers, and so on.
- embodiments may use these information structures to correct street addresses and other components of the structure.
- the user input directed to editing the displayed transcription text includes at least one spoken global language command.
- the user input includes at least one of a user control gesture, and a user control utterance, for selecting a portion of the transcription text.
- the user input may further include at least one of a user edit gesture, and a user edit utterance, for selecting an edit command to apply to the portion of the transcription text.
- the invention is a non-transitory computer-readable medium for editing text, the non-transitory computer-readable medium comprising computer software instructions stored thereon.
- the computer software instructions when executed by at least one processor, causes a computer system to transcribe user utterances to produce transcription text, display the transcription text, and respond to user input directed to editing the displayed transcription text.
- FIGS. 1A-1B are schematic illustrations of a headset computer cooperating with a host computer (e.g., Smart Phone, laptop, etc.) according to principles of the present invention.
- a host computer e.g., Smart Phone, laptop, etc.
- FIG. 2 is a block diagram of flow of data and control in the embodiment of FIGS. 1A-1B .
- FIG. 3 is a schematic view of one embodiment with ASR (speech recognition module).
- ASR speech recognition module
- FIG. 4 illustrates a method of editing text on a headset computer device according to embodiments of the invention.
- FIGS. 1A and 1B show an example embodiment of a wireless computing headset device 100 (also referred to herein as a headset computer (HSC) or head mounted display (HMD)) that incorporates a high-resolution (VGA or better) micro-display element 1010 , and other features described below.
- a wireless computing headset device 100 also referred to herein as a headset computer (HSC) or head mounted display (HMD)
- HSC headset computer
- HMD head mounted display
- HSC 100 can include audio input and/or output devices, including one or more microphones, input and output speakers, geo-positional sensors (GPS), three to nine axis degrees of freedom orientation sensors, atmospheric sensors, health condition sensors, digital compass, pressure sensors, environmental sensors, energy sensors, acceleration sensors, position, attitude, motion, velocity and/or optical sensors, cameras (visible light, infrared, etc.), multiple wireless radios, auxiliary lighting, rangefinders, or the like and/or an array of sensors embedded and/or integrated into the headset and/or attached to the device via one or more peripheral ports 1020 ( FIG. 1B ).
- GPS geo-positional sensors
- three to nine axis degrees of freedom orientation sensors atmospheric sensors, health condition sensors, digital compass, pressure sensors, environmental sensors, energy sensors, acceleration sensors, position, attitude, motion, velocity and/or optical sensors, cameras (visible light, infrared, etc.), multiple wireless radios, auxiliary lighting, rangefinders, or the like and/or an array of sensors embedded and/or integrated into the headset and/or
- headset computing device 100 typically located within the housing of headset computing device 100 are various electronic circuits including, a microcomputer (single or multicore processors), one or more wired and/or wireless communications interfaces, memory or storage devices, various sensors and a peripheral mount or mount, such as a “hot shoe.”
- Example embodiments of the HSC 100 can receive user input through sensing voice commands, head movements, 110 , 111 , 112 and hand gestures 113 , or any combination thereof.
- a microphone or microphones operatively coupled to or integrated into the HSC 100 can be used to capture speech commands, which are then digitized and processed using automatic speech recognition techniques.
- Gyroscopes, accelerometers, and other micro-electromechanical system sensors can be integrated into the HSC 100 and used to track the user's head movements 110 , 111 , 112 to provide user input commands.
- Cameras or motion tracking sensors can be used to monitor a user's hand gestures 113 for user input commands.
- Such a user interface may overcome the disadvantages of hands-dependent formats inherent in other mobile devices.
- the HSC 100 can be used in various ways. It can be used as a peripheral display for displaying video signals received and processed by a remote host computing device 200 (shown in FIG. 1A ).
- the host 200 may be, for example, a notebook PC, smart phone, tablet device, or other computing device having less or greater computational complexity than the wireless computing headset device 100 , such as cloud-based network resources.
- the headset computing device 100 and host 200 can wirelessly communicate via one or more wireless protocols, such as Bluetooth®, Wi-Fi, WiMAX, 4G LTE or other wireless radio link 150 .
- Bluetooth is a registered trademark of Bluetooth Sig, Inc. of 5209 Lake Washington Boulevard, Kirkland, Wash. 98033).
- the host 200 may be further connected to other networks, such as through a wireless connection to the Internet or other cloud-based network resources, so that the host 200 can act as a wireless relay between the HSC 100 and the network 210 .
- some embodiments of the HSC 100 can establish a wireless connection to the Internet (or other cloud-based network resources) directly, without the use of a host wireless relay.
- components of the HSC 100 and the host 200 may be combined into a single device.
- FIG. 1B is a perspective view showing some details of an example embodiment of a headset computer 100 .
- the example embodiment HSC 100 generally includes, a frame 1000 , strap 1002 , rear housing 1004 , speaker 1006 , cantilever, or alternatively referred to as an arm or boom 1008 with a built in microphone, and a micro-display subassembly 1010 .
- a head worn frame 1000 and strap 1002 are generally configured so that a user can wear the headset computer device 100 on the user's head.
- a housing 1004 is generally a low profile unit which houses the electronics, such as the microprocessor, memory or other storage device, along with other associated circuitry. Speakers 1006 provide audio output to the user so that the user can hear information.
- Micro-display subassembly 1010 is used to render visual information to the user. It is coupled to the arm 1008 .
- the arm 1008 generally provides physical support such that the micro-display subassembly is able to be positioned within the user's field of view 300 ( FIG. 1A ), preferably in front of the eye of the user or within its peripheral vision preferably slightly below or above the eye. Arm 1008 also provides the electrical or optical connections between the micro-display subassembly 1010 and the control circuitry housed within housing unit 1004 .
- the HSC display device 100 allows a user to select a field of view 300 within a much larger area defined by a virtual display 400 .
- the user can typically control the position, extent (e.g., X-Y or 3D range), and/or magnification of the field of view 300 .
- FIGS. 1A and 1B While what is shown in FIGS. 1A and 1B is a monocular micro-display presenting a single fixed display element supported on the face of the user with a cantilevered boom, it should be understood that other mechanical configurations for the remote control display device 100 are possible, such as a binocular display with two separate micro-displays (e.g., one for each eye) or a single micro-display arranged to be viewable by both eyes.
- FIG. 2 is a block diagram showing more detail of an embodiment of the HSC or HMD device 100 , host 200 and the data that travels between them.
- the HSC or HMD device 100 receives vocal input from the user via the microphone, hand movements or body gestures via positional and orientation sensors, the camera or optical sensor(s), and head movement inputs via the head tracking circuitry such as 3 axis to 9 axis degrees of freedom orientational sensing. These are translated by software (processors) in the HSC or HMD device 100 into keyboard and/or mouse commands that are then sent over the Bluetooth or other wireless interface 150 to the host 200 .
- the host 200 interprets these translated commands in accordance with its own operating system/application software to perform various functions.
- Among the commands is one to select a field of view 300 within the virtual display 400 and return that selected screen data to the HSC or HMD device 100 .
- a very large format virtual display area might be associated with application software or an operating system running on the host 200 .
- only a portion of that large virtual display area 400 within the field of view 300 is returned to and actually displayed by the micro display 1010 of HSC or HMD device 100 .
- the HSC 100 may take the form of the device described in a co-pending US Patent Publication Number 2011/0187640, which is hereby incorporated by reference in its entirety.
- the invention relates to the concept of using a Head Mounted Display (HMD) 1010 in conjunction with an external ‘smart’ device 200 (such as a smartphone or tablet) to provide information and control to the user hands-free.
- HMD Head Mounted Display
- an external ‘smart’ device 200 such as a smartphone or tablet
- the invention requires transmission of small amounts of data, providing a more reliable data transfer method running in real-time.
- the amount of data to be transmitted over the connection 150 is small-simply instructions on how to lay out a screen, which text to display, and other stylistic information such as drawing arrows, or the background colors, images to include, etc.
- Additional data could be streamed over the same 150 or another connection and displayed on screen 1010 , such as a video stream if required by the host 200 .
- FIG. 3 shows an example embodiment of a wireless hands-free video computing headset 100 under voice command, according to one embodiment of the present invention.
- the user may be presented with an image on the micro-display 9010 , for example, as output by host computer 200 application mentioned above.
- a user of the HMD 100 can employ joint head-tracking and voice command text selection software module 9036 , either locally or from a remote host 200 , in which the user is presented with a sequence of screen views implementing hands free text selection on the micro-display 9010 and the audio of the same through the speaker 9006 of the headset computer 100 .
- the headset computer 100 is also equipped with a microphone 9020 , the user can utter voice commands (e.g., to make command selections) as illustrated next with respect to embodiments of the present invention.
- FIG. 3 shows a schematic diagram illustrating the modules of the headset computer 100 .
- FIG. 3 includes a schematic diagram of the operative modules of the headset computer 100 .
- controller 9100 accesses user command configuration module 9036 , which can be located locally to each HMD 100 or located remotely at a host 200 ( FIGS. 1A-1B ).
- User configurable speech command or speech command replacement software module 9036 contains instructions to display to a user an image of a pertinent request dialog box or the like.
- the graphics converter module 9040 converts the image instructions received from the speech command module 9036 via bus 9103 and converts the instructions into graphics to display on the monocular display 9010 .
- the text-to-speech module 9035 b may, contemporaneous with the graphics display described above, convert the instructions from text selection software module 9036 into digital sound representations corresponding to the contents of the screen views 410 to be displayed.
- the text-to-speech module 9035 b feeds the digital sound representations to the digital-to- analog converter 9021 b, which in turn feeds speaker 9006 to present the audio output to the user.
- Speech command replacement/user reconfiguration software module 9036 can be stored locally at memory 9120 or remotely at a host 200 ( FIG. 1A ). The user can speak/utter the replacement command selection from the image and the user's speech 9090 is received at microphone 9020 . The received speech is then converted from an analog signal into a digital signal at analog-to-digital converter 9021 a. Once the speech is converted from an analog to a digital signal speech recognition module 9035 a processes the speech into recognized speech.
- the recognized speech is compared against known speech and processed into text according to instructions of speech-to-text module 9036 .
- Typical dictation strategies rely on a user employing the touch screen of a mobile device to point and highlight the text to edit, possibly in conjunction with a keyboard to manually enter the corrections.
- the touch screen technique for entering dictation corrections may not be applicable.
- Text Dictation Post-Processing A first stage in ensuring text is close to that desired by the user is in dictation post-processing, i.e., processing of the text as soon as it is returned by the dictation engine (aka speech-to-text engine 9036 ).
- One post-processing technique is an auto-correction algorithm, using a standard language dictionary and/or the user's personal dictionary, one or both stored local to the device 100 .
- Memory 9120 may store these dictionaries, and instructions of modules 9035 a and 9036 may utilize the dictionaries in processing (i.e., recognizing and autocorrecting) speech submitted by the user.
- a “standard language dictionary” means a dictionary that has been compiled for use in general context, for example a dictionary provided by a third party entity for use by different end users.
- a personal dictionary as used herein, means a specifically compile dictionary for use by a narrower audience than would use a standard language dictionary.
- a personal dictionary may be used by a specific person, and may be modified and edited to be specifically sued by that person.
- Another post-processing technique is name correction, based on a local contact database (such as the user's address book or contact list) on the device 100 (memory 9120 ).
- a local contact database such as the user's address book or contact list
- many dictation engines have no concept of the users address book and may return names that while may be correctly spelled in one environment, are not correctly spelled in the user's environment.
- Fuzzy matching algorithms can be used by modules 9035 a and 9036 to compare names found in the dictated text against those in the user's address book. In turn, modules 9035 a, 9036 may correct any obvious misspellings.
- a name interpreted as “Chris” might be replaced with “Kris” by module 9036 , where the term is phonetically the same, but needs spelling correction according to the user's address book in memory 9120 .
- the device 100 may also use context to make a proper correction. For example, if a user's address book includes “Kris Jones” and “Chris Baker,” the first name spoken with the last name would provide a clue as to whether the correct spelling should be “Kris” or “Chris.”
- Global Natural Language Commands The Global Natural Language commands in speech processing modules 9035 a and 9036 allow a user to issue commands that apply to the whole document. A “find and replace all” is an example of a Global Natural Language Command.
- the spoken command is captured (parsed and recognized by module 9035 a ) and sent to the NLU engine (part of module 9036 ) for processing, along with the current message to edit.
- Such global commands may be programmed into the instruction set of modules 9035 a and 9036 .
- the HSC 100 tracks head movements to selectively highlight words of a textual phrase within the user's field of view.
- the speech processing module 9036 may present a menu of options to the user, e.g., ⁇ cut
- the menu may be displayed through microdisplay 9010 , and audibly presented through speaker 9006 as previously operationally described.
- Third Party content providers such as a dictionary and/or a thesaurus, could be accessed by modules 9035 a and 9036 to provide definitions for words and to suggest alternative word choices.
- HSC 100 may obtain the selection of words in text by tracking the movement of the user's head, so as to creating a scrolling effect to move the cursor (pointer) along the displayed text to get to the desired word. For example, a user may cause word selection to progress from word to word in a sentence from left to right in his field of view by turning his head from left to right (i.e., clockwise) or tilting his head to the right. The user may stop the word progression to select a particular word by returning his head to a neutral position (i.e., facing straight ahead, tilted neither to the left nor to the right).
- the HSC 100 may present an edit options menu with several editing options, e.g., ⁇ cut
- the user may use head tilting and/or rotation as described above to select one of the edit options, and use a gesture, head movement or spoken utterance (or similar input) to execute the selected command.
- Embodiments 100 may employ autosuggest to render (on display 9010 and through speaker 9006 ) the identification of contacts, dates, and other entities to correct the spelling of words taken from a third party trained NLU model. For example, a user speaks the name “Mac-ken-zee” and the NLU model recognizes and presents the word “McKenzie.” Device 100 may use a “contacts” database to search for matches for McKenzie, and autosuggest “Mackenzie Adams (madams@gmail.com)” as the contact that most closely matches the recognized name “McKenzie.”
- HSC 100 may suggest alternative words such as synonyms by classifying the context of the statement. For instance, if the context of the statement is to do computations, for example “for divided by three,” then HSC 100 highlights the displayed word “for” on display 9010 , and speech module 9036 issues the suggested replacement “four” displayed on microdisplay 9010 and enunciated through speaker 9006 as illustrated.
- processing for suggesting alternative words based on context within a sentence or other structure may be accomplished with program procedures for modules 9035 a and 9036 , using for example one or more of the elements set forth below, or combinations of similar elements.
- Tokenize the sentence First, partition each sentence into a list of tokens. Each sentence is partitioned into a list of words, and we remove the stop words. Stop words are frequently occurring, insignificant words that appear in a database record, article, or a web page, etc.
- Action Classification Tagging Using a training corpus, the context and action of the phrase is determined. For example, the detection of a mathematical operation “divided by” indicates a mathematical action, and a mathematical entity set is chosen to disambiguate the statement.
- Entity Disambiguation Tagging This task is to identify the correct entity (like numeric, date time, or as a member of a set of entities) of each word in the sentence.
- the algorithm takes a sentence as input and a specified tag set. The output is a single best tag for each word. Tagging ambiguities are resolved by using a training corpus to compute the probability of a given word having a given tag in a given context.
- the gloss of each of its senses is compared to the glosses of every other word in a phrase.
- a word is assigned to the sense whose gloss shares the largest number of words in common with the glosses of the other words. For example the word “For” is preposition and the word “two” is a number.
- the context is determined to be a mathematical division operation that expects a numerical dividend and a divisor to complete the mathematical operation. Since “for” is not numeric, fuzzy logic is employed to determine the edit distance between “for” and the next closest number which is “four” and is numeric. The word “four” is then suggested as the appropriate substitution for “for” in the spoken phrase.
- FIG. 4 illustrates a method of editing text on a headset computer device, including transcribing user utterances to produce transcription text, displaying 404 the transcription text, responding 406 to user input directed to editing the displayed transcription text, and providing 408 editing assistance.
- the editing assistance may include one or more off (i) suggesting alternatives to one or more portions of the transcription text, (ii) auto-correcting transcription text based on a dictionary and (iii) correcting names using a local contact database, among others.
- certain embodiments of the example embodiments described herein may be implemented as logic that performs one or more functions.
- This logic may be hardware-based, software-based, or a combination of hardware-based and software-based.
- Some or all of the logic may be stored on one or more tangible, non-transitory, computer-readable storage media and may include computer-executable instructions that may be executed by a controller or processor.
- the computer-executable instructions may include instructions that implement one or more embodiments of the invention.
- the tangible, non-transitory, computer-readable storage media may be volatile or non-volatile and may include, for example, flash memories, dynamic memories, removable disks, and non-removable disks.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Computer Hardware Design (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
Description
- This application claims the benefit of U.S. Provisional Application No. 61/921,267, filed on Dec. 27, 2013. The entire teachings of the above application is incorporated herein by reference.
- Mobile computing devices, such as a laptop or notebook PC, a smart phone, and tablet computing device, are now common tools used for producing, analyzing, communicating, and consuming data in both business and personal life. Consumers continue to embrace a mobile digital lifestyle as the ease of access to digital information increases with high speed wireless communications technologies becoming ubiquitous. Popular uses of mobile computing devices include displaying large amounts of high-resolution computer graphics information and video content, often wirelessly streamed to the device. While these devices typically include a display screen, the preferred visual experience of a high resolution, large format display cannot be easily replicated in such mobile devices because the physical size of such device is limited to promote mobility. Another drawback of the aforementioned device types is that the user interface is hands-dependent, typically requiring a user to enter data or make selections using a keyboard (physical or virtual) or touch-screen display. As a result, consumers are now seeking a hands-free, high quality, portable, color display solution to augment or replace their hands-dependent mobile devices.
- Recently developed micro-displays can provide large-format, high-resolution color pictures and streaming video in a very small form factor. One application for such displays can be integrated into a wireless headset computer worn on the head of the user with a display within the field of view of the user, similar in format to eyeglasses, audio headset or video eyewear.
- A “wireless computing headset” device, also referred to herein as a headset computer (HSC) or head mounted display (HMD), includes one or more small, high resolution micro-displays and associated optics to magnify the image. The high resolution micro-displays can provide super video graphics array (SVGA) (800×600) resolution or extended graphic arrays (XGA) (1024×768) resolution, or higher resolutions known in the art.
- A wireless computing headset contains one or more wireless computing and communication interfaces, enabling data and streaming video capability, and provides greater convenience and mobility through hands dependent devices.
- For more information concerning such devices, see co-pending patent applications entitled “Mobile Wireless Display Software Platform for Controlling Other Systems and Devices,” U.S. application Ser. No. 12/348, 648 filed Jan. 5, 2009, “Handheld Wireless Display Devices Having High Resolution Display Suitable For Use as a Mobile Internet Device,” PCT International Application No. PCT/US09/38601 filed Mar. 27, 2009, and “Improved Headset Computer,” U.S. Application No. 61/638,419 filed Apr. 25, 2012, each of which are incorporated herein by reference in their entirety.
- As used herein “HSC” headset computers, “HMD” head mounded display device, and “wireless computing headset” device may be used interchangeably.
- Embodiments of the present invention have features of at least three forms of hands-free editing.
- One feature is Text Dictation Post-Processing, which concerns techniques for automatically correcting dictated text using resources from the user's own environment. Such techniques include the use of auto-correction algorithms (based on a standard language dictionary and/or the user's personal dictionary.
- Another feature concerns the use of speech commands to edit the message as a whole. Such commands include global find and replace commands, for example.
- Another feature concerns using speech and gesture commands for selecting and editing words in a document (e.g., a text file).
- In one aspect, the invention is a headset computer device including a microdisplay driven by a processor, a microphone configured to provide user utterances as input to the processor, and a speech processing engine executed by the processor. The speech processing engine configured to transcribe the user utterances and display resulting transcription text on the microdisplay, and be responsive to a user input directed to editing the displayed transcription text.
- In one embodiment, being responsive to input from the user further may include providing editing assistance regarding the transcription text. In another embodiment, the editing assistance may include suggested alternatives to one or more portions of the transcription text. In another embodiment, the suggested alternatives are based at least on context of the portions of the transcription text within the transcription text. The editing assistance may include auto-correcting transcription text based on a dictionary. The dictionary may include one or more of a standard language dictionary and a personal dictionary.
- In one embodiment, the editing assistance may include name correction using a local contact database. In another embodiment, the user input directed to editing the displayed transcription text includes at least one spoken global language command. In another embodiment, the user input may include at least one of a user control gesture, and a user control utterance, for selecting a portion of the transcription text. The user input may further include at least one of a user edit gesture, and a user edit utterance, for selecting an edit command to apply to the portion of the transcription text.
- In another aspect, the invention is a method of editing on a headset computer device, including transcribing user utterances to produce transcription text, displaying the transcription text, and responding to user input directed to editing the displayed transcription text.
- In one embodiment, responding to user input further includes providing editing assistance. Providing editing assistance may further include suggesting alternatives to one or more portions of the transcription text. The alternatives may be based at least on context of the portions of the transcription text within the transcription text. The portions of the transcription text may be individual words, or groups of words (i.e., phrases).
- In one embodiment, providing editing assistance may include auto-correcting transcription text based on a dictionary. Providing editing assistance may also include correcting names using a local contact database, such as a contacts list or an address book, or other information structure containing names, addresses, phone numbers, and so on. In addition to correcting names, embodiments may use these information structures to correct street addresses and other components of the structure.
- In one embodiment, the user input directed to editing the displayed transcription text includes at least one spoken global language command. In another embodiment, the user input includes at least one of a user control gesture, and a user control utterance, for selecting a portion of the transcription text. The user input may further include at least one of a user edit gesture, and a user edit utterance, for selecting an edit command to apply to the portion of the transcription text.
- In another aspect, the invention is a non-transitory computer-readable medium for editing text, the non-transitory computer-readable medium comprising computer software instructions stored thereon. The computer software instructions, when executed by at least one processor, causes a computer system to transcribe user utterances to produce transcription text, display the transcription text, and respond to user input directed to editing the displayed transcription text.
- The foregoing will be apparent from the following more particular description of example embodiments of the invention, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments of the present invention.
-
FIGS. 1A-1B are schematic illustrations of a headset computer cooperating with a host computer (e.g., Smart Phone, laptop, etc.) according to principles of the present invention. -
FIG. 2 is a block diagram of flow of data and control in the embodiment ofFIGS. 1A-1B . -
FIG. 3 is a schematic view of one embodiment with ASR (speech recognition module). -
FIG. 4 illustrates a method of editing text on a headset computer device according to embodiments of the invention. - A description of example embodiments of the invention follows.
- The teachings of all patents, published applications and references cited herein are incorporated by reference in their entirety.
-
FIGS. 1A and 1B show an example embodiment of a wireless computing headset device 100 (also referred to herein as a headset computer (HSC) or head mounted display (HMD)) that incorporates a high-resolution (VGA or better)micro-display element 1010, and other features described below. - HSC 100 can include audio input and/or output devices, including one or more microphones, input and output speakers, geo-positional sensors (GPS), three to nine axis degrees of freedom orientation sensors, atmospheric sensors, health condition sensors, digital compass, pressure sensors, environmental sensors, energy sensors, acceleration sensors, position, attitude, motion, velocity and/or optical sensors, cameras (visible light, infrared, etc.), multiple wireless radios, auxiliary lighting, rangefinders, or the like and/or an array of sensors embedded and/or integrated into the headset and/or attached to the device via one or more peripheral ports 1020 (
FIG. 1B ). - Typically located within the housing of
headset computing device 100 are various electronic circuits including, a microcomputer (single or multicore processors), one or more wired and/or wireless communications interfaces, memory or storage devices, various sensors and a peripheral mount or mount, such as a “hot shoe.” - Example embodiments of the
HSC 100 can receive user input through sensing voice commands, head movements, 110, 111, 112 and hand gestures 113, or any combination thereof. A microphone (or microphones) operatively coupled to or integrated into theHSC 100 can be used to capture speech commands, which are then digitized and processed using automatic speech recognition techniques. Gyroscopes, accelerometers, and other micro-electromechanical system sensors can be integrated into theHSC 100 and used to track the user's 110, 111, 112 to provide user input commands. Cameras or motion tracking sensors can be used to monitor a user's hand gestures 113 for user input commands. Such a user interface may overcome the disadvantages of hands-dependent formats inherent in other mobile devices.head movements - The
HSC 100 can be used in various ways. It can be used as a peripheral display for displaying video signals received and processed by a remote host computing device 200 (shown inFIG. 1A ). Thehost 200 may be, for example, a notebook PC, smart phone, tablet device, or other computing device having less or greater computational complexity than the wirelesscomputing headset device 100, such as cloud-based network resources. Theheadset computing device 100 and host 200 can wirelessly communicate via one or more wireless protocols, such as Bluetooth®, Wi-Fi, WiMAX, 4G LTE or otherwireless radio link 150. (Bluetooth is a registered trademark of Bluetooth Sig, Inc. of 5209 Lake Washington Boulevard, Kirkland, Wash. 98033). - In an example embodiment, the
host 200 may be further connected to other networks, such as through a wireless connection to the Internet or other cloud-based network resources, so that thehost 200 can act as a wireless relay between theHSC 100 and thenetwork 210. Alternatively, some embodiments of theHSC 100 can establish a wireless connection to the Internet (or other cloud-based network resources) directly, without the use of a host wireless relay. In such embodiments, components of theHSC 100 and thehost 200 may be combined into a single device. -
FIG. 1B is a perspective view showing some details of an example embodiment of aheadset computer 100. Theexample embodiment HSC 100 generally includes, aframe 1000,strap 1002,rear housing 1004,speaker 1006, cantilever, or alternatively referred to as an arm orboom 1008 with a built in microphone, and amicro-display subassembly 1010. - A head worn
frame 1000 andstrap 1002 are generally configured so that a user can wear theheadset computer device 100 on the user's head. Ahousing 1004 is generally a low profile unit which houses the electronics, such as the microprocessor, memory or other storage device, along with other associated circuitry.Speakers 1006 provide audio output to the user so that the user can hear information.Micro-display subassembly 1010 is used to render visual information to the user. It is coupled to thearm 1008. Thearm 1008 generally provides physical support such that the micro-display subassembly is able to be positioned within the user's field of view 300 (FIG. 1A ), preferably in front of the eye of the user or within its peripheral vision preferably slightly below or above the eye.Arm 1008 also provides the electrical or optical connections between themicro-display subassembly 1010 and the control circuitry housed withinhousing unit 1004. - According to aspects that will be explained in more detail below, the
HSC display device 100 allows a user to select a field ofview 300 within a much larger area defined by avirtual display 400. The user can typically control the position, extent (e.g., X-Y or 3D range), and/or magnification of the field ofview 300. - While what is shown in
FIGS. 1A and 1B is a monocular micro-display presenting a single fixed display element supported on the face of the user with a cantilevered boom, it should be understood that other mechanical configurations for the remotecontrol display device 100 are possible, such as a binocular display with two separate micro-displays (e.g., one for each eye) or a single micro-display arranged to be viewable by both eyes. -
FIG. 2 is a block diagram showing more detail of an embodiment of the HSC orHMD device 100,host 200 and the data that travels between them. The HSC orHMD device 100 receives vocal input from the user via the microphone, hand movements or body gestures via positional and orientation sensors, the camera or optical sensor(s), and head movement inputs via the head tracking circuitry such as 3 axis to 9 axis degrees of freedom orientational sensing. These are translated by software (processors) in the HSC orHMD device 100 into keyboard and/or mouse commands that are then sent over the Bluetooth orother wireless interface 150 to thehost 200. Thehost 200 then interprets these translated commands in accordance with its own operating system/application software to perform various functions. Among the commands is one to select a field ofview 300 within thevirtual display 400 and return that selected screen data to the HSC orHMD device 100. Thus, it should be understood that a very large format virtual display area might be associated with application software or an operating system running on thehost 200. However, only a portion of that largevirtual display area 400 within the field ofview 300 is returned to and actually displayed by themicro display 1010 of HSC orHMD device 100. - In one embodiment, the
HSC 100 may take the form of the device described in a co-pending US Patent Publication Number 2011/0187640, which is hereby incorporated by reference in its entirety. - In another embodiment, the invention relates to the concept of using a Head Mounted Display (HMD) 1010 in conjunction with an external ‘smart’ device 200 (such as a smartphone or tablet) to provide information and control to the user hands-free. The invention requires transmission of small amounts of data, providing a more reliable data transfer method running in real-time.
- In this sense therefore, the amount of data to be transmitted over the
connection 150 is small-simply instructions on how to lay out a screen, which text to display, and other stylistic information such as drawing arrows, or the background colors, images to include, etc. - Additional data could be streamed over the same 150 or another connection and displayed on
screen 1010, such as a video stream if required by thehost 200. -
FIG. 3 shows an example embodiment of a wireless hands-freevideo computing headset 100 under voice command, according to one embodiment of the present invention. The user may be presented with an image on the micro-display 9010, for example, as output byhost computer 200 application mentioned above. A user of theHMD 100 can employ joint head-tracking and voice command textselection software module 9036, either locally or from aremote host 200, in which the user is presented with a sequence of screen views implementing hands free text selection on the micro-display 9010 and the audio of the same through thespeaker 9006 of theheadset computer 100. Because theheadset computer 100 is also equipped with amicrophone 9020, the user can utter voice commands (e.g., to make command selections) as illustrated next with respect to embodiments of the present invention. -
FIG. 3 shows a schematic diagram illustrating the modules of theheadset computer 100.FIG. 3 includes a schematic diagram of the operative modules of theheadset computer 100. - For the case of speech command replacement in speech driven applications,
controller 9100 accesses usercommand configuration module 9036, which can be located locally to eachHMD 100 or located remotely at a host 200 (FIGS. 1A-1B ). - User configurable speech command or speech command
replacement software module 9036 contains instructions to display to a user an image of a pertinent request dialog box or the like. Thegraphics converter module 9040 converts the image instructions received from thespeech command module 9036 viabus 9103 and converts the instructions into graphics to display on themonocular display 9010. - The text-to-
speech module 9035 b may, contemporaneous with the graphics display described above, convert the instructions from textselection software module 9036 into digital sound representations corresponding to the contents of the screen views 410 to be displayed. The text-to-speech module 9035 b feeds the digital sound representations to the digital-to-analog converter 9021 b, which in turn feedsspeaker 9006 to present the audio output to the user. - Speech command replacement/user
reconfiguration software module 9036 can be stored locally atmemory 9120 or remotely at a host 200 (FIG. 1A ). The user can speak/utter the replacement command selection from the image and the user'sspeech 9090 is received atmicrophone 9020. The received speech is then converted from an analog signal into a digital signal at analog-to-digital converter 9021 a. Once the speech is converted from an analog to a digital signalspeech recognition module 9035 a processes the speech into recognized speech. - The recognized speech is compared against known speech and processed into text according to instructions of speech-to-
text module 9036. - Advances in speech-to-text conversion technologies enable users to simply speak to their computing devices and have the spoken word converted, almost instantly, into text. This allows email, SMS, and other messages to be constructed in a hands-free manner.
- The accuracy, however, of the speech-to-text systems may be less than perfect. Failure rates increase as background noise increases, and as spoken accents become stronger.
- Furthermore, even if a spoken phrase is dictated with perfect accuracy, there will be many cases where the user, in reading the text transcription, will want to change some of the content before saving or sending the subject text file. If autocorrect functions are turned on, those functions may inadvertently change the user's input in a way that is contrary to the user's intentions.
- In all of these cases, the user may desire the ability to edit the dictated message, i.e., the resulting transcription content. Typical dictation strategies rely on a user employing the touch screen of a mobile device to point and highlight the text to edit, possibly in conjunction with a keyboard to manually enter the corrections.
- For the case of a truly hands-free device, such as a head-worn
device 100 or accessory, the touch screen technique for entering dictation corrections may not be applicable. - Text Dictation Post-Processing—A first stage in ensuring text is close to that desired by the user is in dictation post-processing, i.e., processing of the text as soon as it is returned by the dictation engine (aka speech-to-text engine 9036).
- One post-processing technique is an auto-correction algorithm, using a standard language dictionary and/or the user's personal dictionary, one or both stored local to the
device 100.Memory 9120 may store these dictionaries, and instructions of 9035 a and 9036 may utilize the dictionaries in processing (i.e., recognizing and autocorrecting) speech submitted by the user. As used herein, a “standard language dictionary” means a dictionary that has been compiled for use in general context, for example a dictionary provided by a third party entity for use by different end users. A personal dictionary, as used herein, means a specifically compile dictionary for use by a narrower audience than would use a standard language dictionary. A personal dictionary may be used by a specific person, and may be modified and edited to be specifically sued by that person.modules - Another post-processing technique is name correction, based on a local contact database (such as the user's address book or contact list) on the device 100 (memory 9120). For example, many dictation engines have no concept of the users address book and may return names that while may be correctly spelled in one environment, are not correctly spelled in the user's environment. Fuzzy matching algorithms can be used by
9035 a and 9036 to compare names found in the dictated text against those in the user's address book. In turn,modules 9035 a, 9036 may correct any obvious misspellings.modules - For example, a name interpreted as “Chris” might be replaced with “Kris” by
module 9036, where the term is phonetically the same, but needs spelling correction according to the user's address book inmemory 9120. - The
device 100 may also use context to make a proper correction. For example, if a user's address book includes “Kris Jones” and “Chris Baker,” the first name spoken with the last name would provide a clue as to whether the correct spelling should be “Kris” or “Chris.” - Global Natural Language Commands—The Global Natural Language commands in
9035 a and 9036 allow a user to issue commands that apply to the whole document. A “find and replace all” is an example of a Global Natural Language Command. The spoken command is captured (parsed and recognized byspeech processing modules module 9035 a) and sent to the NLU engine (part of module 9036) for processing, along with the current message to edit. - Specific examples of global natural language commands in one embodiment may include:
- (1) Replace Phrase A with Phrase B (e.g., “Replace the name Chris with John” or “Change 10:30 pm to 10:3 pm)
(2) Delete Sections (e.g., “Delete the last line of the text” or “Delete the first paragraph”) - Such global commands may be programmed into the instruction set of
9035 a and 9036.modules - Speech and Gesture Commands for Editing Specific Words—In one embodiment, the
HSC 100 tracks head movements to selectively highlight words of a textual phrase within the user's field of view. Thespeech processing module 9036 may present a menu of options to the user, e.g., {cut|copy|paste|replace|suggest|Define}. The menu may be displayed throughmicrodisplay 9010, and audibly presented throughspeaker 9006 as previously operationally described. Third Party content providers, such as a dictionary and/or a thesaurus, could be accessed by 9035 a and 9036 to provide definitions for words and to suggest alternative word choices.modules -
HSC 100 may obtain the selection of words in text by tracking the movement of the user's head, so as to creating a scrolling effect to move the cursor (pointer) along the displayed text to get to the desired word. For example, a user may cause word selection to progress from word to word in a sentence from left to right in his field of view by turning his head from left to right (i.e., clockwise) or tilting his head to the right. The user may stop the word progression to select a particular word by returning his head to a neutral position (i.e., facing straight ahead, tilted neither to the left nor to the right). - Once a particular word is selected, the
HSC 100 may present an edit options menu with several editing options, e.g., {cut|copy|paste|replace|suggest|define}. The user may use head tilting and/or rotation as described above to select one of the edit options, and use a gesture, head movement or spoken utterance (or similar input) to execute the selected command. -
Embodiments 100 may employ autosuggest to render (ondisplay 9010 and through speaker 9006) the identification of contacts, dates, and other entities to correct the spelling of words taken from a third party trained NLU model. For example, a user speaks the name “Mac-ken-zee” and the NLU model recognizes and presents the word “McKenzie.”Device 100 may use a “contacts” database to search for matches for McKenzie, and autosuggest “Mackenzie Adams (madams@gmail.com)” as the contact that most closely matches the recognized name “McKenzie.” -
HSC 100 may suggest alternative words such as synonyms by classifying the context of the statement. For instance, if the context of the statement is to do computations, for example “for divided by three,” thenHSC 100 highlights the displayed word “for” ondisplay 9010, andspeech module 9036 issues the suggested replacement “four” displayed onmicrodisplay 9010 and enunciated throughspeaker 9006 as illustrated. - In one embodiment, processing for suggesting alternative words based on context within a sentence or other structure may be accomplished with program procedures for
9035 a and 9036, using for example one or more of the elements set forth below, or combinations of similar elements.modules - Tokenize the sentence—First, partition each sentence into a list of tokens. Each sentence is partitioned into a list of words, and we remove the stop words. Stop words are frequently occurring, insignificant words that appear in a database record, article, or a web page, etc.
- Action Classification Tagging—Using a training corpus, the context and action of the phrase is determined. For example, the detection of a mathematical operation “divided by” indicates a mathematical action, and a mathematical entity set is chosen to disambiguate the statement.
- Entity Disambiguation Tagging—This task is to identify the correct entity (like numeric, date time, or as a member of a set of entities) of each word in the sentence. The algorithm takes a sentence as input and a specified tag set. The output is a single best tag for each word. Tagging ambiguities are resolved by using a training corpus to compute the probability of a given word having a given tag in a given context.
- Find the most appropriate sense or meaning for every entity in a sentence—To disambiguate a word, the gloss of each of its senses is compared to the glosses of every other word in a phrase. A word is assigned to the sense whose gloss shares the largest number of words in common with the glosses of the other words. For example the word “For” is preposition and the word “two” is a number. The context is determined to be a mathematical division operation that expects a numerical dividend and a divisor to complete the mathematical operation. Since “for” is not numeric, fuzzy logic is employed to determine the edit distance between “for” and the next closest number which is “four” and is numeric. The word “four” is then suggested as the appropriate substitution for “for” in the spoken phrase.
-
FIG. 4 illustrates a method of editing text on a headset computer device, including transcribing user utterances to produce transcription text, displaying 404 the transcription text, responding 406 to user input directed to editing the displayed transcription text, and providing 408 editing assistance. The editing assistance may include one or more off (i) suggesting alternatives to one or more portions of the transcription text, (ii) auto-correcting transcription text based on a dictionary and (iii) correcting names using a local contact database, among others. - It will be apparent that one or more embodiments described herein may be implemented in many different forms of software and hardware. Software code and/or specialized hardware used to implement embodiments described herein is not limiting of the embodiments of the invention described herein. Thus, the operation and behavior of embodiments are described without reference to specific software code and/or specialized hardware—it being understood that one would be able to design software and/or hardware to implement the embodiments based on the description herein.
- Further, certain embodiments of the example embodiments described herein may be implemented as logic that performs one or more functions. This logic may be hardware-based, software-based, or a combination of hardware-based and software-based. Some or all of the logic may be stored on one or more tangible, non-transitory, computer-readable storage media and may include computer-executable instructions that may be executed by a controller or processor. The computer-executable instructions may include instructions that implement one or more embodiments of the invention. The tangible, non-transitory, computer-readable storage media may be volatile or non-volatile and may include, for example, flash memories, dynamic memories, removable disks, and non-removable disks.
- While this invention has been particularly shown and described with references to example embodiments thereof, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the invention encompassed by the appended claims.
Claims (20)
Priority Applications (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US14/577,600 US9640181B2 (en) | 2013-12-27 | 2014-12-19 | Text editing with gesture control and natural speech |
Applications Claiming Priority (2)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| US201361921267P | 2013-12-27 | 2013-12-27 | |
| US14/577,600 US9640181B2 (en) | 2013-12-27 | 2014-12-19 | Text editing with gesture control and natural speech |
Publications (2)
| Publication Number | Publication Date |
|---|---|
| US20150187355A1 true US20150187355A1 (en) | 2015-07-02 |
| US9640181B2 US9640181B2 (en) | 2017-05-02 |
Family
ID=52394351
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| US14/577,600 Active 2035-02-24 US9640181B2 (en) | 2013-12-27 | 2014-12-19 | Text editing with gesture control and natural speech |
Country Status (2)
| Country | Link |
|---|---|
| US (1) | US9640181B2 (en) |
| WO (1) | WO2015100172A1 (en) |
Cited By (166)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20150339098A1 (en) * | 2014-05-21 | 2015-11-26 | Samsung Electronics Co., Ltd. | Display apparatus, remote control apparatus, system and controlling method thereof |
| US20160170710A1 (en) * | 2014-12-12 | 2016-06-16 | Samsung Electronics Co., Ltd. | Method and apparatus for processing voice input |
| US20160283453A1 (en) * | 2015-03-26 | 2016-09-29 | Lenovo (Singapore) Pte. Ltd. | Text correction using a second input |
| US20170053647A1 (en) * | 2015-08-19 | 2017-02-23 | Hand Held Products, Inc. | Auto-complete methods for spoken complete value entries |
| DK201670539A1 (en) * | 2016-03-14 | 2017-10-02 | Apple Inc | Dictation that allows editing |
| US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
| US20180018308A1 (en) * | 2015-01-22 | 2018-01-18 | Samsung Electronics Co., Ltd. | Text editing apparatus and text editing method based on speech signal |
| US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
| US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
| US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
| US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
| US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
| US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
| US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
| US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
| US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
| US20180330716A1 (en) * | 2017-05-11 | 2018-11-15 | Olympus Corporation | Sound collection apparatus, sound collection method, sound collection program, dictation method, information processing apparatus, and recording medium recording information processing program |
| US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
| US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
| US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
| US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
| US10332518B2 (en) | 2017-05-09 | 2019-06-25 | Apple Inc. | User interface for correcting recognition errors |
| US10354652B2 (en) | 2015-12-02 | 2019-07-16 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
| US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
| US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
| US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
| US10403283B1 (en) | 2018-06-01 | 2019-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
| US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
| US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
| US20190279623A1 (en) * | 2018-03-08 | 2019-09-12 | Kika Tech (Cayman) Holdings Co., Limited | Method for speech recognition dictation and correction by spelling input, system and storage medium |
| US10417405B2 (en) | 2011-03-21 | 2019-09-17 | Apple Inc. | Device access using voice authentication |
| US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
| US10417344B2 (en) | 2014-05-30 | 2019-09-17 | Apple Inc. | Exemplar-based natural language processing |
| US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
| US10438595B2 (en) | 2014-09-30 | 2019-10-08 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
| US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
| US10453443B2 (en) | 2014-09-30 | 2019-10-22 | Apple Inc. | Providing an indication of the suitability of speech recognition |
| US10452227B1 (en) * | 2016-03-31 | 2019-10-22 | United Services Automobile Association (Usaa) | System and method for data visualization and modification in an immersive three dimensional (3-D) environment |
| US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
| US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
| US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
| US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
| US10529332B2 (en) | 2015-03-08 | 2020-01-07 | Apple Inc. | Virtual assistant activation |
| US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
| US10580409B2 (en) | 2016-06-11 | 2020-03-03 | Apple Inc. | Application integration with a digital assistant |
| US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
| US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
| US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
| US10643611B2 (en) | 2008-10-02 | 2020-05-05 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
| US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
| US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
| US10685180B2 (en) | 2018-05-10 | 2020-06-16 | International Business Machines Corporation | Using remote words in data streams from remote devices to autocorrect input text |
| US10684703B2 (en) | 2018-06-01 | 2020-06-16 | Apple Inc. | Attention aware virtual assistant dismissal |
| US10699717B2 (en) | 2014-05-30 | 2020-06-30 | Apple Inc. | Intelligent assistant for home automation |
| US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
| US10714117B2 (en) | 2013-02-07 | 2020-07-14 | Apple Inc. | Voice trigger for a digital assistant |
| US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
| US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
| US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
| US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
| US10741185B2 (en) | 2010-01-18 | 2020-08-11 | Apple Inc. | Intelligent automated assistant |
| US10748546B2 (en) | 2017-05-16 | 2020-08-18 | Apple Inc. | Digital assistant services based on device capabilities |
| US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
| US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
| US10769385B2 (en) | 2013-06-09 | 2020-09-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
| US10789945B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Low-latency intelligent automated assistant |
| US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
| US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
| US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
| US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
| US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
| US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
| US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
| US10904611B2 (en) | 2014-06-30 | 2021-01-26 | Apple Inc. | Intelligent automated assistant for TV user interactions |
| US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
| US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
| US10942702B2 (en) | 2016-06-11 | 2021-03-09 | Apple Inc. | Intelligent device arbitration and control |
| US10942703B2 (en) | 2015-12-23 | 2021-03-09 | Apple Inc. | Proactive assistance based on dialog communication between devices |
| US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
| US10990174B2 (en) | 2016-07-25 | 2021-04-27 | Facebook Technologies, Llc | Methods and apparatus for predicting musculo-skeletal position information using wearable autonomous sensors |
| US11010127B2 (en) | 2015-06-29 | 2021-05-18 | Apple Inc. | Virtual assistant for media playback |
| US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
| US11023513B2 (en) | 2007-12-20 | 2021-06-01 | Apple Inc. | Method and apparatus for searching using an active ontology |
| US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
| US11048473B2 (en) | 2013-06-09 | 2021-06-29 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
| US11070949B2 (en) | 2015-05-27 | 2021-07-20 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display |
| US11069336B2 (en) | 2012-03-02 | 2021-07-20 | Apple Inc. | Systems and methods for name pronunciation |
| US11069347B2 (en) | 2016-06-08 | 2021-07-20 | Apple Inc. | Intelligent automated assistant for media exploration |
| US20210225377A1 (en) * | 2020-01-17 | 2021-07-22 | Verbz Labs Inc. | Method for transcribing spoken language with real-time gesture-based formatting |
| US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
| US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
| US11127397B2 (en) | 2015-05-27 | 2021-09-21 | Apple Inc. | Device voice control |
| US11126400B2 (en) | 2015-09-08 | 2021-09-21 | Apple Inc. | Zero latency digital assistant |
| US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
| US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
| US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
| US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
| US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
| US11216069B2 (en) * | 2018-05-08 | 2022-01-04 | Facebook Technologies, Llc | Systems and methods for improved speech recognition using neuromuscular information |
| US11217251B2 (en) | 2019-05-06 | 2022-01-04 | Apple Inc. | Spoken notifications |
| US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
| US11231904B2 (en) | 2015-03-06 | 2022-01-25 | Apple Inc. | Reducing response latency of intelligent automated assistants |
| US11237797B2 (en) | 2019-05-31 | 2022-02-01 | Apple Inc. | User activity shortcut suggestions |
| US11269678B2 (en) | 2012-05-15 | 2022-03-08 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
| US11275889B2 (en) * | 2019-04-04 | 2022-03-15 | International Business Machines Corporation | Artificial intelligence for interactive preparation of electronic documents |
| US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
| US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
| US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
| US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
| US11314370B2 (en) | 2013-12-06 | 2022-04-26 | Apple Inc. | Method for extracting salient dialog usage from live data |
| US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
| US11350253B2 (en) | 2011-06-03 | 2022-05-31 | Apple Inc. | Active transport based notifications |
| US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
| US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
| US11388291B2 (en) | 2013-03-14 | 2022-07-12 | Apple Inc. | System and method for processing voicemail |
| US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
| US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
| US11468282B2 (en) | 2015-05-15 | 2022-10-11 | Apple Inc. | Virtual assistant in a communication session |
| US11467802B2 (en) | 2017-05-11 | 2022-10-11 | Apple Inc. | Maintaining privacy of personal information |
| US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
| US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
| US11481030B2 (en) | 2019-03-29 | 2022-10-25 | Meta Platforms Technologies, Llc | Methods and apparatus for gesture detection and classification |
| US11481031B1 (en) | 2019-04-30 | 2022-10-25 | Meta Platforms Technologies, Llc | Devices, systems, and methods for controlling computing devices via neuromuscular signals of users |
| US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
| US11493993B2 (en) | 2019-09-04 | 2022-11-08 | Meta Platforms Technologies, Llc | Systems, methods, and interfaces for performing inputs based on neuromuscular control |
| US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
| US11495218B2 (en) | 2018-06-01 | 2022-11-08 | Apple Inc. | Virtual assistant operation in multi-device environments |
| US11500672B2 (en) | 2015-09-08 | 2022-11-15 | Apple Inc. | Distributed personal assistant |
| US11526368B2 (en) | 2015-11-06 | 2022-12-13 | Apple Inc. | Intelligent automated assistant in a messaging environment |
| US11532306B2 (en) | 2017-05-16 | 2022-12-20 | Apple Inc. | Detecting a trigger of a digital assistant |
| US11567573B2 (en) | 2018-09-20 | 2023-01-31 | Meta Platforms Technologies, Llc | Neuromuscular text entry, writing and drawing in augmented reality systems |
| US11635736B2 (en) | 2017-10-19 | 2023-04-25 | Meta Platforms Technologies, Llc | Systems and methods for identifying biological structures associated with neuromuscular source signals |
| US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
| US11644799B2 (en) | 2013-10-04 | 2023-05-09 | Meta Platforms Technologies, Llc | Systems, articles and methods for wearable electronic devices employing contact sensors |
| US11657813B2 (en) | 2019-05-31 | 2023-05-23 | Apple Inc. | Voice identification in digital assistant systems |
| US20230168859A1 (en) * | 2020-04-23 | 2023-06-01 | JRD Communication (Shenzhen) Ltd. | Method and device for voice input using head control device |
| US11671920B2 (en) | 2007-04-03 | 2023-06-06 | Apple Inc. | Method and system for operating a multifunction portable electronic device using voice-activation |
| US11666264B1 (en) | 2013-11-27 | 2023-06-06 | Meta Platforms Technologies, Llc | Systems, articles, and methods for electromyography sensors |
| US20230186918A1 (en) * | 2019-07-15 | 2023-06-15 | Axon Enterprise, Inc. | Methods and systems for transcription of audio data |
| US11696060B2 (en) | 2020-07-21 | 2023-07-04 | Apple Inc. | User identification using headphones |
| US11765209B2 (en) | 2020-05-11 | 2023-09-19 | Apple Inc. | Digital assistant hardware abstraction |
| US11790914B2 (en) | 2019-06-01 | 2023-10-17 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
| US11797087B2 (en) | 2018-11-27 | 2023-10-24 | Meta Platforms Technologies, Llc | Methods and apparatus for autocalibration of a wearable electrode sensor system |
| US11798547B2 (en) | 2013-03-15 | 2023-10-24 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
| US11809483B2 (en) | 2015-09-08 | 2023-11-07 | Apple Inc. | Intelligent automated assistant for media search and playback |
| US11810578B2 (en) | 2020-05-11 | 2023-11-07 | Apple Inc. | Device arbitration for digital assistant-based intercom systems |
| US11838734B2 (en) | 2020-07-20 | 2023-12-05 | Apple Inc. | Multi-device audio adjustment coordination |
| US11853536B2 (en) | 2015-09-08 | 2023-12-26 | Apple Inc. | Intelligent automated assistant in a media environment |
| US11868531B1 (en) | 2021-04-08 | 2024-01-09 | Meta Platforms Technologies, Llc | Wearable device providing for thumb-to-finger-based input gestures detected based on neuromuscular signals, and systems and methods of use thereof |
| US11886805B2 (en) | 2015-11-09 | 2024-01-30 | Apple Inc. | Unconventional virtual assistant interactions |
| US11907423B2 (en) | 2019-11-25 | 2024-02-20 | Meta Platforms Technologies, Llc | Systems and methods for contextualized interactions with an environment |
| US11914848B2 (en) | 2020-05-11 | 2024-02-27 | Apple Inc. | Providing relevant data items based on context |
| US11921471B2 (en) | 2013-08-16 | 2024-03-05 | Meta Platforms Technologies, Llc | Systems, articles, and methods for wearable devices having secondary power sources in links of a band for providing secondary power in addition to a primary power source |
| US11961494B1 (en) | 2019-03-29 | 2024-04-16 | Meta Platforms Technologies, Llc | Electromagnetic interference reduction in extended reality environments |
| US12010262B2 (en) | 2013-08-06 | 2024-06-11 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
| US12014118B2 (en) | 2017-05-15 | 2024-06-18 | Apple Inc. | Multi-modal interfaces having selection disambiguation and text modification capability |
| US12051413B2 (en) | 2015-09-30 | 2024-07-30 | Apple Inc. | Intelligent device identification |
| US12057123B1 (en) * | 2020-11-19 | 2024-08-06 | Voicebase, Inc. | Communication devices with embedded audio content transcription and analysis functions |
| US12094460B2 (en) * | 2016-07-27 | 2024-09-17 | Samsung Electronics Co., Ltd. | Electronic device and voice recognition method thereof |
| US12197817B2 (en) | 2016-06-11 | 2025-01-14 | Apple Inc. | Intelligent device arbitration and control |
| US12223282B2 (en) | 2016-06-09 | 2025-02-11 | Apple Inc. | Intelligent automated assistant in a home environment |
| US12301635B2 (en) | 2020-05-11 | 2025-05-13 | Apple Inc. | Digital assistant hardware abstraction |
| US12504816B2 (en) | 2013-08-16 | 2025-12-23 | Meta Platforms Technologies, Llc | Wearable devices and associated band structures for sensing neuromuscular signals using sensor pairs in respective pods with communicative pathways to a common processor |
| US12554325B2 (en) | 2016-07-25 | 2026-02-17 | Meta Platforms Technologies, Llc | Methods and apparatuses for low latency body state prediction based on neuromuscular data |
| US12579768B2 (en) | 2018-01-25 | 2026-03-17 | Meta Platforms Technologies, Llc | Wearable electronic devices, extended reality systems including neuromuscular sensors, and methods for generating text from speech input and modifying the generated text based on neuromuscular data |
| US12619452B2 (en) | 2023-07-20 | 2026-05-05 | Apple Inc. | Intelligent automated assistant in a messaging environment |
Families Citing this family (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| EP3791387B1 (en) * | 2018-05-08 | 2023-08-30 | Facebook Technologies, LLC. | Systems and methods for improved speech recognition using neuromuscular information |
| US11514893B2 (en) * | 2020-01-29 | 2022-11-29 | Microsoft Technology Licensing, Llc | Voice context-aware content manipulation |
| TW202240461A (en) * | 2021-03-03 | 2022-10-16 | 美商元平台公司 | Text editing using voice and gesture inputs for assistant systems |
| US20220284904A1 (en) * | 2021-03-03 | 2022-09-08 | Meta Platforms, Inc. | Text Editing Using Voice and Gesture Inputs for Assistant Systems |
| US12422934B2 (en) * | 2022-04-08 | 2025-09-23 | Meta Platforms Technologies, Llc | Techniques for neuromuscular-signal-based detection of in-air hand gestures for text production and modification, and systems, wearable devices, and methods for using these techniques |
| US12464307B2 (en) * | 2023-04-10 | 2025-11-04 | Meta Platforms Technologies, Llc | Translation with audio spatialization |
Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5696521A (en) * | 1994-06-22 | 1997-12-09 | Astounding Technologies (M) Sdn. Bhd. | Video headset |
| US5829000A (en) * | 1996-10-31 | 1998-10-27 | Microsoft Corporation | Method and system for correcting misrecognized spoken words or phrases |
| US20070208567A1 (en) * | 2006-03-01 | 2007-09-06 | At&T Corp. | Error Correction In Automatic Speech Recognition Transcripts |
| US20100250250A1 (en) * | 2009-03-30 | 2010-09-30 | Jonathan Wiggs | Systems and methods for generating a hybrid text string from two or more text strings generated by multiple automated speech recognition systems |
| US20110115702A1 (en) * | 2008-07-08 | 2011-05-19 | David Seaberg | Process for Providing and Editing Instructions, Data, Data Structures, and Algorithms in a Computer System |
| US20120262488A1 (en) * | 2009-12-23 | 2012-10-18 | Nokia Corporation | Method and Apparatus for Facilitating Text Editing and Related Computer Program Product and Computer Readable Medium |
| US20130212515A1 (en) * | 2012-02-13 | 2013-08-15 | Syntellia, Inc. | User interface for text input |
| US20140142937A1 (en) * | 2012-11-21 | 2014-05-22 | Pauline S. Powledge | Gesture-augmented speech recognition |
| US20160070441A1 (en) * | 2014-09-05 | 2016-03-10 | Microsoft Technology Licensing, Llc | Display-efficient text entry and editing |
Family Cites Families (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8504369B1 (en) | 2004-06-02 | 2013-08-06 | Nuance Communications, Inc. | Multi-cursor transcription editing |
| US8855719B2 (en) | 2009-05-08 | 2014-10-07 | Kopin Corporation | Wireless hands-free computing headset with detachable accessories controllable by motion, body gesture and/or vocal commands |
| US20100315329A1 (en) | 2009-06-12 | 2010-12-16 | Southwest Research Institute | Wearable workspace |
| US8719014B2 (en) | 2010-09-27 | 2014-05-06 | Apple Inc. | Electronic device with text error correction based on voice recognition data |
| US9165381B2 (en) | 2012-05-31 | 2015-10-20 | Microsoft Technology Licensing, Llc | Augmented books in a mixed reality environment |
| US9640178B2 (en) | 2013-12-26 | 2017-05-02 | Kopin Corporation | User configurable speech commands |
-
2014
- 2014-12-19 US US14/577,600 patent/US9640181B2/en active Active
- 2014-12-19 WO PCT/US2014/071577 patent/WO2015100172A1/en not_active Ceased
Patent Citations (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US5696521A (en) * | 1994-06-22 | 1997-12-09 | Astounding Technologies (M) Sdn. Bhd. | Video headset |
| US5829000A (en) * | 1996-10-31 | 1998-10-27 | Microsoft Corporation | Method and system for correcting misrecognized spoken words or phrases |
| US20070208567A1 (en) * | 2006-03-01 | 2007-09-06 | At&T Corp. | Error Correction In Automatic Speech Recognition Transcripts |
| US20110115702A1 (en) * | 2008-07-08 | 2011-05-19 | David Seaberg | Process for Providing and Editing Instructions, Data, Data Structures, and Algorithms in a Computer System |
| US20100250250A1 (en) * | 2009-03-30 | 2010-09-30 | Jonathan Wiggs | Systems and methods for generating a hybrid text string from two or more text strings generated by multiple automated speech recognition systems |
| US20120262488A1 (en) * | 2009-12-23 | 2012-10-18 | Nokia Corporation | Method and Apparatus for Facilitating Text Editing and Related Computer Program Product and Computer Readable Medium |
| US20130212515A1 (en) * | 2012-02-13 | 2013-08-15 | Syntellia, Inc. | User interface for text input |
| US20140142937A1 (en) * | 2012-11-21 | 2014-05-22 | Pauline S. Powledge | Gesture-augmented speech recognition |
| US20160070441A1 (en) * | 2014-09-05 | 2016-03-10 | Microsoft Technology Licensing, Llc | Display-efficient text entry and editing |
Cited By (285)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US10318871B2 (en) | 2005-09-08 | 2019-06-11 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
| US11928604B2 (en) | 2005-09-08 | 2024-03-12 | Apple Inc. | Method and apparatus for building an intelligent automated assistant |
| US11979836B2 (en) | 2007-04-03 | 2024-05-07 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
| US11671920B2 (en) | 2007-04-03 | 2023-06-06 | Apple Inc. | Method and system for operating a multifunction portable electronic device using voice-activation |
| US12477470B2 (en) | 2007-04-03 | 2025-11-18 | Apple Inc. | Method and system for operating a multi-function portable electronic device using voice-activation |
| US11023513B2 (en) | 2007-12-20 | 2021-06-01 | Apple Inc. | Method and apparatus for searching using an active ontology |
| US10381016B2 (en) | 2008-01-03 | 2019-08-13 | Apple Inc. | Methods and apparatus for altering audio output signals |
| US9865248B2 (en) | 2008-04-05 | 2018-01-09 | Apple Inc. | Intelligent text-to-speech conversion |
| US10108612B2 (en) | 2008-07-31 | 2018-10-23 | Apple Inc. | Mobile device having human language translation capability with positional feedback |
| US10643611B2 (en) | 2008-10-02 | 2020-05-05 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
| US11348582B2 (en) | 2008-10-02 | 2022-05-31 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
| US11900936B2 (en) | 2008-10-02 | 2024-02-13 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
| US12361943B2 (en) | 2008-10-02 | 2025-07-15 | Apple Inc. | Electronic devices with voice command and contextual data processing capabilities |
| US11080012B2 (en) | 2009-06-05 | 2021-08-03 | Apple Inc. | Interface for a virtual digital assistant |
| US10795541B2 (en) | 2009-06-05 | 2020-10-06 | Apple Inc. | Intelligent organization of tasks items |
| US11423886B2 (en) | 2010-01-18 | 2022-08-23 | Apple Inc. | Task flow identification based on user intent |
| US12087308B2 (en) | 2010-01-18 | 2024-09-10 | Apple Inc. | Intelligent automated assistant |
| US12165635B2 (en) | 2010-01-18 | 2024-12-10 | Apple Inc. | Intelligent automated assistant |
| US10706841B2 (en) | 2010-01-18 | 2020-07-07 | Apple Inc. | Task flow identification based on user intent |
| US12431128B2 (en) | 2010-01-18 | 2025-09-30 | Apple Inc. | Task flow identification based on user intent |
| US10741185B2 (en) | 2010-01-18 | 2020-08-11 | Apple Inc. | Intelligent automated assistant |
| US10692504B2 (en) | 2010-02-25 | 2020-06-23 | Apple Inc. | User profiling for voice input processing |
| US10049675B2 (en) | 2010-02-25 | 2018-08-14 | Apple Inc. | User profiling for voice input processing |
| US10417405B2 (en) | 2011-03-21 | 2019-09-17 | Apple Inc. | Device access using voice authentication |
| US11350253B2 (en) | 2011-06-03 | 2022-05-31 | Apple Inc. | Active transport based notifications |
| US12556890B2 (en) | 2011-06-03 | 2026-02-17 | Apple Inc. | Active transport based notifications |
| US11120372B2 (en) | 2011-06-03 | 2021-09-14 | Apple Inc. | Performing actions associated with task items that represent tasks to perform |
| US11069336B2 (en) | 2012-03-02 | 2021-07-20 | Apple Inc. | Systems and methods for name pronunciation |
| US11269678B2 (en) | 2012-05-15 | 2022-03-08 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
| US12613730B2 (en) | 2012-05-15 | 2026-04-28 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
| US11321116B2 (en) | 2012-05-15 | 2022-05-03 | Apple Inc. | Systems and methods for integrating third party services with a digital assistant |
| US10079014B2 (en) | 2012-06-08 | 2018-09-18 | Apple Inc. | Name recognition system |
| US9971774B2 (en) | 2012-09-19 | 2018-05-15 | Apple Inc. | Voice-based media searching |
| US10714117B2 (en) | 2013-02-07 | 2020-07-14 | Apple Inc. | Voice trigger for a digital assistant |
| US11862186B2 (en) | 2013-02-07 | 2024-01-02 | Apple Inc. | Voice trigger for a digital assistant |
| US11636869B2 (en) | 2013-02-07 | 2023-04-25 | Apple Inc. | Voice trigger for a digital assistant |
| US11557310B2 (en) | 2013-02-07 | 2023-01-17 | Apple Inc. | Voice trigger for a digital assistant |
| US12277954B2 (en) | 2013-02-07 | 2025-04-15 | Apple Inc. | Voice trigger for a digital assistant |
| US12009007B2 (en) | 2013-02-07 | 2024-06-11 | Apple Inc. | Voice trigger for a digital assistant |
| US10978090B2 (en) | 2013-02-07 | 2021-04-13 | Apple Inc. | Voice trigger for a digital assistant |
| US11388291B2 (en) | 2013-03-14 | 2022-07-12 | Apple Inc. | System and method for processing voicemail |
| US11798547B2 (en) | 2013-03-15 | 2023-10-24 | Apple Inc. | Voice activated device for use with a voice-based digital assistant |
| US9966060B2 (en) | 2013-06-07 | 2018-05-08 | Apple Inc. | System and method for user-specified pronunciation of words for speech synthesis and recognition |
| US10657961B2 (en) | 2013-06-08 | 2020-05-19 | Apple Inc. | Interpreting and acting upon commands that involve sharing information with remote devices |
| US11727219B2 (en) | 2013-06-09 | 2023-08-15 | Apple Inc. | System and method for inferring user intent from speech inputs |
| US12073147B2 (en) | 2013-06-09 | 2024-08-27 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
| US11048473B2 (en) | 2013-06-09 | 2021-06-29 | Apple Inc. | Device, method, and graphical user interface for enabling conversation persistence across two or more instances of a digital assistant |
| US10769385B2 (en) | 2013-06-09 | 2020-09-08 | Apple Inc. | System and method for inferring user intent from speech inputs |
| US12010262B2 (en) | 2013-08-06 | 2024-06-11 | Apple Inc. | Auto-activating smart responses based on activities from remote devices |
| US11921471B2 (en) | 2013-08-16 | 2024-03-05 | Meta Platforms Technologies, Llc | Systems, articles, and methods for wearable devices having secondary power sources in links of a band for providing secondary power in addition to a primary power source |
| US12504816B2 (en) | 2013-08-16 | 2025-12-23 | Meta Platforms Technologies, Llc | Wearable devices and associated band structures for sensing neuromuscular signals using sensor pairs in respective pods with communicative pathways to a common processor |
| US11644799B2 (en) | 2013-10-04 | 2023-05-09 | Meta Platforms Technologies, Llc | Systems, articles and methods for wearable electronic devices employing contact sensors |
| US11666264B1 (en) | 2013-11-27 | 2023-06-06 | Meta Platforms Technologies, Llc | Systems, articles, and methods for electromyography sensors |
| US11314370B2 (en) | 2013-12-06 | 2022-04-26 | Apple Inc. | Method for extracting salient dialog usage from live data |
| US20150339098A1 (en) * | 2014-05-21 | 2015-11-26 | Samsung Electronics Co., Ltd. | Display apparatus, remote control apparatus, system and controlling method thereof |
| US12118999B2 (en) | 2014-05-30 | 2024-10-15 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
| US11133008B2 (en) | 2014-05-30 | 2021-09-28 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
| US12067990B2 (en) | 2014-05-30 | 2024-08-20 | Apple Inc. | Intelligent assistant for home automation |
| US10083690B2 (en) | 2014-05-30 | 2018-09-25 | Apple Inc. | Better resolution when referencing to concepts |
| US11810562B2 (en) | 2014-05-30 | 2023-11-07 | Apple Inc. | Reducing the need for manual start/end-pointing and trigger phrases |
| US11670289B2 (en) | 2014-05-30 | 2023-06-06 | Apple Inc. | Multi-command single utterance input method |
| US10699717B2 (en) | 2014-05-30 | 2020-06-30 | Apple Inc. | Intelligent assistant for home automation |
| US10657966B2 (en) | 2014-05-30 | 2020-05-19 | Apple Inc. | Better resolution when referencing to concepts |
| US10714095B2 (en) | 2014-05-30 | 2020-07-14 | Apple Inc. | Intelligent assistant for home automation |
| US10417344B2 (en) | 2014-05-30 | 2019-09-17 | Apple Inc. | Exemplar-based natural language processing |
| US10497365B2 (en) | 2014-05-30 | 2019-12-03 | Apple Inc. | Multi-command single utterance input method |
| US11699448B2 (en) | 2014-05-30 | 2023-07-11 | Apple Inc. | Intelligent assistant for home automation |
| US10878809B2 (en) | 2014-05-30 | 2020-12-29 | Apple Inc. | Multi-command single utterance input method |
| US11257504B2 (en) | 2014-05-30 | 2022-02-22 | Apple Inc. | Intelligent assistant for home automation |
| US10904611B2 (en) | 2014-06-30 | 2021-01-26 | Apple Inc. | Intelligent automated assistant for TV user interactions |
| US12200297B2 (en) | 2014-06-30 | 2025-01-14 | Apple Inc. | Intelligent automated assistant for TV user interactions |
| US11838579B2 (en) | 2014-06-30 | 2023-12-05 | Apple Inc. | Intelligent automated assistant for TV user interactions |
| US11516537B2 (en) | 2014-06-30 | 2022-11-29 | Apple Inc. | Intelligent automated assistant for TV user interactions |
| US10431204B2 (en) | 2014-09-11 | 2019-10-01 | Apple Inc. | Method and apparatus for discovering trending terms in speech requests |
| US10438595B2 (en) | 2014-09-30 | 2019-10-08 | Apple Inc. | Speaker identification and unsupervised speaker adaptation techniques |
| US10453443B2 (en) | 2014-09-30 | 2019-10-22 | Apple Inc. | Providing an indication of the suitability of speech recognition |
| US9986419B2 (en) | 2014-09-30 | 2018-05-29 | Apple Inc. | Social reminders |
| US10390213B2 (en) | 2014-09-30 | 2019-08-20 | Apple Inc. | Social reminders |
| US20160170710A1 (en) * | 2014-12-12 | 2016-06-16 | Samsung Electronics Co., Ltd. | Method and apparatus for processing voice input |
| US20180018308A1 (en) * | 2015-01-22 | 2018-01-18 | Samsung Electronics Co., Ltd. | Text editing apparatus and text editing method based on speech signal |
| US11231904B2 (en) | 2015-03-06 | 2022-01-25 | Apple Inc. | Reducing response latency of intelligent automated assistants |
| US12236952B2 (en) | 2015-03-08 | 2025-02-25 | Apple Inc. | Virtual assistant activation |
| US11087759B2 (en) | 2015-03-08 | 2021-08-10 | Apple Inc. | Virtual assistant activation |
| US10529332B2 (en) | 2015-03-08 | 2020-01-07 | Apple Inc. | Virtual assistant activation |
| US11842734B2 (en) | 2015-03-08 | 2023-12-12 | Apple Inc. | Virtual assistant activation |
| US10567477B2 (en) | 2015-03-08 | 2020-02-18 | Apple Inc. | Virtual assistant continuity |
| US10930282B2 (en) | 2015-03-08 | 2021-02-23 | Apple Inc. | Competing devices responding to voice triggers |
| US10311871B2 (en) | 2015-03-08 | 2019-06-04 | Apple Inc. | Competing devices responding to voice triggers |
| US10726197B2 (en) * | 2015-03-26 | 2020-07-28 | Lenovo (Singapore) Pte. Ltd. | Text correction using a second input |
| US20160283453A1 (en) * | 2015-03-26 | 2016-09-29 | Lenovo (Singapore) Pte. Ltd. | Text correction using a second input |
| US12001933B2 (en) | 2015-05-15 | 2024-06-04 | Apple Inc. | Virtual assistant in a communication session |
| US12154016B2 (en) | 2015-05-15 | 2024-11-26 | Apple Inc. | Virtual assistant in a communication session |
| US12333404B2 (en) | 2015-05-15 | 2025-06-17 | Apple Inc. | Virtual assistant in a communication session |
| US11468282B2 (en) | 2015-05-15 | 2022-10-11 | Apple Inc. | Virtual assistant in a communication session |
| US11070949B2 (en) | 2015-05-27 | 2021-07-20 | Apple Inc. | Systems and methods for proactively identifying and surfacing relevant content on an electronic device with a touch-sensitive display |
| US11127397B2 (en) | 2015-05-27 | 2021-09-21 | Apple Inc. | Device voice control |
| US10356243B2 (en) | 2015-06-05 | 2019-07-16 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
| US10681212B2 (en) | 2015-06-05 | 2020-06-09 | Apple Inc. | Virtual assistant aided communication with 3rd party service in a communication session |
| US11025565B2 (en) | 2015-06-07 | 2021-06-01 | Apple Inc. | Personalized prediction of responses for instant messaging |
| US11947873B2 (en) | 2015-06-29 | 2024-04-02 | Apple Inc. | Virtual assistant for media playback |
| US11010127B2 (en) | 2015-06-29 | 2021-05-18 | Apple Inc. | Virtual assistant for media playback |
| US20170053647A1 (en) * | 2015-08-19 | 2017-02-23 | Hand Held Products, Inc. | Auto-complete methods for spoken complete value entries |
| US10529335B2 (en) * | 2015-08-19 | 2020-01-07 | Hand Held Products, Inc. | Auto-complete methods for spoken complete value entries |
| US10410629B2 (en) * | 2015-08-19 | 2019-09-10 | Hand Held Products, Inc. | Auto-complete methods for spoken complete value entries |
| US11550542B2 (en) | 2015-09-08 | 2023-01-10 | Apple Inc. | Zero latency digital assistant |
| US11853536B2 (en) | 2015-09-08 | 2023-12-26 | Apple Inc. | Intelligent automated assistant in a media environment |
| US11500672B2 (en) | 2015-09-08 | 2022-11-15 | Apple Inc. | Distributed personal assistant |
| US12204932B2 (en) | 2015-09-08 | 2025-01-21 | Apple Inc. | Distributed personal assistant |
| US11126400B2 (en) | 2015-09-08 | 2021-09-21 | Apple Inc. | Zero latency digital assistant |
| US12386491B2 (en) | 2015-09-08 | 2025-08-12 | Apple Inc. | Intelligent automated assistant in a media environment |
| US11809483B2 (en) | 2015-09-08 | 2023-11-07 | Apple Inc. | Intelligent automated assistant for media search and playback |
| US12608171B2 (en) | 2015-09-08 | 2026-04-21 | Apple Inc. | Zero latency digital assistant |
| US11954405B2 (en) | 2015-09-08 | 2024-04-09 | Apple Inc. | Zero latency digital assistant |
| US12051413B2 (en) | 2015-09-30 | 2024-07-30 | Apple Inc. | Intelligent device identification |
| US11526368B2 (en) | 2015-11-06 | 2022-12-13 | Apple Inc. | Intelligent automated assistant in a messaging environment |
| US11809886B2 (en) | 2015-11-06 | 2023-11-07 | Apple Inc. | Intelligent automated assistant in a messaging environment |
| US11886805B2 (en) | 2015-11-09 | 2024-01-30 | Apple Inc. | Unconventional virtual assistant interactions |
| US10354652B2 (en) | 2015-12-02 | 2019-07-16 | Apple Inc. | Applying neural network language models to weighted finite state transducers for automatic speech recognition |
| US11853647B2 (en) | 2015-12-23 | 2023-12-26 | Apple Inc. | Proactive assistance based on dialog communication between devices |
| US10942703B2 (en) | 2015-12-23 | 2021-03-09 | Apple Inc. | Proactive assistance based on dialog communication between devices |
| DK201670539A1 (en) * | 2016-03-14 | 2017-10-02 | Apple Inc | Dictation that allows editing |
| US10860170B1 (en) | 2016-03-31 | 2020-12-08 | United Services Automobile Association (Usaa) | System and method for data visualization and modification in an immersive three dimensional (3-D) environment |
| US10452227B1 (en) * | 2016-03-31 | 2019-10-22 | United Services Automobile Association (Usaa) | System and method for data visualization and modification in an immersive three dimensional (3-D) environment |
| US11188189B1 (en) | 2016-03-31 | 2021-11-30 | United Services Automobile Association (Usaa) | System and method for data visualization and modification in an immersive three dimensional (3-D) environment |
| US11662878B1 (en) | 2016-03-31 | 2023-05-30 | United Services Automobile Association (Usaa) | System and method for data visualization and modification in an immersive three dimensional (3-D) environment |
| US11227589B2 (en) | 2016-06-06 | 2022-01-18 | Apple Inc. | Intelligent list reading |
| US11069347B2 (en) | 2016-06-08 | 2021-07-20 | Apple Inc. | Intelligent automated assistant for media exploration |
| US12223282B2 (en) | 2016-06-09 | 2025-02-11 | Apple Inc. | Intelligent automated assistant in a home environment |
| US11657820B2 (en) | 2016-06-10 | 2023-05-23 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
| US10733993B2 (en) | 2016-06-10 | 2020-08-04 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
| US11037565B2 (en) | 2016-06-10 | 2021-06-15 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
| US10067938B2 (en) | 2016-06-10 | 2018-09-04 | Apple Inc. | Multilingual word prediction |
| US12175977B2 (en) | 2016-06-10 | 2024-12-24 | Apple Inc. | Intelligent digital assistant in a multi-tasking environment |
| US11749275B2 (en) | 2016-06-11 | 2023-09-05 | Apple Inc. | Application integration with a digital assistant |
| US11809783B2 (en) | 2016-06-11 | 2023-11-07 | Apple Inc. | Intelligent device arbitration and control |
| US10942702B2 (en) | 2016-06-11 | 2021-03-09 | Apple Inc. | Intelligent device arbitration and control |
| US12293763B2 (en) | 2016-06-11 | 2025-05-06 | Apple Inc. | Application integration with a digital assistant |
| US10580409B2 (en) | 2016-06-11 | 2020-03-03 | Apple Inc. | Application integration with a digital assistant |
| US12197817B2 (en) | 2016-06-11 | 2025-01-14 | Apple Inc. | Intelligent device arbitration and control |
| US11152002B2 (en) | 2016-06-11 | 2021-10-19 | Apple Inc. | Application integration with a digital assistant |
| US12554325B2 (en) | 2016-07-25 | 2026-02-17 | Meta Platforms Technologies, Llc | Methods and apparatuses for low latency body state prediction based on neuromuscular data |
| US10990174B2 (en) | 2016-07-25 | 2021-04-27 | Facebook Technologies, Llc | Methods and apparatus for predicting musculo-skeletal position information using wearable autonomous sensors |
| US12094460B2 (en) * | 2016-07-27 | 2024-09-17 | Samsung Electronics Co., Ltd. | Electronic device and voice recognition method thereof |
| US10474753B2 (en) | 2016-09-07 | 2019-11-12 | Apple Inc. | Language identification using recurrent neural networks |
| US10553215B2 (en) | 2016-09-23 | 2020-02-04 | Apple Inc. | Intelligent automated assistant |
| US10043516B2 (en) | 2016-09-23 | 2018-08-07 | Apple Inc. | Intelligent automated assistant |
| US11281993B2 (en) | 2016-12-05 | 2022-03-22 | Apple Inc. | Model and ensemble compression for metric learning |
| US10593346B2 (en) | 2016-12-22 | 2020-03-17 | Apple Inc. | Rank-reduced token representation for automatic speech recognition |
| US12260234B2 (en) | 2017-01-09 | 2025-03-25 | Apple Inc. | Application integration with a digital assistant |
| US11656884B2 (en) | 2017-01-09 | 2023-05-23 | Apple Inc. | Application integration with a digital assistant |
| US11204787B2 (en) | 2017-01-09 | 2021-12-21 | Apple Inc. | Application integration with a digital assistant |
| US10741181B2 (en) | 2017-05-09 | 2020-08-11 | Apple Inc. | User interface for correcting recognition errors |
| US10332518B2 (en) | 2017-05-09 | 2019-06-25 | Apple Inc. | User interface for correcting recognition errors |
| US10417266B2 (en) | 2017-05-09 | 2019-09-17 | Apple Inc. | Context-aware ranking of intelligent response suggestions |
| US11599331B2 (en) | 2017-05-11 | 2023-03-07 | Apple Inc. | Maintaining privacy of personal information |
| US11467802B2 (en) | 2017-05-11 | 2022-10-11 | Apple Inc. | Maintaining privacy of personal information |
| US10777187B2 (en) * | 2017-05-11 | 2020-09-15 | Olympus Corporation | Sound collection apparatus, sound collection method, sound collection program, dictation method, information processing apparatus, and recording medium recording information processing program |
| US10395654B2 (en) | 2017-05-11 | 2019-08-27 | Apple Inc. | Text normalization based on a data-driven learning network |
| US10847142B2 (en) | 2017-05-11 | 2020-11-24 | Apple Inc. | Maintaining privacy of personal information |
| US10755703B2 (en) | 2017-05-11 | 2020-08-25 | Apple Inc. | Offline personal assistant |
| US20180330716A1 (en) * | 2017-05-11 | 2018-11-15 | Olympus Corporation | Sound collection apparatus, sound collection method, sound collection program, dictation method, information processing apparatus, and recording medium recording information processing program |
| US10726832B2 (en) | 2017-05-11 | 2020-07-28 | Apple Inc. | Maintaining privacy of personal information |
| US11301477B2 (en) | 2017-05-12 | 2022-04-12 | Apple Inc. | Feedback analysis of a digital assistant |
| US10791176B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Synchronization and task delegation of a digital assistant |
| US10789945B2 (en) | 2017-05-12 | 2020-09-29 | Apple Inc. | Low-latency intelligent automated assistant |
| US11862151B2 (en) | 2017-05-12 | 2024-01-02 | Apple Inc. | Low-latency intelligent automated assistant |
| US10410637B2 (en) | 2017-05-12 | 2019-09-10 | Apple Inc. | User-specific acoustic models |
| US11837237B2 (en) | 2017-05-12 | 2023-12-05 | Apple Inc. | User-specific acoustic models |
| US11380310B2 (en) | 2017-05-12 | 2022-07-05 | Apple Inc. | Low-latency intelligent automated assistant |
| US11538469B2 (en) | 2017-05-12 | 2022-12-27 | Apple Inc. | Low-latency intelligent automated assistant |
| US11405466B2 (en) | 2017-05-12 | 2022-08-02 | Apple Inc. | Synchronization and task delegation of a digital assistant |
| US11580990B2 (en) | 2017-05-12 | 2023-02-14 | Apple Inc. | User-specific acoustic models |
| US10482874B2 (en) | 2017-05-15 | 2019-11-19 | Apple Inc. | Hierarchical belief states for digital assistants |
| US12014118B2 (en) | 2017-05-15 | 2024-06-18 | Apple Inc. | Multi-modal interfaces having selection disambiguation and text modification capability |
| US10810274B2 (en) | 2017-05-15 | 2020-10-20 | Apple Inc. | Optimizing dialogue policy decisions for digital assistants using implicit feedback |
| US10748546B2 (en) | 2017-05-16 | 2020-08-18 | Apple Inc. | Digital assistant services based on device capabilities |
| US11532306B2 (en) | 2017-05-16 | 2022-12-20 | Apple Inc. | Detecting a trigger of a digital assistant |
| US12254887B2 (en) | 2017-05-16 | 2025-03-18 | Apple Inc. | Far-field extension of digital assistant services for providing a notification of an event to a user |
| US10403278B2 (en) | 2017-05-16 | 2019-09-03 | Apple Inc. | Methods and systems for phonetic matching in digital assistant services |
| US10311144B2 (en) | 2017-05-16 | 2019-06-04 | Apple Inc. | Emoji word sense disambiguation |
| US10303715B2 (en) | 2017-05-16 | 2019-05-28 | Apple Inc. | Intelligent automated assistant for media exploration |
| US11675829B2 (en) | 2017-05-16 | 2023-06-13 | Apple Inc. | Intelligent automated assistant for media exploration |
| US12026197B2 (en) | 2017-05-16 | 2024-07-02 | Apple Inc. | Intelligent automated assistant for media exploration |
| US11217255B2 (en) | 2017-05-16 | 2022-01-04 | Apple Inc. | Far-field extension for digital assistant services |
| US10909171B2 (en) | 2017-05-16 | 2021-02-02 | Apple Inc. | Intelligent automated assistant for media exploration |
| US10657328B2 (en) | 2017-06-02 | 2020-05-19 | Apple Inc. | Multi-task recurrent neural network architecture for efficient morphology handling in neural language modeling |
| US10445429B2 (en) | 2017-09-21 | 2019-10-15 | Apple Inc. | Natural language understanding using vocabularies with compressed serialized tries |
| US10755051B2 (en) | 2017-09-29 | 2020-08-25 | Apple Inc. | Rule-based natural language processing |
| US11635736B2 (en) | 2017-10-19 | 2023-04-25 | Meta Platforms Technologies, Llc | Systems and methods for identifying biological structures associated with neuromuscular source signals |
| US10636424B2 (en) | 2017-11-30 | 2020-04-28 | Apple Inc. | Multi-turn canned dialog |
| US10733982B2 (en) | 2018-01-08 | 2020-08-04 | Apple Inc. | Multi-directional dialog |
| US12579768B2 (en) | 2018-01-25 | 2026-03-17 | Meta Platforms Technologies, Llc | Wearable electronic devices, extended reality systems including neuromuscular sensors, and methods for generating text from speech input and modifying the generated text based on neuromuscular data |
| US10733375B2 (en) | 2018-01-31 | 2020-08-04 | Apple Inc. | Knowledge-based framework for improving natural language understanding |
| US10789959B2 (en) | 2018-03-02 | 2020-09-29 | Apple Inc. | Training speaker recognition models for digital assistants |
| US20190279623A1 (en) * | 2018-03-08 | 2019-09-12 | Kika Tech (Cayman) Holdings Co., Limited | Method for speech recognition dictation and correction by spelling input, system and storage medium |
| US10592604B2 (en) | 2018-03-12 | 2020-03-17 | Apple Inc. | Inverse text normalization for automatic speech recognition |
| US10818288B2 (en) | 2018-03-26 | 2020-10-27 | Apple Inc. | Natural assistant interaction |
| US12211502B2 (en) | 2018-03-26 | 2025-01-28 | Apple Inc. | Natural assistant interaction |
| US11710482B2 (en) | 2018-03-26 | 2023-07-25 | Apple Inc. | Natural assistant interaction |
| US10909331B2 (en) | 2018-03-30 | 2021-02-02 | Apple Inc. | Implicit identification of translation payload with neural machine translation |
| US11907436B2 (en) | 2018-05-07 | 2024-02-20 | Apple Inc. | Raise to speak |
| US11145294B2 (en) | 2018-05-07 | 2021-10-12 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
| US11487364B2 (en) | 2018-05-07 | 2022-11-01 | Apple Inc. | Raise to speak |
| US10928918B2 (en) | 2018-05-07 | 2021-02-23 | Apple Inc. | Raise to speak |
| US11900923B2 (en) | 2018-05-07 | 2024-02-13 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
| US11169616B2 (en) | 2018-05-07 | 2021-11-09 | Apple Inc. | Raise to speak |
| US11854539B2 (en) | 2018-05-07 | 2023-12-26 | Apple Inc. | Intelligent automated assistant for delivering content from user experiences |
| US11216069B2 (en) * | 2018-05-08 | 2022-01-04 | Facebook Technologies, Llc | Systems and methods for improved speech recognition using neuromuscular information |
| US10685180B2 (en) | 2018-05-10 | 2020-06-16 | International Business Machines Corporation | Using remote words in data streams from remote devices to autocorrect input text |
| US10984780B2 (en) | 2018-05-21 | 2021-04-20 | Apple Inc. | Global semantic word embeddings using bi-directional recurrent neural networks |
| US10684703B2 (en) | 2018-06-01 | 2020-06-16 | Apple Inc. | Attention aware virtual assistant dismissal |
| US11495218B2 (en) | 2018-06-01 | 2022-11-08 | Apple Inc. | Virtual assistant operation in multi-device environments |
| US11386266B2 (en) | 2018-06-01 | 2022-07-12 | Apple Inc. | Text correction |
| US11360577B2 (en) | 2018-06-01 | 2022-06-14 | Apple Inc. | Attention aware virtual assistant dismissal |
| US10984798B2 (en) | 2018-06-01 | 2021-04-20 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
| US11431642B2 (en) | 2018-06-01 | 2022-08-30 | Apple Inc. | Variable latency device coordination |
| US12386434B2 (en) | 2018-06-01 | 2025-08-12 | Apple Inc. | Attention aware virtual assistant dismissal |
| US10892996B2 (en) | 2018-06-01 | 2021-01-12 | Apple Inc. | Variable latency device coordination |
| US10403283B1 (en) | 2018-06-01 | 2019-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
| US12067985B2 (en) | 2018-06-01 | 2024-08-20 | Apple Inc. | Virtual assistant operations in multi-device environments |
| US10720160B2 (en) | 2018-06-01 | 2020-07-21 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
| US11009970B2 (en) | 2018-06-01 | 2021-05-18 | Apple Inc. | Attention aware virtual assistant dismissal |
| US12080287B2 (en) | 2018-06-01 | 2024-09-03 | Apple Inc. | Voice interaction at a primary device to access call functionality of a companion device |
| US11630525B2 (en) | 2018-06-01 | 2023-04-18 | Apple Inc. | Attention aware virtual assistant dismissal |
| US12061752B2 (en) | 2018-06-01 | 2024-08-13 | Apple Inc. | Attention aware virtual assistant dismissal |
| US10504518B1 (en) | 2018-06-03 | 2019-12-10 | Apple Inc. | Accelerated task performance |
| US10944859B2 (en) | 2018-06-03 | 2021-03-09 | Apple Inc. | Accelerated task performance |
| US10496705B1 (en) | 2018-06-03 | 2019-12-03 | Apple Inc. | Accelerated task performance |
| US11567573B2 (en) | 2018-09-20 | 2023-01-31 | Meta Platforms Technologies, Llc | Neuromuscular text entry, writing and drawing in augmented reality systems |
| US11010561B2 (en) | 2018-09-27 | 2021-05-18 | Apple Inc. | Sentiment prediction from textual data |
| US12367879B2 (en) | 2018-09-28 | 2025-07-22 | Apple Inc. | Multi-modal inputs for voice commands |
| US11893992B2 (en) | 2018-09-28 | 2024-02-06 | Apple Inc. | Multi-modal inputs for voice commands |
| US11462215B2 (en) | 2018-09-28 | 2022-10-04 | Apple Inc. | Multi-modal inputs for voice commands |
| US10839159B2 (en) | 2018-09-28 | 2020-11-17 | Apple Inc. | Named entity normalization in a spoken dialog system |
| US11170166B2 (en) | 2018-09-28 | 2021-11-09 | Apple Inc. | Neural typographical error modeling via generative adversarial networks |
| US11475898B2 (en) | 2018-10-26 | 2022-10-18 | Apple Inc. | Low-latency multi-speaker speech recognition |
| US11797087B2 (en) | 2018-11-27 | 2023-10-24 | Meta Platforms Technologies, Llc | Methods and apparatus for autocalibration of a wearable electrode sensor system |
| US11941176B1 (en) | 2018-11-27 | 2024-03-26 | Meta Platforms Technologies, Llc | Methods and apparatus for autocalibration of a wearable electrode sensor system |
| US11638059B2 (en) | 2019-01-04 | 2023-04-25 | Apple Inc. | Content playback on multiple devices |
| US11348573B2 (en) | 2019-03-18 | 2022-05-31 | Apple Inc. | Multimodality in digital assistant systems |
| US12136419B2 (en) | 2019-03-18 | 2024-11-05 | Apple Inc. | Multimodality in digital assistant systems |
| US11783815B2 (en) | 2019-03-18 | 2023-10-10 | Apple Inc. | Multimodality in digital assistant systems |
| US11481030B2 (en) | 2019-03-29 | 2022-10-25 | Meta Platforms Technologies, Llc | Methods and apparatus for gesture detection and classification |
| US11961494B1 (en) | 2019-03-29 | 2024-04-16 | Meta Platforms Technologies, Llc | Electromagnetic interference reduction in extended reality environments |
| US11275889B2 (en) * | 2019-04-04 | 2022-03-15 | International Business Machines Corporation | Artificial intelligence for interactive preparation of electronic documents |
| US11481031B1 (en) | 2019-04-30 | 2022-10-25 | Meta Platforms Technologies, Llc | Devices, systems, and methods for controlling computing devices via neuromuscular signals of users |
| US11217251B2 (en) | 2019-05-06 | 2022-01-04 | Apple Inc. | Spoken notifications |
| US12216894B2 (en) | 2019-05-06 | 2025-02-04 | Apple Inc. | User configurable task triggers |
| US11307752B2 (en) | 2019-05-06 | 2022-04-19 | Apple Inc. | User configurable task triggers |
| US11423908B2 (en) | 2019-05-06 | 2022-08-23 | Apple Inc. | Interpreting spoken requests |
| US12154571B2 (en) | 2019-05-06 | 2024-11-26 | Apple Inc. | Spoken notifications |
| US11475884B2 (en) | 2019-05-06 | 2022-10-18 | Apple Inc. | Reducing digital assistant latency when a language is incorrectly determined |
| US11705130B2 (en) | 2019-05-06 | 2023-07-18 | Apple Inc. | Spoken notifications |
| US11675491B2 (en) | 2019-05-06 | 2023-06-13 | Apple Inc. | User configurable task triggers |
| US11888791B2 (en) | 2019-05-21 | 2024-01-30 | Apple Inc. | Providing message response suggestions |
| US11140099B2 (en) | 2019-05-21 | 2021-10-05 | Apple Inc. | Providing message response suggestions |
| US11360739B2 (en) | 2019-05-31 | 2022-06-14 | Apple Inc. | User activity shortcut suggestions |
| US11496600B2 (en) | 2019-05-31 | 2022-11-08 | Apple Inc. | Remote execution of machine-learned models |
| US11657813B2 (en) | 2019-05-31 | 2023-05-23 | Apple Inc. | Voice identification in digital assistant systems |
| US11237797B2 (en) | 2019-05-31 | 2022-02-01 | Apple Inc. | User activity shortcut suggestions |
| US11289073B2 (en) | 2019-05-31 | 2022-03-29 | Apple Inc. | Device text to speech |
| US11360641B2 (en) | 2019-06-01 | 2022-06-14 | Apple Inc. | Increasing the relevance of new available information |
| US11790914B2 (en) | 2019-06-01 | 2023-10-17 | Apple Inc. | Methods and user interfaces for voice-based control of electronic devices |
| US20230186918A1 (en) * | 2019-07-15 | 2023-06-15 | Axon Enterprise, Inc. | Methods and systems for transcription of audio data |
| US12062374B2 (en) * | 2019-07-15 | 2024-08-13 | Axon Enterprise, Inc. | Methods and systems for transcription of audio data |
| US11493993B2 (en) | 2019-09-04 | 2022-11-08 | Meta Platforms Technologies, Llc | Systems, methods, and interfaces for performing inputs based on neuromuscular control |
| US11488406B2 (en) | 2019-09-25 | 2022-11-01 | Apple Inc. | Text detection using global geometry estimators |
| US12591304B2 (en) | 2019-11-25 | 2026-03-31 | Meta Platforms Technologies, Llc | Systems and methods for contextualized interactions with an environment |
| US11907423B2 (en) | 2019-11-25 | 2024-02-20 | Meta Platforms Technologies, Llc | Systems and methods for contextualized interactions with an environment |
| US20210225377A1 (en) * | 2020-01-17 | 2021-07-22 | Verbz Labs Inc. | Method for transcribing spoken language with real-time gesture-based formatting |
| US20230168859A1 (en) * | 2020-04-23 | 2023-06-01 | JRD Communication (Shenzhen) Ltd. | Method and device for voice input using head control device |
| US12340148B2 (en) * | 2020-04-23 | 2025-06-24 | J Rd Communi Cati On ( Shenzhen) Ltd. | Method and device for voice input using head control device |
| US11765209B2 (en) | 2020-05-11 | 2023-09-19 | Apple Inc. | Digital assistant hardware abstraction |
| US11810578B2 (en) | 2020-05-11 | 2023-11-07 | Apple Inc. | Device arbitration for digital assistant-based intercom systems |
| US12197712B2 (en) | 2020-05-11 | 2025-01-14 | Apple Inc. | Providing relevant data items based on context |
| US12301635B2 (en) | 2020-05-11 | 2025-05-13 | Apple Inc. | Digital assistant hardware abstraction |
| US11914848B2 (en) | 2020-05-11 | 2024-02-27 | Apple Inc. | Providing relevant data items based on context |
| US11924254B2 (en) | 2020-05-11 | 2024-03-05 | Apple Inc. | Digital assistant hardware abstraction |
| US11838734B2 (en) | 2020-07-20 | 2023-12-05 | Apple Inc. | Multi-device audio adjustment coordination |
| US12219314B2 (en) | 2020-07-21 | 2025-02-04 | Apple Inc. | User identification using headphones |
| US11750962B2 (en) | 2020-07-21 | 2023-09-05 | Apple Inc. | User identification using headphones |
| US11696060B2 (en) | 2020-07-21 | 2023-07-04 | Apple Inc. | User identification using headphones |
| US12057123B1 (en) * | 2020-11-19 | 2024-08-06 | Voicebase, Inc. | Communication devices with embedded audio content transcription and analysis functions |
| US11868531B1 (en) | 2021-04-08 | 2024-01-09 | Meta Platforms Technologies, Llc | Wearable device providing for thumb-to-finger-based input gestures detected based on neuromuscular signals, and systems and methods of use thereof |
| US12619452B2 (en) | 2023-07-20 | 2026-05-05 | Apple Inc. | Intelligent automated assistant in a messaging environment |
Also Published As
| Publication number | Publication date |
|---|---|
| US9640181B2 (en) | 2017-05-02 |
| WO2015100172A1 (en) | 2015-07-02 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| US9640181B2 (en) | Text editing with gesture control and natural speech | |
| US10402162B2 (en) | Automatic speech recognition (ASR) feedback for head mounted displays (HMD) | |
| US9830909B2 (en) | User configurable speech commands | |
| US9383816B2 (en) | Text selection using HMD head-tracker and voice-command | |
| US10388284B2 (en) | Speech recognition apparatus and method | |
| KR102002979B1 (en) | Leveraging head mounted displays to enable person-to-person interactions | |
| US9921805B2 (en) | Multi-modal disambiguation of voice assisted input | |
| US10013976B2 (en) | Context sensitive overlays in voice controlled headset computer displays | |
| KR102545666B1 (en) | Method for providing sententce based on persona and electronic device for supporting the same | |
| US20210193123A1 (en) | Speech decoding method and apparatus, computer device, and storage medium | |
| JP6392374B2 (en) | Head mounted display system and method for operating head mounted display device | |
| US20150220142A1 (en) | Head-Tracking Based Technique for Moving On-Screen Objects on Head Mounted Displays (HMD) | |
| CN103890836A (en) | Bluetooth or other wireless interface with power management for head mounted display | |
| US20210151046A1 (en) | Function performance based on input intonation | |
| US20150187371A1 (en) | Location Tracking From Natural Speech | |
| EP3550449A1 (en) | Search method and electronic device using the method | |
| US20150220506A1 (en) | Remote Document Annotation | |
| US8856006B1 (en) | Assisted speech input | |
| US20240330362A1 (en) | System and method for generating visual captions | |
| US20240233729A9 (en) | Transcription based on speech and visual input | |
| US20190369400A1 (en) | Head-Mounted Display System | |
| KR20200051984A (en) | Sign language translation with earphone |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| AS | Assignment |
Owner name: KOPIN CORPORATION, MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:PARKINSON, CHRISTOPHER;REEL/FRAME:034946/0293 Effective date: 20150205 Owner name: ASK ZIGGY, INC., CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:LEIB, SHAI;REEL/FRAME:034946/0308 Effective date: 20150122 |
|
| AS | Assignment |
Owner name: KOPIN CORPORATION, MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ASK ZIGGY, INC.;REEL/FRAME:041675/0816 Effective date: 20170321 |
|
| STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
| CC | Certificate of correction | ||
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |
|
| MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 8 |