News Column

Researchers Submit Patent Application, "Multi-Stage Speaker Adaptation", for Approval

July 1, 2014



By a News Reporter-Staff News Editor at Information Technology Newsweekly -- From Washington, D.C., VerticalNews journalists report that a patent application by the inventors Aleksic, Petar (Mountain View, CA); Lei, Xin (Mountain View, CA), filed on February 17, 2014, was made available online on June 19, 2014.

The patent's assignee is Google Inc.

News editors obtained the following quote from the background information supplied by the inventors: "Automatic speech recognition (ASR) technology can be used to map audio utterances to textual representations of those utterances. In some systems, ASR involves comparing characteristics of the audio utterances to an acoustic model of human voice. However, different speakers may exhibit different speech characteristics (e.g., pitch, accent, tempo, etc.). Consequently, the acoustic model may not perform well for all speakers."

As a supplement to the background information on this patent application, VerticalNews correspondents also obtained the inventors' summary information for this patent application: "In a first example embodiment, a first gender-specific speaker adaptation technique may be selected based on characteristics of a first set of feature vectors. The first set of feature vectors may correspond to a first unit of input speech, and may be configured for use in automatic speech recognition (ASR) of the first unit of input speech. A second set of feature vectors may be modified based on the first gender-specific speaker adaptation technique. The second set of feature vectors may correspond to a second unit of input speech. The modified second set of feature vectors may be configured for use in ASR of the second unit of input speech. A first speaker-dependent speaker adaptation technique may be selected based on characteristics of the second set of feature vectors. A third set of feature vectors may be modified based on the first speaker-dependent speaker adaptation technique. The third set of feature vectors may correspond to a third unit of input speech. The modified third set of feature vectors may be configured for use in ASR of the third unit of input speech.

"In a second example embodiment, a first set of feature vectors may be obtained. The first set of feature vectors may correspond to a first unit of input speech. Characteristics of the first set of feature vectors may be compared to a first gender-specific speech model and a second gender-specific speech model. The characteristics of the first set of feature vectors may be determined to fit the first gender-specific speech model better than the second gender-specific model. A second set of feature vectors may be obtained. The second set of feature vectors may correspond to a second unit of input speech. The second set of feature vectors may be modified based on a first gender-specific speaker adaptation technique associated with the first gender-specific speech model. After modifying the second set of feature vectors, characteristics of the second set of feature vectors may be compared to the first gender-specific speech model, the second gender-specific speech model, and a speaker-dependent speech model. The characteristics of the second set of feature vectors may be determined to fit the speaker-dependent speech model better than the first and second gender-specific models. A third set of feature vectors may be obtained. The third set of feature vectors may correspond to a third unit of input speech. The third set of feature vectors may be modified based on a speaker-dependent speaker adaptation technique associated with the speaker-dependent speech model.

"A third example embodiment may include a non-transitory computer-readable storage medium, having stored thereon program instructions that, upon execution by a computing device, cause the computing device to perform operations in accordance with the first and/or second example embodiments.

"A fourth example embodiment may include a computing device, comprising at least a processor and data storage. The data storage may contain program instructions that, upon execution by the processor, operate in accordance with the first and/or second example embodiments.

"These as well as other aspects, advantages, and alternatives will become apparent to those of ordinary skill in the art by reading the following detailed description with reference where appropriate to the accompanying drawings. Further, it should be understood that the description provided in this summary section and elsewhere in this document is intended to illustrate the claimed subject matter by way of example and not by way of limitation.

BRIEF DESCRIPTION OF THE FIGURES

"FIG. 1 depicts a distributed computing architecture, in accordance with an example embodiment.

"FIG. 2A is a block diagram of a server device, in accordance with an example embodiment.

"FIG. 2B depicts a cloud-based server system, in accordance with an example embodiment.

"FIG. 3 depicts a block diagram of a client device, in accordance with an example embodiment.

"FIG. 4 depicts an ASR system, in accordance with an example embodiment.

"FIG. 5 depicts aspects of an acoustic model, in accordance with an example embodiment.

"FIG. 6 depicts an ASR system search graph, in accordance with an example embodiment.

"FIG. 7 depicts an ASR system that supports speaker adaptation, in accordance with an example embodiment.

"FIG. 8A is a message flow diagram of speaker adaptation, in accordance with an example embodiment.

"FIG. 8B is another message flow diagram of speaker adaptation, in accordance with an example embodiment.

"FIG. 9 is a flow chart, in accordance with an example embodiment.

"FIGS. 10A and 10B are another flow chart, in accordance with an example embodiment."

For additional information on this patent application, see: Aleksic, Petar; Lei, Xin. Multi-Stage Speaker Adaptation. Filed February 17, 2014 and posted June 19, 2014. Patent URL: http://appft.uspto.gov/netacgi/nph-Parser?Sect1=PTO2&Sect2=HITOFF&u=%2Fnetahtml%2FPTO%2Fsearch-adv.html&r=1266&p=26&f=G&l=50&d=PG01&S1=20140612.PD.&OS=PD/20140612&RS=PD/20140612

Keywords for this news article include: Google Inc., Information Technology, Information and Data Storage.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC


For more stories covering the world of technology, please see HispanicBusiness' Tech Channel



Source: Information Technology Newsweekly


Story Tools






HispanicBusiness.com Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters