Speech Recognition - Converging and Countering Diverse Platforms

Dec 1

08:15

2011

Sharad Gaikwad

We live in an era of continuous and rapid innovations. The first decade of the 21st Century brought a number of discoveries, mistakes, and other advances that have prejudiced humans extensively.

In some cases these advances changed deep-seated beliefs of humans; in others, they opened up possibilities beyond what humans thought was impossible years ago. Technology has changed the face of many verticals today; and has diversified its processes and ways to accomplish different tasks.

Speech Recognition – Introduction:

Speech Recognition is the ability of machines to process spoken commands. It enables "hands-free" control of various electronic devices, a benefit for many disabled people and the add-on of automatic creation of "print-ready" dictation. Among the initial applications for Speech recognition were automated telephone systems and medical dictation software (Transcription).

Speech recognition also known as automatic voice recognition, computer speech recognition, speech to text, or just STT, converts spoken words to text. The term "Speech Recognition" is sometimes used to refer to recognition systems that must be trained to a particular speaker; as is the case for most desktop recognition software. Recognizing the speaker can abridge the task of translating speech. Speech recognition is an extensive and expansive solution that refers to technology that can recognize speech without being targeted at single speaker, such as a call system that can recognize subjective voices.

In 1936 AT&T'S Bell labs produced the first electronic speech synthesizer called the Voder. This machine was demonstrated in the 1939 world fairs by experts that used a keyboard and foot pedals to play the machine and emit speech. This was the first attempt to systematize the documentation problem and even out the operational hindrances within the medical and healthcare arena.

Amalgamating Speech recognition in daily lives (Cellphones and devises):

In today's day and age cell phones and PDA's and other high computing devices happen to be common possessions. These devices were introduced with the simple objective to ease the human's basic need to communicate and also double up to be a mode of keeping oneself cognizant. As time and technology advanced cell phone and other devises have also followed the trail. With the availability of 3G & 4G empowered mobile devices with fast Internet connections and super-fast processing, mobile devises and cell phones have been amalgamated with the speech recognition technology. Speech recognition applications in mobile devises include voice user interfaces such as voice dialing (e.g., "Call home"), call routine (e.g., "I would like to make a collect call"), search (e.g., find a podcast where particular words were spoken) etc. Today's speech-recognition apps for smart phones can also learn from their errors; they are auto corrective. If an app misspells a word, users can use the keyboards on their respective devices to correct the mistake, and the correction is distinguished on the server so it is less likely to persist. Large numbers of devices are being rolled out with speech recognition pre-installed into the device. This trend is already up-to-the-minute. Apple's iPhone 4S, for example, includes native speech recognition capabilities that allow users to voice-dial contact in their address books and along with this apple has introduced an interpretive speech recognition app called Siri. Siri on iPhone 4S lets users use their voice to send messages, schedule meetings, place phone calls, type emails, play music and much more. Ask Siri to do things just by talking the way you talk and Siri understands what you say, knows what you mean, and even talks back. Siri is so easy to use and does so much that you'll keep finding more and more ways to use it. All these implications of speech recognition technology are a preview of the future where all mobile devises would no longer require humans to manually input probes, but just talk to a device and get the task done quickly.

Amalgamating Speech recognition in daily lives (Desktop Computers):

Computers have influenced our life in such a way that it is very difficult to sustain without one. Today's computer are made to meet all the user's needs and even to provide use of use, which being the main purpose of the gadget.

There are two primary implementations of the technology on PCs: voice command, by which you use spoken words to control the computer, and voice dictation, which transcribes the words you speak into a text document. Voice recognition programs on desktops allow you to "train" them by reading many passages of text (the contents of which is known to the computer) so the system can learn how you pronounce various words. It can be a monotonous process, but it can greatly improve the accuracy of the transcription.

Voice command tends to work better than dictation. That's because the vocabulary that the computer needs to learn is much smaller. You use standard commands such as "Open file" or "Click Start." With little practice, you can control the computer's basic functions using your voice. Windows Speech Recognition is a new feature in Windows Vista and Windows 7, built using the latest Microsoft speech recognition technologies. Windows Speech Recognition provides excellent recognition accuracy that improves with each use as it adapts to your speaking style and vocabulary.

With Windows Speech Recognition, one is empowered right from the start; a guided setup and an interactive training experience disseminate you with key concepts and commands. Windows Speech Recognition also features an innovative, natural user interface that efficiently assists you in controlling your computer by voice. Whether you are starting an application, selecting a word, or correcting a sentence, you are always in control and Windows Speech Recognition smoothly guides you to completing the task at hand. It boosts various unique features such as Commanding, Correction, Dictation, Disambiguation, Innovative user interface, Interactive tutorial, Personalization (adaptation) and Support for multiple languages. Speech recognition technology has changed the way people use computers today and to a great extent increased the user experience and eased a lot of complex typing tasks.

At some point in the future, speech recognition may become speech understanding. The statistical models that allow computers to decide what a person just said may someday allow them to grasp the meaning behind the words. Although it is a huge leap in terms of computational power and software sophistication, some researchers argue that Speech recognition development offers the most direct line from the computers of today to true artificial intelligence. We can talk to our computers today. In 25 years, they may very well talk back.

Article "tagged" as: