Speech Recognition Technology Principles How Speech Recognition Works

The speech recognition system can transform our language into corresponding commands and deliver them to intelligent robots, and then complete the corresponding actions. It has a crucial role in artificial intelligence. What is the principle of the speech recognition system? For your brief introduction.

Speech recognition system is essentially a pattern recognition system, including three basic units: feature extraction, pattern matching, and reference pattern library. The unknown voice is transformed into an electrical signal by the microphone and added to the input of the recognition system. After preprocessing, the voice model is established according to the characteristics of the human voice, the input voice signal is analyzed, and the required features are extracted. Create a template for speech recognition. In the recognition process, the computer compares the voice template stored in the computer with the characteristics of the input speech signal according to the speech recognition model, and finds a series of optimal matching with the input speech according to a certain search and matching strategy. template. Then according to the definition of this template, through the lookup table can give the computer's recognition results. Obviously, this optimal result has a direct relationship with the selection of features, the quality of the speech model, and the accuracy of the template.

语音识别技术原理 语音识别是如何实现的.jpg

The speech recognition system construction process includes two parts as a whole: training and recognition. Training is usually done off-line, performing signal processing and knowledge mining on a pre-collected mass of speech and language databases to obtain the “acoustic model” and “language model” required by the speech recognition system; and the identification process is usually done online. Real-time voice recognition of users. The identification process can usually be divided into two modules: "front end" and "back end": The main role of the "front end" module is to perform endpoint detection (removing excess mutes and non-speech sounds), noise reduction, feature extraction, etc.; The role of the "end" module is to use the trained "acoustic model" and "language model" to perform statistical pattern recognition (also called "decoding") on the eigenvectors of the user's speech and obtain the text information it contains. In addition, the backend module also There is an "adaptive" feedback module that can perform self-learning on the user's voice to perform the necessary "correction" of the "acoustic model" and the "voice model" to further improve the recognition accuracy.

Speech recognition is a branch of pattern recognition, and it is also subordinate to the field of signal processing science, and has a close relationship with the disciplines of phonetics, linguistics, mathematical statistics, and neurobiology. The purpose of speech recognition is to let the machine "understand" the spoken language of human beings. It includes two meanings: one is to understand words and sentences non-translated into written language; the other is to request or inquire into the spoken language. Understand and make the right response, not sticking to the correct conversion of all words.

Automatic speech recognition technology has three basic principles: First, the speech information in the speech signal is encoded according to the temporal change pattern of the short-time amplitude spectrum; second, the speech is readable, ie, its acoustic signal can be ignored regardless of what the speaker is trying to convey. The information content is represented by dozens of distinctive and discrete symbols; the third speech interaction is a cognitive process and cannot be separated from the grammatical, semantic and pragmatic structures of the language.

The above content is the principle of speech recognition technology. At present, China's speech recognition technology has made great progress, the accuracy of recognition has gradually increased, and it can also be recognized for less important dialects. The speech recognition that we often use may be Siri or smart speakers. Since the popularity of smart home systems in China is not very high, there is still much room for development in the application of this technology.

related suggestion:

Article: What are the applications of speech recognition technology in smart homes?

Recommended: smart lock smart door lock set electronic smart lock investment

Split Encoder

Motion Control Sensor is an original part that converts the change of non-electricity (such as speed, pressure) into electric quantity. According to the converted non-electricity, it can be divided into pressure sensor, speed sensor, temperature sensor, etc. It is a measurement, control instrument and Parts and accessories of equipment.

Encoder And Decoder,Encoder For Motor , Encoder In Communication,Encoder Communication

Changchun Guangxing Sensing Technology Co.LTD , https://www.gx-encoder.com

Posted on