Voice Recognition Api

I have got a java application. And I want to implement a voice recognition feature on it.

Just like this:

Assuming, I have got some records which are called "hey", "one", "two, "tea" by user. When he/she say "hey", voice recognition api should recognize the first input of the records. "hey", "one", "two, "tea" can be something that is not word in english.

I have been already looked for some apis which are support Speech recognition or have a audiofingerprintin algorithm. But I don't want to use them.

Let me explain why I don't use these apis. First of all, speech recognition apis try to understand word and convert to text. However this is limited to on api's supported languages. Even if a speech recognition api supports english langueage. It can be given bad results because of the user's bad pronunciation. So I don't want to use speech recognition api on my application. Because the feature shouldn't be language based.

Besides, when I looked for a voice recognition api, I have found the "audio pringerprint" apis. I have used " musicg " api which is open sourced. And then, I have developed a test application. The application records 4 different audio files that contains unword voices. After that, I recorded a voice which is similar with one of them and the test api has compared the last one with the former audio files using musicg api. However the results are also really bad.

As I mentioned before, I need to get a voice recognition feature which is just like old phones do.


check kaldi http://kaldi-asr.org/ or this tensorflow tutorial: https://www.tensorflow.org/tutorials/audio_recognition

in both cases, you can train model, it's not language based. You can train model for some specific voice or accent, or some specific context.

Also, maybe this project will be interesing to you: https://github.com/cmusphinx/g2p-seq2seq It doesn't use language model and translates audio to phonemes.

链接地址: http://www.djcxy.com/p/64312.html

上一篇: 引用DLL文件不会复制到bin,部署项目会导致错误

下一篇: 语音识别API