English 中文(简体)
How to go about making an untrained speech to text converter?
原标题:

I have a severe to profound deafness from a very early age but luckily I can speak like a normal person. Verbal communication has always been difficult for me due to my impaired speech recognition abilities even with lip-reading. I have gone through school and college by just reading boards, powerpoint slides, books and the internet. I am doing pretty much fine at my current software engineering job, but of late I feel that I must put some effort to make my situation better.

Subtitles are my lifesaver in this country to understand movies/shows on TV and I have only been enjoying this for the last 7 years (I am 31 now).

I strongly feel the need for the ability to see subtitles in real life whenever I talk to some person, even strangers. I want to develop an untrained speech to text converter, and as a start it does not even have to spell out exact words for me, only cues on syllables/phonetics will also be fine.

I have googled on this for a while, but most results are either text to speech or half-baked attempts on speech recognition to give voice commands to a computer. I would really like to get some pointers on how to start on this project. Specifically I need steps like how to deal with audio files and what kind of processing I have to do to get approx phonetics as fast as possible.

问题回答

You might want to look at CMU s Sphinx project which does speech to text in real time. They have some demos to try it out.

Have a look at the DSP guide, it s more about low-level stuff but techniques like Fourier transforms and filtering are of great importance to audio processing. Even if you don t start from scratch it can be good to appreciate the principles and applications.

That said, I bet that starting from scratch, one could create something that can tell apart a basic set of sounds with a few days work...

Here s some other questions that might give you ideas:

And take a look at SIL Linguistics Computing.

Good luck.





相关问题
How to add/merge several Big O s into one

If I have an algorithm which is comprised of (let s say) three sub-algorithms, all with different O() characteristics, e.g.: algorithm A: O(n) algorithm B: O(log(n)) algorithm C: O(n log(n)) How do ...

Grokking Timsort

There s a (relatively) new sort on the block called Timsort. It s been used as Python s list.sort, and is now going to be the new Array.sort in Java 7. There s some documentation and a tiny Wikipedia ...

Manually implementing high performance algorithms in .NET

As a learning experience I recently tried implementing Quicksort with 3 way partitioning in C#. Apart from needing to add an extra range check on the left/right variables before the recursive call, ...

Print possible strings created from a Number

Given a 10 digit Telephone Number, we have to print all possible strings created from that. The mapping of the numbers is the one as exactly on a phone s keypad. i.e. for 1,0-> No Letter for 2->...

Enumerating All Minimal Directed Cycles Of A Directed Graph

I have a directed graph and my problem is to enumerate all the minimal (cycles that cannot be constructed as the union of other cycles) directed cycles of this graph. This is different from what the ...

Quick padding of a string in Delphi

I was trying to speed up a certain routine in an application, and my profiler, AQTime, identified one method in particular as a bottleneck. The method has been with us for years, and is part of a "...

热门标签