English 中文(简体)
It s posible to do SpeechToText(Speech recognition) and afterwards TextToSpeech (using same text and same voice)?
原标题:

I m working on a program for tone deaf people. I ve working with sapi and a TTs. The program does a 3D animation with a hand at the same time. But the problem is that the voices (also when a put them at its slowest speech) is to fast for what I want. So, I ve thought on speech recognition, but the problem is that I ve to do a large process to the text before the animation start.

So, I want to know if It would be posible to do speech recognition(from my voice on a .wave file) and afterwards do the same process of TTs (with Sapi events...) but using the .wave with my voice.

If It s posible, please tell me how. If you think there are better alternatives, let my see them.

Thanks for your time (and excuse my English)

Jesuskiewicz

问题回答

Now that I understand what you want to happen, I can say that as far as I know, the SAPI SR engine doesn t really provide phoneme-level markup that s synchronized to the incoming text.

What you could try (although I have no real expectation for this to work) would be to take the audio, run it through a pronunciation grammar to generate phonemes, and then take the text elements to find the corresponding bits of audio.

When I say a pronunciation grammar , I mean a dictation grammar with the pronunciation model loaded - set it up like this:

CComPtr<ISpRecoGrammar> cpGrammar;
... initialize SR engine and create a grammar ...
cpGrammar->LoadDictation(L"Pronunciation", SPLO_STATIC);

In your recognition handler, you would need to parse out the elements:

ISpRecoResult* ipReco;
SPPHRASE* pPhrase;
ipReco->GetPhrase(&pPhrase);
for (int i = 0; i < pPhrase->Rule.ulCountOfElements; ++i)
{
    const SPPHRASEELEMENT * pElem = pPhrase->pElements + i;
    // examine pElem->ulAudioSizeTime, etc.
}
::CoTaskMemFree(pPhrase);

I hope this is enough to get you started...





相关问题
Speech Recognition with Telephone

I need to detect the user voice when they pick-up the reciever on the other end. Because Modems usually start playing files (playback terminal) when the first ring goes there. So I planned to use ...

speaker dependent speech recognition engin with sdk

I want to do a little apllication, does any one know of a good speaker dependent speech recognition engin with sdk. (not speech to text engins) thank you, Efrat

Speech recognition project

I m making my final year project i.e. speech recognition. but I don t have any idea how to start. I will use c#. Please can anyone guide me how to start? what should be the first step? Thanks

微软 Sam,SAPI替代品

我们计划使用微软讲话标本。 我们现在在Windows XP上使用Microsoft Sam声音,坦率地说,它可怕......。 几乎不可能听到......。

Windows Speech Recognition C#

I m making a program that does stuff (Sorry, I m not allowed to say what it is), but I want to be able to let Windows Speech somehow "know" that there are linklabels and buttons on my Forms, so that ...

热门标签