English 中文(简体)
IOS 语音承认或语音对文本
原标题:iOS Speech Recognition or Speech To Text

我正在测试苹果本身的语音识别 API 根据苹果在Scrumdinger App中给出的样本代码。

The link for the scrumdinger app is here:
https://github.com/ahmaddorra/Scrumdinger

基于此特定页面的会议 View.swift < a href="https://github.com/ahmaddorra/Scrumdinger/blob/main/Scrumdinger/MeetingView.swift" rel=“nofollow noreferr">https://github.com/ahmaddorra/Scrumdinger/blob/main/Scrumdinger/MeetingView.swift

我设计了一个简单的测试应用程序,只是为了测试语音识别,但不知怎么我没能成功。按下启动按钮后,我与电话交谈,但

打印 (speechRecognizer.transcript)

未打印任何文本 。

如果有人能帮我看看我的执行情况,看看我错过了什么,我将不胜感激。

在下面的代码中,记录誊本没有印出。

The whole github link for the iOS test app is here: https://github.com/somaria/scrumtranscript

import SwiftUI
import AVFoundation
import MediaPlayer

struct ContentView: View {
  
  @State var textValue: String = "The transcription text"
  @StateObject var speechRecognizer = SpeechRecognizer()
  
  private var player: AVPlayer { AVPlayer.sharedDingPlayer }
  
    var body: some View {
        VStack {
            Image(systemName: "mic")
                .imageScale(.large)
                .foregroundColor(.accentColor)
          Text(textValue)
          HStack(spacing: 64) {
            Button {
            
              print("starting")
              self.textValue = "New Text"
              
              player.seek(to: .zero)
              player.play()

              speechRecognizer.reset()
              speechRecognizer.transcribe()

            } label: {
              Text("Start")
            }
            Button {
              print("Stop")
              
              speechRecognizer.stopTranscribing()
              打印 (speechRecognizer.transcript)
            } label: {
              Text("Stopping")
            }

          }
        }
        .padding()
    }
}


struct ContentView_Previews: PreviewProvider {
    static var previews: some View {
        ContentView()
    }
}
问题回答

以下SwiftUI代码(基于以上内容)与 iOS 17+ 合作。 它使用苹果的语音识别码( 也包含在下面) 。 我更新了该代码, 以便使用更新的 AVAudioApplication 。 请求RecordPermission( 完成 ) 函数( 以封住过期警告) 。 在您的工程中包含语音识别器文件, 然后添加语音查看代码。 仅对 iPhone 进行测试, 但也应该对 iPads 进行工作 。

import SwiftUI
import MediaPlayer
import AVFoundation
import Speech

struct SpeechView: View {
    @State var textValue: String = "Press start to record speech..."
    @State var speechRecognizer = SpeechRecognizer()
    
    @State private var spokenText: String = ""
    @State private var isRecording: Bool = false
        
    var body: some View {
        VStack(spacing: 50) {
            Image(systemName: isRecording ? "mic.fill" : "mic")
                .imageScale(.large)
                .scaleEffect(2.0)
                .foregroundColor(isRecording ? .accentColor : .primary)
            Text(isRecording ? textValue : "Press start to record speech...")
            if !isRecording {
                Button {
                    print("starting speech recognition")
                    self.textValue = ""
                    self.spokenText = ""
                    isRecording = true
                    speechRecognizer.record(to: $textValue)
                } label: {
                    Text("Start")
                }
            } else {
                Button {
                    print("stopping speech recognition")
                    isRecording = false
                    spokenText = textValue
                    speechRecognizer.stopRecording()
                    textValue = "Press start to record speech..."
                    print(spokenText)
                } label: {
                    Text("Stop")
                }
            }
            if !isRecording {
                Text(spokenText)
            }
        }
        .padding()
    }
}

#Preview {
    SpeechView()
}

//
//  SpeechRecognizer.swift
//  Scrumdinger
//
//  Created by Ahmad Dorra on 2/11/21.
//

import AVFoundation
import Foundation
import Speech
import SwiftUI

/// A helper for transcribing speech to text using AVAudioEngine.
struct SpeechRecognizer {
    private class SpeechAssist {
        var audioEngine: AVAudioEngine?
        var recognitionRequest: SFSpeechAudioBufferRecognitionRequest?
        var recognitionTask: SFSpeechRecognitionTask?
        let speechRecognizer = SFSpeechRecognizer()

        deinit {
            reset()
        }

        func reset() {
            recognitionTask?.cancel()
            audioEngine?.stop()
            audioEngine = nil
            recognitionRequest = nil
            recognitionTask = nil
        }
    }

    private let assistant = SpeechAssist()

    /**
        Begin transcribing audio.
     
        Creates a `SFSpeechRecognitionTask` that transcribes speech to text until you call `stopRecording()`.
        The resulting transcription is continuously written to the provided text binding.
     
        -  Parameters:
            - speech: A binding to a string where the transcription is written.
     */
    func record(to speech: Binding<String>) {
        relay(speech, message: "Requesting access")
        canAccess { authorized in
            guard authorized else {
                relay(speech, message: "Access denied")
                return
            }

            relay(speech, message: "Access granted")

            assistant.audioEngine = AVAudioEngine()
            guard let audioEngine = assistant.audioEngine else {
                fatalError("Unable to create audio engine")
            }
            assistant.recognitionRequest = SFSpeechAudioBufferRecognitionRequest()
            guard let recognitionRequest = assistant.recognitionRequest else {
                fatalError("Unable to create request")
            }
            recognitionRequest.shouldReportPartialResults = true

            do {
                relay(speech, message: "Booting audio subsystem")

                let audioSession = AVAudioSession.sharedInstance()
                try audioSession.setCategory(.record, mode: .measurement, options: .duckOthers)
                try audioSession.setActive(true, options: .notifyOthersOnDeactivation)
                let inputNode = audioEngine.inputNode
                relay(speech, message: "Found input node")

                let recordingFormat = inputNode.outputFormat(forBus: 0)
                inputNode.installTap(onBus: 0, bufferSize: 1024, format: recordingFormat) { (buffer: AVAudioPCMBuffer, when: AVAudioTime) in
                    recognitionRequest.append(buffer)
                }
                relay(speech, message: "Preparing audio engine")
                audioEngine.prepare()
                try audioEngine.start()
                assistant.recognitionTask = assistant.speechRecognizer?.recognitionTask(with: recognitionRequest) { (result, error) in
                    var isFinal = false
                    if let result = result {
                        relay(speech, message: result.bestTranscription.formattedString)
                        isFinal = result.isFinal
                    }

                    if error != nil || isFinal {
                        audioEngine.stop()
                        inputNode.removeTap(onBus: 0)
                        self.assistant.recognitionRequest = nil
                    }
                }
            } catch {
                print("Error transcribing audio: " + error.localizedDescription)
                assistant.reset()
            }
        }
    }
    
    /// Stop transcribing audio.
    func stopRecording() {
        assistant.reset()
    }
    
    private func canAccess(withHandler handler: @escaping (Bool) -> Void) {
        SFSpeechRecognizer.requestAuthorization { status in
            if status == .authorized {
                AVAudioApplication.requestRecordPermission { authorized in
                    handler(authorized)
                }
            } else {
                handler(false)
            }
        }
    }
    
    private func relay(_ binding: Binding<String>, message: String) {
        DispatchQueue.main.async {
            binding.wrappedValue = message
        }
    }
}

HTH, HTH, HTH, HTH, HTH, HTH, HTH, HTH, HT, HTH





相关问题
Speech Recognition with Telephone

I need to detect the user voice when they pick-up the reciever on the other end. Because Modems usually start playing files (playback terminal) when the first ring goes there. So I planned to use ...

speaker dependent speech recognition engin with sdk

I want to do a little apllication, does any one know of a good speaker dependent speech recognition engin with sdk. (not speech to text engins) thank you, Efrat

Speech recognition project

I m making my final year project i.e. speech recognition. but I don t have any idea how to start. I will use c#. Please can anyone guide me how to start? what should be the first step? Thanks

微软 Sam,SAPI替代品

我们计划使用微软讲话标本。 我们现在在Windows XP上使用Microsoft Sam声音,坦率地说,它可怕......。 几乎不可能听到......。

Windows Speech Recognition C#

I m making a program that does stuff (Sorry, I m not allowed to say what it is), but I want to be able to let Windows Speech somehow "know" that there are linklabels and buttons on my Forms, so that ...

热门标签