공부 내용

flutter speech to text

yeji Kim

Jul 27, 20242y ago

https://www.youtube.com/watch?v=IiYeppojd_Y&list=PLxefhmF0pcPlMKc8tPgYmYXXShh8P_8al&index=9

flutter_sound/flutter_sound/example/lib/recordToStream/record_to_stream_example.dart at master · Canardoux/flutter_sound

Flutter plugin for sound. Audio recorder and player. - Canardoux/flutter_sound

github.com

GitHub - rtzr/Awesome-Korean-Speech-Recognition: 한국어 음성인식 STT API 리스트. 각 성능 벤치마크.

한국어 음성인식 STT API 리스트. 각 성능 벤치마크. Contribute to rtzr/Awesome-Korean-Speech-Recognition development by creating an account on GitHub.

github.com

python model - flutter App

To use a deep learning model trained in Python within a Flutter app, you can follow these general steps:

Train and Save the Model in Python:

•

Train your speech-to-text model using Python.

•

Save the model to a file format that can be loaded later (e.g., TensorFlow SavedModel, PyTorch model, etc.).

Set Up a Backend Server:

•

Set up a backend server using a framework like Flask, Django, or FastAPI.

•

The backend server will load the trained model and provide an API endpoint to handle requests from the Flutter app.

•

The server will process audio files sent from the Flutter app, apply the speech-to-text model, and return the transcriptions.

Flutter App:

•

Develop the Flutter app to record audio, send the audio file to the backend server, and display the transcription results.

Here is a step-by-step guide:

Step 1: Train and Save the Model in Python

Assuming you have already trained your model, save it to a file. For example, using TensorFlow:

import tensorflow as tf

# Assuming `model` is your trained model
model.save('path/to/saved_model')

Step 2: Set Up the Backend Server

Set up a Flask server as an example:

pip install flask tensorflow

Create a server.py file:

from flask import Flask, request, jsonify
import tensorflow as tf
import numpy as np
import librosa

app = Flask(__name__)

# Load the saved model
model = tf.keras.models.load_model('path/to/saved_model')

def preprocess_audio(file_path):
    # Load and preprocess the audio file
    audio, sr = librosa.load(file_path, sr=16000)
    # Assume the model expects input shape (None, 16000)
    audio = librosa.util.fix_length(audio, 16000)
    audio = np.expand_dims(audio, axis=0)
    return audio

@app.route('/predict', methods=['POST'])
def predict():
    if 'file' not in request.files:
        return jsonify({"error": "No file provided"}), 400

    file = request.files['file']
    file_path = 'temp.wav'
    file.save(file_path)

    audio = preprocess_audio(file_path)
    prediction = model.predict(audio)
    transcription = decode_prediction(prediction)

    return jsonify({"transcription": transcription})

def decode_prediction(prediction):
    # Decode the model's output to text
    return "decoded text"

if __name__ == '__main__':
    app.run(debug=True)

Step 3: Develop the Flutter App

Add dependencies to pubspec.yaml:

dependencies:
  flutter:
    sdk: flutter
  http: ^0.13.3
  audioplayers: ^0.20.1
  path_provider: ^2.0.1

Create the Flutter UI to record audio and send it to the backend server:

import 'package:flutter/material.dart';
import 'package:http/http.dart' as http;
import 'package:path_provider/path_provider.dart';
import 'package:audioplayers/audioplayers.dart';
import 'dart:io';

void main() {
  runApp(MyApp());
}

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: SpeechToTextScreen(),
    );
  }
}

class SpeechToTextScreen extends StatefulWidget {
  @override
  _SpeechToTextScreenState createState() => _SpeechToTextScreenState();
}

class _SpeechToTextScreenState extends State<SpeechToTextScreen> {
  final AudioPlayer _audioPlayer = AudioPlayer();
  bool _isRecording = false;
  String _transcription = "";

  Future<void> _recordAudio() async {
    // Code to record audio using your preferred method
    // Save the audio file to a temporary directory
    Directory tempDir = await getTemporaryDirectory();
    String tempPath = '${tempDir.path}/temp.wav';

    // Start recording
    // This is a placeholder. Use a suitable package to record audio in Flutter.
    await _audioPlayer.startRecorder(toFile: tempPath);
    setState(() {
      _isRecording = true;
    });
  }

  Future<void> _stopRecording() async {
    // Stop recording
    // This is a placeholder. Use a suitable package to stop recording in Flutter.
    await _audioPlayer.stopRecorder();
    setState(() {
      _isRecording = false;
    });

    // Send the recorded file to the backend server
    Directory tempDir = await getTemporaryDirectory();
    String tempPath = '${tempDir.path}/temp.wav';
    File audioFile = File(tempPath);
    await _sendToServer(audioFile);
  }

  Future<void> _sendToServer(File audioFile) async {
    var request = http.MultipartRequest(
      'POST',
      Uri.parse('http://your_server_ip:your_server_port/predict'),
    );
    request.files.add(await http.MultipartFile.fromPath('file', audioFile.path));
    var response = await request.send();

    if (response.statusCode == 200) {
      var responseData = await response.stream.bytesToString();
      setState(() {
        _transcription = responseData; // Update this according to your server's response format
      });
    } else {
      setState(() {
        _transcription = "Error: ${response.statusCode}";
      });
    }
  }

  @override
  Widget build(BuildContext context) {
    return Scaffold(
      appBar: AppBar(title: Text("Speech to Text")),
      body: Center(
        child: Column(
          mainAxisAlignment: MainAxisAlignment.center,
          children: [
            _isRecording
                ? ElevatedButton(
                    onPressed: _stopRecording, 
                    child: Text("Stop Recording"),
                  )
                : ElevatedButton(
                    onPressed: _recordAudio, 
                    child: Text("Start Recording"),
                  ),
            SizedBox(height: 20),
            Text(_transcription),
          ],
        ),
      ),
    );
  }
}

Explanation

Python Server:

•

A Flask server loads the trained model and provides an API to handle audio file uploads.

•

The server processes the audio file, runs the model, and returns the transcription.

Flutter App:

•

The Flutter app records audio and saves it to a temporary file.

•

The app sends the audio file to the backend server and displays the transcription returned by the server.

Note

•

Ensure the Flutter app and the backend server are on the same network or the server is publicly accessible.

•

Use a suitable Flutter package for audio recording, like flutter_sound or audioplayers, as per your requirements.

•

Handle errors and edge cases appropriately in both the backend server and Flutter app.

Subscribe to '아무튼-작업일지'

Subscribe to my site to be the first to receive notifications and emails about the latest updates, including new posts.
Join Slashpage and subscribe to '아무튼-작업일지'!