Files
TalkEdit/INSTALLATION.md

3.7 KiB

Installation Guide for OBS Recording Transcriber

This guide will help you install all the necessary dependencies for the OBS Recording Transcriber application, including the advanced features from Phase 3.

Prerequisites

Before installing the Python packages, you need to set up some prerequisites:

1. Python 3.8 or higher

Make sure you have Python 3.8 or higher installed. You can download it from python.org.

2. FFmpeg

FFmpeg is required for audio processing:

  • Windows:

  • macOS:

    brew install ffmpeg
    
  • Linux:

    sudo apt update
    sudo apt install ffmpeg
    

3. Visual C++ Build Tools (Windows only)

Some packages like tokenizers require C++ build tools:

  1. Download and install Visual C++ Build Tools
  2. During installation, select "Desktop development with C++"

Installation Steps

# Create a virtual environment
python -m venv venv

# Activate the virtual environment
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate

2. Install PyTorch

For better performance, install PyTorch with CUDA support if you have an NVIDIA GPU:

# Windows/Linux with CUDA
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

# macOS or CPU-only
pip install torch torchvision torchaudio

3. Install Dependencies

# Install all dependencies from requirements.txt
pip install -r requirements.txt

4. Troubleshooting Common Issues

Tokenizers Installation Issues

If you encounter issues with tokenizers installation:

  1. Make sure you have Visual C++ Build Tools installed (Windows)
  2. Try installing Rust: rustup.rs
  3. Install tokenizers separately:
    pip install tokenizers --no-binary tokenizers
    

PyAnnote.Audio Access

To use speaker diarization, you need a HuggingFace token with access to the pyannote models:

  1. Create an account on HuggingFace
  2. Generate an access token at huggingface.co/settings/tokens
  3. Request access to pyannote/speaker-diarization-3.0
  4. Set the token in the application when prompted or as an environment variable:
    # Windows
    set HF_TOKEN=your_token_here
    # macOS/Linux
    export HF_TOKEN=your_token_here
    

Memory Issues with Large Files

If you encounter memory issues with large files:

  1. Use a smaller Whisper model (e.g., "base" instead of "large")
  2. Reduce the GPU memory fraction in the application settings
  3. Increase your system's swap space/virtual memory

Running the Application

After installation, run the application with:

streamlit run app.py

Optional: Ollama Setup for Local Summarization

To use Ollama for local summarization:

  1. Install Ollama from ollama.ai
  2. Pull a model:
    ollama pull llama3
    
  3. Uncomment the Ollama line in requirements.txt and install:
    pip install ollama
    

Verifying Installation

To verify that all components are working correctly:

  1. Run the application
  2. Check that GPU acceleration is available (if applicable)
  3. Test a small video file with basic transcription
  4. Gradually enable advanced features like diarization and translation

If you encounter any issues, check the application logs for specific error messages.