3.7 KiB
Installation Guide for OBS Recording Transcriber
This guide will help you install all the necessary dependencies for the OBS Recording Transcriber application, including the advanced features from Phase 3.
Prerequisites
Before installing the Python packages, you need to set up some prerequisites:
1. Python 3.8 or higher
Make sure you have Python 3.8 or higher installed. You can download it from python.org.
2. FFmpeg
FFmpeg is required for audio processing:
-
Windows:
- Download from gyan.dev/ffmpeg/builds
- Extract the ZIP file
- Add the
binfolder to your system PATH
-
macOS:
brew install ffmpeg -
Linux:
sudo apt update sudo apt install ffmpeg
3. Visual C++ Build Tools (Windows only)
Some packages like tokenizers require C++ build tools:
- Download and install Visual C++ Build Tools
- During installation, select "Desktop development with C++"
Installation Steps
1. Create a Virtual Environment (Recommended)
# Create a virtual environment
python -m venv venv
# Activate the virtual environment
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate
2. Install PyTorch
For better performance, install PyTorch with CUDA support if you have an NVIDIA GPU:
# Windows/Linux with CUDA
pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
# macOS or CPU-only
pip install torch torchvision torchaudio
3. Install Dependencies
# Install all dependencies from requirements.txt
pip install -r requirements.txt
4. Troubleshooting Common Issues
Tokenizers Installation Issues
If you encounter issues with tokenizers installation:
- Make sure you have Visual C++ Build Tools installed (Windows)
- Try installing Rust: rustup.rs
- Install tokenizers separately:
pip install tokenizers --no-binary tokenizers
PyAnnote.Audio Access
To use speaker diarization, you need a HuggingFace token with access to the pyannote models:
- Create an account on HuggingFace
- Generate an access token at huggingface.co/settings/tokens
- Request access to pyannote/speaker-diarization-3.0
- Set the token in the application when prompted or as an environment variable:
# Windows set HF_TOKEN=your_token_here # macOS/Linux export HF_TOKEN=your_token_here
Memory Issues with Large Files
If you encounter memory issues with large files:
- Use a smaller Whisper model (e.g., "base" instead of "large")
- Reduce the GPU memory fraction in the application settings
- Increase your system's swap space/virtual memory
Running the Application
After installation, run the application with:
streamlit run app.py
Optional: Ollama Setup for Local Summarization
To use Ollama for local summarization:
- Install Ollama from ollama.ai
- Pull a model:
ollama pull llama3 - Uncomment the Ollama line in requirements.txt and install:
pip install ollama
Verifying Installation
To verify that all components are working correctly:
- Run the application
- Check that GPU acceleration is available (if applicable)
- Test a small video file with basic transcription
- Gradually enable advanced features like diarization and translation
If you encounter any issues, check the application logs for specific error messages.