Enhance README.md with Docker installation instructions and update Ollama API endpoint to be configurable via environment variable.
This commit is contained in:
77
.dockerignore
Normal file
77
.dockerignore
Normal file
@ -0,0 +1,77 @@
|
||||
# Git and version control
|
||||
.git
|
||||
.gitignore
|
||||
.gitattributes
|
||||
|
||||
# Docker files
|
||||
Dockerfile
|
||||
docker-compose.yml
|
||||
.dockerignore
|
||||
|
||||
# Environment and config files
|
||||
.env
|
||||
.env.*
|
||||
docker.env.example
|
||||
|
||||
# Documentation
|
||||
*.md
|
||||
docs/
|
||||
DOCKER.md
|
||||
README.md
|
||||
INSTALLATION.md
|
||||
GEMINI_INSIGHTS.md
|
||||
|
||||
# Python cache and virtual environments
|
||||
__pycache__/
|
||||
*.py[cod]
|
||||
*$py.class
|
||||
*.so
|
||||
.Python
|
||||
venv/
|
||||
env/
|
||||
ENV/
|
||||
|
||||
# IDE and editor files
|
||||
.vscode/
|
||||
.idea/
|
||||
*.swp
|
||||
*.swo
|
||||
*~
|
||||
|
||||
# OS generated files
|
||||
.DS_Store
|
||||
.DS_Store?
|
||||
._*
|
||||
.Spotlight-V100
|
||||
.Trashes
|
||||
ehthumbs.db
|
||||
Thumbs.db
|
||||
|
||||
# Local directories that will be mounted as volumes
|
||||
videos/
|
||||
outputs/
|
||||
cache/
|
||||
config/
|
||||
|
||||
# Logs
|
||||
*.log
|
||||
logs/
|
||||
|
||||
# Temporary files
|
||||
tmp/
|
||||
temp/
|
||||
*.tmp
|
||||
|
||||
# Test files
|
||||
tests/
|
||||
*_test.py
|
||||
test_*.py
|
||||
|
||||
# Build artifacts
|
||||
build/
|
||||
dist/
|
||||
*.egg-info/
|
||||
|
||||
# Jupyter notebooks
|
||||
*.ipynb
|
||||
.ipynb_checkpoints/
|
||||
305
DOCKER.md
Normal file
305
DOCKER.md
Normal file
@ -0,0 +1,305 @@
|
||||
# Docker Deployment Guide for VideoTranscriber
|
||||
|
||||
This guide explains how to run VideoTranscriber in a Docker container while using Ollama models on your host system.
|
||||
|
||||
## Architecture Overview
|
||||
|
||||
```
|
||||
┌─────────────────────────────────────────┐
|
||||
│ Host System │
|
||||
│ ┌─────────────────┐ ┌──────────────────│
|
||||
│ │ Ollama Service │ │ Video Files │
|
||||
│ │ (port 11434) │ │ Directory │
|
||||
│ └─────────────────┘ └──────────────────│
|
||||
│ ▲ ▲ │
|
||||
│ │ │ │
|
||||
│ ┌───────┼─────────────────────┼─────────│
|
||||
│ │ Docker Container │ │
|
||||
│ │ ┌─────▼─────────┐ │ │
|
||||
│ │ │ VideoTranscriber │ │
|
||||
│ │ │ - Streamlit App │ │
|
||||
│ │ │ - Whisper Models │ │
|
||||
│ │ │ - ML Dependencies │ │
|
||||
│ │ └───────────────┘ │ │
|
||||
│ └────────────────────────────┼─────────│
|
||||
│ │ │
|
||||
│ Mounted Volumes ─────┘ │
|
||||
└─────────────────────────────────────────┘
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Prerequisites
|
||||
|
||||
1. **Docker & Docker Compose** installed
|
||||
2. **Ollama running on host**:
|
||||
```bash
|
||||
# Install Ollama (if not already installed)
|
||||
curl -fsSL https://ollama.ai/install.sh | sh
|
||||
|
||||
# Start Ollama service
|
||||
ollama serve
|
||||
|
||||
# Pull a model (in another terminal)
|
||||
ollama pull llama3
|
||||
```
|
||||
|
||||
### 1. Setup Environment
|
||||
|
||||
```bash
|
||||
# Copy environment template
|
||||
cp docker.env.example .env
|
||||
|
||||
# Edit .env file with your paths
|
||||
# Key settings to update:
|
||||
VIDEO_PATH=/path/to/your/videos
|
||||
OUTPUT_PATH=/path/to/save/outputs
|
||||
HF_TOKEN=your_huggingface_token_if_needed
|
||||
```
|
||||
|
||||
### 2. Create Required Directories
|
||||
|
||||
```bash
|
||||
# Create directories for mounting
|
||||
mkdir -p videos outputs cache config
|
||||
```
|
||||
|
||||
### 3. Build and Run
|
||||
|
||||
```bash
|
||||
# Build and start the container
|
||||
docker-compose up -d
|
||||
|
||||
# View logs
|
||||
docker-compose logs -f
|
||||
|
||||
# Access the application
|
||||
# Open browser to: http://localhost:8501
|
||||
```
|
||||
|
||||
## Configuration Options
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Description | Default | Required |
|
||||
|----------|-------------|---------|----------|
|
||||
| `VIDEO_PATH` | Host directory containing video files | `./videos` | Yes |
|
||||
| `OUTPUT_PATH` | Host directory for outputs | `./outputs` | Yes |
|
||||
| `CACHE_PATH` | Host directory for model cache | `./cache` | No |
|
||||
| `OLLAMA_API_URL` | Ollama API endpoint | `http://host.docker.internal:11434/api` | No |
|
||||
| `HF_TOKEN` | HuggingFace token for advanced features | - | No |
|
||||
| `CUDA_VISIBLE_DEVICES` | GPU devices to use | - | No |
|
||||
|
||||
### Volume Mounts
|
||||
|
||||
| Host Path | Container Path | Purpose |
|
||||
|-----------|----------------|---------|
|
||||
| `${VIDEO_PATH}` | `/app/data/videos` | Input video files |
|
||||
| `${OUTPUT_PATH}` | `/app/data/outputs` | Generated transcripts/summaries |
|
||||
| `${CACHE_PATH}` | `/app/data/cache` | Model and processing cache |
|
||||
| `${CONFIG_PATH}` | `/app/config` | Configuration files |
|
||||
|
||||
## Platform-Specific Setup
|
||||
|
||||
### Windows (Docker Desktop)
|
||||
|
||||
```yaml
|
||||
# In docker-compose.yml - use bridge networking
|
||||
networks:
|
||||
- videotranscriber-network
|
||||
|
||||
environment:
|
||||
- OLLAMA_API_URL=http://host.docker.internal:11434/api
|
||||
```
|
||||
|
||||
### macOS (Docker Desktop)
|
||||
|
||||
Same as Windows - uses `host.docker.internal` to access host services.
|
||||
|
||||
### Linux
|
||||
|
||||
Option 1 - Host Networking (Recommended):
|
||||
```yaml
|
||||
# In docker-compose.yml
|
||||
network_mode: host
|
||||
|
||||
environment:
|
||||
- OLLAMA_API_URL=http://localhost:11434/api
|
||||
```
|
||||
|
||||
Option 2 - Bridge Networking:
|
||||
```yaml
|
||||
environment:
|
||||
- OLLAMA_API_URL=http://172.17.0.1:11434/api # Docker bridge IP
|
||||
```
|
||||
|
||||
## GPU Support
|
||||
|
||||
### NVIDIA GPU Setup
|
||||
|
||||
1. **Install NVIDIA Container Toolkit**:
|
||||
```bash
|
||||
# Ubuntu/Debian
|
||||
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg
|
||||
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
|
||||
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
|
||||
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
|
||||
sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
|
||||
sudo systemctl restart docker
|
||||
```
|
||||
|
||||
2. **Enable in docker-compose.yml**:
|
||||
```yaml
|
||||
deploy:
|
||||
resources:
|
||||
reservations:
|
||||
devices:
|
||||
- driver: nvidia
|
||||
count: 1
|
||||
capabilities: [gpu]
|
||||
```
|
||||
|
||||
## Usage in Container
|
||||
|
||||
### Application Settings
|
||||
|
||||
When running in Docker, update these settings in the VideoTranscriber UI:
|
||||
|
||||
1. **Base Folder**: Set to `/app/data/videos`
|
||||
2. **Ollama Models**: Should auto-detect from host
|
||||
3. **GPU Settings**: Will use container GPU if configured
|
||||
|
||||
### File Access
|
||||
|
||||
- **Input Videos**: Place in your `${VIDEO_PATH}` directory on host
|
||||
- **Outputs**: Generated files appear in `${OUTPUT_PATH}` on host
|
||||
- **Cache**: Models cached in `${CACHE_PATH}` for faster subsequent runs
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
### Common Issues
|
||||
|
||||
#### 1. Can't Connect to Ollama
|
||||
|
||||
**Symptoms**: "Ollama service is not available" message
|
||||
|
||||
**Solutions**:
|
||||
- Verify Ollama is running: `curl http://localhost:11434/api/tags`
|
||||
- Check firewall settings
|
||||
- For Linux, try host networking mode
|
||||
- Verify OLLAMA_API_URL in environment
|
||||
|
||||
#### 2. No Video Files Detected
|
||||
|
||||
**Symptoms**: "No recordings found" message
|
||||
|
||||
**Solutions**:
|
||||
- Check VIDEO_PATH points to correct directory
|
||||
- Ensure directory contains supported formats (.mp4, .avi, .mov, .mkv)
|
||||
- Check file permissions
|
||||
|
||||
#### 3. GPU Not Detected
|
||||
|
||||
**Symptoms**: Processing is slow, no GPU utilization
|
||||
|
||||
**Solutions**:
|
||||
- Install NVIDIA Container Toolkit
|
||||
- Uncomment GPU section in docker-compose.yml
|
||||
- Verify: `docker run --rm --gpus all nvidia/cuda:11.0-base nvidia-smi`
|
||||
|
||||
#### 4. Permission Issues
|
||||
|
||||
**Symptoms**: Cannot write to output directory
|
||||
|
||||
**Solutions**:
|
||||
```bash
|
||||
# Fix permissions
|
||||
sudo chown -R $(id -u):$(id -g) outputs cache config
|
||||
chmod -R 755 outputs cache config
|
||||
```
|
||||
|
||||
### Debugging
|
||||
|
||||
```bash
|
||||
# View container logs
|
||||
docker-compose logs -f videotranscriber
|
||||
|
||||
# Execute shell in container
|
||||
docker-compose exec videotranscriber bash
|
||||
|
||||
# Check Ollama connectivity from container
|
||||
docker-compose exec videotranscriber curl -f $OLLAMA_API_URL/tags
|
||||
|
||||
# Monitor resource usage
|
||||
docker stats videotranscriber
|
||||
```
|
||||
|
||||
## Advanced Configuration
|
||||
|
||||
### Custom Dockerfile
|
||||
|
||||
For specialized requirements, modify the Dockerfile:
|
||||
|
||||
```dockerfile
|
||||
# Add custom dependencies
|
||||
RUN pip install your-custom-package
|
||||
|
||||
# Set custom environment variables
|
||||
ENV YOUR_CUSTOM_VAR=value
|
||||
|
||||
# Copy custom configuration
|
||||
COPY custom-config.yaml /app/config/
|
||||
```
|
||||
|
||||
### Multi-Instance Deployment
|
||||
|
||||
Run multiple instances for different use cases:
|
||||
|
||||
```bash
|
||||
# Copy docker-compose.yml to docker-compose.prod.yml
|
||||
# Modify ports and paths
|
||||
docker-compose -f docker-compose.prod.yml up -d
|
||||
```
|
||||
|
||||
### CI/CD Integration
|
||||
|
||||
```yaml
|
||||
# .github/workflows/docker.yml
|
||||
name: Build and Deploy
|
||||
on:
|
||||
push:
|
||||
branches: [main]
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- uses: actions/checkout@v2
|
||||
- name: Build Docker image
|
||||
run: docker build -t videotranscriber .
|
||||
```
|
||||
|
||||
## Performance Optimization
|
||||
|
||||
### Memory Management
|
||||
|
||||
```yaml
|
||||
# In docker-compose.yml
|
||||
deploy:
|
||||
resources:
|
||||
limits:
|
||||
memory: 8G
|
||||
reservations:
|
||||
memory: 4G
|
||||
```
|
||||
|
||||
### Model Caching
|
||||
|
||||
- Use persistent volumes for `/app/data/cache`
|
||||
- Pre-download models to reduce startup time
|
||||
- Configure appropriate cache size limits
|
||||
|
||||
### Network Optimization
|
||||
|
||||
- Use host networking on Linux for better performance
|
||||
- Consider running Ollama and VideoTranscriber on same machine
|
||||
- Use SSD storage for cache directories
|
||||
44
Dockerfile
Normal file
44
Dockerfile
Normal file
@ -0,0 +1,44 @@
|
||||
FROM python:3.11-slim
|
||||
|
||||
# Set working directory
|
||||
WORKDIR /app
|
||||
|
||||
# Install system dependencies
|
||||
RUN apt-get update && apt-get install -y \
|
||||
ffmpeg \
|
||||
git \
|
||||
wget \
|
||||
curl \
|
||||
build-essential \
|
||||
&& rm -rf /var/lib/apt/lists/*
|
||||
|
||||
# Copy requirements first for better Docker layer caching
|
||||
COPY requirements.txt .
|
||||
|
||||
# Install Python dependencies
|
||||
RUN pip install --no-cache-dir -r requirements.txt
|
||||
|
||||
# Install PyTorch with CUDA support (adjust based on your needs)
|
||||
RUN pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
|
||||
|
||||
# Copy application code
|
||||
COPY . .
|
||||
|
||||
# Create directories for mounted volumes
|
||||
RUN mkdir -p /app/data/videos /app/data/outputs /app/data/cache
|
||||
|
||||
# Set environment variables
|
||||
ENV STREAMLIT_SERVER_PORT=8501
|
||||
ENV STREAMLIT_SERVER_ADDRESS=0.0.0.0
|
||||
ENV STREAMLIT_SERVER_HEADLESS=true
|
||||
ENV STREAMLIT_BROWSER_GATHER_USAGE_STATS=false
|
||||
|
||||
# Expose Streamlit port
|
||||
EXPOSE 8501
|
||||
|
||||
# Health check
|
||||
HEALTHCHECK --interval=30s --timeout=10s --start-period=60s --retries=3 \
|
||||
CMD curl -f http://localhost:8501/_stcore/health || exit 1
|
||||
|
||||
# Start the application
|
||||
CMD ["streamlit", "run", "app.py", "--server.port=8501", "--server.address=0.0.0.0"]
|
||||
26
README.md
26
README.md
@ -13,6 +13,32 @@ https://github.com/user-attachments/assets/990e63fc-232e-46a0-afdf-ca8836d46a13
|
||||
|
||||
## Installation
|
||||
|
||||
### 🐳 Docker Installation (Recommended)
|
||||
|
||||
**Benefits**: Isolated environment, no dependency conflicts, easy deployment
|
||||
|
||||
```bash
|
||||
# 1. Clone repository
|
||||
git clone https://github.com/DataAnts-AI/VideoTranscriber.git
|
||||
cd VideoTranscriber
|
||||
|
||||
# 2. Setup environment
|
||||
cp docker.env.example .env
|
||||
# Edit .env with your video directory paths
|
||||
|
||||
# 3. Ensure Ollama is running on host
|
||||
ollama serve # In separate terminal
|
||||
ollama pull llama3
|
||||
|
||||
# 4. Start with Docker Compose
|
||||
docker-compose up -d
|
||||
|
||||
# 5. Access application
|
||||
# Open browser to: http://localhost:8501
|
||||
```
|
||||
|
||||
See [DOCKER.md](DOCKER.md) for complete Docker setup guide.
|
||||
|
||||
### Easy Installation (Recommended)
|
||||
|
||||
#### Windows
|
||||
|
||||
51
docker-compose.yml
Normal file
51
docker-compose.yml
Normal file
@ -0,0 +1,51 @@
|
||||
version: '3.8'
|
||||
|
||||
services:
|
||||
videotranscriber:
|
||||
build: .
|
||||
container_name: videotranscriber
|
||||
ports:
|
||||
- "8501:8501"
|
||||
volumes:
|
||||
# Mount your video files directory (change the left path to your actual videos folder)
|
||||
- "${VIDEO_PATH:-./videos}:/app/data/videos"
|
||||
# Mount output directory for transcripts and summaries
|
||||
- "${OUTPUT_PATH:-./outputs}:/app/data/outputs"
|
||||
# Mount cache directory for model caching (optional, improves performance)
|
||||
- "${CACHE_PATH:-./cache}:/app/data/cache"
|
||||
# Mount a config directory if needed
|
||||
- "${CONFIG_PATH:-./config}:/app/config"
|
||||
environment:
|
||||
# Ollama configuration for host access
|
||||
- OLLAMA_API_URL=${OLLAMA_API_URL:-http://host.docker.internal:11434/api}
|
||||
# Optional: HuggingFace token for advanced features
|
||||
- HF_TOKEN=${HF_TOKEN:-}
|
||||
# GPU configuration
|
||||
- CUDA_VISIBLE_DEVICES=${CUDA_VISIBLE_DEVICES:-}
|
||||
# Cache settings
|
||||
- TRANSFORMERS_CACHE=/app/data/cache/transformers
|
||||
- WHISPER_CACHE=/app/data/cache/whisper
|
||||
# For GPU access (uncomment if you have NVIDIA GPU and nvidia-docker)
|
||||
# deploy:
|
||||
# resources:
|
||||
# reservations:
|
||||
# devices:
|
||||
# - driver: nvidia
|
||||
# count: 1
|
||||
# capabilities: [gpu]
|
||||
restart: unless-stopped
|
||||
# For Linux hosts, you might prefer host networking for better Ollama access
|
||||
# network_mode: host # Uncomment for Linux hosts
|
||||
# Use bridge networking for Windows/Mac with host.docker.internal
|
||||
networks:
|
||||
- videotranscriber-network
|
||||
healthcheck:
|
||||
test: ["CMD", "curl", "-f", "http://localhost:8501/_stcore/health"]
|
||||
interval: 30s
|
||||
timeout: 10s
|
||||
retries: 3
|
||||
start_period: 60s
|
||||
|
||||
networks:
|
||||
videotranscriber-network:
|
||||
driver: bridge
|
||||
63
docker.env.example
Normal file
63
docker.env.example
Normal file
@ -0,0 +1,63 @@
|
||||
# VideoTranscriber Docker Configuration
|
||||
# Copy this file to .env and modify the values as needed
|
||||
|
||||
# =============================================================================
|
||||
# DOCKER VOLUME PATHS (Host Directories)
|
||||
# =============================================================================
|
||||
|
||||
# Path to your video files directory on the host
|
||||
# This directory will be mounted into the container at /app/data/videos
|
||||
VIDEO_PATH=./videos
|
||||
|
||||
# Path where outputs (transcripts, summaries) will be saved on the host
|
||||
# This directory will be mounted into the container at /app/data/outputs
|
||||
OUTPUT_PATH=./outputs
|
||||
|
||||
# Path for caching ML models and processed files (improves performance)
|
||||
# This directory will be mounted into the container at /app/data/cache
|
||||
CACHE_PATH=./cache
|
||||
|
||||
# Optional: Configuration directory for custom settings
|
||||
CONFIG_PATH=./config
|
||||
|
||||
# =============================================================================
|
||||
# OLLAMA CONFIGURATION
|
||||
# =============================================================================
|
||||
|
||||
# Ollama API URL - how the container accesses your host Ollama service
|
||||
# For Windows/Mac with Docker Desktop: use host.docker.internal
|
||||
# For Linux: use host networking or the actual host IP
|
||||
OLLAMA_API_URL=http://host.docker.internal:11434/api
|
||||
|
||||
# =============================================================================
|
||||
# ML MODEL CONFIGURATION
|
||||
# =============================================================================
|
||||
|
||||
# HuggingFace token for advanced features (speaker diarization, etc.)
|
||||
# Get your token at: https://huggingface.co/settings/tokens
|
||||
# Leave empty if not using advanced features
|
||||
HF_TOKEN=
|
||||
|
||||
# GPU Configuration
|
||||
# Specify which GPU devices to use (leave empty for all available)
|
||||
# Examples: "0" for first GPU, "0,1" for first two GPUs
|
||||
CUDA_VISIBLE_DEVICES=
|
||||
|
||||
# =============================================================================
|
||||
# DOCKER-SPECIFIC SETTINGS
|
||||
# =============================================================================
|
||||
|
||||
# Container name (change if you want to run multiple instances)
|
||||
CONTAINER_NAME=videotranscriber
|
||||
|
||||
# Port mapping (host:container)
|
||||
HOST_PORT=8501
|
||||
|
||||
# =============================================================================
|
||||
# EXAMPLE USAGE
|
||||
# =============================================================================
|
||||
# 1. Copy this file: cp docker.env.example .env
|
||||
# 2. Edit the paths to match your system
|
||||
# 3. Make sure Ollama is running on your host: ollama serve
|
||||
# 4. Start the container: docker-compose up -d
|
||||
# 5. Access the app at: http://localhost:8501
|
||||
@ -13,8 +13,8 @@ import os
|
||||
logging.basicConfig(level=logging.INFO)
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
# Default Ollama API endpoint
|
||||
OLLAMA_API_URL = "http://localhost:11434/api"
|
||||
# Default Ollama API endpoint - configurable via environment variable
|
||||
OLLAMA_API_URL = os.environ.get("OLLAMA_API_URL", "http://localhost:11434/api")
|
||||
|
||||
|
||||
def check_ollama_available():
|
||||
|
||||
Reference in New Issue
Block a user