Update README.md

This commit is contained in:
DataAnts-AI
2025-03-03 12:07:30 -05:00
committed by GitHub
parent 524f0d6a6c
commit 56e947cc71

View File

@ -3,6 +3,66 @@
## Project Overview ## Project Overview
The video Recording Transcriber is a Python application built with Streamlit that processes video recordings (particularly from OBS Studio) to generate transcripts and summaries using AI models. The application uses Whisper for transcription and Hugging Face Transformers for summarization. The video Recording Transcriber is a Python application built with Streamlit that processes video recordings (particularly from OBS Studio) to generate transcripts and summaries using AI models. The application uses Whisper for transcription and Hugging Face Transformers for summarization.
<img width="1190" alt="image" src="https://github.com/user-attachments/assets/0e60d6ba-cb00-4fa8-b401-d4f30a418c92" />
## Installation
### Easy Installation (Recommended)
#### Windows
1. Download or clone the repository
2. Run `install.bat` by double-clicking it
3. Follow the on-screen instructions
#### Linux/macOS
1. Download or clone the repository
2. Open a terminal in the project directory
3. Make the install script executable: `chmod +x install.sh`
4. Run the script: `./install.sh`
5. Follow the on-screen instructions
### Manual Installation
1. Clone the repo.
```
git clone https://github.com/DataAnts-AI/VideoTranscriber.git
cd VideoTranscriber
```
2. Install dependencies:
```
pip install -r requirements.txt
```
Notes:
- Ensure that the versions align with the features you use and your system compatibility.
- torch version should match the capabilities of your hardware (e.g., CUDA support for GPUs).
- For advanced features like speaker diarization, you'll need a HuggingFace token.
- See `INSTALLATION.md` for detailed instructions and troubleshooting.
3. Run the application:
```
streamlit run app.py
```
## Usage
1. Set your base folder where OBS recordings are stored
2. Select a recording from the dropdown
3. Choose transcription and summarization models
4. Configure performance settings (GPU acceleration, caching)
5. Select export formats and compression options
6. Click "Process Recording" to start
## Advanced Features
- **Speaker Diarization**: Identify and label different speakers in your recordings
- **Translation**: Automatically detect language and translate to multiple languages
- **Keyword Extraction**: Extract important keywords with timestamp links
- **Interactive Transcript**: Navigate through the transcript with keyword highlighting
- **GPU Acceleration**: Utilize your GPU for faster processing
- **Caching**: Save processing time by caching results
## Key Improvement Areas ## Key Improvement Areas
### 1. UI Enhancements ### 1. UI Enhancements
@ -92,14 +152,7 @@ The video Recording Transcriber is a Python application built with Streamlit tha
- Added named entity recognition for better content analysis - Added named entity recognition for better content analysis
- Generated keyword index with timestamp references - Generated keyword index with timestamp references
- Provided speaker statistics and word count analysis - Provided speaker statistics and word count analysis
4. **Phase 4:** Integration with other tools and services 4. **Phase 4:** Integration with other tools and services (In progess)
## Technical Considerations
- Ensure compatibility with different Whisper model sizes
- Handle large files efficiently to prevent memory issues
- Provide graceful degradation when optional dependencies are missing
- Maintain backward compatibility with existing workflows
- Consider containerization for easier deployment
## Conclusion Reach out to support@dataants.org if you need assistance with any AI solutions - we offer support for n8n workflows, local RAG chatbots, and ERP and Financial reporting.
The OBS Recording Transcriber has a solid foundation but can be significantly enhanced with the suggested improvements. The focus should be on improving user experience, adding offline processing capabilities, and expanding export options to make the tool more versatile for different use cases.