Audiobook Creator
AI-powered audiobook generator using the Kokoro TTS model.
Generates high-quality narrated .wav files from plain-text novels, with a GUI tool for auditing and fixing proper noun pronunciations per book.
Features
- Multi-book support — each book's proper nouns, fixes, and audio are fully isolated
- Proper Noun GUI — hear every extracted name, mark it correct or type a phonetic fix
- Audiobook generation — one
.wavper chapter, GPU-accelerated via CUDA - In-GUI extraction — click one button to run NLP extraction and generate audio, no separate scripts needed
- Apply Fixes — writes a TTS-ready copy of the source text with all phonetic substitutions applied
Project structure
Audio Text for Novel Lightbringer/ ← multi-file book (chapters as .txt)
Audio Master Nem Full.txt ← single-file book
gui_proper_noun_player.py ← proper noun auditing GUI
create_audiobook_lightbringer.py ← generate Lightbringer audiobook chapters
create_audiobook_nem.py ← generate Nem audiobook chapters
output_audiobook_lightbringer/ ← chapter WAV output
output_audiobook/ ← Nem WAV output
output_proper_nouns/<book>/ ← manifest + JSON fix data per book
proper_nouns_audio/<book>/ ← word audio + replacements cache per book
requirements.txt
setup_windows.bat ← one-click Windows setup
run_gui.bat ← launch GUI on Windows
run_audiobook.bat ← generate audiobook on Windows
---
## Setup (Windows - Easiest for Non-Tech Users)
1. **Download** the project as a ZIP file from GitHub
2. **Extract** the ZIP to a folder on your computer (e.g., `C:\audiobook-creator`)
3. **Double-click** `setup_windows.bat` and wait for it to finish installing everything (may take 10-20 minutes)
4. **Double-click** `run_gui.bat` to launch the Proper Noun Player GUI
5. **Double-click** `run_audiobook.bat` to generate audiobook chapters
That's it! The setup script handles Python installation, virtual environment, and all dependencies automatically.
---
## Setup (Linux / Mac)
```bash
python3.12 -m venv .venv
source .venv/bin/activate
pip install torch --index-url https://download.pytorch.org/whl/cu124 # CUDA 12.4
pip install -r requirements.txt
python -m spacy download en_core_web_sm
For CPU-only: replace the torch line with
pip install torch
Setup (Windows)
See SETUP_WINDOWS.md for a step-by-step guide aimed at non-technical users.
Usage
Proper Noun GUI
.venv/bin/python gui_proper_noun_player.py
- Select a book from the dropdown
- Click ⚙ Extract & Generate Audio — extracts proper nouns via spaCy and generates a TTS clip for each one
- Click words in the Review list to hear them; press Enter to mark correct or type a phonetic replacement first
- Click ⇄ Apply Fixes to Text to write a pronunciation-corrected copy of the source file
Generate Audiobook
# All chapters
.venv/bin/python create_audiobook_lightbringer.py
# List chapters only
.venv/bin/python create_audiobook_lightbringer.py --list
# Preview clips
.venv/bin/python create_audiobook_lightbringer.py --preview
# Specific chapters
.venv/bin/python create_audiobook_lightbringer.py 0 1 2
Dependencies
| Package | Purpose |
|---|---|
kokoro |
Kokoro-82M TTS model |
torch |
GPU inference |
soundfile / sounddevice |
Audio I/O |
numpy |
Audio array operations |
spacy + en_core_web_sm |
Proper noun extraction (NER + PROPN) |
wordfreq |
Common-word filter during extraction |
Output
| Path | Contents |
|---|---|
output_audiobook_lightbringer/ |
chapter_01_homecoming.wav, … |
output_proper_nouns/<book>/manifest.json |
Word → WAV filename map |
output_proper_nouns/<book>/pronunciation_fixes.json |
{"Nephi": "Kneephi", …} |
output_proper_nouns/<book>/correct_words.json |
Words confirmed correct |
proper_nouns_audio/<book>/ |
Per-word audio clips |
proper_nouns_audio/<book>/replacements_cache/ |
Cached phonetic fix clips |
Description
Languages
Python
96.8%
Batchfile
3.2%