2026-04-08 01:42:29 -06:00
2026-04-08 01:42:29 -06:00
2026-04-08 01:42:29 -06:00
2026-04-08 01:42:29 -06:00
2026-04-08 01:42:29 -06:00
2026-03-09 23:36:50 -06:00
2026-03-24 01:42:34 -06:00
2026-02-26 00:57:40 -07:00
2026-04-08 01:42:29 -06:00
2026-03-09 23:36:50 -06:00
2026-03-09 23:36:50 -06:00
2026-04-08 01:42:29 -06:00
2026-04-08 01:42:29 -06:00

Audiobook Creator

AI-powered audiobook generator using the Kokoro TTS model. Generates high-quality narrated .wav files from plain-text novels, with a GUI tool for auditing and fixing proper noun pronunciations per book.


Features

  • Multi-book support — each book's proper nouns, fixes, and audio are fully isolated
  • Proper Noun GUI — hear every extracted name, mark it correct or type a phonetic fix
  • Audiobook generation — one .wav per chapter, GPU-accelerated via CUDA
  • In-GUI extraction — click one button to run NLP extraction and generate audio, no separate scripts needed
  • Apply Fixes — writes a TTS-ready copy of the source text with all phonetic substitutions applied

Project structure

Audio Text for Novel Lightbringer/   ← multi-file book (chapters as .txt)
Audio Master Nem Full.txt            ← single-file book

gui_proper_noun_player.py            ← proper noun auditing GUI
create_audiobook_lightbringer.py     ← generate Lightbringer audiobook chapters
create_audiobook_nem.py              ← generate Nem audiobook chapters

output_audiobook_lightbringer/       ← chapter WAV output
output_audiobook/                    ← Nem WAV output
output_proper_nouns/<book>/          ← manifest + JSON fix data per book
proper_nouns_audio/<book>/           ← word audio + replacements cache per book

requirements.txt
setup_windows.bat                    ← one-click Windows setup
run_gui.bat                          ← launch GUI on Windows
run_audiobook.bat                    ← generate audiobook on Windows
---

## Setup (Windows - Easiest for Non-Tech Users)

1. **Download** the project as a ZIP file from GitHub
2. **Extract** the ZIP to a folder on your computer (e.g., `C:\audiobook-creator`)
3. **Double-click** `setup_windows.bat` and wait for it to finish installing everything (may take 10-20 minutes)
4. **Double-click** `run_gui.bat` to launch the Proper Noun Player GUI
5. **Double-click** `run_audiobook.bat` to generate audiobook chapters

That's it! The setup script handles Python installation, virtual environment, and all dependencies automatically.

---

## Setup (Linux / Mac)

```bash
python3.12 -m venv .venv
source .venv/bin/activate
pip install torch --index-url https://download.pytorch.org/whl/cu124   # CUDA 12.4
pip install -r requirements.txt
python -m spacy download en_core_web_sm

For CPU-only: replace the torch line with pip install torch


Setup (Windows)

See SETUP_WINDOWS.md for a step-by-step guide aimed at non-technical users.


Usage

Proper Noun GUI

.venv/bin/python gui_proper_noun_player.py
  1. Select a book from the dropdown
  2. Click ⚙ Extract & Generate Audio — extracts proper nouns via spaCy and generates a TTS clip for each one
  3. Click words in the Review list to hear them; press Enter to mark correct or type a phonetic replacement first
  4. Click ⇄ Apply Fixes to Text to write a pronunciation-corrected copy of the source file

Generate Audiobook

# All chapters
.venv/bin/python create_audiobook_lightbringer.py

# List chapters only
.venv/bin/python create_audiobook_lightbringer.py --list

# Preview clips
.venv/bin/python create_audiobook_lightbringer.py --preview

# Specific chapters
.venv/bin/python create_audiobook_lightbringer.py 0 1 2

Dependencies

Package Purpose
kokoro Kokoro-82M TTS model
torch GPU inference
soundfile / sounddevice Audio I/O
numpy Audio array operations
spacy + en_core_web_sm Proper noun extraction (NER + PROPN)
wordfreq Common-word filter during extraction

Output

Path Contents
output_audiobook_lightbringer/ chapter_01_homecoming.wav, …
output_proper_nouns/<book>/manifest.json Word → WAV filename map
output_proper_nouns/<book>/pronunciation_fixes.json {"Nephi": "Kneephi", …}
output_proper_nouns/<book>/correct_words.json Words confirmed correct
proper_nouns_audio/<book>/ Per-word audio clips
proper_nouns_audio/<book>/replacements_cache/ Cached phonetic fix clips
Description
No description provided
Readme 1.1 MiB
Languages
Python 96.8%
Batchfile 3.2%