projects include proper noun stuff

audio gen in gui
format doc script
2026-04-08 01:52:54 -06:00 · 2026-04-08 01:42:29 -06:00 · 2026-03-24 01:42:34 -06:00 · 2026-03-13 01:07:17 -06:00 · 2026-03-10 00:45:57 -06:00 · 2026-03-10 00:30:53 -06:00
24 changed files with 4913 additions and 3702 deletions
--- a/.envrc
+++ b/.envrc
@ -0,0 +1,2 @@
+export VIRTUAL_ENV="$PWD/.venv"
+export PATH="$VIRTUAL_ENV/bin:$PATH"
--- a/.gitignore
+++ b/.gitignore
@ -3,6 +3,9 @@ __pycache__/
 *.pyc
 *.pyo
 .venv/
+build/
+dist/
+*.spec

 # Audio files
 *.wav
@ -14,6 +17,10 @@ proper_nouns_audio/
 # Generated data (JSON files in output_proper_nouns/ are tracked)
 output_proper_nouns/remaining_review.txt

+# Generated PDFs and LaTeX files
+*.pdf
+*.tex
+
 # Text files (except proper_nouns.txt)
 *.txt
 !proper_nouns.txt
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@ -0,0 +1,4 @@
+{
+    "python.defaultInterpreterPath": ".venv/bin/python",
+    "python.terminal.activateEnvironment": true
+}
--- a/README.md
+++ b/README.md
@ -0,0 +1,125 @@
+# Audiobook Creator
+
+AI-powered audiobook generator using the [Kokoro TTS](https://github.com/hexgrad/kokoro) model.
+Generates high-quality narrated `.wav` files from plain-text novels, with a GUI tool for auditing and fixing proper noun pronunciations per book.
+
+---
+
+## Features
+
+- **Multi-book support** — each book's proper nouns, fixes, and audio are fully isolated
+- **Proper Noun GUI** — hear every extracted name, mark it correct or type a phonetic fix
+- **Audiobook generation** — one `.wav` per chapter, GPU-accelerated via CUDA
+- **In-GUI extraction** — click one button to run NLP extraction and generate audio, no separate scripts needed
+- **Apply Fixes** — writes a TTS-ready copy of the source text with all phonetic substitutions applied
+
+---
+
+## Project structure
+
+```
+Audio Text for Novel Lightbringer/   ← multi-file book (chapters as .txt)
+Audio Master Nem Full.txt            ← single-file book
+
+gui_proper_noun_player.py            ← proper noun auditing GUI
+create_audiobook_lightbringer.py     ← generate Lightbringer audiobook chapters
+create_audiobook_nem.py              ← generate Nem audiobook chapters
+
+output_audiobook_lightbringer/       ← chapter WAV output
+output_audiobook/                    ← Nem WAV output
+output_proper_nouns/<book>/          ← manifest + JSON fix data per book
+proper_nouns_audio/<book>/           ← word audio + replacements cache per book
+
+requirements.txt
+setup_windows.bat                    ← one-click Windows setup
+run_gui.bat                          ← launch GUI on Windows
+run_audiobook.bat                    ← generate audiobook on Windows
+---
+
+## Setup (Windows - Easiest for Non-Tech Users)
+
+1. **Download** the project as a ZIP file from GitHub
+2. **Extract** the ZIP to a folder on your computer (e.g., `C:\audiobook-creator`)
+3. **Double-click** `setup_windows.bat` and wait for it to finish installing everything (may take 10-20 minutes)
+4. **Double-click** `run_gui.bat` to launch the Proper Noun Player GUI
+5. **Double-click** `run_audiobook.bat` to generate audiobook chapters
+
+That's it! The setup script handles Python installation, virtual environment, and all dependencies automatically.
+
+---
+
+## Setup (Linux / Mac)
+
+```bash
+python3.12 -m venv .venv
+source .venv/bin/activate
+pip install torch --index-url https://download.pytorch.org/whl/cu124   # CUDA 12.4
+pip install -r requirements.txt
+python -m spacy download en_core_web_sm
+```
+
+> For CPU-only: replace the torch line with `pip install torch`
+
+---
+
+## Setup (Windows)
+
+See [SETUP_WINDOWS.md](SETUP_WINDOWS.md) for a step-by-step guide aimed at non-technical users.
+
+---
+
+## Usage
+
+### Proper Noun GUI
+
+```bash
+.venv/bin/python gui_proper_noun_player.py
+```
+
+1. Select a book from the dropdown
+2. Click **⚙ Extract & Generate Audio** — extracts proper nouns via spaCy and generates a TTS clip for each one
+3. Click words in the Review list to hear them; press Enter to mark correct or type a phonetic replacement first
+4. Click **⇄ Apply Fixes to Text** to write a pronunciation-corrected copy of the source file
+
+### Generate Audiobook
+
+```bash
+# All chapters
+.venv/bin/python create_audiobook_lightbringer.py
+
+# List chapters only
+.venv/bin/python create_audiobook_lightbringer.py --list
+
+# Preview clips
+.venv/bin/python create_audiobook_lightbringer.py --preview
+
+# Specific chapters
+.venv/bin/python create_audiobook_lightbringer.py 0 1 2
+```
+
+---
+
+## Dependencies
+
+| Package | Purpose |
+|---|---|
+| `kokoro` | Kokoro-82M TTS model |
+| `torch` | GPU inference |
+| `soundfile` / `sounddevice` | Audio I/O |
+| `numpy` | Audio array operations |
+| `spacy` + `en_core_web_sm` | Proper noun extraction (NER + PROPN) |
+| `wordfreq` | Common-word filter during extraction |
+
+---
+
+## Output
+
+| Path | Contents |
+|---|---|
+| `output_audiobook_lightbringer/` | `chapter_01_homecoming.wav`, … |
+| `output_proper_nouns/<book>/manifest.json` | Word → WAV filename map |
+| `output_proper_nouns/<book>/pronunciation_fixes.json` | `{"Nephi": "Kneephi", …}` |
+| `output_proper_nouns/<book>/correct_words.json` | Words confirmed correct |
+| `proper_nouns_audio/<book>/` | Per-word audio clips |
+| `proper_nouns_audio/<book>/replacements_cache/` | Cached phonetic fix clips |
+
--- a/SETUP_WINDOWS.md
+++ b/SETUP_WINDOWS.md
@ -0,0 +1,134 @@
+# Audiobook Creator — Windows 11 Setup Guide
+
+This guide is written for someone who has never used Python or the command line.
+Follow the steps in order and you will be generating audiobook chapters with your gaming GPU.
+
+---
+
+## What you will need
+
+| Requirement | Why |
+|---|---|
+| Windows 11 PC with a modern NVIDIA GPU | Fast audio generation using CUDA |
+| ~5 GB free disk space | Python, PyTorch, and the AI voice model |
+| Internet connection (first-time only) | Downloads packages and the Kokoro voice model |
+
+---
+
+## Step 1 — Install Python
+
+1. Go to **https://www.python.org/downloads/**
+2. Click the big yellow **"Download Python 3.12.x"** button
+3. Run the installer
+4. **IMPORTANT:** On the very first screen of the installer, tick the checkbox that says **"Add Python to PATH"** before clicking Install Now
+
+> If you missed that checkbox, uninstall Python from Windows Settings and reinstall it with the box ticked.
+
+---
+
+## Step 2 — Get the project files
+
+You should have a folder called `audiobook_creator` (or similar) containing the project files. Make sure it includes these files:
+
+```
+setup_windows.bat
+run_gui.bat
+run_audiobook.bat
+requirements.txt
+gui_proper_noun_player.py
+create_audiobook_lightbringer.py
+Audio Text for Novel Lightbringer\    ← your chapter text files go here
+```
+
+If you received a ZIP file, extract it first so the folder is not inside another folder.
+
+---
+
+## Step 3 — Run Setup (one time only)
+
+1. Open the project folder in File Explorer
+2. Double-click **`setup_windows.bat`**
+3. A black terminal window opens and runs through these steps automatically:
+   - Checks Python is installed
+   - Creates a private Python environment (`.venv` folder)
+   - Downloads PyTorch with GPU (CUDA) support — **about 2.5 GB, this takes several minutes**
+   - Installs the remaining packages (kokoro, spaCy, etc.)
+   - Downloads the spaCy English language model
+   - Downloads the Kokoro AI voice model — **about 330 MB**
+4. When it says **"Setup complete!"**, press any key to close the window
+
+You only need to do this once. If you run it again it will safely skip anything already installed.
+
+---
+
+## Step 4 — Review Proper Noun Pronunciations (GUI)
+
+Before generating the audiobook, it helps to check how unusual names are pronounced.
+
+1. Double-click **`run_gui.bat`**
+2. The Proper Noun Pronunciation Auditor window opens
+3. Select your book from the dropdown at the top
+4. Click **⚙ Extract & Generate Audio** — this scans the text and creates a short audio clip for every proper noun found (takes a few minutes the first time)
+5. Click any word in the **To Review** list to hear how it sounds
+6. If it sounds wrong, type the phonetic spelling in the box at the bottom and press **Enter** to save a fix
+   - Example: type `Kneephi` instead of `Nephi`
+7. If it sounds correct, just press **Enter** without changing anything
+8. When you are done reviewing, click **⇄ Apply Fixes to Text** to save a corrected copy of the source text
+
+**Keyboard shortcuts:**
+| Key | Action |
+|---|---|
+| Space | Replay current word |
+| Enter | Mark correct (or save fix if text was changed) |
+| Escape | Reset the fix box, go back to word list |
+| s | Stop audio |
+| ↑ / ↓ | Navigate the word list from the fix box |
+| Delete | Move a word back to Review from Correct or Fixes |
+
+---
+
+## Step 5 — Generate the Audiobook
+
+1. Double-click **`run_audiobook.bat`**
+2. A menu appears — type the number of your choice and press Enter:
+
+| Option | What it does |
+|---|---|
+| 1 | Generate **all chapters** — can take many hours, safe to leave running overnight |
+| 2 | **List** detected chapters only — instant, nothing is generated |
+| 3 | Generate a short **preview clip** of each chapter — quick sanity check |
+| 4 | Generate **specific chapters** — enter chapter numbers separated by spaces |
+
+3. When finished, `.wav` files will be in the `output_audiobook_lightbringer` folder
+
+---
+
+## Troubleshooting
+
+**"Python was not found"**
+→ Python is not installed, or you forgot to tick "Add Python to PATH" during installation. Uninstall and reinstall Python from https://www.python.org/downloads/ making sure to tick that box.
+
+**The black window opens and immediately closes**
+→ There was an error. To see it: press `Win + R`, type `cmd`, press Enter, then drag the `.bat` file into that black window and press Enter. The error message will stay visible.
+
+**Audio generation is very slow (taking hours per chapter)**
+→ The GPU version of PyTorch may not have installed correctly. Re-run `setup_windows.bat` — it will reinstall just that part.
+
+**"No .txt files found in Audio Text for Novel Lightbringer"**
+→ Make sure your chapter `.txt` files are inside the `Audio Text for Novel Lightbringer` subfolder, not loose in the main project folder.
+
+**The GUI says "No manifest yet"**
+→ You need to click **⚙ Extract & Generate Audio** first for that book.
+
+**Antivirus blocks the .bat files**
+→ Right-click the `.bat` file, choose Properties, and click "Unblock" at the bottom. Then try again.
+
+---
+
+## Output files
+
+| Folder | Contents |
+|---|---|
+| `output_audiobook_lightbringer\` | One `.wav` file per chapter |
+| `output_proper_nouns\<book>\` | Pronunciation data (JSON files) |
+| `proper_nouns_audio\<book>\` | Cached word audio clips |
--- a/create_audiobook.py
+++ b/create_audiobook.py
@ -0,0 +1,402 @@
+"""
+create_audiobook.py
+------------------
+Generic audiobook generator for text files that contain chapter headings.
+
+Supported heading formats (single-line headings):
+- Prologue
+- Chapter 12
+- Chapter 12 - Chapter Name
+- Chapter - 12
+- Chapter - 12 - Chapter Name
+
+Features:
+- Parses chapters from one or more input files/directories
+- Caches parsed chapter data for faster re-runs when source files are unchanged
+- Warns about missing chapter numbers (example: found 1,2,4 -> warns about 3)
+- Generates one .wav per chapter with Kokoro
+
+Examples:
+    python create_audiobook.py --input "Audio Text for Novel Lightbringer"
+    python create_audiobook.py --input novel.txt --list
+    python create_audiobook.py --input novel.txt 0 1 2 --voice am_michael
+    python create_audiobook.py --input novel.txt --preview 3000
+"""
+
+from __future__ import annotations
+
+import argparse
+import hashlib
+import json
+import re
+import time
+from pathlib import Path
+
+import numpy as np
+import soundfile as sf
+import torch
+from kokoro import KPipeline
+
+SAMPLE_RATE = 24000
+SPEED = 1.0
+LANG_CODE = "a"
+VOICE = "am_onyx"
+CACHE_VERSION = 1
+
+PROLOGUE_RE = re.compile(r"^\s*Prologue\s*$", re.IGNORECASE)
+CHAPTER_RE_1 = re.compile(r"^\s*Chapter\s*-\s*(\d+)(?:\s*-\s*(.+))?\s*$", re.IGNORECASE)
+CHAPTER_RE_2 = re.compile(r"^\s*Chapter\s+(\d+)(?:\s*-\s*(.+))?\s*$", re.IGNORECASE)
+RULE_RE = re.compile(r"^[_\-*\s]{3,}\s*$")
+
+
+def _slug(text: str) -> str:
+    text = text.lower()
+    text = re.sub(r"[^a-z0-9]+", "_", text)
+    return text.strip("_")
+
+
+def _clean_text(text: str) -> str:
+    text = RULE_RE.sub("", text)
+    text = re.sub(r"\n{3,}", "\n\n", text)
+    return text.strip()
+
+
+def _fmt_duration(seconds: float) -> str:
+    h, rem = divmod(int(seconds), 3600)
+    m, s = divmod(rem, 60)
+    if h > 0:
+        return f"{h}h {m:02d}m {s:02d}s"
+    if m > 0:
+        return f"{m}m {s:02d}s"
+    return f"{s}s"
+
+
+def _chapter_heading(line: str) -> tuple[int, str, str] | None:
+    stripped = line.strip()
+    if PROLOGUE_RE.match(stripped):
+        return (0, "Prologue", "Prologue")
+
+    m = CHAPTER_RE_1.match(stripped)
+    if not m:
+        m = CHAPTER_RE_2.match(stripped)
+    if not m:
+        return None
+
+    num = int(m.group(1))
+    title = (m.group(2) or "").strip()
+    label = f"Chapter {num}" + (f" - {title}" if title else "")
+    return (num, title, label)
+
+
+def _resolve_txt_files(inputs: list[str]) -> list[Path]:
+    txt_files: list[Path] = []
+    for raw in inputs:
+        path = Path(raw)
+        if path.is_file():
+            if path.suffix.lower() == ".txt":
+                txt_files.append(path)
+            continue
+        if path.is_dir():
+            txt_files.extend(sorted(path.glob("*.txt")))
+
+    deduped = sorted({p.resolve() for p in txt_files})
+    return deduped
+
+
+def _signature_for_files(files: list[Path]) -> list[dict]:
+    sig = []
+    for p in files:
+        st = p.stat()
+        sig.append({
+            "path": str(p),
+            "size": st.st_size,
+            "mtime_ns": st.st_mtime_ns,
+        })
+    return sig
+
+
+def _cache_path(output_dir: Path, files: list[Path]) -> Path:
+    cache_dir = output_dir / ".cache"
+    digest = hashlib.sha256("\n".join(str(p) for p in files).encode("utf-8")).hexdigest()[:12]
+    return cache_dir / f"parse_{digest}.json"
+
+
+def _load_cached_chapters(cache_file: Path, file_sig: list[dict]) -> list[dict] | None:
+    if not cache_file.exists():
+        return None
+
+    try:
+        data = json.loads(cache_file.read_text(encoding="utf-8"))
+    except Exception:
+        return None
+
+    if data.get("version") != CACHE_VERSION:
+        return None
+    if data.get("file_signature") != file_sig:
+        return None
+
+    chapters = data.get("chapters")
+    if not isinstance(chapters, list):
+        return None
+    return chapters
+
+
+def _save_cached_chapters(cache_file: Path, file_sig: list[dict], chapters: list[dict]) -> None:
+    cache_file.parent.mkdir(parents=True, exist_ok=True)
+    payload = {
+        "version": CACHE_VERSION,
+        "file_signature": file_sig,
+        "chapters": chapters,
+    }
+    cache_file.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
+
+
+def _parse_chapters(files: list[Path]) -> tuple[list[dict], set[int]]:
+    chapters: list[dict] = []
+    duplicates: set[int] = set()
+    seen: set[int] = set()
+    current: dict | None = None
+
+    def flush_current() -> None:
+        if current is not None:
+            current["text"] = "".join(current.pop("lines"))
+            num = current["num"]
+            if num in seen:
+                duplicates.add(num)
+                return
+            seen.add(num)
+            chapters.append(current)
+
+    for fpath in files:
+        with fpath.open("r", encoding="utf-8") as fh:
+            for line in fh:
+                info = _chapter_heading(line)
+                if info is not None:
+                    flush_current()
+                    num, title, label = info
+                    num_str = f"{num:02d}"
+                    if num == 0:
+                        slug = "chapter_00_prologue"
+                    elif title:
+                        slug = f"chapter_{num_str}_{_slug(title)}"
+                    else:
+                        slug = f"chapter_{num_str}"
+                    current = {
+                        "num": num,
+                        "title": title,
+                        "label": label,
+                        "slug": slug,
+                        "lines": [line],
+                    }
+                elif current is not None:
+                    current["lines"].append(line)
+
+    flush_current()
+    chapters.sort(key=lambda c: c["num"])
+    return chapters, duplicates
+
+
+def load_all_chapters_with_cache(inputs: list[str], output_dir: Path, force_reparse: bool = False) -> tuple[list[dict], bool, set[int], list[Path]]:
+    files = _resolve_txt_files(inputs)
+    if not files:
+        raise FileNotFoundError("No .txt files found in --input paths")
+
+    file_sig = _signature_for_files(files)
+    cache_file = _cache_path(output_dir, files)
+
+    if not force_reparse:
+        cached = _load_cached_chapters(cache_file, file_sig)
+        if cached is not None:
+            return cached, True, set(), files
+
+    chapters, duplicates = _parse_chapters(files)
+    _save_cached_chapters(cache_file, file_sig, chapters)
+    return chapters, False, duplicates, files
+
+
+def warn_missing_chapters(chapters: list[dict]) -> None:
+    nums = sorted(ch["num"] for ch in chapters if ch["num"] > 0)
+    if not nums:
+        return
+    missing = [n for n in range(nums[0], nums[-1] + 1) if n not in set(nums)]
+    if missing:
+        print(f"WARNING: missing chapter numbers detected: {missing}")
+
+
+def generate_audio(pipeline: KPipeline, text: str, voice: str, output_path: Path) -> float:
+    t0 = time.monotonic()
+    chunks = []
+    for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
+        if hasattr(chunk_audio, "numpy"):
+            chunk_audio = chunk_audio.cpu().numpy()
+        chunk_audio = np.atleast_1d(chunk_audio.squeeze())
+        if chunk_audio.size > 0:
+            chunks.append(chunk_audio)
+
+    elapsed = time.monotonic() - t0
+    if chunks:
+        audio = np.concatenate(chunks, axis=0)
+        sf.write(str(output_path), audio, SAMPLE_RATE)
+        duration = len(audio) / SAMPLE_RATE
+        print(
+            f"  OK saved '{output_path.name}' "
+            f"({_fmt_duration(duration)} audio | {_fmt_duration(elapsed)} wall-clock)"
+        )
+    else:
+        print(f"  ERROR no audio produced for voice='{voice}'")
+    return elapsed
+
+
+def main() -> None:
+    parser = argparse.ArgumentParser(description="Generate an audiobook from chapterized text files.")
+    parser.add_argument(
+        "chapters",
+        nargs="*",
+        type=int,
+        help="Chapter numbers to generate (0 = Prologue). Default: all.",
+    )
+    parser.add_argument(
+        "--input",
+        nargs="+",
+        required=True,
+        help="One or more .txt files and/or directories containing .txt files.",
+    )
+    parser.add_argument(
+        "--output",
+        default="output_audiobook",
+        help="Output directory for generated chapter audio.",
+    )
+    parser.add_argument("--list", action="store_true", help="Print detected chapters and exit.")
+    parser.add_argument("--voice", default=VOICE, help=f"Kokoro voice to use (default: {VOICE}).")
+    parser.add_argument(
+        "--preview",
+        nargs="?",
+        const=3000,
+        type=int,
+        metavar="CHARS",
+        help="Generate short preview clips capped at CHARS (default: 3000).",
+    )
+    parser.add_argument(
+        "--reparse",
+        action="store_true",
+        help="Ignore cache and re-parse chapters from source files.",
+    )
+    args = parser.parse_args()
+
+    output_dir = Path(args.output)
+    output_dir.mkdir(parents=True, exist_ok=True)
+
+    print("Loading chapters...")
+    chapters, used_cache, duplicates, files = load_all_chapters_with_cache(
+        args.input, output_dir, force_reparse=args.reparse
+    )
+
+    print(f"Input files: {len(files)}")
+    print(f"Parse cache: {'HIT' if used_cache else 'MISS'}")
+
+    if duplicates:
+        print(f"WARNING: duplicate chapter numbers were found and ignored: {sorted(duplicates)}")
+
+    if not chapters:
+        print("WARNING: no chapters found.")
+        print("Expected headings like: 'Prologue' or 'Chapter 12 - Name' or 'Chapter - 12'")
+        return
+
+    warn_missing_chapters(chapters)
+
+    if args.list:
+        print(f"\nDetected {len(chapters)} chapters:\n")
+        print(f"  {'#':>4}  {'Label':<45}  {'Chars':>8}  {'Output filename'}")
+        print(f"  {'-' * 4}  {'-' * 45}  {'-' * 8}  {'-' * 30}")
+        for ch in chapters:
+            chars = len(_clean_text(ch["text"]))
+            print(f"  {ch['num']:>4}  {ch['label']:<45}  {chars:>8,}  {ch['slug']}.wav")
+        return
+
+    if args.chapters:
+        requested = set(args.chapters)
+        run_chapters = [ch for ch in chapters if ch["num"] in requested]
+        missing_req = sorted(requested - {ch["num"] for ch in run_chapters})
+        if missing_req:
+            print(f"WARNING: requested chapter(s) not found: {missing_req}")
+    else:
+        run_chapters = chapters
+
+    if not run_chapters:
+        print("No chapters selected. Use --list to see available chapters.")
+        return
+
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"Device: {device}")
+    if device == "cuda":
+        print(f"GPU:    {torch.cuda.get_device_name(0)}")
+    print(f"Voice:  {args.voice}")
+
+    chapter_chars = {ch["num"]: len(_clean_text(ch["text"])) for ch in run_chapters}
+    total_chars = sum(chapter_chars.values())
+
+    preview_note = f"PREVIEW MODE: capped at {args.preview:,} chars/chapter" if args.preview else ""
+    if preview_note:
+        print(preview_note)
+
+    print("\nPlan:")
+    for ch in run_chapters:
+        print(f"  {ch['num']:>3}  {ch['label']}  ({chapter_chars[ch['num']]:,} chars)")
+    print(f"  TOTAL: {total_chars:,} chars\n")
+
+    print("Initializing Kokoro pipeline...")
+    pipeline = KPipeline(lang_code=LANG_CODE)
+
+    chars_per_sec: float | None = None
+    timing_rows: list[tuple[str, int, float]] = []
+
+    for ch in run_chapters:
+        text = _clean_text(ch["text"])
+        if not text:
+            print(f"[{ch['label']}] WARNING empty text, skipping")
+            continue
+
+        if args.preview and len(text) > args.preview:
+            cut = text.rfind(" ", 0, args.preview)
+            text = text[: cut if cut > 0 else args.preview]
+
+        chars = len(text)
+        preview_tag = "_preview" if args.preview else ""
+        out_path = output_dir / f"{ch['slug']}{preview_tag}.wav"
+
+        if chars_per_sec is not None:
+            eta = _fmt_duration(chars / chars_per_sec)
+            print(f"\n[{ch['label']}] -> {out_path.name} (est. {eta})")
+        else:
+            print(f"\n[{ch['label']}] -> {out_path.name} (calibration run)")
+
+        elapsed = generate_audio(pipeline, text, args.voice, out_path)
+        timing_rows.append((ch["label"], chars, elapsed))
+
+        done_chars = sum(c for _, c, _ in timing_rows)
+        done_elapsed = sum(e for _, _, e in timing_rows)
+        if done_elapsed > 0:
+            chars_per_sec = done_chars / done_elapsed
+            remaining = total_chars - done_chars
+            eta_total = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
+            print(f"  Speed: {chars_per_sec:.0f} chars/sec | Estimated remaining: {eta_total}")
+
+    print("\nSummary:")
+    print(f"  {'Chapter':<35}  {'Chars':>7}  {'Actual':>8}  {'Est':>8}")
+    print("  " + "-" * 65)
+    for i, (label, chars, elapsed) in enumerate(timing_rows):
+        actual_str = _fmt_duration(elapsed)
+        prior_chars = sum(c for _, c, _ in timing_rows[:i])
+        prior_elapsed = sum(e for _, _, e in timing_rows[:i])
+        est_str = _fmt_duration(chars / (prior_chars / prior_elapsed)) if prior_elapsed > 0 else "(first)"
+        print(f"  {label:<35}  {chars:>7,}  {actual_str:>8}  {est_str:>8}")
+
+    total_elapsed = sum(e for _, _, e in timing_rows)
+    total_done_chars = sum(c for _, c, _ in timing_rows)
+    print("  " + "-" * 65)
+    print(f"  {'TOTAL':<35}  {total_done_chars:>7,}  {_fmt_duration(total_elapsed):>8}")
+    print("\nDone.")
+
+
+if __name__ == "__main__":
+    main()
--- a/create_audiobook_lightbringer.py
+++ b/create_audiobook_lightbringer.py
@ -0,0 +1,311 @@
+"""
+create_audiobook_lightbringer.py
+─────────────────────────────────
+Generate the "A Darkness Rising" audiobook — one file per chapter/prologue.
+
+Reads all .txt files from NOVEL_DIR, detects Prologue + Chapter headings,
+and writes one .wav per chapter into OUTPUT_DIR.
+
+Usage:
+    python create_audiobook_lightbringer.py            # all chapters
+    python create_audiobook_lightbringer.py --list     # list detected chapters
+    python create_audiobook_lightbringer.py 0 1 2      # prologue + ch1 + ch2
+    python create_audiobook_lightbringer.py --preview  # short preview clips
+
+Output filenames:
+    chapter_00_prologue.wav
+    chapter_01_homecoming.wav
+    chapter_02_the_anhuil_ehlar.wav
+    ...
+"""
+
+import argparse
+import re
+import time
+import numpy as np
+import soundfile as sf
+import torch
+from pathlib import Path
+from kokoro import KPipeline
+
+# ── Config ─────────────────────────────────────────────────────────────────────
+NOVEL_DIR   = Path("Audio Text for Novel Lightbringer")
+OUTPUT_DIR  = Path("output_audiobook_lightbringer")
+SAMPLE_RATE = 24000
+SPEED       = 1.0
+LANG_CODE   = "a"     # American English
+VOICE       = "am_onyx"      # default narrator voice
+
+# Regex that matches a chapter/prologue heading line (case-insensitive).
+# Group 1 captures the chapter number (or None for Prologue).
+# Group 2 captures the optional subtitle after " - ".
+_HEADING_RE = re.compile(
+    r"^(?:Chapter\s+(\d+)\s*(?:-\s*(.+))?|(Prologue))\s*$",
+    re.IGNORECASE,
+)
+
+
+# ── Helpers ────────────────────────────────────────────────────────────────────
+
+def _slug(text: str) -> str:
+    """Convert title text to a filesystem-safe slug."""
+    text = text.lower()
+    text = re.sub(r"[^a-z0-9]+", "_", text)
+    return text.strip("_")
+
+
+def load_all_chapters(novel_dir: Path) -> list[dict]:
+    """
+    Read all .txt files in *novel_dir* in sorted order, detect Prologue /
+    Chapter headings, and return a list of chapter dicts:
+        {
+            "num":   int,          # 0 = Prologue
+            "title": str,          # subtitle portion, e.g. "Homecoming"
+            "label": str,          # human label, e.g. "Chapter 1 - Homecoming"
+            "slug":  str,          # e.g. "chapter_01_homecoming"
+            "text":  str,          # full body text of the chapter
+        }
+    Chapters from multiple files are concatenated in sorted-filename order.
+    """
+    txt_files = sorted(novel_dir.glob("*.txt"))
+    if not txt_files:
+        raise FileNotFoundError(f"No .txt files found in '{novel_dir}'")
+
+    # Collect (chapter_num, title_line, body_lines) across all files
+    raw: list[tuple[int, str, list[str]]] = []  # (num, heading_text, body)
+    current_num: int | None = None
+    current_heading: str = ""
+    current_body: list[str] = []
+
+    def _flush():
+        if current_num is not None:
+            raw.append((current_num, current_heading, list(current_body)))
+
+    for fpath in txt_files:
+        lines = fpath.read_text(encoding="utf-8").splitlines()
+        for line in lines:
+            m = _HEADING_RE.match(line.strip())
+            if m:
+                _flush()
+                if m.group(3):               # Prologue
+                    current_num = 0
+                    current_heading = "Prologue"
+                else:                        # Chapter N
+                    current_num = int(m.group(1))
+                    subtitle = (m.group(2) or "").strip()
+                    current_heading = f"Chapter {current_num}" + (f" - {subtitle}" if subtitle else "")
+                current_body = [line]        # keep heading inside text
+            else:
+                if current_num is not None:
+                    current_body.append(line)
+    _flush()
+
+    # Build chapter dicts, deduplicated and sorted by number
+    seen: set[int] = set()
+    chapters: list[dict] = []
+    for num, heading, body in sorted(raw, key=lambda x: x[0]):
+        if num in seen:
+            continue
+        seen.add(num)
+        # Derive subtitle / slug
+        subtitle = ""
+        sm = re.match(r"Chapter\s+\d+\s*-\s*(.+)", heading, re.IGNORECASE)
+        if sm:
+            subtitle = sm.group(1).strip()
+        elif heading.lower() == "prologue":
+            subtitle = "Prologue"
+
+        num_str = f"{num:02d}"
+        if subtitle:
+            slug = f"chapter_{num_str}_{_slug(subtitle)}"
+        else:
+            slug = f"chapter_{num_str}"
+
+        chapters.append({
+            "num":   num,
+            "title": subtitle or heading,
+            "label": heading,
+            "slug":  slug,
+            "text":  "\n".join(body),
+        })
+
+    return chapters
+
+
+def clean_text(text: str) -> str:
+    """Strip formatting artifacts and normalise whitespace for TTS."""
+    # Remove horizontal-rule lines (underscores / asterisks / dashes)
+    text = re.sub(r"^[_\-\*\s]{3,}\s*$", "", text, flags=re.MULTILINE)
+    # Collapse 3+ blank lines to 2
+    text = re.sub(r"\n{3,}", "\n\n", text)
+    return text.strip()
+
+
+def _fmt_duration(seconds: float) -> str:
+    h, rem = divmod(int(seconds), 3600)
+    m, s = divmod(rem, 60)
+    if h > 0:
+        return f"{h}h {m:02d}m {s:02d}s"
+    if m > 0:
+        return f"{m}m {s:02d}s"
+    return f"{s}s"
+
+
+def generate_audio(pipeline: KPipeline, text: str, voice: str,
+                   output_path: Path) -> float:
+    """Generate audio and return wall-clock seconds elapsed."""
+    t0 = time.monotonic()
+    chunks = []
+    for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
+        if hasattr(chunk_audio, "numpy"):
+            chunk_audio = chunk_audio.cpu().numpy()
+        chunk_audio = np.atleast_1d(chunk_audio.squeeze())
+        if chunk_audio.size > 0:
+            chunks.append(chunk_audio)
+
+    elapsed = time.monotonic() - t0
+    if chunks:
+        audio = np.concatenate(chunks, axis=0)
+        sf.write(str(output_path), audio, SAMPLE_RATE)
+        duration = len(audio) / SAMPLE_RATE
+        print(f"  ✓  Saved '{output_path.name}'  "
+              f"({_fmt_duration(duration)} audio  |  {_fmt_duration(elapsed)} wall-clock)")
+    else:
+        print(f"  ✗  No audio produced for voice='{voice}'")
+    return elapsed
+
+
+# ── Main ───────────────────────────────────────────────────────────────────────
+
+def main() -> None:
+    parser = argparse.ArgumentParser(
+        description="Generate 'A Darkness Rising' audiobook, one file per chapter."
+    )
+    parser.add_argument(
+        "chapters", nargs="*", type=int,
+        help="Chapter numbers to generate (0 = Prologue). Default: all.",
+    )
+    parser.add_argument(
+        "--list", action="store_true",
+        help="Print detected chapters and exit.",
+    )
+    parser.add_argument(
+        "--voice", default=VOICE,
+        help=f"Kokoro voice to use (default: {VOICE}).",
+    )
+    parser.add_argument(
+        "--preview", nargs="?", const=3000, type=int, metavar="CHARS",
+        help="Generate short preview clips (default: 3000 chars). "
+             "Output filenames get a _preview suffix.",
+    )
+    args = parser.parse_args()
+
+    print("Loading chapters …")
+    all_chapters = load_all_chapters(NOVEL_DIR)
+
+    if args.list:
+        print(f"\nDetected {len(all_chapters)} chapters:\n")
+        print(f"  {'#':>4}  {'Label':<45}  {'Chars':>8}  {'Output filename'}")
+        print(f"  {'─'*4}  {'─'*45}  {'─'*8}  {'─'*30}")
+        for ch in all_chapters:
+            chars = len(clean_text(ch["text"]))
+            print(f"  {ch['num']:>4}  {ch['label']:<45}  {chars:>8,}  {ch['slug']}.wav")
+        return
+
+    # Filter to requested subset
+    if args.chapters:
+        requested = set(args.chapters)
+        run_chapters = [ch for ch in all_chapters if ch["num"] in requested]
+        missing = requested - {ch["num"] for ch in run_chapters}
+        if missing:
+            print(f"⚠  Chapter(s) not found: {sorted(missing)}")
+    else:
+        run_chapters = all_chapters
+
+    if not run_chapters:
+        print("No chapters selected. Use --list to see available chapters.")
+        return
+
+    voice = args.voice
+    device = "cuda" if torch.cuda.is_available() else "cpu"
+    print(f"Device: {device}")
+    if device == "cuda":
+        print(f"GPU:    {torch.cuda.get_device_name(0)}")
+    print(f"Voice:  {voice}")
+
+    OUTPUT_DIR.mkdir(exist_ok=True)
+
+    # Pre-compute char counts
+    chapter_chars = {ch["num"]: len(clean_text(ch["text"])) for ch in run_chapters}
+
+    preview_note = (f"  ⚡ PREVIEW MODE — capped at {args.preview:,} chars/chapter\n"
+                    if args.preview else "")
+    print(f"\n{preview_note}{'─'*65}")
+    print(f"  {'#':>4}  {'Label':<40}  {'Chars':>8}")
+    print(f"  {'─'*4}  {'─'*40}  {'─'*8}")
+    for ch in run_chapters:
+        print(f"  {ch['num']:>4}  {ch['label']:<40}  {chapter_chars[ch['num']]:>8,}")
+    print(f"  {'─'*55}")
+    total_chars = sum(chapter_chars.values())
+    print(f"  {'TOTAL':<45}  {total_chars:>8,}\n")
+
+    print("Initialising Kokoro pipeline …")
+    pipeline = KPipeline(lang_code=LANG_CODE)
+
+    chars_per_sec: float | None = None
+    timing_rows: list[tuple[str, int, float]] = []
+
+    for ch in run_chapters:
+        text = clean_text(ch["text"])
+        if not text:
+            print(f"\n[{ch['label']}]  ⚠  Empty text — skipping")
+            continue
+
+        preview_chars = args.preview
+        if preview_chars and len(text) > preview_chars:
+            cut = text.rfind(" ", 0, preview_chars)
+            text = text[: cut if cut > 0 else preview_chars]
+
+        chars = len(text)
+        preview_tag = "_preview" if args.preview else ""
+        out_path = OUTPUT_DIR / f"{ch['slug']}{preview_tag}.wav"
+
+        if chars_per_sec is not None:
+            eta_str = _fmt_duration(chars / chars_per_sec)
+            print(f"\n[{ch['label']}]  voice={voice}  →  {out_path.name}  (est. {eta_str})")
+        else:
+            print(f"\n[{ch['label']}]  voice={voice}  →  {out_path.name}  (calibration run)")
+
+        elapsed = generate_audio(pipeline, text, voice, out_path)
+        timing_rows.append((ch["label"], chars, elapsed))
+
+        total_done = sum(c for _, c, _ in timing_rows)
+        total_elapsed_done = sum(e for _, _, e in timing_rows)
+        if total_elapsed_done > 0:
+            chars_per_sec = total_done / total_elapsed_done
+            remaining = total_chars - total_done
+            eta_overall = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
+            print(f"  ⏱  Speed: {chars_per_sec:.0f} chars/sec  |  Est. overall remaining: {eta_overall}")
+
+    # Summary
+    print("\n" + "─" * 65)
+    print(f"  {'Chapter':<35}  {'Chars':>7}  {'Actual':>8}  {'Est':>8}")
+    print("─" * 65)
+    for i, (label, chars, elapsed) in enumerate(timing_rows):
+        actual_str = _fmt_duration(elapsed)
+        prior_chars = sum(c for _, c, _ in timing_rows[:i])
+        prior_elapsed = sum(e for _, _, e in timing_rows[:i])
+        if prior_elapsed > 0:
+            est_str = _fmt_duration(chars / (prior_chars / prior_elapsed))
+        else:
+            est_str = "(first)"
+        print(f"  {label:<35}  {chars:>7,}  {actual_str:>8}  {est_str:>8}")
+    total_elapsed = sum(e for _, _, e in timing_rows)
+    print("─" * 65)
+    print(f"  {'TOTAL':<35}  {sum(c for _,c,_ in timing_rows):>7,}  "
+          f"{_fmt_duration(total_elapsed):>8}")
+    print("\nDone.")
+
+
+if __name__ == "__main__":
+    main()
--- a/create_audiobook_nem.py
+++ b/create_audiobook_nem.py
@ -4,13 +4,19 @@ audiobook_nem.py
 Generate the Book of the Nem audiobook — one unique voice per book/section.

 Usage:
-    python audiobook_nem.py
+    python create_audiobook_nem.py                   # all enabled books
+    python create_audiobook_nem.py --list            # list available book labels
+    python create_audiobook_nem.py Introduction
+    python create_audiobook_nem.py "Book of Hagoth"
+    python create_audiobook_nem.py Introduction "Book of Hagoth"

-To skip a section, comment out its entry in BOOKS below.
+To permanently skip a section, comment out its entry in BOOKS below.
 Output .wav files are written to OUTPUT_DIR (created automatically).
 """

+import argparse
 import re
+import time
 import numpy as np
 import soundfile as sf
 import torch
@ -27,8 +33,12 @@ SPEED         = 1.0
 LANG_CODE     = "a"   # 'a' = American English

 # ── Available Kokoro voices (American English, lang_code='a') ──────────────────
-#   af_heart   – warm American female      [downloaded]
+#   af_bella   – American female             [downloaded]
+#   af_heart   – warm American female        [downloaded]
 #   af_nicole  – American female             [downloaded]
+#   af_river   – American female             [downloaded]
+#   af_sarah   – American female             [downloaded]
+#   af_sky     – American female             [downloaded]
 #   am_adam    – American male (deep)        [downloaded]
 #   am_echo    – American male               [downloaded]
 #   am_eric    – American male               [downloaded]
@ -40,30 +50,30 @@ LANG_CODE     = "a"   # 'a' = American English
 #   am_santa   – American male               [downloaded] (not used)

 # ── Book definitions ───────────────────────────────────────────────────────────
-# Format: (label, start_marker, voice, output_wav)
-#   start_marker – exact text of the FIRST line of the section header in the source
-#                  (leading/trailing whitespace is ignored when matching)
+# Format: (label, (start_line1, start_line2), voice, output_wav)
+#   start_line1 – exact text of the FIRST line of the section header
+#   start_line2 – prefix of the SECOND line (used together for unambiguous matching)
 #   voice        – Kokoro voice name
 #   output_wav   – filename saved inside OUTPUT_DIR
 #
 # Comment out any line to skip that section entirely.
 BOOKS = [
-    # label                       start_marker                       voice         output_wav
-    ("Introduction",              "Introduction",                    "af_heart",   "00_introduction.wav"),
-    ("Book of Hagoth",            "THE BOOK OF HAGOTH",              "am_fenrir",  "01_hagoth.wav"),
-    ("Shi-Tugo I",                "THE FIRST BOOK OF SHI-TUGO",      "am_eric",    "02_shi_tugo_1.wav"),
-    ("Sanempet",                  "THE BOOK OF SANEMPET",            "am_liam",    "03_sanempet.wav"),
-    ("Oug",                       "THE BOOK OF OUG",                 "am_michael", "04_oug.wav"),
-    ("Temple Writings of Oug",    "THE BOOK OF",                     "am_michael", "05_temple_writings_oug.wav"),
-    ("Sacred Temple Writings",    "THE SACRED",                      "am_michael", "06_sacred_temple_writings.wav"),
-    ("Samuel the Lamanite I",     "THE FIRST BOOK",                  "am_echo",    "07_samuel_lamanite_1.wav"),
-    ("Samuel the Lamanite II",    "THE SECOND BOOK",                 "am_echo",    "08_samuel_lamanite_2.wav"),
-    ("Manti",                     "THE BOOK OF MANTI",               "am_onyx",    "09_manti.wav"),
-    ("Pa Nat I",                  "THE FIRST BOOK OF PA NAT",        "af_nicole",  "10_pa_nat_1.wav"),
-    ("Moroni I",                  "THE FIRST BOOK OF MORONI",        "am_adam",    "11_moroni_1.wav"),
-    ("Moroni II",                 "THE SECOND BOOK OF MORONI",       "am_adam",    "12_moroni_2.wav"),
-    ("Moroni III",                "THE THIRD BOOK OF MORONI",        "am_adam",    "13_moroni_3.wav"),
-    ("Shioni",                    "THE BOOK OF SHIONI",              "am_puck",    "14_shioni.wav"),
+    # label                       (start_line1,                    start_line2)                           voice         output_wav
+    ("Introduction",              ("Introduction",                 "The Book of the Nem"),                "af_heart",   "00_introduction.wav"),
+    ("Book of Hagoth",            ("THE BOOK OF HAGOTH",           "THE SON OF HAGMENI,"),                 "am_santa",  "01_hagoth.wav"),
+    ("Shi-Tugo I",                ("THE FIRST BOOK OF SHI-TUGO",  "FORMER WARRIOR, AMMONITE"),            "am_eric",    "02_shi_tugo_1.wav"),
+    ("Sanempet",                  ("THE BOOK OF SANEMPET",        "THE SON OF HAGMENI,"),                 "am_liam",    "03_sanempet.wav"),
+    ("Oug",                       ("THE BOOK OF OUG",             "THE SON OF SANEMPET"),                 "am_michael", "04_oug.wav"),
+    ("Temple Writings of Oug",    ("THE BOOK OF",                 "THE TEMPLE WRITINGS"),                "am_michael", "05_temple_writings_oug.wav"),
+    ("Sacred Temple Writings",    ("THE SACRED",                  "TEMPLE WRITINGS"),                     "am_michael", "06_sacred_temple_writings.wav"),
+    ("Samuel the Lamanite I",     ("THE FIRST BOOK",              "OF SAMUEL THE LAMANITE"),             "am_echo",    "07_samuel_lamanite_1.wav"),
+    ("Samuel the Lamanite II",    ("THE SECOND BOOK",             "OF SAMUEL THE LAMANITE"),             "am_echo",    "08_samuel_lamanite_2.wav"),
+    ("Manti",                     ("THE BOOK OF MANTI",           "THE SON OF OUG"),                      "am_onyx",    "09_manti.wav"),
+    ("Pa Nat I",                  ("THE FIRST BOOK OF PA NAT",    "THE DAUGHTER OF SHIMLEI"),             "af_bella",  "10_pa_nat_1.wav"),
+    ("Moroni I",                  ("THE FIRST BOOK OF MORONI",    "THE SON OF MORMON,"),                  "am_adam",    "11_moroni_1.wav"),
+    ("Moroni II",                 ("THE SECOND BOOK OF MORONI",   "THE SON OF MORMON,"),                  "am_adam",    "12_moroni_2.wav"),
+    ("Moroni III",                ("THE THIRD BOOK OF MORONI",    "THE SON OF MORMON,"),                  "am_adam",    "13_moroni_3.wav"),
+    ("Shioni",                    ("THE BOOK OF SHIONI",          "THE SON OF MORONI"),                   "am_puck",    "14_shioni.wav"),
 ]

 # ── Helpers ────────────────────────────────────────────────────────────────────
@ -71,23 +81,36 @@ BOOKS = [
 def load_and_split(source: Path, books: list) -> dict[str, str]:
    """
    Read the source file and split it into sections keyed by label.
-    Each section starts at its start_marker line and ends just before the
-    next section's start_marker.
+    Each section starts at its (start_line1, start_line2) marker pair and
+    ends just before the next section's marker.
+
+    Marker positions are always detected from the *original* unmodified file
+    (_ORIG_FILE) when it exists, so that phonetic fixes applied to section
+    headings in the TTS-fixed file can never break section detection.  The
+    line numbers are identical in both files because word-level replacements
+    never add or remove lines.
    """
-    raw_lines = source.read_text(encoding="utf-8").splitlines()
+    # Use the original (un-fixed) file for marker detection so phonetic
+    # changes to heading lines don't break matching.
+    marker_source = _ORIG_FILE if _ORIG_FILE.exists() else source
+    marker_lines = marker_source.read_text(encoding="utf-8").splitlines()

-    # Build a mapping: marker_text → index in BOOKS
-    markers = [(label, marker.strip()) for label, marker, _, _ in books]
+    # The content to actually return comes from `source` (may be fixed file).
+    content_lines = source.read_text(encoding="utf-8").splitlines()

-    # Find the line index of each marker's first occurrence
+    # Build a mapping: (label, line1, line2) for each book
+    markers = [(label, m[0].strip(), m[1].strip()) for label, m, _, _ in books]
+
+    # Find the line index of each marker's first occurrence (two-line match)
    marker_positions: list[tuple[int, int]] = []   # (line_idx, books_idx)
-    for book_idx, (label, marker) in enumerate(markers):
-        for line_idx, line in enumerate(raw_lines):
-            if line.strip() == marker:
+    for book_idx, (label, m1, m2) in enumerate(markers):
+        for line_idx, line in enumerate(marker_lines[:-1]):
+            if (line.strip().upper() == m1.upper() and
+                    marker_lines[line_idx + 1].strip().upper().startswith(m2.upper())):
                marker_positions.append((line_idx, book_idx))
                break
        else:
-            print(f"  ⚠  Marker not found for '{label}': '{marker}' — skipping")
+            print(f"  ⚠  Marker not found for '{label}': '{m1}' / '{m2}' — skipping")

    marker_positions.sort(key=lambda x: x[0])

@ -97,8 +120,8 @@ def load_and_split(source: Path, books: list) -> dict[str, str]:
        if rank + 1 < len(marker_positions):
            end_line = marker_positions[rank + 1][0]
        else:
-            end_line = len(raw_lines)
-        text = "\n".join(raw_lines[line_idx:end_line]).strip()
+            end_line = len(content_lines)
+        text = "\n".join(content_lines[line_idx:end_line]).strip()
        sections[label] = text

    return sections
@ -118,8 +141,21 @@ def clean_text(text: str) -> str:
    return text.strip()


+def _fmt_duration(seconds: float) -> str:
+    """Format seconds as 'Xh Ym Zs', 'Xm Ys', or 'Xs'."""
+    h, rem = divmod(int(seconds), 3600)
+    m, s = divmod(rem, 60)
+    if h > 0:
+        return f"{h}h {m:02d}m {s:02d}s"
+    if m > 0:
+        return f"{m}m {s:02d}s"
+    return f"{s}s"
+
+
 def generate_audio(pipeline: KPipeline, text: str, voice: str,
-                   output_path: Path) -> None:
+                   output_path: Path) -> float:
+    """Generate audio and return wall-clock seconds elapsed."""
+    t0 = time.monotonic()
    chunks = []
    for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
        if hasattr(chunk_audio, "numpy"):
@ -131,15 +167,55 @@ def generate_audio(pipeline: KPipeline, text: str, voice: str,
    if chunks:
        audio = np.concatenate(chunks, axis=0)
        sf.write(str(output_path), audio, SAMPLE_RATE)
+        elapsed = time.monotonic() - t0
        duration = len(audio) / SAMPLE_RATE
-        print(f"  ✓  Saved '{output_path.name}'  ({duration:.1f}s)")
+        print(f"  ✓  Saved '{output_path.name}'  ({_fmt_duration(duration)} audio  |  {_fmt_duration(elapsed)} wall-clock)")
    else:
+        elapsed = time.monotonic() - t0
        print(f"  ✗  No audio produced for voice='{voice}'")
+    return elapsed


 # ── Main ───────────────────────────────────────────────────────────────────────

 def main() -> None:
+    # ── CLI ────────────────────────────────────────────────────────────
+    parser = argparse.ArgumentParser(description="Generate Nem audiobook sections.")
+    parser.add_argument(
+        "books", nargs="*",
+        help="Labels of sections to generate (default: all enabled books). "
+             "Use --list to see available labels."
+    )
+    parser.add_argument(
+        "--list", action="store_true",
+        help="Print all enabled book labels and exit."
+    )
+    parser.add_argument(
+        "--preview", nargs="?", const=3000, type=int, metavar="CHARS",
+        help="Generate a short preview clip per book (default: 3000 chars). "
+             "Output filenames get a _preview suffix."
+    )
+    args = parser.parse_args()
+
+    enabled_labels = [label for label, _, _, _ in BOOKS]
+
+    if args.list:
+        print("Enabled books:")
+        for label in enabled_labels:
+            print(f"  {label}")
+        return
+
+    # Filter to requested subset, preserving BOOKS order
+    if args.books:
+        unknown = [b for b in args.books if b not in enabled_labels]
+        if unknown:
+            print(f"Unknown book label(s): {', '.join(unknown)}")
+            print(f"Run with --list to see available labels.")
+            return
+        run_books = [b for b in BOOKS if b[0] in args.books]
+    else:
+        run_books = list(BOOKS)
+
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Device: {device}")
    if device == "cuda":
@ -150,25 +226,95 @@ def main() -> None:
    print(f"\nSource: '{SOURCE_FILE}'"
          + (" ✓ (TTS fixed)" if SOURCE_FILE == _FIXED_FILE else
             " ⚠ (original — run 'Apply Fixes to Text' in the GUI to use phonetic fixes)"))
+    # Always split using ALL books for correct section boundaries,
+    # but only generate for run_books.
    sections = load_and_split(SOURCE_FILE, BOOKS)
-    print(f"  Found {len(sections)} sections.\n")
+    print(f"  Found {len(sections)} sections ({len(run_books)} selected).\n")

    print("Initialising Kokoro pipeline …")
    pipeline = KPipeline(lang_code=LANG_CODE)

-    for label, marker, voice, wav_name in BOOKS:
-        if label not in sections:
-            continue  # marker was not found; warning already printed
+    # Pre-compute char counts for all sections so we can estimate ETAs
+    section_chars: dict[str, int] = {
+        label: len(clean_text(sections[label]))
+        for label, _, _, _ in run_books
+        if label in sections
+    }

-        print(f"\n[{label}]  voice={voice}  →  {wav_name}")
-        text = clean_text(sections[label])
-        if not text:
-            print("  ⚠  Empty text — skipping")
+    # Print char count summary before starting
+    preview_note = f"  ⚡ PREVIEW MODE — capped at {args.preview:,} chars/book\n" if args.preview else ""
+    print(f"\n{preview_note}{'─' * 52}")
+    print(f"  {'Section':<30}  {'Chars':>8}")
+    print(f"{'─' * 52}")
+    for label, _, _, wav_name in run_books:
+        if label in section_chars:
+            print(f"  {label:<30}  {section_chars[label]:>8,}")
+    print(f"{'─' * 52}")
+    total_chars = sum(section_chars.values())
+    print(f"  {'TOTAL':<30}  {total_chars:>8,}")
+    print()
+
+    chars_per_sec: float | None = None   # derived from the first book that finishes
+    timing_rows: list[tuple[str, int, float]] = []  # (label, chars, elapsed)
+
+    for label, _marker, voice, wav_name in run_books:
+        if label not in sections:
            continue

-        out_path = OUTPUT_DIR / wav_name
-        generate_audio(pipeline, text, voice, out_path)
+        text = clean_text(sections[label])
+        if not text:
+            print(f"\n[{label}]  ⚠  Empty text — skipping")
+            continue

+        # Preview mode: truncate to requested char limit at a word boundary
+        preview_chars = args.preview
+        if preview_chars:
+            if len(text) > preview_chars:
+                cut = text.rfind(" ", 0, preview_chars)
+                text = text[: cut if cut > 0 else preview_chars]
+
+        chars = len(text)
+
+        # Print ETA once we have a calibration rate
+        if chars_per_sec is not None:
+            eta_sec = chars / chars_per_sec
+            eta_str = _fmt_duration(eta_sec)
+            print(f"\n[{label}]  voice={voice}  →  {wav_name}  (est. {eta_str})")
+        else:
+            print(f"\n[{label}]  voice={voice}  →  {wav_name}  (timing calibration run)")
+
+        stem, ext = wav_name.rsplit(".", 1)
+        preview_tag = "_preview" if preview_chars else ""
+        out_path = OUTPUT_DIR / f"{stem}_{voice}{preview_tag}.{ext}"
+        elapsed = generate_audio(pipeline, text, voice, out_path)
+        timing_rows.append((label, chars, elapsed))
+
+        # Update calibration as a cumulative average after every book
+        total_chars_done = sum(c for _, c, _ in timing_rows)
+        total_elapsed_done = sum(e for _, _, e in timing_rows)
+        if total_elapsed_done > 0:
+            chars_per_sec = total_chars_done / total_elapsed_done
+            remaining = total_chars - total_chars_done
+            eta_overall = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
+            print(f"  ⏱  Speed: {chars_per_sec:.0f} chars/sec  |  Est. overall remaining: {eta_overall}")
+
+    # ── Summary ────────────────────────────────────────────────────────────────
+    print("\n" + "─" * 60)
+    print(f"  {'Section':<30}  {'Chars':>7}  {'Actual':>8}  {'Est':>8}")
+    print("─" * 60)
+    for i, (label, chars, elapsed) in enumerate(timing_rows):
+        actual_str = _fmt_duration(elapsed)
+        # Estimate using the cumulative rate *before* this book was added
+        prior_chars = sum(c for _, c, _ in timing_rows[:i])
+        prior_elapsed = sum(e for _, _, e in timing_rows[:i])
+        if prior_elapsed > 0:
+            est_str = _fmt_duration(chars / (prior_chars / prior_elapsed))
+        else:
+            est_str = "(first run)"
+        print(f"  {label:<30}  {chars:>7,}  {actual_str:>8}  {est_str:>8}")
+    total_elapsed = sum(e for _, _, e in timing_rows)
+    print("─" * 60)
+    print(f"  {'TOTAL':<30}  {sum(c for _,c,_ in timing_rows):>7,}  {_fmt_duration(total_elapsed):>8}")
    print("\nDone.")


--- a/create_temple_voices.py
+++ b/create_temple_voices.py
@ -0,0 +1,352 @@
+"""
+create_temple_voices.py
+────────────────────────
+Generate the "Sacred Temple Writings" section of the Nem audiobook using one
+distinct Microsoft Edge neural TTS voice per character (NOT Kokoro).
+
+Uses the free edge-tts library which streams Microsoft Azure neural voices.
+Audio is stitched into a single WAV and saved to OUTPUT_DIR.
+
+Usage:
+    python create_temple_voices.py                    # full render
+    python create_temple_voices.py --preview 40       # first 40 segments only
+    python create_temple_voices.py --print-segments   # inspect parsed segments
+    python create_temple_voices.py --list-voices      # list available en voices
+
+Voice assignments live in CHARACTER_VOICES below — easy to customise.
+Run  --list-voices  to discover all available edge-tts voice names.
+"""
+
+import argparse
+import asyncio
+import re
+import subprocess
+import time
+from collections import Counter
+from pathlib import Path
+
+import numpy as np
+import soundfile as sf
+import edge_tts
+
+# ── File / output config ───────────────────────────────────────────────────────
+_FIXED_FILE  = Path("Audio Master Nem Full (TTS Fixed).txt")
+_ORIG_FILE   = Path("Audio Master Nem Full.txt")
+SOURCE_FILE  = _FIXED_FILE if _FIXED_FILE.exists() else _ORIG_FILE
+
+OUTPUT_DIR   = Path("output_temple_voices")
+OUTPUT_FILE  = "sacred_temple_writings_multivoice.wav"
+
+SAMPLE_RATE  = 24_000   # Hz — final WAV sample rate
+PAUSE_SAME   = 350      # ms silence between same-speaker segments
+PAUSE_CHANGE = 650      # ms silence between different-speaker segments
+
+# ── Section boundary markers (match create_audiobook_nem.py BOOKS order) ──────
+#   Sacred Temple Writings starts at "THE SACRED" / "TEMPLE WRITINGS"
+#   and ends just before "THE FIRST BOOK" / "OF SAMUEL THE LAMANITE"
+_SEC_START_L1 = "THE SACRED"
+_SEC_START_L2 = "TEMPLE WRITINGS"
+_SEC_END_L1   = "THE FIRST BOOK"
+_SEC_END_L2   = "OF SAMUEL THE LAMANITE"
+
+# ── Character → edge-tts voice ────────────────────────────────────────────────
+# Run  python create_temple_voices.py --list-voices  to see all available voices.
+# Keys must match the speaker labels exactly as they appear in the source file.
+CHARACTER_VOICES: dict[str, str] = {
+    # ── Celestial beings ───────────────────────────────────────────────────────
+    "Narrator":               "en-US-GuyNeural",         # calm neutral narrator
+    "Elohim Heavenly Mother": "en-US-JennyNeural",       # warm, wise matriarch
+    "Elohim Heavenly Father": "en-US-AndrewMultilingualNeural",  # expressive, authoritative
+    "Jehovah":                "en-US-AndrewNeural",      # clear, gentle divine
+    "Angel of the Lord":      "en-US-BrianNeural",       # ethereal divine messenger
+    "Holy Ghost":             "en-US-EricNeural",        # quiet, inward, spiritual
+    "Holy Ghost Elders":      "en-US-BrianNeural",       # measured elder council
+
+    # ── Dark beings ────────────────────────────────────────────────────────────
+    "Lucifer":                "en-CA-LiamNeural",        # smooth, persuasive tempter
+    "Satan":                  "en-US-SteffanNeural",     # cold, commanding adversary
+
+    # ── Mortal / earth characters ──────────────────────────────────────────────
+    "Michael":                "en-US-RogerNeural",        # noble warrior archangel
+    "Adam":                   "en-US-ChristopherNeural",  # earnest first man
+    "Eve":                    "en-US-AriaNeural",        # curious, warm first woman
+
+    # ── Apostles ───────────────────────────────────────────────────────────────
+    "Peter":                  "en-GB-RyanNeural",        # firm British apostle
+    "James":                  "en-AU-WilliamMultilingualNeural",  # steady Australian voice
+    "John":                   "en-IE-ConnorNeural",      # gentle Irish apostle
+
+    # ── Other roles ────────────────────────────────────────────────────────────
+    "Preacher":               "en-US-AvaNeural",         # bold emphatic preacher
+    "Mob":                    "en-US-MichelleNeural",    # crowd / multitude voice
+    "The Voice of the Mob":   "en-US-MichelleNeural",   # alias used in some editions
+}
+
+# Voice used when a speaker label isn't found in CHARACTER_VOICES
+FALLBACK_VOICE = "en-US-GuyNeural"
+
+# Lines/patterns that are ceremony stage-directions → read by Narrator
+_STAGE_NARRATOR = re.compile(
+    r"^(Break for Instruction|Resume Session|All\s+arise|"
+    r"CHAPTER\s*\d*|________________+|────+)",
+    re.IGNORECASE,
+)
+
+# Lines to skip entirely (decorative / empty)
+_SKIP_RE = re.compile(r"^[—\-_\s\u2014\u2013]*$")
+
+
+# ── Section extraction ─────────────────────────────────────────────────────────
+
+def extract_section(source: Path) -> str:
+    """Return text of the Sacred Temple Writings section."""
+    lines = source.read_text(encoding="utf-8").splitlines()
+    in_sec = False
+    out: list[str] = []
+
+    for i, line in enumerate(lines):
+        s = line.strip()
+        if not in_sec:
+            if (s.upper() == _SEC_START_L1 and
+                    i + 1 < len(lines) and
+                    lines[i + 1].strip().upper().startswith(_SEC_START_L2)):
+                in_sec = True
+        else:
+            # End just before the next section
+            if (s.upper() == _SEC_END_L1 and
+                    i + 1 < len(lines) and
+                    lines[i + 1].strip().upper().startswith(_SEC_END_L2)):
+                break
+            out.append(line)
+
+    if not out:
+        raise RuntimeError(
+            f"Could not locate 'Sacred Temple Writings' in '{source}'.\n"
+            "Ensure the source file has a line exactly matching "
+            f"'{_SEC_START_L1}' followed by '{_SEC_START_L2}'."
+        )
+    return "\n".join(out)
+
+
+# ── Segment parser ─────────────────────────────────────────────────────────────
+
+def _speaker_regex(characters: list[str]) -> re.Pattern:
+    """Regex matching  [optional-number]  CharacterName:  text"""
+    # Sort longest-first so "Holy Ghost Elders" matches before "Holy Ghost"
+    names = sorted(characters, key=len, reverse=True)
+    pat = "|".join(re.escape(n) for n in names)
+    return re.compile(r"^\d*\s*(" + pat + r")\s*:\s*(.*)", re.IGNORECASE)
+
+
+def parse_segments(text: str) -> list[tuple[str, str]]:
+    """
+    Convert section text into a list of (normalised_speaker, spoken_text) tuples.
+    Non-attributed prose becomes Narrator lines.
+    """
+    char_re = _speaker_regex(list(CHARACTER_VOICES.keys()))
+
+    # Build a quick lowercase→canonical lookup for speaker name normalisation
+    canon: dict[str, str] = {k.lower(): k for k in CHARACTER_VOICES}
+
+    segments: list[tuple[str, str]] = []
+    cur_speaker = "Narrator"
+    buf: list[str] = []
+
+    def flush() -> None:
+        combined = " ".join(l.strip() for l in buf if l.strip())
+        if combined:
+            segments.append((cur_speaker, combined))
+        buf.clear()
+
+    for raw in text.splitlines():
+        line = raw.strip()
+
+        if not line or _SKIP_RE.match(line):
+            continue
+
+        # Stage direction → Narrator reads it
+        if _STAGE_NARRATOR.match(line):
+            flush()
+            cur_speaker = "Narrator"
+            buf.append(line)
+            continue
+
+        # "The words of Jehovah … are in blue." — formatting note, skip
+        if re.search(r"are in blue|words of jehovah", line, re.IGNORECASE):
+            continue
+
+        m = char_re.match(line)
+        if m:
+            flush()
+            raw_name = m.group(1)
+            cur_speaker = canon.get(raw_name.lower(), raw_name)
+            spoken = m.group(2).strip()
+            if spoken:
+                buf.append(spoken)
+        else:
+            # Continuation of current speaker (or unattributed narrator prose)
+            buf.append(line)
+
+    flush()
+    return segments
+
+
+# ── Audio generation ───────────────────────────────────────────────────────────
+
+async def _tts_bytes(text: str, voice: str) -> bytes:
+    """Stream edge-tts and return raw MP3 bytes."""
+    communicate = edge_tts.Communicate(text, voice)
+    data = bytearray()
+    async for chunk in communicate.stream():
+        if chunk["type"] == "audio":
+            data.extend(chunk["data"])
+    return bytes(data)
+
+
+def _mp3_to_numpy(mp3: bytes) -> np.ndarray:
+    """Decode MP3 bytes → mono float32 numpy array at SAMPLE_RATE using ffmpeg."""
+    cmd = [
+        "ffmpeg", "-hide_banner", "-loglevel", "error",
+        "-i", "pipe:0",                    # read MP3 from stdin
+        "-f", "f32le",                      # raw 32-bit little-endian float PCM
+        "-acodec", "pcm_f32le",
+        "-ac", "1",                          # mono
+        "-ar", str(SAMPLE_RATE),            # resample to target rate
+        "pipe:1",                           # write PCM to stdout
+    ]
+    result = subprocess.run(cmd, input=mp3, capture_output=True, check=True)
+    return np.frombuffer(result.stdout, dtype=np.float32).copy()
+
+
+def _silence(ms: int) -> np.ndarray:
+    return np.zeros(int(SAMPLE_RATE * ms / 1000), dtype=np.float32)
+
+
+async def render(
+    segments: list[tuple[str, str]],
+    preview: int | None = None,
+) -> np.ndarray:
+    """Generate and stitch all segment audio; return concatenated float32 array."""
+    if preview is not None:
+        segments = segments[:preview]
+
+    parts: list[np.ndarray] = []
+    last_speaker: str | None = None
+    t0 = time.monotonic()
+
+    for idx, (speaker, text) in enumerate(segments, 1):
+        voice = CHARACTER_VOICES.get(speaker, FALLBACK_VOICE)
+        marker = "⚠" if speaker not in CHARACTER_VOICES else " "
+        print(f"  {marker}[{idx:>4}/{len(segments)}]  {speaker:<28}  {voice}")
+
+        try:
+            mp3 = await _tts_bytes(text, voice)
+        except Exception as exc:
+            print(f"       ↳ ERROR with '{voice}': {exc}  — falling back to {FALLBACK_VOICE}")
+            mp3 = await _tts_bytes(text, FALLBACK_VOICE)
+
+        audio = _mp3_to_numpy(mp3)
+
+        if parts:
+            gap = PAUSE_SAME if speaker == last_speaker else PAUSE_CHANGE
+            parts.append(_silence(gap))
+        parts.append(audio)
+        last_speaker = speaker
+
+    elapsed = time.monotonic() - t0
+    print(f"\n  ✓  {len(segments)} segments in {elapsed:.0f}s")
+    return np.concatenate(parts) if parts else np.array([], dtype=np.float32)
+
+
+# ── Voice listing ──────────────────────────────────────────────────────────────
+
+async def _list_voices_async() -> None:
+    voices = await edge_tts.list_voices()
+    english = sorted(
+        (v for v in voices if v["Locale"].startswith("en-")),
+        key=lambda v: (v["Locale"], v["ShortName"]),
+    )
+    print(f"\n  {'Locale':<12}  {'Short Name':<45}  Gender")
+    print("  " + "─" * 68)
+    for v in english:
+        print(f"  {v['Locale']:<12}  {v['ShortName']:<45}  {v['Gender']}")
+    print(f"\n  {len(english)} English voices total.")
+
+
+# ── CLI / main ─────────────────────────────────────────────────────────────────
+
+def main() -> None:
+    ap = argparse.ArgumentParser(
+        description="Render Sacred Temple Writings with per-character edge-tts voices."
+    )
+    ap.add_argument("--list-voices", action="store_true",
+                    help="Print all available English edge-tts voices and exit.")
+    ap.add_argument("--print-segments", action="store_true",
+                    help="Print parsed (speaker, text) segments and exit.")
+    ap.add_argument("--preview", type=int, metavar="N",
+                    help="Render only the first N segments (quick test).")
+    args = ap.parse_args()
+
+    if args.list_voices:
+        asyncio.run(_list_voices_async())
+        return
+
+    # ── Extract & parse ────────────────────────────────────────────────────────
+    print(f"Source : {SOURCE_FILE}")
+    text = extract_section(SOURCE_FILE)
+    print(f"Section: {len(text):,} chars extracted\n")
+
+    segments = parse_segments(text)
+
+    if args.print_segments:
+        print(f"Parsed {len(segments)} segments:\n")
+        for i, (spkr, txt) in enumerate(segments, 1):
+            snippet = txt[:90] + ("…" if len(txt) > 90 else "")
+            voice = CHARACTER_VOICES.get(spkr, f"{FALLBACK_VOICE} ⚠")
+            print(f"  {i:>4}.  [{spkr}]  ({voice})\n        {snippet}\n")
+        return
+
+    # ── Summary table ──────────────────────────────────────────────────────────
+    counts = Counter(s for s, _ in segments)
+    unrecognised = {s for s in counts if s not in CHARACTER_VOICES}
+
+    print(f"Parsed {len(segments)} segments across {len(counts)} speakers:\n")
+    print(f"  {'Speaker':<28}  {'Segs':>5}  {'Voice'}")
+    print(f"  {'─'*28}  {'─'*5}  {'─'*45}")
+    for spkr, voice in CHARACTER_VOICES.items():
+        if counts[spkr]:
+            print(f"  {spkr:<28}  {counts[spkr]:>5}  {voice}")
+    for spkr in sorted(unrecognised):
+        print(f"  {spkr:<28}  {counts[spkr]:>5}  {FALLBACK_VOICE}  ⚠ unrecognised")
+
+    total_chars = sum(len(t) for _, t in segments)
+    print(f"\n  Total chars: {total_chars:,}")
+    if args.preview:
+        print(f"  ⚡ PREVIEW MODE — rendering first {args.preview} segments only")
+
+    # ── GPU note ───────────────────────────────────────────────────────────────
+    # edge-tts is cloud-based (Microsoft Azure neural, free) — GPU not used.
+    print("\nNote: edge-tts uses Microsoft's servers (free, no API key needed).\n"
+          "      Render speed depends on your internet connection.\n")
+
+    # ── Render ─────────────────────────────────────────────────────────────────
+    OUTPUT_DIR.mkdir(exist_ok=True)
+    out_path = OUTPUT_DIR / (
+        f"sacred_temple_writings_preview{args.preview}.wav"
+        if args.preview else OUTPUT_FILE
+    )
+
+    print("Rendering segments …\n")
+    audio = asyncio.run(render(segments, args.preview))
+
+    if audio.size > 0:
+        sf.write(str(out_path), audio, SAMPLE_RATE)
+        dur = len(audio) / SAMPLE_RATE
+        m, s = divmod(int(dur), 60)
+        print(f"\n✓  Saved '{out_path}'  ({m}m {s:02d}s audio  |  {SAMPLE_RATE} Hz)")
+    else:
+        print("✗  No audio produced — check parsing with --print-segments")
+
+
+if __name__ == "__main__":
+    main()
--- a/extract_proper_nouns.py
+++ b/extract_proper_nouns.py
@ -18,6 +18,25 @@ from collections import defaultdict
 from pathlib import Path

 import spacy
+from wordfreq import top_n_list
+
+# ── Top 10 000 most-frequent English words ──────────────────────────
+TOP_10K_ENGLISH: frozenset[str] = frozenset(top_n_list("en", 10_000))
+
+# Words in the top-10k list that are genuine proper nouns in this text —
+# keep them despite the frequency filter.
+PROPER_NOUN_WHITELIST: frozenset[str] = frozenset({
+    # Biblical names
+    "aaron", "abel", "abraham", "adam", "cain", "eden", "egypt",
+    "elijah", "ephraim", "eve", "gad", "ham", "isaac", "israel",
+    "jacob", "james", "jehovah", "john", "joseph", "judah",
+    "laban", "lehi", "levi", "micah", "michael", "moses", "noah",
+    "peter", "pharaoh", "samuel", "sarah", "sarai", "seth", "simeon",
+    "timothy", "zion",
+    # Book-specific names that happen to match English words
+    "alma", "ether", "gideon", "limhi", "mormon", "moroni", "mulek",
+    "mosiah", "nephi", "satan", "sidon",
+})

 SOURCE = Path("Audio Master Nem Full.txt")
 OUTPUT = Path("proper_nouns.txt")
@ -35,12 +54,29 @@ ORG_LABELS    = {"ORG", "NORP"}
 OTHER_LABELS  = {"EVENT", "WORK_OF_ART", "LAW", "PRODUCT", "LANGUAGE"}

 # ── Noise filters ──────────────────────────────────────────────────────────────
-# All-caps lines are section headers, not spoken names — skip them.
-# Also skip very short tokens that are likely artefacts.
-SKIP_PATTERNS = re.compile(
-    r"^(THE|A|AN|AND|OF|IN|TO|FOR|BY|AT|IS|WAS|BE|HE|SHE|IT|"
-    r"CHAPTER|VERSE|YEA|BEHOLD|LORD|GOD|CHRIST|HOLY|GHOST)$"
-)
+# Common English words that should be dropped when splitting multi-word entities.
+STOP_WORDS: set[str] = {
+    "A", "AN", "AND", "AS", "AT", "BE", "BUT", "BY",
+    "DO", "DID", "DOTH",
+    "EVEN", "FOR", "FROM",
+    "HAD", "HAS", "HAVE", "HATH", "HE", "HER", "HIS", "HOW",
+    "I", "IN", "IS", "IT", "ITS",
+    "MAY", "ME", "MORE", "MY",
+    "NAY", "NO", "NOT", "NOW",
+    "OF", "OR", "OUR",
+    "SHALL", "SHE", "SO", "SOME",
+    "THAT", "THE", "THEE", "THEIR", "THEN", "THERE", "THESE", "THEY",
+    "THIS", "THOSE", "THOU", "THUS", "THY", "TO",
+    "UP", "UPON", "US",
+    "WAS", "WE", "WHEN", "WHERE", "WHICH", "WHO", "WILL", "WITH",
+    "YE", "YEA", "YET", "YOU", "YOUR",
+    # Book-specific common words not worth flagging
+    "BEHOLD", "CHAPTER", "CHRIST", "GOD", "GHOST", "HOLY", "LORD", "VERSE",
+    # Generic nouns that slip through NER
+    "CITY", "DAYS", "DAY", "GREAT", "LAND", "MAN", "MEN", "NEW",
+    "PEOPLE", "SON", "TIME",
+}
+

 def is_noise(text: str) -> bool:
    t = text.strip()
@ -48,9 +84,12 @@ def is_noise(text: str) -> bool:
        return True
    if t.isupper() and len(t) > 4:      # all-caps section header word
        return True
-    if SKIP_PATTERNS.match(t.upper()):
+    if t.upper() in STOP_WORDS:
        return True
-    if re.search(r"[^a-zA-Z\-' ]", t):  # contains digits or symbols
+    if re.search(r"[^a-zA-Z\-']", t):   # contains digits, spaces, or symbols
+        return True
+    # Drop common English words (no hyphens) unless whitelisted as proper nouns.
+    if "-" not in t and t.lower() in TOP_10K_ENGLISH and t.lower() not in PROPER_NOUN_WHITELIST:
        return True
    return False

@ -60,6 +99,11 @@ def canonical(text: str) -> str:
    return " ".join(text.split()).title()


+def split_words(phrase: str) -> list[str]:
+    """Split a phrase on spaces; hyphenated words are kept as one token."""
+    return phrase.split()
+
+
 # ── Read and process ───────────────────────────────────────────────────────────
 print(f"Reading '{SOURCE}' …")
 raw_text = SOURCE.read_text(encoding="utf-8")
@ -71,20 +115,23 @@ doc = nlp(raw_text)
 buckets: dict[str, set[str]] = defaultdict(set)

 # 1. NER pass — trust spaCy's entity labels
+#    Multi-word entities (e.g. "Peter James John") are split into individual
+#    words; hyphenated words (e.g. "Anti-Nephi-Lehi") stay as one token.
 for ent in doc.ents:
-    name = canonical(ent.text)
-    if is_noise(name):
-        continue
-    if ent.label_ in PERSON_LABELS:
-        buckets["People & Characters"].add(name)
-    elif ent.label_ in PLACE_LABELS:
-        buckets["Places & Lands"].add(name)
-    elif ent.label_ in ORG_LABELS:
-        buckets["Groups & Nations"].add(name)
-    elif ent.label_ in OTHER_LABELS:
-        buckets["Other Named Things"].add(name)
-    else:
-        buckets["Other Named Things"].add(name)
+    phrase = canonical(ent.text)
+    for word in split_words(phrase):
+        if is_noise(word):
+            continue
+        if ent.label_ in PERSON_LABELS:
+            buckets["People & Characters"].add(word)
+        elif ent.label_ in PLACE_LABELS:
+            buckets["Places & Lands"].add(word)
+        elif ent.label_ in ORG_LABELS:
+            buckets["Groups & Nations"].add(word)
+        elif ent.label_ in OTHER_LABELS:
+            buckets["Other Named Things"].add(word)
+        else:
+            buckets["Other Named Things"].add(word)

 # 2. PROPN pass — catch names spaCy didn't recognise as entities
 #    Only include tokens that are inside a sentence (not at position 0)
@ -97,13 +144,13 @@ for token in doc:
        continue                          # skip all-caps
    if token.i == token.sent.start:
        continue                          # skip sentence-initial (could be any word)
-    name = canonical(text)
-    if is_noise(name):
+    word = canonical(text)
+    if is_noise(word):
        continue
    # Only add if not already captured by NER
-    already_captured = any(name in s for s in buckets.values())
+    already_captured = any(word in s for s in buckets.values())
    if not already_captured:
-        buckets["Unclassified Proper Nouns"].add(name)
+        buckets["Unclassified Proper Nouns"].add(word)

 # ── Write output ───────────────────────────────────────────────────────────────
 GROUP_ORDER = [
--- a/format_scripture.py
+++ b/format_scripture.py
@ -0,0 +1,801 @@
+#!/usr/bin/env python3
+"""
+create_scripture_pdf.py
+════════════════════════
+Convert the Book of the Nem plain-text file into two scripture-style PDFs:
+
+  nem_kindle.pdf  – single-column, sized for e-readers (4.5" × 6.5")
+  nem_paper.pdf   – two-column, Book of Mormon style (5.5" × 8.5")
+
+Requirements (Debian/Ubuntu):
+    sudo apt-get install texlive-latex-extra texlive-fonts-recommended
+
+  The key packages used are:
+    extsizes   – for 9 pt document class (paper format)
+    tgpagella  – TeX Gyre Pagella (Palatino-clone) font
+    multicol   – two-column layout without hard page breaks
+    microtype  – improved text justification and hyphenation
+    fancyhdr   – running headers and footers
+    needspace  – prevent orphaned headings
+
+Usage:
+    python create_scripture_pdf.py
+    python create_scripture_pdf.py --input "Audio Master Nem Full.txt"
+    python create_scripture_pdf.py --kindle-only
+    python create_scripture_pdf.py --paper-only
+    python create_scripture_pdf.py --output-dir ./pdfs
+    python create_scripture_pdf.py --keep-tex   # keep .tex files for debugging
+"""
+
+import argparse
+import re
+import subprocess
+import sys
+import tempfile
+from dataclasses import dataclass
+from pathlib import Path
+from typing import Optional
+
+# ── Default paths ──────────────────────────────────────────────────────────────
+INPUT_FILE = Path("Audio Master Nem Full.txt")
+OUTPUT_DIR = Path("output_pdf")
+
+# ══════════════════════════════════════════════════════════════════════════════
+# LaTeX helper
+# ══════════════════════════════════════════════════════════════════════════════
+
+_LATEX_TRANS = str.maketrans({
+    "\\":  r"\textbackslash{}",
+    "&":   r"\&",
+    "%":   r"\%",
+    "$":   r"\$",
+    "#":   r"\#",
+    "_":   r"\_",
+    "{":   r"\{",
+    "}":   r"\}",
+    "~":   r"\textasciitilde{}",
+    "^":   r"\textasciicircum{}",
+    "\u2014": "---",        # em dash
+    "\u2013": "--",         # en dash
+    "\u2018": "`",          # left single quote
+    "\u2019": "'",          # right single quote
+    "\u201c": "``",         # left double quote
+    "\u201d": "''",         # right double quote
+    "\u2026": r"\ldots{}",  # ellipsis
+    "\u00e9": r"\'e",
+    "\u00e8": r"\`e",
+    "\u00ea": r"\^e",
+    "\u00e0": r"\`a",
+    "\u00e2": r"\^a",
+    "\u00f3": r"\'o",
+    "\u00ed": r"\'{\i}",
+})
+
+
+def esc(text: str) -> str:
+    """Escape special LaTeX characters in a string."""
+    return text.translate(_LATEX_TRANS)
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Document element types
+# ══════════════════════════════════════════════════════════════════════════════
+
+@dataclass
+class TitlePage:
+    lines: list
+
+
+@dataclass
+class BookHeader:
+    """One or more heading lines that introduce a new book/section."""
+    lines: list   # list of str
+
+
+@dataclass
+class Chapter:
+    num: int
+    subtitle: Optional[str] = None
+
+
+@dataclass
+class SectionHeading:
+    """Short heading within a chapter (e.g. MARRIAGE, BAPTISM)."""
+    text: str
+
+
+@dataclass
+class Verse:
+    num: int
+    text: str
+
+
+@dataclass
+class Paragraph:
+    text: str
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Parser
+# ══════════════════════════════════════════════════════════════════════════════
+
+_RE_VERSE   = re.compile(r"^\s*(\d+)\s+(.*)")
+_RE_CHAPTER = re.compile(r"^\s*CHAPTER\s+(\d+)\s*$", re.IGNORECASE)
+_RE_DIVIDER = re.compile(r"^_{4,}")
+
+# Lines longer than this are treated as body paragraphs rather than headings
+MAX_HEADING_LEN = 120
+
+
+def _is_verse(line: str) -> bool:
+    """Line starts with a verse number followed by text."""
+    m = _RE_VERSE.match(line)
+    return bool(m) and int(m.group(1)) > 0
+
+
+def _is_chapter(line: str) -> bool:
+    return bool(_RE_CHAPTER.match(line.strip()))
+
+
+def _is_divider(line: str) -> bool:
+    return bool(_RE_DIVIDER.match(line.strip()))
+
+
+def _is_allcaps(line: str) -> bool:
+    s = line.strip()
+    return bool(s) and s == s.upper() and any(c.isalpha() for c in s)
+
+
+def parse(text: str) -> list:
+    """Parse the scripture text into a list of Element objects."""
+    lines = text.splitlines()
+    elements = []
+    n = len(lines)
+    i = 0
+
+    # ── Title page: short lines before the first divider ──────────────────────
+    # Short lines (≤80 chars) are the actual title. Long prose before the first
+    # divider is ignored so it does not duplicate the later labeled Introduction.
+    title_lines = []
+    while i < n and not _is_divider(lines[i]):
+        title_lines.append(lines[i])
+        i += 1
+    actual_title = []
+    for l in title_lines:
+        s = l.strip()
+        if not s:
+            continue
+        if len(s) <= 80:
+            actual_title.append(s)
+    if actual_title:
+        elements.append(TitlePage(lines=actual_title))
+
+    # ── Main pass ─────────────────────────────────────────────────────────────
+    after_divider = False
+
+    while i < n:
+        raw  = lines[i]
+        line = raw.strip()
+
+        # ── Divider ───────────────────────────────────────────────────────────
+        if _is_divider(raw):
+            after_divider = True
+            i += 1
+            continue
+
+        # ── Blank line ────────────────────────────────────────────────────────
+        if not line:
+            i += 1
+            continue
+
+        # ── After a divider: collect section/book header ───────────────────
+        # Collect all short non-verse non-chapter lines immediately following
+        # the divider.  Stop as soon as we hit a long prose line or body content.
+        if after_divider:
+            after_divider = False
+            header_lines = []
+            j = i
+            while j < n:
+                s = lines[j].strip()
+                if not s:                           # blank: keep scanning
+                    j += 1
+                    continue
+                if _is_verse(lines[j]) or _is_chapter(lines[j]):
+                    break                           # reached verse/chapter body
+                if len(s) > MAX_HEADING_LEN:
+                    break                           # long prose line: stop here
+                header_lines.append(s)
+                j += 1
+            if header_lines:
+                elements.append(BookHeader(lines=header_lines))
+            i = j
+            continue
+
+        # ── Chapter heading ────────────────────────────────────────────────
+        m = _RE_CHAPTER.match(line)
+        if m:
+            num = int(m.group(1))
+            # Look ahead for an optional subtitle (short non-verse line)
+            j = i + 1
+            subtitle = None
+            while j < n and not lines[j].strip():
+                j += 1
+            if j < n:
+                ns = lines[j].strip()
+                if (ns
+                        and not _is_verse(lines[j])
+                        and not _is_chapter(lines[j])
+                        and not _is_divider(lines[j])
+                        and len(ns) <= MAX_HEADING_LEN):
+                    subtitle = ns
+                    i = j + 1
+                else:
+                    i += 1
+            else:
+                i += 1
+            elements.append(Chapter(num=num, subtitle=subtitle))
+            continue
+
+        # ── All-caps lines: either a BookHeader cluster or a SectionHeading ─
+        # If the cluster of consecutive all-caps lines is followed (after any
+        # blanks) by a CHAPTER heading, treat the whole cluster as a BookHeader.
+        # Otherwise treat only the first line as a SectionHeading.
+        if _is_allcaps(line) and len(line) <= MAX_HEADING_LEN and not _is_verse(raw):
+            # Gather consecutive all-caps lines (blanks skipped)
+            j = i
+            caps_block = []
+            while j < n:
+                s = lines[j].strip()
+                if not s:
+                    j += 1
+                    continue
+                if (_is_allcaps(s)
+                        and len(s) <= MAX_HEADING_LEN
+                        and not _is_verse(lines[j])
+                        and not _is_chapter(lines[j])
+                        and not _is_divider(lines[j])):
+                    caps_block.append(s)
+                    j += 1
+                else:
+                    break
+            # Look past any blanks to see if a chapter heading follows
+            k = j
+            while k < n and not lines[k].strip():
+                k += 1
+            if k < n and _is_chapter(lines[k]):
+                # This cluster is a book/section header
+                elements.append(BookHeader(lines=caps_block))
+                i = j
+            else:
+                # Single inline section subheading (MARRIAGE, BAPTISM, etc.)
+                elements.append(SectionHeading(text=caps_block[0] if caps_block else line))
+                i = i + 1
+            continue
+
+        # ── Verse ─────────────────────────────────────────────────────────
+        if _is_verse(raw):
+            mfull = _RE_VERSE.match(raw)
+            elements.append(Verse(num=int(mfull.group(1)), text=mfull.group(2).strip()))
+            i += 1
+            continue
+
+        # ── Paragraph ─────────────────────────────────────────────────────
+        elements.append(Paragraph(text=line))
+        i += 1
+
+    return elements
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# LaTeX generation
+# ══════════════════════════════════════════════════════════════════════════════
+
+_PREAMBLE_SHARED = r"""
+\usepackage[T1]{fontenc}
+\usepackage[utf8]{inputenc}
+\usepackage{tgpagella}
+\usepackage{microtype}
+\usepackage{fancyhdr}
+\usepackage{needspace}
+\setlength{\headheight}{14pt}
+\addtolength{\topmargin}{-2pt}
+\usepackage[hidelinks]{hyperref}
+"""
+
+
+def _hrule() -> str:
+    return r"\noindent\rule{\linewidth}{0.3pt}"
+
+
+# ── Kindle (single-column, e-reader sized) ────────────────────────────────────
+
+def build_kindle_latex(elements: list) -> str:
+    """Build a single-column LaTeX document sized for e-readers."""
+    out = []
+    # extarticle (from extsizes) gives us 11pt; plain article also supports it
+    out.append(r"\documentclass[11pt]{extarticle}")
+    out.append(r"""
+\usepackage[paperwidth=4.5in,paperheight=6.5in,
+            top=0.08in,bottom=0.5in,
+            inner=0.42in,outer=0.38in,
+            headheight=12pt,headsep=6pt,
+            includehead]{geometry}""")
+    out.append(_PREAMBLE_SHARED)
+    out.append(r"""
+\pagestyle{fancy}
+\fancyhf{}
+\fancyhead[C]{\small\itshape\nouppercase{\leftmark}}
+\fancyfoot[C]{\small\thepage}
+\renewcommand{\headrulewidth}{0.3pt}
+
+\setlength{\parindent}{0pt}
+\setlength{\parskip}{3pt plus 1pt minus 1pt}
+
+\begin{document}
+""")
+    # Handle title page separately so we can insert TOC after it
+    title_els = [e for e in elements if isinstance(e, TitlePage)]
+    body_els  = [e for e in elements if not isinstance(e, TitlePage)]
+    if title_els:
+        out.append(r"\clearpage")
+        out.append(r"\thispagestyle{empty}")
+        out.append(r"\vspace*{1.3in}")
+        out.append(r"\begin{center}")
+        for j, tl in enumerate(title_els[0].lines):
+            s = tl.strip()
+            if not s:
+                continue
+            if j < 3:
+                out.append(r"{\LARGE\bfseries " + esc(s) + r"} \\[8pt]")
+            else:
+                out.append(r"{\large " + esc(s) + r"} \\[4pt]")
+        out.append(r"\end{center}")
+        out.append(r"\clearpage")
+    out.append(r"\renewcommand{\contentsname}{Table of Contents}")
+    out.append(r"\tableofcontents")
+    out.append(r"\clearpage")
+    _emit_elements(out, body_els, kindle=True)
+    out.append(r"\end{document}")
+    return "\n".join(out)
+
+
+# ── Paper / BOM style (two-column) ────────────────────────────────────────────
+
+def build_paper_latex(elements: list) -> str:
+    """Build a two-column, Book of Mormon-style LaTeX document."""
+    out = []
+    # extarticle (from extsizes) for 9pt support
+    out.append(r"\documentclass[9pt,twoside]{extarticle}")
+    out.append(r"""
+\usepackage[paperwidth=5.5in,paperheight=8.5in,
+            top=0.08in,bottom=0.55in,
+            inner=0.5in,outer=0.42in,
+            headheight=10pt,headsep=5pt,
+            includehead]{geometry}""")
+    out.append(_PREAMBLE_SHARED)
+    out.append(r"""
+\usepackage{multicol}
+\setlength{\columnsep}{0.22in}
+\setlength{\columnseprule}{0.3pt}
+
+\pagestyle{fancy}
+\fancyhf{}
+\fancyhead[LE]{\footnotesize\itshape\nouppercase{\leftmark}}
+\fancyhead[RO]{\footnotesize\itshape\nouppercase{\rightmark}}
+\fancyfoot[C]{\scriptsize\thepage}
+\renewcommand{\headrulewidth}{0.3pt}
+
+\setlength{\parindent}{0pt}
+\setlength{\parskip}{1pt}
+
+\begin{document}
+""")
+
+    # Emit the title page outside multicols (single-column block)
+    title_els = [e for e in elements if isinstance(e, TitlePage)]
+    body_els  = [e for e in elements if not isinstance(e, TitlePage)]
+
+    if title_els:
+        out.append(r"\begin{center}")
+        for j, tl in enumerate(title_els[0].lines):
+            s = tl.strip()
+            if not s:
+                continue
+            if j < 3:
+                out.append(r"{\large\bfseries " + esc(s) + r"} \\[3pt]")
+            else:
+                out.append(r"{\small " + esc(s) + r"} \\[1pt]")
+        out.append(r"\end{center}")
+        out.append(r"\medskip")
+
+    out.append(r"\renewcommand{\contentsname}{Table of Contents}")
+    out.append(r"\tableofcontents")
+    out.append(r"\clearpage")
+
+    # Skip any leading front-matter paragraphs before the first section header.
+    # For paper output, the intro should begin at the labeled "Introduction"
+    # section rather than repeating the pre-divider prose block.
+    first_section = next(
+        (i for i, el in enumerate(body_els) if isinstance(el, BookHeader)),
+        len(body_els),
+    )
+    paper_body_els = body_els[first_section:]
+
+    # Split intro (before first real book) from main body.
+    # A "real book" is a BookHeader that is followed by at least one Chapter
+    # before the next BookHeader. "Introduction" and similar preamble sections
+    # are BookHeaders too but have no chapters, so they stay in the intro.
+    first_book = len(paper_body_els)
+    for i, el in enumerate(paper_body_els):
+        if isinstance(el, BookHeader):
+            # Check if a Chapter follows before the next BookHeader
+            for j in range(i + 1, len(paper_body_els)):
+                if isinstance(paper_body_els[j], Chapter):
+                    first_book = i
+                    break
+                if isinstance(paper_body_els[j], BookHeader):
+                    break
+        if first_book < len(paper_body_els):
+            break
+    intro_els = paper_body_els[:first_book]
+    main_els  = paper_body_els[first_book:]
+
+    if intro_els:
+        _emit_elements(out, intro_els, kindle=True, compact_headers=True)
+        out.append(r"\clearpage")
+
+    out.append(r"\begin{multicols}{2}")
+    _emit_elements(out, main_els, kindle=False)
+    out.append(r"\end{multicols}")
+    out.append(r"\end{document}")
+    return "\n".join(out)
+
+
+# ── Body emitter ──────────────────────────────────────────────────────────────
+
+def _emit_elements(
+    out: list,
+    elements: list,
+    kindle: bool,
+    indent: bool = False,
+    compact_headers: bool = False,
+) -> None:
+    """Translate parsed Element objects into LaTeX markup."""
+
+    for el in elements:
+
+        # ── Title page (kindle only; paper handles it before multicols) ──────
+        if isinstance(el, TitlePage):
+            if kindle:
+                out.append(r"\clearpage")
+                out.append(r"\thispagestyle{empty}")
+                out.append(r"\vspace*{1.3in}")
+                out.append(r"\begin{center}")
+                for j, tl in enumerate(el.lines):
+                    s = tl.strip()
+                    if not s:
+                        continue
+                    if j < 3:
+                        out.append(r"{\LARGE\bfseries " + esc(s) + r"} \\[8pt]")
+                    else:
+                        out.append(r"{\large " + esc(s) + r"} \\[4pt]")
+                out.append(r"\end{center}")
+                out.append(r"\clearpage")
+
+        # ── Book / section header ────────────────────────────────────────────
+        elif isinstance(el, BookHeader):
+            lines = el.lines
+
+            if kindle:
+                # Start a new page for each major book
+                out.append(r"\clearpage")
+                out.append(r"\phantomsection\addcontentsline{toc}{section}{" + esc(lines[0]) + r"}")
+                out.append(r"\vspace*{0pt}" if compact_headers else r"\vspace*{0.1in}")
+                out.append(r"\begin{center}")
+                out.append(_hrule())
+                out.append(r"\\[6pt]")
+                out.append(r"{\bfseries\large " + esc(lines[0]) + r"}")
+                for ln in lines[1:]:
+                    out.append(r"\\ [3pt]{\normalsize\itshape " + esc(ln) + r"}")
+                out.append(r"\\[6pt]")
+                out.append(_hrule())
+                out.append(r"\end{center}")
+                out.append(r"\markboth{" + esc(lines[0]) + r"}{" + esc(lines[0]) + r"}")
+                out.append(r"\vspace{5pt}")
+
+            else:
+                # Inline heading within the two-column flow
+                # Refuse to start a new book in the bottom half of a column
+                out.append(r"\needspace{0.5\textheight}")
+                out.append(r"\phantomsection\addcontentsline{toc}{section}{" + esc(lines[0]) + r"}")
+                out.append(r"\begin{center}")
+                out.append(_hrule())
+                out.append(r"\\[2pt]")
+                out.append(r"{\bfseries " + esc(lines[0]) + r"}")
+                for ln in lines[1:]:
+                    out.append(r"\\ {\small\itshape " + esc(ln) + r"}")
+                out.append(r"\\[2pt]")
+                out.append(_hrule())
+                out.append(r"\end{center}")
+                out.append(r"\markboth{" + esc(lines[0]) + r"}{" + esc(lines[0]) + r"}")
+                out.append(r"\vspace{2pt}")
+
+        # ── Chapter heading ──────────────────────────────────────────────────
+        elif isinstance(el, Chapter):
+            label = f"CHAPTER {el.num}"
+
+            if kindle:
+                out.append(r"\phantomsection\addcontentsline{toc}{subsection}{" + esc(label) + r"}")
+                out.append(r"\needspace{4\baselineskip}")
+                out.append(r"\vspace{14pt}")
+                out.append(r"\begin{center}")
+                out.append(r"{\bfseries\large " + esc(label) + r"}")
+                if el.subtitle:
+                    out.append(r"\\ [3pt]{\normalsize\itshape " + esc(el.subtitle) + r"}")
+                out.append(r"\end{center}")
+                out.append(r"\markright{" + esc(label) + r"}")
+                out.append(r"\vspace{6pt}")
+
+            else:
+                out.append(r"\phantomsection\addcontentsline{toc}{subsection}{" + esc(label) + r"}")
+                out.append(r"\needspace{2\baselineskip}")
+                out.append(r"\vspace{3pt}")
+                out.append(r"\begin{center}")
+                out.append(r"{\bfseries " + esc(label) + r"}")
+                if el.subtitle:
+                    out.append(r"\\ {\small\itshape " + esc(el.subtitle) + r"}")
+                out.append(r"\end{center}")
+                out.append(r"\markright{" + esc(label) + r"}")
+                out.append(r"\vspace{1pt}")
+
+        # ── Section subheading (MARRIAGE, BAPTISM, etc.) ────────────────────
+        elif isinstance(el, SectionHeading):
+            if kindle:
+                out.append(r"\vspace{8pt}")
+                out.append(r"\begin{center}{\bfseries " + esc(el.text) + r"}\end{center}")
+                out.append(r"\vspace{4pt}")
+            else:
+                out.append(r"\vspace{3pt}")
+                out.append(
+                    r"\begin{center}{\bfseries\small " + esc(el.text) + r"}\end{center}"
+                )
+                out.append(r"\vspace{1pt}")
+
+        # ── Verse ────────────────────────────────────────────────────────────
+        elif isinstance(el, Verse):
+            body = esc(el.text)
+            if kindle:
+                # Bold inline number (not superscript) for readability on screen
+                vnum = r"\textbf{" + str(el.num) + r"}"
+                out.append(r"\noindent " + vnum + r"~" + body)
+                out.append(r"\par\smallskip")
+            else:
+                vnum = r"\textbf{" + str(el.num) + r"}"
+                out.append(r"\noindent " + vnum + r"~" + body + r"\par")
+
+        # ── Paragraph (prose intro, commentary, etc.) ───────────────────────
+        elif isinstance(el, Paragraph):
+            body = esc(el.text)
+            if kindle:
+                out.append(r"\noindent " + body)
+                out.append(r"\par\smallskip")
+            elif indent:
+                out.append(body + r"\par\medskip")
+            else:
+                out.append(r"\noindent " + body + r"\par")
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Utility: book limiter
+# ══════════════════════════════════════════════════════════════════════════════
+
+def truncate_to_books(elements: list, max_books: int) -> list:
+    """Return only the first *max_books* BookHeader sections (and their content).
+    Title-page and front-matter paragraphs before the first BookHeader are always kept.
+    """
+    if max_books <= 0:
+        return elements
+    count = 0
+    result = []
+    for el in elements:
+        if isinstance(el, BookHeader):
+            count += 1
+            if count > max_books:
+                break
+        result.append(el)
+    return result
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# PDF compilation
+# ══════════════════════════════════════════════════════════════════════════════
+
+def _find_compiler() -> tuple:
+    """Return (compiler_path, compiler_type) or (None, None) if none found."""
+    import shutil
+    # Also probe common absolute paths in case the dir isn't on $PATH
+    candidates = {
+        "pdflatex": ["/usr/bin/pdflatex", "/usr/local/bin/pdflatex"],
+        "tectonic": ["/usr/bin/tectonic", "/usr/local/bin/tectonic"],
+    }
+    for cmd, extra_paths in candidates.items():
+        found = shutil.which(cmd)
+        if found:
+            return found, cmd
+        for p in extra_paths:
+            if Path(p).exists():
+                return p, cmd
+    return None, None
+
+
+def compile_pdf(tex_src: str, output_pdf: Path,
+                keep_tex: bool = False,
+                compiler_path: str = "/usr/bin/pdflatex",
+                compiler_type: str = "pdflatex") -> bool:
+    """
+    Write *tex_src* into a temp directory, run the LaTeX compiler, and copy
+    the resulting PDF to *output_pdf*.  Supports ``pdflatex`` and ``tectonic``.
+    Returns True on success.
+    """
+    with tempfile.TemporaryDirectory() as tmp:
+        tmp_path = Path(tmp)
+        tex_file = tmp_path / "document.tex"
+        tex_file.write_text(tex_src, encoding="utf-8")
+
+        if compiler_type == "tectonic":
+            # Tectonic compiles in one pass and downloads missing packages.
+            passes = 1
+            cmd_base = [compiler_path, "document.tex"]
+        else:
+            # pdflatex needs two passes to get page headers right.
+            passes = 2
+            cmd_base = [compiler_path, "-interaction=nonstopmode",
+                        "-halt-on-error", "document.tex"]
+
+        for pass_num in range(1, passes + 1):
+            result = subprocess.run(
+                cmd_base, cwd=tmp, capture_output=True, text=True,
+            )
+            if result.returncode != 0:
+                print(f"  [compiler error on pass {pass_num}]", file=sys.stderr)
+                print(result.stdout[-3000:], file=sys.stderr)
+                if result.stderr:
+                    print(result.stderr[-1000:], file=sys.stderr)
+                if keep_tex:
+                    dest = output_pdf.with_suffix(".tex")
+                    dest.write_text(tex_src, encoding="utf-8")
+                    print(f"  TeX source saved to: {dest}", file=sys.stderr)
+                return False
+
+        pdf_out = tmp_path / "document.pdf"
+        if pdf_out.exists():
+            output_pdf.parent.mkdir(parents=True, exist_ok=True)
+            output_pdf.write_bytes(pdf_out.read_bytes())
+            if keep_tex:
+                dest = output_pdf.with_suffix(".tex")
+                dest.write_text(tex_src, encoding="utf-8")
+            return True
+
+        print("  [compiler ran but document.pdf was not produced]", file=sys.stderr)
+        return False
+
+
+# ══════════════════════════════════════════════════════════════════════════════
+# Main
+# ══════════════════════════════════════════════════════════════════════════════
+
+_INSTALL_INSTRUCTIONS = """
+No LaTeX compiler found.  Install one of the following:
+
+  Arch / CachyOS / Manjaro:
+    sudo pacman -S texlive-basic texlive-latex texlive-latexrecommended \\
+                   texlive-latexextra texlive-fontsrecommended
+
+  Debian / Ubuntu:
+    sudo apt-get install texlive-latex-extra texlive-fonts-recommended
+
+  --- OR ---  (self-contained, downloads packages on first use)
+    sudo pacman -S tectonic
+    # or: cargo install tectonic
+"""
+
+
+def main():
+    parser = argparse.ArgumentParser(
+        description="Generate scripture-style PDFs from the Book of the Nem text.",
+        formatter_class=argparse.RawDescriptionHelpFormatter,
+        epilog=__doc__,
+    )
+    parser.add_argument(
+        "--input", type=Path, default=INPUT_FILE,
+        help=f"Input plain-text file  (default: {INPUT_FILE})",
+    )
+    parser.add_argument(
+        "--output-dir", type=Path, default=OUTPUT_DIR,
+        help=f"Output directory  (default: {OUTPUT_DIR})",
+    )
+    parser.add_argument(
+        "--kindle-only", action="store_true",
+        help="Generate only the Kindle (single-column) PDF.",
+    )
+    parser.add_argument(
+        "--paper-only", action="store_true",
+        help="Generate only the paper (two-column) PDF.",
+    )
+    parser.add_argument(
+        "--keep-tex", action="store_true",
+        help="Save the intermediate .tex files alongside each PDF.",
+    )
+    parser.add_argument(
+        "--max-books", type=int, default=0, metavar="N",
+        help="Limit output to the first N book sections (0 = no limit).",
+    )
+    parser.add_argument(
+        "--tex-only", action="store_true",
+        help="Write .tex files only — do not attempt PDF compilation. "
+             "Useful when a LaTeX compiler is not available.",
+    )
+    args = parser.parse_args()
+
+    src_path: Path = args.input
+    if not src_path.exists():
+        sys.exit(f"ERROR: Input file not found: {src_path}")
+
+    print(f"Reading: {src_path}")
+    text = src_path.read_text(encoding="utf-8", errors="replace")
+
+    elements = parse(text)
+    if args.max_books > 0:
+        elements = truncate_to_books(elements, args.max_books)
+        print(f"  Limiting to first {args.max_books} book(s).")
+    books    = sum(1 for e in elements if isinstance(e, BookHeader))
+    chapters = sum(1 for e in elements if isinstance(e, Chapter))
+    verses   = sum(1 for e in elements if isinstance(e, Verse))
+    print(f"  Parsed: {books} books/sections, {chapters} chapters, {verses} verses")
+
+    out_dir: Path = args.output_dir
+    out_dir.mkdir(parents=True, exist_ok=True)
+
+    # Locate compiler (unless --tex-only)
+    compiler_path, compiler_type = None, None
+    if not args.tex_only:
+        compiler_path, compiler_type = _find_compiler()
+        if not compiler_path:
+            print(_INSTALL_INSTRUCTIONS, file=sys.stderr)
+            print("Falling back to --tex-only mode: .tex files will be written "
+                  "but not compiled.", file=sys.stderr)
+            args.tex_only = True
+        else:
+            print(f"  Using compiler: {compiler_path}")
+
+    def _write_or_compile(tex: str, pdf_path: Path, label: str):
+        if args.tex_only or args.keep_tex:
+            tex_path = pdf_path.with_suffix(".tex")
+            tex_path.write_text(tex, encoding="utf-8")
+            print(f"  ✓  TeX saved: {tex_path}")
+        if args.tex_only:
+            return
+        print(f"  Compiling {label} PDF …")
+        ok = compile_pdf(tex, pdf_path, keep_tex=args.keep_tex,
+                         compiler_path=compiler_path,
+                         compiler_type=compiler_type)
+        if ok:
+            print(f"  ✓  {pdf_path}")
+        else:
+            print(f"  ✗  {label} PDF failed — see errors above.")
+
+    # ── Kindle PDF ────────────────────────────────────────────────────────────
+    if not args.paper_only:
+        print(f"\nKindle PDF  (single-column, 4.5\"×6.5\") …")
+        tex = build_kindle_latex(elements)
+        _write_or_compile(tex, out_dir / "nem_phone.pdf", "Kindle")
+
+    # ── Paper / BOM-style PDF ────────────────────────────────────────────────
+    if not args.kindle_only:
+        print(f"\nPaper PDF  (two-column BOM style, 5.5\"×8.5\") …")
+        tex = build_paper_latex(elements)
+        _write_or_compile(tex, out_dir / "nem_paper.pdf", "Paper")
+
+
+if __name__ == "__main__":
+    main()
--- a/gui_proper_noun_player.py
+++ b/gui_proper_noun_player.py
--- a/output_proper_nouns/audio_text_for_novel_lightbringer/manifest.json
+++ b/output_proper_nouns/audio_text_for_novel_lightbringer/manifest.json
@ -0,0 +1,778 @@
+{
+  "Aaagast": "aaagast.wav",
+  "Abby": "abby.wav",
+  "Abigail": "abigail.wav",
+  "Abodey": "abodey.wav",
+  "Abriyyah": "abriyyah.wav",
+  "Abyss": "abyss.wav",
+  "Adamantine": "adamantine.wav",
+  "Addobes": "addobes.wav",
+  "Adobbes": "adobbes.wav",
+  "Aedrick": "aedrick.wav",
+  "Aegis": "aegis.wav",
+  "Aegrir": "aegrir.wav",
+  "Afire": "afire.wav",
+  "Agatha": "agatha.wav",
+  "Agony": "agony.wav",
+  "Agrarian": "agrarian.wav",
+  "Aheer": "aheer.wav",
+  "Ahman": "ahman.wav",
+  "Ailondel": "ailondel.wav",
+  "Airk": "airk.wav",
+  "Al-Astan": "al_astan.wav",
+  "Alchemist": "alchemist.wav",
+  "Alvrin": "alvrin.wav",
+  "Amarantha": "amarantha.wav",
+  "Amaryllis": "amaryllis.wav",
+  "Ananduil": "ananduil.wav",
+  "Anaudriel": "anaudriel.wav",
+  "Andrahel": "andrahel.wav",
+  "Anhuil": "anhuil.wav",
+  "Anhuil-Ehlar": "anhuil_ehlar.wav",
+  "Anhuil-Elhar": "anhuil_elhar.wav",
+  "Anjeer": "anjeer.wav",
+  "Ankh": "ankh.wav",
+  "Annalise": "annalise.wav",
+  "Anointing": "anointing.wav",
+  "Anoush": "anoush.wav",
+  "Anuil": "anuil.wav",
+  "Anvilhammer": "anvilhammer.wav",
+  "Ara": "ara.wav",
+  "Aragast": "aragast.wav",
+  "Aragst": "aragst.wav",
+  "Aralon": "aralon.wav",
+  "Aran": "aran.wav",
+  "Arans": "arans.wav",
+  "Arashan": "arashan.wav",
+  "Arbiter": "arbiter.wav",
+  "Archmage": "archmage.wav",
+  "Archwizard": "archwizard.wav",
+  "Ardrick": "ardrick.wav",
+  "Argast": "argast.wav",
+  "Armbrook": "armbrook.wav",
+  "Armory": "armory.wav",
+  "Arn": "arn.wav",
+  "Arn-Del": "arn_del.wav",
+  "Asheer": "asheer.wav",
+  "Aske": "aske.wav",
+  "Aster": "aster.wav",
+  "Astor": "astor.wav",
+  "Astral": "astral.wav",
+  "Astride": "astride.wav",
+  "Astute": "astute.wav",
+  "Avery": "avery.wav",
+  "Avorein": "avorein.wav",
+  "Await": "await.wav",
+  "Awww": "awww.wav",
+  "Axehammer": "axehammer.wav",
+  "Ayana": "ayana.wav",
+  "Ayron": "ayron.wav",
+  "Azuremoon": "azuremoon.wav",
+  "Badlands": "badlands.wav",
+  "Baelen": "baelen.wav",
+  "Bah": "bah.wav",
+  "Ballista": "ballista.wav",
+  "Bancroft": "bancroft.wav",
+  "Baras": "baras.wav",
+  "Barek": "barek.wav",
+  "Barge": "barge.wav",
+  "Barrik": "barrik.wav",
+  "Battlelord": "battlelord.wav",
+  "Bazaar": "bazaar.wav",
+  "Bearas": "bearas.wav",
+  "Bearasagain": "bearasagain.wav",
+  "Bearasand": "bearasand.wav",
+  "Bearasasked": "bearasasked.wav",
+  "Bearasat": "bearasat.wav",
+  "Bearasbegan": "bearasbegan.wav",
+  "Bearasbowed": "bearasbowed.wav",
+  "Bearascan": "bearascan.wav",
+  "Bearasdown": "bearasdown.wav",
+  "Bearasemerged": "bearasemerged.wav",
+  "Bearasfelt": "bearasfelt.wav",
+  "Bearasfor": "bearasfor.wav",
+  "Bearashad": "bearashad.wav",
+  "Bearashas": "bearashas.wav",
+  "Bearasheld": "bearasheld.wav",
+  "Bearashesitantly": "bearashesitantly.wav",
+  "Bearasin": "bearasin.wav",
+  "Bearasleading": "bearasleading.wav",
+  "Bearasmust": "bearasmust.wav",
+  "Bearasnodded": "bearasnodded.wav",
+  "Bearasperplexed": "bearasperplexed.wav",
+  "Bearasquickly": "bearasquickly.wav",
+  "Bearasreleased": "bearasreleased.wav",
+  "Bearassaid": "bearassaid.wav",
+  "Bearassat": "bearassat.wav",
+  "Bearassimply": "bearassimply.wav",
+  "Bearasslowly": "bearasslowly.wav",
+  "Bearassome": "bearassome.wav",
+  "Bearasspeaks": "bearasspeaks.wav",
+  "Bearassteeled": "bearassteeled.wav",
+  "Bearasstood": "bearasstood.wav",
+  "Bearasthat": "bearasthat.wav",
+  "Bearasthen": "bearasthen.wav",
+  "Bearasto": "bearasto.wav",
+  "Bearastrailed": "bearastrailed.wav",
+  "Bearaswandered": "bearaswandered.wav",
+  "Bearaswho": "bearaswho.wav",
+  "Bearaswith": "bearaswith.wav",
+  "Beldvorth": "beldvorth.wav",
+  "Belegast": "belegast.wav",
+  "Berstag": "berstag.wav",
+  "Beydell": "beydell.wav",
+  "Blackfeather": "blackfeather.wav",
+  "Blackroot": "blackroot.wav",
+  "Blargh": "blargh.wav",
+  "Bledvorth": "bledvorth.wav",
+  "Blessings": "blessings.wav",
+  "Bloodstone": "bloodstone.wav",
+  "Bloodtone": "bloodtone.wav",
+  "Bogard": "bogard.wav",
+  "Boldar": "boldar.wav",
+  "Bolton": "bolton.wav",
+  "Bon": "bon.wav",
+  "Boomer": "boomer.wav",
+  "Bouldershaun": "bouldershaun.wav",
+  "Boulevarde": "boulevarde.wav",
+  "Brahma": "brahma.wav",
+  "Bramble": "bramble.wav",
+  "Brambleburr": "brambleburr.wav",
+  "Brambleburrs": "brambleburrs.wav",
+  "Branson": "branson.wav",
+  "Bravado": "bravado.wav",
+  "Brax": "brax.wav",
+  "Braz": "braz.wav",
+  "Brazen": "brazen.wav",
+  "Brazenclaw": "brazenclaw.wav",
+  "Brazenclaws": "brazenclaws.wav",
+  "Breeches": "breeches.wav",
+  "Brendan": "brendan.wav",
+  "Brethren": "brethren.wav",
+  "Brickhorn": "brickhorn.wav",
+  "Caldwell": "caldwell.wav",
+  "Calico": "calico.wav",
+  "Caller": "caller.wav",
+  "Camels": "camels.wav",
+  "Canals": "canals.wav",
+  "Captains": "captains.wav",
+  "Caravan": "caravan.wav",
+  "Caswold": "caswold.wav",
+  "Causeway": "causeway.wav",
+  "Cavalier": "cavalier.wav",
+  "Cavern": "cavern.wav",
+  "Cherrytree": "cherrytree.wav",
+  "Chieftain": "chieftain.wav",
+  "Chivalrous": "chivalrous.wav",
+  "Chun": "chun.wav",
+  "Citadel": "citadel.wav",
+  "Clarn": "clarn.wav",
+  "Claw": "claw.wav",
+  "Cleric": "cleric.wav",
+  "Cobblestone": "cobblestone.wav",
+  "Contessa": "contessa.wav",
+  "Corporal": "corporal.wav",
+  "Cotswold": "cotswold.wav",
+  "Councillor": "councillor.wav",
+  "Councilman": "councilman.wav",
+  "Councilmen": "councilmen.wav",
+  "Councilor": "councilor.wav",
+  "Crimson": "crimson.wav",
+  "Crismon": "crismon.wav",
+  "Cylan": "cylan.wav",
+  "Dai": "dai.wav",
+  "Dalthanis": "dalthanis.wav",
+  "Dank": "dank.wav",
+  "Dayr": "dayr.wav",
+  "Dedric": "dedric.wav",
+  "Delgra": "delgra.wav",
+  "Delic": "delic.wav",
+  "Denizen": "denizen.wav",
+  "Denizens": "denizens.wav",
+  "Deric": "deric.wav",
+  "Derrbane": "derrbane.wav",
+  "Derro": "derro.wav",
+  "Derrobane": "derrobane.wav",
+  "Dibble": "dibble.wav",
+  "Diblon": "diblon.wav",
+  "Dire": "dire.wav",
+  "Dis": "dis.wav",
+  "Dobson": "dobson.wav",
+  "Dorian": "dorian.wav",
+  "Dorza": "dorza.wav",
+  "Dragonbane": "dragonbane.wav",
+  "Dragonsbane": "dragonsbane.wav",
+  "Drakor": "drakor.wav",
+  "Draygon": "draygon.wav",
+  "Drefan": "drefan.wav",
+  "Ducan": "ducan.wav",
+  "Duggan": "duggan.wav",
+  "Dulak": "dulak.wav",
+  "Dunca": "dunca.wav",
+  "Dune": "dune.wav",
+  "Dur": "dur.wav",
+  "Dur-Hakan": "dur_hakan.wav",
+  "Durgane": "durgane.wav",
+  "Durthaim": "durthaim.wav",
+  "Durthrim": "durthrim.wav",
+  "Dwarf": "dwarf.wav",
+  "Dwarven": "dwarven.wav",
+  "Earlson": "earlson.wav",
+  "Eastward": "eastward.wav",
+  "Effigius": "effigius.wav",
+  "Ehlar": "ehlar.wav",
+  "El-Ran": "el_ran.wav",
+  "El-Shen": "el_shen.wav",
+  "Elan": "elan.wav",
+  "Elessel": "elessel.wav",
+  "Elf": "elf.wav",
+  "Elhar": "elhar.wav",
+  "Elishan": "elishan.wav",
+  "Eliza": "eliza.wav",
+  "Elliswan": "elliswan.wav",
+  "Elliwsan": "elliwsan.wav",
+  "Elodea": "elodea.wav",
+  "Elshan": "elshan.wav",
+  "Elven": "elven.wav",
+  "Elvenkind": "elvenkind.wav",
+  "Elves": "elves.wav",
+  "Elvrathas": "elvrathas.wav",
+  "Elysium": "elysium.wav",
+  "Emaleen": "emaleen.wav",
+  "Eminence": "eminence.wav",
+  "Emissary": "emissary.wav",
+  "Emporium": "emporium.wav",
+  "Enaru": "enaru.wav",
+  "Endaleth": "endaleth.wav",
+  "Envoy": "envoy.wav",
+  "Eppres": "eppres.wav",
+  "Eradication": "eradication.wav",
+  "Eru": "eru.wav",
+  "Eshela": "eshela.wav",
+  "Ethereal": "ethereal.wav",
+  "Eushon": "eushon.wav",
+  "Eushownava": "eushownava.wav",
+  "Everdark": "everdark.wav",
+  "Everytime": "everytime.wav",
+  "Eylana": "eylana.wav",
+  "Eylanan": "eylanan.wav",
+  "Ezrin": "ezrin.wav",
+  "F-Fine": "f_fine.wav",
+  "F-Forgive": "f_forgive.wav",
+  "Faerie": "faerie.wav",
+  "Fairik": "fairik.wav",
+  "Fargus": "fargus.wav",
+  "Fark": "fark.wav",
+  "Farraj": "farraj.wav",
+  "Farush": "farush.wav",
+  "Feasthall": "feasthall.wav",
+  "Featherstone": "featherstone.wav",
+  "Felaria": "felaria.wav",
+  "Feliq": "feliq.wav",
+  "Felnck": "felnck.wav",
+  "Felnick": "felnick.wav",
+  "Felnicks": "felnicks.wav",
+  "Felnik": "felnik.wav",
+  "Fenaya": "fenaya.wav",
+  "Feneya": "feneya.wav",
+  "Ferrus": "ferrus.wav",
+  "Fey": "fey.wav",
+  "Firebane": "firebane.wav",
+  "Fireshard": "fireshard.wav",
+  "Foomwairma": "foomwairma.wav",
+  "Forger": "forger.wav",
+  "Frandor": "frandor.wav",
+  "Friarsdai": "friarsdai.wav",
+  "Fumairma": "fumairma.wav",
+  "Fumwairma": "fumwairma.wav",
+  "Galantholas": "galantholas.wav",
+  "Galathorn": "galathorn.wav",
+  "Galen": "galen.wav",
+  "Galonti": "galonti.wav",
+  "Garb": "garb.wav",
+  "Gareth": "gareth.wav",
+  "Garvek": "garvek.wav",
+  "Gaunt": "gaunt.wav",
+  "Gavin": "gavin.wav",
+  "Geez": "geez.wav",
+  "Ghurauk": "ghurauk.wav",
+  "Gilandras": "gilandras.wav",
+  "Gilard": "gilard.wav",
+  "Gilchis": "gilchis.wav",
+  "Gilchris": "gilchris.wav",
+  "Gilding": "gilding.wav",
+  "Gilrick": "gilrick.wav",
+  "Glades": "glades.wav",
+  "Glanthalas": "glanthalas.wav",
+  "Glantholas": "glantholas.wav",
+  "Glimmerwyn": "glimmerwyn.wav",
+  "Gloomstone": "gloomstone.wav",
+  "Gnaum": "gnaum.wav",
+  "Gnomish": "gnomish.wav",
+  "Goblinkin": "goblinkin.wav",
+  "Goldsheen": "goldsheen.wav",
+  "Gorath": "gorath.wav",
+  "Gore": "gore.wav",
+  "Gorg": "gorg.wav",
+  "Gorlyn": "gorlyn.wav",
+  "Gorstad": "gorstad.wav",
+  "Gotto": "gotto.wav",
+  "Graces": "graces.wav",
+  "Graffel": "graffel.wav",
+  "Grandmaster": "grandmaster.wav",
+  "Granitestone": "granitestone.wav",
+  "Gratzel": "gratzel.wav",
+  "Graystrom": "graystrom.wav",
+  "Greathaven": "greathaven.wav",
+  "Gregarious": "gregarious.wav",
+  "Gregor": "gregor.wav",
+  "Griffon": "griffon.wav",
+  "Grimbold": "grimbold.wav",
+  "Gripp": "gripp.wav",
+  "Grizzled": "grizzled.wav",
+  "Grog": "grog.wav",
+  "Grogg": "grogg.wav",
+  "Grotto": "grotto.wav",
+  "Gruff": "gruff.wav",
+  "Gruul": "gruul.wav",
+  "Guardarm": "guardarm.wav",
+  "Gustafson": "gustafson.wav",
+  "Guza": "guza.wav",
+  "Gylis": "gylis.wav",
+  "Habani": "habani.wav",
+  "Hagatha": "hagatha.wav",
+  "Hakan": "hakan.wav",
+  "Hallowed": "hallowed.wav",
+  "Halthessala": "halthessala.wav",
+  "Hammerhaft": "hammerhaft.wav",
+  "Har": "har.wav",
+  "Harbrim": "harbrim.wav",
+  "Harbrin": "harbrin.wav",
+  "Hardrock": "hardrock.wav",
+  "Harrik": "harrik.wav",
+  "Hauberk": "hauberk.wav",
+  "Hazards": "hazards.wav",
+  "Headmaster": "headmaster.wav",
+  "Heed": "heed.wav",
+  "Hells": "hells.wav",
+  "Henceforth": "henceforth.wav",
+  "Hendel": "hendel.wav",
+  "Heshbani": "heshbani.wav",
+  "Hesta": "hesta.wav",
+  "Hestra": "hestra.wav",
+  "Heykingygladtomeetyouireallylikeithereitremindsmeofmyhome": "heykingygladtomeetyouireallylikeithereitremindsmeofmyhome.wav",
+  "Highlands": "highlands.wav",
+  "Highlord": "highlord.wav",
+  "Hillsfar": "hillsfar.wav",
+  "Hmmm": "hmmm.wav",
+  "Homecoming": "homecoming.wav",
+  "Horblaster": "horblaster.wav",
+  "Horde": "horde.wav",
+  "Horgard": "horgard.wav",
+  "Hornblade": "hornblade.wav",
+  "Hornblaster": "hornblaster.wav",
+  "Horned": "horned.wav",
+  "Hrumph": "hrumph.wav",
+  "Huen": "huen.wav",
+  "Hylan": "hylan.wav",
+  "Illuminant": "illuminant.wav",
+  "Illuminated": "illuminated.wav",
+  "Illumination": "illumination.wav",
+  "Ilrodel": "ilrodel.wav",
+  "Imp": "imp.wav",
+  "Inquisitor": "inquisitor.wav",
+  "Ironblade": "ironblade.wav",
+  "Ironbound": "ironbound.wav",
+  "Ironguard": "ironguard.wav",
+  "Ironhold": "ironhold.wav",
+  "Ironspear": "ironspear.wav",
+  "Irontree": "irontree.wav",
+  "Iston": "iston.wav",
+  "Jabari": "jabari.wav",
+  "Jabbed": "jabbed.wav",
+  "Jacob": "jacob.wav",
+  "Jad": "jad.wav",
+  "Janson": "janson.wav",
+  "Jasyen": "jasyen.wav",
+  "Jayden": "jayden.wav",
+  "Jaylan": "jaylan.wav",
+  "Jaysen": "jaysen.wav",
+  "Jewel": "jewel.wav",
+  "Jors": "jors.wav",
+  "Jovially": "jovially.wav",
+  "Kaash": "kaash.wav",
+  "Kah": "kah.wav",
+  "Kalzaduum": "kalzaduum.wav",
+  "Karnak": "karnak.wav",
+  "Kaspar": "kaspar.wav",
+  "Kassie": "kassie.wav",
+  "Keldris": "keldris.wav",
+  "Kelshard": "kelshard.wav",
+  "Kelvesh": "kelvesh.wav",
+  "Kelvin": "kelvin.wav",
+  "Kelwane": "kelwane.wav",
+  "Kev": "kev.wav",
+  "Khaki": "khaki.wav",
+  "Kihee": "kihee.wav",
+  "Kihee-Uust": "kihee_uust.wav",
+  "Kiiri": "kiiri.wav",
+  "Kin": "kin.wav",
+  "Kirri": "kirri.wav",
+  "Kisleth": "kisleth.wav",
+  "Knelt": "knelt.wav",
+  "Knight-Corporal": "knight_corporal.wav",
+  "Knight-Lieutenant": "knight_lieutenant.wav",
+  "Knight-Major": "knight_major.wav",
+  "Knight-Sergeant": "knight_sergeant.wav",
+  "Knighthand": "knighthand.wav",
+  "Knighthood": "knighthood.wav",
+  "Knowin": "knowin.wav",
+  "Kodan": "kodan.wav",
+  "Kor": "kor.wav",
+  "Kor-Roth": "kor_roth.wav",
+  "Kordan": "kordan.wav",
+  "Koreth": "koreth.wav",
+  "Korin": "korin.wav",
+  "Kraelheimgar": "kraelheimgar.wav",
+  "Kraven": "kraven.wav",
+  "Kris": "kris.wav",
+  "Krisleth": "krisleth.wav",
+  "Kronlin": "kronlin.wav",
+  "Kudah": "kudah.wav",
+  "Kuerana": "kuerana.wav",
+  "Kunah": "kunah.wav",
+  "Kwenal": "kwenal.wav",
+  "Kyfurn": "kyfurn.wav",
+  "Kylic": "kylic.wav",
+  "Ladell": "ladell.wav",
+  "Laird": "laird.wav",
+  "Leng": "leng.wav",
+  "Lesik": "lesik.wav",
+  "Lightbinger": "lightbinger.wav",
+  "Lightbrigner": "lightbrigner.wav",
+  "Lightbringer": "lightbringer.wav",
+  "Lightbringers": "lightbringers.wav",
+  "Lightrbinger": "lightrbinger.wav",
+  "Liu": "liu.wav",
+  "Lon": "lon.wav",
+  "Lon-Ell": "lon_ell.wav",
+  "Longsword": "longsword.wav",
+  "Lordship": "lordship.wav",
+  "Lumisha": "lumisha.wav",
+  "Lyceum": "lyceum.wav",
+  "Macabress": "macabress.wav",
+  "Madam": "madam.wav",
+  "Magician": "magician.wav",
+  "Magister": "magister.wav",
+  "Magistry": "magistry.wav",
+  "Magorian": "magorian.wav",
+  "Majesties": "majesties.wav",
+  "Maldrood": "maldrood.wav",
+  "Malrood": "malrood.wav",
+  "Manchu": "manchu.wav",
+  "Marches": "marches.wav",
+  "Marlee": "marlee.wav",
+  "Masta": "masta.wav",
+  "Matriarch": "matriarch.wav",
+  "Matriarchs": "matriarchs.wav",
+  "Meknathar": "meknathar.wav",
+  "Menthal": "menthal.wav",
+  "Ming": "ming.wav",
+  "Minotaur": "minotaur.wav",
+  "Minotaurs": "minotaurs.wav",
+  "Mister": "mister.wav",
+  "Misty": "misty.wav",
+  "Mithral": "mithral.wav",
+  "Mithrin": "mithrin.wav",
+  "Mitral": "mitral.wav",
+  "Mmmm": "mmmm.wav",
+  "Moans": "moans.wav",
+  "Molgol": "molgol.wav",
+  "Monarchy": "monarchy.wav",
+  "Morther": "morther.wav",
+  "Motioning": "motioning.wav",
+  "Mustaches": "mustaches.wav",
+  "Mutters": "mutters.wav",
+  "Mylee": "mylee.wav",
+  "Nahzim": "nahzim.wav",
+  "Nefaleem": "nefaleem.wav",
+  "Nestor": "nestor.wav",
+  "Nesven": "nesven.wav",
+  "Neverthoughtidseeyouprancingaroundwithabunchofelfgirls": "neverthoughtidseeyouprancingaroundwithabunchofelfgirls.wav",
+  "Nijel": "nijel.wav",
+  "Nik": "nik.wav",
+  "Nimbly": "nimbly.wav",
+  "Nimgalad": "nimgalad.wav",
+  "Nirvana": "nirvana.wav",
+  "Noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest": "noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest.wav",
+  "Nollon": "nollon.wav",
+  "Nomadic": "nomadic.wav",
+  "Nook": "nook.wav",
+  "Nurn": "nurn.wav",
+  "Nym": "nym.wav",
+  "Oakheart": "oakheart.wav",
+  "Oakleaf": "oakleaf.wav",
+  "Odie": "odie.wav",
+  "Odo": "odo.wav",
+  "Ododrian": "ododrian.wav",
+  "Odoiran": "odoiran.wav",
+  "Odorain": "odorain.wav",
+  "Odoriain": "odoriain.wav",
+  "Odorian": "odorian.wav",
+  "Odorians": "odorians.wav",
+  "Ody": "ody.wav",
+  "Off-Worlder": "off_worlder.wav",
+  "Ogrin": "ogrin.wav",
+  "Olde": "olde.wav",
+  "Onas": "onas.wav",
+  "Ooo": "ooo.wav",
+  "Oorian": "oorian.wav",
+  "Oranoc": "oranoc.wav",
+  "Orbs": "orbs.wav",
+  "Orehand": "orehand.wav",
+  "Orgrin": "orgrin.wav",
+  "Orin": "orin.wav",
+  "Orkosh": "orkosh.wav",
+  "Oroset": "oroset.wav",
+  "Orson": "orson.wav",
+  "Oslagil": "oslagil.wav",
+  "Overlord": "overlord.wav",
+  "Paladin": "paladin.wav",
+  "Paladin-King": "paladin_king.wav",
+  "Patriarch": "patriarch.wav",
+  "Patriarchs": "patriarchs.wav",
+  "Penance": "penance.wav",
+  "Penelope": "penelope.wav",
+  "Periwinkle": "periwinkle.wav",
+  "Pilgrim": "pilgrim.wav",
+  "Pinnacle": "pinnacle.wav",
+  "Pricilla": "pricilla.wav",
+  "Priestess": "priestess.wav",
+  "Primer": "primer.wav",
+  "Priscilla": "priscilla.wav",
+  "Prologue": "prologue.wav",
+  "Prudent": "prudent.wav",
+  "Quartzhand": "quartzhand.wav",
+  "Racah": "racah.wav",
+  "Rachelle": "rachelle.wav",
+  "Radiant": "radiant.wav",
+  "Rah'Zi": "rah_zi.wav",
+  "Rasheer": "rasheer.wav",
+  "Raslan": "raslan.wav",
+  "Ravenburg": "ravenburg.wav",
+  "Ravenhill": "ravenhill.wav",
+  "Ravensburg": "ravensburg.wav",
+  "Razentia": "razentia.wav",
+  "Realms": "realms.wav",
+  "Redhorn": "redhorn.wav",
+  "Reflexively": "reflexively.wav",
+  "Reinys": "reinys.wav",
+  "Retort": "retort.wav",
+  "Roc": "roc.wav",
+  "Rockport": "rockport.wav",
+  "Rolands": "rolands.wav",
+  "Rolden": "rolden.wav",
+  "Rooks": "rooks.wav",
+  "Roth": "roth.wav",
+  "Rothsholm": "rothsholm.wav",
+  "Rouge": "rouge.wav",
+  "Rustigar": "rustigar.wav",
+  "Sarnel": "sarnel.wav",
+  "Satyrsdai": "satyrsdai.wav",
+  "Scaly": "scaly.wav",
+  "Scepter": "scepter.wav",
+  "Seagull": "seagull.wav",
+  "Sedition": "sedition.wav",
+  "Seeker": "seeker.wav",
+  "Sehlaba": "sehlaba.wav",
+  "Seker": "seker.wav",
+  "Seker-Ankh": "seker_ankh.wav",
+  "Selna": "selna.wav",
+  "Senica": "senica.wav",
+  "Sentinel": "sentinel.wav",
+  "Septuigen": "septuigen.wav",
+  "Sergeant-Major": "sergeant_major.wav",
+  "Serk": "serk.wav",
+  "Sgt": "sgt.wav",
+  "Shadeem": "shadeem.wav",
+  "Shae": "shae.wav",
+  "Shal": "shal.wav",
+  "Shalahz": "shalahz.wav",
+  "Shalaz": "shalaz.wav",
+  "Shalazah": "shalazah.wav",
+  "Shambhu": "shambhu.wav",
+  "Shambu": "shambu.wav",
+  "Shanay": "shanay.wav",
+  "Shatterdawn": "shatterdawn.wav",
+  "Shdeem": "shdeem.wav",
+  "Shelna": "shelna.wav",
+  "Shen": "shen.wav",
+  "Shrouded": "shrouded.wav",
+  "Shyrra": "shyrra.wav",
+  "Sigil": "sigil.wav",
+  "Silverbane": "silverbane.wav",
+  "Silvernote": "silvernote.wav",
+  "Silvervein": "silvervein.wav",
+  "Silverwind": "silverwind.wav",
+  "Sirjif": "sirjif.wav",
+  "Sis": "sis.wav",
+  "Skeptically": "skeptically.wav",
+  "Slagg": "slagg.wav",
+  "Slaver": "slaver.wav",
+  "Slavers": "slavers.wav",
+  "Slick": "slick.wav",
+  "Solstice": "solstice.wav",
+  "Soren": "soren.wav",
+  "Sorrow": "sorrow.wav",
+  "Sosa": "sosa.wav",
+  "Soulseeker": "soulseeker.wav",
+  "Soulsinger": "soulsinger.wav",
+  "Sparks": "sparks.wav",
+  "Spellbooks": "spellbooks.wav",
+  "Spikehorn": "spikehorn.wav",
+  "Stairwell": "stairwell.wav",
+  "Stalker": "stalker.wav",
+  "Stealthy": "stealthy.wav",
+  "Steelaxe": "steelaxe.wav",
+  "Steelclaw": "steelclaw.wav",
+  "Steelhorn": "steelhorn.wav",
+  "Steward": "steward.wav",
+  "Stiletto": "stiletto.wav",
+  "Stonefirger": "stonefirger.wav",
+  "Stoneforger": "stoneforger.wav",
+  "Stonehelm": "stonehelm.wav",
+  "Stonehold": "stonehold.wav",
+  "Stoner": "stoner.wav",
+  "Sunder": "sunder.wav",
+  "Surly": "surly.wav",
+  "Swung": "swung.wav",
+  "Symphonic": "symphonic.wav",
+  "Ta-Lar": "ta_lar.wav",
+  "Taeriel": "taeriel.wav",
+  "Tailor": "tailor.wav",
+  "Talaer": "talaer.wav",
+  "Tallspear": "tallspear.wav",
+  "Targoth": "targoth.wav",
+  "Tarnen": "tarnen.wav",
+  "Tathan": "tathan.wav",
+  "Tavern": "tavern.wav",
+  "Tellin": "tellin.wav",
+  "Thane": "thane.wav",
+  "Thanes": "thanes.wav",
+  "Theocratic": "theocratic.wav",
+  "Therak": "therak.wav",
+  "Therondil": "therondil.wav",
+  "Thorn": "thorn.wav",
+  "Thranis": "thranis.wav",
+  "Throgg": "throgg.wav",
+  "Thunderstrike": "thunderstrike.wav",
+  "Tien": "tien.wav",
+  "Tillborne": "tillborne.wav",
+  "Tinbreaker": "tinbreaker.wav",
+  "Tome": "tome.wav",
+  "Torak": "torak.wav",
+  "Toren": "toren.wav",
+  "Torgath": "torgath.wav",
+  "Torgoth": "torgoth.wav",
+  "Traitor": "traitor.wav",
+  "Triesse": "triesse.wav",
+  "Tumark": "tumark.wav",
+  "Tumbler": "tumbler.wav",
+  "Turcan": "turcan.wav",
+  "Turog": "turog.wav",
+  "Twinsdai": "twinsdai.wav",
+  "Twyleen": "twyleen.wav",
+  "Tyrant": "tyrant.wav",
+  "Udda": "udda.wav",
+  "Uhrn": "uhrn.wav",
+  "Ulagra": "ulagra.wav",
+  "Ulrik": "ulrik.wav",
+  "Umbrin": "umbrin.wav",
+  "Umfray": "umfray.wav",
+  "Undwin": "undwin.wav",
+  "Unison": "unison.wav",
+  "Urhn": "urhn.wav",
+  "Uryna": "uryna.wav",
+  "Uust": "uust.wav",
+  "Vagrant": "vagrant.wav",
+  "Valdarin": "valdarin.wav",
+  "Valeth": "valeth.wav",
+  "Valindar": "valindar.wav",
+  "Valinor": "valinor.wav",
+  "Valis": "valis.wav",
+  "Vanessa": "vanessa.wav",
+  "Varann": "varann.wav",
+  "Varsis": "varsis.wav",
+  "Varu": "varu.wav",
+  "Vedra": "vedra.wav",
+  "Velicia": "velicia.wav",
+  "Velvet": "velvet.wav",
+  "Vendar": "vendar.wav",
+  "Venessa": "venessa.wav",
+  "Vengeance": "vengeance.wav",
+  "Vermin": "vermin.wav",
+  "Verness": "verness.wav",
+  "Verr": "verr.wav",
+  "Verr-": "verr.wav",
+  "Verr-Asses": "verr_asses.wav",
+  "Veya": "veya.wav",
+  "Viscount": "viscount.wav",
+  "Vizier": "vizier.wav",
+  "Vlainor": "vlainor.wav",
+  "Volan": "volan.wav",
+  "Volstan": "volstan.wav",
+  "Vorann": "vorann.wav",
+  "Vorgak": "vorgak.wav",
+  "Vorum": "vorum.wav",
+  "Vuhnalya": "vuhnalya.wav",
+  "Vyn": "vyn.wav",
+  "Wallbreaker": "wallbreaker.wav",
+  "Wanton": "wanton.wav",
+  "Warfrost": "warfrost.wav",
+  "Wargog": "wargog.wav",
+  "Warstar": "warstar.wav",
+  "Warthog": "warthog.wav",
+  "Weaving": "weaving.wav",
+  "Weee": "weee.wav",
+  "Wettstein": "wettstein.wav",
+  "Wh": "wh.wav",
+  "Wha": "wha.wav",
+  "Whatchya": "whatchya.wav",
+  "Wheni": "wheni.wav",
+  "Whitehand": "whitehand.wav",
+  "Whoah": "whoah.wav",
+  "Williamsburg": "williamsburg.wav",
+  "Willowbrook": "willowbrook.wav",
+  "Windrift": "windrift.wav",
+  "Windsdai": "windsdai.wav",
+  "Witchwyrd": "witchwyrd.wav",
+  "Witchwyrds": "witchwyrds.wav",
+  "Wolfclaw": "wolfclaw.wav",
+  "Woodlan": "woodlan.wav",
+  "Woodland": "woodland.wav",
+  "Wooo": "wooo.wav",
+  "Worlder": "worlder.wav",
+  "Wrath": "wrath.wav",
+  "Wuzy": "wuzy.wav",
+  "Wynshorn": "wynshorn.wav",
+  "Wyren": "wyren.wav",
+  "Yahnig": "yahnig.wav",
+  "Yan": "yan.wav",
+  "Yar": "yar.wav",
+  "Yer": "yer.wav",
+  "Yolan": "yolan.wav",
+  "Yoos": "yoos.wav",
+  "Yurik": "yurik.wav",
+  "Zalrek": "zalrek.wav",
+  "Zeb": "zeb.wav",
+  "Zelph": "zelph.wav",
+  "Zha": "zha.wav",
+  "Zhong": "zhong.wav",
+  "Zhong-Goo": "zhong_goo.wav",
+  "Zinger": "zinger.wav",
+  "Zirak": "zirak.wav",
+  "Zurn": "zurn.wav",
+  "Zyzaren": "zyzaren.wav",
+  "Zyzarn": "zyzarn.wav",
+  "Zyzren": "zyzren.wav"
+}
--- a/output_proper_nouns/audio_text_for_novel_lightbringer/pronunciation_fixes.json
+++ b/output_proper_nouns/audio_text_for_novel_lightbringer/pronunciation_fixes.json
@ -0,0 +1,20 @@
+{
+  "Anhuil-Elhar": "An-WHEEL AY-Lar",
+  "Anhuil-Ehlar": "An-WHEEL AY-Lar",
+  "Aegrir": "Ay-Greer",
+  "Baras": "BARE-iss",
+  "Emaleen": "EMMA-lean",
+  "Eushownava": "You-SHOWN-Eh-Vah",
+  "Graffel": "Gra-FELL",
+  "Greathaven": "GREAT-Haven",
+  "Jaylan": "JAY-Lin",
+  "Neverthoughtidseeyouprancingaroundwithabunchofelfgirls": "Never thought I'd see you prancing around with a bunch of elf girls",
+  "Nijel": "NYE-jell",
+  "Noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest": "No I've been here and there looking for my kin rumored to dwell here in this forest",
+  "Odoiran": "Oh-DORIAN",
+  "Ody": "Oh-Dee",
+  "Seker-Ankh": "Seker-Ahnk",
+  "Rasheer": "Raw-SHEAR",
+  "Valinor": "Vala-nor",
+  "Varsis": "Ver-Asis"
+}
--- a/output_proper_nouns/correct_words.json
+++ b/output_proper_nouns/correct_words.json
--- a/output_proper_nouns/manifest.json
+++ b/output_proper_nouns/manifest.json
--- a/output_proper_nouns/pronunciation_fixes.json
+++ b/output_proper_nouns/pronunciation_fixes.json
@ -1,28 +1,35 @@
 {
-  "Gadianton Robbers": "Gadeeantun Robbers",
  "Gadianton": "Gadeeantun",
  "Coriantumr": "Coryantomer",
  "Laman": "Layman",
-  "Lehi And Nephi": "Leehi And Nephi",
  "Lehi": "Leehi",
-  "Lehi Mathonihah": "Leehi Mathonihah",
  "Lehis": "Leehis",
  "Lehies": "Leehis",
  "Liahona": "Leeahona",
-  "Moroni": "Morero-ni",
-  "Alma": "Al-ma",
  "Gadiantons": "Gadeeantuns",
  "Laban": "Layban",
  "Mosiah": "Moziah",
-  "Mosiah The King": "Moziah The King",
  "Nehors": "Kneehores",
-  "Samuel The Lamanite": "Samuel The Laymanite",
  "Tarry": "Tarery",
-  "The Lamanite Twins": "The Laymanite Twins",
-  "The Lamanites Of Ammon": "The Laymanites Of Ammon",
-  "The Lamanites Of The Land Of Zarahemla": "The Laymanites Of The Land Of Zarahemla",
-  "The Lamanites Of The Land Southward": "The Laymanites Of The Land Southward",
-  "The Lamanites Of The People Of Ammon": "The Laymanites Of The People Of Ammon",
-  "The Lamb'S Book Of Life": "The Lamb's Book Of Life",
-  "The Land Of Nephi": "The Land Of Kneefi"
+  "Nephites": "Kneefites",
+  "Anti-Nephi-Lehies": "Anti-Kneef-eye-Leehis",
+  "Lamanite": "Laymanite",
+  "Lamanites": "Laymanites",
+  "Lamb'S": "Lamb's",
+  "Sarai": "Sa-rye",
+  "Telestial": "Tea-lestial",
+  "Lord'S": "Lord's",
+  "Helaman": "He-la-mun",
+  "Alma": "Al-ma",
+  "Nephihah": "Kneef-eyehah",
+  "Nephihet": "Kneef-eyehet",
+  "Nephite": "Kneefight",
+  "Nephi-Im": "Kneef-eye-Im",
+  "Zenephi": "Ze-kneef-eye",
+  "Nephitish": "Kneefight-ish",
+  "Moroni": "Moh-roh-nye",
+  "Nephi": "Knee-fye",
+  "Hagar": "Hag-ar",
+  "Oug": "Ohg",
+  "Ougan": "Ohgan"
 }
--- a/output_proper_nouns/visions_glory_canada/manifest.json
+++ b/output_proper_nouns/visions_glory_canada/manifest.json
@ -0,0 +1,30 @@
+{
+  "Adam": "adam.wav",
+  "Adam-Ondi-Ahman": "adam_ondi_ahman.wav",
+  "Ahman": "ahman.wav",
+  "Alma": "alma.wav",
+  "Apostles": "apostles.wav",
+  "Brethren": "brethren.wav",
+  "Cardston": "cardston.wav",
+  "Ephraim": "ephraim.wav",
+  "Evolving": "evolving.wav",
+  "Holies": "holies.wav",
+  "Israel": "israel.wav",
+  "Joseph": "joseph.wav",
+  "Knelt": "knelt.wav",
+  "Lehi": "lehi.wav",
+  "Liahona": "liahona.wav",
+  "Millennium": "millennium.wav",
+  "Mormon": "mormon.wav",
+  "Moroni": "moroni.wav",
+  "Mosiah": "mosiah.wav",
+  "Nauvoo": "nauvoo.wav",
+  "Quorum": "quorum.wav",
+  "Rachael": "rachael.wav",
+  "Savior": "savior.wav",
+  "Thummim": "thummim.wav",
+  "Urim": "urim.wav",
+  "Vignette": "vignette.wav",
+  "Zachary": "zachary.wav",
+  "Zion": "zion.wav"
+}
--- a/output_proper_nouns/visions_of_glory__zion_in_canada_pg_162-193/manifest.json
+++ b/output_proper_nouns/visions_of_glory__zion_in_canada_pg_162-193/manifest.json
@ -0,0 +1,30 @@
+{
+  "Adam": "adam.wav",
+  "Adam-Ondi-Ahman": "adam_ondi_ahman.wav",
+  "Ahman": "ahman.wav",
+  "Alma": "alma.wav",
+  "Apostles": "apostles.wav",
+  "Brethren": "brethren.wav",
+  "Cardston": "cardston.wav",
+  "Ephraim": "ephraim.wav",
+  "Evolving": "evolving.wav",
+  "Holies": "holies.wav",
+  "Israel": "israel.wav",
+  "Joseph": "joseph.wav",
+  "Knelt": "knelt.wav",
+  "Lehi": "lehi.wav",
+  "Liahona": "liahona.wav",
+  "Millennium": "millennium.wav",
+  "Mormon": "mormon.wav",
+  "Moroni": "moroni.wav",
+  "Mosiah": "mosiah.wav",
+  "Nauvoo": "nauvoo.wav",
+  "Quorum": "quorum.wav",
+  "Rachael": "rachael.wav",
+  "Savior": "savior.wav",
+  "Thummim": "thummim.wav",
+  "Urim": "urim.wav",
+  "Vignette": "vignette.wav",
+  "Zachary": "zachary.wav",
+  "Zion": "zion.wav"
+}
--- a/projects.json
+++ b/projects.json
@ -0,0 +1,18 @@
+[
+  {
+    "name": "Audio Text for Novel Lightbringer",
+    "source_paths": [
+      "/home/dillon/_code/voice_model/Audio Text for Novel Lightbringer/Audio Text for Novel Lightbringer.txt"
+    ],
+    "proper_nouns_output_dir": "output_proper_nouns/audio_text_for_novel_lightbringer",
+    "proper_nouns_audio_dir": "proper_nouns_audio/audio_text_for_novel_lightbringer"
+  },
+  {
+    "name": "visions glory canada",
+    "source_paths": [
+      "/home/dillon/_code/voice_model/Visions of Glory_ Zion in Canada pg 162-193.txt"
+    ],
+    "proper_nouns_output_dir": "output_proper_nouns/visions_glory_canada",
+    "proper_nouns_audio_dir": "proper_nouns_audio/visions_glory_canada"
+  }
+]
--- a/proper_nouns.txt
+++ b/proper_nouns.txt
--- a/run_audiobook.bat
+++ b/run_audiobook.bat
@ -0,0 +1,42 @@
+@echo off
+title Create Audiobook
+
+:: Change to the folder this .bat file lives in
+cd /d "%~dp0"
+
+:: Check setup has been run
+if not exist .venv\Scripts\python.exe (
+    echo ERROR: Setup has not been run yet.
+    echo Please double-click setup_windows.bat first.
+    pause
+    exit /b 1
+)
+
+echo ============================================================
+echo  Audiobook Creator
+echo ============================================================
+echo.
+echo  Options:
+echo    1 - Generate ALL chapters  (may take many hours)
+echo    2 - List detected chapters only
+echo    3 - Generate a short PREVIEW of each chapter
+echo    4 - Generate specific chapters (enter numbers next)
+echo.
+set /p CHOICE="Enter choice (1/2/3/4): "
+
+if "%CHOICE%"=="1" (
+    .venv\Scripts\python create_audiobook_lightbringer.py
+) else if "%CHOICE%"=="2" (
+    .venv\Scripts\python create_audiobook_lightbringer.py --list
+) else if "%CHOICE%"=="3" (
+    .venv\Scripts\python create_audiobook_lightbringer.py --preview
+) else if "%CHOICE%"=="4" (
+    set /p CHAPTERS="Enter chapter numbers separated by spaces (e.g. 0 1 2): "
+    .venv\Scripts\python create_audiobook_lightbringer.py %CHAPTERS%
+) else (
+    echo Invalid choice.
+)
+
+echo.
+echo Done. Output files are in the output_audiobook_lightbringer folder.
+pause
--- a/run_gui.bat
+++ b/run_gui.bat
@ -0,0 +1,21 @@
+@echo off
+title Proper Noun GUI
+
+:: Change to the folder this .bat file lives in
+cd /d "%~dp0"
+
+:: Check setup has been run
+if not exist .venv\Scripts\python.exe (
+    echo ERROR: Setup has not been run yet.
+    echo Please double-click setup_windows.bat first.
+    pause
+    exit /b 1
+)
+
+echo Starting Proper Noun Player GUI...
+.venv\Scripts\python gui_proper_noun_player.py
+if errorlevel 1 (
+    echo.
+    echo The application closed with an error. See message above.
+    pause
+)
--- a/setup_windows.bat
+++ b/setup_windows.bat
@ -0,0 +1,93 @@
+@echo off
+setlocal EnableDelayedExpansion
+title Audiobook Setup
+
+echo ============================================================
+echo  Audiobook Setup for Windows 11
+echo ============================================================
+echo.
+
+:: ── 1. Check Python ──────────────────────────────────────────────────────────
+echo [1/5] Checking Python installation...
+python --version >nul 2>&1
+if errorlevel 1 (
+    echo.
+    echo  ERROR: Python was not found.
+    echo.
+    echo  Please install Python 3.12 from https://www.python.org/downloads/
+    echo  IMPORTANT: On the installer, tick "Add Python to PATH" before clicking Install.
+    echo.
+    echo  After installing, close this window and double-click setup_windows.bat again.
+    pause
+    exit /b 1
+)
+
+for /f "tokens=2 delims= " %%v in ('python --version 2^>^&1') do set PY_VER=%%v
+echo  Found Python %PY_VER%
+echo.
+
+:: ── 2. Create virtual environment ────────────────────────────────────────────
+echo [2/5] Creating virtual environment (.venv)...
+if exist .venv (
+    echo  .venv already exists, skipping creation.
+) else (
+    python -m venv .venv
+    if errorlevel 1 (
+        echo  ERROR: Failed to create virtual environment.
+        pause
+        exit /b 1
+    )
+    echo  Virtual environment created.
+)
+echo.
+
+:: ── 3. Install PyTorch with CUDA (for gaming GPU) ────────────────────────────
+echo [3/5] Installing PyTorch with CUDA 12.4 support (this may take a while)...
+echo  Downloading ~2.5 GB — please be patient.
+echo.
+.venv\Scripts\pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
+if errorlevel 1 (
+    echo.
+    echo  WARNING: CUDA build failed. Falling back to CPU-only PyTorch.
+    echo  Audio generation will be slower but will still work.
+    .venv\Scripts\pip install torch
+)
+echo.
+
+:: ── 4. Install remaining packages ────────────────────────────────────────────
+echo [4/5] Installing remaining packages (kokoro, soundfile, sounddevice, spacy, wordfreq)...
+.venv\Scripts\pip install -r requirements.txt
+if errorlevel 1 (
+    echo  ERROR: Package installation failed. Check your internet connection.
+    pause
+    exit /b 1
+)
+
+echo Downloading spaCy English language model (en_core_web_sm, ~15 MB)...
+.venv\Scripts\python -m spacy download en_core_web_sm
+if errorlevel 1 (
+    echo  WARNING: spaCy model download failed. Proper noun extraction will not work
+    echo  until you re-run:  .venv\Scripts\python -m spacy download en_core_web_sm
+)
+echo.
+
+:: ── 5. Download the Kokoro TTS model ─────────────────────────────────────────
+echo [5/5] Downloading the Kokoro TTS model (hexgrad/Kokoro-82M, ~330 MB)...
+echo  This only happens once.
+echo.
+.venv\Scripts\python -c "from kokoro import KPipeline; KPipeline(lang_code='a', repo_id='hexgrad/Kokoro-82M'); print('Model ready.')"
+if errorlevel 1 (
+    echo.
+    echo  WARNING: Model download failed. It will retry the first time you run the app.
+    echo  Make sure you have an internet connection on first launch.
+)
+
+echo.
+echo ============================================================
+echo  Setup complete!
+echo.
+echo  To launch the GUI:          double-click  run_gui.bat
+echo  To create the audiobook:   double-click  run_audiobook.bat
+echo ============================================================
+echo.
+pause
Author	SHA1	Message	Date
dillonj	e9ddbb586a	projects include proper noun stuff	2026-04-08 01:52:54 -06:00
dillonj	894144c84a	audio gen in gui	2026-04-08 01:42:29 -06:00
dillonj	69639342e3	format doc script	2026-03-24 01:42:34 -06:00
dillonj	125cb25cf8	improved time remaining display; added pronunciations	2026-03-13 01:07:17 -06:00
dillonj	8a1362fe0b	setup readme	2026-03-10 00:45:57 -06:00
dillonj	0d00176a18	better readme	2026-03-10 00:30:53 -06:00
dillonj	3c2c3d241e	improved gui	2026-03-10 00:12:04 -06:00
dillonj	224f97d0c6	prep for win 11	2026-03-09 23:36:50 -06:00
dillonj	6e2e0f9af7	better word replacement	2026-02-26 15:08:44 -07:00
dillonj	c1301fee18	deleting word from fixed removes it from correct at same time	2026-02-26 12:52:09 -07:00
dillonj	6781efe3f3	improved estimation	2026-02-26 12:21:44 -07:00
dillonj	44bc757f3f	fixed book markers	2026-02-26 12:09:43 -07:00
dillonj	6cefc3c862	improved proper noun parsing	2026-02-26 00:57:40 -07:00
dillonj	949bd7c203	Clean correct_words.json: single words, filter stop words, keep two-part proper names	2026-02-25 23:50:52 -07:00