projects include proper noun stuff

audio gen in gui
format doc script
2026-04-08 01:52:54 -06:00 · 2026-04-08 01:42:29 -06:00 · 2026-03-24 01:42:34 -06:00 · 2026-03-13 01:07:17 -06:00 · 2026-03-10 00:45:57 -06:00 · 2026-03-10 00:30:53 -06:00
24 changed files with 4913 additions and 3702 deletions
--- a/.envrc
+++ b/.envrc
@ -0,0 +1,2 @@
 export VIRTUAL_ENV="$PWD/.venv"
 export PATH="$VIRTUAL_ENV/bin:$PATH"
--- a/.gitignore
+++ b/.gitignore
@ -3,6 +3,9 @@ __pycache__/
 *.pyc
 *.pyo
 .venv/
 build/
 dist/
 *.spec
 # Audio files
 *.wav
@ -14,6 +17,10 @@ proper_nouns_audio/
 # Generated data (JSON files in output_proper_nouns/ are tracked)
 output_proper_nouns/remaining_review.txt
 # Generated PDFs and LaTeX files
 *.pdf
 *.tex
 # Text files (except proper_nouns.txt)
 *.txt
 !proper_nouns.txt
--- a/.vscode/settings.json
+++ b/.vscode/settings.json
@ -0,0 +1,4 @@
 {
    "python.defaultInterpreterPath": ".venv/bin/python",
    "python.terminal.activateEnvironment": true
 }
--- a/README.md
+++ b/README.md
@ -0,0 +1,125 @@
 # Audiobook Creator
 AI-powered audiobook generator using the [Kokoro TTS](https://github.com/hexgrad/kokoro) model.
 Generates high-quality narrated `.wav` files from plain-text novels, with a GUI tool for auditing and fixing proper noun pronunciations per book.
 ---
 ## Features
 - **Multi-book support** — each book's proper nouns, fixes, and audio are fully isolated
 - **Proper Noun GUI** — hear every extracted name, mark it correct or type a phonetic fix
 - **Audiobook generation** — one `.wav` per chapter, GPU-accelerated via CUDA
 - **In-GUI extraction** — click one button to run NLP extraction and generate audio, no separate scripts needed
 - **Apply Fixes** — writes a TTS-ready copy of the source text with all phonetic substitutions applied
 ---
 ## Project structure
 ```
 Audio Text for Novel Lightbringer/   ← multi-file book (chapters as .txt)
 Audio Master Nem Full.txt            ← single-file book
 gui_proper_noun_player.py            ← proper noun auditing GUI
 create_audiobook_lightbringer.py     ← generate Lightbringer audiobook chapters
 create_audiobook_nem.py              ← generate Nem audiobook chapters
 output_audiobook_lightbringer/       ← chapter WAV output
 output_audiobook/                    ← Nem WAV output
 output_proper_nouns/<book>/          ← manifest + JSON fix data per book
 proper_nouns_audio/<book>/           ← word audio + replacements cache per book
 requirements.txt
 setup_windows.bat                    ← one-click Windows setup
 run_gui.bat                          ← launch GUI on Windows
 run_audiobook.bat                    ← generate audiobook on Windows
 ---
 ## Setup (Windows - Easiest for Non-Tech Users)
 1. **Download** the project as a ZIP file from GitHub
 2. **Extract** the ZIP to a folder on your computer (e.g., `C:\audiobook-creator`)
 3. **Double-click** `setup_windows.bat` and wait for it to finish installing everything (may take 10-20 minutes)
 4. **Double-click** `run_gui.bat` to launch the Proper Noun Player GUI
 5. **Double-click** `run_audiobook.bat` to generate audiobook chapters
 That's it! The setup script handles Python installation, virtual environment, and all dependencies automatically.
 ---
 ## Setup (Linux / Mac)
 ```bash
 python3.12 -m venv .venv
 source .venv/bin/activate
 pip install torch --index-url https://download.pytorch.org/whl/cu124   # CUDA 12.4
 pip install -r requirements.txt
 python -m spacy download en_core_web_sm
 ```
 > For CPU-only: replace the torch line with `pip install torch`
 ---
 ## Setup (Windows)
 See [SETUP_WINDOWS.md](SETUP_WINDOWS.md) for a step-by-step guide aimed at non-technical users.
 ---
 ## Usage
 ### Proper Noun GUI
 ```bash
 .venv/bin/python gui_proper_noun_player.py
 ```
 1. Select a book from the dropdown
 2. Click **⚙ Extract & Generate Audio** — extracts proper nouns via spaCy and generates a TTS clip for each one
 3. Click words in the Review list to hear them; press Enter to mark correct or type a phonetic replacement first
 4. Click **⇄ Apply Fixes to Text** to write a pronunciation-corrected copy of the source file
 ### Generate Audiobook
 ```bash
 # All chapters
 .venv/bin/python create_audiobook_lightbringer.py
 # List chapters only
 .venv/bin/python create_audiobook_lightbringer.py --list
 # Preview clips
 .venv/bin/python create_audiobook_lightbringer.py --preview
 # Specific chapters
 .venv/bin/python create_audiobook_lightbringer.py 0 1 2
 ```
 ---
 ## Dependencies
 | Package | Purpose |
 |---|---|
 | `kokoro` | Kokoro-82M TTS model |
 | `torch` | GPU inference |
 | `soundfile` / `sounddevice` | Audio I/O |
 | `numpy` | Audio array operations |
 | `spacy` + `en_core_web_sm` | Proper noun extraction (NER + PROPN) |
 | `wordfreq` | Common-word filter during extraction |
 ---
 ## Output
 | Path | Contents |
 |---|---|
 | `output_audiobook_lightbringer/` | `chapter_01_homecoming.wav`, … |
 | `output_proper_nouns/<book>/manifest.json` | Word → WAV filename map |
 | `output_proper_nouns/<book>/pronunciation_fixes.json` | `{"Nephi": "Kneephi", …}` |
 | `output_proper_nouns/<book>/correct_words.json` | Words confirmed correct |
 | `proper_nouns_audio/<book>/` | Per-word audio clips |
 | `proper_nouns_audio/<book>/replacements_cache/` | Cached phonetic fix clips |
--- a/SETUP_WINDOWS.md
+++ b/SETUP_WINDOWS.md
@ -0,0 +1,134 @@
 # Audiobook Creator — Windows 11 Setup Guide
 This guide is written for someone who has never used Python or the command line.
 Follow the steps in order and you will be generating audiobook chapters with your gaming GPU.
 ---
 ## What you will need
 | Requirement | Why |
 |---|---|
 | Windows 11 PC with a modern NVIDIA GPU | Fast audio generation using CUDA |
 | ~5 GB free disk space | Python, PyTorch, and the AI voice model |
 | Internet connection (first-time only) | Downloads packages and the Kokoro voice model |
 ---
 ## Step 1 — Install Python
 1. Go to **https://www.python.org/downloads/**
 2. Click the big yellow **"Download Python 3.12.x"** button
 3. Run the installer
 4. **IMPORTANT:** On the very first screen of the installer, tick the checkbox that says **"Add Python to PATH"** before clicking Install Now
 > If you missed that checkbox, uninstall Python from Windows Settings and reinstall it with the box ticked.
 ---
 ## Step 2 — Get the project files
 You should have a folder called `audiobook_creator` (or similar) containing the project files. Make sure it includes these files:
 ```
 setup_windows.bat
 run_gui.bat
 run_audiobook.bat
 requirements.txt
 gui_proper_noun_player.py
 create_audiobook_lightbringer.py
 Audio Text for Novel Lightbringer\    ← your chapter text files go here
 ```
 If you received a ZIP file, extract it first so the folder is not inside another folder.
 ---
 ## Step 3 — Run Setup (one time only)
 1. Open the project folder in File Explorer
 2. Double-click **`setup_windows.bat`**
 3. A black terminal window opens and runs through these steps automatically:
   - Checks Python is installed
   - Creates a private Python environment (`.venv` folder)
   - Downloads PyTorch with GPU (CUDA) support — **about 2.5 GB, this takes several minutes**
   - Installs the remaining packages (kokoro, spaCy, etc.)
   - Downloads the spaCy English language model
   - Downloads the Kokoro AI voice model — **about 330 MB**
 4. When it says **"Setup complete!"**, press any key to close the window
 You only need to do this once. If you run it again it will safely skip anything already installed.
 ---
 ## Step 4 — Review Proper Noun Pronunciations (GUI)
 Before generating the audiobook, it helps to check how unusual names are pronounced.
 1. Double-click **`run_gui.bat`**
 2. The Proper Noun Pronunciation Auditor window opens
 3. Select your book from the dropdown at the top
 4. Click **⚙ Extract & Generate Audio** — this scans the text and creates a short audio clip for every proper noun found (takes a few minutes the first time)
 5. Click any word in the **To Review** list to hear how it sounds
 6. If it sounds wrong, type the phonetic spelling in the box at the bottom and press **Enter** to save a fix
   - Example: type `Kneephi` instead of `Nephi`
 7. If it sounds correct, just press **Enter** without changing anything
 8. When you are done reviewing, click **⇄ Apply Fixes to Text** to save a corrected copy of the source text
 **Keyboard shortcuts:**
 | Key | Action |
 |---|---|
 | Space | Replay current word |
 | Enter | Mark correct (or save fix if text was changed) |
 | Escape | Reset the fix box, go back to word list |
 | s | Stop audio |
 | ↑ / ↓ | Navigate the word list from the fix box |
 | Delete | Move a word back to Review from Correct or Fixes |
 ---
 ## Step 5 — Generate the Audiobook
 1. Double-click **`run_audiobook.bat`**
 2. A menu appears — type the number of your choice and press Enter:
 | Option | What it does |
 |---|---|
 | 1 | Generate **all chapters** — can take many hours, safe to leave running overnight |
 | 2 | **List** detected chapters only — instant, nothing is generated |
 | 3 | Generate a short **preview clip** of each chapter — quick sanity check |
 | 4 | Generate **specific chapters** — enter chapter numbers separated by spaces |
 3. When finished, `.wav` files will be in the `output_audiobook_lightbringer` folder
 ---
 ## Troubleshooting
 **"Python was not found"**
 → Python is not installed, or you forgot to tick "Add Python to PATH" during installation. Uninstall and reinstall Python from https://www.python.org/downloads/ making sure to tick that box.
 **The black window opens and immediately closes**
 → There was an error. To see it: press `Win + R`, type `cmd`, press Enter, then drag the `.bat` file into that black window and press Enter. The error message will stay visible.
 **Audio generation is very slow (taking hours per chapter)**
 → The GPU version of PyTorch may not have installed correctly. Re-run `setup_windows.bat` — it will reinstall just that part.
 **"No .txt files found in Audio Text for Novel Lightbringer"**
 → Make sure your chapter `.txt` files are inside the `Audio Text for Novel Lightbringer` subfolder, not loose in the main project folder.
 **The GUI says "No manifest yet"**
 → You need to click **⚙ Extract & Generate Audio** first for that book.
 **Antivirus blocks the .bat files**
 → Right-click the `.bat` file, choose Properties, and click "Unblock" at the bottom. Then try again.
 ---
 ## Output files
 | Folder | Contents |
 |---|---|
 | `output_audiobook_lightbringer\` | One `.wav` file per chapter |
 | `output_proper_nouns\<book>\` | Pronunciation data (JSON files) |
 | `proper_nouns_audio\<book>\` | Cached word audio clips |
--- a/create_audiobook.py
+++ b/create_audiobook.py
@ -0,0 +1,402 @@
 """
 create_audiobook.py
 ------------------
 Generic audiobook generator for text files that contain chapter headings.
 Supported heading formats (single-line headings):
 - Prologue
 - Chapter 12
 - Chapter 12 - Chapter Name
 - Chapter - 12
 - Chapter - 12 - Chapter Name
 Features:
 - Parses chapters from one or more input files/directories
 - Caches parsed chapter data for faster re-runs when source files are unchanged
 - Warns about missing chapter numbers (example: found 1,2,4 -> warns about 3)
 - Generates one .wav per chapter with Kokoro
 Examples:
    python create_audiobook.py --input "Audio Text for Novel Lightbringer"
    python create_audiobook.py --input novel.txt --list
    python create_audiobook.py --input novel.txt 0 1 2 --voice am_michael
    python create_audiobook.py --input novel.txt --preview 3000
 """
 from __future__ import annotations
 import argparse
 import hashlib
 import json
 import re
 import time
 from pathlib import Path
 import numpy as np
 import soundfile as sf
 import torch
 from kokoro import KPipeline
 SAMPLE_RATE = 24000
 SPEED = 1.0
 LANG_CODE = "a"
 VOICE = "am_onyx"
 CACHE_VERSION = 1
 PROLOGUE_RE = re.compile(r"^\s*Prologue\s*$", re.IGNORECASE)
 CHAPTER_RE_1 = re.compile(r"^\s*Chapter\s*-\s*(\d+)(?:\s*-\s*(.+))?\s*$", re.IGNORECASE)
 CHAPTER_RE_2 = re.compile(r"^\s*Chapter\s+(\d+)(?:\s*-\s*(.+))?\s*$", re.IGNORECASE)
 RULE_RE = re.compile(r"^[_\-*\s]{3,}\s*$")
 def _slug(text: str) -> str:
    text = text.lower()
    text = re.sub(r"[^a-z0-9]+", "_", text)
    return text.strip("_")
 def _clean_text(text: str) -> str:
    text = RULE_RE.sub("", text)
    text = re.sub(r"\n{3,}", "\n\n", text)
    return text.strip()
 def _fmt_duration(seconds: float) -> str:
    h, rem = divmod(int(seconds), 3600)
    m, s = divmod(rem, 60)
    if h > 0:
        return f"{h}h {m:02d}m {s:02d}s"
    if m > 0:
        return f"{m}m {s:02d}s"
    return f"{s}s"
 def _chapter_heading(line: str) -> tuple[int, str, str] | None:
    stripped = line.strip()
    if PROLOGUE_RE.match(stripped):
        return (0, "Prologue", "Prologue")
    m = CHAPTER_RE_1.match(stripped)
    if not m:
        m = CHAPTER_RE_2.match(stripped)
    if not m:
        return None
    num = int(m.group(1))
    title = (m.group(2) or "").strip()
    label = f"Chapter {num}" + (f" - {title}" if title else "")
    return (num, title, label)
 def _resolve_txt_files(inputs: list[str]) -> list[Path]:
    txt_files: list[Path] = []
    for raw in inputs:
        path = Path(raw)
        if path.is_file():
            if path.suffix.lower() == ".txt":
                txt_files.append(path)
            continue
        if path.is_dir():
            txt_files.extend(sorted(path.glob("*.txt")))
    deduped = sorted({p.resolve() for p in txt_files})
    return deduped
 def _signature_for_files(files: list[Path]) -> list[dict]:
    sig = []
    for p in files:
        st = p.stat()
        sig.append({
            "path": str(p),
            "size": st.st_size,
            "mtime_ns": st.st_mtime_ns,
        })
    return sig
 def _cache_path(output_dir: Path, files: list[Path]) -> Path:
    cache_dir = output_dir / ".cache"
    digest = hashlib.sha256("\n".join(str(p) for p in files).encode("utf-8")).hexdigest()[:12]
    return cache_dir / f"parse_{digest}.json"
 def _load_cached_chapters(cache_file: Path, file_sig: list[dict]) -> list[dict] | None:
    if not cache_file.exists():
        return None
    try:
        data = json.loads(cache_file.read_text(encoding="utf-8"))
    except Exception:
        return None
    if data.get("version") != CACHE_VERSION:
        return None
    if data.get("file_signature") != file_sig:
        return None
    chapters = data.get("chapters")
    if not isinstance(chapters, list):
        return None
    return chapters
 def _save_cached_chapters(cache_file: Path, file_sig: list[dict], chapters: list[dict]) -> None:
    cache_file.parent.mkdir(parents=True, exist_ok=True)
    payload = {
        "version": CACHE_VERSION,
        "file_signature": file_sig,
        "chapters": chapters,
    }
    cache_file.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
 def _parse_chapters(files: list[Path]) -> tuple[list[dict], set[int]]:
    chapters: list[dict] = []
    duplicates: set[int] = set()
    seen: set[int] = set()
    current: dict | None = None
    def flush_current() -> None:
        if current is not None:
            current["text"] = "".join(current.pop("lines"))
            num = current["num"]
            if num in seen:
                duplicates.add(num)
                return
            seen.add(num)
            chapters.append(current)
    for fpath in files:
        with fpath.open("r", encoding="utf-8") as fh:
            for line in fh:
                info = _chapter_heading(line)
                if info is not None:
                    flush_current()
                    num, title, label = info
                    num_str = f"{num:02d}"
                    if num == 0:
                        slug = "chapter_00_prologue"
                    elif title:
                        slug = f"chapter_{num_str}_{_slug(title)}"
                    else:
                        slug = f"chapter_{num_str}"
                    current = {
                        "num": num,
                        "title": title,
                        "label": label,
                        "slug": slug,
                        "lines": [line],
                    }
                elif current is not None:
                    current["lines"].append(line)
    flush_current()
    chapters.sort(key=lambda c: c["num"])
    return chapters, duplicates
 def load_all_chapters_with_cache(inputs: list[str], output_dir: Path, force_reparse: bool = False) -> tuple[list[dict], bool, set[int], list[Path]]:
    files = _resolve_txt_files(inputs)
    if not files:
        raise FileNotFoundError("No .txt files found in --input paths")
    file_sig = _signature_for_files(files)
    cache_file = _cache_path(output_dir, files)
    if not force_reparse:
        cached = _load_cached_chapters(cache_file, file_sig)
        if cached is not None:
            return cached, True, set(), files
    chapters, duplicates = _parse_chapters(files)
    _save_cached_chapters(cache_file, file_sig, chapters)
    return chapters, False, duplicates, files
 def warn_missing_chapters(chapters: list[dict]) -> None:
    nums = sorted(ch["num"] for ch in chapters if ch["num"] > 0)
    if not nums:
        return
    missing = [n for n in range(nums[0], nums[-1] + 1) if n not in set(nums)]
    if missing:
        print(f"WARNING: missing chapter numbers detected: {missing}")
 def generate_audio(pipeline: KPipeline, text: str, voice: str, output_path: Path) -> float:
    t0 = time.monotonic()
    chunks = []
    for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
        if hasattr(chunk_audio, "numpy"):
            chunk_audio = chunk_audio.cpu().numpy()
        chunk_audio = np.atleast_1d(chunk_audio.squeeze())
        if chunk_audio.size > 0:
            chunks.append(chunk_audio)
    elapsed = time.monotonic() - t0
    if chunks:
        audio = np.concatenate(chunks, axis=0)
        sf.write(str(output_path), audio, SAMPLE_RATE)
        duration = len(audio) / SAMPLE_RATE
        print(
            f"  OK saved '{output_path.name}' "
            f"({_fmt_duration(duration)} audio | {_fmt_duration(elapsed)} wall-clock)"
        )
    else:
        print(f"  ERROR no audio produced for voice='{voice}'")
    return elapsed
 def main() -> None:
    parser = argparse.ArgumentParser(description="Generate an audiobook from chapterized text files.")
    parser.add_argument(
        "chapters",
        nargs="*",
        type=int,
        help="Chapter numbers to generate (0 = Prologue). Default: all.",
    )
    parser.add_argument(
        "--input",
        nargs="+",
        required=True,
        help="One or more .txt files and/or directories containing .txt files.",
    )
    parser.add_argument(
        "--output",
        default="output_audiobook",
        help="Output directory for generated chapter audio.",
    )
    parser.add_argument("--list", action="store_true", help="Print detected chapters and exit.")
    parser.add_argument("--voice", default=VOICE, help=f"Kokoro voice to use (default: {VOICE}).")
    parser.add_argument(
        "--preview",
        nargs="?",
        const=3000,
        type=int,
        metavar="CHARS",
        help="Generate short preview clips capped at CHARS (default: 3000).",
    )
    parser.add_argument(
        "--reparse",
        action="store_true",
        help="Ignore cache and re-parse chapters from source files.",
    )
    args = parser.parse_args()
    output_dir = Path(args.output)
    output_dir.mkdir(parents=True, exist_ok=True)
    print("Loading chapters...")
    chapters, used_cache, duplicates, files = load_all_chapters_with_cache(
        args.input, output_dir, force_reparse=args.reparse
    )
    print(f"Input files: {len(files)}")
    print(f"Parse cache: {'HIT' if used_cache else 'MISS'}")
    if duplicates:
        print(f"WARNING: duplicate chapter numbers were found and ignored: {sorted(duplicates)}")
    if not chapters:
        print("WARNING: no chapters found.")
        print("Expected headings like: 'Prologue' or 'Chapter 12 - Name' or 'Chapter - 12'")
        return
    warn_missing_chapters(chapters)
    if args.list:
        print(f"\nDetected {len(chapters)} chapters:\n")
        print(f"  {'#':>4}  {'Label':<45}  {'Chars':>8}  {'Output filename'}")
        print(f"  {'-' * 4}  {'-' * 45}  {'-' * 8}  {'-' * 30}")
        for ch in chapters:
            chars = len(_clean_text(ch["text"]))
            print(f"  {ch['num']:>4}  {ch['label']:<45}  {chars:>8,}  {ch['slug']}.wav")
        return
    if args.chapters:
        requested = set(args.chapters)
        run_chapters = [ch for ch in chapters if ch["num"] in requested]
        missing_req = sorted(requested - {ch["num"] for ch in run_chapters})
        if missing_req:
            print(f"WARNING: requested chapter(s) not found: {missing_req}")
    else:
        run_chapters = chapters
    if not run_chapters:
        print("No chapters selected. Use --list to see available chapters.")
        return
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Device: {device}")
    if device == "cuda":
        print(f"GPU:    {torch.cuda.get_device_name(0)}")
    print(f"Voice:  {args.voice}")
    chapter_chars = {ch["num"]: len(_clean_text(ch["text"])) for ch in run_chapters}
    total_chars = sum(chapter_chars.values())
    preview_note = f"PREVIEW MODE: capped at {args.preview:,} chars/chapter" if args.preview else ""
    if preview_note:
        print(preview_note)
    print("\nPlan:")
    for ch in run_chapters:
        print(f"  {ch['num']:>3}  {ch['label']}  ({chapter_chars[ch['num']]:,} chars)")
    print(f"  TOTAL: {total_chars:,} chars\n")
    print("Initializing Kokoro pipeline...")
    pipeline = KPipeline(lang_code=LANG_CODE)
    chars_per_sec: float | None = None
    timing_rows: list[tuple[str, int, float]] = []
    for ch in run_chapters:
        text = _clean_text(ch["text"])
        if not text:
            print(f"[{ch['label']}] WARNING empty text, skipping")
            continue
        if args.preview and len(text) > args.preview:
            cut = text.rfind(" ", 0, args.preview)
            text = text[: cut if cut > 0 else args.preview]
        chars = len(text)
        preview_tag = "_preview" if args.preview else ""
        out_path = output_dir / f"{ch['slug']}{preview_tag}.wav"
        if chars_per_sec is not None:
            eta = _fmt_duration(chars / chars_per_sec)
            print(f"\n[{ch['label']}] -> {out_path.name} (est. {eta})")
        else:
            print(f"\n[{ch['label']}] -> {out_path.name} (calibration run)")
        elapsed = generate_audio(pipeline, text, args.voice, out_path)
        timing_rows.append((ch["label"], chars, elapsed))
        done_chars = sum(c for _, c, _ in timing_rows)
        done_elapsed = sum(e for _, _, e in timing_rows)
        if done_elapsed > 0:
            chars_per_sec = done_chars / done_elapsed
            remaining = total_chars - done_chars
            eta_total = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
            print(f"  Speed: {chars_per_sec:.0f} chars/sec | Estimated remaining: {eta_total}")
    print("\nSummary:")
    print(f"  {'Chapter':<35}  {'Chars':>7}  {'Actual':>8}  {'Est':>8}")
    print("  " + "-" * 65)
    for i, (label, chars, elapsed) in enumerate(timing_rows):
        actual_str = _fmt_duration(elapsed)
        prior_chars = sum(c for _, c, _ in timing_rows[:i])
        prior_elapsed = sum(e for _, _, e in timing_rows[:i])
        est_str = _fmt_duration(chars / (prior_chars / prior_elapsed)) if prior_elapsed > 0 else "(first)"
        print(f"  {label:<35}  {chars:>7,}  {actual_str:>8}  {est_str:>8}")
    total_elapsed = sum(e for _, _, e in timing_rows)
    total_done_chars = sum(c for _, c, _ in timing_rows)
    print("  " + "-" * 65)
    print(f"  {'TOTAL':<35}  {total_done_chars:>7,}  {_fmt_duration(total_elapsed):>8}")
    print("\nDone.")
 if __name__ == "__main__":
    main()
--- a/create_audiobook_lightbringer.py
+++ b/create_audiobook_lightbringer.py
@ -0,0 +1,311 @@
 """
 create_audiobook_lightbringer.py
 ─────────────────────────────────
 Generate the "A Darkness Rising" audiobook — one file per chapter/prologue.
 Reads all .txt files from NOVEL_DIR, detects Prologue + Chapter headings,
 and writes one .wav per chapter into OUTPUT_DIR.
 Usage:
    python create_audiobook_lightbringer.py            # all chapters
    python create_audiobook_lightbringer.py --list     # list detected chapters
    python create_audiobook_lightbringer.py 0 1 2      # prologue + ch1 + ch2
    python create_audiobook_lightbringer.py --preview  # short preview clips
 Output filenames:
    chapter_00_prologue.wav
    chapter_01_homecoming.wav
    chapter_02_the_anhuil_ehlar.wav
    ...
 """
 import argparse
 import re
 import time
 import numpy as np
 import soundfile as sf
 import torch
 from pathlib import Path
 from kokoro import KPipeline
 # ── Config ─────────────────────────────────────────────────────────────────────
 NOVEL_DIR   = Path("Audio Text for Novel Lightbringer")
 OUTPUT_DIR  = Path("output_audiobook_lightbringer")
 SAMPLE_RATE = 24000
 SPEED       = 1.0
 LANG_CODE   = "a"     # American English
 VOICE       = "am_onyx"      # default narrator voice
 # Regex that matches a chapter/prologue heading line (case-insensitive).
 # Group 1 captures the chapter number (or None for Prologue).
 # Group 2 captures the optional subtitle after " - ".
 _HEADING_RE = re.compile(
    r"^(?:Chapter\s+(\d+)\s*(?:-\s*(.+))?|(Prologue))\s*$",
    re.IGNORECASE,
 )
 # ── Helpers ────────────────────────────────────────────────────────────────────
 def _slug(text: str) -> str:
    """Convert title text to a filesystem-safe slug."""
    text = text.lower()
    text = re.sub(r"[^a-z0-9]+", "_", text)
    return text.strip("_")
 def load_all_chapters(novel_dir: Path) -> list[dict]:
    """
    Read all .txt files in *novel_dir* in sorted order, detect Prologue /
    Chapter headings, and return a list of chapter dicts:
        {
            "num":   int,          # 0 = Prologue
            "title": str,          # subtitle portion, e.g. "Homecoming"
            "label": str,          # human label, e.g. "Chapter 1 - Homecoming"
            "slug":  str,          # e.g. "chapter_01_homecoming"
            "text":  str,          # full body text of the chapter
        }
    Chapters from multiple files are concatenated in sorted-filename order.
    """
    txt_files = sorted(novel_dir.glob("*.txt"))
    if not txt_files:
        raise FileNotFoundError(f"No .txt files found in '{novel_dir}'")
    # Collect (chapter_num, title_line, body_lines) across all files
    raw: list[tuple[int, str, list[str]]] = []  # (num, heading_text, body)
    current_num: int | None = None
    current_heading: str = ""
    current_body: list[str] = []
    def _flush():
        if current_num is not None:
            raw.append((current_num, current_heading, list(current_body)))
    for fpath in txt_files:
        lines = fpath.read_text(encoding="utf-8").splitlines()
        for line in lines:
            m = _HEADING_RE.match(line.strip())
            if m:
                _flush()
                if m.group(3):               # Prologue
                    current_num = 0
                    current_heading = "Prologue"
                else:                        # Chapter N
                    current_num = int(m.group(1))
                    subtitle = (m.group(2) or "").strip()
                    current_heading = f"Chapter {current_num}" + (f" - {subtitle}" if subtitle else "")
                current_body = [line]        # keep heading inside text
            else:
                if current_num is not None:
                    current_body.append(line)
    _flush()
    # Build chapter dicts, deduplicated and sorted by number
    seen: set[int] = set()
    chapters: list[dict] = []
    for num, heading, body in sorted(raw, key=lambda x: x[0]):
        if num in seen:
            continue
        seen.add(num)
        # Derive subtitle / slug
        subtitle = ""
        sm = re.match(r"Chapter\s+\d+\s*-\s*(.+)", heading, re.IGNORECASE)
        if sm:
            subtitle = sm.group(1).strip()
        elif heading.lower() == "prologue":
            subtitle = "Prologue"
        num_str = f"{num:02d}"
        if subtitle:
            slug = f"chapter_{num_str}_{_slug(subtitle)}"
        else:
            slug = f"chapter_{num_str}"
        chapters.append({
            "num":   num,
            "title": subtitle or heading,
            "label": heading,
            "slug":  slug,
            "text":  "\n".join(body),
        })
    return chapters
 def clean_text(text: str) -> str:
    """Strip formatting artifacts and normalise whitespace for TTS."""
    # Remove horizontal-rule lines (underscores / asterisks / dashes)
    text = re.sub(r"^[_\-\*\s]{3,}\s*$", "", text, flags=re.MULTILINE)
    # Collapse 3+ blank lines to 2
    text = re.sub(r"\n{3,}", "\n\n", text)
    return text.strip()
 def _fmt_duration(seconds: float) -> str:
    h, rem = divmod(int(seconds), 3600)
    m, s = divmod(rem, 60)
    if h > 0:
        return f"{h}h {m:02d}m {s:02d}s"
    if m > 0:
        return f"{m}m {s:02d}s"
    return f"{s}s"
 def generate_audio(pipeline: KPipeline, text: str, voice: str,
                   output_path: Path) -> float:
    """Generate audio and return wall-clock seconds elapsed."""
    t0 = time.monotonic()
    chunks = []
    for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
        if hasattr(chunk_audio, "numpy"):
            chunk_audio = chunk_audio.cpu().numpy()
        chunk_audio = np.atleast_1d(chunk_audio.squeeze())
        if chunk_audio.size > 0:
            chunks.append(chunk_audio)
    elapsed = time.monotonic() - t0
    if chunks:
        audio = np.concatenate(chunks, axis=0)
        sf.write(str(output_path), audio, SAMPLE_RATE)
        duration = len(audio) / SAMPLE_RATE
        print(f"  ✓  Saved '{output_path.name}'  "
              f"({_fmt_duration(duration)} audio  |  {_fmt_duration(elapsed)} wall-clock)")
    else:
        print(f"  ✗  No audio produced for voice='{voice}'")
    return elapsed
 # ── Main ───────────────────────────────────────────────────────────────────────
 def main() -> None:
    parser = argparse.ArgumentParser(
        description="Generate 'A Darkness Rising' audiobook, one file per chapter."
    )
    parser.add_argument(
        "chapters", nargs="*", type=int,
        help="Chapter numbers to generate (0 = Prologue). Default: all.",
    )
    parser.add_argument(
        "--list", action="store_true",
        help="Print detected chapters and exit.",
    )
    parser.add_argument(
        "--voice", default=VOICE,
        help=f"Kokoro voice to use (default: {VOICE}).",
    )
    parser.add_argument(
        "--preview", nargs="?", const=3000, type=int, metavar="CHARS",
        help="Generate short preview clips (default: 3000 chars). "
             "Output filenames get a _preview suffix.",
    )
    args = parser.parse_args()
    print("Loading chapters …")
    all_chapters = load_all_chapters(NOVEL_DIR)
    if args.list:
        print(f"\nDetected {len(all_chapters)} chapters:\n")
        print(f"  {'#':>4}  {'Label':<45}  {'Chars':>8}  {'Output filename'}")
        print(f"  {'─'*4}  {'─'*45}  {'─'*8}  {'─'*30}")
        for ch in all_chapters:
            chars = len(clean_text(ch["text"]))
            print(f"  {ch['num']:>4}  {ch['label']:<45}  {chars:>8,}  {ch['slug']}.wav")
        return
    # Filter to requested subset
    if args.chapters:
        requested = set(args.chapters)
        run_chapters = [ch for ch in all_chapters if ch["num"] in requested]
        missing = requested - {ch["num"] for ch in run_chapters}
        if missing:
            print(f"⚠  Chapter(s) not found: {sorted(missing)}")
    else:
        run_chapters = all_chapters
    if not run_chapters:
        print("No chapters selected. Use --list to see available chapters.")
        return
    voice = args.voice
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Device: {device}")
    if device == "cuda":
        print(f"GPU:    {torch.cuda.get_device_name(0)}")
    print(f"Voice:  {voice}")
    OUTPUT_DIR.mkdir(exist_ok=True)
    # Pre-compute char counts
    chapter_chars = {ch["num"]: len(clean_text(ch["text"])) for ch in run_chapters}
    preview_note = (f"  ⚡ PREVIEW MODE — capped at {args.preview:,} chars/chapter\n"
                    if args.preview else "")
    print(f"\n{preview_note}{'─'*65}")
    print(f"  {'#':>4}  {'Label':<40}  {'Chars':>8}")
    print(f"  {'─'*4}  {'─'*40}  {'─'*8}")
    for ch in run_chapters:
        print(f"  {ch['num']:>4}  {ch['label']:<40}  {chapter_chars[ch['num']]:>8,}")
    print(f"  {'─'*55}")
    total_chars = sum(chapter_chars.values())
    print(f"  {'TOTAL':<45}  {total_chars:>8,}\n")
    print("Initialising Kokoro pipeline …")
    pipeline = KPipeline(lang_code=LANG_CODE)
    chars_per_sec: float | None = None
    timing_rows: list[tuple[str, int, float]] = []
    for ch in run_chapters:
        text = clean_text(ch["text"])
        if not text:
            print(f"\n[{ch['label']}]  ⚠  Empty text — skipping")
            continue
        preview_chars = args.preview
        if preview_chars and len(text) > preview_chars:
            cut = text.rfind(" ", 0, preview_chars)
            text = text[: cut if cut > 0 else preview_chars]
        chars = len(text)
        preview_tag = "_preview" if args.preview else ""
        out_path = OUTPUT_DIR / f"{ch['slug']}{preview_tag}.wav"
        if chars_per_sec is not None:
            eta_str = _fmt_duration(chars / chars_per_sec)
            print(f"\n[{ch['label']}]  voice={voice}  →  {out_path.name}  (est. {eta_str})")
        else:
            print(f"\n[{ch['label']}]  voice={voice}  →  {out_path.name}  (calibration run)")
        elapsed = generate_audio(pipeline, text, voice, out_path)
        timing_rows.append((ch["label"], chars, elapsed))
        total_done = sum(c for _, c, _ in timing_rows)
        total_elapsed_done = sum(e for _, _, e in timing_rows)
        if total_elapsed_done > 0:
            chars_per_sec = total_done / total_elapsed_done
            remaining = total_chars - total_done
            eta_overall = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
            print(f"  ⏱  Speed: {chars_per_sec:.0f} chars/sec  |  Est. overall remaining: {eta_overall}")
    # Summary
    print("\n" + "─" * 65)
    print(f"  {'Chapter':<35}  {'Chars':>7}  {'Actual':>8}  {'Est':>8}")
    print("─" * 65)
    for i, (label, chars, elapsed) in enumerate(timing_rows):
        actual_str = _fmt_duration(elapsed)
        prior_chars = sum(c for _, c, _ in timing_rows[:i])
        prior_elapsed = sum(e for _, _, e in timing_rows[:i])
        if prior_elapsed > 0:
            est_str = _fmt_duration(chars / (prior_chars / prior_elapsed))
        else:
            est_str = "(first)"
        print(f"  {label:<35}  {chars:>7,}  {actual_str:>8}  {est_str:>8}")
    total_elapsed = sum(e for _, _, e in timing_rows)
    print("─" * 65)
    print(f"  {'TOTAL':<35}  {sum(c for _,c,_ in timing_rows):>7,}  "
          f"{_fmt_duration(total_elapsed):>8}")
    print("\nDone.")
 if __name__ == "__main__":
    main()
--- a/create_audiobook_nem.py
+++ b/create_audiobook_nem.py
@ -4,13 +4,19 @@ audiobook_nem.py
 Generate the Book of the Nem audiobook — one unique voice per book/section.
 Usage:
-    python audiobook_nem.py
+    python create_audiobook_nem.py                   # all enabled books
    python create_audiobook_nem.py --list            # list available book labels
    python create_audiobook_nem.py Introduction
    python create_audiobook_nem.py "Book of Hagoth"
    python create_audiobook_nem.py Introduction "Book of Hagoth"
-To skip a section, comment out its entry in BOOKS below.
+To permanently skip a section, comment out its entry in BOOKS below.
 Output .wav files are written to OUTPUT_DIR (created automatically).
 """
 import argparse
 import re
 import time
 import numpy as np
 import soundfile as sf
 import torch
@ -27,8 +33,12 @@ SPEED         = 1.0
 LANG_CODE     = "a"   # 'a' = American English
 # ── Available Kokoro voices (American English, lang_code='a') ──────────────────
-#   af_heart   – warm American female      [downloaded]
+#   af_bella   – American female             [downloaded]
 #   af_heart   – warm American female        [downloaded]
 #   af_nicole  – American female             [downloaded]
 #   af_river   – American female             [downloaded]
 #   af_sarah   – American female             [downloaded]
 #   af_sky     – American female             [downloaded]
 #   am_adam    – American male (deep)        [downloaded]
 #   am_echo    – American male               [downloaded]
 #   am_eric    – American male               [downloaded]
@ -40,30 +50,30 @@ LANG_CODE     = "a"   # 'a' = American English
 #   am_santa   – American male               [downloaded] (not used)
 # ── Book definitions ───────────────────────────────────────────────────────────
-# Format: (label, start_marker, voice, output_wav)
+# Format: (label, (start_line1, start_line2), voice, output_wav)
-#   start_marker – exact text of the FIRST line of the section header in the source
+#   start_line1 – exact text of the FIRST line of the section header
-#                  (leading/trailing whitespace is ignored when matching)
+#   start_line2 – prefix of the SECOND line (used together for unambiguous matching)
 #   voice        – Kokoro voice name
 #   output_wav   – filename saved inside OUTPUT_DIR
 #
 # Comment out any line to skip that section entirely.
 BOOKS = [
-    # label                       start_marker                       voice         output_wav
+    # label                       (start_line1,                    start_line2)                           voice         output_wav
-    ("Introduction",              "Introduction",                    "af_heart",   "00_introduction.wav"),
+    ("Introduction",              ("Introduction",                 "The Book of the Nem"),                "af_heart",   "00_introduction.wav"),
-    ("Book of Hagoth",            "THE BOOK OF HAGOTH",              "am_fenrir",  "01_hagoth.wav"),
+    ("Book of Hagoth",            ("THE BOOK OF HAGOTH",           "THE SON OF HAGMENI,"),                 "am_santa",  "01_hagoth.wav"),
-    ("Shi-Tugo I",                "THE FIRST BOOK OF SHI-TUGO",      "am_eric",    "02_shi_tugo_1.wav"),
+    ("Shi-Tugo I",                ("THE FIRST BOOK OF SHI-TUGO",  "FORMER WARRIOR, AMMONITE"),            "am_eric",    "02_shi_tugo_1.wav"),
-    ("Sanempet",                  "THE BOOK OF SANEMPET",            "am_liam",    "03_sanempet.wav"),
+    ("Sanempet",                  ("THE BOOK OF SANEMPET",        "THE SON OF HAGMENI,"),                 "am_liam",    "03_sanempet.wav"),
-    ("Oug",                       "THE BOOK OF OUG",                 "am_michael", "04_oug.wav"),
+    ("Oug",                       ("THE BOOK OF OUG",             "THE SON OF SANEMPET"),                 "am_michael", "04_oug.wav"),
-    ("Temple Writings of Oug",    "THE BOOK OF",                     "am_michael", "05_temple_writings_oug.wav"),
+    ("Temple Writings of Oug",    ("THE BOOK OF",                 "THE TEMPLE WRITINGS"),                "am_michael", "05_temple_writings_oug.wav"),
-    ("Sacred Temple Writings",    "THE SACRED",                      "am_michael", "06_sacred_temple_writings.wav"),
+    ("Sacred Temple Writings",    ("THE SACRED",                  "TEMPLE WRITINGS"),                     "am_michael", "06_sacred_temple_writings.wav"),
-    ("Samuel the Lamanite I",     "THE FIRST BOOK",                  "am_echo",    "07_samuel_lamanite_1.wav"),
+    ("Samuel the Lamanite I",     ("THE FIRST BOOK",              "OF SAMUEL THE LAMANITE"),             "am_echo",    "07_samuel_lamanite_1.wav"),
-    ("Samuel the Lamanite II",    "THE SECOND BOOK",                 "am_echo",    "08_samuel_lamanite_2.wav"),
+    ("Samuel the Lamanite II",    ("THE SECOND BOOK",             "OF SAMUEL THE LAMANITE"),             "am_echo",    "08_samuel_lamanite_2.wav"),
-    ("Manti",                     "THE BOOK OF MANTI",               "am_onyx",    "09_manti.wav"),
+    ("Manti",                     ("THE BOOK OF MANTI",           "THE SON OF OUG"),                      "am_onyx",    "09_manti.wav"),
-    ("Pa Nat I",                  "THE FIRST BOOK OF PA NAT",        "af_nicole",  "10_pa_nat_1.wav"),
+    ("Pa Nat I",                  ("THE FIRST BOOK OF PA NAT",    "THE DAUGHTER OF SHIMLEI"),             "af_bella",  "10_pa_nat_1.wav"),
-    ("Moroni I",                  "THE FIRST BOOK OF MORONI",        "am_adam",    "11_moroni_1.wav"),
+    ("Moroni I",                  ("THE FIRST BOOK OF MORONI",    "THE SON OF MORMON,"),                  "am_adam",    "11_moroni_1.wav"),
-    ("Moroni II",                 "THE SECOND BOOK OF MORONI",       "am_adam",    "12_moroni_2.wav"),
+    ("Moroni II",                 ("THE SECOND BOOK OF MORONI",   "THE SON OF MORMON,"),                  "am_adam",    "12_moroni_2.wav"),
-    ("Moroni III",                "THE THIRD BOOK OF MORONI",        "am_adam",    "13_moroni_3.wav"),
+    ("Moroni III",                ("THE THIRD BOOK OF MORONI",    "THE SON OF MORMON,"),                  "am_adam",    "13_moroni_3.wav"),
-    ("Shioni",                    "THE BOOK OF SHIONI",              "am_puck",    "14_shioni.wav"),
+    ("Shioni",                    ("THE BOOK OF SHIONI",          "THE SON OF MORONI"),                   "am_puck",    "14_shioni.wav"),
 ]
 # ── Helpers ────────────────────────────────────────────────────────────────────
@ -71,23 +81,36 @@ BOOKS = [
 def load_and_split(source: Path, books: list) -> dict[str, str]:
    """
    Read the source file and split it into sections keyed by label.
-    Each section starts at its start_marker line and ends just before the
+    Each section starts at its (start_line1, start_line2) marker pair and
-    next section's start_marker.
+    ends just before the next section's marker.
    Marker positions are always detected from the *original* unmodified file
    (_ORIG_FILE) when it exists, so that phonetic fixes applied to section
    headings in the TTS-fixed file can never break section detection.  The
    line numbers are identical in both files because word-level replacements
    never add or remove lines.
    """
-    raw_lines = source.read_text(encoding="utf-8").splitlines()
+    # Use the original (un-fixed) file for marker detection so phonetic
    # changes to heading lines don't break matching.
    marker_source = _ORIG_FILE if _ORIG_FILE.exists() else source
    marker_lines = marker_source.read_text(encoding="utf-8").splitlines()
-    # Build a mapping: marker_text → index in BOOKS
+    # The content to actually return comes from `source` (may be fixed file).
-    markers = [(label, marker.strip()) for label, marker, _, _ in books]
+    content_lines = source.read_text(encoding="utf-8").splitlines()
-    # Find the line index of each marker's first occurrence
+    # Build a mapping: (label, line1, line2) for each book
    markers = [(label, m[0].strip(), m[1].strip()) for label, m, _, _ in books]
    # Find the line index of each marker's first occurrence (two-line match)
    marker_positions: list[tuple[int, int]] = []   # (line_idx, books_idx)
-    for book_idx, (label, marker) in enumerate(markers):
+    for book_idx, (label, m1, m2) in enumerate(markers):
-        for line_idx, line in enumerate(raw_lines):
+        for line_idx, line in enumerate(marker_lines[:-1]):
-            if line.strip() == marker:
+            if (line.strip().upper() == m1.upper() and
                    marker_lines[line_idx + 1].strip().upper().startswith(m2.upper())):
                marker_positions.append((line_idx, book_idx))
                break
        else:
-            print(f"  ⚠  Marker not found for '{label}': '{marker}' — skipping")
+            print(f"  ⚠  Marker not found for '{label}': '{m1}' / '{m2}' — skipping")
    marker_positions.sort(key=lambda x: x[0])
@ -97,8 +120,8 @@ def load_and_split(source: Path, books: list) -> dict[str, str]:
        if rank + 1 < len(marker_positions):
            end_line = marker_positions[rank + 1][0]
        else:
-            end_line = len(raw_lines)
+            end_line = len(content_lines)
-        text = "\n".join(raw_lines[line_idx:end_line]).strip()
+        text = "\n".join(content_lines[line_idx:end_line]).strip()
        sections[label] = text
    return sections
@ -118,8 +141,21 @@ def clean_text(text: str) -> str:
    return text.strip()
 def _fmt_duration(seconds: float) -> str:
    """Format seconds as 'Xh Ym Zs', 'Xm Ys', or 'Xs'."""
    h, rem = divmod(int(seconds), 3600)
    m, s = divmod(rem, 60)
    if h > 0:
        return f"{h}h {m:02d}m {s:02d}s"
    if m > 0:
        return f"{m}m {s:02d}s"
    return f"{s}s"
 def generate_audio(pipeline: KPipeline, text: str, voice: str,
-                   output_path: Path) -> None:
+                   output_path: Path) -> float:
    """Generate audio and return wall-clock seconds elapsed."""
    t0 = time.monotonic()
    chunks = []
    for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
        if hasattr(chunk_audio, "numpy"):
@ -131,15 +167,55 @@ def generate_audio(pipeline: KPipeline, text: str, voice: str,
    if chunks:
        audio = np.concatenate(chunks, axis=0)
        sf.write(str(output_path), audio, SAMPLE_RATE)
        elapsed = time.monotonic() - t0
        duration = len(audio) / SAMPLE_RATE
-        print(f"  ✓  Saved '{output_path.name}'  ({duration:.1f}s)")
+        print(f"  ✓  Saved '{output_path.name}'  ({_fmt_duration(duration)} audio  |  {_fmt_duration(elapsed)} wall-clock)")
    else:
        elapsed = time.monotonic() - t0
        print(f"  ✗  No audio produced for voice='{voice}'")
    return elapsed
 # ── Main ───────────────────────────────────────────────────────────────────────
 def main() -> None:
    # ── CLI ────────────────────────────────────────────────────────────
    parser = argparse.ArgumentParser(description="Generate Nem audiobook sections.")
    parser.add_argument(
        "books", nargs="*",
        help="Labels of sections to generate (default: all enabled books). "
             "Use --list to see available labels."
    )
    parser.add_argument(
        "--list", action="store_true",
        help="Print all enabled book labels and exit."
    )
    parser.add_argument(
        "--preview", nargs="?", const=3000, type=int, metavar="CHARS",
        help="Generate a short preview clip per book (default: 3000 chars). "
             "Output filenames get a _preview suffix."
    )
    args = parser.parse_args()
    enabled_labels = [label for label, _, _, _ in BOOKS]
    if args.list:
        print("Enabled books:")
        for label in enabled_labels:
            print(f"  {label}")
        return
    # Filter to requested subset, preserving BOOKS order
    if args.books:
        unknown = [b for b in args.books if b not in enabled_labels]
        if unknown:
            print(f"Unknown book label(s): {', '.join(unknown)}")
            print(f"Run with --list to see available labels.")
            return
        run_books = [b for b in BOOKS if b[0] in args.books]
    else:
        run_books = list(BOOKS)
    device = "cuda" if torch.cuda.is_available() else "cpu"
    print(f"Device: {device}")
    if device == "cuda":
@ -150,25 +226,95 @@ def main() -> None:
    print(f"\nSource: '{SOURCE_FILE}'"
          + (" ✓ (TTS fixed)" if SOURCE_FILE == _FIXED_FILE else
             " ⚠ (original — run 'Apply Fixes to Text' in the GUI to use phonetic fixes)"))
    # Always split using ALL books for correct section boundaries,
    # but only generate for run_books.
    sections = load_and_split(SOURCE_FILE, BOOKS)
-    print(f"  Found {len(sections)} sections.\n")
+    print(f"  Found {len(sections)} sections ({len(run_books)} selected).\n")
    print("Initialising Kokoro pipeline …")
    pipeline = KPipeline(lang_code=LANG_CODE)
-    for label, marker, voice, wav_name in BOOKS:
+    # Pre-compute char counts for all sections so we can estimate ETAs
-        if label not in sections:
+    section_chars: dict[str, int] = {
-            continue  # marker was not found; warning already printed
+        label: len(clean_text(sections[label]))
        for label, _, _, _ in run_books
        if label in sections
    }
-        print(f"\n[{label}]  voice={voice}  →  {wav_name}")
+    # Print char count summary before starting
-        text = clean_text(sections[label])
+    preview_note = f"  ⚡ PREVIEW MODE — capped at {args.preview:,} chars/book\n" if args.preview else ""
-        if not text:
+    print(f"\n{preview_note}{'─' * 52}")
-            print("  ⚠  Empty text — skipping")
+    print(f"  {'Section':<30}  {'Chars':>8}")
    print(f"{'─' * 52}")
    for label, _, _, wav_name in run_books:
        if label in section_chars:
            print(f"  {label:<30}  {section_chars[label]:>8,}")
    print(f"{'─' * 52}")
    total_chars = sum(section_chars.values())
    print(f"  {'TOTAL':<30}  {total_chars:>8,}")
    print()
    chars_per_sec: float | None = None   # derived from the first book that finishes
    timing_rows: list[tuple[str, int, float]] = []  # (label, chars, elapsed)
    for label, _marker, voice, wav_name in run_books:
        if label not in sections:
            continue
-        out_path = OUTPUT_DIR / wav_name
+        text = clean_text(sections[label])
-        generate_audio(pipeline, text, voice, out_path)
+        if not text:
            print(f"\n[{label}]  ⚠  Empty text — skipping")
            continue
        # Preview mode: truncate to requested char limit at a word boundary
        preview_chars = args.preview
        if preview_chars:
            if len(text) > preview_chars:
                cut = text.rfind(" ", 0, preview_chars)
                text = text[: cut if cut > 0 else preview_chars]
        chars = len(text)
        # Print ETA once we have a calibration rate
        if chars_per_sec is not None:
            eta_sec = chars / chars_per_sec
            eta_str = _fmt_duration(eta_sec)
            print(f"\n[{label}]  voice={voice}  →  {wav_name}  (est. {eta_str})")
        else:
            print(f"\n[{label}]  voice={voice}  →  {wav_name}  (timing calibration run)")
        stem, ext = wav_name.rsplit(".", 1)
        preview_tag = "_preview" if preview_chars else ""
        out_path = OUTPUT_DIR / f"{stem}_{voice}{preview_tag}.{ext}"
        elapsed = generate_audio(pipeline, text, voice, out_path)
        timing_rows.append((label, chars, elapsed))
        # Update calibration as a cumulative average after every book
        total_chars_done = sum(c for _, c, _ in timing_rows)
        total_elapsed_done = sum(e for _, _, e in timing_rows)
        if total_elapsed_done > 0:
            chars_per_sec = total_chars_done / total_elapsed_done
            remaining = total_chars - total_chars_done
            eta_overall = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
            print(f"  ⏱  Speed: {chars_per_sec:.0f} chars/sec  |  Est. overall remaining: {eta_overall}")
    # ── Summary ────────────────────────────────────────────────────────────────
    print("\n" + "─" * 60)
    print(f"  {'Section':<30}  {'Chars':>7}  {'Actual':>8}  {'Est':>8}")
    print("─" * 60)
    for i, (label, chars, elapsed) in enumerate(timing_rows):
        actual_str = _fmt_duration(elapsed)
        # Estimate using the cumulative rate *before* this book was added
        prior_chars = sum(c for _, c, _ in timing_rows[:i])
        prior_elapsed = sum(e for _, _, e in timing_rows[:i])
        if prior_elapsed > 0:
            est_str = _fmt_duration(chars / (prior_chars / prior_elapsed))
        else:
            est_str = "(first run)"
        print(f"  {label:<30}  {chars:>7,}  {actual_str:>8}  {est_str:>8}")
    total_elapsed = sum(e for _, _, e in timing_rows)
    print("─" * 60)
    print(f"  {'TOTAL':<30}  {sum(c for _,c,_ in timing_rows):>7,}  {_fmt_duration(total_elapsed):>8}")
    print("\nDone.")
--- a/create_temple_voices.py
+++ b/create_temple_voices.py
@ -0,0 +1,352 @@
 """
 create_temple_voices.py
 ────────────────────────
 Generate the "Sacred Temple Writings" section of the Nem audiobook using one
 distinct Microsoft Edge neural TTS voice per character (NOT Kokoro).
 Uses the free edge-tts library which streams Microsoft Azure neural voices.
 Audio is stitched into a single WAV and saved to OUTPUT_DIR.
 Usage:
    python create_temple_voices.py                    # full render
    python create_temple_voices.py --preview 40       # first 40 segments only
    python create_temple_voices.py --print-segments   # inspect parsed segments
    python create_temple_voices.py --list-voices      # list available en voices
 Voice assignments live in CHARACTER_VOICES below — easy to customise.
 Run  --list-voices  to discover all available edge-tts voice names.
 """
 import argparse
 import asyncio
 import re
 import subprocess
 import time
 from collections import Counter
 from pathlib import Path
 import numpy as np
 import soundfile as sf
 import edge_tts
 # ── File / output config ───────────────────────────────────────────────────────
 _FIXED_FILE  = Path("Audio Master Nem Full (TTS Fixed).txt")
 _ORIG_FILE   = Path("Audio Master Nem Full.txt")
 SOURCE_FILE  = _FIXED_FILE if _FIXED_FILE.exists() else _ORIG_FILE
 OUTPUT_DIR   = Path("output_temple_voices")
 OUTPUT_FILE  = "sacred_temple_writings_multivoice.wav"
 SAMPLE_RATE  = 24_000   # Hz — final WAV sample rate
 PAUSE_SAME   = 350      # ms silence between same-speaker segments
 PAUSE_CHANGE = 650      # ms silence between different-speaker segments
 # ── Section boundary markers (match create_audiobook_nem.py BOOKS order) ──────
 #   Sacred Temple Writings starts at "THE SACRED" / "TEMPLE WRITINGS"
 #   and ends just before "THE FIRST BOOK" / "OF SAMUEL THE LAMANITE"
 _SEC_START_L1 = "THE SACRED"
 _SEC_START_L2 = "TEMPLE WRITINGS"
 _SEC_END_L1   = "THE FIRST BOOK"
 _SEC_END_L2   = "OF SAMUEL THE LAMANITE"
 # ── Character → edge-tts voice ────────────────────────────────────────────────
 # Run  python create_temple_voices.py --list-voices  to see all available voices.
 # Keys must match the speaker labels exactly as they appear in the source file.
 CHARACTER_VOICES: dict[str, str] = {
    # ── Celestial beings ───────────────────────────────────────────────────────
    "Narrator":               "en-US-GuyNeural",         # calm neutral narrator
    "Elohim Heavenly Mother": "en-US-JennyNeural",       # warm, wise matriarch
    "Elohim Heavenly Father": "en-US-AndrewMultilingualNeural",  # expressive, authoritative
    "Jehovah":                "en-US-AndrewNeural",      # clear, gentle divine
    "Angel of the Lord":      "en-US-BrianNeural",       # ethereal divine messenger
    "Holy Ghost":             "en-US-EricNeural",        # quiet, inward, spiritual
    "Holy Ghost Elders":      "en-US-BrianNeural",       # measured elder council
    # ── Dark beings ────────────────────────────────────────────────────────────
    "Lucifer":                "en-CA-LiamNeural",        # smooth, persuasive tempter
    "Satan":                  "en-US-SteffanNeural",     # cold, commanding adversary
    # ── Mortal / earth characters ──────────────────────────────────────────────
    "Michael":                "en-US-RogerNeural",        # noble warrior archangel
    "Adam":                   "en-US-ChristopherNeural",  # earnest first man
    "Eve":                    "en-US-AriaNeural",        # curious, warm first woman
    # ── Apostles ───────────────────────────────────────────────────────────────
    "Peter":                  "en-GB-RyanNeural",        # firm British apostle
    "James":                  "en-AU-WilliamMultilingualNeural",  # steady Australian voice
    "John":                   "en-IE-ConnorNeural",      # gentle Irish apostle
    # ── Other roles ────────────────────────────────────────────────────────────
    "Preacher":               "en-US-AvaNeural",         # bold emphatic preacher
    "Mob":                    "en-US-MichelleNeural",    # crowd / multitude voice
    "The Voice of the Mob":   "en-US-MichelleNeural",   # alias used in some editions
 }
 # Voice used when a speaker label isn't found in CHARACTER_VOICES
 FALLBACK_VOICE = "en-US-GuyNeural"
 # Lines/patterns that are ceremony stage-directions → read by Narrator
 _STAGE_NARRATOR = re.compile(
    r"^(Break for Instruction|Resume Session|All\s+arise|"
    r"CHAPTER\s*\d*|________________+|────+)",
    re.IGNORECASE,
 )
 # Lines to skip entirely (decorative / empty)
 _SKIP_RE = re.compile(r"^[—\-_\s\u2014\u2013]*$")
 # ── Section extraction ─────────────────────────────────────────────────────────
 def extract_section(source: Path) -> str:
    """Return text of the Sacred Temple Writings section."""
    lines = source.read_text(encoding="utf-8").splitlines()
    in_sec = False
    out: list[str] = []
    for i, line in enumerate(lines):
        s = line.strip()
        if not in_sec:
            if (s.upper() == _SEC_START_L1 and
                    i + 1 < len(lines) and
                    lines[i + 1].strip().upper().startswith(_SEC_START_L2)):
                in_sec = True
        else:
            # End just before the next section
            if (s.upper() == _SEC_END_L1 and
                    i + 1 < len(lines) and
                    lines[i + 1].strip().upper().startswith(_SEC_END_L2)):
                break
            out.append(line)
    if not out:
        raise RuntimeError(
            f"Could not locate 'Sacred Temple Writings' in '{source}'.\n"
            "Ensure the source file has a line exactly matching "
            f"'{_SEC_START_L1}' followed by '{_SEC_START_L2}'."
        )
    return "\n".join(out)
 # ── Segment parser ─────────────────────────────────────────────────────────────
 def _speaker_regex(characters: list[str]) -> re.Pattern:
    """Regex matching  [optional-number]  CharacterName:  text"""
    # Sort longest-first so "Holy Ghost Elders" matches before "Holy Ghost"
    names = sorted(characters, key=len, reverse=True)
    pat = "|".join(re.escape(n) for n in names)
    return re.compile(r"^\d*\s*(" + pat + r")\s*:\s*(.*)", re.IGNORECASE)
 def parse_segments(text: str) -> list[tuple[str, str]]:
    """
    Convert section text into a list of (normalised_speaker, spoken_text) tuples.
    Non-attributed prose becomes Narrator lines.
    """
    char_re = _speaker_regex(list(CHARACTER_VOICES.keys()))
    # Build a quick lowercase→canonical lookup for speaker name normalisation
    canon: dict[str, str] = {k.lower(): k for k in CHARACTER_VOICES}
    segments: list[tuple[str, str]] = []
    cur_speaker = "Narrator"
    buf: list[str] = []
    def flush() -> None:
        combined = " ".join(l.strip() for l in buf if l.strip())
        if combined:
            segments.append((cur_speaker, combined))
        buf.clear()
    for raw in text.splitlines():
        line = raw.strip()
        if not line or _SKIP_RE.match(line):
            continue
        # Stage direction → Narrator reads it
        if _STAGE_NARRATOR.match(line):
            flush()
            cur_speaker = "Narrator"
            buf.append(line)
            continue
        # "The words of Jehovah … are in blue." — formatting note, skip
        if re.search(r"are in blue|words of jehovah", line, re.IGNORECASE):
            continue
        m = char_re.match(line)
        if m:
            flush()
            raw_name = m.group(1)
            cur_speaker = canon.get(raw_name.lower(), raw_name)
            spoken = m.group(2).strip()
            if spoken:
                buf.append(spoken)
        else:
            # Continuation of current speaker (or unattributed narrator prose)
            buf.append(line)
    flush()
    return segments
 # ── Audio generation ───────────────────────────────────────────────────────────
 async def _tts_bytes(text: str, voice: str) -> bytes:
    """Stream edge-tts and return raw MP3 bytes."""
    communicate = edge_tts.Communicate(text, voice)
    data = bytearray()
    async for chunk in communicate.stream():
        if chunk["type"] == "audio":
            data.extend(chunk["data"])
    return bytes(data)
 def _mp3_to_numpy(mp3: bytes) -> np.ndarray:
    """Decode MP3 bytes → mono float32 numpy array at SAMPLE_RATE using ffmpeg."""
    cmd = [
        "ffmpeg", "-hide_banner", "-loglevel", "error",
        "-i", "pipe:0",                    # read MP3 from stdin
        "-f", "f32le",                      # raw 32-bit little-endian float PCM
        "-acodec", "pcm_f32le",
        "-ac", "1",                          # mono
        "-ar", str(SAMPLE_RATE),            # resample to target rate
        "pipe:1",                           # write PCM to stdout
    ]
    result = subprocess.run(cmd, input=mp3, capture_output=True, check=True)
    return np.frombuffer(result.stdout, dtype=np.float32).copy()
 def _silence(ms: int) -> np.ndarray:
    return np.zeros(int(SAMPLE_RATE * ms / 1000), dtype=np.float32)
 async def render(
    segments: list[tuple[str, str]],
    preview: int | None = None,
 ) -> np.ndarray:
    """Generate and stitch all segment audio; return concatenated float32 array."""
    if preview is not None:
        segments = segments[:preview]
    parts: list[np.ndarray] = []
    last_speaker: str | None = None
    t0 = time.monotonic()
    for idx, (speaker, text) in enumerate(segments, 1):
        voice = CHARACTER_VOICES.get(speaker, FALLBACK_VOICE)
        marker = "⚠" if speaker not in CHARACTER_VOICES else " "
        print(f"  {marker}[{idx:>4}/{len(segments)}]  {speaker:<28}  {voice}")
        try:
            mp3 = await _tts_bytes(text, voice)
        except Exception as exc:
            print(f"       ↳ ERROR with '{voice}': {exc}  — falling back to {FALLBACK_VOICE}")
            mp3 = await _tts_bytes(text, FALLBACK_VOICE)
        audio = _mp3_to_numpy(mp3)
        if parts:
            gap = PAUSE_SAME if speaker == last_speaker else PAUSE_CHANGE
            parts.append(_silence(gap))
        parts.append(audio)
        last_speaker = speaker
    elapsed = time.monotonic() - t0
    print(f"\n  ✓  {len(segments)} segments in {elapsed:.0f}s")
    return np.concatenate(parts) if parts else np.array([], dtype=np.float32)
 # ── Voice listing ──────────────────────────────────────────────────────────────
 async def _list_voices_async() -> None:
    voices = await edge_tts.list_voices()
    english = sorted(
        (v for v in voices if v["Locale"].startswith("en-")),
        key=lambda v: (v["Locale"], v["ShortName"]),
    )
    print(f"\n  {'Locale':<12}  {'Short Name':<45}  Gender")
    print("  " + "─" * 68)
    for v in english:
        print(f"  {v['Locale']:<12}  {v['ShortName']:<45}  {v['Gender']}")
    print(f"\n  {len(english)} English voices total.")
 # ── CLI / main ─────────────────────────────────────────────────────────────────
 def main() -> None:
    ap = argparse.ArgumentParser(
        description="Render Sacred Temple Writings with per-character edge-tts voices."
    )
    ap.add_argument("--list-voices", action="store_true",
                    help="Print all available English edge-tts voices and exit.")
    ap.add_argument("--print-segments", action="store_true",
                    help="Print parsed (speaker, text) segments and exit.")
    ap.add_argument("--preview", type=int, metavar="N",
                    help="Render only the first N segments (quick test).")
    args = ap.parse_args()
    if args.list_voices:
        asyncio.run(_list_voices_async())
        return
    # ── Extract & parse ────────────────────────────────────────────────────────
    print(f"Source : {SOURCE_FILE}")
    text = extract_section(SOURCE_FILE)
    print(f"Section: {len(text):,} chars extracted\n")
    segments = parse_segments(text)
    if args.print_segments:
        print(f"Parsed {len(segments)} segments:\n")
        for i, (spkr, txt) in enumerate(segments, 1):
            snippet = txt[:90] + ("…" if len(txt) > 90 else "")
            voice = CHARACTER_VOICES.get(spkr, f"{FALLBACK_VOICE} ⚠")
            print(f"  {i:>4}.  [{spkr}]  ({voice})\n        {snippet}\n")
        return
    # ── Summary table ──────────────────────────────────────────────────────────
    counts = Counter(s for s, _ in segments)
    unrecognised = {s for s in counts if s not in CHARACTER_VOICES}
    print(f"Parsed {len(segments)} segments across {len(counts)} speakers:\n")
    print(f"  {'Speaker':<28}  {'Segs':>5}  {'Voice'}")
    print(f"  {'─'*28}  {'─'*5}  {'─'*45}")
    for spkr, voice in CHARACTER_VOICES.items():
        if counts[spkr]:
            print(f"  {spkr:<28}  {counts[spkr]:>5}  {voice}")
    for spkr in sorted(unrecognised):
        print(f"  {spkr:<28}  {counts[spkr]:>5}  {FALLBACK_VOICE}  ⚠ unrecognised")
    total_chars = sum(len(t) for _, t in segments)
    print(f"\n  Total chars: {total_chars:,}")
    if args.preview:
        print(f"  ⚡ PREVIEW MODE — rendering first {args.preview} segments only")
    # ── GPU note ───────────────────────────────────────────────────────────────
    # edge-tts is cloud-based (Microsoft Azure neural, free) — GPU not used.
    print("\nNote: edge-tts uses Microsoft's servers (free, no API key needed).\n"
          "      Render speed depends on your internet connection.\n")
    # ── Render ─────────────────────────────────────────────────────────────────
    OUTPUT_DIR.mkdir(exist_ok=True)
    out_path = OUTPUT_DIR / (
        f"sacred_temple_writings_preview{args.preview}.wav"
        if args.preview else OUTPUT_FILE
    )
    print("Rendering segments …\n")
    audio = asyncio.run(render(segments, args.preview))
    if audio.size > 0:
        sf.write(str(out_path), audio, SAMPLE_RATE)
        dur = len(audio) / SAMPLE_RATE
        m, s = divmod(int(dur), 60)
        print(f"\n✓  Saved '{out_path}'  ({m}m {s:02d}s audio  |  {SAMPLE_RATE} Hz)")
    else:
        print("✗  No audio produced — check parsing with --print-segments")
 if __name__ == "__main__":
    main()
--- a/extract_proper_nouns.py
+++ b/extract_proper_nouns.py
@ -18,6 +18,25 @@ from collections import defaultdict
 from pathlib import Path
 import spacy
 from wordfreq import top_n_list
 # ── Top 10 000 most-frequent English words ──────────────────────────
 TOP_10K_ENGLISH: frozenset[str] = frozenset(top_n_list("en", 10_000))
 # Words in the top-10k list that are genuine proper nouns in this text —
 # keep them despite the frequency filter.
 PROPER_NOUN_WHITELIST: frozenset[str] = frozenset({
    # Biblical names
    "aaron", "abel", "abraham", "adam", "cain", "eden", "egypt",
    "elijah", "ephraim", "eve", "gad", "ham", "isaac", "israel",
    "jacob", "james", "jehovah", "john", "joseph", "judah",
    "laban", "lehi", "levi", "micah", "michael", "moses", "noah",
    "peter", "pharaoh", "samuel", "sarah", "sarai", "seth", "simeon",
    "timothy", "zion",
    # Book-specific names that happen to match English words
    "alma", "ether", "gideon", "limhi", "mormon", "moroni", "mulek",
    "mosiah", "nephi", "satan", "sidon",
 })
 SOURCE = Path("Audio Master Nem Full.txt")
 OUTPUT = Path("proper_nouns.txt")
@ -35,12 +54,29 @@ ORG_LABELS    = {"ORG", "NORP"}
 OTHER_LABELS  = {"EVENT", "WORK_OF_ART", "LAW", "PRODUCT", "LANGUAGE"}
 # ── Noise filters ──────────────────────────────────────────────────────────────
-# All-caps lines are section headers, not spoken names — skip them.
+# Common English words that should be dropped when splitting multi-word entities.
-# Also skip very short tokens that are likely artefacts.
+STOP_WORDS: set[str] = {
-SKIP_PATTERNS = re.compile(
+    "A", "AN", "AND", "AS", "AT", "BE", "BUT", "BY",
-    r"^(THE|A|AN|AND|OF|IN|TO|FOR|BY|AT|IS|WAS|BE|HE|SHE|IT|"
+    "DO", "DID", "DOTH",
-    r"CHAPTER|VERSE|YEA|BEHOLD|LORD|GOD|CHRIST|HOLY|GHOST)$"
+    "EVEN", "FOR", "FROM",
-)
+    "HAD", "HAS", "HAVE", "HATH", "HE", "HER", "HIS", "HOW",
    "I", "IN", "IS", "IT", "ITS",
    "MAY", "ME", "MORE", "MY",
    "NAY", "NO", "NOT", "NOW",
    "OF", "OR", "OUR",
    "SHALL", "SHE", "SO", "SOME",
    "THAT", "THE", "THEE", "THEIR", "THEN", "THERE", "THESE", "THEY",
    "THIS", "THOSE", "THOU", "THUS", "THY", "TO",
    "UP", "UPON", "US",
    "WAS", "WE", "WHEN", "WHERE", "WHICH", "WHO", "WILL", "WITH",
    "YE", "YEA", "YET", "YOU", "YOUR",
    # Book-specific common words not worth flagging
    "BEHOLD", "CHAPTER", "CHRIST", "GOD", "GHOST", "HOLY", "LORD", "VERSE",
    # Generic nouns that slip through NER
    "CITY", "DAYS", "DAY", "GREAT", "LAND", "MAN", "MEN", "NEW",
    "PEOPLE", "SON", "TIME",
 }
 def is_noise(text: str) -> bool:
    t = text.strip()
@ -48,9 +84,12 @@ def is_noise(text: str) -> bool:
        return True
    if t.isupper() and len(t) > 4:      # all-caps section header word
        return True
-    if SKIP_PATTERNS.match(t.upper()):
+    if t.upper() in STOP_WORDS:
        return True
-    if re.search(r"[^a-zA-Z\-' ]", t):  # contains digits or symbols
+    if re.search(r"[^a-zA-Z\-']", t):   # contains digits, spaces, or symbols
        return True
    # Drop common English words (no hyphens) unless whitelisted as proper nouns.
    if "-" not in t and t.lower() in TOP_10K_ENGLISH and t.lower() not in PROPER_NOUN_WHITELIST:
        return True
    return False
@ -60,6 +99,11 @@ def canonical(text: str) -> str:
    return " ".join(text.split()).title()
 def split_words(phrase: str) -> list[str]:
    """Split a phrase on spaces; hyphenated words are kept as one token."""
    return phrase.split()
 # ── Read and process ───────────────────────────────────────────────────────────
 print(f"Reading '{SOURCE}' …")
 raw_text = SOURCE.read_text(encoding="utf-8")
@ -71,20 +115,23 @@ doc = nlp(raw_text)
 buckets: dict[str, set[str]] = defaultdict(set)
 # 1. NER pass — trust spaCy's entity labels
 #    Multi-word entities (e.g. "Peter James John") are split into individual
 #    words; hyphenated words (e.g. "Anti-Nephi-Lehi") stay as one token.
 for ent in doc.ents:
-    name = canonical(ent.text)
+    phrase = canonical(ent.text)
-    if is_noise(name):
+    for word in split_words(phrase):
-        continue
+        if is_noise(word):
-    if ent.label_ in PERSON_LABELS:
+            continue
-        buckets["People & Characters"].add(name)
+        if ent.label_ in PERSON_LABELS:
-    elif ent.label_ in PLACE_LABELS:
+            buckets["People & Characters"].add(word)
-        buckets["Places & Lands"].add(name)
+        elif ent.label_ in PLACE_LABELS:
-    elif ent.label_ in ORG_LABELS:
+            buckets["Places & Lands"].add(word)
-        buckets["Groups & Nations"].add(name)
+        elif ent.label_ in ORG_LABELS:
-    elif ent.label_ in OTHER_LABELS:
+            buckets["Groups & Nations"].add(word)
-        buckets["Other Named Things"].add(name)
+        elif ent.label_ in OTHER_LABELS:
-    else:
+            buckets["Other Named Things"].add(word)
-        buckets["Other Named Things"].add(name)
+        else:
            buckets["Other Named Things"].add(word)
 # 2. PROPN pass — catch names spaCy didn't recognise as entities
 #    Only include tokens that are inside a sentence (not at position 0)
@ -97,13 +144,13 @@ for token in doc:
        continue                          # skip all-caps
    if token.i == token.sent.start:
        continue                          # skip sentence-initial (could be any word)
-    name = canonical(text)
+    word = canonical(text)
-    if is_noise(name):
+    if is_noise(word):
        continue
    # Only add if not already captured by NER
-    already_captured = any(name in s for s in buckets.values())
+    already_captured = any(word in s for s in buckets.values())
    if not already_captured:
-        buckets["Unclassified Proper Nouns"].add(name)
+        buckets["Unclassified Proper Nouns"].add(word)
 # ── Write output ───────────────────────────────────────────────────────────────
 GROUP_ORDER = [
--- a/format_scripture.py
+++ b/format_scripture.py
@ -0,0 +1,801 @@
 #!/usr/bin/env python3
 """
 create_scripture_pdf.py
 ════════════════════════
 Convert the Book of the Nem plain-text file into two scripture-style PDFs:
  nem_kindle.pdf  – single-column, sized for e-readers (4.5" × 6.5")
  nem_paper.pdf   – two-column, Book of Mormon style (5.5" × 8.5")
 Requirements (Debian/Ubuntu):
    sudo apt-get install texlive-latex-extra texlive-fonts-recommended
  The key packages used are:
    extsizes   – for 9 pt document class (paper format)
    tgpagella  – TeX Gyre Pagella (Palatino-clone) font
    multicol   – two-column layout without hard page breaks
    microtype  – improved text justification and hyphenation
    fancyhdr   – running headers and footers
    needspace  – prevent orphaned headings
 Usage:
    python create_scripture_pdf.py
    python create_scripture_pdf.py --input "Audio Master Nem Full.txt"
    python create_scripture_pdf.py --kindle-only
    python create_scripture_pdf.py --paper-only
    python create_scripture_pdf.py --output-dir ./pdfs
    python create_scripture_pdf.py --keep-tex   # keep .tex files for debugging
 """
 import argparse
 import re
 import subprocess
 import sys
 import tempfile
 from dataclasses import dataclass
 from pathlib import Path
 from typing import Optional
 # ── Default paths ──────────────────────────────────────────────────────────────
 INPUT_FILE = Path("Audio Master Nem Full.txt")
 OUTPUT_DIR = Path("output_pdf")
 # ══════════════════════════════════════════════════════════════════════════════
 # LaTeX helper
 # ══════════════════════════════════════════════════════════════════════════════
 _LATEX_TRANS = str.maketrans({
    "\\":  r"\textbackslash{}",
    "&":   r"\&",
    "%":   r"\%",
    "$":   r"\$",
    "#":   r"\#",
    "_":   r"\_",
    "{":   r"\{",
    "}":   r"\}",
    "~":   r"\textasciitilde{}",
    "^":   r"\textasciicircum{}",
    "\u2014": "---",        # em dash
    "\u2013": "--",         # en dash
    "\u2018": "`",          # left single quote
    "\u2019": "'",          # right single quote
    "\u201c": "``",         # left double quote
    "\u201d": "''",         # right double quote
    "\u2026": r"\ldots{}",  # ellipsis
    "\u00e9": r"\'e",
    "\u00e8": r"\`e",
    "\u00ea": r"\^e",
    "\u00e0": r"\`a",
    "\u00e2": r"\^a",
    "\u00f3": r"\'o",
    "\u00ed": r"\'{\i}",
 })
 def esc(text: str) -> str:
    """Escape special LaTeX characters in a string."""
    return text.translate(_LATEX_TRANS)
 # ══════════════════════════════════════════════════════════════════════════════
 # Document element types
 # ══════════════════════════════════════════════════════════════════════════════
@dataclass
 class TitlePage:
    lines: list
@dataclass
 class BookHeader:
    """One or more heading lines that introduce a new book/section."""
    lines: list   # list of str
@dataclass
 class Chapter:
    num: int
    subtitle: Optional[str] = None
@dataclass
 class SectionHeading:
    """Short heading within a chapter (e.g. MARRIAGE, BAPTISM)."""
    text: str
@dataclass
 class Verse:
    num: int
    text: str
@dataclass
 class Paragraph:
    text: str
 # ══════════════════════════════════════════════════════════════════════════════
 # Parser
 # ══════════════════════════════════════════════════════════════════════════════
 _RE_VERSE   = re.compile(r"^\s*(\d+)\s+(.*)")
 _RE_CHAPTER = re.compile(r"^\s*CHAPTER\s+(\d+)\s*$", re.IGNORECASE)
 _RE_DIVIDER = re.compile(r"^_{4,}")
 # Lines longer than this are treated as body paragraphs rather than headings
 MAX_HEADING_LEN = 120
 def _is_verse(line: str) -> bool:
    """Line starts with a verse number followed by text."""
    m = _RE_VERSE.match(line)
    return bool(m) and int(m.group(1)) > 0
 def _is_chapter(line: str) -> bool:
    return bool(_RE_CHAPTER.match(line.strip()))
 def _is_divider(line: str) -> bool:
    return bool(_RE_DIVIDER.match(line.strip()))
 def _is_allcaps(line: str) -> bool:
    s = line.strip()
    return bool(s) and s == s.upper() and any(c.isalpha() for c in s)
 def parse(text: str) -> list:
    """Parse the scripture text into a list of Element objects."""
    lines = text.splitlines()
    elements = []
    n = len(lines)
    i = 0
    # ── Title page: short lines before the first divider ──────────────────────
    # Short lines (≤80 chars) are the actual title. Long prose before the first
    # divider is ignored so it does not duplicate the later labeled Introduction.
    title_lines = []
    while i < n and not _is_divider(lines[i]):
        title_lines.append(lines[i])
        i += 1
    actual_title = []
    for l in title_lines:
        s = l.strip()
        if not s:
            continue
        if len(s) <= 80:
            actual_title.append(s)
    if actual_title:
        elements.append(TitlePage(lines=actual_title))
    # ── Main pass ─────────────────────────────────────────────────────────────
    after_divider = False
    while i < n:
        raw  = lines[i]
        line = raw.strip()
        # ── Divider ───────────────────────────────────────────────────────────
        if _is_divider(raw):
            after_divider = True
            i += 1
            continue
        # ── Blank line ────────────────────────────────────────────────────────
        if not line:
            i += 1
            continue
        # ── After a divider: collect section/book header ───────────────────
        # Collect all short non-verse non-chapter lines immediately following
        # the divider.  Stop as soon as we hit a long prose line or body content.
        if after_divider:
            after_divider = False
            header_lines = []
            j = i
            while j < n:
                s = lines[j].strip()
                if not s:                           # blank: keep scanning
                    j += 1
                    continue
                if _is_verse(lines[j]) or _is_chapter(lines[j]):
                    break                           # reached verse/chapter body
                if len(s) > MAX_HEADING_LEN:
                    break                           # long prose line: stop here
                header_lines.append(s)
                j += 1
            if header_lines:
                elements.append(BookHeader(lines=header_lines))
            i = j
            continue
        # ── Chapter heading ────────────────────────────────────────────────
        m = _RE_CHAPTER.match(line)
        if m:
            num = int(m.group(1))
            # Look ahead for an optional subtitle (short non-verse line)
            j = i + 1
            subtitle = None
            while j < n and not lines[j].strip():
                j += 1
            if j < n:
                ns = lines[j].strip()
                if (ns
                        and not _is_verse(lines[j])
                        and not _is_chapter(lines[j])
                        and not _is_divider(lines[j])
                        and len(ns) <= MAX_HEADING_LEN):
                    subtitle = ns
                    i = j + 1
                else:
                    i += 1
            else:
                i += 1
            elements.append(Chapter(num=num, subtitle=subtitle))
            continue
        # ── All-caps lines: either a BookHeader cluster or a SectionHeading ─
        # If the cluster of consecutive all-caps lines is followed (after any
        # blanks) by a CHAPTER heading, treat the whole cluster as a BookHeader.
        # Otherwise treat only the first line as a SectionHeading.
        if _is_allcaps(line) and len(line) <= MAX_HEADING_LEN and not _is_verse(raw):
            # Gather consecutive all-caps lines (blanks skipped)
            j = i
            caps_block = []
            while j < n:
                s = lines[j].strip()
                if not s:
                    j += 1
                    continue
                if (_is_allcaps(s)
                        and len(s) <= MAX_HEADING_LEN
                        and not _is_verse(lines[j])
                        and not _is_chapter(lines[j])
                        and not _is_divider(lines[j])):
                    caps_block.append(s)
                    j += 1
                else:
                    break
            # Look past any blanks to see if a chapter heading follows
            k = j
            while k < n and not lines[k].strip():
                k += 1
            if k < n and _is_chapter(lines[k]):
                # This cluster is a book/section header
                elements.append(BookHeader(lines=caps_block))
                i = j
            else:
                # Single inline section subheading (MARRIAGE, BAPTISM, etc.)
                elements.append(SectionHeading(text=caps_block[0] if caps_block else line))
                i = i + 1
            continue
        # ── Verse ─────────────────────────────────────────────────────────
        if _is_verse(raw):
            mfull = _RE_VERSE.match(raw)
            elements.append(Verse(num=int(mfull.group(1)), text=mfull.group(2).strip()))
            i += 1
            continue
        # ── Paragraph ─────────────────────────────────────────────────────
        elements.append(Paragraph(text=line))
        i += 1
    return elements
 # ══════════════════════════════════════════════════════════════════════════════
 # LaTeX generation
 # ══════════════════════════════════════════════════════════════════════════════
 _PREAMBLE_SHARED = r"""
 \usepackage[T1]{fontenc}
 \usepackage[utf8]{inputenc}
 \usepackage{tgpagella}
 \usepackage{microtype}
 \usepackage{fancyhdr}
 \usepackage{needspace}
 \setlength{\headheight}{14pt}
 \addtolength{\topmargin}{-2pt}
 \usepackage[hidelinks]{hyperref}
 """
 def _hrule() -> str:
    return r"\noindent\rule{\linewidth}{0.3pt}"
 # ── Kindle (single-column, e-reader sized) ────────────────────────────────────
 def build_kindle_latex(elements: list) -> str:
    """Build a single-column LaTeX document sized for e-readers."""
    out = []
    # extarticle (from extsizes) gives us 11pt; plain article also supports it
    out.append(r"\documentclass[11pt]{extarticle}")
    out.append(r"""
 \usepackage[paperwidth=4.5in,paperheight=6.5in,
            top=0.08in,bottom=0.5in,
            inner=0.42in,outer=0.38in,
            headheight=12pt,headsep=6pt,
            includehead]{geometry}""")
    out.append(_PREAMBLE_SHARED)
    out.append(r"""
 \pagestyle{fancy}
 \fancyhf{}
 \fancyhead[C]{\small\itshape\nouppercase{\leftmark}}
 \fancyfoot[C]{\small\thepage}
 \renewcommand{\headrulewidth}{0.3pt}
 \setlength{\parindent}{0pt}
 \setlength{\parskip}{3pt plus 1pt minus 1pt}
 \begin{document}
 """)
    # Handle title page separately so we can insert TOC after it
    title_els = [e for e in elements if isinstance(e, TitlePage)]
    body_els  = [e for e in elements if not isinstance(e, TitlePage)]
    if title_els:
        out.append(r"\clearpage")
        out.append(r"\thispagestyle{empty}")
        out.append(r"\vspace*{1.3in}")
        out.append(r"\begin{center}")
        for j, tl in enumerate(title_els[0].lines):
            s = tl.strip()
            if not s:
                continue
            if j < 3:
                out.append(r"{\LARGE\bfseries " + esc(s) + r"} \\[8pt]")
            else:
                out.append(r"{\large " + esc(s) + r"} \\[4pt]")
        out.append(r"\end{center}")
        out.append(r"\clearpage")
    out.append(r"\renewcommand{\contentsname}{Table of Contents}")
    out.append(r"\tableofcontents")
    out.append(r"\clearpage")
    _emit_elements(out, body_els, kindle=True)
    out.append(r"\end{document}")
    return "\n".join(out)
 # ── Paper / BOM style (two-column) ────────────────────────────────────────────
 def build_paper_latex(elements: list) -> str:
    """Build a two-column, Book of Mormon-style LaTeX document."""
    out = []
    # extarticle (from extsizes) for 9pt support
    out.append(r"\documentclass[9pt,twoside]{extarticle}")
    out.append(r"""
 \usepackage[paperwidth=5.5in,paperheight=8.5in,
            top=0.08in,bottom=0.55in,
            inner=0.5in,outer=0.42in,
            headheight=10pt,headsep=5pt,
            includehead]{geometry}""")
    out.append(_PREAMBLE_SHARED)
    out.append(r"""
 \usepackage{multicol}
 \setlength{\columnsep}{0.22in}
 \setlength{\columnseprule}{0.3pt}
 \pagestyle{fancy}
 \fancyhf{}
 \fancyhead[LE]{\footnotesize\itshape\nouppercase{\leftmark}}
 \fancyhead[RO]{\footnotesize\itshape\nouppercase{\rightmark}}
 \fancyfoot[C]{\scriptsize\thepage}
 \renewcommand{\headrulewidth}{0.3pt}
 \setlength{\parindent}{0pt}
 \setlength{\parskip}{1pt}
 \begin{document}
 """)
    # Emit the title page outside multicols (single-column block)
    title_els = [e for e in elements if isinstance(e, TitlePage)]
    body_els  = [e for e in elements if not isinstance(e, TitlePage)]
    if title_els:
        out.append(r"\begin{center}")
        for j, tl in enumerate(title_els[0].lines):
            s = tl.strip()
            if not s:
                continue
            if j < 3:
                out.append(r"{\large\bfseries " + esc(s) + r"} \\[3pt]")
            else:
                out.append(r"{\small " + esc(s) + r"} \\[1pt]")
        out.append(r"\end{center}")
        out.append(r"\medskip")
    out.append(r"\renewcommand{\contentsname}{Table of Contents}")
    out.append(r"\tableofcontents")
    out.append(r"\clearpage")
    # Skip any leading front-matter paragraphs before the first section header.
    # For paper output, the intro should begin at the labeled "Introduction"
    # section rather than repeating the pre-divider prose block.
    first_section = next(
        (i for i, el in enumerate(body_els) if isinstance(el, BookHeader)),
        len(body_els),
    )
    paper_body_els = body_els[first_section:]
    # Split intro (before first real book) from main body.
    # A "real book" is a BookHeader that is followed by at least one Chapter
    # before the next BookHeader. "Introduction" and similar preamble sections
    # are BookHeaders too but have no chapters, so they stay in the intro.
    first_book = len(paper_body_els)
    for i, el in enumerate(paper_body_els):
        if isinstance(el, BookHeader):
            # Check if a Chapter follows before the next BookHeader
            for j in range(i + 1, len(paper_body_els)):
                if isinstance(paper_body_els[j], Chapter):
                    first_book = i
                    break
                if isinstance(paper_body_els[j], BookHeader):
                    break
        if first_book < len(paper_body_els):
            break
    intro_els = paper_body_els[:first_book]
    main_els  = paper_body_els[first_book:]
    if intro_els:
        _emit_elements(out, intro_els, kindle=True, compact_headers=True)
        out.append(r"\clearpage")
    out.append(r"\begin{multicols}{2}")
    _emit_elements(out, main_els, kindle=False)
    out.append(r"\end{multicols}")
    out.append(r"\end{document}")
    return "\n".join(out)
 # ── Body emitter ──────────────────────────────────────────────────────────────
 def _emit_elements(
    out: list,
    elements: list,
    kindle: bool,
    indent: bool = False,
    compact_headers: bool = False,
 ) -> None:
    """Translate parsed Element objects into LaTeX markup."""
    for el in elements:
        # ── Title page (kindle only; paper handles it before multicols) ──────
        if isinstance(el, TitlePage):
            if kindle:
                out.append(r"\clearpage")
                out.append(r"\thispagestyle{empty}")
                out.append(r"\vspace*{1.3in}")
                out.append(r"\begin{center}")
                for j, tl in enumerate(el.lines):
                    s = tl.strip()
                    if not s:
                        continue
                    if j < 3:
                        out.append(r"{\LARGE\bfseries " + esc(s) + r"} \\[8pt]")
                    else:
                        out.append(r"{\large " + esc(s) + r"} \\[4pt]")
                out.append(r"\end{center}")
                out.append(r"\clearpage")
        # ── Book / section header ────────────────────────────────────────────
        elif isinstance(el, BookHeader):
            lines = el.lines
            if kindle:
                # Start a new page for each major book
                out.append(r"\clearpage")
                out.append(r"\phantomsection\addcontentsline{toc}{section}{" + esc(lines[0]) + r"}")
                out.append(r"\vspace*{0pt}" if compact_headers else r"\vspace*{0.1in}")
                out.append(r"\begin{center}")
                out.append(_hrule())
                out.append(r"\\[6pt]")
                out.append(r"{\bfseries\large " + esc(lines[0]) + r"}")
                for ln in lines[1:]:
                    out.append(r"\\ [3pt]{\normalsize\itshape " + esc(ln) + r"}")
                out.append(r"\\[6pt]")
                out.append(_hrule())
                out.append(r"\end{center}")
                out.append(r"\markboth{" + esc(lines[0]) + r"}{" + esc(lines[0]) + r"}")
                out.append(r"\vspace{5pt}")
            else:
                # Inline heading within the two-column flow
                # Refuse to start a new book in the bottom half of a column
                out.append(r"\needspace{0.5\textheight}")
                out.append(r"\phantomsection\addcontentsline{toc}{section}{" + esc(lines[0]) + r"}")
                out.append(r"\begin{center}")
                out.append(_hrule())
                out.append(r"\\[2pt]")
                out.append(r"{\bfseries " + esc(lines[0]) + r"}")
                for ln in lines[1:]:
                    out.append(r"\\ {\small\itshape " + esc(ln) + r"}")
                out.append(r"\\[2pt]")
                out.append(_hrule())
                out.append(r"\end{center}")
                out.append(r"\markboth{" + esc(lines[0]) + r"}{" + esc(lines[0]) + r"}")
                out.append(r"\vspace{2pt}")
        # ── Chapter heading ──────────────────────────────────────────────────
        elif isinstance(el, Chapter):
            label = f"CHAPTER {el.num}"
            if kindle:
                out.append(r"\phantomsection\addcontentsline{toc}{subsection}{" + esc(label) + r"}")
                out.append(r"\needspace{4\baselineskip}")
                out.append(r"\vspace{14pt}")
                out.append(r"\begin{center}")
                out.append(r"{\bfseries\large " + esc(label) + r"}")
                if el.subtitle:
                    out.append(r"\\ [3pt]{\normalsize\itshape " + esc(el.subtitle) + r"}")
                out.append(r"\end{center}")
                out.append(r"\markright{" + esc(label) + r"}")
                out.append(r"\vspace{6pt}")
            else:
                out.append(r"\phantomsection\addcontentsline{toc}{subsection}{" + esc(label) + r"}")
                out.append(r"\needspace{2\baselineskip}")
                out.append(r"\vspace{3pt}")
                out.append(r"\begin{center}")
                out.append(r"{\bfseries " + esc(label) + r"}")
                if el.subtitle:
                    out.append(r"\\ {\small\itshape " + esc(el.subtitle) + r"}")
                out.append(r"\end{center}")
                out.append(r"\markright{" + esc(label) + r"}")
                out.append(r"\vspace{1pt}")
        # ── Section subheading (MARRIAGE, BAPTISM, etc.) ────────────────────
        elif isinstance(el, SectionHeading):
            if kindle:
                out.append(r"\vspace{8pt}")
                out.append(r"\begin{center}{\bfseries " + esc(el.text) + r"}\end{center}")
                out.append(r"\vspace{4pt}")
            else:
                out.append(r"\vspace{3pt}")
                out.append(
                    r"\begin{center}{\bfseries\small " + esc(el.text) + r"}\end{center}"
                )
                out.append(r"\vspace{1pt}")
        # ── Verse ────────────────────────────────────────────────────────────
        elif isinstance(el, Verse):
            body = esc(el.text)
            if kindle:
                # Bold inline number (not superscript) for readability on screen
                vnum = r"\textbf{" + str(el.num) + r"}"
                out.append(r"\noindent " + vnum + r"~" + body)
                out.append(r"\par\smallskip")
            else:
                vnum = r"\textbf{" + str(el.num) + r"}"
                out.append(r"\noindent " + vnum + r"~" + body + r"\par")
        # ── Paragraph (prose intro, commentary, etc.) ───────────────────────
        elif isinstance(el, Paragraph):
            body = esc(el.text)
            if kindle:
                out.append(r"\noindent " + body)
                out.append(r"\par\smallskip")
            elif indent:
                out.append(body + r"\par\medskip")
            else:
                out.append(r"\noindent " + body + r"\par")
 # ══════════════════════════════════════════════════════════════════════════════
 # Utility: book limiter
 # ══════════════════════════════════════════════════════════════════════════════
 def truncate_to_books(elements: list, max_books: int) -> list:
    """Return only the first *max_books* BookHeader sections (and their content).
    Title-page and front-matter paragraphs before the first BookHeader are always kept.
    """
    if max_books <= 0:
        return elements
    count = 0
    result = []
    for el in elements:
        if isinstance(el, BookHeader):
            count += 1
            if count > max_books:
                break
        result.append(el)
    return result
 # ══════════════════════════════════════════════════════════════════════════════
 # PDF compilation
 # ══════════════════════════════════════════════════════════════════════════════
 def _find_compiler() -> tuple:
    """Return (compiler_path, compiler_type) or (None, None) if none found."""
    import shutil
    # Also probe common absolute paths in case the dir isn't on $PATH
    candidates = {
        "pdflatex": ["/usr/bin/pdflatex", "/usr/local/bin/pdflatex"],
        "tectonic": ["/usr/bin/tectonic", "/usr/local/bin/tectonic"],
    }
    for cmd, extra_paths in candidates.items():
        found = shutil.which(cmd)
        if found:
            return found, cmd
        for p in extra_paths:
            if Path(p).exists():
                return p, cmd
    return None, None
 def compile_pdf(tex_src: str, output_pdf: Path,
                keep_tex: bool = False,
                compiler_path: str = "/usr/bin/pdflatex",
                compiler_type: str = "pdflatex") -> bool:
    """
    Write *tex_src* into a temp directory, run the LaTeX compiler, and copy
    the resulting PDF to *output_pdf*.  Supports ``pdflatex`` and ``tectonic``.
    Returns True on success.
    """
    with tempfile.TemporaryDirectory() as tmp:
        tmp_path = Path(tmp)
        tex_file = tmp_path / "document.tex"
        tex_file.write_text(tex_src, encoding="utf-8")
        if compiler_type == "tectonic":
            # Tectonic compiles in one pass and downloads missing packages.
            passes = 1
            cmd_base = [compiler_path, "document.tex"]
        else:
            # pdflatex needs two passes to get page headers right.
            passes = 2
            cmd_base = [compiler_path, "-interaction=nonstopmode",
                        "-halt-on-error", "document.tex"]
        for pass_num in range(1, passes + 1):
            result = subprocess.run(
                cmd_base, cwd=tmp, capture_output=True, text=True,
            )
            if result.returncode != 0:
                print(f"  [compiler error on pass {pass_num}]", file=sys.stderr)
                print(result.stdout[-3000:], file=sys.stderr)
                if result.stderr:
                    print(result.stderr[-1000:], file=sys.stderr)
                if keep_tex:
                    dest = output_pdf.with_suffix(".tex")
                    dest.write_text(tex_src, encoding="utf-8")
                    print(f"  TeX source saved to: {dest}", file=sys.stderr)
                return False
        pdf_out = tmp_path / "document.pdf"
        if pdf_out.exists():
            output_pdf.parent.mkdir(parents=True, exist_ok=True)
            output_pdf.write_bytes(pdf_out.read_bytes())
            if keep_tex:
                dest = output_pdf.with_suffix(".tex")
                dest.write_text(tex_src, encoding="utf-8")
            return True
        print("  [compiler ran but document.pdf was not produced]", file=sys.stderr)
        return False
 # ══════════════════════════════════════════════════════════════════════════════
 # Main
 # ══════════════════════════════════════════════════════════════════════════════
 _INSTALL_INSTRUCTIONS = """
 No LaTeX compiler found.  Install one of the following:
  Arch / CachyOS / Manjaro:
    sudo pacman -S texlive-basic texlive-latex texlive-latexrecommended \\
                   texlive-latexextra texlive-fontsrecommended
  Debian / Ubuntu:
    sudo apt-get install texlive-latex-extra texlive-fonts-recommended
  --- OR ---  (self-contained, downloads packages on first use)
    sudo pacman -S tectonic
    # or: cargo install tectonic
 """
 def main():
    parser = argparse.ArgumentParser(
        description="Generate scripture-style PDFs from the Book of the Nem text.",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog=__doc__,
    )
    parser.add_argument(
        "--input", type=Path, default=INPUT_FILE,
        help=f"Input plain-text file  (default: {INPUT_FILE})",
    )
    parser.add_argument(
        "--output-dir", type=Path, default=OUTPUT_DIR,
        help=f"Output directory  (default: {OUTPUT_DIR})",
    )
    parser.add_argument(
        "--kindle-only", action="store_true",
        help="Generate only the Kindle (single-column) PDF.",
    )
    parser.add_argument(
        "--paper-only", action="store_true",
        help="Generate only the paper (two-column) PDF.",
    )
    parser.add_argument(
        "--keep-tex", action="store_true",
        help="Save the intermediate .tex files alongside each PDF.",
    )
    parser.add_argument(
        "--max-books", type=int, default=0, metavar="N",
        help="Limit output to the first N book sections (0 = no limit).",
    )
    parser.add_argument(
        "--tex-only", action="store_true",
        help="Write .tex files only — do not attempt PDF compilation. "
             "Useful when a LaTeX compiler is not available.",
    )
    args = parser.parse_args()
    src_path: Path = args.input
    if not src_path.exists():
        sys.exit(f"ERROR: Input file not found: {src_path}")
    print(f"Reading: {src_path}")
    text = src_path.read_text(encoding="utf-8", errors="replace")
    elements = parse(text)
    if args.max_books > 0:
        elements = truncate_to_books(elements, args.max_books)
        print(f"  Limiting to first {args.max_books} book(s).")
    books    = sum(1 for e in elements if isinstance(e, BookHeader))
    chapters = sum(1 for e in elements if isinstance(e, Chapter))
    verses   = sum(1 for e in elements if isinstance(e, Verse))
    print(f"  Parsed: {books} books/sections, {chapters} chapters, {verses} verses")
    out_dir: Path = args.output_dir
    out_dir.mkdir(parents=True, exist_ok=True)
    # Locate compiler (unless --tex-only)
    compiler_path, compiler_type = None, None
    if not args.tex_only:
        compiler_path, compiler_type = _find_compiler()
        if not compiler_path:
            print(_INSTALL_INSTRUCTIONS, file=sys.stderr)
            print("Falling back to --tex-only mode: .tex files will be written "
                  "but not compiled.", file=sys.stderr)
            args.tex_only = True
        else:
            print(f"  Using compiler: {compiler_path}")
    def _write_or_compile(tex: str, pdf_path: Path, label: str):
        if args.tex_only or args.keep_tex:
            tex_path = pdf_path.with_suffix(".tex")
            tex_path.write_text(tex, encoding="utf-8")
            print(f"  ✓  TeX saved: {tex_path}")
        if args.tex_only:
            return
        print(f"  Compiling {label} PDF …")
        ok = compile_pdf(tex, pdf_path, keep_tex=args.keep_tex,
                         compiler_path=compiler_path,
                         compiler_type=compiler_type)
        if ok:
            print(f"  ✓  {pdf_path}")
        else:
            print(f"  ✗  {label} PDF failed — see errors above.")
    # ── Kindle PDF ────────────────────────────────────────────────────────────
    if not args.paper_only:
        print(f"\nKindle PDF  (single-column, 4.5\"×6.5\") …")
        tex = build_kindle_latex(elements)
        _write_or_compile(tex, out_dir / "nem_phone.pdf", "Kindle")
    # ── Paper / BOM-style PDF ────────────────────────────────────────────────
    if not args.kindle_only:
        print(f"\nPaper PDF  (two-column BOM style, 5.5\"×8.5\") …")
        tex = build_paper_latex(elements)
        _write_or_compile(tex, out_dir / "nem_paper.pdf", "Paper")
 if __name__ == "__main__":
    main()
--- a/gui_proper_noun_player.py
+++ b/gui_proper_noun_player.py
--- a/output_proper_nouns/audio_text_for_novel_lightbringer/manifest.json
+++ b/output_proper_nouns/audio_text_for_novel_lightbringer/manifest.json
@ -0,0 +1,778 @@
 {
  "Aaagast": "aaagast.wav",
  "Abby": "abby.wav",
  "Abigail": "abigail.wav",
  "Abodey": "abodey.wav",
  "Abriyyah": "abriyyah.wav",
  "Abyss": "abyss.wav",
  "Adamantine": "adamantine.wav",
  "Addobes": "addobes.wav",
  "Adobbes": "adobbes.wav",
  "Aedrick": "aedrick.wav",
  "Aegis": "aegis.wav",
  "Aegrir": "aegrir.wav",
  "Afire": "afire.wav",
  "Agatha": "agatha.wav",
  "Agony": "agony.wav",
  "Agrarian": "agrarian.wav",
  "Aheer": "aheer.wav",
  "Ahman": "ahman.wav",
  "Ailondel": "ailondel.wav",
  "Airk": "airk.wav",
  "Al-Astan": "al_astan.wav",
  "Alchemist": "alchemist.wav",
  "Alvrin": "alvrin.wav",
  "Amarantha": "amarantha.wav",
  "Amaryllis": "amaryllis.wav",
  "Ananduil": "ananduil.wav",
  "Anaudriel": "anaudriel.wav",
  "Andrahel": "andrahel.wav",
  "Anhuil": "anhuil.wav",
  "Anhuil-Ehlar": "anhuil_ehlar.wav",
  "Anhuil-Elhar": "anhuil_elhar.wav",
  "Anjeer": "anjeer.wav",
  "Ankh": "ankh.wav",
  "Annalise": "annalise.wav",
  "Anointing": "anointing.wav",
  "Anoush": "anoush.wav",
  "Anuil": "anuil.wav",
  "Anvilhammer": "anvilhammer.wav",
  "Ara": "ara.wav",
  "Aragast": "aragast.wav",
  "Aragst": "aragst.wav",
  "Aralon": "aralon.wav",
  "Aran": "aran.wav",
  "Arans": "arans.wav",
  "Arashan": "arashan.wav",
  "Arbiter": "arbiter.wav",
  "Archmage": "archmage.wav",
  "Archwizard": "archwizard.wav",
  "Ardrick": "ardrick.wav",
  "Argast": "argast.wav",
  "Armbrook": "armbrook.wav",
  "Armory": "armory.wav",
  "Arn": "arn.wav",
  "Arn-Del": "arn_del.wav",
  "Asheer": "asheer.wav",
  "Aske": "aske.wav",
  "Aster": "aster.wav",
  "Astor": "astor.wav",
  "Astral": "astral.wav",
  "Astride": "astride.wav",
  "Astute": "astute.wav",
  "Avery": "avery.wav",
  "Avorein": "avorein.wav",
  "Await": "await.wav",
  "Awww": "awww.wav",
  "Axehammer": "axehammer.wav",
  "Ayana": "ayana.wav",
  "Ayron": "ayron.wav",
  "Azuremoon": "azuremoon.wav",
  "Badlands": "badlands.wav",
  "Baelen": "baelen.wav",
  "Bah": "bah.wav",
  "Ballista": "ballista.wav",
  "Bancroft": "bancroft.wav",
  "Baras": "baras.wav",
  "Barek": "barek.wav",
  "Barge": "barge.wav",
  "Barrik": "barrik.wav",
  "Battlelord": "battlelord.wav",
  "Bazaar": "bazaar.wav",
  "Bearas": "bearas.wav",
  "Bearasagain": "bearasagain.wav",
  "Bearasand": "bearasand.wav",
  "Bearasasked": "bearasasked.wav",
  "Bearasat": "bearasat.wav",
  "Bearasbegan": "bearasbegan.wav",
  "Bearasbowed": "bearasbowed.wav",
  "Bearascan": "bearascan.wav",
  "Bearasdown": "bearasdown.wav",
  "Bearasemerged": "bearasemerged.wav",
  "Bearasfelt": "bearasfelt.wav",
  "Bearasfor": "bearasfor.wav",
  "Bearashad": "bearashad.wav",
  "Bearashas": "bearashas.wav",
  "Bearasheld": "bearasheld.wav",
  "Bearashesitantly": "bearashesitantly.wav",
  "Bearasin": "bearasin.wav",
  "Bearasleading": "bearasleading.wav",
  "Bearasmust": "bearasmust.wav",
  "Bearasnodded": "bearasnodded.wav",
  "Bearasperplexed": "bearasperplexed.wav",
  "Bearasquickly": "bearasquickly.wav",
  "Bearasreleased": "bearasreleased.wav",
  "Bearassaid": "bearassaid.wav",
  "Bearassat": "bearassat.wav",
  "Bearassimply": "bearassimply.wav",
  "Bearasslowly": "bearasslowly.wav",
  "Bearassome": "bearassome.wav",
  "Bearasspeaks": "bearasspeaks.wav",
  "Bearassteeled": "bearassteeled.wav",
  "Bearasstood": "bearasstood.wav",
  "Bearasthat": "bearasthat.wav",
  "Bearasthen": "bearasthen.wav",
  "Bearasto": "bearasto.wav",
  "Bearastrailed": "bearastrailed.wav",
  "Bearaswandered": "bearaswandered.wav",
  "Bearaswho": "bearaswho.wav",
  "Bearaswith": "bearaswith.wav",
  "Beldvorth": "beldvorth.wav",
  "Belegast": "belegast.wav",
  "Berstag": "berstag.wav",
  "Beydell": "beydell.wav",
  "Blackfeather": "blackfeather.wav",
  "Blackroot": "blackroot.wav",
  "Blargh": "blargh.wav",
  "Bledvorth": "bledvorth.wav",
  "Blessings": "blessings.wav",
  "Bloodstone": "bloodstone.wav",
  "Bloodtone": "bloodtone.wav",
  "Bogard": "bogard.wav",
  "Boldar": "boldar.wav",
  "Bolton": "bolton.wav",
  "Bon": "bon.wav",
  "Boomer": "boomer.wav",
  "Bouldershaun": "bouldershaun.wav",
  "Boulevarde": "boulevarde.wav",
  "Brahma": "brahma.wav",
  "Bramble": "bramble.wav",
  "Brambleburr": "brambleburr.wav",
  "Brambleburrs": "brambleburrs.wav",
  "Branson": "branson.wav",
  "Bravado": "bravado.wav",
  "Brax": "brax.wav",
  "Braz": "braz.wav",
  "Brazen": "brazen.wav",
  "Brazenclaw": "brazenclaw.wav",
  "Brazenclaws": "brazenclaws.wav",
  "Breeches": "breeches.wav",
  "Brendan": "brendan.wav",
  "Brethren": "brethren.wav",
  "Brickhorn": "brickhorn.wav",
  "Caldwell": "caldwell.wav",
  "Calico": "calico.wav",
  "Caller": "caller.wav",
  "Camels": "camels.wav",
  "Canals": "canals.wav",
  "Captains": "captains.wav",
  "Caravan": "caravan.wav",
  "Caswold": "caswold.wav",
  "Causeway": "causeway.wav",
  "Cavalier": "cavalier.wav",
  "Cavern": "cavern.wav",
  "Cherrytree": "cherrytree.wav",
  "Chieftain": "chieftain.wav",
  "Chivalrous": "chivalrous.wav",
  "Chun": "chun.wav",
  "Citadel": "citadel.wav",
  "Clarn": "clarn.wav",
  "Claw": "claw.wav",
  "Cleric": "cleric.wav",
  "Cobblestone": "cobblestone.wav",
  "Contessa": "contessa.wav",
  "Corporal": "corporal.wav",
  "Cotswold": "cotswold.wav",
  "Councillor": "councillor.wav",
  "Councilman": "councilman.wav",
  "Councilmen": "councilmen.wav",
  "Councilor": "councilor.wav",
  "Crimson": "crimson.wav",
  "Crismon": "crismon.wav",
  "Cylan": "cylan.wav",
  "Dai": "dai.wav",
  "Dalthanis": "dalthanis.wav",
  "Dank": "dank.wav",
  "Dayr": "dayr.wav",
  "Dedric": "dedric.wav",
  "Delgra": "delgra.wav",
  "Delic": "delic.wav",
  "Denizen": "denizen.wav",
  "Denizens": "denizens.wav",
  "Deric": "deric.wav",
  "Derrbane": "derrbane.wav",
  "Derro": "derro.wav",
  "Derrobane": "derrobane.wav",
  "Dibble": "dibble.wav",
  "Diblon": "diblon.wav",
  "Dire": "dire.wav",
  "Dis": "dis.wav",
  "Dobson": "dobson.wav",
  "Dorian": "dorian.wav",
  "Dorza": "dorza.wav",
  "Dragonbane": "dragonbane.wav",
  "Dragonsbane": "dragonsbane.wav",
  "Drakor": "drakor.wav",
  "Draygon": "draygon.wav",
  "Drefan": "drefan.wav",
  "Ducan": "ducan.wav",
  "Duggan": "duggan.wav",
  "Dulak": "dulak.wav",
  "Dunca": "dunca.wav",
  "Dune": "dune.wav",
  "Dur": "dur.wav",
  "Dur-Hakan": "dur_hakan.wav",
  "Durgane": "durgane.wav",
  "Durthaim": "durthaim.wav",
  "Durthrim": "durthrim.wav",
  "Dwarf": "dwarf.wav",
  "Dwarven": "dwarven.wav",
  "Earlson": "earlson.wav",
  "Eastward": "eastward.wav",
  "Effigius": "effigius.wav",
  "Ehlar": "ehlar.wav",
  "El-Ran": "el_ran.wav",
  "El-Shen": "el_shen.wav",
  "Elan": "elan.wav",
  "Elessel": "elessel.wav",
  "Elf": "elf.wav",
  "Elhar": "elhar.wav",
  "Elishan": "elishan.wav",
  "Eliza": "eliza.wav",
  "Elliswan": "elliswan.wav",
  "Elliwsan": "elliwsan.wav",
  "Elodea": "elodea.wav",
  "Elshan": "elshan.wav",
  "Elven": "elven.wav",
  "Elvenkind": "elvenkind.wav",
  "Elves": "elves.wav",
  "Elvrathas": "elvrathas.wav",
  "Elysium": "elysium.wav",
  "Emaleen": "emaleen.wav",
  "Eminence": "eminence.wav",
  "Emissary": "emissary.wav",
  "Emporium": "emporium.wav",
  "Enaru": "enaru.wav",
  "Endaleth": "endaleth.wav",
  "Envoy": "envoy.wav",
  "Eppres": "eppres.wav",
  "Eradication": "eradication.wav",
  "Eru": "eru.wav",
  "Eshela": "eshela.wav",
  "Ethereal": "ethereal.wav",
  "Eushon": "eushon.wav",
  "Eushownava": "eushownava.wav",
  "Everdark": "everdark.wav",
  "Everytime": "everytime.wav",
  "Eylana": "eylana.wav",
  "Eylanan": "eylanan.wav",
  "Ezrin": "ezrin.wav",
  "F-Fine": "f_fine.wav",
  "F-Forgive": "f_forgive.wav",
  "Faerie": "faerie.wav",
  "Fairik": "fairik.wav",
  "Fargus": "fargus.wav",
  "Fark": "fark.wav",
  "Farraj": "farraj.wav",
  "Farush": "farush.wav",
  "Feasthall": "feasthall.wav",
  "Featherstone": "featherstone.wav",
  "Felaria": "felaria.wav",
  "Feliq": "feliq.wav",
  "Felnck": "felnck.wav",
  "Felnick": "felnick.wav",
  "Felnicks": "felnicks.wav",
  "Felnik": "felnik.wav",
  "Fenaya": "fenaya.wav",
  "Feneya": "feneya.wav",
  "Ferrus": "ferrus.wav",
  "Fey": "fey.wav",
  "Firebane": "firebane.wav",
  "Fireshard": "fireshard.wav",
  "Foomwairma": "foomwairma.wav",
  "Forger": "forger.wav",
  "Frandor": "frandor.wav",
  "Friarsdai": "friarsdai.wav",
  "Fumairma": "fumairma.wav",
  "Fumwairma": "fumwairma.wav",
  "Galantholas": "galantholas.wav",
  "Galathorn": "galathorn.wav",
  "Galen": "galen.wav",
  "Galonti": "galonti.wav",
  "Garb": "garb.wav",
  "Gareth": "gareth.wav",
  "Garvek": "garvek.wav",
  "Gaunt": "gaunt.wav",
  "Gavin": "gavin.wav",
  "Geez": "geez.wav",
  "Ghurauk": "ghurauk.wav",
  "Gilandras": "gilandras.wav",
  "Gilard": "gilard.wav",
  "Gilchis": "gilchis.wav",
  "Gilchris": "gilchris.wav",
  "Gilding": "gilding.wav",
  "Gilrick": "gilrick.wav",
  "Glades": "glades.wav",
  "Glanthalas": "glanthalas.wav",
  "Glantholas": "glantholas.wav",
  "Glimmerwyn": "glimmerwyn.wav",
  "Gloomstone": "gloomstone.wav",
  "Gnaum": "gnaum.wav",
  "Gnomish": "gnomish.wav",
  "Goblinkin": "goblinkin.wav",
  "Goldsheen": "goldsheen.wav",
  "Gorath": "gorath.wav",
  "Gore": "gore.wav",
  "Gorg": "gorg.wav",
  "Gorlyn": "gorlyn.wav",
  "Gorstad": "gorstad.wav",
  "Gotto": "gotto.wav",
  "Graces": "graces.wav",
  "Graffel": "graffel.wav",
  "Grandmaster": "grandmaster.wav",
  "Granitestone": "granitestone.wav",
  "Gratzel": "gratzel.wav",
  "Graystrom": "graystrom.wav",
  "Greathaven": "greathaven.wav",
  "Gregarious": "gregarious.wav",
  "Gregor": "gregor.wav",
  "Griffon": "griffon.wav",
  "Grimbold": "grimbold.wav",
  "Gripp": "gripp.wav",
  "Grizzled": "grizzled.wav",
  "Grog": "grog.wav",
  "Grogg": "grogg.wav",
  "Grotto": "grotto.wav",
  "Gruff": "gruff.wav",
  "Gruul": "gruul.wav",
  "Guardarm": "guardarm.wav",
  "Gustafson": "gustafson.wav",
  "Guza": "guza.wav",
  "Gylis": "gylis.wav",
  "Habani": "habani.wav",
  "Hagatha": "hagatha.wav",
  "Hakan": "hakan.wav",
  "Hallowed": "hallowed.wav",
  "Halthessala": "halthessala.wav",
  "Hammerhaft": "hammerhaft.wav",
  "Har": "har.wav",
  "Harbrim": "harbrim.wav",
  "Harbrin": "harbrin.wav",
  "Hardrock": "hardrock.wav",
  "Harrik": "harrik.wav",
  "Hauberk": "hauberk.wav",
  "Hazards": "hazards.wav",
  "Headmaster": "headmaster.wav",
  "Heed": "heed.wav",
  "Hells": "hells.wav",
  "Henceforth": "henceforth.wav",
  "Hendel": "hendel.wav",
  "Heshbani": "heshbani.wav",
  "Hesta": "hesta.wav",
  "Hestra": "hestra.wav",
  "Heykingygladtomeetyouireallylikeithereitremindsmeofmyhome": "heykingygladtomeetyouireallylikeithereitremindsmeofmyhome.wav",
  "Highlands": "highlands.wav",
  "Highlord": "highlord.wav",
  "Hillsfar": "hillsfar.wav",
  "Hmmm": "hmmm.wav",
  "Homecoming": "homecoming.wav",
  "Horblaster": "horblaster.wav",
  "Horde": "horde.wav",
  "Horgard": "horgard.wav",
  "Hornblade": "hornblade.wav",
  "Hornblaster": "hornblaster.wav",
  "Horned": "horned.wav",
  "Hrumph": "hrumph.wav",
  "Huen": "huen.wav",
  "Hylan": "hylan.wav",
  "Illuminant": "illuminant.wav",
  "Illuminated": "illuminated.wav",
  "Illumination": "illumination.wav",
  "Ilrodel": "ilrodel.wav",
  "Imp": "imp.wav",
  "Inquisitor": "inquisitor.wav",
  "Ironblade": "ironblade.wav",
  "Ironbound": "ironbound.wav",
  "Ironguard": "ironguard.wav",
  "Ironhold": "ironhold.wav",
  "Ironspear": "ironspear.wav",
  "Irontree": "irontree.wav",
  "Iston": "iston.wav",
  "Jabari": "jabari.wav",
  "Jabbed": "jabbed.wav",
  "Jacob": "jacob.wav",
  "Jad": "jad.wav",
  "Janson": "janson.wav",
  "Jasyen": "jasyen.wav",
  "Jayden": "jayden.wav",
  "Jaylan": "jaylan.wav",
  "Jaysen": "jaysen.wav",
  "Jewel": "jewel.wav",
  "Jors": "jors.wav",
  "Jovially": "jovially.wav",
  "Kaash": "kaash.wav",
  "Kah": "kah.wav",
  "Kalzaduum": "kalzaduum.wav",
  "Karnak": "karnak.wav",
  "Kaspar": "kaspar.wav",
  "Kassie": "kassie.wav",
  "Keldris": "keldris.wav",
  "Kelshard": "kelshard.wav",
  "Kelvesh": "kelvesh.wav",
  "Kelvin": "kelvin.wav",
  "Kelwane": "kelwane.wav",
  "Kev": "kev.wav",
  "Khaki": "khaki.wav",
  "Kihee": "kihee.wav",
  "Kihee-Uust": "kihee_uust.wav",
  "Kiiri": "kiiri.wav",
  "Kin": "kin.wav",
  "Kirri": "kirri.wav",
  "Kisleth": "kisleth.wav",
  "Knelt": "knelt.wav",
  "Knight-Corporal": "knight_corporal.wav",
  "Knight-Lieutenant": "knight_lieutenant.wav",
  "Knight-Major": "knight_major.wav",
  "Knight-Sergeant": "knight_sergeant.wav",
  "Knighthand": "knighthand.wav",
  "Knighthood": "knighthood.wav",
  "Knowin": "knowin.wav",
  "Kodan": "kodan.wav",
  "Kor": "kor.wav",
  "Kor-Roth": "kor_roth.wav",
  "Kordan": "kordan.wav",
  "Koreth": "koreth.wav",
  "Korin": "korin.wav",
  "Kraelheimgar": "kraelheimgar.wav",
  "Kraven": "kraven.wav",
  "Kris": "kris.wav",
  "Krisleth": "krisleth.wav",
  "Kronlin": "kronlin.wav",
  "Kudah": "kudah.wav",
  "Kuerana": "kuerana.wav",
  "Kunah": "kunah.wav",
  "Kwenal": "kwenal.wav",
  "Kyfurn": "kyfurn.wav",
  "Kylic": "kylic.wav",
  "Ladell": "ladell.wav",
  "Laird": "laird.wav",
  "Leng": "leng.wav",
  "Lesik": "lesik.wav",
  "Lightbinger": "lightbinger.wav",
  "Lightbrigner": "lightbrigner.wav",
  "Lightbringer": "lightbringer.wav",
  "Lightbringers": "lightbringers.wav",
  "Lightrbinger": "lightrbinger.wav",
  "Liu": "liu.wav",
  "Lon": "lon.wav",
  "Lon-Ell": "lon_ell.wav",
  "Longsword": "longsword.wav",
  "Lordship": "lordship.wav",
  "Lumisha": "lumisha.wav",
  "Lyceum": "lyceum.wav",
  "Macabress": "macabress.wav",
  "Madam": "madam.wav",
  "Magician": "magician.wav",
  "Magister": "magister.wav",
  "Magistry": "magistry.wav",
  "Magorian": "magorian.wav",
  "Majesties": "majesties.wav",
  "Maldrood": "maldrood.wav",
  "Malrood": "malrood.wav",
  "Manchu": "manchu.wav",
  "Marches": "marches.wav",
  "Marlee": "marlee.wav",
  "Masta": "masta.wav",
  "Matriarch": "matriarch.wav",
  "Matriarchs": "matriarchs.wav",
  "Meknathar": "meknathar.wav",
  "Menthal": "menthal.wav",
  "Ming": "ming.wav",
  "Minotaur": "minotaur.wav",
  "Minotaurs": "minotaurs.wav",
  "Mister": "mister.wav",
  "Misty": "misty.wav",
  "Mithral": "mithral.wav",
  "Mithrin": "mithrin.wav",
  "Mitral": "mitral.wav",
  "Mmmm": "mmmm.wav",
  "Moans": "moans.wav",
  "Molgol": "molgol.wav",
  "Monarchy": "monarchy.wav",
  "Morther": "morther.wav",
  "Motioning": "motioning.wav",
  "Mustaches": "mustaches.wav",
  "Mutters": "mutters.wav",
  "Mylee": "mylee.wav",
  "Nahzim": "nahzim.wav",
  "Nefaleem": "nefaleem.wav",
  "Nestor": "nestor.wav",
  "Nesven": "nesven.wav",
  "Neverthoughtidseeyouprancingaroundwithabunchofelfgirls": "neverthoughtidseeyouprancingaroundwithabunchofelfgirls.wav",
  "Nijel": "nijel.wav",
  "Nik": "nik.wav",
  "Nimbly": "nimbly.wav",
  "Nimgalad": "nimgalad.wav",
  "Nirvana": "nirvana.wav",
  "Noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest": "noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest.wav",
  "Nollon": "nollon.wav",
  "Nomadic": "nomadic.wav",
  "Nook": "nook.wav",
  "Nurn": "nurn.wav",
  "Nym": "nym.wav",
  "Oakheart": "oakheart.wav",
  "Oakleaf": "oakleaf.wav",
  "Odie": "odie.wav",
  "Odo": "odo.wav",
  "Ododrian": "ododrian.wav",
  "Odoiran": "odoiran.wav",
  "Odorain": "odorain.wav",
  "Odoriain": "odoriain.wav",
  "Odorian": "odorian.wav",
  "Odorians": "odorians.wav",
  "Ody": "ody.wav",
  "Off-Worlder": "off_worlder.wav",
  "Ogrin": "ogrin.wav",
  "Olde": "olde.wav",
  "Onas": "onas.wav",
  "Ooo": "ooo.wav",
  "Oorian": "oorian.wav",
  "Oranoc": "oranoc.wav",
  "Orbs": "orbs.wav",
  "Orehand": "orehand.wav",
  "Orgrin": "orgrin.wav",
  "Orin": "orin.wav",
  "Orkosh": "orkosh.wav",
  "Oroset": "oroset.wav",
  "Orson": "orson.wav",
  "Oslagil": "oslagil.wav",
  "Overlord": "overlord.wav",
  "Paladin": "paladin.wav",
  "Paladin-King": "paladin_king.wav",
  "Patriarch": "patriarch.wav",
  "Patriarchs": "patriarchs.wav",
  "Penance": "penance.wav",
  "Penelope": "penelope.wav",
  "Periwinkle": "periwinkle.wav",
  "Pilgrim": "pilgrim.wav",
  "Pinnacle": "pinnacle.wav",
  "Pricilla": "pricilla.wav",
  "Priestess": "priestess.wav",
  "Primer": "primer.wav",
  "Priscilla": "priscilla.wav",
  "Prologue": "prologue.wav",
  "Prudent": "prudent.wav",
  "Quartzhand": "quartzhand.wav",
  "Racah": "racah.wav",
  "Rachelle": "rachelle.wav",
  "Radiant": "radiant.wav",
  "Rah'Zi": "rah_zi.wav",
  "Rasheer": "rasheer.wav",
  "Raslan": "raslan.wav",
  "Ravenburg": "ravenburg.wav",
  "Ravenhill": "ravenhill.wav",
  "Ravensburg": "ravensburg.wav",
  "Razentia": "razentia.wav",
  "Realms": "realms.wav",
  "Redhorn": "redhorn.wav",
  "Reflexively": "reflexively.wav",
  "Reinys": "reinys.wav",
  "Retort": "retort.wav",
  "Roc": "roc.wav",
  "Rockport": "rockport.wav",
  "Rolands": "rolands.wav",
  "Rolden": "rolden.wav",
  "Rooks": "rooks.wav",
  "Roth": "roth.wav",
  "Rothsholm": "rothsholm.wav",
  "Rouge": "rouge.wav",
  "Rustigar": "rustigar.wav",
  "Sarnel": "sarnel.wav",
  "Satyrsdai": "satyrsdai.wav",
  "Scaly": "scaly.wav",
  "Scepter": "scepter.wav",
  "Seagull": "seagull.wav",
  "Sedition": "sedition.wav",
  "Seeker": "seeker.wav",
  "Sehlaba": "sehlaba.wav",
  "Seker": "seker.wav",
  "Seker-Ankh": "seker_ankh.wav",
  "Selna": "selna.wav",
  "Senica": "senica.wav",
  "Sentinel": "sentinel.wav",
  "Septuigen": "septuigen.wav",
  "Sergeant-Major": "sergeant_major.wav",
  "Serk": "serk.wav",
  "Sgt": "sgt.wav",
  "Shadeem": "shadeem.wav",
  "Shae": "shae.wav",
  "Shal": "shal.wav",
  "Shalahz": "shalahz.wav",
  "Shalaz": "shalaz.wav",
  "Shalazah": "shalazah.wav",
  "Shambhu": "shambhu.wav",
  "Shambu": "shambu.wav",
  "Shanay": "shanay.wav",
  "Shatterdawn": "shatterdawn.wav",
  "Shdeem": "shdeem.wav",
  "Shelna": "shelna.wav",
  "Shen": "shen.wav",
  "Shrouded": "shrouded.wav",
  "Shyrra": "shyrra.wav",
  "Sigil": "sigil.wav",
  "Silverbane": "silverbane.wav",
  "Silvernote": "silvernote.wav",
  "Silvervein": "silvervein.wav",
  "Silverwind": "silverwind.wav",
  "Sirjif": "sirjif.wav",
  "Sis": "sis.wav",
  "Skeptically": "skeptically.wav",
  "Slagg": "slagg.wav",
  "Slaver": "slaver.wav",
  "Slavers": "slavers.wav",
  "Slick": "slick.wav",
  "Solstice": "solstice.wav",
  "Soren": "soren.wav",
  "Sorrow": "sorrow.wav",
  "Sosa": "sosa.wav",
  "Soulseeker": "soulseeker.wav",
  "Soulsinger": "soulsinger.wav",
  "Sparks": "sparks.wav",
  "Spellbooks": "spellbooks.wav",
  "Spikehorn": "spikehorn.wav",
  "Stairwell": "stairwell.wav",
  "Stalker": "stalker.wav",
  "Stealthy": "stealthy.wav",
  "Steelaxe": "steelaxe.wav",
  "Steelclaw": "steelclaw.wav",
  "Steelhorn": "steelhorn.wav",
  "Steward": "steward.wav",
  "Stiletto": "stiletto.wav",
  "Stonefirger": "stonefirger.wav",
  "Stoneforger": "stoneforger.wav",
  "Stonehelm": "stonehelm.wav",
  "Stonehold": "stonehold.wav",
  "Stoner": "stoner.wav",
  "Sunder": "sunder.wav",
  "Surly": "surly.wav",
  "Swung": "swung.wav",
  "Symphonic": "symphonic.wav",
  "Ta-Lar": "ta_lar.wav",
  "Taeriel": "taeriel.wav",
  "Tailor": "tailor.wav",
  "Talaer": "talaer.wav",
  "Tallspear": "tallspear.wav",
  "Targoth": "targoth.wav",
  "Tarnen": "tarnen.wav",
  "Tathan": "tathan.wav",
  "Tavern": "tavern.wav",
  "Tellin": "tellin.wav",
  "Thane": "thane.wav",
  "Thanes": "thanes.wav",
  "Theocratic": "theocratic.wav",
  "Therak": "therak.wav",
  "Therondil": "therondil.wav",
  "Thorn": "thorn.wav",
  "Thranis": "thranis.wav",
  "Throgg": "throgg.wav",
  "Thunderstrike": "thunderstrike.wav",
  "Tien": "tien.wav",
  "Tillborne": "tillborne.wav",
  "Tinbreaker": "tinbreaker.wav",
  "Tome": "tome.wav",
  "Torak": "torak.wav",
  "Toren": "toren.wav",
  "Torgath": "torgath.wav",
  "Torgoth": "torgoth.wav",
  "Traitor": "traitor.wav",
  "Triesse": "triesse.wav",
  "Tumark": "tumark.wav",
  "Tumbler": "tumbler.wav",
  "Turcan": "turcan.wav",
  "Turog": "turog.wav",
  "Twinsdai": "twinsdai.wav",
  "Twyleen": "twyleen.wav",
  "Tyrant": "tyrant.wav",
  "Udda": "udda.wav",
  "Uhrn": "uhrn.wav",
  "Ulagra": "ulagra.wav",
  "Ulrik": "ulrik.wav",
  "Umbrin": "umbrin.wav",
  "Umfray": "umfray.wav",
  "Undwin": "undwin.wav",
  "Unison": "unison.wav",
  "Urhn": "urhn.wav",
  "Uryna": "uryna.wav",
  "Uust": "uust.wav",
  "Vagrant": "vagrant.wav",
  "Valdarin": "valdarin.wav",
  "Valeth": "valeth.wav",
  "Valindar": "valindar.wav",
  "Valinor": "valinor.wav",
  "Valis": "valis.wav",
  "Vanessa": "vanessa.wav",
  "Varann": "varann.wav",
  "Varsis": "varsis.wav",
  "Varu": "varu.wav",
  "Vedra": "vedra.wav",
  "Velicia": "velicia.wav",
  "Velvet": "velvet.wav",
  "Vendar": "vendar.wav",
  "Venessa": "venessa.wav",
  "Vengeance": "vengeance.wav",
  "Vermin": "vermin.wav",
  "Verness": "verness.wav",
  "Verr": "verr.wav",
  "Verr-": "verr.wav",
  "Verr-Asses": "verr_asses.wav",
  "Veya": "veya.wav",
  "Viscount": "viscount.wav",
  "Vizier": "vizier.wav",
  "Vlainor": "vlainor.wav",
  "Volan": "volan.wav",
  "Volstan": "volstan.wav",
  "Vorann": "vorann.wav",
  "Vorgak": "vorgak.wav",
  "Vorum": "vorum.wav",
  "Vuhnalya": "vuhnalya.wav",
  "Vyn": "vyn.wav",
  "Wallbreaker": "wallbreaker.wav",
  "Wanton": "wanton.wav",
  "Warfrost": "warfrost.wav",
  "Wargog": "wargog.wav",
  "Warstar": "warstar.wav",
  "Warthog": "warthog.wav",
  "Weaving": "weaving.wav",
  "Weee": "weee.wav",
  "Wettstein": "wettstein.wav",
  "Wh": "wh.wav",
  "Wha": "wha.wav",
  "Whatchya": "whatchya.wav",
  "Wheni": "wheni.wav",
  "Whitehand": "whitehand.wav",
  "Whoah": "whoah.wav",
  "Williamsburg": "williamsburg.wav",
  "Willowbrook": "willowbrook.wav",
  "Windrift": "windrift.wav",
  "Windsdai": "windsdai.wav",
  "Witchwyrd": "witchwyrd.wav",
  "Witchwyrds": "witchwyrds.wav",
  "Wolfclaw": "wolfclaw.wav",
  "Woodlan": "woodlan.wav",
  "Woodland": "woodland.wav",
  "Wooo": "wooo.wav",
  "Worlder": "worlder.wav",
  "Wrath": "wrath.wav",
  "Wuzy": "wuzy.wav",
  "Wynshorn": "wynshorn.wav",
  "Wyren": "wyren.wav",
  "Yahnig": "yahnig.wav",
  "Yan": "yan.wav",
  "Yar": "yar.wav",
  "Yer": "yer.wav",
  "Yolan": "yolan.wav",
  "Yoos": "yoos.wav",
  "Yurik": "yurik.wav",
  "Zalrek": "zalrek.wav",
  "Zeb": "zeb.wav",
  "Zelph": "zelph.wav",
  "Zha": "zha.wav",
  "Zhong": "zhong.wav",
  "Zhong-Goo": "zhong_goo.wav",
  "Zinger": "zinger.wav",
  "Zirak": "zirak.wav",
  "Zurn": "zurn.wav",
  "Zyzaren": "zyzaren.wav",
  "Zyzarn": "zyzarn.wav",
  "Zyzren": "zyzren.wav"
 }
--- a/output_proper_nouns/audio_text_for_novel_lightbringer/pronunciation_fixes.json
+++ b/output_proper_nouns/audio_text_for_novel_lightbringer/pronunciation_fixes.json
@ -0,0 +1,20 @@
 {
  "Anhuil-Elhar": "An-WHEEL AY-Lar",
  "Anhuil-Ehlar": "An-WHEEL AY-Lar",
  "Aegrir": "Ay-Greer",
  "Baras": "BARE-iss",
  "Emaleen": "EMMA-lean",
  "Eushownava": "You-SHOWN-Eh-Vah",
  "Graffel": "Gra-FELL",
  "Greathaven": "GREAT-Haven",
  "Jaylan": "JAY-Lin",
  "Neverthoughtidseeyouprancingaroundwithabunchofelfgirls": "Never thought I'd see you prancing around with a bunch of elf girls",
  "Nijel": "NYE-jell",
  "Noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest": "No I've been here and there looking for my kin rumored to dwell here in this forest",
  "Odoiran": "Oh-DORIAN",
  "Ody": "Oh-Dee",
  "Seker-Ankh": "Seker-Ahnk",
  "Rasheer": "Raw-SHEAR",
  "Valinor": "Vala-nor",
  "Varsis": "Ver-Asis"
 }
--- a/output_proper_nouns/correct_words.json
+++ b/output_proper_nouns/correct_words.json
--- a/output_proper_nouns/manifest.json
+++ b/output_proper_nouns/manifest.json
--- a/output_proper_nouns/pronunciation_fixes.json
+++ b/output_proper_nouns/pronunciation_fixes.json
@ -1,28 +1,35 @@
 {
  "Gadianton Robbers": "Gadeeantun Robbers",
  "Gadianton": "Gadeeantun",
  "Coriantumr": "Coryantomer",
  "Laman": "Layman",
  "Lehi And Nephi": "Leehi And Nephi",
  "Lehi": "Leehi",
  "Lehi Mathonihah": "Leehi Mathonihah",
  "Lehis": "Leehis",
  "Lehies": "Leehis",
  "Liahona": "Leeahona",
  "Moroni": "Morero-ni",
  "Alma": "Al-ma",
  "Gadiantons": "Gadeeantuns",
  "Laban": "Layban",
  "Mosiah": "Moziah",
  "Mosiah The King": "Moziah The King",
  "Nehors": "Kneehores",
  "Samuel The Lamanite": "Samuel The Laymanite",
  "Tarry": "Tarery",
-  "The Lamanite Twins": "The Laymanite Twins",
+  "Nephites": "Kneefites",
-  "The Lamanites Of Ammon": "The Laymanites Of Ammon",
+  "Anti-Nephi-Lehies": "Anti-Kneef-eye-Leehis",
-  "The Lamanites Of The Land Of Zarahemla": "The Laymanites Of The Land Of Zarahemla",
+  "Lamanite": "Laymanite",
-  "The Lamanites Of The Land Southward": "The Laymanites Of The Land Southward",
+  "Lamanites": "Laymanites",
-  "The Lamanites Of The People Of Ammon": "The Laymanites Of The People Of Ammon",
+  "Lamb'S": "Lamb's",
-  "The Lamb'S Book Of Life": "The Lamb's Book Of Life",
+  "Sarai": "Sa-rye",
-  "The Land Of Nephi": "The Land Of Kneefi"
+  "Telestial": "Tea-lestial",
  "Lord'S": "Lord's",
  "Helaman": "He-la-mun",
  "Alma": "Al-ma",
  "Nephihah": "Kneef-eyehah",
  "Nephihet": "Kneef-eyehet",
  "Nephite": "Kneefight",
  "Nephi-Im": "Kneef-eye-Im",
  "Zenephi": "Ze-kneef-eye",
  "Nephitish": "Kneefight-ish",
  "Moroni": "Moh-roh-nye",
  "Nephi": "Knee-fye",
  "Hagar": "Hag-ar",
  "Oug": "Ohg",
  "Ougan": "Ohgan"
 }
--- a/output_proper_nouns/visions_glory_canada/manifest.json
+++ b/output_proper_nouns/visions_glory_canada/manifest.json
@ -0,0 +1,30 @@
 {
  "Adam": "adam.wav",
  "Adam-Ondi-Ahman": "adam_ondi_ahman.wav",
  "Ahman": "ahman.wav",
  "Alma": "alma.wav",
  "Apostles": "apostles.wav",
  "Brethren": "brethren.wav",
  "Cardston": "cardston.wav",
  "Ephraim": "ephraim.wav",
  "Evolving": "evolving.wav",
  "Holies": "holies.wav",
  "Israel": "israel.wav",
  "Joseph": "joseph.wav",
  "Knelt": "knelt.wav",
  "Lehi": "lehi.wav",
  "Liahona": "liahona.wav",
  "Millennium": "millennium.wav",
  "Mormon": "mormon.wav",
  "Moroni": "moroni.wav",
  "Mosiah": "mosiah.wav",
  "Nauvoo": "nauvoo.wav",
  "Quorum": "quorum.wav",
  "Rachael": "rachael.wav",
  "Savior": "savior.wav",
  "Thummim": "thummim.wav",
  "Urim": "urim.wav",
  "Vignette": "vignette.wav",
  "Zachary": "zachary.wav",
  "Zion": "zion.wav"
 }
--- a/output_proper_nouns/visions_of_glory__zion_in_canada_pg_162-193/manifest.json
+++ b/output_proper_nouns/visions_of_glory__zion_in_canada_pg_162-193/manifest.json
@ -0,0 +1,30 @@
 {
  "Adam": "adam.wav",
  "Adam-Ondi-Ahman": "adam_ondi_ahman.wav",
  "Ahman": "ahman.wav",
  "Alma": "alma.wav",
  "Apostles": "apostles.wav",
  "Brethren": "brethren.wav",
  "Cardston": "cardston.wav",
  "Ephraim": "ephraim.wav",
  "Evolving": "evolving.wav",
  "Holies": "holies.wav",
  "Israel": "israel.wav",
  "Joseph": "joseph.wav",
  "Knelt": "knelt.wav",
  "Lehi": "lehi.wav",
  "Liahona": "liahona.wav",
  "Millennium": "millennium.wav",
  "Mormon": "mormon.wav",
  "Moroni": "moroni.wav",
  "Mosiah": "mosiah.wav",
  "Nauvoo": "nauvoo.wav",
  "Quorum": "quorum.wav",
  "Rachael": "rachael.wav",
  "Savior": "savior.wav",
  "Thummim": "thummim.wav",
  "Urim": "urim.wav",
  "Vignette": "vignette.wav",
  "Zachary": "zachary.wav",
  "Zion": "zion.wav"
 }
--- a/projects.json
+++ b/projects.json
@ -0,0 +1,18 @@
 [
  {
    "name": "Audio Text for Novel Lightbringer",
    "source_paths": [
      "/home/dillon/_code/voice_model/Audio Text for Novel Lightbringer/Audio Text for Novel Lightbringer.txt"
    ],
    "proper_nouns_output_dir": "output_proper_nouns/audio_text_for_novel_lightbringer",
    "proper_nouns_audio_dir": "proper_nouns_audio/audio_text_for_novel_lightbringer"
  },
  {
    "name": "visions glory canada",
    "source_paths": [
      "/home/dillon/_code/voice_model/Visions of Glory_ Zion in Canada pg 162-193.txt"
    ],
    "proper_nouns_output_dir": "output_proper_nouns/visions_glory_canada",
    "proper_nouns_audio_dir": "proper_nouns_audio/visions_glory_canada"
  }
 ]
--- a/proper_nouns.txt
+++ b/proper_nouns.txt
--- a/run_audiobook.bat
+++ b/run_audiobook.bat
@ -0,0 +1,42 @@
@echo off
 title Create Audiobook
 :: Change to the folder this .bat file lives in
 cd /d "%~dp0"
 :: Check setup has been run
 if not exist .venv\Scripts\python.exe (
    echo ERROR: Setup has not been run yet.
    echo Please double-click setup_windows.bat first.
    pause
    exit /b 1
 )
 echo ============================================================
 echo  Audiobook Creator
 echo ============================================================
 echo.
 echo  Options:
 echo    1 - Generate ALL chapters  (may take many hours)
 echo    2 - List detected chapters only
 echo    3 - Generate a short PREVIEW of each chapter
 echo    4 - Generate specific chapters (enter numbers next)
 echo.
 set /p CHOICE="Enter choice (1/2/3/4): "
 if "%CHOICE%"=="1" (
    .venv\Scripts\python create_audiobook_lightbringer.py
 ) else if "%CHOICE%"=="2" (
    .venv\Scripts\python create_audiobook_lightbringer.py --list
 ) else if "%CHOICE%"=="3" (
    .venv\Scripts\python create_audiobook_lightbringer.py --preview
 ) else if "%CHOICE%"=="4" (
    set /p CHAPTERS="Enter chapter numbers separated by spaces (e.g. 0 1 2): "
    .venv\Scripts\python create_audiobook_lightbringer.py %CHAPTERS%
 ) else (
    echo Invalid choice.
 )
 echo.
 echo Done. Output files are in the output_audiobook_lightbringer folder.
 pause
--- a/run_gui.bat
+++ b/run_gui.bat
@ -0,0 +1,21 @@
@echo off
 title Proper Noun GUI
 :: Change to the folder this .bat file lives in
 cd /d "%~dp0"
 :: Check setup has been run
 if not exist .venv\Scripts\python.exe (
    echo ERROR: Setup has not been run yet.
    echo Please double-click setup_windows.bat first.
    pause
    exit /b 1
 )
 echo Starting Proper Noun Player GUI...
 .venv\Scripts\python gui_proper_noun_player.py
 if errorlevel 1 (
    echo.
    echo The application closed with an error. See message above.
    pause
 )
--- a/setup_windows.bat
+++ b/setup_windows.bat
@ -0,0 +1,93 @@
@echo off
 setlocal EnableDelayedExpansion
 title Audiobook Setup
 echo ============================================================
 echo  Audiobook Setup for Windows 11
 echo ============================================================
 echo.
 :: ── 1. Check Python ──────────────────────────────────────────────────────────
 echo [1/5] Checking Python installation...
 python --version >nul 2>&1
 if errorlevel 1 (
    echo.
    echo  ERROR: Python was not found.
    echo.
    echo  Please install Python 3.12 from https://www.python.org/downloads/
    echo  IMPORTANT: On the installer, tick "Add Python to PATH" before clicking Install.
    echo.
    echo  After installing, close this window and double-click setup_windows.bat again.
    pause
    exit /b 1
 )
 for /f "tokens=2 delims= " %%v in ('python --version 2^>^&1') do set PY_VER=%%v
 echo  Found Python %PY_VER%
 echo.
 :: ── 2. Create virtual environment ────────────────────────────────────────────
 echo [2/5] Creating virtual environment (.venv)...
 if exist .venv (
    echo  .venv already exists, skipping creation.
 ) else (
    python -m venv .venv
    if errorlevel 1 (
        echo  ERROR: Failed to create virtual environment.
        pause
        exit /b 1
    )
    echo  Virtual environment created.
 )
 echo.
 :: ── 3. Install PyTorch with CUDA (for gaming GPU) ────────────────────────────
 echo [3/5] Installing PyTorch with CUDA 12.4 support (this may take a while)...
 echo  Downloading ~2.5 GB — please be patient.
 echo.
 .venv\Scripts\pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
 if errorlevel 1 (
    echo.
    echo  WARNING: CUDA build failed. Falling back to CPU-only PyTorch.
    echo  Audio generation will be slower but will still work.
    .venv\Scripts\pip install torch
 )
 echo.
 :: ── 4. Install remaining packages ────────────────────────────────────────────
 echo [4/5] Installing remaining packages (kokoro, soundfile, sounddevice, spacy, wordfreq)...
 .venv\Scripts\pip install -r requirements.txt
 if errorlevel 1 (
    echo  ERROR: Package installation failed. Check your internet connection.
    pause
    exit /b 1
 )
 echo Downloading spaCy English language model (en_core_web_sm, ~15 MB)...
 .venv\Scripts\python -m spacy download en_core_web_sm
 if errorlevel 1 (
    echo  WARNING: spaCy model download failed. Proper noun extraction will not work
    echo  until you re-run:  .venv\Scripts\python -m spacy download en_core_web_sm
 )
 echo.
 :: ── 5. Download the Kokoro TTS model ─────────────────────────────────────────
 echo [5/5] Downloading the Kokoro TTS model (hexgrad/Kokoro-82M, ~330 MB)...
 echo  This only happens once.
 echo.
 .venv\Scripts\python -c "from kokoro import KPipeline; KPipeline(lang_code='a', repo_id='hexgrad/Kokoro-82M'); print('Model ready.')"
 if errorlevel 1 (
    echo.
    echo  WARNING: Model download failed. It will retry the first time you run the app.
    echo  Make sure you have an internet connection on first launch.
 )
 echo.
 echo ============================================================
 echo  Setup complete!
 echo.
 echo  To launch the GUI:          double-click  run_gui.bat
 echo  To create the audiobook:   double-click  run_audiobook.bat
 echo ============================================================
 echo.
 pause
Author	SHA1	Message	Date
dillonj	e9ddbb586a	projects include proper noun stuff	2026-04-08 01:52:54 -06:00
dillonj	894144c84a	audio gen in gui	2026-04-08 01:42:29 -06:00
dillonj	69639342e3	format doc script	2026-03-24 01:42:34 -06:00
dillonj	125cb25cf8	improved time remaining display; added pronunciations	2026-03-13 01:07:17 -06:00
dillonj	8a1362fe0b	setup readme	2026-03-10 00:45:57 -06:00
dillonj	0d00176a18	better readme	2026-03-10 00:30:53 -06:00
dillonj	3c2c3d241e	improved gui	2026-03-10 00:12:04 -06:00
dillonj	224f97d0c6	prep for win 11	2026-03-09 23:36:50 -06:00
dillonj	6e2e0f9af7	better word replacement	2026-02-26 15:08:44 -07:00
dillonj	c1301fee18	deleting word from fixed removes it from correct at same time	2026-02-26 12:52:09 -07:00
dillonj	6781efe3f3	improved estimation	2026-02-26 12:21:44 -07:00
dillonj	44bc757f3f	fixed book markers	2026-02-26 12:09:43 -07:00
dillonj	6cefc3c862	improved proper noun parsing	2026-02-26 00:57:40 -07:00
dillonj	949bd7c203	Clean correct_words.json: single words, filter stop words, keep two-part proper names	2026-02-25 23:50:52 -07:00
		`@ -0,0 +1,2 @@`
							`export VIRTUAL_ENV="$PWD/.venv"`
							`export PATH="$VIRTUAL_ENV/bin:$PATH"`