Compare commits
14 Commits
f0e0adf24b
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| e9ddbb586a | |||
| 894144c84a | |||
| 69639342e3 | |||
| 125cb25cf8 | |||
| 8a1362fe0b | |||
| 0d00176a18 | |||
| 3c2c3d241e | |||
| 224f97d0c6 | |||
| 6e2e0f9af7 | |||
| c1301fee18 | |||
| 6781efe3f3 | |||
| 44bc757f3f | |||
| 6cefc3c862 | |||
| 949bd7c203 |
2
.envrc
Normal file
2
.envrc
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
export VIRTUAL_ENV="$PWD/.venv"
|
||||||
|
export PATH="$VIRTUAL_ENV/bin:$PATH"
|
||||||
7
.gitignore
vendored
7
.gitignore
vendored
@ -3,6 +3,9 @@ __pycache__/
|
|||||||
*.pyc
|
*.pyc
|
||||||
*.pyo
|
*.pyo
|
||||||
.venv/
|
.venv/
|
||||||
|
build/
|
||||||
|
dist/
|
||||||
|
*.spec
|
||||||
|
|
||||||
# Audio files
|
# Audio files
|
||||||
*.wav
|
*.wav
|
||||||
@ -14,6 +17,10 @@ proper_nouns_audio/
|
|||||||
# Generated data (JSON files in output_proper_nouns/ are tracked)
|
# Generated data (JSON files in output_proper_nouns/ are tracked)
|
||||||
output_proper_nouns/remaining_review.txt
|
output_proper_nouns/remaining_review.txt
|
||||||
|
|
||||||
|
# Generated PDFs and LaTeX files
|
||||||
|
*.pdf
|
||||||
|
*.tex
|
||||||
|
|
||||||
# Text files (except proper_nouns.txt)
|
# Text files (except proper_nouns.txt)
|
||||||
*.txt
|
*.txt
|
||||||
!proper_nouns.txt
|
!proper_nouns.txt
|
||||||
|
|||||||
4
.vscode/settings.json
vendored
Normal file
4
.vscode/settings.json
vendored
Normal file
@ -0,0 +1,4 @@
|
|||||||
|
{
|
||||||
|
"python.defaultInterpreterPath": ".venv/bin/python",
|
||||||
|
"python.terminal.activateEnvironment": true
|
||||||
|
}
|
||||||
125
README.md
125
README.md
@ -0,0 +1,125 @@
|
|||||||
|
# Audiobook Creator
|
||||||
|
|
||||||
|
AI-powered audiobook generator using the [Kokoro TTS](https://github.com/hexgrad/kokoro) model.
|
||||||
|
Generates high-quality narrated `.wav` files from plain-text novels, with a GUI tool for auditing and fixing proper noun pronunciations per book.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- **Multi-book support** — each book's proper nouns, fixes, and audio are fully isolated
|
||||||
|
- **Proper Noun GUI** — hear every extracted name, mark it correct or type a phonetic fix
|
||||||
|
- **Audiobook generation** — one `.wav` per chapter, GPU-accelerated via CUDA
|
||||||
|
- **In-GUI extraction** — click one button to run NLP extraction and generate audio, no separate scripts needed
|
||||||
|
- **Apply Fixes** — writes a TTS-ready copy of the source text with all phonetic substitutions applied
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Project structure
|
||||||
|
|
||||||
|
```
|
||||||
|
Audio Text for Novel Lightbringer/ ← multi-file book (chapters as .txt)
|
||||||
|
Audio Master Nem Full.txt ← single-file book
|
||||||
|
|
||||||
|
gui_proper_noun_player.py ← proper noun auditing GUI
|
||||||
|
create_audiobook_lightbringer.py ← generate Lightbringer audiobook chapters
|
||||||
|
create_audiobook_nem.py ← generate Nem audiobook chapters
|
||||||
|
|
||||||
|
output_audiobook_lightbringer/ ← chapter WAV output
|
||||||
|
output_audiobook/ ← Nem WAV output
|
||||||
|
output_proper_nouns/<book>/ ← manifest + JSON fix data per book
|
||||||
|
proper_nouns_audio/<book>/ ← word audio + replacements cache per book
|
||||||
|
|
||||||
|
requirements.txt
|
||||||
|
setup_windows.bat ← one-click Windows setup
|
||||||
|
run_gui.bat ← launch GUI on Windows
|
||||||
|
run_audiobook.bat ← generate audiobook on Windows
|
||||||
|
---
|
||||||
|
|
||||||
|
## Setup (Windows - Easiest for Non-Tech Users)
|
||||||
|
|
||||||
|
1. **Download** the project as a ZIP file from GitHub
|
||||||
|
2. **Extract** the ZIP to a folder on your computer (e.g., `C:\audiobook-creator`)
|
||||||
|
3. **Double-click** `setup_windows.bat` and wait for it to finish installing everything (may take 10-20 minutes)
|
||||||
|
4. **Double-click** `run_gui.bat` to launch the Proper Noun Player GUI
|
||||||
|
5. **Double-click** `run_audiobook.bat` to generate audiobook chapters
|
||||||
|
|
||||||
|
That's it! The setup script handles Python installation, virtual environment, and all dependencies automatically.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Setup (Linux / Mac)
|
||||||
|
|
||||||
|
```bash
|
||||||
|
python3.12 -m venv .venv
|
||||||
|
source .venv/bin/activate
|
||||||
|
pip install torch --index-url https://download.pytorch.org/whl/cu124 # CUDA 12.4
|
||||||
|
pip install -r requirements.txt
|
||||||
|
python -m spacy download en_core_web_sm
|
||||||
|
```
|
||||||
|
|
||||||
|
> For CPU-only: replace the torch line with `pip install torch`
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Setup (Windows)
|
||||||
|
|
||||||
|
See [SETUP_WINDOWS.md](SETUP_WINDOWS.md) for a step-by-step guide aimed at non-technical users.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### Proper Noun GUI
|
||||||
|
|
||||||
|
```bash
|
||||||
|
.venv/bin/python gui_proper_noun_player.py
|
||||||
|
```
|
||||||
|
|
||||||
|
1. Select a book from the dropdown
|
||||||
|
2. Click **⚙ Extract & Generate Audio** — extracts proper nouns via spaCy and generates a TTS clip for each one
|
||||||
|
3. Click words in the Review list to hear them; press Enter to mark correct or type a phonetic replacement first
|
||||||
|
4. Click **⇄ Apply Fixes to Text** to write a pronunciation-corrected copy of the source file
|
||||||
|
|
||||||
|
### Generate Audiobook
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# All chapters
|
||||||
|
.venv/bin/python create_audiobook_lightbringer.py
|
||||||
|
|
||||||
|
# List chapters only
|
||||||
|
.venv/bin/python create_audiobook_lightbringer.py --list
|
||||||
|
|
||||||
|
# Preview clips
|
||||||
|
.venv/bin/python create_audiobook_lightbringer.py --preview
|
||||||
|
|
||||||
|
# Specific chapters
|
||||||
|
.venv/bin/python create_audiobook_lightbringer.py 0 1 2
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Dependencies
|
||||||
|
|
||||||
|
| Package | Purpose |
|
||||||
|
|---|---|
|
||||||
|
| `kokoro` | Kokoro-82M TTS model |
|
||||||
|
| `torch` | GPU inference |
|
||||||
|
| `soundfile` / `sounddevice` | Audio I/O |
|
||||||
|
| `numpy` | Audio array operations |
|
||||||
|
| `spacy` + `en_core_web_sm` | Proper noun extraction (NER + PROPN) |
|
||||||
|
| `wordfreq` | Common-word filter during extraction |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Output
|
||||||
|
|
||||||
|
| Path | Contents |
|
||||||
|
|---|---|
|
||||||
|
| `output_audiobook_lightbringer/` | `chapter_01_homecoming.wav`, … |
|
||||||
|
| `output_proper_nouns/<book>/manifest.json` | Word → WAV filename map |
|
||||||
|
| `output_proper_nouns/<book>/pronunciation_fixes.json` | `{"Nephi": "Kneephi", …}` |
|
||||||
|
| `output_proper_nouns/<book>/correct_words.json` | Words confirmed correct |
|
||||||
|
| `proper_nouns_audio/<book>/` | Per-word audio clips |
|
||||||
|
| `proper_nouns_audio/<book>/replacements_cache/` | Cached phonetic fix clips |
|
||||||
|
|
||||||
|
|||||||
134
SETUP_WINDOWS.md
Normal file
134
SETUP_WINDOWS.md
Normal file
@ -0,0 +1,134 @@
|
|||||||
|
# Audiobook Creator — Windows 11 Setup Guide
|
||||||
|
|
||||||
|
This guide is written for someone who has never used Python or the command line.
|
||||||
|
Follow the steps in order and you will be generating audiobook chapters with your gaming GPU.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## What you will need
|
||||||
|
|
||||||
|
| Requirement | Why |
|
||||||
|
|---|---|
|
||||||
|
| Windows 11 PC with a modern NVIDIA GPU | Fast audio generation using CUDA |
|
||||||
|
| ~5 GB free disk space | Python, PyTorch, and the AI voice model |
|
||||||
|
| Internet connection (first-time only) | Downloads packages and the Kokoro voice model |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 1 — Install Python
|
||||||
|
|
||||||
|
1. Go to **https://www.python.org/downloads/**
|
||||||
|
2. Click the big yellow **"Download Python 3.12.x"** button
|
||||||
|
3. Run the installer
|
||||||
|
4. **IMPORTANT:** On the very first screen of the installer, tick the checkbox that says **"Add Python to PATH"** before clicking Install Now
|
||||||
|
|
||||||
|
> If you missed that checkbox, uninstall Python from Windows Settings and reinstall it with the box ticked.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 2 — Get the project files
|
||||||
|
|
||||||
|
You should have a folder called `audiobook_creator` (or similar) containing the project files. Make sure it includes these files:
|
||||||
|
|
||||||
|
```
|
||||||
|
setup_windows.bat
|
||||||
|
run_gui.bat
|
||||||
|
run_audiobook.bat
|
||||||
|
requirements.txt
|
||||||
|
gui_proper_noun_player.py
|
||||||
|
create_audiobook_lightbringer.py
|
||||||
|
Audio Text for Novel Lightbringer\ ← your chapter text files go here
|
||||||
|
```
|
||||||
|
|
||||||
|
If you received a ZIP file, extract it first so the folder is not inside another folder.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 3 — Run Setup (one time only)
|
||||||
|
|
||||||
|
1. Open the project folder in File Explorer
|
||||||
|
2. Double-click **`setup_windows.bat`**
|
||||||
|
3. A black terminal window opens and runs through these steps automatically:
|
||||||
|
- Checks Python is installed
|
||||||
|
- Creates a private Python environment (`.venv` folder)
|
||||||
|
- Downloads PyTorch with GPU (CUDA) support — **about 2.5 GB, this takes several minutes**
|
||||||
|
- Installs the remaining packages (kokoro, spaCy, etc.)
|
||||||
|
- Downloads the spaCy English language model
|
||||||
|
- Downloads the Kokoro AI voice model — **about 330 MB**
|
||||||
|
4. When it says **"Setup complete!"**, press any key to close the window
|
||||||
|
|
||||||
|
You only need to do this once. If you run it again it will safely skip anything already installed.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 4 — Review Proper Noun Pronunciations (GUI)
|
||||||
|
|
||||||
|
Before generating the audiobook, it helps to check how unusual names are pronounced.
|
||||||
|
|
||||||
|
1. Double-click **`run_gui.bat`**
|
||||||
|
2. The Proper Noun Pronunciation Auditor window opens
|
||||||
|
3. Select your book from the dropdown at the top
|
||||||
|
4. Click **⚙ Extract & Generate Audio** — this scans the text and creates a short audio clip for every proper noun found (takes a few minutes the first time)
|
||||||
|
5. Click any word in the **To Review** list to hear how it sounds
|
||||||
|
6. If it sounds wrong, type the phonetic spelling in the box at the bottom and press **Enter** to save a fix
|
||||||
|
- Example: type `Kneephi` instead of `Nephi`
|
||||||
|
7. If it sounds correct, just press **Enter** without changing anything
|
||||||
|
8. When you are done reviewing, click **⇄ Apply Fixes to Text** to save a corrected copy of the source text
|
||||||
|
|
||||||
|
**Keyboard shortcuts:**
|
||||||
|
| Key | Action |
|
||||||
|
|---|---|
|
||||||
|
| Space | Replay current word |
|
||||||
|
| Enter | Mark correct (or save fix if text was changed) |
|
||||||
|
| Escape | Reset the fix box, go back to word list |
|
||||||
|
| s | Stop audio |
|
||||||
|
| ↑ / ↓ | Navigate the word list from the fix box |
|
||||||
|
| Delete | Move a word back to Review from Correct or Fixes |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Step 5 — Generate the Audiobook
|
||||||
|
|
||||||
|
1. Double-click **`run_audiobook.bat`**
|
||||||
|
2. A menu appears — type the number of your choice and press Enter:
|
||||||
|
|
||||||
|
| Option | What it does |
|
||||||
|
|---|---|
|
||||||
|
| 1 | Generate **all chapters** — can take many hours, safe to leave running overnight |
|
||||||
|
| 2 | **List** detected chapters only — instant, nothing is generated |
|
||||||
|
| 3 | Generate a short **preview clip** of each chapter — quick sanity check |
|
||||||
|
| 4 | Generate **specific chapters** — enter chapter numbers separated by spaces |
|
||||||
|
|
||||||
|
3. When finished, `.wav` files will be in the `output_audiobook_lightbringer` folder
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
**"Python was not found"**
|
||||||
|
→ Python is not installed, or you forgot to tick "Add Python to PATH" during installation. Uninstall and reinstall Python from https://www.python.org/downloads/ making sure to tick that box.
|
||||||
|
|
||||||
|
**The black window opens and immediately closes**
|
||||||
|
→ There was an error. To see it: press `Win + R`, type `cmd`, press Enter, then drag the `.bat` file into that black window and press Enter. The error message will stay visible.
|
||||||
|
|
||||||
|
**Audio generation is very slow (taking hours per chapter)**
|
||||||
|
→ The GPU version of PyTorch may not have installed correctly. Re-run `setup_windows.bat` — it will reinstall just that part.
|
||||||
|
|
||||||
|
**"No .txt files found in Audio Text for Novel Lightbringer"**
|
||||||
|
→ Make sure your chapter `.txt` files are inside the `Audio Text for Novel Lightbringer` subfolder, not loose in the main project folder.
|
||||||
|
|
||||||
|
**The GUI says "No manifest yet"**
|
||||||
|
→ You need to click **⚙ Extract & Generate Audio** first for that book.
|
||||||
|
|
||||||
|
**Antivirus blocks the .bat files**
|
||||||
|
→ Right-click the `.bat` file, choose Properties, and click "Unblock" at the bottom. Then try again.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Output files
|
||||||
|
|
||||||
|
| Folder | Contents |
|
||||||
|
|---|---|
|
||||||
|
| `output_audiobook_lightbringer\` | One `.wav` file per chapter |
|
||||||
|
| `output_proper_nouns\<book>\` | Pronunciation data (JSON files) |
|
||||||
|
| `proper_nouns_audio\<book>\` | Cached word audio clips |
|
||||||
402
create_audiobook.py
Normal file
402
create_audiobook.py
Normal file
@ -0,0 +1,402 @@
|
|||||||
|
"""
|
||||||
|
create_audiobook.py
|
||||||
|
------------------
|
||||||
|
Generic audiobook generator for text files that contain chapter headings.
|
||||||
|
|
||||||
|
Supported heading formats (single-line headings):
|
||||||
|
- Prologue
|
||||||
|
- Chapter 12
|
||||||
|
- Chapter 12 - Chapter Name
|
||||||
|
- Chapter - 12
|
||||||
|
- Chapter - 12 - Chapter Name
|
||||||
|
|
||||||
|
Features:
|
||||||
|
- Parses chapters from one or more input files/directories
|
||||||
|
- Caches parsed chapter data for faster re-runs when source files are unchanged
|
||||||
|
- Warns about missing chapter numbers (example: found 1,2,4 -> warns about 3)
|
||||||
|
- Generates one .wav per chapter with Kokoro
|
||||||
|
|
||||||
|
Examples:
|
||||||
|
python create_audiobook.py --input "Audio Text for Novel Lightbringer"
|
||||||
|
python create_audiobook.py --input novel.txt --list
|
||||||
|
python create_audiobook.py --input novel.txt 0 1 2 --voice am_michael
|
||||||
|
python create_audiobook.py --input novel.txt --preview 3000
|
||||||
|
"""
|
||||||
|
|
||||||
|
from __future__ import annotations
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import hashlib
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
import time
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import soundfile as sf
|
||||||
|
import torch
|
||||||
|
from kokoro import KPipeline
|
||||||
|
|
||||||
|
SAMPLE_RATE = 24000
|
||||||
|
SPEED = 1.0
|
||||||
|
LANG_CODE = "a"
|
||||||
|
VOICE = "am_onyx"
|
||||||
|
CACHE_VERSION = 1
|
||||||
|
|
||||||
|
PROLOGUE_RE = re.compile(r"^\s*Prologue\s*$", re.IGNORECASE)
|
||||||
|
CHAPTER_RE_1 = re.compile(r"^\s*Chapter\s*-\s*(\d+)(?:\s*-\s*(.+))?\s*$", re.IGNORECASE)
|
||||||
|
CHAPTER_RE_2 = re.compile(r"^\s*Chapter\s+(\d+)(?:\s*-\s*(.+))?\s*$", re.IGNORECASE)
|
||||||
|
RULE_RE = re.compile(r"^[_\-*\s]{3,}\s*$")
|
||||||
|
|
||||||
|
|
||||||
|
def _slug(text: str) -> str:
|
||||||
|
text = text.lower()
|
||||||
|
text = re.sub(r"[^a-z0-9]+", "_", text)
|
||||||
|
return text.strip("_")
|
||||||
|
|
||||||
|
|
||||||
|
def _clean_text(text: str) -> str:
|
||||||
|
text = RULE_RE.sub("", text)
|
||||||
|
text = re.sub(r"\n{3,}", "\n\n", text)
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _fmt_duration(seconds: float) -> str:
|
||||||
|
h, rem = divmod(int(seconds), 3600)
|
||||||
|
m, s = divmod(rem, 60)
|
||||||
|
if h > 0:
|
||||||
|
return f"{h}h {m:02d}m {s:02d}s"
|
||||||
|
if m > 0:
|
||||||
|
return f"{m}m {s:02d}s"
|
||||||
|
return f"{s}s"
|
||||||
|
|
||||||
|
|
||||||
|
def _chapter_heading(line: str) -> tuple[int, str, str] | None:
|
||||||
|
stripped = line.strip()
|
||||||
|
if PROLOGUE_RE.match(stripped):
|
||||||
|
return (0, "Prologue", "Prologue")
|
||||||
|
|
||||||
|
m = CHAPTER_RE_1.match(stripped)
|
||||||
|
if not m:
|
||||||
|
m = CHAPTER_RE_2.match(stripped)
|
||||||
|
if not m:
|
||||||
|
return None
|
||||||
|
|
||||||
|
num = int(m.group(1))
|
||||||
|
title = (m.group(2) or "").strip()
|
||||||
|
label = f"Chapter {num}" + (f" - {title}" if title else "")
|
||||||
|
return (num, title, label)
|
||||||
|
|
||||||
|
|
||||||
|
def _resolve_txt_files(inputs: list[str]) -> list[Path]:
|
||||||
|
txt_files: list[Path] = []
|
||||||
|
for raw in inputs:
|
||||||
|
path = Path(raw)
|
||||||
|
if path.is_file():
|
||||||
|
if path.suffix.lower() == ".txt":
|
||||||
|
txt_files.append(path)
|
||||||
|
continue
|
||||||
|
if path.is_dir():
|
||||||
|
txt_files.extend(sorted(path.glob("*.txt")))
|
||||||
|
|
||||||
|
deduped = sorted({p.resolve() for p in txt_files})
|
||||||
|
return deduped
|
||||||
|
|
||||||
|
|
||||||
|
def _signature_for_files(files: list[Path]) -> list[dict]:
|
||||||
|
sig = []
|
||||||
|
for p in files:
|
||||||
|
st = p.stat()
|
||||||
|
sig.append({
|
||||||
|
"path": str(p),
|
||||||
|
"size": st.st_size,
|
||||||
|
"mtime_ns": st.st_mtime_ns,
|
||||||
|
})
|
||||||
|
return sig
|
||||||
|
|
||||||
|
|
||||||
|
def _cache_path(output_dir: Path, files: list[Path]) -> Path:
|
||||||
|
cache_dir = output_dir / ".cache"
|
||||||
|
digest = hashlib.sha256("\n".join(str(p) for p in files).encode("utf-8")).hexdigest()[:12]
|
||||||
|
return cache_dir / f"parse_{digest}.json"
|
||||||
|
|
||||||
|
|
||||||
|
def _load_cached_chapters(cache_file: Path, file_sig: list[dict]) -> list[dict] | None:
|
||||||
|
if not cache_file.exists():
|
||||||
|
return None
|
||||||
|
|
||||||
|
try:
|
||||||
|
data = json.loads(cache_file.read_text(encoding="utf-8"))
|
||||||
|
except Exception:
|
||||||
|
return None
|
||||||
|
|
||||||
|
if data.get("version") != CACHE_VERSION:
|
||||||
|
return None
|
||||||
|
if data.get("file_signature") != file_sig:
|
||||||
|
return None
|
||||||
|
|
||||||
|
chapters = data.get("chapters")
|
||||||
|
if not isinstance(chapters, list):
|
||||||
|
return None
|
||||||
|
return chapters
|
||||||
|
|
||||||
|
|
||||||
|
def _save_cached_chapters(cache_file: Path, file_sig: list[dict], chapters: list[dict]) -> None:
|
||||||
|
cache_file.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
payload = {
|
||||||
|
"version": CACHE_VERSION,
|
||||||
|
"file_signature": file_sig,
|
||||||
|
"chapters": chapters,
|
||||||
|
}
|
||||||
|
cache_file.write_text(json.dumps(payload, ensure_ascii=False), encoding="utf-8")
|
||||||
|
|
||||||
|
|
||||||
|
def _parse_chapters(files: list[Path]) -> tuple[list[dict], set[int]]:
|
||||||
|
chapters: list[dict] = []
|
||||||
|
duplicates: set[int] = set()
|
||||||
|
seen: set[int] = set()
|
||||||
|
current: dict | None = None
|
||||||
|
|
||||||
|
def flush_current() -> None:
|
||||||
|
if current is not None:
|
||||||
|
current["text"] = "".join(current.pop("lines"))
|
||||||
|
num = current["num"]
|
||||||
|
if num in seen:
|
||||||
|
duplicates.add(num)
|
||||||
|
return
|
||||||
|
seen.add(num)
|
||||||
|
chapters.append(current)
|
||||||
|
|
||||||
|
for fpath in files:
|
||||||
|
with fpath.open("r", encoding="utf-8") as fh:
|
||||||
|
for line in fh:
|
||||||
|
info = _chapter_heading(line)
|
||||||
|
if info is not None:
|
||||||
|
flush_current()
|
||||||
|
num, title, label = info
|
||||||
|
num_str = f"{num:02d}"
|
||||||
|
if num == 0:
|
||||||
|
slug = "chapter_00_prologue"
|
||||||
|
elif title:
|
||||||
|
slug = f"chapter_{num_str}_{_slug(title)}"
|
||||||
|
else:
|
||||||
|
slug = f"chapter_{num_str}"
|
||||||
|
current = {
|
||||||
|
"num": num,
|
||||||
|
"title": title,
|
||||||
|
"label": label,
|
||||||
|
"slug": slug,
|
||||||
|
"lines": [line],
|
||||||
|
}
|
||||||
|
elif current is not None:
|
||||||
|
current["lines"].append(line)
|
||||||
|
|
||||||
|
flush_current()
|
||||||
|
chapters.sort(key=lambda c: c["num"])
|
||||||
|
return chapters, duplicates
|
||||||
|
|
||||||
|
|
||||||
|
def load_all_chapters_with_cache(inputs: list[str], output_dir: Path, force_reparse: bool = False) -> tuple[list[dict], bool, set[int], list[Path]]:
|
||||||
|
files = _resolve_txt_files(inputs)
|
||||||
|
if not files:
|
||||||
|
raise FileNotFoundError("No .txt files found in --input paths")
|
||||||
|
|
||||||
|
file_sig = _signature_for_files(files)
|
||||||
|
cache_file = _cache_path(output_dir, files)
|
||||||
|
|
||||||
|
if not force_reparse:
|
||||||
|
cached = _load_cached_chapters(cache_file, file_sig)
|
||||||
|
if cached is not None:
|
||||||
|
return cached, True, set(), files
|
||||||
|
|
||||||
|
chapters, duplicates = _parse_chapters(files)
|
||||||
|
_save_cached_chapters(cache_file, file_sig, chapters)
|
||||||
|
return chapters, False, duplicates, files
|
||||||
|
|
||||||
|
|
||||||
|
def warn_missing_chapters(chapters: list[dict]) -> None:
|
||||||
|
nums = sorted(ch["num"] for ch in chapters if ch["num"] > 0)
|
||||||
|
if not nums:
|
||||||
|
return
|
||||||
|
missing = [n for n in range(nums[0], nums[-1] + 1) if n not in set(nums)]
|
||||||
|
if missing:
|
||||||
|
print(f"WARNING: missing chapter numbers detected: {missing}")
|
||||||
|
|
||||||
|
|
||||||
|
def generate_audio(pipeline: KPipeline, text: str, voice: str, output_path: Path) -> float:
|
||||||
|
t0 = time.monotonic()
|
||||||
|
chunks = []
|
||||||
|
for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
|
||||||
|
if hasattr(chunk_audio, "numpy"):
|
||||||
|
chunk_audio = chunk_audio.cpu().numpy()
|
||||||
|
chunk_audio = np.atleast_1d(chunk_audio.squeeze())
|
||||||
|
if chunk_audio.size > 0:
|
||||||
|
chunks.append(chunk_audio)
|
||||||
|
|
||||||
|
elapsed = time.monotonic() - t0
|
||||||
|
if chunks:
|
||||||
|
audio = np.concatenate(chunks, axis=0)
|
||||||
|
sf.write(str(output_path), audio, SAMPLE_RATE)
|
||||||
|
duration = len(audio) / SAMPLE_RATE
|
||||||
|
print(
|
||||||
|
f" OK saved '{output_path.name}' "
|
||||||
|
f"({_fmt_duration(duration)} audio | {_fmt_duration(elapsed)} wall-clock)"
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
print(f" ERROR no audio produced for voice='{voice}'")
|
||||||
|
return elapsed
|
||||||
|
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
parser = argparse.ArgumentParser(description="Generate an audiobook from chapterized text files.")
|
||||||
|
parser.add_argument(
|
||||||
|
"chapters",
|
||||||
|
nargs="*",
|
||||||
|
type=int,
|
||||||
|
help="Chapter numbers to generate (0 = Prologue). Default: all.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--input",
|
||||||
|
nargs="+",
|
||||||
|
required=True,
|
||||||
|
help="One or more .txt files and/or directories containing .txt files.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output",
|
||||||
|
default="output_audiobook",
|
||||||
|
help="Output directory for generated chapter audio.",
|
||||||
|
)
|
||||||
|
parser.add_argument("--list", action="store_true", help="Print detected chapters and exit.")
|
||||||
|
parser.add_argument("--voice", default=VOICE, help=f"Kokoro voice to use (default: {VOICE}).")
|
||||||
|
parser.add_argument(
|
||||||
|
"--preview",
|
||||||
|
nargs="?",
|
||||||
|
const=3000,
|
||||||
|
type=int,
|
||||||
|
metavar="CHARS",
|
||||||
|
help="Generate short preview clips capped at CHARS (default: 3000).",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--reparse",
|
||||||
|
action="store_true",
|
||||||
|
help="Ignore cache and re-parse chapters from source files.",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
output_dir = Path(args.output)
|
||||||
|
output_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
print("Loading chapters...")
|
||||||
|
chapters, used_cache, duplicates, files = load_all_chapters_with_cache(
|
||||||
|
args.input, output_dir, force_reparse=args.reparse
|
||||||
|
)
|
||||||
|
|
||||||
|
print(f"Input files: {len(files)}")
|
||||||
|
print(f"Parse cache: {'HIT' if used_cache else 'MISS'}")
|
||||||
|
|
||||||
|
if duplicates:
|
||||||
|
print(f"WARNING: duplicate chapter numbers were found and ignored: {sorted(duplicates)}")
|
||||||
|
|
||||||
|
if not chapters:
|
||||||
|
print("WARNING: no chapters found.")
|
||||||
|
print("Expected headings like: 'Prologue' or 'Chapter 12 - Name' or 'Chapter - 12'")
|
||||||
|
return
|
||||||
|
|
||||||
|
warn_missing_chapters(chapters)
|
||||||
|
|
||||||
|
if args.list:
|
||||||
|
print(f"\nDetected {len(chapters)} chapters:\n")
|
||||||
|
print(f" {'#':>4} {'Label':<45} {'Chars':>8} {'Output filename'}")
|
||||||
|
print(f" {'-' * 4} {'-' * 45} {'-' * 8} {'-' * 30}")
|
||||||
|
for ch in chapters:
|
||||||
|
chars = len(_clean_text(ch["text"]))
|
||||||
|
print(f" {ch['num']:>4} {ch['label']:<45} {chars:>8,} {ch['slug']}.wav")
|
||||||
|
return
|
||||||
|
|
||||||
|
if args.chapters:
|
||||||
|
requested = set(args.chapters)
|
||||||
|
run_chapters = [ch for ch in chapters if ch["num"] in requested]
|
||||||
|
missing_req = sorted(requested - {ch["num"] for ch in run_chapters})
|
||||||
|
if missing_req:
|
||||||
|
print(f"WARNING: requested chapter(s) not found: {missing_req}")
|
||||||
|
else:
|
||||||
|
run_chapters = chapters
|
||||||
|
|
||||||
|
if not run_chapters:
|
||||||
|
print("No chapters selected. Use --list to see available chapters.")
|
||||||
|
return
|
||||||
|
|
||||||
|
device = "cuda" if torch.cuda.is_available() else "cpu"
|
||||||
|
print(f"Device: {device}")
|
||||||
|
if device == "cuda":
|
||||||
|
print(f"GPU: {torch.cuda.get_device_name(0)}")
|
||||||
|
print(f"Voice: {args.voice}")
|
||||||
|
|
||||||
|
chapter_chars = {ch["num"]: len(_clean_text(ch["text"])) for ch in run_chapters}
|
||||||
|
total_chars = sum(chapter_chars.values())
|
||||||
|
|
||||||
|
preview_note = f"PREVIEW MODE: capped at {args.preview:,} chars/chapter" if args.preview else ""
|
||||||
|
if preview_note:
|
||||||
|
print(preview_note)
|
||||||
|
|
||||||
|
print("\nPlan:")
|
||||||
|
for ch in run_chapters:
|
||||||
|
print(f" {ch['num']:>3} {ch['label']} ({chapter_chars[ch['num']]:,} chars)")
|
||||||
|
print(f" TOTAL: {total_chars:,} chars\n")
|
||||||
|
|
||||||
|
print("Initializing Kokoro pipeline...")
|
||||||
|
pipeline = KPipeline(lang_code=LANG_CODE)
|
||||||
|
|
||||||
|
chars_per_sec: float | None = None
|
||||||
|
timing_rows: list[tuple[str, int, float]] = []
|
||||||
|
|
||||||
|
for ch in run_chapters:
|
||||||
|
text = _clean_text(ch["text"])
|
||||||
|
if not text:
|
||||||
|
print(f"[{ch['label']}] WARNING empty text, skipping")
|
||||||
|
continue
|
||||||
|
|
||||||
|
if args.preview and len(text) > args.preview:
|
||||||
|
cut = text.rfind(" ", 0, args.preview)
|
||||||
|
text = text[: cut if cut > 0 else args.preview]
|
||||||
|
|
||||||
|
chars = len(text)
|
||||||
|
preview_tag = "_preview" if args.preview else ""
|
||||||
|
out_path = output_dir / f"{ch['slug']}{preview_tag}.wav"
|
||||||
|
|
||||||
|
if chars_per_sec is not None:
|
||||||
|
eta = _fmt_duration(chars / chars_per_sec)
|
||||||
|
print(f"\n[{ch['label']}] -> {out_path.name} (est. {eta})")
|
||||||
|
else:
|
||||||
|
print(f"\n[{ch['label']}] -> {out_path.name} (calibration run)")
|
||||||
|
|
||||||
|
elapsed = generate_audio(pipeline, text, args.voice, out_path)
|
||||||
|
timing_rows.append((ch["label"], chars, elapsed))
|
||||||
|
|
||||||
|
done_chars = sum(c for _, c, _ in timing_rows)
|
||||||
|
done_elapsed = sum(e for _, _, e in timing_rows)
|
||||||
|
if done_elapsed > 0:
|
||||||
|
chars_per_sec = done_chars / done_elapsed
|
||||||
|
remaining = total_chars - done_chars
|
||||||
|
eta_total = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
|
||||||
|
print(f" Speed: {chars_per_sec:.0f} chars/sec | Estimated remaining: {eta_total}")
|
||||||
|
|
||||||
|
print("\nSummary:")
|
||||||
|
print(f" {'Chapter':<35} {'Chars':>7} {'Actual':>8} {'Est':>8}")
|
||||||
|
print(" " + "-" * 65)
|
||||||
|
for i, (label, chars, elapsed) in enumerate(timing_rows):
|
||||||
|
actual_str = _fmt_duration(elapsed)
|
||||||
|
prior_chars = sum(c for _, c, _ in timing_rows[:i])
|
||||||
|
prior_elapsed = sum(e for _, _, e in timing_rows[:i])
|
||||||
|
est_str = _fmt_duration(chars / (prior_chars / prior_elapsed)) if prior_elapsed > 0 else "(first)"
|
||||||
|
print(f" {label:<35} {chars:>7,} {actual_str:>8} {est_str:>8}")
|
||||||
|
|
||||||
|
total_elapsed = sum(e for _, _, e in timing_rows)
|
||||||
|
total_done_chars = sum(c for _, c, _ in timing_rows)
|
||||||
|
print(" " + "-" * 65)
|
||||||
|
print(f" {'TOTAL':<35} {total_done_chars:>7,} {_fmt_duration(total_elapsed):>8}")
|
||||||
|
print("\nDone.")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
311
create_audiobook_lightbringer.py
Normal file
311
create_audiobook_lightbringer.py
Normal file
@ -0,0 +1,311 @@
|
|||||||
|
"""
|
||||||
|
create_audiobook_lightbringer.py
|
||||||
|
─────────────────────────────────
|
||||||
|
Generate the "A Darkness Rising" audiobook — one file per chapter/prologue.
|
||||||
|
|
||||||
|
Reads all .txt files from NOVEL_DIR, detects Prologue + Chapter headings,
|
||||||
|
and writes one .wav per chapter into OUTPUT_DIR.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python create_audiobook_lightbringer.py # all chapters
|
||||||
|
python create_audiobook_lightbringer.py --list # list detected chapters
|
||||||
|
python create_audiobook_lightbringer.py 0 1 2 # prologue + ch1 + ch2
|
||||||
|
python create_audiobook_lightbringer.py --preview # short preview clips
|
||||||
|
|
||||||
|
Output filenames:
|
||||||
|
chapter_00_prologue.wav
|
||||||
|
chapter_01_homecoming.wav
|
||||||
|
chapter_02_the_anhuil_ehlar.wav
|
||||||
|
...
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import re
|
||||||
|
import time
|
||||||
|
import numpy as np
|
||||||
|
import soundfile as sf
|
||||||
|
import torch
|
||||||
|
from pathlib import Path
|
||||||
|
from kokoro import KPipeline
|
||||||
|
|
||||||
|
# ── Config ─────────────────────────────────────────────────────────────────────
|
||||||
|
NOVEL_DIR = Path("Audio Text for Novel Lightbringer")
|
||||||
|
OUTPUT_DIR = Path("output_audiobook_lightbringer")
|
||||||
|
SAMPLE_RATE = 24000
|
||||||
|
SPEED = 1.0
|
||||||
|
LANG_CODE = "a" # American English
|
||||||
|
VOICE = "am_onyx" # default narrator voice
|
||||||
|
|
||||||
|
# Regex that matches a chapter/prologue heading line (case-insensitive).
|
||||||
|
# Group 1 captures the chapter number (or None for Prologue).
|
||||||
|
# Group 2 captures the optional subtitle after " - ".
|
||||||
|
_HEADING_RE = re.compile(
|
||||||
|
r"^(?:Chapter\s+(\d+)\s*(?:-\s*(.+))?|(Prologue))\s*$",
|
||||||
|
re.IGNORECASE,
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Helpers ────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def _slug(text: str) -> str:
|
||||||
|
"""Convert title text to a filesystem-safe slug."""
|
||||||
|
text = text.lower()
|
||||||
|
text = re.sub(r"[^a-z0-9]+", "_", text)
|
||||||
|
return text.strip("_")
|
||||||
|
|
||||||
|
|
||||||
|
def load_all_chapters(novel_dir: Path) -> list[dict]:
|
||||||
|
"""
|
||||||
|
Read all .txt files in *novel_dir* in sorted order, detect Prologue /
|
||||||
|
Chapter headings, and return a list of chapter dicts:
|
||||||
|
{
|
||||||
|
"num": int, # 0 = Prologue
|
||||||
|
"title": str, # subtitle portion, e.g. "Homecoming"
|
||||||
|
"label": str, # human label, e.g. "Chapter 1 - Homecoming"
|
||||||
|
"slug": str, # e.g. "chapter_01_homecoming"
|
||||||
|
"text": str, # full body text of the chapter
|
||||||
|
}
|
||||||
|
Chapters from multiple files are concatenated in sorted-filename order.
|
||||||
|
"""
|
||||||
|
txt_files = sorted(novel_dir.glob("*.txt"))
|
||||||
|
if not txt_files:
|
||||||
|
raise FileNotFoundError(f"No .txt files found in '{novel_dir}'")
|
||||||
|
|
||||||
|
# Collect (chapter_num, title_line, body_lines) across all files
|
||||||
|
raw: list[tuple[int, str, list[str]]] = [] # (num, heading_text, body)
|
||||||
|
current_num: int | None = None
|
||||||
|
current_heading: str = ""
|
||||||
|
current_body: list[str] = []
|
||||||
|
|
||||||
|
def _flush():
|
||||||
|
if current_num is not None:
|
||||||
|
raw.append((current_num, current_heading, list(current_body)))
|
||||||
|
|
||||||
|
for fpath in txt_files:
|
||||||
|
lines = fpath.read_text(encoding="utf-8").splitlines()
|
||||||
|
for line in lines:
|
||||||
|
m = _HEADING_RE.match(line.strip())
|
||||||
|
if m:
|
||||||
|
_flush()
|
||||||
|
if m.group(3): # Prologue
|
||||||
|
current_num = 0
|
||||||
|
current_heading = "Prologue"
|
||||||
|
else: # Chapter N
|
||||||
|
current_num = int(m.group(1))
|
||||||
|
subtitle = (m.group(2) or "").strip()
|
||||||
|
current_heading = f"Chapter {current_num}" + (f" - {subtitle}" if subtitle else "")
|
||||||
|
current_body = [line] # keep heading inside text
|
||||||
|
else:
|
||||||
|
if current_num is not None:
|
||||||
|
current_body.append(line)
|
||||||
|
_flush()
|
||||||
|
|
||||||
|
# Build chapter dicts, deduplicated and sorted by number
|
||||||
|
seen: set[int] = set()
|
||||||
|
chapters: list[dict] = []
|
||||||
|
for num, heading, body in sorted(raw, key=lambda x: x[0]):
|
||||||
|
if num in seen:
|
||||||
|
continue
|
||||||
|
seen.add(num)
|
||||||
|
# Derive subtitle / slug
|
||||||
|
subtitle = ""
|
||||||
|
sm = re.match(r"Chapter\s+\d+\s*-\s*(.+)", heading, re.IGNORECASE)
|
||||||
|
if sm:
|
||||||
|
subtitle = sm.group(1).strip()
|
||||||
|
elif heading.lower() == "prologue":
|
||||||
|
subtitle = "Prologue"
|
||||||
|
|
||||||
|
num_str = f"{num:02d}"
|
||||||
|
if subtitle:
|
||||||
|
slug = f"chapter_{num_str}_{_slug(subtitle)}"
|
||||||
|
else:
|
||||||
|
slug = f"chapter_{num_str}"
|
||||||
|
|
||||||
|
chapters.append({
|
||||||
|
"num": num,
|
||||||
|
"title": subtitle or heading,
|
||||||
|
"label": heading,
|
||||||
|
"slug": slug,
|
||||||
|
"text": "\n".join(body),
|
||||||
|
})
|
||||||
|
|
||||||
|
return chapters
|
||||||
|
|
||||||
|
|
||||||
|
def clean_text(text: str) -> str:
|
||||||
|
"""Strip formatting artifacts and normalise whitespace for TTS."""
|
||||||
|
# Remove horizontal-rule lines (underscores / asterisks / dashes)
|
||||||
|
text = re.sub(r"^[_\-\*\s]{3,}\s*$", "", text, flags=re.MULTILINE)
|
||||||
|
# Collapse 3+ blank lines to 2
|
||||||
|
text = re.sub(r"\n{3,}", "\n\n", text)
|
||||||
|
return text.strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _fmt_duration(seconds: float) -> str:
|
||||||
|
h, rem = divmod(int(seconds), 3600)
|
||||||
|
m, s = divmod(rem, 60)
|
||||||
|
if h > 0:
|
||||||
|
return f"{h}h {m:02d}m {s:02d}s"
|
||||||
|
if m > 0:
|
||||||
|
return f"{m}m {s:02d}s"
|
||||||
|
return f"{s}s"
|
||||||
|
|
||||||
|
|
||||||
|
def generate_audio(pipeline: KPipeline, text: str, voice: str,
|
||||||
|
output_path: Path) -> float:
|
||||||
|
"""Generate audio and return wall-clock seconds elapsed."""
|
||||||
|
t0 = time.monotonic()
|
||||||
|
chunks = []
|
||||||
|
for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
|
||||||
|
if hasattr(chunk_audio, "numpy"):
|
||||||
|
chunk_audio = chunk_audio.cpu().numpy()
|
||||||
|
chunk_audio = np.atleast_1d(chunk_audio.squeeze())
|
||||||
|
if chunk_audio.size > 0:
|
||||||
|
chunks.append(chunk_audio)
|
||||||
|
|
||||||
|
elapsed = time.monotonic() - t0
|
||||||
|
if chunks:
|
||||||
|
audio = np.concatenate(chunks, axis=0)
|
||||||
|
sf.write(str(output_path), audio, SAMPLE_RATE)
|
||||||
|
duration = len(audio) / SAMPLE_RATE
|
||||||
|
print(f" ✓ Saved '{output_path.name}' "
|
||||||
|
f"({_fmt_duration(duration)} audio | {_fmt_duration(elapsed)} wall-clock)")
|
||||||
|
else:
|
||||||
|
print(f" ✗ No audio produced for voice='{voice}'")
|
||||||
|
return elapsed
|
||||||
|
|
||||||
|
|
||||||
|
# ── Main ───────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Generate 'A Darkness Rising' audiobook, one file per chapter."
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"chapters", nargs="*", type=int,
|
||||||
|
help="Chapter numbers to generate (0 = Prologue). Default: all.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--list", action="store_true",
|
||||||
|
help="Print detected chapters and exit.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--voice", default=VOICE,
|
||||||
|
help=f"Kokoro voice to use (default: {VOICE}).",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--preview", nargs="?", const=3000, type=int, metavar="CHARS",
|
||||||
|
help="Generate short preview clips (default: 3000 chars). "
|
||||||
|
"Output filenames get a _preview suffix.",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
print("Loading chapters …")
|
||||||
|
all_chapters = load_all_chapters(NOVEL_DIR)
|
||||||
|
|
||||||
|
if args.list:
|
||||||
|
print(f"\nDetected {len(all_chapters)} chapters:\n")
|
||||||
|
print(f" {'#':>4} {'Label':<45} {'Chars':>8} {'Output filename'}")
|
||||||
|
print(f" {'─'*4} {'─'*45} {'─'*8} {'─'*30}")
|
||||||
|
for ch in all_chapters:
|
||||||
|
chars = len(clean_text(ch["text"]))
|
||||||
|
print(f" {ch['num']:>4} {ch['label']:<45} {chars:>8,} {ch['slug']}.wav")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Filter to requested subset
|
||||||
|
if args.chapters:
|
||||||
|
requested = set(args.chapters)
|
||||||
|
run_chapters = [ch for ch in all_chapters if ch["num"] in requested]
|
||||||
|
missing = requested - {ch["num"] for ch in run_chapters}
|
||||||
|
if missing:
|
||||||
|
print(f"⚠ Chapter(s) not found: {sorted(missing)}")
|
||||||
|
else:
|
||||||
|
run_chapters = all_chapters
|
||||||
|
|
||||||
|
if not run_chapters:
|
||||||
|
print("No chapters selected. Use --list to see available chapters.")
|
||||||
|
return
|
||||||
|
|
||||||
|
voice = args.voice
|
||||||
|
device = "cuda" if torch.cuda.is_available() else "cpu"
|
||||||
|
print(f"Device: {device}")
|
||||||
|
if device == "cuda":
|
||||||
|
print(f"GPU: {torch.cuda.get_device_name(0)}")
|
||||||
|
print(f"Voice: {voice}")
|
||||||
|
|
||||||
|
OUTPUT_DIR.mkdir(exist_ok=True)
|
||||||
|
|
||||||
|
# Pre-compute char counts
|
||||||
|
chapter_chars = {ch["num"]: len(clean_text(ch["text"])) for ch in run_chapters}
|
||||||
|
|
||||||
|
preview_note = (f" ⚡ PREVIEW MODE — capped at {args.preview:,} chars/chapter\n"
|
||||||
|
if args.preview else "")
|
||||||
|
print(f"\n{preview_note}{'─'*65}")
|
||||||
|
print(f" {'#':>4} {'Label':<40} {'Chars':>8}")
|
||||||
|
print(f" {'─'*4} {'─'*40} {'─'*8}")
|
||||||
|
for ch in run_chapters:
|
||||||
|
print(f" {ch['num']:>4} {ch['label']:<40} {chapter_chars[ch['num']]:>8,}")
|
||||||
|
print(f" {'─'*55}")
|
||||||
|
total_chars = sum(chapter_chars.values())
|
||||||
|
print(f" {'TOTAL':<45} {total_chars:>8,}\n")
|
||||||
|
|
||||||
|
print("Initialising Kokoro pipeline …")
|
||||||
|
pipeline = KPipeline(lang_code=LANG_CODE)
|
||||||
|
|
||||||
|
chars_per_sec: float | None = None
|
||||||
|
timing_rows: list[tuple[str, int, float]] = []
|
||||||
|
|
||||||
|
for ch in run_chapters:
|
||||||
|
text = clean_text(ch["text"])
|
||||||
|
if not text:
|
||||||
|
print(f"\n[{ch['label']}] ⚠ Empty text — skipping")
|
||||||
|
continue
|
||||||
|
|
||||||
|
preview_chars = args.preview
|
||||||
|
if preview_chars and len(text) > preview_chars:
|
||||||
|
cut = text.rfind(" ", 0, preview_chars)
|
||||||
|
text = text[: cut if cut > 0 else preview_chars]
|
||||||
|
|
||||||
|
chars = len(text)
|
||||||
|
preview_tag = "_preview" if args.preview else ""
|
||||||
|
out_path = OUTPUT_DIR / f"{ch['slug']}{preview_tag}.wav"
|
||||||
|
|
||||||
|
if chars_per_sec is not None:
|
||||||
|
eta_str = _fmt_duration(chars / chars_per_sec)
|
||||||
|
print(f"\n[{ch['label']}] voice={voice} → {out_path.name} (est. {eta_str})")
|
||||||
|
else:
|
||||||
|
print(f"\n[{ch['label']}] voice={voice} → {out_path.name} (calibration run)")
|
||||||
|
|
||||||
|
elapsed = generate_audio(pipeline, text, voice, out_path)
|
||||||
|
timing_rows.append((ch["label"], chars, elapsed))
|
||||||
|
|
||||||
|
total_done = sum(c for _, c, _ in timing_rows)
|
||||||
|
total_elapsed_done = sum(e for _, _, e in timing_rows)
|
||||||
|
if total_elapsed_done > 0:
|
||||||
|
chars_per_sec = total_done / total_elapsed_done
|
||||||
|
remaining = total_chars - total_done
|
||||||
|
eta_overall = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
|
||||||
|
print(f" ⏱ Speed: {chars_per_sec:.0f} chars/sec | Est. overall remaining: {eta_overall}")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
print("\n" + "─" * 65)
|
||||||
|
print(f" {'Chapter':<35} {'Chars':>7} {'Actual':>8} {'Est':>8}")
|
||||||
|
print("─" * 65)
|
||||||
|
for i, (label, chars, elapsed) in enumerate(timing_rows):
|
||||||
|
actual_str = _fmt_duration(elapsed)
|
||||||
|
prior_chars = sum(c for _, c, _ in timing_rows[:i])
|
||||||
|
prior_elapsed = sum(e for _, _, e in timing_rows[:i])
|
||||||
|
if prior_elapsed > 0:
|
||||||
|
est_str = _fmt_duration(chars / (prior_chars / prior_elapsed))
|
||||||
|
else:
|
||||||
|
est_str = "(first)"
|
||||||
|
print(f" {label:<35} {chars:>7,} {actual_str:>8} {est_str:>8}")
|
||||||
|
total_elapsed = sum(e for _, _, e in timing_rows)
|
||||||
|
print("─" * 65)
|
||||||
|
print(f" {'TOTAL':<35} {sum(c for _,c,_ in timing_rows):>7,} "
|
||||||
|
f"{_fmt_duration(total_elapsed):>8}")
|
||||||
|
print("\nDone.")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@ -4,13 +4,19 @@ audiobook_nem.py
|
|||||||
Generate the Book of the Nem audiobook — one unique voice per book/section.
|
Generate the Book of the Nem audiobook — one unique voice per book/section.
|
||||||
|
|
||||||
Usage:
|
Usage:
|
||||||
python audiobook_nem.py
|
python create_audiobook_nem.py # all enabled books
|
||||||
|
python create_audiobook_nem.py --list # list available book labels
|
||||||
|
python create_audiobook_nem.py Introduction
|
||||||
|
python create_audiobook_nem.py "Book of Hagoth"
|
||||||
|
python create_audiobook_nem.py Introduction "Book of Hagoth"
|
||||||
|
|
||||||
To skip a section, comment out its entry in BOOKS below.
|
To permanently skip a section, comment out its entry in BOOKS below.
|
||||||
Output .wav files are written to OUTPUT_DIR (created automatically).
|
Output .wav files are written to OUTPUT_DIR (created automatically).
|
||||||
"""
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
import re
|
import re
|
||||||
|
import time
|
||||||
import numpy as np
|
import numpy as np
|
||||||
import soundfile as sf
|
import soundfile as sf
|
||||||
import torch
|
import torch
|
||||||
@ -27,8 +33,12 @@ SPEED = 1.0
|
|||||||
LANG_CODE = "a" # 'a' = American English
|
LANG_CODE = "a" # 'a' = American English
|
||||||
|
|
||||||
# ── Available Kokoro voices (American English, lang_code='a') ──────────────────
|
# ── Available Kokoro voices (American English, lang_code='a') ──────────────────
|
||||||
# af_heart – warm American female [downloaded]
|
# af_bella – American female [downloaded]
|
||||||
|
# af_heart – warm American female [downloaded]
|
||||||
# af_nicole – American female [downloaded]
|
# af_nicole – American female [downloaded]
|
||||||
|
# af_river – American female [downloaded]
|
||||||
|
# af_sarah – American female [downloaded]
|
||||||
|
# af_sky – American female [downloaded]
|
||||||
# am_adam – American male (deep) [downloaded]
|
# am_adam – American male (deep) [downloaded]
|
||||||
# am_echo – American male [downloaded]
|
# am_echo – American male [downloaded]
|
||||||
# am_eric – American male [downloaded]
|
# am_eric – American male [downloaded]
|
||||||
@ -40,30 +50,30 @@ LANG_CODE = "a" # 'a' = American English
|
|||||||
# am_santa – American male [downloaded] (not used)
|
# am_santa – American male [downloaded] (not used)
|
||||||
|
|
||||||
# ── Book definitions ───────────────────────────────────────────────────────────
|
# ── Book definitions ───────────────────────────────────────────────────────────
|
||||||
# Format: (label, start_marker, voice, output_wav)
|
# Format: (label, (start_line1, start_line2), voice, output_wav)
|
||||||
# start_marker – exact text of the FIRST line of the section header in the source
|
# start_line1 – exact text of the FIRST line of the section header
|
||||||
# (leading/trailing whitespace is ignored when matching)
|
# start_line2 – prefix of the SECOND line (used together for unambiguous matching)
|
||||||
# voice – Kokoro voice name
|
# voice – Kokoro voice name
|
||||||
# output_wav – filename saved inside OUTPUT_DIR
|
# output_wav – filename saved inside OUTPUT_DIR
|
||||||
#
|
#
|
||||||
# Comment out any line to skip that section entirely.
|
# Comment out any line to skip that section entirely.
|
||||||
BOOKS = [
|
BOOKS = [
|
||||||
# label start_marker voice output_wav
|
# label (start_line1, start_line2) voice output_wav
|
||||||
("Introduction", "Introduction", "af_heart", "00_introduction.wav"),
|
("Introduction", ("Introduction", "The Book of the Nem"), "af_heart", "00_introduction.wav"),
|
||||||
("Book of Hagoth", "THE BOOK OF HAGOTH", "am_fenrir", "01_hagoth.wav"),
|
("Book of Hagoth", ("THE BOOK OF HAGOTH", "THE SON OF HAGMENI,"), "am_santa", "01_hagoth.wav"),
|
||||||
("Shi-Tugo I", "THE FIRST BOOK OF SHI-TUGO", "am_eric", "02_shi_tugo_1.wav"),
|
("Shi-Tugo I", ("THE FIRST BOOK OF SHI-TUGO", "FORMER WARRIOR, AMMONITE"), "am_eric", "02_shi_tugo_1.wav"),
|
||||||
("Sanempet", "THE BOOK OF SANEMPET", "am_liam", "03_sanempet.wav"),
|
("Sanempet", ("THE BOOK OF SANEMPET", "THE SON OF HAGMENI,"), "am_liam", "03_sanempet.wav"),
|
||||||
("Oug", "THE BOOK OF OUG", "am_michael", "04_oug.wav"),
|
("Oug", ("THE BOOK OF OUG", "THE SON OF SANEMPET"), "am_michael", "04_oug.wav"),
|
||||||
("Temple Writings of Oug", "THE BOOK OF", "am_michael", "05_temple_writings_oug.wav"),
|
("Temple Writings of Oug", ("THE BOOK OF", "THE TEMPLE WRITINGS"), "am_michael", "05_temple_writings_oug.wav"),
|
||||||
("Sacred Temple Writings", "THE SACRED", "am_michael", "06_sacred_temple_writings.wav"),
|
("Sacred Temple Writings", ("THE SACRED", "TEMPLE WRITINGS"), "am_michael", "06_sacred_temple_writings.wav"),
|
||||||
("Samuel the Lamanite I", "THE FIRST BOOK", "am_echo", "07_samuel_lamanite_1.wav"),
|
("Samuel the Lamanite I", ("THE FIRST BOOK", "OF SAMUEL THE LAMANITE"), "am_echo", "07_samuel_lamanite_1.wav"),
|
||||||
("Samuel the Lamanite II", "THE SECOND BOOK", "am_echo", "08_samuel_lamanite_2.wav"),
|
("Samuel the Lamanite II", ("THE SECOND BOOK", "OF SAMUEL THE LAMANITE"), "am_echo", "08_samuel_lamanite_2.wav"),
|
||||||
("Manti", "THE BOOK OF MANTI", "am_onyx", "09_manti.wav"),
|
("Manti", ("THE BOOK OF MANTI", "THE SON OF OUG"), "am_onyx", "09_manti.wav"),
|
||||||
("Pa Nat I", "THE FIRST BOOK OF PA NAT", "af_nicole", "10_pa_nat_1.wav"),
|
("Pa Nat I", ("THE FIRST BOOK OF PA NAT", "THE DAUGHTER OF SHIMLEI"), "af_bella", "10_pa_nat_1.wav"),
|
||||||
("Moroni I", "THE FIRST BOOK OF MORONI", "am_adam", "11_moroni_1.wav"),
|
("Moroni I", ("THE FIRST BOOK OF MORONI", "THE SON OF MORMON,"), "am_adam", "11_moroni_1.wav"),
|
||||||
("Moroni II", "THE SECOND BOOK OF MORONI", "am_adam", "12_moroni_2.wav"),
|
("Moroni II", ("THE SECOND BOOK OF MORONI", "THE SON OF MORMON,"), "am_adam", "12_moroni_2.wav"),
|
||||||
("Moroni III", "THE THIRD BOOK OF MORONI", "am_adam", "13_moroni_3.wav"),
|
("Moroni III", ("THE THIRD BOOK OF MORONI", "THE SON OF MORMON,"), "am_adam", "13_moroni_3.wav"),
|
||||||
("Shioni", "THE BOOK OF SHIONI", "am_puck", "14_shioni.wav"),
|
("Shioni", ("THE BOOK OF SHIONI", "THE SON OF MORONI"), "am_puck", "14_shioni.wav"),
|
||||||
]
|
]
|
||||||
|
|
||||||
# ── Helpers ────────────────────────────────────────────────────────────────────
|
# ── Helpers ────────────────────────────────────────────────────────────────────
|
||||||
@ -71,23 +81,36 @@ BOOKS = [
|
|||||||
def load_and_split(source: Path, books: list) -> dict[str, str]:
|
def load_and_split(source: Path, books: list) -> dict[str, str]:
|
||||||
"""
|
"""
|
||||||
Read the source file and split it into sections keyed by label.
|
Read the source file and split it into sections keyed by label.
|
||||||
Each section starts at its start_marker line and ends just before the
|
Each section starts at its (start_line1, start_line2) marker pair and
|
||||||
next section's start_marker.
|
ends just before the next section's marker.
|
||||||
|
|
||||||
|
Marker positions are always detected from the *original* unmodified file
|
||||||
|
(_ORIG_FILE) when it exists, so that phonetic fixes applied to section
|
||||||
|
headings in the TTS-fixed file can never break section detection. The
|
||||||
|
line numbers are identical in both files because word-level replacements
|
||||||
|
never add or remove lines.
|
||||||
"""
|
"""
|
||||||
raw_lines = source.read_text(encoding="utf-8").splitlines()
|
# Use the original (un-fixed) file for marker detection so phonetic
|
||||||
|
# changes to heading lines don't break matching.
|
||||||
|
marker_source = _ORIG_FILE if _ORIG_FILE.exists() else source
|
||||||
|
marker_lines = marker_source.read_text(encoding="utf-8").splitlines()
|
||||||
|
|
||||||
# Build a mapping: marker_text → index in BOOKS
|
# The content to actually return comes from `source` (may be fixed file).
|
||||||
markers = [(label, marker.strip()) for label, marker, _, _ in books]
|
content_lines = source.read_text(encoding="utf-8").splitlines()
|
||||||
|
|
||||||
# Find the line index of each marker's first occurrence
|
# Build a mapping: (label, line1, line2) for each book
|
||||||
|
markers = [(label, m[0].strip(), m[1].strip()) for label, m, _, _ in books]
|
||||||
|
|
||||||
|
# Find the line index of each marker's first occurrence (two-line match)
|
||||||
marker_positions: list[tuple[int, int]] = [] # (line_idx, books_idx)
|
marker_positions: list[tuple[int, int]] = [] # (line_idx, books_idx)
|
||||||
for book_idx, (label, marker) in enumerate(markers):
|
for book_idx, (label, m1, m2) in enumerate(markers):
|
||||||
for line_idx, line in enumerate(raw_lines):
|
for line_idx, line in enumerate(marker_lines[:-1]):
|
||||||
if line.strip() == marker:
|
if (line.strip().upper() == m1.upper() and
|
||||||
|
marker_lines[line_idx + 1].strip().upper().startswith(m2.upper())):
|
||||||
marker_positions.append((line_idx, book_idx))
|
marker_positions.append((line_idx, book_idx))
|
||||||
break
|
break
|
||||||
else:
|
else:
|
||||||
print(f" ⚠ Marker not found for '{label}': '{marker}' — skipping")
|
print(f" ⚠ Marker not found for '{label}': '{m1}' / '{m2}' — skipping")
|
||||||
|
|
||||||
marker_positions.sort(key=lambda x: x[0])
|
marker_positions.sort(key=lambda x: x[0])
|
||||||
|
|
||||||
@ -97,8 +120,8 @@ def load_and_split(source: Path, books: list) -> dict[str, str]:
|
|||||||
if rank + 1 < len(marker_positions):
|
if rank + 1 < len(marker_positions):
|
||||||
end_line = marker_positions[rank + 1][0]
|
end_line = marker_positions[rank + 1][0]
|
||||||
else:
|
else:
|
||||||
end_line = len(raw_lines)
|
end_line = len(content_lines)
|
||||||
text = "\n".join(raw_lines[line_idx:end_line]).strip()
|
text = "\n".join(content_lines[line_idx:end_line]).strip()
|
||||||
sections[label] = text
|
sections[label] = text
|
||||||
|
|
||||||
return sections
|
return sections
|
||||||
@ -118,8 +141,21 @@ def clean_text(text: str) -> str:
|
|||||||
return text.strip()
|
return text.strip()
|
||||||
|
|
||||||
|
|
||||||
|
def _fmt_duration(seconds: float) -> str:
|
||||||
|
"""Format seconds as 'Xh Ym Zs', 'Xm Ys', or 'Xs'."""
|
||||||
|
h, rem = divmod(int(seconds), 3600)
|
||||||
|
m, s = divmod(rem, 60)
|
||||||
|
if h > 0:
|
||||||
|
return f"{h}h {m:02d}m {s:02d}s"
|
||||||
|
if m > 0:
|
||||||
|
return f"{m}m {s:02d}s"
|
||||||
|
return f"{s}s"
|
||||||
|
|
||||||
|
|
||||||
def generate_audio(pipeline: KPipeline, text: str, voice: str,
|
def generate_audio(pipeline: KPipeline, text: str, voice: str,
|
||||||
output_path: Path) -> None:
|
output_path: Path) -> float:
|
||||||
|
"""Generate audio and return wall-clock seconds elapsed."""
|
||||||
|
t0 = time.monotonic()
|
||||||
chunks = []
|
chunks = []
|
||||||
for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
|
for _, _, chunk_audio in pipeline(text, voice=voice, speed=SPEED):
|
||||||
if hasattr(chunk_audio, "numpy"):
|
if hasattr(chunk_audio, "numpy"):
|
||||||
@ -131,15 +167,55 @@ def generate_audio(pipeline: KPipeline, text: str, voice: str,
|
|||||||
if chunks:
|
if chunks:
|
||||||
audio = np.concatenate(chunks, axis=0)
|
audio = np.concatenate(chunks, axis=0)
|
||||||
sf.write(str(output_path), audio, SAMPLE_RATE)
|
sf.write(str(output_path), audio, SAMPLE_RATE)
|
||||||
|
elapsed = time.monotonic() - t0
|
||||||
duration = len(audio) / SAMPLE_RATE
|
duration = len(audio) / SAMPLE_RATE
|
||||||
print(f" ✓ Saved '{output_path.name}' ({duration:.1f}s)")
|
print(f" ✓ Saved '{output_path.name}' ({_fmt_duration(duration)} audio | {_fmt_duration(elapsed)} wall-clock)")
|
||||||
else:
|
else:
|
||||||
|
elapsed = time.monotonic() - t0
|
||||||
print(f" ✗ No audio produced for voice='{voice}'")
|
print(f" ✗ No audio produced for voice='{voice}'")
|
||||||
|
return elapsed
|
||||||
|
|
||||||
|
|
||||||
# ── Main ───────────────────────────────────────────────────────────────────────
|
# ── Main ───────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
def main() -> None:
|
def main() -> None:
|
||||||
|
# ── CLI ────────────────────────────────────────────────────────────
|
||||||
|
parser = argparse.ArgumentParser(description="Generate Nem audiobook sections.")
|
||||||
|
parser.add_argument(
|
||||||
|
"books", nargs="*",
|
||||||
|
help="Labels of sections to generate (default: all enabled books). "
|
||||||
|
"Use --list to see available labels."
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--list", action="store_true",
|
||||||
|
help="Print all enabled book labels and exit."
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--preview", nargs="?", const=3000, type=int, metavar="CHARS",
|
||||||
|
help="Generate a short preview clip per book (default: 3000 chars). "
|
||||||
|
"Output filenames get a _preview suffix."
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
enabled_labels = [label for label, _, _, _ in BOOKS]
|
||||||
|
|
||||||
|
if args.list:
|
||||||
|
print("Enabled books:")
|
||||||
|
for label in enabled_labels:
|
||||||
|
print(f" {label}")
|
||||||
|
return
|
||||||
|
|
||||||
|
# Filter to requested subset, preserving BOOKS order
|
||||||
|
if args.books:
|
||||||
|
unknown = [b for b in args.books if b not in enabled_labels]
|
||||||
|
if unknown:
|
||||||
|
print(f"Unknown book label(s): {', '.join(unknown)}")
|
||||||
|
print(f"Run with --list to see available labels.")
|
||||||
|
return
|
||||||
|
run_books = [b for b in BOOKS if b[0] in args.books]
|
||||||
|
else:
|
||||||
|
run_books = list(BOOKS)
|
||||||
|
|
||||||
device = "cuda" if torch.cuda.is_available() else "cpu"
|
device = "cuda" if torch.cuda.is_available() else "cpu"
|
||||||
print(f"Device: {device}")
|
print(f"Device: {device}")
|
||||||
if device == "cuda":
|
if device == "cuda":
|
||||||
@ -150,25 +226,95 @@ def main() -> None:
|
|||||||
print(f"\nSource: '{SOURCE_FILE}'"
|
print(f"\nSource: '{SOURCE_FILE}'"
|
||||||
+ (" ✓ (TTS fixed)" if SOURCE_FILE == _FIXED_FILE else
|
+ (" ✓ (TTS fixed)" if SOURCE_FILE == _FIXED_FILE else
|
||||||
" ⚠ (original — run 'Apply Fixes to Text' in the GUI to use phonetic fixes)"))
|
" ⚠ (original — run 'Apply Fixes to Text' in the GUI to use phonetic fixes)"))
|
||||||
|
# Always split using ALL books for correct section boundaries,
|
||||||
|
# but only generate for run_books.
|
||||||
sections = load_and_split(SOURCE_FILE, BOOKS)
|
sections = load_and_split(SOURCE_FILE, BOOKS)
|
||||||
print(f" Found {len(sections)} sections.\n")
|
print(f" Found {len(sections)} sections ({len(run_books)} selected).\n")
|
||||||
|
|
||||||
print("Initialising Kokoro pipeline …")
|
print("Initialising Kokoro pipeline …")
|
||||||
pipeline = KPipeline(lang_code=LANG_CODE)
|
pipeline = KPipeline(lang_code=LANG_CODE)
|
||||||
|
|
||||||
for label, marker, voice, wav_name in BOOKS:
|
# Pre-compute char counts for all sections so we can estimate ETAs
|
||||||
if label not in sections:
|
section_chars: dict[str, int] = {
|
||||||
continue # marker was not found; warning already printed
|
label: len(clean_text(sections[label]))
|
||||||
|
for label, _, _, _ in run_books
|
||||||
|
if label in sections
|
||||||
|
}
|
||||||
|
|
||||||
print(f"\n[{label}] voice={voice} → {wav_name}")
|
# Print char count summary before starting
|
||||||
text = clean_text(sections[label])
|
preview_note = f" ⚡ PREVIEW MODE — capped at {args.preview:,} chars/book\n" if args.preview else ""
|
||||||
if not text:
|
print(f"\n{preview_note}{'─' * 52}")
|
||||||
print(" ⚠ Empty text — skipping")
|
print(f" {'Section':<30} {'Chars':>8}")
|
||||||
|
print(f"{'─' * 52}")
|
||||||
|
for label, _, _, wav_name in run_books:
|
||||||
|
if label in section_chars:
|
||||||
|
print(f" {label:<30} {section_chars[label]:>8,}")
|
||||||
|
print(f"{'─' * 52}")
|
||||||
|
total_chars = sum(section_chars.values())
|
||||||
|
print(f" {'TOTAL':<30} {total_chars:>8,}")
|
||||||
|
print()
|
||||||
|
|
||||||
|
chars_per_sec: float | None = None # derived from the first book that finishes
|
||||||
|
timing_rows: list[tuple[str, int, float]] = [] # (label, chars, elapsed)
|
||||||
|
|
||||||
|
for label, _marker, voice, wav_name in run_books:
|
||||||
|
if label not in sections:
|
||||||
continue
|
continue
|
||||||
|
|
||||||
out_path = OUTPUT_DIR / wav_name
|
text = clean_text(sections[label])
|
||||||
generate_audio(pipeline, text, voice, out_path)
|
if not text:
|
||||||
|
print(f"\n[{label}] ⚠ Empty text — skipping")
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Preview mode: truncate to requested char limit at a word boundary
|
||||||
|
preview_chars = args.preview
|
||||||
|
if preview_chars:
|
||||||
|
if len(text) > preview_chars:
|
||||||
|
cut = text.rfind(" ", 0, preview_chars)
|
||||||
|
text = text[: cut if cut > 0 else preview_chars]
|
||||||
|
|
||||||
|
chars = len(text)
|
||||||
|
|
||||||
|
# Print ETA once we have a calibration rate
|
||||||
|
if chars_per_sec is not None:
|
||||||
|
eta_sec = chars / chars_per_sec
|
||||||
|
eta_str = _fmt_duration(eta_sec)
|
||||||
|
print(f"\n[{label}] voice={voice} → {wav_name} (est. {eta_str})")
|
||||||
|
else:
|
||||||
|
print(f"\n[{label}] voice={voice} → {wav_name} (timing calibration run)")
|
||||||
|
|
||||||
|
stem, ext = wav_name.rsplit(".", 1)
|
||||||
|
preview_tag = "_preview" if preview_chars else ""
|
||||||
|
out_path = OUTPUT_DIR / f"{stem}_{voice}{preview_tag}.{ext}"
|
||||||
|
elapsed = generate_audio(pipeline, text, voice, out_path)
|
||||||
|
timing_rows.append((label, chars, elapsed))
|
||||||
|
|
||||||
|
# Update calibration as a cumulative average after every book
|
||||||
|
total_chars_done = sum(c for _, c, _ in timing_rows)
|
||||||
|
total_elapsed_done = sum(e for _, _, e in timing_rows)
|
||||||
|
if total_elapsed_done > 0:
|
||||||
|
chars_per_sec = total_chars_done / total_elapsed_done
|
||||||
|
remaining = total_chars - total_chars_done
|
||||||
|
eta_overall = _fmt_duration(remaining / chars_per_sec) if remaining > 0 else "0s"
|
||||||
|
print(f" ⏱ Speed: {chars_per_sec:.0f} chars/sec | Est. overall remaining: {eta_overall}")
|
||||||
|
|
||||||
|
# ── Summary ────────────────────────────────────────────────────────────────
|
||||||
|
print("\n" + "─" * 60)
|
||||||
|
print(f" {'Section':<30} {'Chars':>7} {'Actual':>8} {'Est':>8}")
|
||||||
|
print("─" * 60)
|
||||||
|
for i, (label, chars, elapsed) in enumerate(timing_rows):
|
||||||
|
actual_str = _fmt_duration(elapsed)
|
||||||
|
# Estimate using the cumulative rate *before* this book was added
|
||||||
|
prior_chars = sum(c for _, c, _ in timing_rows[:i])
|
||||||
|
prior_elapsed = sum(e for _, _, e in timing_rows[:i])
|
||||||
|
if prior_elapsed > 0:
|
||||||
|
est_str = _fmt_duration(chars / (prior_chars / prior_elapsed))
|
||||||
|
else:
|
||||||
|
est_str = "(first run)"
|
||||||
|
print(f" {label:<30} {chars:>7,} {actual_str:>8} {est_str:>8}")
|
||||||
|
total_elapsed = sum(e for _, _, e in timing_rows)
|
||||||
|
print("─" * 60)
|
||||||
|
print(f" {'TOTAL':<30} {sum(c for _,c,_ in timing_rows):>7,} {_fmt_duration(total_elapsed):>8}")
|
||||||
print("\nDone.")
|
print("\nDone.")
|
||||||
|
|
||||||
|
|
||||||
|
|||||||
352
create_temple_voices.py
Normal file
352
create_temple_voices.py
Normal file
@ -0,0 +1,352 @@
|
|||||||
|
"""
|
||||||
|
create_temple_voices.py
|
||||||
|
────────────────────────
|
||||||
|
Generate the "Sacred Temple Writings" section of the Nem audiobook using one
|
||||||
|
distinct Microsoft Edge neural TTS voice per character (NOT Kokoro).
|
||||||
|
|
||||||
|
Uses the free edge-tts library which streams Microsoft Azure neural voices.
|
||||||
|
Audio is stitched into a single WAV and saved to OUTPUT_DIR.
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python create_temple_voices.py # full render
|
||||||
|
python create_temple_voices.py --preview 40 # first 40 segments only
|
||||||
|
python create_temple_voices.py --print-segments # inspect parsed segments
|
||||||
|
python create_temple_voices.py --list-voices # list available en voices
|
||||||
|
|
||||||
|
Voice assignments live in CHARACTER_VOICES below — easy to customise.
|
||||||
|
Run --list-voices to discover all available edge-tts voice names.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import asyncio
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
import time
|
||||||
|
from collections import Counter
|
||||||
|
from pathlib import Path
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import soundfile as sf
|
||||||
|
import edge_tts
|
||||||
|
|
||||||
|
# ── File / output config ───────────────────────────────────────────────────────
|
||||||
|
_FIXED_FILE = Path("Audio Master Nem Full (TTS Fixed).txt")
|
||||||
|
_ORIG_FILE = Path("Audio Master Nem Full.txt")
|
||||||
|
SOURCE_FILE = _FIXED_FILE if _FIXED_FILE.exists() else _ORIG_FILE
|
||||||
|
|
||||||
|
OUTPUT_DIR = Path("output_temple_voices")
|
||||||
|
OUTPUT_FILE = "sacred_temple_writings_multivoice.wav"
|
||||||
|
|
||||||
|
SAMPLE_RATE = 24_000 # Hz — final WAV sample rate
|
||||||
|
PAUSE_SAME = 350 # ms silence between same-speaker segments
|
||||||
|
PAUSE_CHANGE = 650 # ms silence between different-speaker segments
|
||||||
|
|
||||||
|
# ── Section boundary markers (match create_audiobook_nem.py BOOKS order) ──────
|
||||||
|
# Sacred Temple Writings starts at "THE SACRED" / "TEMPLE WRITINGS"
|
||||||
|
# and ends just before "THE FIRST BOOK" / "OF SAMUEL THE LAMANITE"
|
||||||
|
_SEC_START_L1 = "THE SACRED"
|
||||||
|
_SEC_START_L2 = "TEMPLE WRITINGS"
|
||||||
|
_SEC_END_L1 = "THE FIRST BOOK"
|
||||||
|
_SEC_END_L2 = "OF SAMUEL THE LAMANITE"
|
||||||
|
|
||||||
|
# ── Character → edge-tts voice ────────────────────────────────────────────────
|
||||||
|
# Run python create_temple_voices.py --list-voices to see all available voices.
|
||||||
|
# Keys must match the speaker labels exactly as they appear in the source file.
|
||||||
|
CHARACTER_VOICES: dict[str, str] = {
|
||||||
|
# ── Celestial beings ───────────────────────────────────────────────────────
|
||||||
|
"Narrator": "en-US-GuyNeural", # calm neutral narrator
|
||||||
|
"Elohim Heavenly Mother": "en-US-JennyNeural", # warm, wise matriarch
|
||||||
|
"Elohim Heavenly Father": "en-US-AndrewMultilingualNeural", # expressive, authoritative
|
||||||
|
"Jehovah": "en-US-AndrewNeural", # clear, gentle divine
|
||||||
|
"Angel of the Lord": "en-US-BrianNeural", # ethereal divine messenger
|
||||||
|
"Holy Ghost": "en-US-EricNeural", # quiet, inward, spiritual
|
||||||
|
"Holy Ghost Elders": "en-US-BrianNeural", # measured elder council
|
||||||
|
|
||||||
|
# ── Dark beings ────────────────────────────────────────────────────────────
|
||||||
|
"Lucifer": "en-CA-LiamNeural", # smooth, persuasive tempter
|
||||||
|
"Satan": "en-US-SteffanNeural", # cold, commanding adversary
|
||||||
|
|
||||||
|
# ── Mortal / earth characters ──────────────────────────────────────────────
|
||||||
|
"Michael": "en-US-RogerNeural", # noble warrior archangel
|
||||||
|
"Adam": "en-US-ChristopherNeural", # earnest first man
|
||||||
|
"Eve": "en-US-AriaNeural", # curious, warm first woman
|
||||||
|
|
||||||
|
# ── Apostles ───────────────────────────────────────────────────────────────
|
||||||
|
"Peter": "en-GB-RyanNeural", # firm British apostle
|
||||||
|
"James": "en-AU-WilliamMultilingualNeural", # steady Australian voice
|
||||||
|
"John": "en-IE-ConnorNeural", # gentle Irish apostle
|
||||||
|
|
||||||
|
# ── Other roles ────────────────────────────────────────────────────────────
|
||||||
|
"Preacher": "en-US-AvaNeural", # bold emphatic preacher
|
||||||
|
"Mob": "en-US-MichelleNeural", # crowd / multitude voice
|
||||||
|
"The Voice of the Mob": "en-US-MichelleNeural", # alias used in some editions
|
||||||
|
}
|
||||||
|
|
||||||
|
# Voice used when a speaker label isn't found in CHARACTER_VOICES
|
||||||
|
FALLBACK_VOICE = "en-US-GuyNeural"
|
||||||
|
|
||||||
|
# Lines/patterns that are ceremony stage-directions → read by Narrator
|
||||||
|
_STAGE_NARRATOR = re.compile(
|
||||||
|
r"^(Break for Instruction|Resume Session|All\s+arise|"
|
||||||
|
r"CHAPTER\s*\d*|________________+|────+)",
|
||||||
|
re.IGNORECASE,
|
||||||
|
)
|
||||||
|
|
||||||
|
# Lines to skip entirely (decorative / empty)
|
||||||
|
_SKIP_RE = re.compile(r"^[—\-_\s\u2014\u2013]*$")
|
||||||
|
|
||||||
|
|
||||||
|
# ── Section extraction ─────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def extract_section(source: Path) -> str:
|
||||||
|
"""Return text of the Sacred Temple Writings section."""
|
||||||
|
lines = source.read_text(encoding="utf-8").splitlines()
|
||||||
|
in_sec = False
|
||||||
|
out: list[str] = []
|
||||||
|
|
||||||
|
for i, line in enumerate(lines):
|
||||||
|
s = line.strip()
|
||||||
|
if not in_sec:
|
||||||
|
if (s.upper() == _SEC_START_L1 and
|
||||||
|
i + 1 < len(lines) and
|
||||||
|
lines[i + 1].strip().upper().startswith(_SEC_START_L2)):
|
||||||
|
in_sec = True
|
||||||
|
else:
|
||||||
|
# End just before the next section
|
||||||
|
if (s.upper() == _SEC_END_L1 and
|
||||||
|
i + 1 < len(lines) and
|
||||||
|
lines[i + 1].strip().upper().startswith(_SEC_END_L2)):
|
||||||
|
break
|
||||||
|
out.append(line)
|
||||||
|
|
||||||
|
if not out:
|
||||||
|
raise RuntimeError(
|
||||||
|
f"Could not locate 'Sacred Temple Writings' in '{source}'.\n"
|
||||||
|
"Ensure the source file has a line exactly matching "
|
||||||
|
f"'{_SEC_START_L1}' followed by '{_SEC_START_L2}'."
|
||||||
|
)
|
||||||
|
return "\n".join(out)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Segment parser ─────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def _speaker_regex(characters: list[str]) -> re.Pattern:
|
||||||
|
"""Regex matching [optional-number] CharacterName: text"""
|
||||||
|
# Sort longest-first so "Holy Ghost Elders" matches before "Holy Ghost"
|
||||||
|
names = sorted(characters, key=len, reverse=True)
|
||||||
|
pat = "|".join(re.escape(n) for n in names)
|
||||||
|
return re.compile(r"^\d*\s*(" + pat + r")\s*:\s*(.*)", re.IGNORECASE)
|
||||||
|
|
||||||
|
|
||||||
|
def parse_segments(text: str) -> list[tuple[str, str]]:
|
||||||
|
"""
|
||||||
|
Convert section text into a list of (normalised_speaker, spoken_text) tuples.
|
||||||
|
Non-attributed prose becomes Narrator lines.
|
||||||
|
"""
|
||||||
|
char_re = _speaker_regex(list(CHARACTER_VOICES.keys()))
|
||||||
|
|
||||||
|
# Build a quick lowercase→canonical lookup for speaker name normalisation
|
||||||
|
canon: dict[str, str] = {k.lower(): k for k in CHARACTER_VOICES}
|
||||||
|
|
||||||
|
segments: list[tuple[str, str]] = []
|
||||||
|
cur_speaker = "Narrator"
|
||||||
|
buf: list[str] = []
|
||||||
|
|
||||||
|
def flush() -> None:
|
||||||
|
combined = " ".join(l.strip() for l in buf if l.strip())
|
||||||
|
if combined:
|
||||||
|
segments.append((cur_speaker, combined))
|
||||||
|
buf.clear()
|
||||||
|
|
||||||
|
for raw in text.splitlines():
|
||||||
|
line = raw.strip()
|
||||||
|
|
||||||
|
if not line or _SKIP_RE.match(line):
|
||||||
|
continue
|
||||||
|
|
||||||
|
# Stage direction → Narrator reads it
|
||||||
|
if _STAGE_NARRATOR.match(line):
|
||||||
|
flush()
|
||||||
|
cur_speaker = "Narrator"
|
||||||
|
buf.append(line)
|
||||||
|
continue
|
||||||
|
|
||||||
|
# "The words of Jehovah … are in blue." — formatting note, skip
|
||||||
|
if re.search(r"are in blue|words of jehovah", line, re.IGNORECASE):
|
||||||
|
continue
|
||||||
|
|
||||||
|
m = char_re.match(line)
|
||||||
|
if m:
|
||||||
|
flush()
|
||||||
|
raw_name = m.group(1)
|
||||||
|
cur_speaker = canon.get(raw_name.lower(), raw_name)
|
||||||
|
spoken = m.group(2).strip()
|
||||||
|
if spoken:
|
||||||
|
buf.append(spoken)
|
||||||
|
else:
|
||||||
|
# Continuation of current speaker (or unattributed narrator prose)
|
||||||
|
buf.append(line)
|
||||||
|
|
||||||
|
flush()
|
||||||
|
return segments
|
||||||
|
|
||||||
|
|
||||||
|
# ── Audio generation ───────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
async def _tts_bytes(text: str, voice: str) -> bytes:
|
||||||
|
"""Stream edge-tts and return raw MP3 bytes."""
|
||||||
|
communicate = edge_tts.Communicate(text, voice)
|
||||||
|
data = bytearray()
|
||||||
|
async for chunk in communicate.stream():
|
||||||
|
if chunk["type"] == "audio":
|
||||||
|
data.extend(chunk["data"])
|
||||||
|
return bytes(data)
|
||||||
|
|
||||||
|
|
||||||
|
def _mp3_to_numpy(mp3: bytes) -> np.ndarray:
|
||||||
|
"""Decode MP3 bytes → mono float32 numpy array at SAMPLE_RATE using ffmpeg."""
|
||||||
|
cmd = [
|
||||||
|
"ffmpeg", "-hide_banner", "-loglevel", "error",
|
||||||
|
"-i", "pipe:0", # read MP3 from stdin
|
||||||
|
"-f", "f32le", # raw 32-bit little-endian float PCM
|
||||||
|
"-acodec", "pcm_f32le",
|
||||||
|
"-ac", "1", # mono
|
||||||
|
"-ar", str(SAMPLE_RATE), # resample to target rate
|
||||||
|
"pipe:1", # write PCM to stdout
|
||||||
|
]
|
||||||
|
result = subprocess.run(cmd, input=mp3, capture_output=True, check=True)
|
||||||
|
return np.frombuffer(result.stdout, dtype=np.float32).copy()
|
||||||
|
|
||||||
|
|
||||||
|
def _silence(ms: int) -> np.ndarray:
|
||||||
|
return np.zeros(int(SAMPLE_RATE * ms / 1000), dtype=np.float32)
|
||||||
|
|
||||||
|
|
||||||
|
async def render(
|
||||||
|
segments: list[tuple[str, str]],
|
||||||
|
preview: int | None = None,
|
||||||
|
) -> np.ndarray:
|
||||||
|
"""Generate and stitch all segment audio; return concatenated float32 array."""
|
||||||
|
if preview is not None:
|
||||||
|
segments = segments[:preview]
|
||||||
|
|
||||||
|
parts: list[np.ndarray] = []
|
||||||
|
last_speaker: str | None = None
|
||||||
|
t0 = time.monotonic()
|
||||||
|
|
||||||
|
for idx, (speaker, text) in enumerate(segments, 1):
|
||||||
|
voice = CHARACTER_VOICES.get(speaker, FALLBACK_VOICE)
|
||||||
|
marker = "⚠" if speaker not in CHARACTER_VOICES else " "
|
||||||
|
print(f" {marker}[{idx:>4}/{len(segments)}] {speaker:<28} {voice}")
|
||||||
|
|
||||||
|
try:
|
||||||
|
mp3 = await _tts_bytes(text, voice)
|
||||||
|
except Exception as exc:
|
||||||
|
print(f" ↳ ERROR with '{voice}': {exc} — falling back to {FALLBACK_VOICE}")
|
||||||
|
mp3 = await _tts_bytes(text, FALLBACK_VOICE)
|
||||||
|
|
||||||
|
audio = _mp3_to_numpy(mp3)
|
||||||
|
|
||||||
|
if parts:
|
||||||
|
gap = PAUSE_SAME if speaker == last_speaker else PAUSE_CHANGE
|
||||||
|
parts.append(_silence(gap))
|
||||||
|
parts.append(audio)
|
||||||
|
last_speaker = speaker
|
||||||
|
|
||||||
|
elapsed = time.monotonic() - t0
|
||||||
|
print(f"\n ✓ {len(segments)} segments in {elapsed:.0f}s")
|
||||||
|
return np.concatenate(parts) if parts else np.array([], dtype=np.float32)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Voice listing ──────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
async def _list_voices_async() -> None:
|
||||||
|
voices = await edge_tts.list_voices()
|
||||||
|
english = sorted(
|
||||||
|
(v for v in voices if v["Locale"].startswith("en-")),
|
||||||
|
key=lambda v: (v["Locale"], v["ShortName"]),
|
||||||
|
)
|
||||||
|
print(f"\n {'Locale':<12} {'Short Name':<45} Gender")
|
||||||
|
print(" " + "─" * 68)
|
||||||
|
for v in english:
|
||||||
|
print(f" {v['Locale']:<12} {v['ShortName']:<45} {v['Gender']}")
|
||||||
|
print(f"\n {len(english)} English voices total.")
|
||||||
|
|
||||||
|
|
||||||
|
# ── CLI / main ─────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def main() -> None:
|
||||||
|
ap = argparse.ArgumentParser(
|
||||||
|
description="Render Sacred Temple Writings with per-character edge-tts voices."
|
||||||
|
)
|
||||||
|
ap.add_argument("--list-voices", action="store_true",
|
||||||
|
help="Print all available English edge-tts voices and exit.")
|
||||||
|
ap.add_argument("--print-segments", action="store_true",
|
||||||
|
help="Print parsed (speaker, text) segments and exit.")
|
||||||
|
ap.add_argument("--preview", type=int, metavar="N",
|
||||||
|
help="Render only the first N segments (quick test).")
|
||||||
|
args = ap.parse_args()
|
||||||
|
|
||||||
|
if args.list_voices:
|
||||||
|
asyncio.run(_list_voices_async())
|
||||||
|
return
|
||||||
|
|
||||||
|
# ── Extract & parse ────────────────────────────────────────────────────────
|
||||||
|
print(f"Source : {SOURCE_FILE}")
|
||||||
|
text = extract_section(SOURCE_FILE)
|
||||||
|
print(f"Section: {len(text):,} chars extracted\n")
|
||||||
|
|
||||||
|
segments = parse_segments(text)
|
||||||
|
|
||||||
|
if args.print_segments:
|
||||||
|
print(f"Parsed {len(segments)} segments:\n")
|
||||||
|
for i, (spkr, txt) in enumerate(segments, 1):
|
||||||
|
snippet = txt[:90] + ("…" if len(txt) > 90 else "")
|
||||||
|
voice = CHARACTER_VOICES.get(spkr, f"{FALLBACK_VOICE} ⚠")
|
||||||
|
print(f" {i:>4}. [{spkr}] ({voice})\n {snippet}\n")
|
||||||
|
return
|
||||||
|
|
||||||
|
# ── Summary table ──────────────────────────────────────────────────────────
|
||||||
|
counts = Counter(s for s, _ in segments)
|
||||||
|
unrecognised = {s for s in counts if s not in CHARACTER_VOICES}
|
||||||
|
|
||||||
|
print(f"Parsed {len(segments)} segments across {len(counts)} speakers:\n")
|
||||||
|
print(f" {'Speaker':<28} {'Segs':>5} {'Voice'}")
|
||||||
|
print(f" {'─'*28} {'─'*5} {'─'*45}")
|
||||||
|
for spkr, voice in CHARACTER_VOICES.items():
|
||||||
|
if counts[spkr]:
|
||||||
|
print(f" {spkr:<28} {counts[spkr]:>5} {voice}")
|
||||||
|
for spkr in sorted(unrecognised):
|
||||||
|
print(f" {spkr:<28} {counts[spkr]:>5} {FALLBACK_VOICE} ⚠ unrecognised")
|
||||||
|
|
||||||
|
total_chars = sum(len(t) for _, t in segments)
|
||||||
|
print(f"\n Total chars: {total_chars:,}")
|
||||||
|
if args.preview:
|
||||||
|
print(f" ⚡ PREVIEW MODE — rendering first {args.preview} segments only")
|
||||||
|
|
||||||
|
# ── GPU note ───────────────────────────────────────────────────────────────
|
||||||
|
# edge-tts is cloud-based (Microsoft Azure neural, free) — GPU not used.
|
||||||
|
print("\nNote: edge-tts uses Microsoft's servers (free, no API key needed).\n"
|
||||||
|
" Render speed depends on your internet connection.\n")
|
||||||
|
|
||||||
|
# ── Render ─────────────────────────────────────────────────────────────────
|
||||||
|
OUTPUT_DIR.mkdir(exist_ok=True)
|
||||||
|
out_path = OUTPUT_DIR / (
|
||||||
|
f"sacred_temple_writings_preview{args.preview}.wav"
|
||||||
|
if args.preview else OUTPUT_FILE
|
||||||
|
)
|
||||||
|
|
||||||
|
print("Rendering segments …\n")
|
||||||
|
audio = asyncio.run(render(segments, args.preview))
|
||||||
|
|
||||||
|
if audio.size > 0:
|
||||||
|
sf.write(str(out_path), audio, SAMPLE_RATE)
|
||||||
|
dur = len(audio) / SAMPLE_RATE
|
||||||
|
m, s = divmod(int(dur), 60)
|
||||||
|
print(f"\n✓ Saved '{out_path}' ({m}m {s:02d}s audio | {SAMPLE_RATE} Hz)")
|
||||||
|
else:
|
||||||
|
print("✗ No audio produced — check parsing with --print-segments")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
@ -18,6 +18,25 @@ from collections import defaultdict
|
|||||||
from pathlib import Path
|
from pathlib import Path
|
||||||
|
|
||||||
import spacy
|
import spacy
|
||||||
|
from wordfreq import top_n_list
|
||||||
|
|
||||||
|
# ── Top 10 000 most-frequent English words ──────────────────────────
|
||||||
|
TOP_10K_ENGLISH: frozenset[str] = frozenset(top_n_list("en", 10_000))
|
||||||
|
|
||||||
|
# Words in the top-10k list that are genuine proper nouns in this text —
|
||||||
|
# keep them despite the frequency filter.
|
||||||
|
PROPER_NOUN_WHITELIST: frozenset[str] = frozenset({
|
||||||
|
# Biblical names
|
||||||
|
"aaron", "abel", "abraham", "adam", "cain", "eden", "egypt",
|
||||||
|
"elijah", "ephraim", "eve", "gad", "ham", "isaac", "israel",
|
||||||
|
"jacob", "james", "jehovah", "john", "joseph", "judah",
|
||||||
|
"laban", "lehi", "levi", "micah", "michael", "moses", "noah",
|
||||||
|
"peter", "pharaoh", "samuel", "sarah", "sarai", "seth", "simeon",
|
||||||
|
"timothy", "zion",
|
||||||
|
# Book-specific names that happen to match English words
|
||||||
|
"alma", "ether", "gideon", "limhi", "mormon", "moroni", "mulek",
|
||||||
|
"mosiah", "nephi", "satan", "sidon",
|
||||||
|
})
|
||||||
|
|
||||||
SOURCE = Path("Audio Master Nem Full.txt")
|
SOURCE = Path("Audio Master Nem Full.txt")
|
||||||
OUTPUT = Path("proper_nouns.txt")
|
OUTPUT = Path("proper_nouns.txt")
|
||||||
@ -35,12 +54,29 @@ ORG_LABELS = {"ORG", "NORP"}
|
|||||||
OTHER_LABELS = {"EVENT", "WORK_OF_ART", "LAW", "PRODUCT", "LANGUAGE"}
|
OTHER_LABELS = {"EVENT", "WORK_OF_ART", "LAW", "PRODUCT", "LANGUAGE"}
|
||||||
|
|
||||||
# ── Noise filters ──────────────────────────────────────────────────────────────
|
# ── Noise filters ──────────────────────────────────────────────────────────────
|
||||||
# All-caps lines are section headers, not spoken names — skip them.
|
# Common English words that should be dropped when splitting multi-word entities.
|
||||||
# Also skip very short tokens that are likely artefacts.
|
STOP_WORDS: set[str] = {
|
||||||
SKIP_PATTERNS = re.compile(
|
"A", "AN", "AND", "AS", "AT", "BE", "BUT", "BY",
|
||||||
r"^(THE|A|AN|AND|OF|IN|TO|FOR|BY|AT|IS|WAS|BE|HE|SHE|IT|"
|
"DO", "DID", "DOTH",
|
||||||
r"CHAPTER|VERSE|YEA|BEHOLD|LORD|GOD|CHRIST|HOLY|GHOST)$"
|
"EVEN", "FOR", "FROM",
|
||||||
)
|
"HAD", "HAS", "HAVE", "HATH", "HE", "HER", "HIS", "HOW",
|
||||||
|
"I", "IN", "IS", "IT", "ITS",
|
||||||
|
"MAY", "ME", "MORE", "MY",
|
||||||
|
"NAY", "NO", "NOT", "NOW",
|
||||||
|
"OF", "OR", "OUR",
|
||||||
|
"SHALL", "SHE", "SO", "SOME",
|
||||||
|
"THAT", "THE", "THEE", "THEIR", "THEN", "THERE", "THESE", "THEY",
|
||||||
|
"THIS", "THOSE", "THOU", "THUS", "THY", "TO",
|
||||||
|
"UP", "UPON", "US",
|
||||||
|
"WAS", "WE", "WHEN", "WHERE", "WHICH", "WHO", "WILL", "WITH",
|
||||||
|
"YE", "YEA", "YET", "YOU", "YOUR",
|
||||||
|
# Book-specific common words not worth flagging
|
||||||
|
"BEHOLD", "CHAPTER", "CHRIST", "GOD", "GHOST", "HOLY", "LORD", "VERSE",
|
||||||
|
# Generic nouns that slip through NER
|
||||||
|
"CITY", "DAYS", "DAY", "GREAT", "LAND", "MAN", "MEN", "NEW",
|
||||||
|
"PEOPLE", "SON", "TIME",
|
||||||
|
}
|
||||||
|
|
||||||
|
|
||||||
def is_noise(text: str) -> bool:
|
def is_noise(text: str) -> bool:
|
||||||
t = text.strip()
|
t = text.strip()
|
||||||
@ -48,9 +84,12 @@ def is_noise(text: str) -> bool:
|
|||||||
return True
|
return True
|
||||||
if t.isupper() and len(t) > 4: # all-caps section header word
|
if t.isupper() and len(t) > 4: # all-caps section header word
|
||||||
return True
|
return True
|
||||||
if SKIP_PATTERNS.match(t.upper()):
|
if t.upper() in STOP_WORDS:
|
||||||
return True
|
return True
|
||||||
if re.search(r"[^a-zA-Z\-' ]", t): # contains digits or symbols
|
if re.search(r"[^a-zA-Z\-']", t): # contains digits, spaces, or symbols
|
||||||
|
return True
|
||||||
|
# Drop common English words (no hyphens) unless whitelisted as proper nouns.
|
||||||
|
if "-" not in t and t.lower() in TOP_10K_ENGLISH and t.lower() not in PROPER_NOUN_WHITELIST:
|
||||||
return True
|
return True
|
||||||
return False
|
return False
|
||||||
|
|
||||||
@ -60,6 +99,11 @@ def canonical(text: str) -> str:
|
|||||||
return " ".join(text.split()).title()
|
return " ".join(text.split()).title()
|
||||||
|
|
||||||
|
|
||||||
|
def split_words(phrase: str) -> list[str]:
|
||||||
|
"""Split a phrase on spaces; hyphenated words are kept as one token."""
|
||||||
|
return phrase.split()
|
||||||
|
|
||||||
|
|
||||||
# ── Read and process ───────────────────────────────────────────────────────────
|
# ── Read and process ───────────────────────────────────────────────────────────
|
||||||
print(f"Reading '{SOURCE}' …")
|
print(f"Reading '{SOURCE}' …")
|
||||||
raw_text = SOURCE.read_text(encoding="utf-8")
|
raw_text = SOURCE.read_text(encoding="utf-8")
|
||||||
@ -71,20 +115,23 @@ doc = nlp(raw_text)
|
|||||||
buckets: dict[str, set[str]] = defaultdict(set)
|
buckets: dict[str, set[str]] = defaultdict(set)
|
||||||
|
|
||||||
# 1. NER pass — trust spaCy's entity labels
|
# 1. NER pass — trust spaCy's entity labels
|
||||||
|
# Multi-word entities (e.g. "Peter James John") are split into individual
|
||||||
|
# words; hyphenated words (e.g. "Anti-Nephi-Lehi") stay as one token.
|
||||||
for ent in doc.ents:
|
for ent in doc.ents:
|
||||||
name = canonical(ent.text)
|
phrase = canonical(ent.text)
|
||||||
if is_noise(name):
|
for word in split_words(phrase):
|
||||||
continue
|
if is_noise(word):
|
||||||
if ent.label_ in PERSON_LABELS:
|
continue
|
||||||
buckets["People & Characters"].add(name)
|
if ent.label_ in PERSON_LABELS:
|
||||||
elif ent.label_ in PLACE_LABELS:
|
buckets["People & Characters"].add(word)
|
||||||
buckets["Places & Lands"].add(name)
|
elif ent.label_ in PLACE_LABELS:
|
||||||
elif ent.label_ in ORG_LABELS:
|
buckets["Places & Lands"].add(word)
|
||||||
buckets["Groups & Nations"].add(name)
|
elif ent.label_ in ORG_LABELS:
|
||||||
elif ent.label_ in OTHER_LABELS:
|
buckets["Groups & Nations"].add(word)
|
||||||
buckets["Other Named Things"].add(name)
|
elif ent.label_ in OTHER_LABELS:
|
||||||
else:
|
buckets["Other Named Things"].add(word)
|
||||||
buckets["Other Named Things"].add(name)
|
else:
|
||||||
|
buckets["Other Named Things"].add(word)
|
||||||
|
|
||||||
# 2. PROPN pass — catch names spaCy didn't recognise as entities
|
# 2. PROPN pass — catch names spaCy didn't recognise as entities
|
||||||
# Only include tokens that are inside a sentence (not at position 0)
|
# Only include tokens that are inside a sentence (not at position 0)
|
||||||
@ -97,13 +144,13 @@ for token in doc:
|
|||||||
continue # skip all-caps
|
continue # skip all-caps
|
||||||
if token.i == token.sent.start:
|
if token.i == token.sent.start:
|
||||||
continue # skip sentence-initial (could be any word)
|
continue # skip sentence-initial (could be any word)
|
||||||
name = canonical(text)
|
word = canonical(text)
|
||||||
if is_noise(name):
|
if is_noise(word):
|
||||||
continue
|
continue
|
||||||
# Only add if not already captured by NER
|
# Only add if not already captured by NER
|
||||||
already_captured = any(name in s for s in buckets.values())
|
already_captured = any(word in s for s in buckets.values())
|
||||||
if not already_captured:
|
if not already_captured:
|
||||||
buckets["Unclassified Proper Nouns"].add(name)
|
buckets["Unclassified Proper Nouns"].add(word)
|
||||||
|
|
||||||
# ── Write output ───────────────────────────────────────────────────────────────
|
# ── Write output ───────────────────────────────────────────────────────────────
|
||||||
GROUP_ORDER = [
|
GROUP_ORDER = [
|
||||||
|
|||||||
801
format_scripture.py
Normal file
801
format_scripture.py
Normal file
@ -0,0 +1,801 @@
|
|||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
create_scripture_pdf.py
|
||||||
|
════════════════════════
|
||||||
|
Convert the Book of the Nem plain-text file into two scripture-style PDFs:
|
||||||
|
|
||||||
|
nem_kindle.pdf – single-column, sized for e-readers (4.5" × 6.5")
|
||||||
|
nem_paper.pdf – two-column, Book of Mormon style (5.5" × 8.5")
|
||||||
|
|
||||||
|
Requirements (Debian/Ubuntu):
|
||||||
|
sudo apt-get install texlive-latex-extra texlive-fonts-recommended
|
||||||
|
|
||||||
|
The key packages used are:
|
||||||
|
extsizes – for 9 pt document class (paper format)
|
||||||
|
tgpagella – TeX Gyre Pagella (Palatino-clone) font
|
||||||
|
multicol – two-column layout without hard page breaks
|
||||||
|
microtype – improved text justification and hyphenation
|
||||||
|
fancyhdr – running headers and footers
|
||||||
|
needspace – prevent orphaned headings
|
||||||
|
|
||||||
|
Usage:
|
||||||
|
python create_scripture_pdf.py
|
||||||
|
python create_scripture_pdf.py --input "Audio Master Nem Full.txt"
|
||||||
|
python create_scripture_pdf.py --kindle-only
|
||||||
|
python create_scripture_pdf.py --paper-only
|
||||||
|
python create_scripture_pdf.py --output-dir ./pdfs
|
||||||
|
python create_scripture_pdf.py --keep-tex # keep .tex files for debugging
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import re
|
||||||
|
import subprocess
|
||||||
|
import sys
|
||||||
|
import tempfile
|
||||||
|
from dataclasses import dataclass
|
||||||
|
from pathlib import Path
|
||||||
|
from typing import Optional
|
||||||
|
|
||||||
|
# ── Default paths ──────────────────────────────────────────────────────────────
|
||||||
|
INPUT_FILE = Path("Audio Master Nem Full.txt")
|
||||||
|
OUTPUT_DIR = Path("output_pdf")
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# LaTeX helper
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
_LATEX_TRANS = str.maketrans({
|
||||||
|
"\\": r"\textbackslash{}",
|
||||||
|
"&": r"\&",
|
||||||
|
"%": r"\%",
|
||||||
|
"$": r"\$",
|
||||||
|
"#": r"\#",
|
||||||
|
"_": r"\_",
|
||||||
|
"{": r"\{",
|
||||||
|
"}": r"\}",
|
||||||
|
"~": r"\textasciitilde{}",
|
||||||
|
"^": r"\textasciicircum{}",
|
||||||
|
"\u2014": "---", # em dash
|
||||||
|
"\u2013": "--", # en dash
|
||||||
|
"\u2018": "`", # left single quote
|
||||||
|
"\u2019": "'", # right single quote
|
||||||
|
"\u201c": "``", # left double quote
|
||||||
|
"\u201d": "''", # right double quote
|
||||||
|
"\u2026": r"\ldots{}", # ellipsis
|
||||||
|
"\u00e9": r"\'e",
|
||||||
|
"\u00e8": r"\`e",
|
||||||
|
"\u00ea": r"\^e",
|
||||||
|
"\u00e0": r"\`a",
|
||||||
|
"\u00e2": r"\^a",
|
||||||
|
"\u00f3": r"\'o",
|
||||||
|
"\u00ed": r"\'{\i}",
|
||||||
|
})
|
||||||
|
|
||||||
|
|
||||||
|
def esc(text: str) -> str:
|
||||||
|
"""Escape special LaTeX characters in a string."""
|
||||||
|
return text.translate(_LATEX_TRANS)
|
||||||
|
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# Document element types
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class TitlePage:
|
||||||
|
lines: list
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class BookHeader:
|
||||||
|
"""One or more heading lines that introduce a new book/section."""
|
||||||
|
lines: list # list of str
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Chapter:
|
||||||
|
num: int
|
||||||
|
subtitle: Optional[str] = None
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class SectionHeading:
|
||||||
|
"""Short heading within a chapter (e.g. MARRIAGE, BAPTISM)."""
|
||||||
|
text: str
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Verse:
|
||||||
|
num: int
|
||||||
|
text: str
|
||||||
|
|
||||||
|
|
||||||
|
@dataclass
|
||||||
|
class Paragraph:
|
||||||
|
text: str
|
||||||
|
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# Parser
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
_RE_VERSE = re.compile(r"^\s*(\d+)\s+(.*)")
|
||||||
|
_RE_CHAPTER = re.compile(r"^\s*CHAPTER\s+(\d+)\s*$", re.IGNORECASE)
|
||||||
|
_RE_DIVIDER = re.compile(r"^_{4,}")
|
||||||
|
|
||||||
|
# Lines longer than this are treated as body paragraphs rather than headings
|
||||||
|
MAX_HEADING_LEN = 120
|
||||||
|
|
||||||
|
|
||||||
|
def _is_verse(line: str) -> bool:
|
||||||
|
"""Line starts with a verse number followed by text."""
|
||||||
|
m = _RE_VERSE.match(line)
|
||||||
|
return bool(m) and int(m.group(1)) > 0
|
||||||
|
|
||||||
|
|
||||||
|
def _is_chapter(line: str) -> bool:
|
||||||
|
return bool(_RE_CHAPTER.match(line.strip()))
|
||||||
|
|
||||||
|
|
||||||
|
def _is_divider(line: str) -> bool:
|
||||||
|
return bool(_RE_DIVIDER.match(line.strip()))
|
||||||
|
|
||||||
|
|
||||||
|
def _is_allcaps(line: str) -> bool:
|
||||||
|
s = line.strip()
|
||||||
|
return bool(s) and s == s.upper() and any(c.isalpha() for c in s)
|
||||||
|
|
||||||
|
|
||||||
|
def parse(text: str) -> list:
|
||||||
|
"""Parse the scripture text into a list of Element objects."""
|
||||||
|
lines = text.splitlines()
|
||||||
|
elements = []
|
||||||
|
n = len(lines)
|
||||||
|
i = 0
|
||||||
|
|
||||||
|
# ── Title page: short lines before the first divider ──────────────────────
|
||||||
|
# Short lines (≤80 chars) are the actual title. Long prose before the first
|
||||||
|
# divider is ignored so it does not duplicate the later labeled Introduction.
|
||||||
|
title_lines = []
|
||||||
|
while i < n and not _is_divider(lines[i]):
|
||||||
|
title_lines.append(lines[i])
|
||||||
|
i += 1
|
||||||
|
actual_title = []
|
||||||
|
for l in title_lines:
|
||||||
|
s = l.strip()
|
||||||
|
if not s:
|
||||||
|
continue
|
||||||
|
if len(s) <= 80:
|
||||||
|
actual_title.append(s)
|
||||||
|
if actual_title:
|
||||||
|
elements.append(TitlePage(lines=actual_title))
|
||||||
|
|
||||||
|
# ── Main pass ─────────────────────────────────────────────────────────────
|
||||||
|
after_divider = False
|
||||||
|
|
||||||
|
while i < n:
|
||||||
|
raw = lines[i]
|
||||||
|
line = raw.strip()
|
||||||
|
|
||||||
|
# ── Divider ───────────────────────────────────────────────────────────
|
||||||
|
if _is_divider(raw):
|
||||||
|
after_divider = True
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# ── Blank line ────────────────────────────────────────────────────────
|
||||||
|
if not line:
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# ── After a divider: collect section/book header ───────────────────
|
||||||
|
# Collect all short non-verse non-chapter lines immediately following
|
||||||
|
# the divider. Stop as soon as we hit a long prose line or body content.
|
||||||
|
if after_divider:
|
||||||
|
after_divider = False
|
||||||
|
header_lines = []
|
||||||
|
j = i
|
||||||
|
while j < n:
|
||||||
|
s = lines[j].strip()
|
||||||
|
if not s: # blank: keep scanning
|
||||||
|
j += 1
|
||||||
|
continue
|
||||||
|
if _is_verse(lines[j]) or _is_chapter(lines[j]):
|
||||||
|
break # reached verse/chapter body
|
||||||
|
if len(s) > MAX_HEADING_LEN:
|
||||||
|
break # long prose line: stop here
|
||||||
|
header_lines.append(s)
|
||||||
|
j += 1
|
||||||
|
if header_lines:
|
||||||
|
elements.append(BookHeader(lines=header_lines))
|
||||||
|
i = j
|
||||||
|
continue
|
||||||
|
|
||||||
|
# ── Chapter heading ────────────────────────────────────────────────
|
||||||
|
m = _RE_CHAPTER.match(line)
|
||||||
|
if m:
|
||||||
|
num = int(m.group(1))
|
||||||
|
# Look ahead for an optional subtitle (short non-verse line)
|
||||||
|
j = i + 1
|
||||||
|
subtitle = None
|
||||||
|
while j < n and not lines[j].strip():
|
||||||
|
j += 1
|
||||||
|
if j < n:
|
||||||
|
ns = lines[j].strip()
|
||||||
|
if (ns
|
||||||
|
and not _is_verse(lines[j])
|
||||||
|
and not _is_chapter(lines[j])
|
||||||
|
and not _is_divider(lines[j])
|
||||||
|
and len(ns) <= MAX_HEADING_LEN):
|
||||||
|
subtitle = ns
|
||||||
|
i = j + 1
|
||||||
|
else:
|
||||||
|
i += 1
|
||||||
|
else:
|
||||||
|
i += 1
|
||||||
|
elements.append(Chapter(num=num, subtitle=subtitle))
|
||||||
|
continue
|
||||||
|
|
||||||
|
# ── All-caps lines: either a BookHeader cluster or a SectionHeading ─
|
||||||
|
# If the cluster of consecutive all-caps lines is followed (after any
|
||||||
|
# blanks) by a CHAPTER heading, treat the whole cluster as a BookHeader.
|
||||||
|
# Otherwise treat only the first line as a SectionHeading.
|
||||||
|
if _is_allcaps(line) and len(line) <= MAX_HEADING_LEN and not _is_verse(raw):
|
||||||
|
# Gather consecutive all-caps lines (blanks skipped)
|
||||||
|
j = i
|
||||||
|
caps_block = []
|
||||||
|
while j < n:
|
||||||
|
s = lines[j].strip()
|
||||||
|
if not s:
|
||||||
|
j += 1
|
||||||
|
continue
|
||||||
|
if (_is_allcaps(s)
|
||||||
|
and len(s) <= MAX_HEADING_LEN
|
||||||
|
and not _is_verse(lines[j])
|
||||||
|
and not _is_chapter(lines[j])
|
||||||
|
and not _is_divider(lines[j])):
|
||||||
|
caps_block.append(s)
|
||||||
|
j += 1
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
# Look past any blanks to see if a chapter heading follows
|
||||||
|
k = j
|
||||||
|
while k < n and not lines[k].strip():
|
||||||
|
k += 1
|
||||||
|
if k < n and _is_chapter(lines[k]):
|
||||||
|
# This cluster is a book/section header
|
||||||
|
elements.append(BookHeader(lines=caps_block))
|
||||||
|
i = j
|
||||||
|
else:
|
||||||
|
# Single inline section subheading (MARRIAGE, BAPTISM, etc.)
|
||||||
|
elements.append(SectionHeading(text=caps_block[0] if caps_block else line))
|
||||||
|
i = i + 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# ── Verse ─────────────────────────────────────────────────────────
|
||||||
|
if _is_verse(raw):
|
||||||
|
mfull = _RE_VERSE.match(raw)
|
||||||
|
elements.append(Verse(num=int(mfull.group(1)), text=mfull.group(2).strip()))
|
||||||
|
i += 1
|
||||||
|
continue
|
||||||
|
|
||||||
|
# ── Paragraph ─────────────────────────────────────────────────────
|
||||||
|
elements.append(Paragraph(text=line))
|
||||||
|
i += 1
|
||||||
|
|
||||||
|
return elements
|
||||||
|
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# LaTeX generation
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
_PREAMBLE_SHARED = r"""
|
||||||
|
\usepackage[T1]{fontenc}
|
||||||
|
\usepackage[utf8]{inputenc}
|
||||||
|
\usepackage{tgpagella}
|
||||||
|
\usepackage{microtype}
|
||||||
|
\usepackage{fancyhdr}
|
||||||
|
\usepackage{needspace}
|
||||||
|
\setlength{\headheight}{14pt}
|
||||||
|
\addtolength{\topmargin}{-2pt}
|
||||||
|
\usepackage[hidelinks]{hyperref}
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def _hrule() -> str:
|
||||||
|
return r"\noindent\rule{\linewidth}{0.3pt}"
|
||||||
|
|
||||||
|
|
||||||
|
# ── Kindle (single-column, e-reader sized) ────────────────────────────────────
|
||||||
|
|
||||||
|
def build_kindle_latex(elements: list) -> str:
|
||||||
|
"""Build a single-column LaTeX document sized for e-readers."""
|
||||||
|
out = []
|
||||||
|
# extarticle (from extsizes) gives us 11pt; plain article also supports it
|
||||||
|
out.append(r"\documentclass[11pt]{extarticle}")
|
||||||
|
out.append(r"""
|
||||||
|
\usepackage[paperwidth=4.5in,paperheight=6.5in,
|
||||||
|
top=0.08in,bottom=0.5in,
|
||||||
|
inner=0.42in,outer=0.38in,
|
||||||
|
headheight=12pt,headsep=6pt,
|
||||||
|
includehead]{geometry}""")
|
||||||
|
out.append(_PREAMBLE_SHARED)
|
||||||
|
out.append(r"""
|
||||||
|
\pagestyle{fancy}
|
||||||
|
\fancyhf{}
|
||||||
|
\fancyhead[C]{\small\itshape\nouppercase{\leftmark}}
|
||||||
|
\fancyfoot[C]{\small\thepage}
|
||||||
|
\renewcommand{\headrulewidth}{0.3pt}
|
||||||
|
|
||||||
|
\setlength{\parindent}{0pt}
|
||||||
|
\setlength{\parskip}{3pt plus 1pt minus 1pt}
|
||||||
|
|
||||||
|
\begin{document}
|
||||||
|
""")
|
||||||
|
# Handle title page separately so we can insert TOC after it
|
||||||
|
title_els = [e for e in elements if isinstance(e, TitlePage)]
|
||||||
|
body_els = [e for e in elements if not isinstance(e, TitlePage)]
|
||||||
|
if title_els:
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
out.append(r"\thispagestyle{empty}")
|
||||||
|
out.append(r"\vspace*{1.3in}")
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
for j, tl in enumerate(title_els[0].lines):
|
||||||
|
s = tl.strip()
|
||||||
|
if not s:
|
||||||
|
continue
|
||||||
|
if j < 3:
|
||||||
|
out.append(r"{\LARGE\bfseries " + esc(s) + r"} \\[8pt]")
|
||||||
|
else:
|
||||||
|
out.append(r"{\large " + esc(s) + r"} \\[4pt]")
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
out.append(r"\renewcommand{\contentsname}{Table of Contents}")
|
||||||
|
out.append(r"\tableofcontents")
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
_emit_elements(out, body_els, kindle=True)
|
||||||
|
out.append(r"\end{document}")
|
||||||
|
return "\n".join(out)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Paper / BOM style (two-column) ────────────────────────────────────────────
|
||||||
|
|
||||||
|
def build_paper_latex(elements: list) -> str:
|
||||||
|
"""Build a two-column, Book of Mormon-style LaTeX document."""
|
||||||
|
out = []
|
||||||
|
# extarticle (from extsizes) for 9pt support
|
||||||
|
out.append(r"\documentclass[9pt,twoside]{extarticle}")
|
||||||
|
out.append(r"""
|
||||||
|
\usepackage[paperwidth=5.5in,paperheight=8.5in,
|
||||||
|
top=0.08in,bottom=0.55in,
|
||||||
|
inner=0.5in,outer=0.42in,
|
||||||
|
headheight=10pt,headsep=5pt,
|
||||||
|
includehead]{geometry}""")
|
||||||
|
out.append(_PREAMBLE_SHARED)
|
||||||
|
out.append(r"""
|
||||||
|
\usepackage{multicol}
|
||||||
|
\setlength{\columnsep}{0.22in}
|
||||||
|
\setlength{\columnseprule}{0.3pt}
|
||||||
|
|
||||||
|
\pagestyle{fancy}
|
||||||
|
\fancyhf{}
|
||||||
|
\fancyhead[LE]{\footnotesize\itshape\nouppercase{\leftmark}}
|
||||||
|
\fancyhead[RO]{\footnotesize\itshape\nouppercase{\rightmark}}
|
||||||
|
\fancyfoot[C]{\scriptsize\thepage}
|
||||||
|
\renewcommand{\headrulewidth}{0.3pt}
|
||||||
|
|
||||||
|
\setlength{\parindent}{0pt}
|
||||||
|
\setlength{\parskip}{1pt}
|
||||||
|
|
||||||
|
\begin{document}
|
||||||
|
""")
|
||||||
|
|
||||||
|
# Emit the title page outside multicols (single-column block)
|
||||||
|
title_els = [e for e in elements if isinstance(e, TitlePage)]
|
||||||
|
body_els = [e for e in elements if not isinstance(e, TitlePage)]
|
||||||
|
|
||||||
|
if title_els:
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
for j, tl in enumerate(title_els[0].lines):
|
||||||
|
s = tl.strip()
|
||||||
|
if not s:
|
||||||
|
continue
|
||||||
|
if j < 3:
|
||||||
|
out.append(r"{\large\bfseries " + esc(s) + r"} \\[3pt]")
|
||||||
|
else:
|
||||||
|
out.append(r"{\small " + esc(s) + r"} \\[1pt]")
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\medskip")
|
||||||
|
|
||||||
|
out.append(r"\renewcommand{\contentsname}{Table of Contents}")
|
||||||
|
out.append(r"\tableofcontents")
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
|
||||||
|
# Skip any leading front-matter paragraphs before the first section header.
|
||||||
|
# For paper output, the intro should begin at the labeled "Introduction"
|
||||||
|
# section rather than repeating the pre-divider prose block.
|
||||||
|
first_section = next(
|
||||||
|
(i for i, el in enumerate(body_els) if isinstance(el, BookHeader)),
|
||||||
|
len(body_els),
|
||||||
|
)
|
||||||
|
paper_body_els = body_els[first_section:]
|
||||||
|
|
||||||
|
# Split intro (before first real book) from main body.
|
||||||
|
# A "real book" is a BookHeader that is followed by at least one Chapter
|
||||||
|
# before the next BookHeader. "Introduction" and similar preamble sections
|
||||||
|
# are BookHeaders too but have no chapters, so they stay in the intro.
|
||||||
|
first_book = len(paper_body_els)
|
||||||
|
for i, el in enumerate(paper_body_els):
|
||||||
|
if isinstance(el, BookHeader):
|
||||||
|
# Check if a Chapter follows before the next BookHeader
|
||||||
|
for j in range(i + 1, len(paper_body_els)):
|
||||||
|
if isinstance(paper_body_els[j], Chapter):
|
||||||
|
first_book = i
|
||||||
|
break
|
||||||
|
if isinstance(paper_body_els[j], BookHeader):
|
||||||
|
break
|
||||||
|
if first_book < len(paper_body_els):
|
||||||
|
break
|
||||||
|
intro_els = paper_body_els[:first_book]
|
||||||
|
main_els = paper_body_els[first_book:]
|
||||||
|
|
||||||
|
if intro_els:
|
||||||
|
_emit_elements(out, intro_els, kindle=True, compact_headers=True)
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
|
||||||
|
out.append(r"\begin{multicols}{2}")
|
||||||
|
_emit_elements(out, main_els, kindle=False)
|
||||||
|
out.append(r"\end{multicols}")
|
||||||
|
out.append(r"\end{document}")
|
||||||
|
return "\n".join(out)
|
||||||
|
|
||||||
|
|
||||||
|
# ── Body emitter ──────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
def _emit_elements(
|
||||||
|
out: list,
|
||||||
|
elements: list,
|
||||||
|
kindle: bool,
|
||||||
|
indent: bool = False,
|
||||||
|
compact_headers: bool = False,
|
||||||
|
) -> None:
|
||||||
|
"""Translate parsed Element objects into LaTeX markup."""
|
||||||
|
|
||||||
|
for el in elements:
|
||||||
|
|
||||||
|
# ── Title page (kindle only; paper handles it before multicols) ──────
|
||||||
|
if isinstance(el, TitlePage):
|
||||||
|
if kindle:
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
out.append(r"\thispagestyle{empty}")
|
||||||
|
out.append(r"\vspace*{1.3in}")
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
for j, tl in enumerate(el.lines):
|
||||||
|
s = tl.strip()
|
||||||
|
if not s:
|
||||||
|
continue
|
||||||
|
if j < 3:
|
||||||
|
out.append(r"{\LARGE\bfseries " + esc(s) + r"} \\[8pt]")
|
||||||
|
else:
|
||||||
|
out.append(r"{\large " + esc(s) + r"} \\[4pt]")
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
|
||||||
|
# ── Book / section header ────────────────────────────────────────────
|
||||||
|
elif isinstance(el, BookHeader):
|
||||||
|
lines = el.lines
|
||||||
|
|
||||||
|
if kindle:
|
||||||
|
# Start a new page for each major book
|
||||||
|
out.append(r"\clearpage")
|
||||||
|
out.append(r"\phantomsection\addcontentsline{toc}{section}{" + esc(lines[0]) + r"}")
|
||||||
|
out.append(r"\vspace*{0pt}" if compact_headers else r"\vspace*{0.1in}")
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
out.append(_hrule())
|
||||||
|
out.append(r"\\[6pt]")
|
||||||
|
out.append(r"{\bfseries\large " + esc(lines[0]) + r"}")
|
||||||
|
for ln in lines[1:]:
|
||||||
|
out.append(r"\\ [3pt]{\normalsize\itshape " + esc(ln) + r"}")
|
||||||
|
out.append(r"\\[6pt]")
|
||||||
|
out.append(_hrule())
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\markboth{" + esc(lines[0]) + r"}{" + esc(lines[0]) + r"}")
|
||||||
|
out.append(r"\vspace{5pt}")
|
||||||
|
|
||||||
|
else:
|
||||||
|
# Inline heading within the two-column flow
|
||||||
|
# Refuse to start a new book in the bottom half of a column
|
||||||
|
out.append(r"\needspace{0.5\textheight}")
|
||||||
|
out.append(r"\phantomsection\addcontentsline{toc}{section}{" + esc(lines[0]) + r"}")
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
out.append(_hrule())
|
||||||
|
out.append(r"\\[2pt]")
|
||||||
|
out.append(r"{\bfseries " + esc(lines[0]) + r"}")
|
||||||
|
for ln in lines[1:]:
|
||||||
|
out.append(r"\\ {\small\itshape " + esc(ln) + r"}")
|
||||||
|
out.append(r"\\[2pt]")
|
||||||
|
out.append(_hrule())
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\markboth{" + esc(lines[0]) + r"}{" + esc(lines[0]) + r"}")
|
||||||
|
out.append(r"\vspace{2pt}")
|
||||||
|
|
||||||
|
# ── Chapter heading ──────────────────────────────────────────────────
|
||||||
|
elif isinstance(el, Chapter):
|
||||||
|
label = f"CHAPTER {el.num}"
|
||||||
|
|
||||||
|
if kindle:
|
||||||
|
out.append(r"\phantomsection\addcontentsline{toc}{subsection}{" + esc(label) + r"}")
|
||||||
|
out.append(r"\needspace{4\baselineskip}")
|
||||||
|
out.append(r"\vspace{14pt}")
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
out.append(r"{\bfseries\large " + esc(label) + r"}")
|
||||||
|
if el.subtitle:
|
||||||
|
out.append(r"\\ [3pt]{\normalsize\itshape " + esc(el.subtitle) + r"}")
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\markright{" + esc(label) + r"}")
|
||||||
|
out.append(r"\vspace{6pt}")
|
||||||
|
|
||||||
|
else:
|
||||||
|
out.append(r"\phantomsection\addcontentsline{toc}{subsection}{" + esc(label) + r"}")
|
||||||
|
out.append(r"\needspace{2\baselineskip}")
|
||||||
|
out.append(r"\vspace{3pt}")
|
||||||
|
out.append(r"\begin{center}")
|
||||||
|
out.append(r"{\bfseries " + esc(label) + r"}")
|
||||||
|
if el.subtitle:
|
||||||
|
out.append(r"\\ {\small\itshape " + esc(el.subtitle) + r"}")
|
||||||
|
out.append(r"\end{center}")
|
||||||
|
out.append(r"\markright{" + esc(label) + r"}")
|
||||||
|
out.append(r"\vspace{1pt}")
|
||||||
|
|
||||||
|
# ── Section subheading (MARRIAGE, BAPTISM, etc.) ────────────────────
|
||||||
|
elif isinstance(el, SectionHeading):
|
||||||
|
if kindle:
|
||||||
|
out.append(r"\vspace{8pt}")
|
||||||
|
out.append(r"\begin{center}{\bfseries " + esc(el.text) + r"}\end{center}")
|
||||||
|
out.append(r"\vspace{4pt}")
|
||||||
|
else:
|
||||||
|
out.append(r"\vspace{3pt}")
|
||||||
|
out.append(
|
||||||
|
r"\begin{center}{\bfseries\small " + esc(el.text) + r"}\end{center}"
|
||||||
|
)
|
||||||
|
out.append(r"\vspace{1pt}")
|
||||||
|
|
||||||
|
# ── Verse ────────────────────────────────────────────────────────────
|
||||||
|
elif isinstance(el, Verse):
|
||||||
|
body = esc(el.text)
|
||||||
|
if kindle:
|
||||||
|
# Bold inline number (not superscript) for readability on screen
|
||||||
|
vnum = r"\textbf{" + str(el.num) + r"}"
|
||||||
|
out.append(r"\noindent " + vnum + r"~" + body)
|
||||||
|
out.append(r"\par\smallskip")
|
||||||
|
else:
|
||||||
|
vnum = r"\textbf{" + str(el.num) + r"}"
|
||||||
|
out.append(r"\noindent " + vnum + r"~" + body + r"\par")
|
||||||
|
|
||||||
|
# ── Paragraph (prose intro, commentary, etc.) ───────────────────────
|
||||||
|
elif isinstance(el, Paragraph):
|
||||||
|
body = esc(el.text)
|
||||||
|
if kindle:
|
||||||
|
out.append(r"\noindent " + body)
|
||||||
|
out.append(r"\par\smallskip")
|
||||||
|
elif indent:
|
||||||
|
out.append(body + r"\par\medskip")
|
||||||
|
else:
|
||||||
|
out.append(r"\noindent " + body + r"\par")
|
||||||
|
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# Utility: book limiter
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
def truncate_to_books(elements: list, max_books: int) -> list:
|
||||||
|
"""Return only the first *max_books* BookHeader sections (and their content).
|
||||||
|
Title-page and front-matter paragraphs before the first BookHeader are always kept.
|
||||||
|
"""
|
||||||
|
if max_books <= 0:
|
||||||
|
return elements
|
||||||
|
count = 0
|
||||||
|
result = []
|
||||||
|
for el in elements:
|
||||||
|
if isinstance(el, BookHeader):
|
||||||
|
count += 1
|
||||||
|
if count > max_books:
|
||||||
|
break
|
||||||
|
result.append(el)
|
||||||
|
return result
|
||||||
|
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# PDF compilation
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
def _find_compiler() -> tuple:
|
||||||
|
"""Return (compiler_path, compiler_type) or (None, None) if none found."""
|
||||||
|
import shutil
|
||||||
|
# Also probe common absolute paths in case the dir isn't on $PATH
|
||||||
|
candidates = {
|
||||||
|
"pdflatex": ["/usr/bin/pdflatex", "/usr/local/bin/pdflatex"],
|
||||||
|
"tectonic": ["/usr/bin/tectonic", "/usr/local/bin/tectonic"],
|
||||||
|
}
|
||||||
|
for cmd, extra_paths in candidates.items():
|
||||||
|
found = shutil.which(cmd)
|
||||||
|
if found:
|
||||||
|
return found, cmd
|
||||||
|
for p in extra_paths:
|
||||||
|
if Path(p).exists():
|
||||||
|
return p, cmd
|
||||||
|
return None, None
|
||||||
|
|
||||||
|
|
||||||
|
def compile_pdf(tex_src: str, output_pdf: Path,
|
||||||
|
keep_tex: bool = False,
|
||||||
|
compiler_path: str = "/usr/bin/pdflatex",
|
||||||
|
compiler_type: str = "pdflatex") -> bool:
|
||||||
|
"""
|
||||||
|
Write *tex_src* into a temp directory, run the LaTeX compiler, and copy
|
||||||
|
the resulting PDF to *output_pdf*. Supports ``pdflatex`` and ``tectonic``.
|
||||||
|
Returns True on success.
|
||||||
|
"""
|
||||||
|
with tempfile.TemporaryDirectory() as tmp:
|
||||||
|
tmp_path = Path(tmp)
|
||||||
|
tex_file = tmp_path / "document.tex"
|
||||||
|
tex_file.write_text(tex_src, encoding="utf-8")
|
||||||
|
|
||||||
|
if compiler_type == "tectonic":
|
||||||
|
# Tectonic compiles in one pass and downloads missing packages.
|
||||||
|
passes = 1
|
||||||
|
cmd_base = [compiler_path, "document.tex"]
|
||||||
|
else:
|
||||||
|
# pdflatex needs two passes to get page headers right.
|
||||||
|
passes = 2
|
||||||
|
cmd_base = [compiler_path, "-interaction=nonstopmode",
|
||||||
|
"-halt-on-error", "document.tex"]
|
||||||
|
|
||||||
|
for pass_num in range(1, passes + 1):
|
||||||
|
result = subprocess.run(
|
||||||
|
cmd_base, cwd=tmp, capture_output=True, text=True,
|
||||||
|
)
|
||||||
|
if result.returncode != 0:
|
||||||
|
print(f" [compiler error on pass {pass_num}]", file=sys.stderr)
|
||||||
|
print(result.stdout[-3000:], file=sys.stderr)
|
||||||
|
if result.stderr:
|
||||||
|
print(result.stderr[-1000:], file=sys.stderr)
|
||||||
|
if keep_tex:
|
||||||
|
dest = output_pdf.with_suffix(".tex")
|
||||||
|
dest.write_text(tex_src, encoding="utf-8")
|
||||||
|
print(f" TeX source saved to: {dest}", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
pdf_out = tmp_path / "document.pdf"
|
||||||
|
if pdf_out.exists():
|
||||||
|
output_pdf.parent.mkdir(parents=True, exist_ok=True)
|
||||||
|
output_pdf.write_bytes(pdf_out.read_bytes())
|
||||||
|
if keep_tex:
|
||||||
|
dest = output_pdf.with_suffix(".tex")
|
||||||
|
dest.write_text(tex_src, encoding="utf-8")
|
||||||
|
return True
|
||||||
|
|
||||||
|
print(" [compiler ran but document.pdf was not produced]", file=sys.stderr)
|
||||||
|
return False
|
||||||
|
|
||||||
|
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
# Main
|
||||||
|
# ══════════════════════════════════════════════════════════════════════════════
|
||||||
|
|
||||||
|
_INSTALL_INSTRUCTIONS = """
|
||||||
|
No LaTeX compiler found. Install one of the following:
|
||||||
|
|
||||||
|
Arch / CachyOS / Manjaro:
|
||||||
|
sudo pacman -S texlive-basic texlive-latex texlive-latexrecommended \\
|
||||||
|
texlive-latexextra texlive-fontsrecommended
|
||||||
|
|
||||||
|
Debian / Ubuntu:
|
||||||
|
sudo apt-get install texlive-latex-extra texlive-fonts-recommended
|
||||||
|
|
||||||
|
--- OR --- (self-contained, downloads packages on first use)
|
||||||
|
sudo pacman -S tectonic
|
||||||
|
# or: cargo install tectonic
|
||||||
|
"""
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Generate scripture-style PDFs from the Book of the Nem text.",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog=__doc__,
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--input", type=Path, default=INPUT_FILE,
|
||||||
|
help=f"Input plain-text file (default: {INPUT_FILE})",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--output-dir", type=Path, default=OUTPUT_DIR,
|
||||||
|
help=f"Output directory (default: {OUTPUT_DIR})",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--kindle-only", action="store_true",
|
||||||
|
help="Generate only the Kindle (single-column) PDF.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--paper-only", action="store_true",
|
||||||
|
help="Generate only the paper (two-column) PDF.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--keep-tex", action="store_true",
|
||||||
|
help="Save the intermediate .tex files alongside each PDF.",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--max-books", type=int, default=0, metavar="N",
|
||||||
|
help="Limit output to the first N book sections (0 = no limit).",
|
||||||
|
)
|
||||||
|
parser.add_argument(
|
||||||
|
"--tex-only", action="store_true",
|
||||||
|
help="Write .tex files only — do not attempt PDF compilation. "
|
||||||
|
"Useful when a LaTeX compiler is not available.",
|
||||||
|
)
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
src_path: Path = args.input
|
||||||
|
if not src_path.exists():
|
||||||
|
sys.exit(f"ERROR: Input file not found: {src_path}")
|
||||||
|
|
||||||
|
print(f"Reading: {src_path}")
|
||||||
|
text = src_path.read_text(encoding="utf-8", errors="replace")
|
||||||
|
|
||||||
|
elements = parse(text)
|
||||||
|
if args.max_books > 0:
|
||||||
|
elements = truncate_to_books(elements, args.max_books)
|
||||||
|
print(f" Limiting to first {args.max_books} book(s).")
|
||||||
|
books = sum(1 for e in elements if isinstance(e, BookHeader))
|
||||||
|
chapters = sum(1 for e in elements if isinstance(e, Chapter))
|
||||||
|
verses = sum(1 for e in elements if isinstance(e, Verse))
|
||||||
|
print(f" Parsed: {books} books/sections, {chapters} chapters, {verses} verses")
|
||||||
|
|
||||||
|
out_dir: Path = args.output_dir
|
||||||
|
out_dir.mkdir(parents=True, exist_ok=True)
|
||||||
|
|
||||||
|
# Locate compiler (unless --tex-only)
|
||||||
|
compiler_path, compiler_type = None, None
|
||||||
|
if not args.tex_only:
|
||||||
|
compiler_path, compiler_type = _find_compiler()
|
||||||
|
if not compiler_path:
|
||||||
|
print(_INSTALL_INSTRUCTIONS, file=sys.stderr)
|
||||||
|
print("Falling back to --tex-only mode: .tex files will be written "
|
||||||
|
"but not compiled.", file=sys.stderr)
|
||||||
|
args.tex_only = True
|
||||||
|
else:
|
||||||
|
print(f" Using compiler: {compiler_path}")
|
||||||
|
|
||||||
|
def _write_or_compile(tex: str, pdf_path: Path, label: str):
|
||||||
|
if args.tex_only or args.keep_tex:
|
||||||
|
tex_path = pdf_path.with_suffix(".tex")
|
||||||
|
tex_path.write_text(tex, encoding="utf-8")
|
||||||
|
print(f" ✓ TeX saved: {tex_path}")
|
||||||
|
if args.tex_only:
|
||||||
|
return
|
||||||
|
print(f" Compiling {label} PDF …")
|
||||||
|
ok = compile_pdf(tex, pdf_path, keep_tex=args.keep_tex,
|
||||||
|
compiler_path=compiler_path,
|
||||||
|
compiler_type=compiler_type)
|
||||||
|
if ok:
|
||||||
|
print(f" ✓ {pdf_path}")
|
||||||
|
else:
|
||||||
|
print(f" ✗ {label} PDF failed — see errors above.")
|
||||||
|
|
||||||
|
# ── Kindle PDF ────────────────────────────────────────────────────────────
|
||||||
|
if not args.paper_only:
|
||||||
|
print(f"\nKindle PDF (single-column, 4.5\"×6.5\") …")
|
||||||
|
tex = build_kindle_latex(elements)
|
||||||
|
_write_or_compile(tex, out_dir / "nem_phone.pdf", "Kindle")
|
||||||
|
|
||||||
|
# ── Paper / BOM-style PDF ────────────────────────────────────────────────
|
||||||
|
if not args.kindle_only:
|
||||||
|
print(f"\nPaper PDF (two-column BOM style, 5.5\"×8.5\") …")
|
||||||
|
tex = build_paper_latex(elements)
|
||||||
|
_write_or_compile(tex, out_dir / "nem_paper.pdf", "Paper")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
main()
|
||||||
File diff suppressed because it is too large
Load Diff
@ -0,0 +1,778 @@
|
|||||||
|
{
|
||||||
|
"Aaagast": "aaagast.wav",
|
||||||
|
"Abby": "abby.wav",
|
||||||
|
"Abigail": "abigail.wav",
|
||||||
|
"Abodey": "abodey.wav",
|
||||||
|
"Abriyyah": "abriyyah.wav",
|
||||||
|
"Abyss": "abyss.wav",
|
||||||
|
"Adamantine": "adamantine.wav",
|
||||||
|
"Addobes": "addobes.wav",
|
||||||
|
"Adobbes": "adobbes.wav",
|
||||||
|
"Aedrick": "aedrick.wav",
|
||||||
|
"Aegis": "aegis.wav",
|
||||||
|
"Aegrir": "aegrir.wav",
|
||||||
|
"Afire": "afire.wav",
|
||||||
|
"Agatha": "agatha.wav",
|
||||||
|
"Agony": "agony.wav",
|
||||||
|
"Agrarian": "agrarian.wav",
|
||||||
|
"Aheer": "aheer.wav",
|
||||||
|
"Ahman": "ahman.wav",
|
||||||
|
"Ailondel": "ailondel.wav",
|
||||||
|
"Airk": "airk.wav",
|
||||||
|
"Al-Astan": "al_astan.wav",
|
||||||
|
"Alchemist": "alchemist.wav",
|
||||||
|
"Alvrin": "alvrin.wav",
|
||||||
|
"Amarantha": "amarantha.wav",
|
||||||
|
"Amaryllis": "amaryllis.wav",
|
||||||
|
"Ananduil": "ananduil.wav",
|
||||||
|
"Anaudriel": "anaudriel.wav",
|
||||||
|
"Andrahel": "andrahel.wav",
|
||||||
|
"Anhuil": "anhuil.wav",
|
||||||
|
"Anhuil-Ehlar": "anhuil_ehlar.wav",
|
||||||
|
"Anhuil-Elhar": "anhuil_elhar.wav",
|
||||||
|
"Anjeer": "anjeer.wav",
|
||||||
|
"Ankh": "ankh.wav",
|
||||||
|
"Annalise": "annalise.wav",
|
||||||
|
"Anointing": "anointing.wav",
|
||||||
|
"Anoush": "anoush.wav",
|
||||||
|
"Anuil": "anuil.wav",
|
||||||
|
"Anvilhammer": "anvilhammer.wav",
|
||||||
|
"Ara": "ara.wav",
|
||||||
|
"Aragast": "aragast.wav",
|
||||||
|
"Aragst": "aragst.wav",
|
||||||
|
"Aralon": "aralon.wav",
|
||||||
|
"Aran": "aran.wav",
|
||||||
|
"Arans": "arans.wav",
|
||||||
|
"Arashan": "arashan.wav",
|
||||||
|
"Arbiter": "arbiter.wav",
|
||||||
|
"Archmage": "archmage.wav",
|
||||||
|
"Archwizard": "archwizard.wav",
|
||||||
|
"Ardrick": "ardrick.wav",
|
||||||
|
"Argast": "argast.wav",
|
||||||
|
"Armbrook": "armbrook.wav",
|
||||||
|
"Armory": "armory.wav",
|
||||||
|
"Arn": "arn.wav",
|
||||||
|
"Arn-Del": "arn_del.wav",
|
||||||
|
"Asheer": "asheer.wav",
|
||||||
|
"Aske": "aske.wav",
|
||||||
|
"Aster": "aster.wav",
|
||||||
|
"Astor": "astor.wav",
|
||||||
|
"Astral": "astral.wav",
|
||||||
|
"Astride": "astride.wav",
|
||||||
|
"Astute": "astute.wav",
|
||||||
|
"Avery": "avery.wav",
|
||||||
|
"Avorein": "avorein.wav",
|
||||||
|
"Await": "await.wav",
|
||||||
|
"Awww": "awww.wav",
|
||||||
|
"Axehammer": "axehammer.wav",
|
||||||
|
"Ayana": "ayana.wav",
|
||||||
|
"Ayron": "ayron.wav",
|
||||||
|
"Azuremoon": "azuremoon.wav",
|
||||||
|
"Badlands": "badlands.wav",
|
||||||
|
"Baelen": "baelen.wav",
|
||||||
|
"Bah": "bah.wav",
|
||||||
|
"Ballista": "ballista.wav",
|
||||||
|
"Bancroft": "bancroft.wav",
|
||||||
|
"Baras": "baras.wav",
|
||||||
|
"Barek": "barek.wav",
|
||||||
|
"Barge": "barge.wav",
|
||||||
|
"Barrik": "barrik.wav",
|
||||||
|
"Battlelord": "battlelord.wav",
|
||||||
|
"Bazaar": "bazaar.wav",
|
||||||
|
"Bearas": "bearas.wav",
|
||||||
|
"Bearasagain": "bearasagain.wav",
|
||||||
|
"Bearasand": "bearasand.wav",
|
||||||
|
"Bearasasked": "bearasasked.wav",
|
||||||
|
"Bearasat": "bearasat.wav",
|
||||||
|
"Bearasbegan": "bearasbegan.wav",
|
||||||
|
"Bearasbowed": "bearasbowed.wav",
|
||||||
|
"Bearascan": "bearascan.wav",
|
||||||
|
"Bearasdown": "bearasdown.wav",
|
||||||
|
"Bearasemerged": "bearasemerged.wav",
|
||||||
|
"Bearasfelt": "bearasfelt.wav",
|
||||||
|
"Bearasfor": "bearasfor.wav",
|
||||||
|
"Bearashad": "bearashad.wav",
|
||||||
|
"Bearashas": "bearashas.wav",
|
||||||
|
"Bearasheld": "bearasheld.wav",
|
||||||
|
"Bearashesitantly": "bearashesitantly.wav",
|
||||||
|
"Bearasin": "bearasin.wav",
|
||||||
|
"Bearasleading": "bearasleading.wav",
|
||||||
|
"Bearasmust": "bearasmust.wav",
|
||||||
|
"Bearasnodded": "bearasnodded.wav",
|
||||||
|
"Bearasperplexed": "bearasperplexed.wav",
|
||||||
|
"Bearasquickly": "bearasquickly.wav",
|
||||||
|
"Bearasreleased": "bearasreleased.wav",
|
||||||
|
"Bearassaid": "bearassaid.wav",
|
||||||
|
"Bearassat": "bearassat.wav",
|
||||||
|
"Bearassimply": "bearassimply.wav",
|
||||||
|
"Bearasslowly": "bearasslowly.wav",
|
||||||
|
"Bearassome": "bearassome.wav",
|
||||||
|
"Bearasspeaks": "bearasspeaks.wav",
|
||||||
|
"Bearassteeled": "bearassteeled.wav",
|
||||||
|
"Bearasstood": "bearasstood.wav",
|
||||||
|
"Bearasthat": "bearasthat.wav",
|
||||||
|
"Bearasthen": "bearasthen.wav",
|
||||||
|
"Bearasto": "bearasto.wav",
|
||||||
|
"Bearastrailed": "bearastrailed.wav",
|
||||||
|
"Bearaswandered": "bearaswandered.wav",
|
||||||
|
"Bearaswho": "bearaswho.wav",
|
||||||
|
"Bearaswith": "bearaswith.wav",
|
||||||
|
"Beldvorth": "beldvorth.wav",
|
||||||
|
"Belegast": "belegast.wav",
|
||||||
|
"Berstag": "berstag.wav",
|
||||||
|
"Beydell": "beydell.wav",
|
||||||
|
"Blackfeather": "blackfeather.wav",
|
||||||
|
"Blackroot": "blackroot.wav",
|
||||||
|
"Blargh": "blargh.wav",
|
||||||
|
"Bledvorth": "bledvorth.wav",
|
||||||
|
"Blessings": "blessings.wav",
|
||||||
|
"Bloodstone": "bloodstone.wav",
|
||||||
|
"Bloodtone": "bloodtone.wav",
|
||||||
|
"Bogard": "bogard.wav",
|
||||||
|
"Boldar": "boldar.wav",
|
||||||
|
"Bolton": "bolton.wav",
|
||||||
|
"Bon": "bon.wav",
|
||||||
|
"Boomer": "boomer.wav",
|
||||||
|
"Bouldershaun": "bouldershaun.wav",
|
||||||
|
"Boulevarde": "boulevarde.wav",
|
||||||
|
"Brahma": "brahma.wav",
|
||||||
|
"Bramble": "bramble.wav",
|
||||||
|
"Brambleburr": "brambleburr.wav",
|
||||||
|
"Brambleburrs": "brambleburrs.wav",
|
||||||
|
"Branson": "branson.wav",
|
||||||
|
"Bravado": "bravado.wav",
|
||||||
|
"Brax": "brax.wav",
|
||||||
|
"Braz": "braz.wav",
|
||||||
|
"Brazen": "brazen.wav",
|
||||||
|
"Brazenclaw": "brazenclaw.wav",
|
||||||
|
"Brazenclaws": "brazenclaws.wav",
|
||||||
|
"Breeches": "breeches.wav",
|
||||||
|
"Brendan": "brendan.wav",
|
||||||
|
"Brethren": "brethren.wav",
|
||||||
|
"Brickhorn": "brickhorn.wav",
|
||||||
|
"Caldwell": "caldwell.wav",
|
||||||
|
"Calico": "calico.wav",
|
||||||
|
"Caller": "caller.wav",
|
||||||
|
"Camels": "camels.wav",
|
||||||
|
"Canals": "canals.wav",
|
||||||
|
"Captains": "captains.wav",
|
||||||
|
"Caravan": "caravan.wav",
|
||||||
|
"Caswold": "caswold.wav",
|
||||||
|
"Causeway": "causeway.wav",
|
||||||
|
"Cavalier": "cavalier.wav",
|
||||||
|
"Cavern": "cavern.wav",
|
||||||
|
"Cherrytree": "cherrytree.wav",
|
||||||
|
"Chieftain": "chieftain.wav",
|
||||||
|
"Chivalrous": "chivalrous.wav",
|
||||||
|
"Chun": "chun.wav",
|
||||||
|
"Citadel": "citadel.wav",
|
||||||
|
"Clarn": "clarn.wav",
|
||||||
|
"Claw": "claw.wav",
|
||||||
|
"Cleric": "cleric.wav",
|
||||||
|
"Cobblestone": "cobblestone.wav",
|
||||||
|
"Contessa": "contessa.wav",
|
||||||
|
"Corporal": "corporal.wav",
|
||||||
|
"Cotswold": "cotswold.wav",
|
||||||
|
"Councillor": "councillor.wav",
|
||||||
|
"Councilman": "councilman.wav",
|
||||||
|
"Councilmen": "councilmen.wav",
|
||||||
|
"Councilor": "councilor.wav",
|
||||||
|
"Crimson": "crimson.wav",
|
||||||
|
"Crismon": "crismon.wav",
|
||||||
|
"Cylan": "cylan.wav",
|
||||||
|
"Dai": "dai.wav",
|
||||||
|
"Dalthanis": "dalthanis.wav",
|
||||||
|
"Dank": "dank.wav",
|
||||||
|
"Dayr": "dayr.wav",
|
||||||
|
"Dedric": "dedric.wav",
|
||||||
|
"Delgra": "delgra.wav",
|
||||||
|
"Delic": "delic.wav",
|
||||||
|
"Denizen": "denizen.wav",
|
||||||
|
"Denizens": "denizens.wav",
|
||||||
|
"Deric": "deric.wav",
|
||||||
|
"Derrbane": "derrbane.wav",
|
||||||
|
"Derro": "derro.wav",
|
||||||
|
"Derrobane": "derrobane.wav",
|
||||||
|
"Dibble": "dibble.wav",
|
||||||
|
"Diblon": "diblon.wav",
|
||||||
|
"Dire": "dire.wav",
|
||||||
|
"Dis": "dis.wav",
|
||||||
|
"Dobson": "dobson.wav",
|
||||||
|
"Dorian": "dorian.wav",
|
||||||
|
"Dorza": "dorza.wav",
|
||||||
|
"Dragonbane": "dragonbane.wav",
|
||||||
|
"Dragonsbane": "dragonsbane.wav",
|
||||||
|
"Drakor": "drakor.wav",
|
||||||
|
"Draygon": "draygon.wav",
|
||||||
|
"Drefan": "drefan.wav",
|
||||||
|
"Ducan": "ducan.wav",
|
||||||
|
"Duggan": "duggan.wav",
|
||||||
|
"Dulak": "dulak.wav",
|
||||||
|
"Dunca": "dunca.wav",
|
||||||
|
"Dune": "dune.wav",
|
||||||
|
"Dur": "dur.wav",
|
||||||
|
"Dur-Hakan": "dur_hakan.wav",
|
||||||
|
"Durgane": "durgane.wav",
|
||||||
|
"Durthaim": "durthaim.wav",
|
||||||
|
"Durthrim": "durthrim.wav",
|
||||||
|
"Dwarf": "dwarf.wav",
|
||||||
|
"Dwarven": "dwarven.wav",
|
||||||
|
"Earlson": "earlson.wav",
|
||||||
|
"Eastward": "eastward.wav",
|
||||||
|
"Effigius": "effigius.wav",
|
||||||
|
"Ehlar": "ehlar.wav",
|
||||||
|
"El-Ran": "el_ran.wav",
|
||||||
|
"El-Shen": "el_shen.wav",
|
||||||
|
"Elan": "elan.wav",
|
||||||
|
"Elessel": "elessel.wav",
|
||||||
|
"Elf": "elf.wav",
|
||||||
|
"Elhar": "elhar.wav",
|
||||||
|
"Elishan": "elishan.wav",
|
||||||
|
"Eliza": "eliza.wav",
|
||||||
|
"Elliswan": "elliswan.wav",
|
||||||
|
"Elliwsan": "elliwsan.wav",
|
||||||
|
"Elodea": "elodea.wav",
|
||||||
|
"Elshan": "elshan.wav",
|
||||||
|
"Elven": "elven.wav",
|
||||||
|
"Elvenkind": "elvenkind.wav",
|
||||||
|
"Elves": "elves.wav",
|
||||||
|
"Elvrathas": "elvrathas.wav",
|
||||||
|
"Elysium": "elysium.wav",
|
||||||
|
"Emaleen": "emaleen.wav",
|
||||||
|
"Eminence": "eminence.wav",
|
||||||
|
"Emissary": "emissary.wav",
|
||||||
|
"Emporium": "emporium.wav",
|
||||||
|
"Enaru": "enaru.wav",
|
||||||
|
"Endaleth": "endaleth.wav",
|
||||||
|
"Envoy": "envoy.wav",
|
||||||
|
"Eppres": "eppres.wav",
|
||||||
|
"Eradication": "eradication.wav",
|
||||||
|
"Eru": "eru.wav",
|
||||||
|
"Eshela": "eshela.wav",
|
||||||
|
"Ethereal": "ethereal.wav",
|
||||||
|
"Eushon": "eushon.wav",
|
||||||
|
"Eushownava": "eushownava.wav",
|
||||||
|
"Everdark": "everdark.wav",
|
||||||
|
"Everytime": "everytime.wav",
|
||||||
|
"Eylana": "eylana.wav",
|
||||||
|
"Eylanan": "eylanan.wav",
|
||||||
|
"Ezrin": "ezrin.wav",
|
||||||
|
"F-Fine": "f_fine.wav",
|
||||||
|
"F-Forgive": "f_forgive.wav",
|
||||||
|
"Faerie": "faerie.wav",
|
||||||
|
"Fairik": "fairik.wav",
|
||||||
|
"Fargus": "fargus.wav",
|
||||||
|
"Fark": "fark.wav",
|
||||||
|
"Farraj": "farraj.wav",
|
||||||
|
"Farush": "farush.wav",
|
||||||
|
"Feasthall": "feasthall.wav",
|
||||||
|
"Featherstone": "featherstone.wav",
|
||||||
|
"Felaria": "felaria.wav",
|
||||||
|
"Feliq": "feliq.wav",
|
||||||
|
"Felnck": "felnck.wav",
|
||||||
|
"Felnick": "felnick.wav",
|
||||||
|
"Felnicks": "felnicks.wav",
|
||||||
|
"Felnik": "felnik.wav",
|
||||||
|
"Fenaya": "fenaya.wav",
|
||||||
|
"Feneya": "feneya.wav",
|
||||||
|
"Ferrus": "ferrus.wav",
|
||||||
|
"Fey": "fey.wav",
|
||||||
|
"Firebane": "firebane.wav",
|
||||||
|
"Fireshard": "fireshard.wav",
|
||||||
|
"Foomwairma": "foomwairma.wav",
|
||||||
|
"Forger": "forger.wav",
|
||||||
|
"Frandor": "frandor.wav",
|
||||||
|
"Friarsdai": "friarsdai.wav",
|
||||||
|
"Fumairma": "fumairma.wav",
|
||||||
|
"Fumwairma": "fumwairma.wav",
|
||||||
|
"Galantholas": "galantholas.wav",
|
||||||
|
"Galathorn": "galathorn.wav",
|
||||||
|
"Galen": "galen.wav",
|
||||||
|
"Galonti": "galonti.wav",
|
||||||
|
"Garb": "garb.wav",
|
||||||
|
"Gareth": "gareth.wav",
|
||||||
|
"Garvek": "garvek.wav",
|
||||||
|
"Gaunt": "gaunt.wav",
|
||||||
|
"Gavin": "gavin.wav",
|
||||||
|
"Geez": "geez.wav",
|
||||||
|
"Ghurauk": "ghurauk.wav",
|
||||||
|
"Gilandras": "gilandras.wav",
|
||||||
|
"Gilard": "gilard.wav",
|
||||||
|
"Gilchis": "gilchis.wav",
|
||||||
|
"Gilchris": "gilchris.wav",
|
||||||
|
"Gilding": "gilding.wav",
|
||||||
|
"Gilrick": "gilrick.wav",
|
||||||
|
"Glades": "glades.wav",
|
||||||
|
"Glanthalas": "glanthalas.wav",
|
||||||
|
"Glantholas": "glantholas.wav",
|
||||||
|
"Glimmerwyn": "glimmerwyn.wav",
|
||||||
|
"Gloomstone": "gloomstone.wav",
|
||||||
|
"Gnaum": "gnaum.wav",
|
||||||
|
"Gnomish": "gnomish.wav",
|
||||||
|
"Goblinkin": "goblinkin.wav",
|
||||||
|
"Goldsheen": "goldsheen.wav",
|
||||||
|
"Gorath": "gorath.wav",
|
||||||
|
"Gore": "gore.wav",
|
||||||
|
"Gorg": "gorg.wav",
|
||||||
|
"Gorlyn": "gorlyn.wav",
|
||||||
|
"Gorstad": "gorstad.wav",
|
||||||
|
"Gotto": "gotto.wav",
|
||||||
|
"Graces": "graces.wav",
|
||||||
|
"Graffel": "graffel.wav",
|
||||||
|
"Grandmaster": "grandmaster.wav",
|
||||||
|
"Granitestone": "granitestone.wav",
|
||||||
|
"Gratzel": "gratzel.wav",
|
||||||
|
"Graystrom": "graystrom.wav",
|
||||||
|
"Greathaven": "greathaven.wav",
|
||||||
|
"Gregarious": "gregarious.wav",
|
||||||
|
"Gregor": "gregor.wav",
|
||||||
|
"Griffon": "griffon.wav",
|
||||||
|
"Grimbold": "grimbold.wav",
|
||||||
|
"Gripp": "gripp.wav",
|
||||||
|
"Grizzled": "grizzled.wav",
|
||||||
|
"Grog": "grog.wav",
|
||||||
|
"Grogg": "grogg.wav",
|
||||||
|
"Grotto": "grotto.wav",
|
||||||
|
"Gruff": "gruff.wav",
|
||||||
|
"Gruul": "gruul.wav",
|
||||||
|
"Guardarm": "guardarm.wav",
|
||||||
|
"Gustafson": "gustafson.wav",
|
||||||
|
"Guza": "guza.wav",
|
||||||
|
"Gylis": "gylis.wav",
|
||||||
|
"Habani": "habani.wav",
|
||||||
|
"Hagatha": "hagatha.wav",
|
||||||
|
"Hakan": "hakan.wav",
|
||||||
|
"Hallowed": "hallowed.wav",
|
||||||
|
"Halthessala": "halthessala.wav",
|
||||||
|
"Hammerhaft": "hammerhaft.wav",
|
||||||
|
"Har": "har.wav",
|
||||||
|
"Harbrim": "harbrim.wav",
|
||||||
|
"Harbrin": "harbrin.wav",
|
||||||
|
"Hardrock": "hardrock.wav",
|
||||||
|
"Harrik": "harrik.wav",
|
||||||
|
"Hauberk": "hauberk.wav",
|
||||||
|
"Hazards": "hazards.wav",
|
||||||
|
"Headmaster": "headmaster.wav",
|
||||||
|
"Heed": "heed.wav",
|
||||||
|
"Hells": "hells.wav",
|
||||||
|
"Henceforth": "henceforth.wav",
|
||||||
|
"Hendel": "hendel.wav",
|
||||||
|
"Heshbani": "heshbani.wav",
|
||||||
|
"Hesta": "hesta.wav",
|
||||||
|
"Hestra": "hestra.wav",
|
||||||
|
"Heykingygladtomeetyouireallylikeithereitremindsmeofmyhome": "heykingygladtomeetyouireallylikeithereitremindsmeofmyhome.wav",
|
||||||
|
"Highlands": "highlands.wav",
|
||||||
|
"Highlord": "highlord.wav",
|
||||||
|
"Hillsfar": "hillsfar.wav",
|
||||||
|
"Hmmm": "hmmm.wav",
|
||||||
|
"Homecoming": "homecoming.wav",
|
||||||
|
"Horblaster": "horblaster.wav",
|
||||||
|
"Horde": "horde.wav",
|
||||||
|
"Horgard": "horgard.wav",
|
||||||
|
"Hornblade": "hornblade.wav",
|
||||||
|
"Hornblaster": "hornblaster.wav",
|
||||||
|
"Horned": "horned.wav",
|
||||||
|
"Hrumph": "hrumph.wav",
|
||||||
|
"Huen": "huen.wav",
|
||||||
|
"Hylan": "hylan.wav",
|
||||||
|
"Illuminant": "illuminant.wav",
|
||||||
|
"Illuminated": "illuminated.wav",
|
||||||
|
"Illumination": "illumination.wav",
|
||||||
|
"Ilrodel": "ilrodel.wav",
|
||||||
|
"Imp": "imp.wav",
|
||||||
|
"Inquisitor": "inquisitor.wav",
|
||||||
|
"Ironblade": "ironblade.wav",
|
||||||
|
"Ironbound": "ironbound.wav",
|
||||||
|
"Ironguard": "ironguard.wav",
|
||||||
|
"Ironhold": "ironhold.wav",
|
||||||
|
"Ironspear": "ironspear.wav",
|
||||||
|
"Irontree": "irontree.wav",
|
||||||
|
"Iston": "iston.wav",
|
||||||
|
"Jabari": "jabari.wav",
|
||||||
|
"Jabbed": "jabbed.wav",
|
||||||
|
"Jacob": "jacob.wav",
|
||||||
|
"Jad": "jad.wav",
|
||||||
|
"Janson": "janson.wav",
|
||||||
|
"Jasyen": "jasyen.wav",
|
||||||
|
"Jayden": "jayden.wav",
|
||||||
|
"Jaylan": "jaylan.wav",
|
||||||
|
"Jaysen": "jaysen.wav",
|
||||||
|
"Jewel": "jewel.wav",
|
||||||
|
"Jors": "jors.wav",
|
||||||
|
"Jovially": "jovially.wav",
|
||||||
|
"Kaash": "kaash.wav",
|
||||||
|
"Kah": "kah.wav",
|
||||||
|
"Kalzaduum": "kalzaduum.wav",
|
||||||
|
"Karnak": "karnak.wav",
|
||||||
|
"Kaspar": "kaspar.wav",
|
||||||
|
"Kassie": "kassie.wav",
|
||||||
|
"Keldris": "keldris.wav",
|
||||||
|
"Kelshard": "kelshard.wav",
|
||||||
|
"Kelvesh": "kelvesh.wav",
|
||||||
|
"Kelvin": "kelvin.wav",
|
||||||
|
"Kelwane": "kelwane.wav",
|
||||||
|
"Kev": "kev.wav",
|
||||||
|
"Khaki": "khaki.wav",
|
||||||
|
"Kihee": "kihee.wav",
|
||||||
|
"Kihee-Uust": "kihee_uust.wav",
|
||||||
|
"Kiiri": "kiiri.wav",
|
||||||
|
"Kin": "kin.wav",
|
||||||
|
"Kirri": "kirri.wav",
|
||||||
|
"Kisleth": "kisleth.wav",
|
||||||
|
"Knelt": "knelt.wav",
|
||||||
|
"Knight-Corporal": "knight_corporal.wav",
|
||||||
|
"Knight-Lieutenant": "knight_lieutenant.wav",
|
||||||
|
"Knight-Major": "knight_major.wav",
|
||||||
|
"Knight-Sergeant": "knight_sergeant.wav",
|
||||||
|
"Knighthand": "knighthand.wav",
|
||||||
|
"Knighthood": "knighthood.wav",
|
||||||
|
"Knowin": "knowin.wav",
|
||||||
|
"Kodan": "kodan.wav",
|
||||||
|
"Kor": "kor.wav",
|
||||||
|
"Kor-Roth": "kor_roth.wav",
|
||||||
|
"Kordan": "kordan.wav",
|
||||||
|
"Koreth": "koreth.wav",
|
||||||
|
"Korin": "korin.wav",
|
||||||
|
"Kraelheimgar": "kraelheimgar.wav",
|
||||||
|
"Kraven": "kraven.wav",
|
||||||
|
"Kris": "kris.wav",
|
||||||
|
"Krisleth": "krisleth.wav",
|
||||||
|
"Kronlin": "kronlin.wav",
|
||||||
|
"Kudah": "kudah.wav",
|
||||||
|
"Kuerana": "kuerana.wav",
|
||||||
|
"Kunah": "kunah.wav",
|
||||||
|
"Kwenal": "kwenal.wav",
|
||||||
|
"Kyfurn": "kyfurn.wav",
|
||||||
|
"Kylic": "kylic.wav",
|
||||||
|
"Ladell": "ladell.wav",
|
||||||
|
"Laird": "laird.wav",
|
||||||
|
"Leng": "leng.wav",
|
||||||
|
"Lesik": "lesik.wav",
|
||||||
|
"Lightbinger": "lightbinger.wav",
|
||||||
|
"Lightbrigner": "lightbrigner.wav",
|
||||||
|
"Lightbringer": "lightbringer.wav",
|
||||||
|
"Lightbringers": "lightbringers.wav",
|
||||||
|
"Lightrbinger": "lightrbinger.wav",
|
||||||
|
"Liu": "liu.wav",
|
||||||
|
"Lon": "lon.wav",
|
||||||
|
"Lon-Ell": "lon_ell.wav",
|
||||||
|
"Longsword": "longsword.wav",
|
||||||
|
"Lordship": "lordship.wav",
|
||||||
|
"Lumisha": "lumisha.wav",
|
||||||
|
"Lyceum": "lyceum.wav",
|
||||||
|
"Macabress": "macabress.wav",
|
||||||
|
"Madam": "madam.wav",
|
||||||
|
"Magician": "magician.wav",
|
||||||
|
"Magister": "magister.wav",
|
||||||
|
"Magistry": "magistry.wav",
|
||||||
|
"Magorian": "magorian.wav",
|
||||||
|
"Majesties": "majesties.wav",
|
||||||
|
"Maldrood": "maldrood.wav",
|
||||||
|
"Malrood": "malrood.wav",
|
||||||
|
"Manchu": "manchu.wav",
|
||||||
|
"Marches": "marches.wav",
|
||||||
|
"Marlee": "marlee.wav",
|
||||||
|
"Masta": "masta.wav",
|
||||||
|
"Matriarch": "matriarch.wav",
|
||||||
|
"Matriarchs": "matriarchs.wav",
|
||||||
|
"Meknathar": "meknathar.wav",
|
||||||
|
"Menthal": "menthal.wav",
|
||||||
|
"Ming": "ming.wav",
|
||||||
|
"Minotaur": "minotaur.wav",
|
||||||
|
"Minotaurs": "minotaurs.wav",
|
||||||
|
"Mister": "mister.wav",
|
||||||
|
"Misty": "misty.wav",
|
||||||
|
"Mithral": "mithral.wav",
|
||||||
|
"Mithrin": "mithrin.wav",
|
||||||
|
"Mitral": "mitral.wav",
|
||||||
|
"Mmmm": "mmmm.wav",
|
||||||
|
"Moans": "moans.wav",
|
||||||
|
"Molgol": "molgol.wav",
|
||||||
|
"Monarchy": "monarchy.wav",
|
||||||
|
"Morther": "morther.wav",
|
||||||
|
"Motioning": "motioning.wav",
|
||||||
|
"Mustaches": "mustaches.wav",
|
||||||
|
"Mutters": "mutters.wav",
|
||||||
|
"Mylee": "mylee.wav",
|
||||||
|
"Nahzim": "nahzim.wav",
|
||||||
|
"Nefaleem": "nefaleem.wav",
|
||||||
|
"Nestor": "nestor.wav",
|
||||||
|
"Nesven": "nesven.wav",
|
||||||
|
"Neverthoughtidseeyouprancingaroundwithabunchofelfgirls": "neverthoughtidseeyouprancingaroundwithabunchofelfgirls.wav",
|
||||||
|
"Nijel": "nijel.wav",
|
||||||
|
"Nik": "nik.wav",
|
||||||
|
"Nimbly": "nimbly.wav",
|
||||||
|
"Nimgalad": "nimgalad.wav",
|
||||||
|
"Nirvana": "nirvana.wav",
|
||||||
|
"Noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest": "noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest.wav",
|
||||||
|
"Nollon": "nollon.wav",
|
||||||
|
"Nomadic": "nomadic.wav",
|
||||||
|
"Nook": "nook.wav",
|
||||||
|
"Nurn": "nurn.wav",
|
||||||
|
"Nym": "nym.wav",
|
||||||
|
"Oakheart": "oakheart.wav",
|
||||||
|
"Oakleaf": "oakleaf.wav",
|
||||||
|
"Odie": "odie.wav",
|
||||||
|
"Odo": "odo.wav",
|
||||||
|
"Ododrian": "ododrian.wav",
|
||||||
|
"Odoiran": "odoiran.wav",
|
||||||
|
"Odorain": "odorain.wav",
|
||||||
|
"Odoriain": "odoriain.wav",
|
||||||
|
"Odorian": "odorian.wav",
|
||||||
|
"Odorians": "odorians.wav",
|
||||||
|
"Ody": "ody.wav",
|
||||||
|
"Off-Worlder": "off_worlder.wav",
|
||||||
|
"Ogrin": "ogrin.wav",
|
||||||
|
"Olde": "olde.wav",
|
||||||
|
"Onas": "onas.wav",
|
||||||
|
"Ooo": "ooo.wav",
|
||||||
|
"Oorian": "oorian.wav",
|
||||||
|
"Oranoc": "oranoc.wav",
|
||||||
|
"Orbs": "orbs.wav",
|
||||||
|
"Orehand": "orehand.wav",
|
||||||
|
"Orgrin": "orgrin.wav",
|
||||||
|
"Orin": "orin.wav",
|
||||||
|
"Orkosh": "orkosh.wav",
|
||||||
|
"Oroset": "oroset.wav",
|
||||||
|
"Orson": "orson.wav",
|
||||||
|
"Oslagil": "oslagil.wav",
|
||||||
|
"Overlord": "overlord.wav",
|
||||||
|
"Paladin": "paladin.wav",
|
||||||
|
"Paladin-King": "paladin_king.wav",
|
||||||
|
"Patriarch": "patriarch.wav",
|
||||||
|
"Patriarchs": "patriarchs.wav",
|
||||||
|
"Penance": "penance.wav",
|
||||||
|
"Penelope": "penelope.wav",
|
||||||
|
"Periwinkle": "periwinkle.wav",
|
||||||
|
"Pilgrim": "pilgrim.wav",
|
||||||
|
"Pinnacle": "pinnacle.wav",
|
||||||
|
"Pricilla": "pricilla.wav",
|
||||||
|
"Priestess": "priestess.wav",
|
||||||
|
"Primer": "primer.wav",
|
||||||
|
"Priscilla": "priscilla.wav",
|
||||||
|
"Prologue": "prologue.wav",
|
||||||
|
"Prudent": "prudent.wav",
|
||||||
|
"Quartzhand": "quartzhand.wav",
|
||||||
|
"Racah": "racah.wav",
|
||||||
|
"Rachelle": "rachelle.wav",
|
||||||
|
"Radiant": "radiant.wav",
|
||||||
|
"Rah'Zi": "rah_zi.wav",
|
||||||
|
"Rasheer": "rasheer.wav",
|
||||||
|
"Raslan": "raslan.wav",
|
||||||
|
"Ravenburg": "ravenburg.wav",
|
||||||
|
"Ravenhill": "ravenhill.wav",
|
||||||
|
"Ravensburg": "ravensburg.wav",
|
||||||
|
"Razentia": "razentia.wav",
|
||||||
|
"Realms": "realms.wav",
|
||||||
|
"Redhorn": "redhorn.wav",
|
||||||
|
"Reflexively": "reflexively.wav",
|
||||||
|
"Reinys": "reinys.wav",
|
||||||
|
"Retort": "retort.wav",
|
||||||
|
"Roc": "roc.wav",
|
||||||
|
"Rockport": "rockport.wav",
|
||||||
|
"Rolands": "rolands.wav",
|
||||||
|
"Rolden": "rolden.wav",
|
||||||
|
"Rooks": "rooks.wav",
|
||||||
|
"Roth": "roth.wav",
|
||||||
|
"Rothsholm": "rothsholm.wav",
|
||||||
|
"Rouge": "rouge.wav",
|
||||||
|
"Rustigar": "rustigar.wav",
|
||||||
|
"Sarnel": "sarnel.wav",
|
||||||
|
"Satyrsdai": "satyrsdai.wav",
|
||||||
|
"Scaly": "scaly.wav",
|
||||||
|
"Scepter": "scepter.wav",
|
||||||
|
"Seagull": "seagull.wav",
|
||||||
|
"Sedition": "sedition.wav",
|
||||||
|
"Seeker": "seeker.wav",
|
||||||
|
"Sehlaba": "sehlaba.wav",
|
||||||
|
"Seker": "seker.wav",
|
||||||
|
"Seker-Ankh": "seker_ankh.wav",
|
||||||
|
"Selna": "selna.wav",
|
||||||
|
"Senica": "senica.wav",
|
||||||
|
"Sentinel": "sentinel.wav",
|
||||||
|
"Septuigen": "septuigen.wav",
|
||||||
|
"Sergeant-Major": "sergeant_major.wav",
|
||||||
|
"Serk": "serk.wav",
|
||||||
|
"Sgt": "sgt.wav",
|
||||||
|
"Shadeem": "shadeem.wav",
|
||||||
|
"Shae": "shae.wav",
|
||||||
|
"Shal": "shal.wav",
|
||||||
|
"Shalahz": "shalahz.wav",
|
||||||
|
"Shalaz": "shalaz.wav",
|
||||||
|
"Shalazah": "shalazah.wav",
|
||||||
|
"Shambhu": "shambhu.wav",
|
||||||
|
"Shambu": "shambu.wav",
|
||||||
|
"Shanay": "shanay.wav",
|
||||||
|
"Shatterdawn": "shatterdawn.wav",
|
||||||
|
"Shdeem": "shdeem.wav",
|
||||||
|
"Shelna": "shelna.wav",
|
||||||
|
"Shen": "shen.wav",
|
||||||
|
"Shrouded": "shrouded.wav",
|
||||||
|
"Shyrra": "shyrra.wav",
|
||||||
|
"Sigil": "sigil.wav",
|
||||||
|
"Silverbane": "silverbane.wav",
|
||||||
|
"Silvernote": "silvernote.wav",
|
||||||
|
"Silvervein": "silvervein.wav",
|
||||||
|
"Silverwind": "silverwind.wav",
|
||||||
|
"Sirjif": "sirjif.wav",
|
||||||
|
"Sis": "sis.wav",
|
||||||
|
"Skeptically": "skeptically.wav",
|
||||||
|
"Slagg": "slagg.wav",
|
||||||
|
"Slaver": "slaver.wav",
|
||||||
|
"Slavers": "slavers.wav",
|
||||||
|
"Slick": "slick.wav",
|
||||||
|
"Solstice": "solstice.wav",
|
||||||
|
"Soren": "soren.wav",
|
||||||
|
"Sorrow": "sorrow.wav",
|
||||||
|
"Sosa": "sosa.wav",
|
||||||
|
"Soulseeker": "soulseeker.wav",
|
||||||
|
"Soulsinger": "soulsinger.wav",
|
||||||
|
"Sparks": "sparks.wav",
|
||||||
|
"Spellbooks": "spellbooks.wav",
|
||||||
|
"Spikehorn": "spikehorn.wav",
|
||||||
|
"Stairwell": "stairwell.wav",
|
||||||
|
"Stalker": "stalker.wav",
|
||||||
|
"Stealthy": "stealthy.wav",
|
||||||
|
"Steelaxe": "steelaxe.wav",
|
||||||
|
"Steelclaw": "steelclaw.wav",
|
||||||
|
"Steelhorn": "steelhorn.wav",
|
||||||
|
"Steward": "steward.wav",
|
||||||
|
"Stiletto": "stiletto.wav",
|
||||||
|
"Stonefirger": "stonefirger.wav",
|
||||||
|
"Stoneforger": "stoneforger.wav",
|
||||||
|
"Stonehelm": "stonehelm.wav",
|
||||||
|
"Stonehold": "stonehold.wav",
|
||||||
|
"Stoner": "stoner.wav",
|
||||||
|
"Sunder": "sunder.wav",
|
||||||
|
"Surly": "surly.wav",
|
||||||
|
"Swung": "swung.wav",
|
||||||
|
"Symphonic": "symphonic.wav",
|
||||||
|
"Ta-Lar": "ta_lar.wav",
|
||||||
|
"Taeriel": "taeriel.wav",
|
||||||
|
"Tailor": "tailor.wav",
|
||||||
|
"Talaer": "talaer.wav",
|
||||||
|
"Tallspear": "tallspear.wav",
|
||||||
|
"Targoth": "targoth.wav",
|
||||||
|
"Tarnen": "tarnen.wav",
|
||||||
|
"Tathan": "tathan.wav",
|
||||||
|
"Tavern": "tavern.wav",
|
||||||
|
"Tellin": "tellin.wav",
|
||||||
|
"Thane": "thane.wav",
|
||||||
|
"Thanes": "thanes.wav",
|
||||||
|
"Theocratic": "theocratic.wav",
|
||||||
|
"Therak": "therak.wav",
|
||||||
|
"Therondil": "therondil.wav",
|
||||||
|
"Thorn": "thorn.wav",
|
||||||
|
"Thranis": "thranis.wav",
|
||||||
|
"Throgg": "throgg.wav",
|
||||||
|
"Thunderstrike": "thunderstrike.wav",
|
||||||
|
"Tien": "tien.wav",
|
||||||
|
"Tillborne": "tillborne.wav",
|
||||||
|
"Tinbreaker": "tinbreaker.wav",
|
||||||
|
"Tome": "tome.wav",
|
||||||
|
"Torak": "torak.wav",
|
||||||
|
"Toren": "toren.wav",
|
||||||
|
"Torgath": "torgath.wav",
|
||||||
|
"Torgoth": "torgoth.wav",
|
||||||
|
"Traitor": "traitor.wav",
|
||||||
|
"Triesse": "triesse.wav",
|
||||||
|
"Tumark": "tumark.wav",
|
||||||
|
"Tumbler": "tumbler.wav",
|
||||||
|
"Turcan": "turcan.wav",
|
||||||
|
"Turog": "turog.wav",
|
||||||
|
"Twinsdai": "twinsdai.wav",
|
||||||
|
"Twyleen": "twyleen.wav",
|
||||||
|
"Tyrant": "tyrant.wav",
|
||||||
|
"Udda": "udda.wav",
|
||||||
|
"Uhrn": "uhrn.wav",
|
||||||
|
"Ulagra": "ulagra.wav",
|
||||||
|
"Ulrik": "ulrik.wav",
|
||||||
|
"Umbrin": "umbrin.wav",
|
||||||
|
"Umfray": "umfray.wav",
|
||||||
|
"Undwin": "undwin.wav",
|
||||||
|
"Unison": "unison.wav",
|
||||||
|
"Urhn": "urhn.wav",
|
||||||
|
"Uryna": "uryna.wav",
|
||||||
|
"Uust": "uust.wav",
|
||||||
|
"Vagrant": "vagrant.wav",
|
||||||
|
"Valdarin": "valdarin.wav",
|
||||||
|
"Valeth": "valeth.wav",
|
||||||
|
"Valindar": "valindar.wav",
|
||||||
|
"Valinor": "valinor.wav",
|
||||||
|
"Valis": "valis.wav",
|
||||||
|
"Vanessa": "vanessa.wav",
|
||||||
|
"Varann": "varann.wav",
|
||||||
|
"Varsis": "varsis.wav",
|
||||||
|
"Varu": "varu.wav",
|
||||||
|
"Vedra": "vedra.wav",
|
||||||
|
"Velicia": "velicia.wav",
|
||||||
|
"Velvet": "velvet.wav",
|
||||||
|
"Vendar": "vendar.wav",
|
||||||
|
"Venessa": "venessa.wav",
|
||||||
|
"Vengeance": "vengeance.wav",
|
||||||
|
"Vermin": "vermin.wav",
|
||||||
|
"Verness": "verness.wav",
|
||||||
|
"Verr": "verr.wav",
|
||||||
|
"Verr-": "verr.wav",
|
||||||
|
"Verr-Asses": "verr_asses.wav",
|
||||||
|
"Veya": "veya.wav",
|
||||||
|
"Viscount": "viscount.wav",
|
||||||
|
"Vizier": "vizier.wav",
|
||||||
|
"Vlainor": "vlainor.wav",
|
||||||
|
"Volan": "volan.wav",
|
||||||
|
"Volstan": "volstan.wav",
|
||||||
|
"Vorann": "vorann.wav",
|
||||||
|
"Vorgak": "vorgak.wav",
|
||||||
|
"Vorum": "vorum.wav",
|
||||||
|
"Vuhnalya": "vuhnalya.wav",
|
||||||
|
"Vyn": "vyn.wav",
|
||||||
|
"Wallbreaker": "wallbreaker.wav",
|
||||||
|
"Wanton": "wanton.wav",
|
||||||
|
"Warfrost": "warfrost.wav",
|
||||||
|
"Wargog": "wargog.wav",
|
||||||
|
"Warstar": "warstar.wav",
|
||||||
|
"Warthog": "warthog.wav",
|
||||||
|
"Weaving": "weaving.wav",
|
||||||
|
"Weee": "weee.wav",
|
||||||
|
"Wettstein": "wettstein.wav",
|
||||||
|
"Wh": "wh.wav",
|
||||||
|
"Wha": "wha.wav",
|
||||||
|
"Whatchya": "whatchya.wav",
|
||||||
|
"Wheni": "wheni.wav",
|
||||||
|
"Whitehand": "whitehand.wav",
|
||||||
|
"Whoah": "whoah.wav",
|
||||||
|
"Williamsburg": "williamsburg.wav",
|
||||||
|
"Willowbrook": "willowbrook.wav",
|
||||||
|
"Windrift": "windrift.wav",
|
||||||
|
"Windsdai": "windsdai.wav",
|
||||||
|
"Witchwyrd": "witchwyrd.wav",
|
||||||
|
"Witchwyrds": "witchwyrds.wav",
|
||||||
|
"Wolfclaw": "wolfclaw.wav",
|
||||||
|
"Woodlan": "woodlan.wav",
|
||||||
|
"Woodland": "woodland.wav",
|
||||||
|
"Wooo": "wooo.wav",
|
||||||
|
"Worlder": "worlder.wav",
|
||||||
|
"Wrath": "wrath.wav",
|
||||||
|
"Wuzy": "wuzy.wav",
|
||||||
|
"Wynshorn": "wynshorn.wav",
|
||||||
|
"Wyren": "wyren.wav",
|
||||||
|
"Yahnig": "yahnig.wav",
|
||||||
|
"Yan": "yan.wav",
|
||||||
|
"Yar": "yar.wav",
|
||||||
|
"Yer": "yer.wav",
|
||||||
|
"Yolan": "yolan.wav",
|
||||||
|
"Yoos": "yoos.wav",
|
||||||
|
"Yurik": "yurik.wav",
|
||||||
|
"Zalrek": "zalrek.wav",
|
||||||
|
"Zeb": "zeb.wav",
|
||||||
|
"Zelph": "zelph.wav",
|
||||||
|
"Zha": "zha.wav",
|
||||||
|
"Zhong": "zhong.wav",
|
||||||
|
"Zhong-Goo": "zhong_goo.wav",
|
||||||
|
"Zinger": "zinger.wav",
|
||||||
|
"Zirak": "zirak.wav",
|
||||||
|
"Zurn": "zurn.wav",
|
||||||
|
"Zyzaren": "zyzaren.wav",
|
||||||
|
"Zyzarn": "zyzarn.wav",
|
||||||
|
"Zyzren": "zyzren.wav"
|
||||||
|
}
|
||||||
@ -0,0 +1,20 @@
|
|||||||
|
{
|
||||||
|
"Anhuil-Elhar": "An-WHEEL AY-Lar",
|
||||||
|
"Anhuil-Ehlar": "An-WHEEL AY-Lar",
|
||||||
|
"Aegrir": "Ay-Greer",
|
||||||
|
"Baras": "BARE-iss",
|
||||||
|
"Emaleen": "EMMA-lean",
|
||||||
|
"Eushownava": "You-SHOWN-Eh-Vah",
|
||||||
|
"Graffel": "Gra-FELL",
|
||||||
|
"Greathaven": "GREAT-Haven",
|
||||||
|
"Jaylan": "JAY-Lin",
|
||||||
|
"Neverthoughtidseeyouprancingaroundwithabunchofelfgirls": "Never thought I'd see you prancing around with a bunch of elf girls",
|
||||||
|
"Nijel": "NYE-jell",
|
||||||
|
"Noivebeenhereandtherelookingformykinrumoredtodwellhereinthisforest": "No I've been here and there looking for my kin rumored to dwell here in this forest",
|
||||||
|
"Odoiran": "Oh-DORIAN",
|
||||||
|
"Ody": "Oh-Dee",
|
||||||
|
"Seker-Ankh": "Seker-Ahnk",
|
||||||
|
"Rasheer": "Raw-SHEAR",
|
||||||
|
"Valinor": "Vala-nor",
|
||||||
|
"Varsis": "Ver-Asis"
|
||||||
|
}
|
||||||
File diff suppressed because it is too large
Load Diff
File diff suppressed because it is too large
Load Diff
@ -1,28 +1,35 @@
|
|||||||
{
|
{
|
||||||
"Gadianton Robbers": "Gadeeantun Robbers",
|
|
||||||
"Gadianton": "Gadeeantun",
|
"Gadianton": "Gadeeantun",
|
||||||
"Coriantumr": "Coryantomer",
|
"Coriantumr": "Coryantomer",
|
||||||
"Laman": "Layman",
|
"Laman": "Layman",
|
||||||
"Lehi And Nephi": "Leehi And Nephi",
|
|
||||||
"Lehi": "Leehi",
|
"Lehi": "Leehi",
|
||||||
"Lehi Mathonihah": "Leehi Mathonihah",
|
|
||||||
"Lehis": "Leehis",
|
"Lehis": "Leehis",
|
||||||
"Lehies": "Leehis",
|
"Lehies": "Leehis",
|
||||||
"Liahona": "Leeahona",
|
"Liahona": "Leeahona",
|
||||||
"Moroni": "Morero-ni",
|
|
||||||
"Alma": "Al-ma",
|
|
||||||
"Gadiantons": "Gadeeantuns",
|
"Gadiantons": "Gadeeantuns",
|
||||||
"Laban": "Layban",
|
"Laban": "Layban",
|
||||||
"Mosiah": "Moziah",
|
"Mosiah": "Moziah",
|
||||||
"Mosiah The King": "Moziah The King",
|
|
||||||
"Nehors": "Kneehores",
|
"Nehors": "Kneehores",
|
||||||
"Samuel The Lamanite": "Samuel The Laymanite",
|
|
||||||
"Tarry": "Tarery",
|
"Tarry": "Tarery",
|
||||||
"The Lamanite Twins": "The Laymanite Twins",
|
"Nephites": "Kneefites",
|
||||||
"The Lamanites Of Ammon": "The Laymanites Of Ammon",
|
"Anti-Nephi-Lehies": "Anti-Kneef-eye-Leehis",
|
||||||
"The Lamanites Of The Land Of Zarahemla": "The Laymanites Of The Land Of Zarahemla",
|
"Lamanite": "Laymanite",
|
||||||
"The Lamanites Of The Land Southward": "The Laymanites Of The Land Southward",
|
"Lamanites": "Laymanites",
|
||||||
"The Lamanites Of The People Of Ammon": "The Laymanites Of The People Of Ammon",
|
"Lamb'S": "Lamb's",
|
||||||
"The Lamb'S Book Of Life": "The Lamb's Book Of Life",
|
"Sarai": "Sa-rye",
|
||||||
"The Land Of Nephi": "The Land Of Kneefi"
|
"Telestial": "Tea-lestial",
|
||||||
|
"Lord'S": "Lord's",
|
||||||
|
"Helaman": "He-la-mun",
|
||||||
|
"Alma": "Al-ma",
|
||||||
|
"Nephihah": "Kneef-eyehah",
|
||||||
|
"Nephihet": "Kneef-eyehet",
|
||||||
|
"Nephite": "Kneefight",
|
||||||
|
"Nephi-Im": "Kneef-eye-Im",
|
||||||
|
"Zenephi": "Ze-kneef-eye",
|
||||||
|
"Nephitish": "Kneefight-ish",
|
||||||
|
"Moroni": "Moh-roh-nye",
|
||||||
|
"Nephi": "Knee-fye",
|
||||||
|
"Hagar": "Hag-ar",
|
||||||
|
"Oug": "Ohg",
|
||||||
|
"Ougan": "Ohgan"
|
||||||
}
|
}
|
||||||
30
output_proper_nouns/visions_glory_canada/manifest.json
Normal file
30
output_proper_nouns/visions_glory_canada/manifest.json
Normal file
@ -0,0 +1,30 @@
|
|||||||
|
{
|
||||||
|
"Adam": "adam.wav",
|
||||||
|
"Adam-Ondi-Ahman": "adam_ondi_ahman.wav",
|
||||||
|
"Ahman": "ahman.wav",
|
||||||
|
"Alma": "alma.wav",
|
||||||
|
"Apostles": "apostles.wav",
|
||||||
|
"Brethren": "brethren.wav",
|
||||||
|
"Cardston": "cardston.wav",
|
||||||
|
"Ephraim": "ephraim.wav",
|
||||||
|
"Evolving": "evolving.wav",
|
||||||
|
"Holies": "holies.wav",
|
||||||
|
"Israel": "israel.wav",
|
||||||
|
"Joseph": "joseph.wav",
|
||||||
|
"Knelt": "knelt.wav",
|
||||||
|
"Lehi": "lehi.wav",
|
||||||
|
"Liahona": "liahona.wav",
|
||||||
|
"Millennium": "millennium.wav",
|
||||||
|
"Mormon": "mormon.wav",
|
||||||
|
"Moroni": "moroni.wav",
|
||||||
|
"Mosiah": "mosiah.wav",
|
||||||
|
"Nauvoo": "nauvoo.wav",
|
||||||
|
"Quorum": "quorum.wav",
|
||||||
|
"Rachael": "rachael.wav",
|
||||||
|
"Savior": "savior.wav",
|
||||||
|
"Thummim": "thummim.wav",
|
||||||
|
"Urim": "urim.wav",
|
||||||
|
"Vignette": "vignette.wav",
|
||||||
|
"Zachary": "zachary.wav",
|
||||||
|
"Zion": "zion.wav"
|
||||||
|
}
|
||||||
@ -0,0 +1,30 @@
|
|||||||
|
{
|
||||||
|
"Adam": "adam.wav",
|
||||||
|
"Adam-Ondi-Ahman": "adam_ondi_ahman.wav",
|
||||||
|
"Ahman": "ahman.wav",
|
||||||
|
"Alma": "alma.wav",
|
||||||
|
"Apostles": "apostles.wav",
|
||||||
|
"Brethren": "brethren.wav",
|
||||||
|
"Cardston": "cardston.wav",
|
||||||
|
"Ephraim": "ephraim.wav",
|
||||||
|
"Evolving": "evolving.wav",
|
||||||
|
"Holies": "holies.wav",
|
||||||
|
"Israel": "israel.wav",
|
||||||
|
"Joseph": "joseph.wav",
|
||||||
|
"Knelt": "knelt.wav",
|
||||||
|
"Lehi": "lehi.wav",
|
||||||
|
"Liahona": "liahona.wav",
|
||||||
|
"Millennium": "millennium.wav",
|
||||||
|
"Mormon": "mormon.wav",
|
||||||
|
"Moroni": "moroni.wav",
|
||||||
|
"Mosiah": "mosiah.wav",
|
||||||
|
"Nauvoo": "nauvoo.wav",
|
||||||
|
"Quorum": "quorum.wav",
|
||||||
|
"Rachael": "rachael.wav",
|
||||||
|
"Savior": "savior.wav",
|
||||||
|
"Thummim": "thummim.wav",
|
||||||
|
"Urim": "urim.wav",
|
||||||
|
"Vignette": "vignette.wav",
|
||||||
|
"Zachary": "zachary.wav",
|
||||||
|
"Zion": "zion.wav"
|
||||||
|
}
|
||||||
18
projects.json
Normal file
18
projects.json
Normal file
@ -0,0 +1,18 @@
|
|||||||
|
[
|
||||||
|
{
|
||||||
|
"name": "Audio Text for Novel Lightbringer",
|
||||||
|
"source_paths": [
|
||||||
|
"/home/dillon/_code/voice_model/Audio Text for Novel Lightbringer/Audio Text for Novel Lightbringer.txt"
|
||||||
|
],
|
||||||
|
"proper_nouns_output_dir": "output_proper_nouns/audio_text_for_novel_lightbringer",
|
||||||
|
"proper_nouns_audio_dir": "proper_nouns_audio/audio_text_for_novel_lightbringer"
|
||||||
|
},
|
||||||
|
{
|
||||||
|
"name": "visions glory canada",
|
||||||
|
"source_paths": [
|
||||||
|
"/home/dillon/_code/voice_model/Visions of Glory_ Zion in Canada pg 162-193.txt"
|
||||||
|
],
|
||||||
|
"proper_nouns_output_dir": "output_proper_nouns/visions_glory_canada",
|
||||||
|
"proper_nouns_audio_dir": "proper_nouns_audio/visions_glory_canada"
|
||||||
|
}
|
||||||
|
]
|
||||||
1345
proper_nouns.txt
1345
proper_nouns.txt
File diff suppressed because it is too large
Load Diff
42
run_audiobook.bat
Normal file
42
run_audiobook.bat
Normal file
@ -0,0 +1,42 @@
|
|||||||
|
@echo off
|
||||||
|
title Create Audiobook
|
||||||
|
|
||||||
|
:: Change to the folder this .bat file lives in
|
||||||
|
cd /d "%~dp0"
|
||||||
|
|
||||||
|
:: Check setup has been run
|
||||||
|
if not exist .venv\Scripts\python.exe (
|
||||||
|
echo ERROR: Setup has not been run yet.
|
||||||
|
echo Please double-click setup_windows.bat first.
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
echo ============================================================
|
||||||
|
echo Audiobook Creator
|
||||||
|
echo ============================================================
|
||||||
|
echo.
|
||||||
|
echo Options:
|
||||||
|
echo 1 - Generate ALL chapters (may take many hours)
|
||||||
|
echo 2 - List detected chapters only
|
||||||
|
echo 3 - Generate a short PREVIEW of each chapter
|
||||||
|
echo 4 - Generate specific chapters (enter numbers next)
|
||||||
|
echo.
|
||||||
|
set /p CHOICE="Enter choice (1/2/3/4): "
|
||||||
|
|
||||||
|
if "%CHOICE%"=="1" (
|
||||||
|
.venv\Scripts\python create_audiobook_lightbringer.py
|
||||||
|
) else if "%CHOICE%"=="2" (
|
||||||
|
.venv\Scripts\python create_audiobook_lightbringer.py --list
|
||||||
|
) else if "%CHOICE%"=="3" (
|
||||||
|
.venv\Scripts\python create_audiobook_lightbringer.py --preview
|
||||||
|
) else if "%CHOICE%"=="4" (
|
||||||
|
set /p CHAPTERS="Enter chapter numbers separated by spaces (e.g. 0 1 2): "
|
||||||
|
.venv\Scripts\python create_audiobook_lightbringer.py %CHAPTERS%
|
||||||
|
) else (
|
||||||
|
echo Invalid choice.
|
||||||
|
)
|
||||||
|
|
||||||
|
echo.
|
||||||
|
echo Done. Output files are in the output_audiobook_lightbringer folder.
|
||||||
|
pause
|
||||||
21
run_gui.bat
Normal file
21
run_gui.bat
Normal file
@ -0,0 +1,21 @@
|
|||||||
|
@echo off
|
||||||
|
title Proper Noun GUI
|
||||||
|
|
||||||
|
:: Change to the folder this .bat file lives in
|
||||||
|
cd /d "%~dp0"
|
||||||
|
|
||||||
|
:: Check setup has been run
|
||||||
|
if not exist .venv\Scripts\python.exe (
|
||||||
|
echo ERROR: Setup has not been run yet.
|
||||||
|
echo Please double-click setup_windows.bat first.
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
echo Starting Proper Noun Player GUI...
|
||||||
|
.venv\Scripts\python gui_proper_noun_player.py
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo.
|
||||||
|
echo The application closed with an error. See message above.
|
||||||
|
pause
|
||||||
|
)
|
||||||
93
setup_windows.bat
Normal file
93
setup_windows.bat
Normal file
@ -0,0 +1,93 @@
|
|||||||
|
@echo off
|
||||||
|
setlocal EnableDelayedExpansion
|
||||||
|
title Audiobook Setup
|
||||||
|
|
||||||
|
echo ============================================================
|
||||||
|
echo Audiobook Setup for Windows 11
|
||||||
|
echo ============================================================
|
||||||
|
echo.
|
||||||
|
|
||||||
|
:: ── 1. Check Python ──────────────────────────────────────────────────────────
|
||||||
|
echo [1/5] Checking Python installation...
|
||||||
|
python --version >nul 2>&1
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo.
|
||||||
|
echo ERROR: Python was not found.
|
||||||
|
echo.
|
||||||
|
echo Please install Python 3.12 from https://www.python.org/downloads/
|
||||||
|
echo IMPORTANT: On the installer, tick "Add Python to PATH" before clicking Install.
|
||||||
|
echo.
|
||||||
|
echo After installing, close this window and double-click setup_windows.bat again.
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
for /f "tokens=2 delims= " %%v in ('python --version 2^>^&1') do set PY_VER=%%v
|
||||||
|
echo Found Python %PY_VER%
|
||||||
|
echo.
|
||||||
|
|
||||||
|
:: ── 2. Create virtual environment ────────────────────────────────────────────
|
||||||
|
echo [2/5] Creating virtual environment (.venv)...
|
||||||
|
if exist .venv (
|
||||||
|
echo .venv already exists, skipping creation.
|
||||||
|
) else (
|
||||||
|
python -m venv .venv
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo ERROR: Failed to create virtual environment.
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
echo Virtual environment created.
|
||||||
|
)
|
||||||
|
echo.
|
||||||
|
|
||||||
|
:: ── 3. Install PyTorch with CUDA (for gaming GPU) ────────────────────────────
|
||||||
|
echo [3/5] Installing PyTorch with CUDA 12.4 support (this may take a while)...
|
||||||
|
echo Downloading ~2.5 GB — please be patient.
|
||||||
|
echo.
|
||||||
|
.venv\Scripts\pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu124
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo.
|
||||||
|
echo WARNING: CUDA build failed. Falling back to CPU-only PyTorch.
|
||||||
|
echo Audio generation will be slower but will still work.
|
||||||
|
.venv\Scripts\pip install torch
|
||||||
|
)
|
||||||
|
echo.
|
||||||
|
|
||||||
|
:: ── 4. Install remaining packages ────────────────────────────────────────────
|
||||||
|
echo [4/5] Installing remaining packages (kokoro, soundfile, sounddevice, spacy, wordfreq)...
|
||||||
|
.venv\Scripts\pip install -r requirements.txt
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo ERROR: Package installation failed. Check your internet connection.
|
||||||
|
pause
|
||||||
|
exit /b 1
|
||||||
|
)
|
||||||
|
|
||||||
|
echo Downloading spaCy English language model (en_core_web_sm, ~15 MB)...
|
||||||
|
.venv\Scripts\python -m spacy download en_core_web_sm
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo WARNING: spaCy model download failed. Proper noun extraction will not work
|
||||||
|
echo until you re-run: .venv\Scripts\python -m spacy download en_core_web_sm
|
||||||
|
)
|
||||||
|
echo.
|
||||||
|
|
||||||
|
:: ── 5. Download the Kokoro TTS model ─────────────────────────────────────────
|
||||||
|
echo [5/5] Downloading the Kokoro TTS model (hexgrad/Kokoro-82M, ~330 MB)...
|
||||||
|
echo This only happens once.
|
||||||
|
echo.
|
||||||
|
.venv\Scripts\python -c "from kokoro import KPipeline; KPipeline(lang_code='a', repo_id='hexgrad/Kokoro-82M'); print('Model ready.')"
|
||||||
|
if errorlevel 1 (
|
||||||
|
echo.
|
||||||
|
echo WARNING: Model download failed. It will retry the first time you run the app.
|
||||||
|
echo Make sure you have an internet connection on first launch.
|
||||||
|
)
|
||||||
|
|
||||||
|
echo.
|
||||||
|
echo ============================================================
|
||||||
|
echo Setup complete!
|
||||||
|
echo.
|
||||||
|
echo To launch the GUI: double-click run_gui.bat
|
||||||
|
echo To create the audiobook: double-click run_audiobook.bat
|
||||||
|
echo ============================================================
|
||||||
|
echo.
|
||||||
|
pause
|
||||||
Reference in New Issue
Block a user