Compare commits

5 Commits

Author SHA1 Message Date
1e02bf32d9 still working on crashes 2026-04-08 01:42:00 -06:00
2406b0a2e7 trying to fix crashes 2026-04-08 01:04:27 -06:00
38ca9cfbad crashing from wav file size i think 2026-04-08 00:48:05 -06:00
56be227245 fixed issues from removing other frontend 2026-04-08 00:02:56 -06:00
e25f8a9b63 removed electron 2026-04-07 23:08:27 -06:00
31 changed files with 718 additions and 525 deletions

3
.gitignore vendored
View File

@ -21,9 +21,6 @@ htmlcov/
.idea/
.cursor/
# Submodules (can be cloned separately if needed)
CutScript/
# OS files
.env
.env.local

View File

@ -6,59 +6,59 @@ Features are grouped by priority. Check off items as they are implemented.
## 🔴 High Priority — Core editing gaps
- [x] **Cut / Mute sections** — select a time range and choose to cut (remove entirely) or mute (silence audio while video continues). Cut sections show as red overlays, mute sections as transparent blue overlays on the timeline over the transcript text and audio waveform. Backend: `ffmpeg -af volume=0` for mute, time-based cutting for removal.
1. [x] **Cut / Mute sections** — select a time range and choose to cut (remove entirely) or mute (silence audio while video continues). Cut sections show as red overlays, mute sections as transparent blue overlays on the timeline over the transcript text and audio waveform. Backend: `ffmpeg -af volume=0` for mute, time-based cutting for removal.
- [ ] **Silence / pause trimmer (in progress)** — detect pauses using min duration (ms) + amplitude threshold (dB), then apply detected pauses as cut ranges. Initial endpoint: `/audio/detect-silence`; UI includes filter controls and an "Apply As Cuts" action.
2. [x] **Silence / pause trimmer** — detect pauses using min duration (ms) + amplitude threshold (dB), then apply detected pauses as cut ranges. Initial endpoint: `/audio/detect-silence`; UI includes filter controls and an "Apply As Cuts" action.
- [ ] **Operation-level undo for batch actions** — explicit undo entry for actions like "Apply Silence Trim" so one shortcut/click reverts the whole operation, while still allowing normal fine-grained undo/redo steps.
3. [x] **Operation-level undo for batch actions** — explicit undo entry for actions like "Apply Silence Trim" so one shortcut/click reverts the whole operation, while still allowing normal fine-grained undo/redo steps.
- [ ] **Grouped silence-trim zones (editable batch)** — when pauses are applied, tag them as a batch (`trim_group_id`) so the user can: (1) delete all zones from that auto-trim pass at once, and (2) still select/resize/delete individual zones independently.
4. [ ] **Grouped silence-trim zones (editable batch)** — when pauses are applied, tag them as a batch (`trim_group_id`) so the user can: (1) delete all zones from that auto-trim pass at once, and (2) still select/resize/delete individual zones independently.
- [ ] **Edit silence-trim group settings after apply** — allow reopening a trim group and changing its detection settings (min pause ms, threshold dB, pre/post buffers), then reapplying updates to that group without affecting unrelated edits.
5. [ ] **Edit silence-trim group settings after apply** — allow reopening a trim group and changing its detection settings (min pause ms, threshold dB, pre/post buffers), then reapplying updates to that group without affecting unrelated edits.
- [ ] **Volume / gain control** — per-selection or global audio gain slider. Every editor has this. Descript users constantly complain it's missing. Backend: `ffmpeg -af volume=Xdb`.
6. [ ] **Volume / gain control** — per-selection or global audio gain slider. Every editor has this. Descript users constantly complain it's missing. Backend: `ffmpeg -af volume=Xdb`.
- [ ] **Speed adjustment** — slow down or speed up a selection or the whole clip. Backend: `ffmpeg -filter:v setpts` + `atempo`. Common use case: slightly speed up boring sections.
7. [ ] **Speed adjustment** — slow down or speed up a selection or the whole clip. Backend: `ffmpeg -filter:v setpts` + `atempo`. Common use case: slightly speed up boring sections.
- [ ] **Cut preview** — before committing a delete, play what the audio will sound like with that section removed (pre-listen across the edit point). Pure frontend using Web Audio API — splice the AudioBuffer and play the join.
8. [ ] **Cut preview** — before committing a delete, play what the audio will sound like with that section removed (pre-listen across the edit point). Pure frontend using Web Audio API — splice the AudioBuffer and play the join.
- [ ] **Timeline shows output length** — deleted regions should visually collapse (or show as narrow gaps) so the user sees the *output* duration, not just the source duration.
9. [ ] **Timeline shows output length** — deleted regions should visually collapse (or show as narrow gaps) so the user sees the *output* duration, not just the source duration.
---
## 🟡 Medium Priority — Widely expected features
- [ ] **Transcript search (Ctrl+F)** — find words/phrases in the transcript and highlight matches. Pure frontend. Critical for long-form content. Jump between matches with Enter.
10. [ ] **Transcript search (Ctrl+F)** — find words/phrases in the transcript and highlight matches. Pure frontend. Critical for long-form content. Jump between matches with Enter.
- [ ] **Mark In / Out + delete (I / O keys)** — keyboard shortcuts to mark a time range on the timeline, then delete it. Faster than click-dragging words. Store the in/out points in state, `Delete` removes them.
11. [ ] **Mark In / Out + delete (I / O keys)** — keyboard shortcuts to mark a time range on the timeline, then delete it. Faster than click-dragging words. Store the in/out points in state, `Delete` removes them.
- [ ] **Low-confidence word highlighting** — WhisperX already returns `confidence` per word. Words below a threshold (e.g. < 0.6) should be visually underlined or tinted so the user knows where to double-check.
12. [ ] **Low-confidence word highlighting** — WhisperX already returns `confidence` per word. Words below a threshold (e.g. < 0.6) should be visually underlined or tinted so the user knows where to double-check.
- [ ] **Re-transcribe selection** if Whisper gets a section wrong, let the user select a word range and re-run transcription on just that segment (optionally with a different model or language).
13. [ ] **Re-transcribe selection** if Whisper gets a section wrong, let the user select a word range and re-run transcription on just that segment (optionally with a different model or language).
- [ ] **Word text correction** allow editing the transcript text of a word without affecting its timing. Whisper gets homophones/proper nouns wrong constantly. Pure frontend state change; no backend needed.
14. [ ] **Word text correction** allow editing the transcript text of a word without affecting its timing. Whisper gets homophones/proper nouns wrong constantly. Pure frontend state change; no backend needed.
- [ ] **Named timeline markers** drop named marker pins on the waveform (like Resolve markers). Store as `{ id, time, label, color }` in the project. Rendered as colored triangles on the timeline canvas.
15. [ ] **Named timeline markers** drop named marker pins on the waveform (like Resolve markers). Store as `{ id, time, label, color }` in the project. Rendered as colored triangles on the timeline canvas.
- [ ] **Chapters** group markers into named chapter ranges. Useful for podcasts and lectures. Exportable as YouTube chapter timestamps in the description.
16. [ ] **Chapters** group markers into named chapter ranges. Useful for podcasts and lectures. Exportable as YouTube chapter timestamps in the description.
---
## 🟢 Lower Priority — Differentiating / power features
- [ ] **Audio normalization / loudness targeting** single "Normalize" button that targets a LUFS level (-14 for YouTube, -16 for Spotify). Backend: `ffmpeg -af loudnorm`. Very high value for podcasters, ~23 hours of work.
17. [ ] **Audio normalization / loudness targeting** single "Normalize" button that targets a LUFS level (-14 for YouTube, -16 for Spotify). Backend: `ffmpeg -af loudnorm`. Very high value for podcasters, ~23 hours of work.
- [ ] **Background music track** a second audio track for background music with volume ducking. Major gap in Descript that TalkEdit could own. Backend: `ffmpeg` amix + `asendcmd` for auto-ducking.
18. [ ] **Background music track** a second audio track for background music with volume ducking. Major gap in Descript that TalkEdit could own. Backend: `ffmpeg` amix + `asendcmd` for auto-ducking.
- [ ] **Video zoom / punch-in** scale and position the video (crop, zoom, pan). Used constantly on talking-head videos for emphasis. Backend: `ffmpeg -vf crop/scale/zoompan`.
19. [ ] **Video zoom / punch-in** scale and position the video (crop, zoom, pan). Used constantly on talking-head videos for emphasis. Backend: `ffmpeg -vf crop/scale/zoompan`.
- [ ] **Multi-clip / append** load a second video and append it to the timeline. Even without a full multi-track timeline, "append clip" is a heavily used workflow.
20. [ ] **Multi-clip / append** load a second video and append it to the timeline. Even without a full multi-track timeline, "append clip" is a heavily used workflow.
- [ ] **Clip thumbnail strip** video frame thumbnails along the timeline so users can navigate visually, not only by waveform. Backend: `ffmpeg` thumbnail extraction at regular intervals.
21. [ ] **Clip thumbnail strip** video frame thumbnails along the timeline so users can navigate visually, not only by waveform. Backend: `ffmpeg` thumbnail extraction at regular intervals.
- [ ] **Batch silence removal** full-file scan + remove all pauses above threshold in one click. Distinct from the manual trimmer above; this is a "fix the whole file" operation.
22. [ ] **Batch silence removal** full-file scan + remove all pauses above threshold in one click. Distinct from the manual trimmer above; this is a "fix the whole file" operation.
- [ ] **Export to transcript text / SRT only** some users just want a clean `.txt` or `.srt` of the edited transcript without rendering video.
23. [ ] **Export to transcript text / SRT only** some users just want a clean `.txt` or `.srt` of the edited transcript without rendering video.
---
@ -66,28 +66,28 @@ Features are grouped by priority. Check off items as they are implemented.
These aren't features to build they're things to make more visible in the UI and README:
- **100% offline / no account required** CapCut requires login and sends data to servers. Descript is cloud-first. TalkEdit never leaves the machine.
- **Local AI models** Ollama support means no API costs and no data leaving the device.
- **Word-level precision** editing by deleting words (not dragging razor cuts) is faster for talking-head content than any timeline-based editor.
- **Works on long files** virtualized transcript + chunked waveform handles 1hr+ content that bogs down CapCut.
24. **100% offline / no account required** CapCut requires login and sends data to servers. Descript is cloud-first. TalkEdit never leaves the machine.
25. **Local AI models** Ollama support means no API costs and no data leaving the device.
26. **Word-level precision** editing by deleting words (not dragging razor cuts) is faster for talking-head content than any timeline-based editor.
27. **Works on long files** virtualized transcript + chunked waveform handles 1hr+ content that bogs down CapCut.
---
## ✅ Already Implemented
- Word-level transcript editing (select, drag, shift-click, delete)
- Ctrl+click word seek timeline to that position
- Waveform timeline with zoom (Ctrl+scroll), scroll, drag-to-scrub playhead
- Auto-scroll waveform when playhead goes off-screen
- AI filler word detection and removal (Ollama / OpenAI / Claude)
- AI clip suggestions for social media
- Noise reduction (DeepFilterNet or FFmpeg ANLMDN)
- Export: fast stream-copy or full reencode (MP4/MOV/WebM, 720p/1080p/4K)
- Captions: SRT, VTT, ASS burn-in with font/color/position options
- Speaker diarization
- Project save / load (.aive JSON format)
- Undo / redo (100-level history via Zundo)
- Multi-format input (MP4, MKV, MOV, AVI, WebM, M4A)
- Keyboard shortcuts (Space, J/K/L, arrows, Ctrl+Z/Shift+Z, Ctrl+S, Ctrl+E)
- Settings panel: AI provider config (Ollama, OpenAI, Claude)
- Cut/mute range creation on timeline with draggable zone edits and Delete-to-remove
28. Word-level transcript editing (select, drag, shift-click, delete)
29. Ctrl+click word seek timeline to that position
30. Waveform timeline with zoom (Ctrl+scroll), scroll, drag-to-scrub playhead
31. Auto-scroll waveform when playhead goes off-screen
32. AI filler word detection and removal (Ollama / OpenAI / Claude)
33. AI clip suggestions for social media
34. Noise reduction (DeepFilterNet or FFmpeg ANLMDN)
35. Export: fast stream-copy or full reencode (MP4/MOV/WebM, 720p/1080p/4K)
36. Captions: SRT, VTT, ASS burn-in with font/color/position options
37. Speaker diarization
38. Project save / load (.aive JSON format)
39. Undo / redo (100-level history via Zundo)
40. Multi-format input (MP4, MKV, MOV, AVI, WebM, M4A)
41. Keyboard shortcuts (Space, J/K/L, arrows, Ctrl+Z/Shift+Z, Ctrl+S, Ctrl+E)
42. Settings panel: AI provider config (Ollama, OpenAI, Claude)
43. Cut/mute range creation on timeline with draggable zone edits and Delete-to-remove

View File

@ -1,4 +1,4 @@
# CutScript
# TalkEdit
An open-source, local-first, Descript-like text-based audio and video editor powered by AI. Edit audio/video by editing text — delete a word from the transcript and it's cut from the audio/video.
@ -7,7 +7,7 @@ An open-source, local-first, Descript-like text-based audio and video editor pow
## Architecture
- **Electron + React** desktop app with Tailwind CSS
- **Tauri + React** desktop app with Tailwind CSS
- **FastAPI** Python backend (spawned as child process)
- **WhisperX** for word-level transcription with alignment
- **FFmpeg** for video processing (stream-copy and re-encode)
@ -25,7 +25,7 @@ An open-source, local-first, Descript-like text-based audio and video editor pow
### Install
```bash
# Root dependencies (Electron, concurrently)
# Root scripts
npm install
# Frontend dependencies (React, Tailwind, Zustand)
@ -37,32 +37,46 @@ cd backend && pip install -r requirements.txt && cd ..
### Run (Development)
Set a custom backend port once (optional):
```bash
# Start all three (backend + frontend + electron)
export BACKEND_PORT=8000
```
If you run frontend separately, you can also set:
```bash
export VITE_BACKEND_PORT=$BACKEND_PORT
```
```bash
# Start frontend in browser
npm run dev
# Or start the full desktop app (backend + tauri)
npm run dev:tauri
```
Or run them separately:
```bash
# Terminal 1: Backend
cd backend && python -m uvicorn main:app --reload --port 8642
cd backend && python -m uvicorn main:app --reload --port 8000
# Terminal 2: Frontend
cd frontend && npm run dev
# Terminal 3: Electron
npx electron .
# Terminal 3: Tauri app shell
cd frontend && cargo tauri dev
```
## Project Structure
```
cutscript/
├── electron/ # Electron main process
│ ├── main.js # App entry, spawns Python backend
── preload.js # Secure IPC bridge
│ └── python-bridge.js
talkedit/
├── src-tauri/ # Tauri Rust host
│ ├── src/main.rs # App entry and backend orchestration
── tauri.conf.json
├── frontend/ # React + Vite + Tailwind
│ └── src/
│ ├── components/ # VideoPlayer, TranscriptEditor, etc.
@ -97,7 +111,7 @@ cutscript/
| Speaker diarization | Done |
| Virtualized transcript (react-virtuoso) | Done |
| Encrypted API key storage | Done |
| Project save/load (.cutscript) | Done |
| Project save/load (.aive) | Done |
| AI background removal | Planned |
## Keyboard Shortcuts

View File

@ -2,20 +2,25 @@ import logging
import os
import stat
import sys
import tempfile
from contextlib import asynccontextmanager
from pathlib import Path
from fastapi import FastAPI, Query, Request, HTTPException
from fastapi.middleware.cors import CORSMiddleware
from fastapi.responses import StreamingResponse
from fastapi.responses import StreamingResponse, FileResponse
import ffmpeg
from routers import transcribe, export, ai, captions, audio
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
# Dev log file — frontend forwards console.error/warn here so the agent can read it
DEV_LOG_PATH = Path(__file__).parent.parent / "webview.log"
# Dev log file — keep outside workspace to avoid dev watcher reload loops.
DEV_LOG_PATH = Path(
os.environ.get("TALKEDIT_DEV_LOG_PATH", str(Path(tempfile.gettempdir()) / "talkedit-webview.log"))
)
@asynccontextmanager
@ -61,20 +66,69 @@ MIME_MAP = {
@app.get("/file")
async def serve_local_file(request: Request, path: str = Query(...)):
"""Stream a local file with HTTP Range support (required for video seeking)."""
async def serve_local_file(request: Request, path: str = Query(...), format: str = Query(None)):
"""Stream a local file with HTTP Range support (required for video seeking).
Optionally transcode audio files to MP3 for better browser compatibility."""
file_path = Path(path)
if not file_path.is_file():
logger.warning(f"[serve_file] File not found: {path}")
raise HTTPException(status_code=404, detail=f"File not found: {path}")
file_size = file_path.stat().st_size
content_type = MIME_MAP.get(file_path.suffix.lower(), "application/octet-stream")
original_ext = file_path.suffix.lower()
# Check if we should transcode this file
should_transcode = (
original_ext == '.wav' and
(format == 'mp3' or file_size > 10 * 1024 * 1024) # Transcode WAV if > 10MB or explicitly requested
)
if should_transcode:
logger.info(f"[serve_file] Transcoding {file_path.name} to MP3 (size: {file_size})")
# Create cache directory
cache_dir = Path(__file__).parent / "cache"
cache_dir.mkdir(exist_ok=True)
# Create cache filename
import hashlib
file_hash = hashlib.md5(str(file_path).encode()).hexdigest()
cache_path = cache_dir / f"{file_hash}.mp3"
# Check if cached version exists
if not cache_path.exists():
logger.info(f"[serve_file] Creating cached MP3: {cache_path}")
try:
# Transcode to MP3 using ffmpeg
(
ffmpeg
.input(str(file_path))
.output(str(cache_path), acodec='libmp3lame', ab='128k')
.run(overwrite_output=True, quiet=True)
)
except ffmpeg.Error as e:
logger.error(f"[serve_file] Transcoding failed: {e}")
# Fall back to original file
cache_path = file_path
else:
logger.info(f"[serve_file] Transcoding completed: {cache_path}")
else:
logger.info(f"[serve_file] Using cached MP3: {cache_path}")
# Use the transcoded file
file_path = cache_path
file_size = file_path.stat().st_size
content_type = "audio/mpeg"
else:
content_type = MIME_MAP.get(original_ext, "application/octet-stream")
range_header = request.headers.get("range")
logger.info(
f"[serve_file] {file_path.name} | size={file_size} | "
f"type={content_type} | range={range_header or 'none'}"
f"[serve_file] Serving {file_path.name} | size={file_size} | "
f"type={content_type} | range={range_header or 'none'} | transcoded={should_transcode}"
)
if content_type == "application/octet-stream":
@ -153,6 +207,7 @@ async def dev_log(request: Request):
if args:
line += " " + " ".join(args)
line += "\n"
DEV_LOG_PATH.parent.mkdir(parents=True, exist_ok=True)
with open(DEV_LOG_PATH, "a") as f:
f.write(line)
return {"ok": True}
return {"ok": True, "path": str(DEV_LOG_PATH)}

View File

@ -118,7 +118,7 @@ async def export_video(req: ExportRequest):
# Audio enhancement: clean, then mux back into the exported video
if req.enhanceAudio:
try:
tmp_dir = tempfile.mkdtemp(prefix="cutscript_audio_")
tmp_dir = tempfile.mkdtemp(prefix="talkedit_audio_")
cleaned_audio = os.path.join(tmp_dir, "cleaned.wav")
clean_audio(output, cleaned_audio)

27
close
View File

@ -1,7 +1,9 @@
#!/bin/bash
# Close TalkEdit and/or CutScript processes (Tauri dev, Electron, and Python backends)
# Close TalkEdit processes (Tauri dev and Python backend)
KILLED_ANY=0
BACKEND_PORT="${BACKEND_PORT:-8000}"
FRONTEND_PORT="${FRONTEND_PORT:-5173}"
kill_port() {
local port=$1
@ -9,7 +11,7 @@ kill_port() {
local pids
pids=$(lsof -ti tcp:"$port" 2>/dev/null)
if [[ -n "$pids" ]]; then
echo "Stopping $name backend (port $port, PID $pids)..."
echo "Stopping $name (port $port, PID $pids)..."
kill "$pids" 2>/dev/null
KILLED_ANY=1
fi
@ -27,23 +29,18 @@ kill_pattern() {
fi
}
# --- TalkEdit (Tauri, port 8000) ---
kill_port 8000 "TalkEdit"
# --- TalkEdit (Tauri) ---
kill_port "$BACKEND_PORT" "TalkEdit"
kill_pattern "tauri.*TalkEdit\|TalkEdit.*tauri\|cargo.*tauri dev\|/TalkEdit/target/debug" "TalkEdit (Tauri dev)"
# Vite dev server for TalkEdit (port 5173)
kill_pattern "vite.*5173\|rsbuild.*5173" "TalkEdit frontend dev server"
# Frontend dev server: first kill by listening port, then by known process patterns.
kill_port "$FRONTEND_PORT" "TalkEdit frontend"
kill_pattern "vite\|rsbuild\|npm.*run dev\|pnpm.*dev\|yarn.*dev" "TalkEdit frontend dev server"
# --- CutScript (Electron, port 8642) ---
kill_port 8642 "CutScript"
kill_pattern "electron.*CutScript\|CutScript.*electron" "CutScript (Electron)"
kill_pattern "vite.*CutScript\|CutScript.*vite" "CutScript frontend dev server"
# --- Orphaned uvicorn workers for either app ---
kill_pattern "uvicorn.*main:app.*--port 800[012]" "leftover uvicorn workers (TalkEdit)"
kill_pattern "uvicorn.*main:app.*--port 864" "leftover uvicorn workers (CutScript)"
# --- Orphaned uvicorn workers for TalkEdit ---
kill_pattern "uvicorn.*main:app.*--port ${BACKEND_PORT}" "leftover uvicorn workers (TalkEdit)"
if [[ $KILLED_ANY -eq 0 ]]; then
echo "Nothing to close — no TalkEdit or CutScript processes found."
echo "Nothing to close — no TalkEdit processes found."
else
echo "Done."
fi

View File

@ -1,131 +0,0 @@
const { app, BrowserWindow, ipcMain, dialog, safeStorage } = require('electron');
const path = require('path');
const { PythonBackend } = require('./python-bridge');
let mainWindow = null;
let pythonBackend = null;
const isDev = !app.isPackaged;
const BACKEND_PORT = 8642;
function createWindow() {
mainWindow = new BrowserWindow({
width: 1400,
height: 900,
minWidth: 1024,
minHeight: 700,
title: 'CutScript',
webPreferences: {
preload: path.join(__dirname, 'preload.js'),
contextIsolation: true,
nodeIntegration: false,
webSecurity: isDev ? false : true,
},
show: false,
});
if (isDev) {
mainWindow.loadURL('http://localhost:5173');
mainWindow.webContents.openDevTools();
} else {
mainWindow.loadFile(path.join(__dirname, '..', 'frontend', 'dist', 'index.html'));
}
mainWindow.once('ready-to-show', () => {
mainWindow.show();
});
mainWindow.on('closed', () => {
mainWindow = null;
});
}
app.whenReady().then(async () => {
pythonBackend = new PythonBackend(BACKEND_PORT, isDev);
await pythonBackend.start();
createWindow();
app.on('activate', () => {
if (BrowserWindow.getAllWindows().length === 0) {
createWindow();
}
});
});
app.on('window-all-closed', () => {
if (process.platform !== 'darwin') {
app.quit();
}
});
app.on('before-quit', () => {
if (pythonBackend) {
pythonBackend.stop();
}
});
// IPC Handlers
ipcMain.handle('dialog:openFile', async (_event, options) => {
const result = await dialog.showOpenDialog(mainWindow, {
properties: ['openFile'],
filters: [
{ name: 'Video Files', extensions: ['mp4', 'avi', 'mov', 'mkv', 'webm'] },
{ name: 'Audio Files', extensions: ['m4a', 'wav', 'mp3', 'flac'] },
{ name: 'All Files', extensions: ['*'] },
],
...options,
});
return result.canceled ? null : result.filePaths[0];
});
ipcMain.handle('dialog:saveFile', async (_event, options) => {
const result = await dialog.showSaveDialog(mainWindow, {
filters: [
{ name: 'Video Files', extensions: ['mp4', 'mov', 'webm'] },
{ name: 'Project Files', extensions: ['aive'] },
],
...options,
});
return result.canceled ? null : result.filePath;
});
ipcMain.handle('dialog:openProject', async () => {
const result = await dialog.showOpenDialog(mainWindow, {
properties: ['openFile'],
filters: [
{ name: 'AI Video Editor Project', extensions: ['aive'] },
],
});
return result.canceled ? null : result.filePaths[0];
});
ipcMain.handle('safe-storage:encrypt', (_event, data) => {
if (safeStorage.isEncryptionAvailable()) {
return safeStorage.encryptString(data).toString('base64');
}
return data;
});
ipcMain.handle('safe-storage:decrypt', (_event, encrypted) => {
if (safeStorage.isEncryptionAvailable()) {
return safeStorage.decryptString(Buffer.from(encrypted, 'base64'));
}
return encrypted;
});
ipcMain.handle('get-backend-url', () => {
return `http://localhost:${BACKEND_PORT}`;
});
ipcMain.handle('fs:readFile', async (_event, filePath) => {
const fs = require('fs');
return fs.readFileSync(filePath, 'utf-8');
});
ipcMain.handle('fs:writeFile', async (_event, filePath, content) => {
const fs = require('fs');
fs.writeFileSync(filePath, content, 'utf-8');
return true;
});

View File

@ -1,12 +0,0 @@
const { contextBridge, ipcRenderer } = require('electron');
contextBridge.exposeInMainWorld('electronAPI', {
openFile: (options) => ipcRenderer.invoke('dialog:openFile', options),
saveFile: (options) => ipcRenderer.invoke('dialog:saveFile', options),
openProject: () => ipcRenderer.invoke('dialog:openProject'),
getBackendUrl: () => ipcRenderer.invoke('get-backend-url'),
encryptString: (data) => ipcRenderer.invoke('safe-storage:encrypt', data),
decryptString: (encrypted) => ipcRenderer.invoke('safe-storage:decrypt', encrypted),
readFile: (path) => ipcRenderer.invoke('fs:readFile', path),
writeFile: (path, content) => ipcRenderer.invoke('fs:writeFile', path, content),
});

View File

@ -1,105 +0,0 @@
const { spawn } = require('child_process');
const path = require('path');
const http = require('http');
class PythonBackend {
constructor(port, isDev) {
this.port = port;
this.isDev = isDev;
this.process = null;
}
async start() {
// In dev mode, check if a backend is already running (e.g. from `npm run dev:backend`)
// If so, reuse it instead of spawning a duplicate.
if (this.isDev) {
const alreadyRunning = await this._isPortOpen(2000);
if (alreadyRunning) {
console.log(`[backend] Dev backend already running on port ${this.port} — reusing it.`);
return;
}
}
const backendDir = this.isDev
? path.join(__dirname, '..', 'backend')
: path.join(process.resourcesPath, 'backend');
const pythonCmd = process.platform === 'win32' ? 'python' : '/home/dillon/.pyenv/versions/3.11.15/bin/python';
this.process = spawn(pythonCmd, [
'-m', 'uvicorn', 'main:app',
'--host', '127.0.0.1',
'--port', String(this.port),
], {
cwd: backendDir,
stdio: ['pipe', 'pipe', 'pipe'],
env: { ...process.env, PYTHONUNBUFFERED: '1' },
});
this.process.stdout.on('data', (data) => {
console.log(`[backend] ${data.toString().trim()}`);
});
this.process.stderr.on('data', (data) => {
console.error(`[backend] ${data.toString().trim()}`);
});
this.process.on('error', (err) => {
console.error('[backend] Failed to start Python backend:', err.message);
});
this.process.on('exit', (code) => {
console.log(`[backend] Process exited with code ${code}`);
this.process = null;
});
await this._waitForReady(30000);
console.log(`[backend] Ready on port ${this.port}`);
}
_isPortOpen(timeoutMs) {
return new Promise((resolve) => {
const req = http.get(`http://127.0.0.1:${this.port}/health`, (res) => {
resolve(res.statusCode === 200);
});
req.on('error', () => resolve(false));
req.setTimeout(timeoutMs, () => { req.destroy(); resolve(false); });
req.end();
});
}
stop() {
if (this.process) {
if (process.platform === 'win32') {
spawn('taskkill', ['/pid', String(this.process.pid), '/f', '/t']);
} else {
this.process.kill('SIGTERM');
}
this.process = null;
}
}
_waitForReady(timeoutMs) {
const startTime = Date.now();
return new Promise((resolve, reject) => {
const check = () => {
if (Date.now() - startTime > timeoutMs) {
reject(new Error('Backend startup timed out'));
return;
}
const req = http.get(`http://127.0.0.1:${this.port}/health`, (res) => {
if (res.statusCode === 200) {
resolve();
} else {
setTimeout(check, 500);
}
});
req.on('error', () => setTimeout(check, 500));
req.end();
};
setTimeout(check, 1000);
});
}
}
module.exports = { PythonBackend };

View File

@ -3,10 +3,7 @@
<head>
<meta charset="UTF-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta http-equiv="Content-Security-Policy" content="default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline' https://fonts.googleapis.com; font-src 'self' data: https://fonts.gstatic.com; connect-src 'self' ipc: http://ipc.localhost http://localhost:* http://127.0.0.1:* ws://localhost:* ws://127.0.0.1:*; media-src 'self' file: blob: http://localhost:* http://127.0.0.1:*; img-src 'self' data: blob:;" />
<link rel="preconnect" href="https://fonts.googleapis.com" />
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin />
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@300;400;500;600;700&family=JetBrains+Mono:wght@400;500&display=swap" rel="stylesheet" />
<meta http-equiv="Content-Security-Policy" content="default-src 'self'; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; font-src 'self' data:; connect-src 'self' ipc: http://ipc.localhost http://localhost:* http://127.0.0.1:* ws://localhost:* ws://127.0.0.1:*; media-src 'self' file: blob: http://localhost:* http://127.0.0.1:*; img-src 'self' data: blob:;" />
<title>TalkEdit</title>
</head>
<body class="bg-editor-bg text-editor-text antialiased">

View File

@ -8,6 +8,8 @@
"name": "talkedit-frontend",
"version": "0.1.0",
"dependencies": {
"@fontsource/inter": "^5.2.8",
"@fontsource/jetbrains-mono": "^5.2.8",
"@tauri-apps/api": "^2",
"@tauri-apps/plugin-dialog": "^2",
"@tauri-apps/plugin-fs": "^2",
@ -768,6 +770,24 @@
"node": ">=18"
}
},
"node_modules/@fontsource/inter": {
"version": "5.2.8",
"resolved": "https://registry.npmjs.org/@fontsource/inter/-/inter-5.2.8.tgz",
"integrity": "sha512-P6r5WnJoKiNVV+zvW2xM13gNdFhAEpQ9dQJHt3naLvfg+LkF2ldgSLiF4T41lf1SQCM9QmkqPTn4TH568IRagg==",
"license": "OFL-1.1",
"funding": {
"url": "https://github.com/sponsors/ayuhito"
}
},
"node_modules/@fontsource/jetbrains-mono": {
"version": "5.2.8",
"resolved": "https://registry.npmjs.org/@fontsource/jetbrains-mono/-/jetbrains-mono-5.2.8.tgz",
"integrity": "sha512-6w8/SG4kqvIMu7xd7wt6x3idn1Qux3p9N62s6G3rfldOUYHpWcc2FKrqf+Vo44jRvqWj2oAtTHrZXEP23oSKwQ==",
"license": "OFL-1.1",
"funding": {
"url": "https://github.com/sponsors/ayuhito"
}
},
"node_modules/@jridgewell/gen-mapping": {
"version": "0.3.13",
"resolved": "https://registry.npmjs.org/@jridgewell/gen-mapping/-/gen-mapping-0.3.13.tgz",

View File

@ -10,6 +10,8 @@
"preview": "vite preview"
},
"dependencies": {
"@fontsource/inter": "^5.2.8",
"@fontsource/jetbrains-mono": "^5.2.8",
"@tauri-apps/api": "^2",
"@tauri-apps/plugin-dialog": "^2",
"@tauri-apps/plugin-fs": "^2",

View File

@ -22,7 +22,7 @@ import {
VolumeX,
} from 'lucide-react';
const IS_ELECTRON = !!window.electronAPI;
const IS_DESKTOP = !!window.desktopAPI;
type Panel = 'ai' | 'settings' | 'export' | 'silence' | null;
@ -49,6 +49,7 @@ export default function App() {
const [cutMode, setCutMode] = useState(false);
const [muteMode, setMuteMode] = useState(false);
const fileInputRef = useRef<HTMLInputElement>(null);
const lastVideoPathRef = useRef<string | null>(null);
useKeyboardShortcuts();
@ -66,21 +67,52 @@ export default function App() {
}, []);
useEffect(() => {
if (IS_ELECTRON) {
window.electronAPI!.getBackendUrl().then(setBackendUrl);
if (IS_DESKTOP) {
window.desktopAPI!.getBackendUrl().then(setBackendUrl);
}
// In Tauri on Linux/WebKit2GTK the ipc:// custom protocol is blocked by
// WebKit internals; postMessage fallback works but logs noisy warnings.
// The backend URL is fixed at 127.0.0.1:8000 so we rely on the store default.
}, [setBackendUrl]);
useEffect(() => {
if (!import.meta.env.DEV) return;
const previousVideoPath = lastVideoPathRef.current;
if (previousVideoPath !== videoPath) {
console.log('[app-state] videoPath transition', {
from: previousVideoPath,
to: videoPath,
wordCount: words.length,
isTranscribing,
});
if (previousVideoPath && !videoPath) {
console.warn('[app-state] videoPath cleared and UI will show welcome screen', {
previousVideoPath,
wordCount: words.length,
isTranscribing,
});
}
lastVideoPathRef.current = videoPath;
}
}, [videoPath, words.length, isTranscribing]);
const handleLoadProject = async () => {
if (!IS_ELECTRON) return;
if (!IS_DESKTOP) return;
try {
const projectPath = await window.electronAPI!.openProject();
if (import.meta.env.DEV) console.log('[app-action] loadProject:dialogOpen');
const projectPath = await window.desktopAPI!.openProject();
if (import.meta.env.DEV) console.log('[app-action] loadProject:dialogResult', { projectPath });
if (!projectPath) return;
const content = await window.electronAPI!.readFile(projectPath);
const content = await window.desktopAPI!.readFile(projectPath);
const data = JSON.parse(content);
if (import.meta.env.DEV) {
console.log('[app-action] loadProject:parsed', {
projectPath,
videoPath: data?.videoPath,
words: Array.isArray(data?.words) ? data.words.length : null,
segments: Array.isArray(data?.segments) ? data.segments.length : null,
});
}
useEditorStore.getState().loadProject(data);
} catch (err) {
console.error('Failed to load project:', err);
@ -89,13 +121,13 @@ export default function App() {
};
const handleSaveProject = async () => {
if (!IS_ELECTRON) return;
if (!IS_DESKTOP) return;
try {
const savePath = await window.electronAPI!.saveProject();
const savePath = await window.desktopAPI!.saveProject();
if (!savePath) return;
const data = useEditorStore.getState().saveProject();
const path = savePath.endsWith('.aive') ? savePath : `${savePath}.aive`;
await window.electronAPI!.writeFile(path, JSON.stringify(data, null, 2));
await window.desktopAPI!.writeFile(path, JSON.stringify(data, null, 2));
} catch (err) {
console.error('Failed to save project:', err);
alert(`Failed to save project: ${err}`);
@ -103,9 +135,12 @@ export default function App() {
};
const handleOpenFile = async () => {
if (IS_ELECTRON) {
const path = await window.electronAPI!.openFile();
if (IS_DESKTOP) {
if (import.meta.env.DEV) console.log('[app-action] openFile:dialogOpen');
const path = await window.desktopAPI!.openFile();
if (import.meta.env.DEV) console.log('[app-action] openFile:dialogResult', { path });
if (path) {
if (import.meta.env.DEV) console.log('[app-action] openFile:loadVideo', { path });
loadVideo(path);
await transcribeVideo(path);
}
@ -113,6 +148,7 @@ export default function App() {
// Browser: use the manual path input
const path = manualPath.trim();
if (path) {
if (import.meta.env.DEV) console.log('[app-action] openFile:webManualPath', { path });
loadVideo(path);
await transcribeVideo(path);
}
@ -123,14 +159,16 @@ export default function App() {
e.preventDefault();
const path = manualPath.trim();
if (!path) return;
if (import.meta.env.DEV) console.log('[app-action] manualSubmit:loadVideo', { path });
loadVideo(path);
await transcribeVideo(path);
};
const transcribeVideo = async (path: string) => {
if (import.meta.env.DEV) console.log('[app-action] transcribe:start', { path, whisperModel });
setTranscribing(true, 0, 'Checking model...');
try {
if (!window.electronAPI?.transcribe) {
if (!window.desktopAPI?.transcribe) {
throw new Error('Transcription not available');
}
// Step 1: ensure model is downloaded (may take a while on first run)
@ -153,16 +191,26 @@ export default function App() {
};
const modelLabel = MODEL_SIZES[whisperModel] ?? 'unknown size';
setTranscribing(true, 5, `Downloading ${whisperModel} model (${modelLabel})...`);
await window.electronAPI.ensureModel(whisperModel);
await window.desktopAPI.ensureModel(whisperModel);
if (import.meta.env.DEV) console.log('[app-action] transcribe:modelReady', { whisperModel });
// Step 2: run transcription
setTranscribing(true, 20, 'Transcribing audio...');
const data = await window.electronAPI.transcribe(path, whisperModel);
const data = await window.desktopAPI.transcribe(path, whisperModel);
if (import.meta.env.DEV) {
console.log('[app-action] transcribe:result', {
path,
words: Array.isArray(data?.words) ? data.words.length : null,
segments: Array.isArray(data?.segments) ? data.segments.length : null,
language: data?.language,
});
}
setTranscription(data);
} catch (err) {
console.error('Transcription error:', err);
alert(`Transcription failed. Check the console for details.\n\n${err}`);
} finally {
if (import.meta.env.DEV) console.log('[app-action] transcribe:finish', { path });
setTranscribing(false);
}
};
@ -243,7 +291,7 @@ export default function App() {
English-only models are ~10% faster and more accurate for English content.
</p>
{IS_ELECTRON ? (
{IS_DESKTOP ? (
<div className="flex flex-col items-center gap-3">
<button
onClick={handleOpenFile}
@ -313,9 +361,9 @@ export default function App() {
<ToolbarButton
icon={<FolderOpen className="w-4 h-4" />}
label="Open"
onClick={IS_ELECTRON ? handleOpenFile : () => useEditorStore.getState().reset()}
onClick={IS_DESKTOP ? handleOpenFile : () => useEditorStore.getState().reset()}
/>
{IS_ELECTRON && (
{IS_DESKTOP && (
<ToolbarButton
icon={<Save className="w-4 h-4" />}
label="Save"
@ -323,7 +371,7 @@ export default function App() {
disabled={words.length === 0}
/>
)}
{IS_ELECTRON && (
{IS_DESKTOP && (
<ToolbarButton
icon={<FileInput className="w-4 h-4" />}
label="Load"
@ -343,8 +391,8 @@ export default function App() {
active={muteMode}
/>
<ToolbarButton
icon={<span className="text-[10px] font-semibold">PA</span>}
label="Pause Trim"
icon={<span className="text-[10px] font-semibold">ST</span>}
label="Silence Trim"
active={activePanel === 'silence'}
onClick={() => togglePanel('silence')}
disabled={!videoPath}

View File

@ -20,7 +20,7 @@ export default function ExportDialog() {
const handleExport = useCallback(async () => {
if (!videoPath) return;
const outputPath = await window.electronAPI?.saveFile({
const outputPath = await window.desktopAPI?.saveFile({
defaultPath: videoPath.replace(/\.[^.]+$/, '_edited.mp4'),
filters: [
{ name: 'MP4', extensions: ['mp4'] },

View File

@ -1,6 +1,6 @@
import { useState } from 'react';
import { useState, useEffect } from 'react';
import { useEditorStore } from '../store/editorStore';
import { Loader2, Scissors } from 'lucide-react';
import { Loader2, Scissors, Trash2, Play, Pause } from 'lucide-react';
type SilenceRange = {
start: number;
@ -8,14 +8,31 @@ type SilenceRange = {
duration: number;
};
type TrimAction = 'cut' | 'mute';
export default function SilenceTrimmerPanel() {
const { videoPath, backendUrl, addCutRange, duration } = useEditorStore();
const {
videoPath,
backendUrl,
addCutRange,
addMuteRange,
removeCutRange,
removeMuteRange,
cutRanges,
muteRanges,
duration,
pauseUndo,
resumeUndo
} = useEditorStore();
const [minSilenceMs, setMinSilenceMs] = useState(500);
const [silenceDb, setSilenceDb] = useState(-35);
const [preBufferMs, setPreBufferMs] = useState(80);
const [postBufferMs, setPostBufferMs] = useState(120);
const [isDetecting, setIsDetecting] = useState(false);
const [ranges, setRanges] = useState<SilenceRange[]>([]);
const [trimAction, setTrimAction] = useState<TrimAction>('cut');
const [isActive, setIsActive] = useState(false);
const detectSilence = async () => {
if (!videoPath) return;
@ -58,6 +75,9 @@ export default function SilenceTrimmerPanel() {
};
const applyAsCuts = () => {
// Pause undo tracking to group all cuts into a single undo operation
pauseUndo();
const preBufferSeconds = preBufferMs / 1000;
const postBufferSeconds = postBufferMs / 1000;
const maxEnd = duration > 0 ? duration : Number.POSITIVE_INFINITY;
@ -67,11 +87,71 @@ export default function SilenceTrimmerPanel() {
const start = Math.max(0, r.start + preBufferSeconds);
const end = Math.min(maxEnd, r.end - postBufferSeconds);
if (end - start >= 0.01) {
addCutRange(start, end);
if (trimAction === 'cut') {
addCutRange(start, end);
} else {
addMuteRange(start, end);
}
}
}
// Resume undo tracking - this creates a single undo entry for the entire batch
resumeUndo();
};
const removeExistingTrims = () => {
pauseUndo();
try {
// Remove all cut ranges that match detected silence ranges
cutRanges.forEach(range => {
ranges.forEach(silenceRange => {
if (Math.abs(range.start - silenceRange.start) < 0.1 &&
Math.abs(range.end - silenceRange.end) < 0.1) {
removeCutRange(range.id);
}
});
});
// Remove all mute ranges that match detected silence ranges
muteRanges.forEach(range => {
ranges.forEach(silenceRange => {
if (Math.abs(range.start - silenceRange.start) < 0.1 &&
Math.abs(range.end - silenceRange.end) < 0.1) {
removeMuteRange(range.id);
}
});
});
} finally {
resumeUndo();
}
};
const toggleActive = () => {
setIsActive(!isActive);
if (!isActive) {
// When activating, detect silence and apply
detectSilence().then(() => {
if (ranges.length > 0) {
applyAsCuts();
}
});
} else {
// When deactivating, remove existing trims
removeExistingTrims();
}
};
// Auto-detect when video changes
useEffect(() => {
if (videoPath && isActive) {
detectSilence().then(() => {
if (ranges.length > 0) {
applyAsCuts();
}
});
}
}, [videoPath]);
return (
<div className="p-4 space-y-4">
<div className="space-y-1">
@ -142,43 +222,95 @@ export default function SilenceTrimmerPanel() {
</div>
</div>
<button
onClick={detectSilence}
disabled={isDetecting || !videoPath}
className="w-full flex items-center justify-center gap-2 px-4 py-2.5 bg-editor-accent hover:bg-editor-accent-hover disabled:opacity-50 rounded-lg text-sm font-medium transition-colors"
>
{isDetecting ? (
<>
<Loader2 className="w-4 h-4 animate-spin" />
Detecting pauses...
</>
) : (
'Detect Pauses'
)}
</button>
</div>
{ranges.length > 0 && (
<div className="space-y-2">
<div className="flex items-center justify-between">
<span className="text-xs font-medium">Detected {ranges.length} pause ranges</span>
<button
onClick={applyAsCuts}
className="flex items-center gap-1 px-2 py-1 text-xs bg-editor-accent/20 text-editor-accent rounded hover:bg-editor-accent/30"
>
<Scissors className="w-3 h-3" />
Apply As Cuts
</button>
</div>
<div className="max-h-56 overflow-y-auto space-y-1 pr-1">
{ranges.slice(0, 50).map((r, i) => (
<div key={`${r.start}-${r.end}-${i}`} className="px-2 py-1.5 rounded bg-editor-surface border border-editor-border text-xs">
{r.start.toFixed(2)}s - {r.end.toFixed(2)}s ({r.duration.toFixed(2)}s)
</div>
))}
</div>
<div className="space-y-1.5">
<label className="text-[11px] text-editor-text-muted font-medium">
Trim Action
</label>
<select
value={trimAction}
onChange={(e) => setTrimAction(e.target.value as TrimAction)}
className="w-full px-2.5 py-1.5 text-xs bg-editor-surface border border-editor-border rounded focus:border-editor-accent focus:outline-none"
>
<option value="cut">Cut (remove silence)</option>
<option value="mute">Mute (silence audio)</option>
</select>
</div>
)}
<div className="flex items-center justify-between">
<label className="text-[11px] text-editor-text-muted font-medium">
Active Mode
</label>
<button
onClick={toggleActive}
className={`flex items-center gap-2 px-3 py-1.5 text-xs rounded transition-colors ${
isActive
? 'bg-green-600 hover:bg-green-700 text-white'
: 'bg-editor-surface border border-editor-border hover:bg-editor-surface-hover'
}`}
>
{isActive ? (
<>
<Pause className="w-3 h-3" />
Active
</>
) : (
<>
<Play className="w-3 h-3" />
Inactive
</>
)}
</button>
</div>
<div className="flex gap-2">
<button
onClick={detectSilence}
disabled={isDetecting || !videoPath}
className="flex-1 flex items-center justify-center gap-2 px-3 py-2 bg-editor-accent hover:bg-editor-accent-hover disabled:opacity-50 rounded text-xs font-medium transition-colors"
>
{isDetecting ? (
<>
<Loader2 className="w-3 h-3 animate-spin" />
Detecting...
</>
) : (
'Detect Pauses'
)}
</button>
{ranges.length > 0 && (
<button
onClick={removeExistingTrims}
className="flex items-center gap-1 px-3 py-2 text-xs bg-red-600 hover:bg-red-700 text-white rounded transition-colors"
>
<Trash2 className="w-3 h-3" />
Remove Trims
</button>
)}
</div>
{ranges.length > 0 && (
<div className="space-y-2">
<div className="flex items-center justify-between">
<span className="text-xs font-medium">Detected {ranges.length} pause ranges</span>
<button
onClick={applyAsCuts}
className="flex items-center gap-1 px-2 py-1 text-xs bg-editor-accent/20 text-editor-accent rounded hover:bg-editor-accent/30"
>
<Scissors className="w-3 h-3" />
Apply As {trimAction === 'cut' ? 'Cuts' : 'Mutes'}
</button>
</div>
<div className="max-h-56 overflow-y-auto space-y-1 pr-1">
{ranges.slice(0, 50).map((r, i) => (
<div key={`${r.start}-${r.end}-${i}`} className="px-2 py-1.5 rounded bg-editor-surface border border-editor-border text-xs">
{r.start.toFixed(2)}s - {r.end.toFixed(2)}s ({r.duration.toFixed(2)}s)
</div>
))}
</div>
</div>
)}
</div>
</div>
);
}

View File

@ -5,24 +5,31 @@ import { Play, Pause, SkipBack, SkipForward, Volume2 } from 'lucide-react';
export default function VideoPlayer() {
const videoRef = useRef<HTMLVideoElement>(null);
const audioRef = useRef<HTMLAudioElement>(null);
const videoUrl = useEditorStore((s) => s.videoUrl);
const isPlaying = useEditorStore((s) => s.isPlaying);
const duration = useEditorStore((s) => s.duration);
const { seekTo, togglePlay } = useVideoSync(videoRef);
// Determine if this is an audio file based on the URL
const isAudioFile = videoUrl && (videoUrl.includes('.wav') || videoUrl.includes('.mp3') || videoUrl.includes('.m4a') || videoUrl.includes('.aac'));
const mediaRef = isAudioFile ? audioRef : videoRef;
const { seekTo, togglePlay } = useVideoSync(mediaRef as React.RefObject<HTMLVideoElement | HTMLAudioElement | null>);
const [displayTime, setDisplayTime] = useState(0);
useEffect(() => {
const video = videoRef.current;
if (!video) return;
const media = mediaRef.current;
if (!media) return;
let raf = 0;
const tick = () => {
setDisplayTime(video.currentTime);
setDisplayTime(media.currentTime);
raf = requestAnimationFrame(tick);
};
raf = requestAnimationFrame(tick);
return () => cancelAnimationFrame(raf);
}, [videoUrl]);
}, [videoUrl, mediaRef]);
const formatTime = (seconds: number) => {
const m = Math.floor(seconds / 60);
@ -41,11 +48,11 @@ export default function VideoPlayer() {
const skip = useCallback(
(delta: number) => {
const video = videoRef.current;
if (!video) return;
seekTo(Math.max(0, Math.min(duration, video.currentTime + delta)));
const media = mediaRef.current;
if (!media) return;
seekTo(Math.max(0, Math.min(duration, media.currentTime + delta)));
},
[seekTo, duration],
[seekTo, duration, mediaRef],
);
if (!videoUrl) {
@ -59,13 +66,45 @@ export default function VideoPlayer() {
return (
<div className="w-full h-full flex flex-col">
<div className="flex-1 flex items-center justify-center bg-black rounded-lg overflow-hidden min-h-0">
<video
ref={videoRef}
src={videoUrl}
className="max-w-full max-h-full object-contain"
playsInline
onClick={togglePlay}
/>
{isAudioFile ? (
<audio
ref={audioRef}
src={videoUrl}
className="max-w-full max-h-full"
controls={false}
preload="none"
onClick={togglePlay}
onError={(e) => {
console.error('Audio load error:', e);
console.error('Audio src:', videoUrl);
}}
onLoadStart={() => console.log('Audio load start:', videoUrl)}
onLoadedData={() => console.log('Audio loaded data')}
onCanPlay={() => console.log('Audio can play')}
onProgress={() => console.log('Audio progress event')}
onStalled={() => console.log('Audio stalled')}
onSuspend={() => console.log('Audio suspended')}
/>
) : (
<video
ref={videoRef}
src={videoUrl}
className="max-w-full max-h-full object-contain"
playsInline
preload="none"
onClick={togglePlay}
onError={(e) => {
console.error('Video load error:', e);
console.error('Video src:', videoUrl);
}}
onLoadStart={() => console.log('Video load start:', videoUrl)}
onLoadedData={() => console.log('Video loaded data')}
onCanPlay={() => console.log('Video can play')}
onProgress={() => console.log('Video progress event')}
onStalled={() => console.log('Video stalled')}
onSuspend={() => console.log('Video suspended')}
/>
)}
</div>
<div className="pt-2 space-y-1.5 shrink-0">

View File

@ -167,14 +167,14 @@ async function saveProject() {
modifiedAt: new Date().toISOString(),
};
const outputPath = await window.electronAPI?.saveFile({
const outputPath = await window.desktopAPI?.saveFile({
defaultPath: state.videoPath.replace(/\.[^.]+$/, '.aive'),
filters: [{ name: 'TalkEdit Project', extensions: ['aive'] }],
});
if (outputPath) {
if (window.electronAPI?.writeFile) {
await window.electronAPI.writeFile(outputPath, JSON.stringify(projectData, null, 2));
if (window.desktopAPI?.writeFile) {
await window.desktopAPI.writeFile(outputPath, JSON.stringify(projectData, null, 2));
} else {
const blob = new Blob([JSON.stringify(projectData, null, 2)], { type: 'application/json' });
const url = URL.createObjectURL(blob);

View File

@ -1,7 +1,7 @@
import { useCallback, useRef, useEffect } from 'react';
import { useEditorStore } from '../store/editorStore';
export function useVideoSync(videoRef: React.RefObject<HTMLVideoElement | null>) {
export function useVideoSync(videoRef: React.RefObject<HTMLVideoElement | HTMLAudioElement | null>) {
const rafRef = useRef<number>(0);
const {
setCurrentTime,
@ -52,13 +52,13 @@ export function useVideoSync(videoRef: React.RefObject<HTMLVideoElement | null>)
}, [videoRef]);
useEffect(() => {
const video = videoRef.current;
if (!video) return;
const media = videoRef.current;
if (!media) return;
const onTimeUpdate = () => {
cancelAnimationFrame(rafRef.current);
rafRef.current = requestAnimationFrame(() => {
let t = video.currentTime;
let t = media.currentTime;
// Skip over deleted ranges and cut ranges (handle overlapping/chained ranges)
const allSkipRanges = [...deletedRanges, ...cutRanges];
@ -79,19 +79,21 @@ export function useVideoSync(videoRef: React.RefObject<HTMLVideoElement | null>)
}
if (skipCount > 0) {
video.currentTime = t;
media.currentTime = t;
return;
}
// Mute/unmute based on mute ranges
let shouldMute = false;
for (const range of muteRanges) {
if (t >= range.start && t < range.end) {
shouldMute = true;
break;
// Mute/unmute based on mute ranges (only for video elements)
if ('muted' in media) {
let shouldMute = false;
for (const range of muteRanges) {
if (t >= range.start && t < range.end) {
shouldMute = true;
break;
}
}
media.muted = shouldMute;
}
video.muted = shouldMute;
setCurrentTime(t);
});
@ -99,18 +101,18 @@ export function useVideoSync(videoRef: React.RefObject<HTMLVideoElement | null>)
const onPlay = () => setIsPlaying(true);
const onPause = () => setIsPlaying(false);
const onLoadedMetadata = () => setDuration(video.duration);
const onLoadedMetadata = () => setDuration(media.duration);
video.addEventListener('timeupdate', onTimeUpdate);
video.addEventListener('play', onPlay);
video.addEventListener('pause', onPause);
video.addEventListener('loadedmetadata', onLoadedMetadata);
media.addEventListener('timeupdate', onTimeUpdate);
media.addEventListener('play', onPlay);
media.addEventListener('pause', onPause);
media.addEventListener('loadedmetadata', onLoadedMetadata);
return () => {
video.removeEventListener('timeupdate', onTimeUpdate);
video.removeEventListener('play', onPlay);
video.removeEventListener('pause', onPause);
video.removeEventListener('loadedmetadata', onLoadedMetadata);
media.removeEventListener('timeupdate', onTimeUpdate);
media.removeEventListener('play', onPlay);
media.removeEventListener('pause', onPause);
media.removeEventListener('loadedmetadata', onLoadedMetadata);
cancelAnimationFrame(rafRef.current);
};
}, [videoRef, deletedRanges, cutRanges, muteRanges, setCurrentTime, setIsPlaying, setDuration]);

View File

@ -1,3 +1,11 @@
@import '@fontsource/inter/300.css';
@import '@fontsource/inter/400.css';
@import '@fontsource/inter/500.css';
@import '@fontsource/inter/600.css';
@import '@fontsource/inter/700.css';
@import '@fontsource/jetbrains-mono/400.css';
@import '@fontsource/jetbrains-mono/500.css';
@tailwind base;
@tailwind components;
@tailwind utilities;

View File

@ -1,13 +1,29 @@
/**
* Dev-only console interceptor.
* Forwards console.error / console.warn to the backend /dev/log endpoint,
* which appends them to webview.log so the agent can read it.
* which appends them to a backend-managed dev log file.
*/
if (import.meta.env.DEV) {
const BACKEND = 'http://127.0.0.1:8000';
type ConsoleFn = (...args: unknown[]) => void;
const serialize = (value: unknown): string => {
if (typeof value === 'string') return value;
if (value instanceof Error) {
return JSON.stringify({
name: value.name,
message: value.message,
stack: value.stack,
});
}
try {
return JSON.stringify(value);
} catch {
return String(value);
}
};
const forward = (level: string, orig: ConsoleFn): ConsoleFn =>
(...args: unknown[]) => {
orig(...args);
@ -15,7 +31,7 @@ if (import.meta.env.DEV) {
fetch(`${BACKEND}/dev/log`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ level, message: String(first ?? ''), args: rest.map(String) }),
body: JSON.stringify({ level, message: serialize(first ?? ''), args: rest.map(serialize) }),
}).catch(() => {/* backend not running yet */});
};

View File

@ -1,8 +1,8 @@
/**
* tauri-bridge.ts
*
* Polyfills window.electronAPI with Tauri equivalents so all existing
* call-sites in App.tsx, hooks, and stores continue to work unchanged.
* Exposes window.desktopAPI using Tauri equivalents so UI code can stay
* desktop-runtime agnostic.
*
* Imported once at the top of main.tsx.
*/
@ -25,36 +25,56 @@ const EXPORT_FILTERS = [
{ name: 'Project Files', extensions: ['aive'] },
];
window.electronAPI = {
const BACKEND_PORT = import.meta.env.VITE_BACKEND_PORT || '8000';
const BACKEND_URL = `http://127.0.0.1:${BACKEND_PORT}`;
const debugBridge = (event: string, details?: Record<string, unknown>) => {
if (!import.meta.env.DEV) return;
console.log('[tauri-bridge]', event, details ?? {});
};
window.desktopAPI = {
openFile: async (_options?: Record<string, unknown>): Promise<string | null> => {
debugBridge('openFile:dialogOpen');
const result = await open({
multiple: false,
filters: VIDEO_FILTERS,
});
return typeof result === 'string' ? result : null;
const path = typeof result === 'string' ? result : null;
debugBridge('openFile:dialogResult', { path });
return path;
},
saveFile: async (_options?: Record<string, unknown>): Promise<string | null> => {
debugBridge('saveFile:dialogOpen');
const result = await save({ filters: EXPORT_FILTERS });
return result ?? null;
const path = result ?? null;
debugBridge('saveFile:dialogResult', { path });
return path;
},
openProject: async (): Promise<string | null> => {
debugBridge('openProject:dialogOpen');
const result = await open({
multiple: false,
filters: PROJECT_FILTERS,
});
return typeof result === 'string' ? result : null;
const path = typeof result === 'string' ? result : null;
debugBridge('openProject:dialogResult', { path });
return path;
},
saveProject: async (): Promise<string | null> => {
debugBridge('saveProject:dialogOpen');
const result = await save({ filters: PROJECT_FILTERS });
return result ?? null;
const path = result ?? null;
debugBridge('saveProject:dialogResult', { path });
return path;
},
getBackendUrl: (): Promise<string> => {
// Backend URL is fixed; avoid invoke() which triggers ipc:// CSP errors on Linux/WebKit2GTK
return Promise.resolve('http://127.0.0.1:8000');
// Use env-driven backend URL and avoid invoke() to bypass ipc:// noise on Linux/WebKit2GTK.
return Promise.resolve(BACKEND_URL);
},
encryptString: (data: string): Promise<string> => {
@ -74,10 +94,12 @@ window.electronAPI = {
},
readFile: (path: string): Promise<string> => {
debugBridge('readFile', { path });
return readTextFile(path);
},
writeFile: async (path: string, content: string): Promise<boolean> => {
debugBridge('writeFile', { path, size: content.length });
await writeTextFile(path, content);
return true;
},

View File

@ -1,8 +1,8 @@
import React from 'react';
import ReactDOM from 'react-dom/client';
// Forward console.error/warn/log to backend in dev mode so we can tail webview.log
// Forward console.error/warn/log to backend in dev mode so we can tail the backend dev log.
import './lib/dev-logger';
// Must be imported before App so window.electronAPI is patched before any component runs.
// Must be imported before App so window.desktopAPI is patched before any component runs.
import './lib/tauri-bridge';
import App from './App';
import './index.css';

View File

@ -30,8 +30,8 @@ async function encryptAndStore(key: string, value: string): Promise<void> {
localStorage.removeItem(ENCRYPTED_KEY_PREFIX + key);
return;
}
if (window.electronAPI) {
const encrypted = await window.electronAPI.encryptString(value);
if (window.desktopAPI) {
const encrypted = await window.desktopAPI.encryptString(value);
localStorage.setItem(ENCRYPTED_KEY_PREFIX + key, encrypted);
} else {
localStorage.setItem(ENCRYPTED_KEY_PREFIX + key, btoa(value));
@ -41,9 +41,9 @@ async function encryptAndStore(key: string, value: string): Promise<void> {
async function loadAndDecrypt(key: string): Promise<string> {
const stored = localStorage.getItem(ENCRYPTED_KEY_PREFIX + key);
if (!stored) return '';
if (window.electronAPI) {
if (window.desktopAPI) {
try {
return await window.electronAPI.decryptString(stored);
return await window.desktopAPI.decryptString(stored);
} catch {
return '';
}

View File

@ -1,4 +1,5 @@
import { create } from 'zustand';
import { persist } from 'zustand/middleware';
import { temporal } from 'zundo';
import type { Word, Segment, DeletedRange, CutRange, MuteRange, TranscriptionResult, ProjectFile } from '../types/project';
@ -55,6 +56,8 @@ interface EditorActions {
getWordAtTime: (time: number) => number;
loadProject: (projectData: any) => void;
reset: () => void;
pauseUndo: () => void;
resumeUndo: () => void;
}
const initialState: EditorState = {
@ -82,12 +85,21 @@ const initialState: EditorState = {
let nextRangeId = 1;
const debugEditorStore = (event: string, details?: Record<string, unknown>) => {
if (!import.meta.env.DEV) return;
console.log('[editor-store]', event, details ?? {});
};
export const useEditorStore = create<EditorState & EditorActions>()(
temporal(
(set, get) => ({
persist(
temporal(
(set, get) => ({
...initialState,
setBackendUrl: (url) => set({ backendUrl: url }),
setBackendUrl: (url) => {
debugEditorStore('setBackendUrl', { url });
set({ backendUrl: url });
},
setExportedAudioPath: (path) => set({ exportedAudioPath: path }),
@ -114,13 +126,25 @@ export const useEditorStore = create<EditorState & EditorActions>()(
loadVideo: (path) => {
const backend = get().backendUrl;
const url = `${backend}/file?path=${encodeURIComponent(path)}`;
const buildMediaUrl = (filePath: string) => {
const isWav = filePath.toLowerCase().endsWith('.wav');
return isWav
? `${backend}/file?path=${encodeURIComponent(filePath)}&format=mp3`
: `${backend}/file?path=${encodeURIComponent(filePath)}`;
};
const url = buildMediaUrl(path);
debugEditorStore('loadVideo:start', {
path,
backend,
previousVideoPath: get().videoPath,
});
set({
...initialState,
backendUrl: backend,
videoPath: path,
videoUrl: url,
});
debugEditorStore('loadVideo:done', { path, url });
},
setTranscription: (result) => {
@ -130,6 +154,12 @@ export const useEditorStore = create<EditorState & EditorActions>()(
globalIdx += seg.words.length;
return annotated;
});
debugEditorStore('setTranscription', {
wordCount: result.words?.length ?? 0,
segmentCount: result.segments?.length ?? 0,
language: result.language,
currentVideoPath: get().videoPath,
});
set({
words: result.words,
segments: annotatedSegments,
@ -302,7 +332,18 @@ export const useEditorStore = create<EditorState & EditorActions>()(
loadProject: (data) => {
const backend = get().backendUrl;
const url = `${backend}/file?path=${encodeURIComponent(data.videoPath)}`;
const resolvedVideoPath = typeof data?.videoPath === 'string' ? data.videoPath : null;
if (!resolvedVideoPath) {
debugEditorStore('loadProject:invalidVideoPath', {
videoPathType: typeof data?.videoPath,
hasKeys: data && typeof data === 'object' ? Object.keys(data as Record<string, unknown>) : [],
});
throw new Error('Project file missing required videoPath string');
}
const isWav = resolvedVideoPath.toLowerCase().endsWith('.wav');
const url = isWav
? `${backend}/file?path=${encodeURIComponent(resolvedVideoPath)}&format=mp3`
: `${backend}/file?path=${encodeURIComponent(resolvedVideoPath)}`;
let globalIdx = 0;
const annotatedSegments = (data.segments || []).map((seg: Segment) => {
@ -311,10 +352,20 @@ export const useEditorStore = create<EditorState & EditorActions>()(
return annotated;
});
debugEditorStore('loadProject:start', {
videoPath: resolvedVideoPath,
words: Array.isArray(data?.words) ? data.words.length : null,
segments: Array.isArray(data?.segments) ? data.segments.length : null,
cutRanges: Array.isArray(data?.cutRanges) ? data.cutRanges.length : null,
muteRanges: Array.isArray(data?.muteRanges) ? data.muteRanges.length : null,
deletedRanges: Array.isArray(data?.deletedRanges) ? data.deletedRanges.length : null,
previousVideoPath: get().videoPath,
});
set({
...initialState,
backendUrl: backend,
videoPath: data.videoPath,
videoPath: resolvedVideoPath,
videoUrl: url,
words: data.words || [],
segments: annotatedSegments,
@ -324,10 +375,70 @@ export const useEditorStore = create<EditorState & EditorActions>()(
language: data.language || '',
exportedAudioPath: data.exportedAudioPath ?? null,
});
debugEditorStore('loadProject:done', {
videoPath: resolvedVideoPath,
url,
});
},
reset: () => set(initialState),
}),
{ limit: 100 },
reset: () => {
const stack = new Error().stack?.split('\n').slice(1, 6).join(' | ');
debugEditorStore('reset', {
previousVideoPath: get().videoPath,
stack,
});
set(initialState);
},
pauseUndo: () => {
// Access the temporal store through the useEditorStore
const temporalStore = (useEditorStore as any).temporal;
if (temporalStore) {
temporalStore.getState().pause();
}
},
resumeUndo: () => {
// Access the temporal store through the useEditorStore
const temporalStore = (useEditorStore as any).temporal;
if (temporalStore) {
temporalStore.getState().resume();
}
},
}),
{ limit: 100 },
),
{
name: 'talkedit-editor-session',
version: 1,
partialize: (state) => ({
videoPath: state.videoPath,
videoUrl: state.videoUrl,
exportedAudioPath: state.exportedAudioPath,
words: state.words,
segments: state.segments,
deletedRanges: state.deletedRanges,
cutRanges: state.cutRanges,
muteRanges: state.muteRanges,
language: state.language,
backendUrl: state.backendUrl,
currentTime: state.currentTime,
duration: state.duration,
}),
onRehydrateStorage: () => (state, error) => {
if (error) {
debugEditorStore('persist:rehydrate:error', { error: String(error) });
return;
}
debugEditorStore('persist:rehydrate:done', {
videoPath: state?.videoPath ?? null,
words: state?.words?.length ?? 0,
segments: state?.segments?.length ?? 0,
cutRanges: state?.cutRanges?.length ?? 0,
muteRanges: state?.muteRanges?.length ?? 0,
});
},
},
),
);

View File

@ -1,6 +1,10 @@
/// <reference types="vite/client" />
interface ElectronAPI {
interface ImportMetaEnv {
readonly VITE_BACKEND_PORT?: string;
}
interface DesktopAPI {
openFile: (options?: Record<string, unknown>) => Promise<string | null>;
saveFile: (options?: Record<string, unknown>) => Promise<string | null>;
openProject: () => Promise<string | null>;
@ -15,5 +19,5 @@ interface ElectronAPI {
}
interface Window {
electronAPI?: ElectronAPI;
desktopAPI?: DesktopAPI;
}

View File

@ -1 +1 @@
{"root":["./src/App.tsx","./src/main.tsx","./src/vite-env.d.ts","./src/components/AIPanel.tsx","./src/components/DevPanel.tsx","./src/components/ExportDialog.tsx","./src/components/SettingsPanel.tsx","./src/components/TranscriptEditor.tsx","./src/components/VideoPlayer.tsx","./src/components/WaveformTimeline.tsx","./src/hooks/useKeyboardShortcuts.ts","./src/hooks/useVideoSync.ts","./src/lib/dev-logger.ts","./src/lib/tauri-bridge.ts","./src/store/aiStore.ts","./src/store/editorStore.ts","./src/types/project.ts"],"errors":true,"version":"5.9.3"}
{"root":["./src/App.tsx","./src/main.tsx","./src/vite-env.d.ts","./src/components/AIPanel.tsx","./src/components/DevPanel.tsx","./src/components/ExportDialog.tsx","./src/components/SettingsPanel.tsx","./src/components/SilenceTrimmerPanel.tsx","./src/components/TranscriptEditor.tsx","./src/components/VideoPlayer.tsx","./src/components/WaveformTimeline.tsx","./src/hooks/useKeyboardShortcuts.ts","./src/hooks/useVideoSync.ts","./src/lib/dev-logger.ts","./src/lib/tauri-bridge.ts","./src/store/aiStore.ts","./src/store/editorStore.ts","./src/types/project.ts"],"version":"5.9.3"}

View File

@ -1,8 +1,8 @@
Here's a clear, actionable **summary** of what you (as a solo developer using AI tools heavily) should do to build and monetize this product, based on current market demand in 2026.
### What You Should Do (Step-by-Step Plan)
1. **Fork an existing open-source base** (don't start from scratch)
- Best choice: **CutScript** (newest, explicitly built as "offline Descript alternative" with text-based editing) or **Audapolis** (more mature, ~1.8k stars, wordprocessor-like experience for spoken-word video/audio).
1. **Build from the existing TalkEdit base** (don't start from scratch)
- Keep TalkEdit as the primary codebase and borrow ideas from mature open-source editors like **Audapolis** where useful.
- Reason: The hard parts (local Whisper transcription with word-level timestamps, syncing text deletions to video cuts, FFmpeg handling) are already solved. You save 48 weeks and focus on polish.
2. **Migrate/refactor to Tauri 2.0** (Rust backend + React/Vite + Tailwind + shadcn-ui frontend)
@ -48,13 +48,13 @@ That's it. No multi-track timelines, no voice cloning, no collaboration, no fanc
### Why This Will Work
- **Market demand is real**: Creators love text-based editing because it feels revolutionary for dialogue-heavy videos. They want it faster, cheaper, and private/offline. Existing alternatives are either cloud-based with subscriptions or clunky open-source tools.
- **Competition gap**: CutScript and Audapolis prove interest but lack slick UX and the "one magic button" polish. You can own the "delightful local Descript killer" niche.
- **Competition gap**: Existing local editors prove interest but often lack slick UX and the "one magic button" polish. You can own the "delightful local Descript killer" niche.
- **Solo-dev friendly**: Forking + AI code generation makes this realistic without a team.
Once you ship the MVP and get initial users, you can add nice-to-haves (e.g., custom filler lists, better subtitle export, optional cloud boost) based on real feedback.
**Next immediate actions**:
- Clone CutScript or Audapolis today and run it locally to see the current state.
- Continue from TalkEdit and benchmark against Audapolis today to compare current UX quality.
- Set up a new Tauri project and start refactoring the UI/transcript editor.
If you want, I can give you the exact Git commands, first AI prompts for refactoring, folder structure, or even sample code for the "Clean it" button + FFmpeg polish chain.

26
open
View File

@ -3,8 +3,11 @@
cd "$(dirname "$0")"
PROJECT_DIR="$PWD"
BACKEND_PORT=8000
export BACKEND_PORT="${BACKEND_PORT:-8000}"
export VITE_BACKEND_PORT="${VITE_BACKEND_PORT:-$BACKEND_PORT}"
export TALKEDIT_DEV_LOG_PATH="${TALKEDIT_DEV_LOG_PATH:-/tmp/talkedit-webview.log}"
BACKEND_URL="http://127.0.0.1:${BACKEND_PORT}/health"
FRONTEND_URL="http://127.0.0.1:5173"
# Check if backend is already running
if curl -sf "$BACKEND_URL" > /dev/null 2>&1; then
@ -16,20 +19,20 @@ else
# Try common terminal emulators in order
if command -v ghostty &>/dev/null; then
ghostty -e bash -c "cd '${BACKEND_DIR}' && '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
ghostty -e bash -c "cd '${BACKEND_DIR}' && TALKEDIT_DEV_LOG_PATH='${TALKEDIT_DEV_LOG_PATH}' '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
elif command -v kitty &>/dev/null; then
kitty --title "TalkEdit Backend" -- bash -c "cd '${BACKEND_DIR}' && '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
kitty --title "TalkEdit Backend" -- bash -c "cd '${BACKEND_DIR}' && TALKEDIT_DEV_LOG_PATH='${TALKEDIT_DEV_LOG_PATH}' '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
elif command -v alacritty &>/dev/null; then
alacritty --title "TalkEdit Backend" -e bash -c "cd '${BACKEND_DIR}' && '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
alacritty --title "TalkEdit Backend" -e bash -c "cd '${BACKEND_DIR}' && TALKEDIT_DEV_LOG_PATH='${TALKEDIT_DEV_LOG_PATH}' '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
elif command -v konsole &>/dev/null; then
konsole --new-tab -e bash -c "cd '${BACKEND_DIR}' && '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
konsole --new-tab -e bash -c "cd '${BACKEND_DIR}' && TALKEDIT_DEV_LOG_PATH='${TALKEDIT_DEV_LOG_PATH}' '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
elif command -v gnome-terminal &>/dev/null; then
gnome-terminal --title "TalkEdit Backend" -- bash -c "cd '${BACKEND_DIR}' && '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
gnome-terminal --title "TalkEdit Backend" -- bash -c "cd '${BACKEND_DIR}' && TALKEDIT_DEV_LOG_PATH='${TALKEDIT_DEV_LOG_PATH}' '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
elif command -v xterm &>/dev/null; then
xterm -T "TalkEdit Backend" -e bash -c "cd '${BACKEND_DIR}' && '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
xterm -T "TalkEdit Backend" -e bash -c "cd '${BACKEND_DIR}' && TALKEDIT_DEV_LOG_PATH='${TALKEDIT_DEV_LOG_PATH}' '${VENV_PYTHON}' -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT}; exec bash" &
else
echo "No supported terminal emulator found. Starting backend in background..."
cd "${BACKEND_DIR}" && "${VENV_PYTHON}" -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT} &
cd "${BACKEND_DIR}" && TALKEDIT_DEV_LOG_PATH="${TALKEDIT_DEV_LOG_PATH}" "${VENV_PYTHON}" -m uvicorn main:app --host 127.0.0.1 --port ${BACKEND_PORT} &
fi
# Wait up to 15s for backend to become ready
@ -47,4 +50,11 @@ else
done
fi
# Check if frontend is already running
if curl -sf "$FRONTEND_URL" > /dev/null 2>&1; then
echo "Frontend already running on port 5173."
else
echo "Frontend not running — Tauri will start it automatically."
fi
npx tauri dev

View File

@ -3,48 +3,15 @@
"version": "0.1.0",
"private": true,
"description": "TalkEdit — Open-source AI-powered text-based video editor",
"main": "electron/main.js",
"scripts": {
"tauri": "tauri",
"dev": "cd frontend && npm run dev -- --host",
"dev:tauri": "cd backend && python -m uvicorn main:app --reload --port 8642 & cd frontend && cargo tauri dev",
"dev": "VITE_BACKEND_PORT=${VITE_BACKEND_PORT:-${BACKEND_PORT:-8000}}; cd frontend && VITE_BACKEND_PORT=$VITE_BACKEND_PORT npm run dev -- --host",
"dev:tauri": "BACKEND_PORT=${BACKEND_PORT:-8000}; VITE_BACKEND_PORT=${VITE_BACKEND_PORT:-$BACKEND_PORT}; cd backend && python -m uvicorn main:app --reload --port $BACKEND_PORT & cd frontend && VITE_BACKEND_PORT=$VITE_BACKEND_PORT cargo tauri dev",
"build:tauri": "cd frontend && cargo tauri build",
"dev:frontend": "cd frontend && npm run dev",
"dev:backend": "cd backend && python -m uvicorn main:app --reload --port 8642",
"dev:frontend": "VITE_BACKEND_PORT=${VITE_BACKEND_PORT:-${BACKEND_PORT:-8000}}; cd frontend && VITE_BACKEND_PORT=$VITE_BACKEND_PORT npm run dev",
"dev:backend": "BACKEND_PORT=${BACKEND_PORT:-8000}; cd backend && python -m uvicorn main:app --reload --port $BACKEND_PORT",
"lint": "cd frontend && npm run lint"
},
"devDependencies": {
"concurrently": "^9.1.0",
"electron": "^33.2.0",
"electron-builder": "^25.1.0",
"wait-on": "^8.0.0"
},
"dependencies": {
"python-shell": "^5.0.0"
},
"build": {
"appId": "com.talkedit.app",
"productName": "TalkEdit",
"files": [
"electron/**/*",
"frontend/dist/**/*",
"backend/**/*",
"shared/**/*"
],
"extraResources": [
{
"from": "backend",
"to": "backend"
}
],
"win": {
"target": "nsis"
},
"mac": {
"target": "dmg"
},
"linux": {
"target": "AppImage"
}
}
"devDependencies": {},
"dependencies": {}
}

12
plan.md
View File

@ -1,6 +1,6 @@
# Plan for Building TalkEdit (Whisper.cpp + Tauri)
Based on your original idea summary and our discussions, here's a detailed plan to build a standalone, local audio/video editor app. We'll modify CutScript as the base, migrate to **Tauri 2.0** (Rust backend + React frontend) for tiny, dependency-free installers, and use **Whisper.cpp** for fast, accurate transcription. This keeps the scope minimal, focuses on text-based editing for spoken content, and targets podcasters/YouTubers.
Based on your original idea summary and our discussions, here's a detailed plan to build a standalone, local audio/video editor app. We'll continue evolving the existing TalkEdit codebase on **Tauri 2.0** (Rust backend + React frontend) for tiny, dependency-free installers, and use **Whisper.cpp** for fast, accurate transcription. This keeps the scope minimal, focuses on text-based editing for spoken content, and targets podcasters/YouTubers.
## 1. Overview
- **Goal**: Create an offline Descript alternative with word-level editing, transcription, and export. Users download one file (~1020MB), install, and run—no Python, FFmpeg, or external deps.
@ -9,19 +9,19 @@ Based on your original idea summary and our discussions, here's a detailed plan
- **Key Differentiators**: Fully local, text-based editing like Google Docs, smart cuts with fades.
## 2. Tech Stack
- **Frontend**: React + Vite + Tailwind CSS + shadcn/ui (from CutScript; minimal changes).
- **Frontend**: React + Vite + Tailwind CSS + shadcn/ui.
- **Backend**: Tauri 2.0 (Rust) handles file I/O, FFmpeg calls, Whisper.cpp integration.
- **Transcription**: Whisper.cpp (via Rust bindings like `whisper-cpp-sys` or `whisper-rs`).
- **Audio/Video Processing**: FFmpeg (bundled or called via Rust wrappers like `ffmpeg-next`).
- **State Management**: Zustand (from CutScript).
- **State Management**: Zustand.
- **Packaging**: Tauri's `tauri build` for cross-platform installers.
- **AI Features**: Local models only (no APIs); optional Ollama for fillers.
## 3. Step-by-Step Development Plan
1. **Set Up Tauri in CutScript** (12 weeks):
1. **Set Up Tauri in TalkEdit** (12 weeks):
- Install `tauri-cli` globally.
- In CutScript root: `npx tauri init` (choose Rust backend, link to existing React frontend).
- Migrate Electron main.js to Tauri's `src/main.rs` (handle window, file dialogs).
- In TalkEdit root: `npx tauri init` (choose Rust backend, link to existing React frontend).
- Implement Tauri `src/main.rs` host flow (window lifecycle, file dialogs, backend coordination).
- Update `tauri.conf.json` for app metadata, bundle settings.
2. **Integrate Whisper.cpp in Rust** (23 weeks):

View File

@ -6,7 +6,7 @@
"build": {
"frontendDist": "../frontend/dist",
"devUrl": "http://localhost:5173",
"beforeDevCommand": "cd frontend && npm run dev",
"beforeDevCommand": "cd frontend && (lsof -i :5173 >/dev/null 2>&1 && echo 'Frontend dev server already running on port 5173' || npm run dev)",
"beforeBuildCommand": "cd frontend && npm run build"
},
"app": {
@ -23,7 +23,7 @@
}
],
"security": {
"csp": "default-src 'self'; connect-src 'self' http://127.0.0.1:8000; media-src 'self' http://127.0.0.1:8000; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; img-src 'self' data: blob:"
"csp": "default-src 'self'; connect-src 'self' http://127.0.0.1:* http://localhost:* ws://127.0.0.1:* ws://localhost:*; media-src 'self' http://127.0.0.1:* http://localhost:* file: blob:; script-src 'self' 'unsafe-inline' 'unsafe-eval'; style-src 'self' 'unsafe-inline'; font-src 'self' data:; img-src 'self' data: blob:"
}
},
"bundle": {