Initial CutScript release - Open-source AI-powered text-based video editor

CutScript is a local-first, Descript-like video editor where you edit video by editing text. Delete a word from the transcript and it's cut from the video. Features: - Word-level transcription with WhisperX - Text-based video editing with undo/redo - AI filler word removal (Ollama/OpenAI/Claude) - AI clip creation for shorts - Waveform timeline with virtualized transcript - FFmpeg stream-copy (fast) and re-encode (4K) export - Caption burn-in and sidecar SRT generation - Studio Sound audio enhancement (DeepFilterNet) - Keyboard shortcuts (J/K/L, Space, Delete, Ctrl+Z/S/E) - Encrypted API key storage - Project save/load (.aive files) Architecture: - Electron + React + Tailwind (frontend) - FastAPI + Python (backend) - WhisperX for transcription - FFmpeg for video processing - Multi-provider AI support Performance optimizations: - RAF-throttled time updates - Zustand selectors for granular subscriptions - Dual-canvas waveform rendering - Virtualized transcript with react-virtuoso Built on top of DataAnts-AI/VideoTranscriber, completely rewritten as a desktop application. License: MIT
2026-03-03 06:31:04 -05:00
parent d1e1fedcae
commit 33cca5f552
73 changed files with 7463 additions and 3906 deletions
--- a/backend/routers/captions.py
+++ b/backend/routers/captions.py
@ -0,0 +1,65 @@
+"""Caption generation endpoint."""
+
+import logging
+from typing import List, Optional
+
+from fastapi import APIRouter, HTTPException
+from fastapi.responses import PlainTextResponse
+from pydantic import BaseModel
+
+from services.caption_generator import generate_srt, generate_vtt, generate_ass, save_captions
+
+logger = logging.getLogger(__name__)
+router = APIRouter()
+
+
+class CaptionWord(BaseModel):
+    word: str
+    start: float
+    end: float
+    confidence: float = 0.0
+
+
+class CaptionStyle(BaseModel):
+    fontName: str = "Arial"
+    fontSize: int = 48
+    fontColor: str = "&H00FFFFFF"
+    backgroundColor: str = "&H80000000"
+    position: str = "bottom"
+    bold: bool = True
+
+
+class CaptionRequest(BaseModel):
+    words: List[CaptionWord]
+    deleted_indices: List[int] = []
+    format: str = "srt"
+    words_per_line: int = 8
+    style: Optional[CaptionStyle] = None
+    output_path: Optional[str] = None
+
+
+@router.post("/captions")
+async def generate_captions(req: CaptionRequest):
+    try:
+        words_dicts = [w.model_dump() for w in req.words]
+        deleted_set = set(req.deleted_indices)
+
+        if req.format == "srt":
+            content = generate_srt(words_dicts, deleted_set, req.words_per_line)
+        elif req.format == "vtt":
+            content = generate_vtt(words_dicts, deleted_set, req.words_per_line)
+        elif req.format == "ass":
+            style_dict = req.style.model_dump() if req.style else None
+            content = generate_ass(words_dicts, deleted_set, req.words_per_line, style_dict)
+        else:
+            raise HTTPException(status_code=400, detail=f"Unknown format: {req.format}")
+
+        if req.output_path:
+            saved = save_captions(content, req.output_path)
+            return {"status": "ok", "output_path": saved}
+
+        return PlainTextResponse(content, media_type="text/plain")
+
+    except Exception as e:
+        logger.error(f"Caption generation failed: {e}", exc_info=True)
+        raise HTTPException(status_code=500, detail=str(e))