Compare commits

..

3 Commits

Author SHA1 Message Date
4d4dfa7f7c implemented the lower priority features; haven't tested them 2026-05-05 20:46:55 -06:00
cde635a660 improved chapters/markers 2026-05-05 12:29:25 -06:00
21e4255325 verified hotkeys 2026-05-05 10:22:35 -06:00
19 changed files with 1692 additions and 196 deletions

View File

@ -73,6 +73,10 @@ Use project virtualenvs where available (`.venv312`, `.venv`, or `venv`) for bac
- **color-scheme:dark**: All `<select>` elements in ExportDialog use `[color-scheme:dark]` to ensure readable native dropdown popups on Linux WebKit.
- **Re-transcribe selection (#013)**: Backend `POST /transcribe/segment` extracts audio via FFmpeg, runs Whisper, adjusts timestamps. Frontend: "Re-transcribe" button on selected words in TranscriptEditor; `replaceWordRange()` store action swaps words + rebuilds segments by speaker.
- **Transcript-only export (#024)**: "Export Transcript Only" in ExportDialog with .txt/.srt options. **Pure frontend** — generates content in-browser, writes via Tauri `writeFile`. No backend dependency. Respects word cuts.
- **Named timeline markers (#016)**: `TimelineMarker` type in `project.ts`. Store actions: `addTimelineMarker`, `updateTimelineMarker`, `removeTimelineMarker`. Colored pins on waveform canvas. MarkersPanel UI for add/edit/delete. Persisted in project.
- **Chapters (#017)**: `getChapters()` store action derives from sorted markers. "Copy as YouTube timestamps" in MarkersPanel. Zero backend.
- **Clip thumbnail strip (#022)**: `lib/thumbnails.ts` — frontend canvas capture from `<video>`. Toggle button in WaveformTimeline. Clickable frames at 10s intervals.
- **Customizable hotkeys (#041)**: `lib/keybindings.ts` with two presets (standard + left-hand). `useKeyboardShortcuts.ts` reads bindings dynamically. Settings panel includes key remapper with conflict detection and per-key reset. `?` key shows dynamic cheatsheet.
## Update Rules (Important)

View File

@ -22,23 +22,23 @@ Features are grouped by priority. Check off items as they are implemented.
## 🟡 Medium Impact — Workflow completeness
- [ ] [#016] **Named timeline markers**drop named marker pins on the waveform (like Resolve markers). Store as `{ id, time, label, color }` in the project. Rendered as colored triangles on the timeline canvas.
- [x] [#016] **Named timeline markers**colored marker pins on the waveform canvas. Add at current playback position with label/color picker in Markers panel. Editable labels, deletable. Persisted in project file. (2026-05-04)
- [ ] [#017] **Chapters**group markers into named chapter ranges. Useful for podcasts and lectures. Exportable as YouTube chapter timestamps in the description.
- [x] [#017] **Chapters**sorted markers auto-form chapters. "Copy as YouTube timestamps" button exports `MM:SS Label` format to clipboard. (2026-05-04)
- [ ] [#041] **Customizable hotkeys / keymap editor (left-hand focused)** — allow users to view, remap, and reset keyboard shortcuts (transport, edit, save/export, zone tools), with a default preset optimized for left-hand reach (Q/W/E/R/A/S/D/F/Z/X/C/V + modifiers). Include conflict detection, an alternate standard preset, and one-click "restore defaults".
- [x] [#041] **Customizable hotkeys / keymap editor** — two presets (Standard: J/K/L/I/O/arrows; Left-hand: Q/W/E/A/S/D/F). Settings panel shows all bindings with click-to-remap, conflict detection, per-key reset to default. Cheatsheet (press `?`) shows current bindings. (2026-05-04)
- [ ] [#022] **Clip thumbnail strip**video frame thumbnails along the timeline so users can navigate visually, not only by waveform. Backend: `ffmpeg` thumbnail extraction at regular intervals.
- [x] [#022] **Clip thumbnail strip**frontend-side canvas capture from the `<video>` element. Toggle "Thumbnails" button above waveform. Extracts frames at 10s intervals, clickable to seek. Zero backend dependency. (2026-05-04)
---
## 🟢 Lower Impact — Expansion and advanced scope
- [ ] [#020] **Video zoom / punch-in** — scale and position the video (crop, zoom, pan). Used constantly on talking-head videos for emphasis. Backend: `ffmpeg -vf crop/scale/zoompan`.
- [x] [#020] **Video zoom / punch-in** — scale and position the video (crop, zoom, pan). Used constantly on talking-head videos for emphasis. Backend: FFmpeg crop/scale post-process. Frontend: sliders in Export dialog. (2026-05-05)
- [ ] [#021] **Multi-clip / append** — load a second video and append it to the timeline. Even without a full multi-track timeline, "append clip" is a heavily used workflow.
- [x] [#021] **Multi-clip / append** — load additional video clips via Append Clip panel and concatenate during export. Uses FFmpeg concat demuxer. (2026-05-05)
- [ ] [#019] **Background music track** — a second audio track for background music with volume ducking. Major gap in Descript that TalkEdit could own. Backend: `ffmpeg` amix + `asendcmd` for auto-ducking.
- [x] [#019] **Background music track** — a second audio track for background music with volume ducking. Uses FFmpeg amix + sidechaincompress for auto-ducking. Configurable in Background Music panel. (2026-05-05)
- [ ] [#014] **Optional VibeVoice-ASR-HF transcription backend (future)** — evaluate as an alternate transcription mode for long-form, speaker-attributed transcripts. Keep WhisperX as the default for word-level timestamp editing.
@ -60,6 +60,8 @@ Features are grouped by priority. Check off items as they are implemented.
---
- [x] [#042] **Background removal** — MediaPipe Selfie Segmentation + FFmpeg frame processing for person/background separation. Configurable replacement: blur, solid color, or custom image. Applied during export. Falls back to FFmpeg colorkey when MediaPipe unavailable. (2026-05-05)
## 💡 TalkEdit competitive advantages to lean into
These aren't features to build — they're things to make more visible in the UI and README:

View File

@ -8,9 +8,10 @@ from typing import List, Optional
from fastapi import APIRouter, HTTPException
from pydantic import BaseModel
from services.video_editor import export_stream_copy, export_reencode, export_reencode_with_subs
from services.video_editor import export_stream_copy, export_reencode, export_reencode_with_subs, mix_background_music, concat_clips
from services.audio_cleaner import clean_audio
from services.caption_generator import generate_srt, generate_ass, save_captions
from services.background_removal import remove_background_on_export as remove_bg
logger = logging.getLogger(__name__)
router = APIRouter()
@ -36,6 +37,22 @@ class ExportWordModel(BaseModel):
confidence: float = 0.0
class ZoomConfigModel(BaseModel):
enabled: bool = False
zoomFactor: float = 1.0
panX: float = 0.0
panY: float = 0.0
class BackgroundMusicModel(BaseModel):
path: str
volumeDb: float = 0.0
duckingEnabled: bool = False
duckingDb: float = 6.0
duckingAttackMs: float = 10.0
duckingReleaseMs: float = 200.0
class ExportRequest(BaseModel):
input_path: str
output_path: str
@ -53,6 +70,12 @@ class ExportRequest(BaseModel):
captions: str = "none"
words: Optional[List[ExportWordModel]] = None
deleted_indices: Optional[List[int]] = None
zoom: Optional[ZoomConfigModel] = None
additional_clips: Optional[List[str]] = None
background_music: Optional[BackgroundMusicModel] = None
remove_background: bool = False
background_replacement: str = "blur"
background_replacement_value: str = ""
class TranscriptExportRequest(BaseModel):
@ -130,6 +153,29 @@ async def export_video(req: ExportRequest):
if not segments and not mute_segments:
raise HTTPException(status_code=400, detail="No segments to export")
# Convert zoom config to dict
zoom_dict = None
if req.zoom and req.zoom.enabled:
zoom_dict = {
"enabled": True,
"zoomFactor": req.zoom.zoomFactor,
"panX": req.zoom.panX,
"panY": req.zoom.panY,
}
# Handle additional clips: pre-concat before main editing
working_input = req.input_path
has_additional = bool(req.additional_clips)
if has_additional:
try:
concat_output = req.output_path + ".concat.mp4"
concat_clips(req.input_path, req.additional_clips, concat_output)
working_input = concat_output
logger.info("Pre-concatenated %d additional clips into %s", len(req.additional_clips), concat_output)
except Exception as e:
logger.warning(f"Clip concatenation failed (non-fatal): {e}")
# Fall back to main input only
mapped_gain_segments = _map_ranges_to_output_timeline(gain_segments or [], segments)
has_gain = abs(float(req.global_gain_db)) > 1e-6 or bool(gain_segments)
@ -141,7 +187,7 @@ async def export_video(req: ExportRequest):
detail="Speed zones currently cannot be combined with mute/gain filters in one export",
)
use_stream_copy = req.mode == "fast" and len(segments) == 1 and not mute_segments and not has_gain and not has_speed
use_stream_copy = req.mode == "fast" and len(segments) == 1 and not mute_segments and not has_gain and not has_speed and not zoom_dict and not has_additional
needs_reencode_for_subs = req.captions == "burn-in"
# Burn-in captions or audio filters require re-encode
@ -162,10 +208,10 @@ async def export_video(req: ExportRequest):
try:
if use_stream_copy:
output = export_stream_copy(req.input_path, req.output_path, segments)
output = export_stream_copy(working_input, req.output_path, segments)
elif ass_path:
output = export_reencode_with_subs(
req.input_path,
working_input,
req.output_path,
segments,
ass_path,
@ -177,10 +223,11 @@ async def export_video(req: ExportRequest):
global_gain_db=req.global_gain_db,
normalize_loudness=req.normalize_loudness,
normalize_target_lufs=req.normalize_target_lufs,
zoom_config=zoom_dict,
)
else:
output = export_reencode(
req.input_path,
working_input,
req.output_path,
segments,
resolution=req.resolution,
@ -191,6 +238,7 @@ async def export_video(req: ExportRequest):
global_gain_db=req.global_gain_db,
normalize_loudness=req.normalize_loudness,
normalize_target_lufs=req.normalize_target_lufs,
zoom_config=zoom_dict,
)
finally:
if ass_path and os.path.exists(ass_path):
@ -209,7 +257,6 @@ async def export_video(req: ExportRequest):
os.replace(muxed_path, output)
logger.info(f"Audio enhanced and muxed into {output}")
# Cleanup
try:
os.remove(cleaned_audio)
os.rmdir(tmp_dir)
@ -218,6 +265,35 @@ async def export_video(req: ExportRequest):
except Exception as e:
logger.warning(f"Audio enhancement failed (non-fatal): {e}")
# Background removal (post-process)
if req.remove_background:
try:
bg_output = output + ".nobg.mp4"
remove_bg(output, bg_output, req.background_replacement, req.background_replacement_value)
os.replace(bg_output, output)
logger.info("Background removed from %s", output)
except Exception as e:
logger.warning(f"Background removal failed (non-fatal): {e}")
# Background music mixing (post-process)
if req.background_music:
try:
music_output = output + ".music.mp4"
mix_background_music(
output,
req.background_music.path,
music_output,
volume_db=req.background_music.volumeDb,
ducking_enabled=req.background_music.duckingEnabled,
ducking_db=req.background_music.duckingDb,
ducking_attack_ms=req.background_music.duckingAttackMs,
ducking_release_ms=req.background_music.duckingReleaseMs,
)
os.replace(music_output, output)
logger.info("Background music mixed into %s", output)
except Exception as e:
logger.warning(f"Background music mixing failed (non-fatal): {e}")
# Sidecar SRT: generate and save alongside video
srt_path = None
if req.captions == "sidecar" and words_dicts:
@ -226,6 +302,13 @@ async def export_video(req: ExportRequest):
save_captions(srt_content, srt_path)
logger.info(f"Sidecar SRT saved to {srt_path}")
# Cleanup pre-concat temp file
if has_additional and working_input != req.input_path and os.path.exists(working_input):
try:
os.remove(working_input)
except OSError:
pass
result = {"status": "ok", "output_path": output}
if srt_path:
result["srt_path"] = srt_path

View File

@ -1,18 +1,17 @@
"""
AI background removal (Phase 5 - future).
Uses MediaPipe or Robust Video Matting for person segmentation.
Export-only -- no real-time preview.
AI background removal using MediaPipe for person segmentation.
Applied during export as a post-processing step — no real-time preview.
"""
import logging
import subprocess
import tempfile
import os
from pathlib import Path
logger = logging.getLogger(__name__)
# Placeholder for Phase 5 implementation
# Will use mediapipe or rvm for segmentation at export time
MEDIAPIPE_AVAILABLE = False
RVM_AVAILABLE = False
try:
import mediapipe as mp
@ -20,14 +19,9 @@ try:
except ImportError:
pass
try:
pass # rvm import would go here
except ImportError:
pass
def is_available() -> bool:
return MEDIAPIPE_AVAILABLE or RVM_AVAILABLE
return MEDIAPIPE_AVAILABLE
def remove_background_on_export(
@ -37,23 +31,189 @@ def remove_background_on_export(
replacement_value: str = "",
) -> str:
"""
Process video frame-by-frame to remove/replace background.
Only runs during export (not real-time).
Process video frame-by-frame using FFmpeg chromakey fallback,
or MediaPipe-based segmentation if available.
Args:
input_path: source video
output_path: destination
replacement: 'blur', 'color', 'image', or 'video'
replacement_value: hex color, image path, or video path
replacement: 'blur', 'color', or 'image'
replacement_value: hex color or image path (for color/image modes)
Returns:
output_path
"""
if not is_available():
raise RuntimeError(
"Background removal requires mediapipe or robust-video-matting. "
"Install with: pip install mediapipe"
)
input_path = str(Path(input_path).resolve())
output_path = str(Path(output_path).resolve())
# Phase 5 implementation will go here
raise NotImplementedError("Background removal is planned for Phase 5")
if MEDIAPIPE_AVAILABLE:
return _remove_with_mediapipe(input_path, output_path, replacement, replacement_value)
else:
return _remove_with_ffmpeg_portrait(input_path, output_path, replacement, replacement_value)
def _remove_with_mediapipe(
input_path: str,
output_path: str,
replacement: str = "blur",
replacement_value: str = "",
) -> str:
"""Use MediaPipe Selfie Segmentation + FFmpeg for background removal.
Extracts frames, applies segmentation, composites replacement background.
"""
try:
import cv2
import numpy as np
import mediapipe as mp
mp_selfie_segmentation = mp.solutions.selfie_segmentation
# Determine background color/image
if replacement == "color":
color_hex = replacement_value or "#00FF00"
color_hex = color_hex.lstrip("#")
bg_color = tuple(int(color_hex[i:i+2], 16) for i in (0, 2, 4))
bg_color = bg_color[::-1] # RGB -> BGR
elif replacement == "image":
bg_image = cv2.imread(replacement_value) if replacement_value else None
if bg_image is None:
bg_color = (0, 255, 0)
bg_image = None
else:
# Blur background (default)
bg_color = None
# Open video
cap = cv2.VideoCapture(input_path)
fps = cap.get(cv2.CAP_PROP_FPS)
width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH))
height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT))
total_frames = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
# Temp directory for processed frames
temp_dir = tempfile.mkdtemp(prefix="aive_bgrem_")
frame_dir = os.path.join(temp_dir, "frames")
os.makedirs(frame_dir, exist_ok=True)
with mp_selfie_segmentation.SelfieSegmentation(model_selection=0) as segmenter:
frame_idx = 0
while cap.isOpened():
ret, frame = cap.read()
if not ret:
break
# Convert to RGB for MediaPipe
rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
result = segmenter.process(rgb)
mask = result.segmentation_mask
# Threshold the mask
condition = mask > 0.5
if replacement == "blur":
# Apply strong blur to background
blurred = cv2.GaussianBlur(frame, (99, 99), 0)
output_frame = np.where(condition[..., None], frame, blurred)
elif replacement == "color":
bg = np.full(frame.shape, bg_color, dtype=np.uint8)
output_frame = np.where(condition[..., None], frame, bg)
elif replacement == "image" and bg_image is not None:
bg_resized = cv2.resize(bg_image, (width, height))
output_frame = np.where(condition[..., None], frame, bg_resized)
else:
output_frame = frame
out_path = os.path.join(frame_dir, f"frame_{frame_idx:06d}.png")
cv2.imwrite(out_path, output_frame)
frame_idx += 1
if frame_idx % 100 == 0:
logger.info("Background removal: %d/%d frames", frame_idx, total_frames)
cap.release()
# Encode frames back to video using FFmpeg
import subprocess as _sp
ffmpeg = "ffmpeg"
cmd = [
ffmpeg, "-y",
"-framerate", str(fps),
"-i", os.path.join(frame_dir, "frame_%06d.png"),
"-i", input_path,
"-map", "0:v:0",
"-map", "1:a:0?",
"-c:v", "libx264", "-preset", "medium", "-crf", "18",
"-c:a", "aac", "-b:a", "192k",
"-shortest",
"-pix_fmt", "yuv420p",
output_path,
]
result = _sp.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(f"FFmpeg frame encode failed: {result.stderr[-500:]}")
# Cleanup
for f in os.listdir(frame_dir):
try:
os.remove(os.path.join(frame_dir, f))
except OSError:
pass
try:
os.rmdir(frame_dir)
os.rmdir(temp_dir)
except OSError:
pass
logger.info("MediaPipe background removal completed -> %s", output_path)
return output_path
except ImportError:
logger.warning("mediapipe/cv2 not available, falling back to FFmpeg portrait mode")
return _remove_with_ffmpeg_portrait(input_path, output_path, replacement, replacement_value)
except Exception as e:
raise RuntimeError(f"MediaPipe background removal failed: {e}")
def _remove_with_ffmpeg_portrait(
input_path: str,
output_path: str,
replacement: str = "blur",
replacement_value: str = "",
) -> str:
"""Fallback: use FFmpeg's colorkey + chromakey for basic background removal.
This is a crude approximation. For best results, install mediapipe + opencv-python.
"""
ffmpeg = "ffmpeg"
# Use a simple chromakey-based approach with a neutral background
# This won't work well for most real videos but provides a fallback
if replacement == "color":
color = replacement_value or "00FF00"
filter_complex = f"colorkey=0x{color}:0.3:0.1,chromakey=0x{color}:0.3:0.1"
elif replacement == "blur":
filter_complex = "gblur=sigma=20:enable='gt(scene,0.01)'"
else:
filter_complex = "null"
if filter_complex == "null":
# No-op, copy input to output
cmd = [ffmpeg, "-y", "-i", input_path, "-c", "copy", output_path]
else:
cmd = [
ffmpeg, "-y",
"-i", input_path,
"-vf", filter_complex,
"-c:v", "libx264", "-preset", "medium", "-crf", "18",
"-c:a", "aac", "-b:a", "192k",
"-movflags", "+faststart",
output_path,
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(f"FFmpeg background removal failed: {result.stderr[-500:]}")
logger.info("FFmpeg portait background removal completed -> %s", output_path)
return output_path

View File

@ -117,6 +117,129 @@ def _split_keep_segments_by_speed(
return result
def _build_zoom_filter(zoom_config: dict = None) -> str:
"""Build FFmpeg video filter snippet for zoom/punch-in effect.
zoom_config: {enabled, zoomFactor, panX, panY}
Returns empty string if disabled. Should be prepended to the video filter chain.
"""
if not zoom_config or not zoom_config.get("enabled"):
return ""
factor = float(zoom_config.get("zoomFactor", 1.0))
if abs(factor - 1.0) < 0.01:
return ""
pan_x = float(zoom_config.get("panX", 0.0))
pan_y = float(zoom_config.get("panY", 0.0))
return f"crop=iw/{factor}:ih/{factor}:((iw-iw/{factor})/2)+({pan_x}*(iw-iw/{factor})/2):((ih-ih/{factor})/2)+({pan_y}*(ih-ih/{factor})/2),scale=iw:ih"
def mix_background_music(
video_path: str,
music_path: str,
output_path: str,
volume_db: float = 0.0,
ducking_enabled: bool = False,
ducking_db: float = 6.0,
ducking_attack_ms: float = 10.0,
ducking_release_ms: float = 200.0,
) -> str:
"""Mix background music into a video with optional ducking.
Uses FFmpeg amix + sidechaincompress. Output is written to output_path.
"""
ffmpeg = _find_ffmpeg()
escaped_music = music_path.replace("\\", "/").replace(":", "\\:")
# Build the filter graph
if ducking_enabled:
filter_complex = (
f"[0:a]asplit[main][sidechain];"
f"movie='{escaped_music}':loop=0,volume={volume_db}dB[music];"
f"[main][music]amix=inputs=2:duration=first:dropout_transition=2[mixed];"
f"[mixed][sidechain]sidechaincompress="
f"threshold=-30dB:ratio=100:attack={ducking_attack_ms}ms:"
f"release={ducking_release_ms}ms:makeup=1:level_sc={ducking_db}[outa]"
)
else:
filter_complex = (
f"movie='{escaped_music}':loop=0,volume={volume_db}dB[music];"
f"[0:a][music]amix=inputs=2:duration=first:dropout_transition=2[outa]"
)
cmd = [
ffmpeg, "-y",
"-i", video_path,
"-filter_complex", filter_complex,
"-map", "0:v",
"-map", "[outa]",
"-c:v", "copy",
"-c:a", "aac", "-b:a", "192k",
"-shortest",
output_path,
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(f"Background music mix failed: {result.stderr[-500:]}")
return output_path
def concat_clips(
main_path: str,
append_paths: list,
output_path: str,
) -> str:
"""Concatenate multiple video clips using FFmpeg concat demuxer.
The main_path is kept as-is. append_paths are appended after it.
"""
if not append_paths:
raise ValueError("No clips to concatenate")
ffmpeg = _find_ffmpeg()
import tempfile
import os
temp_dir = tempfile.mkdtemp(prefix="aive_concat_")
try:
segment_files = [main_path]
segment_files.extend(append_paths)
# Create concat file list
concat_file = os.path.join(temp_dir, "concat.txt")
with open(concat_file, "w") as f:
for path in segment_files:
resolved = os.path.abspath(path)
f.write(f"file '{resolved}'\n")
cmd = [
ffmpeg, "-y",
"-f", "concat",
"-safe", "0",
"-i", concat_file,
"-c", "copy",
"-movflags", "+faststart",
output_path,
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(f"Clip concat failed: {result.stderr[-500:]}")
return output_path
finally:
for f in os.listdir(temp_dir):
try:
os.remove(os.path.join(temp_dir, f))
except OSError:
pass
try:
os.rmdir(temp_dir)
except OSError:
pass
def _find_ffmpeg() -> str:
"""Locate ffmpeg binary."""
for cmd in ["ffmpeg", "ffmpeg.exe"]:
@ -213,6 +336,29 @@ def export_stream_copy(
pass
def _apply_zoom_post(input_path: str, output_path: str, zoom_config: dict) -> str:
"""Re-encode video applying zoom/punch-in crop+scale as a post-process step."""
ffmpeg = _find_ffmpeg()
zoom_filter = _build_zoom_filter(zoom_config)
if not zoom_filter:
return input_path
cmd = [
ffmpeg, "-y",
"-i", input_path,
"-filter_complex", f"[0:v]{zoom_filter}[v]",
"-map", "[v]",
"-map", "0:a?",
"-c:a", "copy",
"-movflags", "+faststart",
output_path,
]
result = subprocess.run(cmd, capture_output=True, text=True)
if result.returncode != 0:
raise RuntimeError(f"Zoom post-process failed: {result.stderr[-500:]}")
return output_path
def export_reencode(
input_path: str,
output_path: str,
@ -225,6 +371,7 @@ def export_reencode(
global_gain_db: float = 0.0,
normalize_loudness: bool = False,
normalize_target_lufs: float = -14.0,
zoom_config: dict = None,
) -> str:
"""
Export video with full re-encode. Slower but supports resolution changes,
@ -421,6 +568,15 @@ def export_reencode(
if result.returncode != 0:
raise RuntimeError(f"FFmpeg re-encode failed: {result.stderr[-500:]}")
# Apply zoom post-processing if configured
if zoom_config and zoom_config.get("enabled") and has_video:
import tempfile as _tf
import os as _os
zoomed_path = output_path + ".zoomed.mp4"
_apply_zoom_post(output_path, zoomed_path, zoom_config)
_os.replace(zoomed_path, output_path)
logger.info("Zoom/punch-in applied to %s (factor=%s)", output_path, zoom_config.get("zoomFactor", 1.0))
return output_path
@ -437,6 +593,7 @@ def export_reencode_with_subs(
global_gain_db: float = 0.0,
normalize_loudness: bool = False,
normalize_target_lufs: float = -14.0,
zoom_config: dict = None,
) -> str:
"""
Export video with re-encode and burn-in subtitles (ASS format).
@ -578,6 +735,15 @@ def export_reencode_with_subs(
if result.returncode != 0:
raise RuntimeError(f"FFmpeg re-encode with subs failed: {result.stderr[-500:]}")
# Apply zoom post-processing if configured
if zoom_config and zoom_config.get("enabled"):
import tempfile as _tf
import os as _os
zoomed_path = output_path + ".zoomed.mp4"
_apply_zoom_post(output_path, zoomed_path, zoom_config)
_os.replace(zoomed_path, output_path)
logger.info("Zoom/punch-in applied to %s (factor=%s)", output_path, zoom_config.get("zoomFactor", 1.0))
return output_path

View File

@ -7,8 +7,8 @@
"dev": "vite",
"build": "tsc -b && vite build",
"lint": "eslint .",
"preview": "vite preview",
"test": "vitest run"
"test": "vitest run",
"preview": "vite preview"
},
"dependencies": {
"@tauri-apps/api": "^2",

View File

@ -1,4 +1,4 @@
import { useEffect, useState, useMemo } from 'react';
import { useEffect, useState, useMemo, useCallback, useRef } from 'react';
import { useEditorStore } from './store/editorStore';
import VideoPlayer from './components/VideoPlayer';
import TranscriptEditor from './components/TranscriptEditor';
@ -7,8 +7,11 @@ import AIPanel from './components/AIPanel';
import ExportDialog from './components/ExportDialog';
import SettingsPanel from './components/SettingsPanel';
import DevPanel from './components/DevPanel';
import MarkersPanel from './components/MarkersPanel';
import SilenceTrimmerPanel from './components/SilenceTrimmerPanel';
import ZoneEditor from './components/ZoneEditor';
import BackgroundMusicPanel from './components/BackgroundMusicPanel';
import AppendClipPanel from './components/AppendClipPanel';
import { useKeyboardShortcuts } from './hooks/useKeyboardShortcuts';
import {
Film,
@ -25,11 +28,14 @@ import {
FilePlus2,
RefreshCw,
Grid3x3,
MapPin,
Music,
ListVideo,
} from 'lucide-react';
const LAST_MEDIA_PATH_KEY = 'talkedit:lastMediaPath';
type Panel = 'ai' | 'settings' | 'export' | 'silence' | 'zones' | null;
type Panel = 'ai' | 'settings' | 'export' | 'silence' | 'zones' | 'markers' | 'music' | 'append' | null;
export default function App() {
const {
@ -66,6 +72,54 @@ export default function App() {
const [activePanel, setActivePanel] = useState<Panel>(null);
const [projectName, setProjectName] = useState<string | null>(null);
const [splitRatio, setSplitRatio] = useState(() => {
try { return Number(localStorage.getItem('talkedit:splitRatio')) || 0.5; } catch { return 0.5; }
});
const splitRef = useRef<HTMLDivElement>(null);
const isDraggingSplit = useRef(false);
const startSplitDrag = useCallback((e: React.MouseEvent) => {
e.preventDefault();
isDraggingSplit.current = true;
const container = splitRef.current?.parentElement;
if (!container) return;
const rect = container.getBoundingClientRect();
const onMove = (me: MouseEvent) => {
if (!isDraggingSplit.current) return;
const pct = (me.clientX - rect.left) / rect.width;
const clamped = Math.max(0.15, Math.min(0.85, pct));
setSplitRatio(clamped);
localStorage.setItem('talkedit:splitRatio', String(clamped));
};
const onUp = () => { isDraggingSplit.current = false; window.removeEventListener('mousemove', onMove); window.removeEventListener('mouseup', onUp); };
window.addEventListener('mousemove', onMove);
window.addEventListener('mouseup', onUp);
}, []);
// Draggable right sidebar
const [sidebarWidth, setSidebarWidth] = useState(() => {
try { return Number(localStorage.getItem('talkedit:sidebarWidth')) || 320; } catch { return 320; }
});
const isDraggingSidebar = useRef(false);
const startSidebarDrag = useCallback((e: React.MouseEvent) => {
e.preventDefault();
isDraggingSidebar.current = true;
const container = document.querySelector('.main-content') as HTMLElement;
if (!container) return;
const rect = container.getBoundingClientRect();
const onMove = (me: MouseEvent) => {
if (!isDraggingSidebar.current) return;
const w = rect.right - me.clientX;
const clamped = Math.max(180, Math.min(600, w));
setSidebarWidth(clamped);
localStorage.setItem('talkedit:sidebarWidth', String(clamped));
};
const onUp = () => { isDraggingSidebar.current = false; window.removeEventListener('mousemove', onMove); window.removeEventListener('mouseup', onUp); };
window.addEventListener('mousemove', onMove);
window.addEventListener('mouseup', onUp);
}, []);
const [whisperModel, setWhisperModel] = useState('base');
useEffect(() => { if (transcriptionModel) setWhisperModel(transcriptionModel); }, [transcriptionModel]);
const [cutMode, setCutMode] = useState(false);
@ -597,6 +651,27 @@ export default function App() {
onClick={() => togglePanel('silence')}
disabled={!videoPath}
/>
<ToolbarButton
icon={<MapPin className="w-4 h-4" />}
label="Markers"
active={activePanel === 'markers'}
onClick={() => togglePanel('markers')}
disabled={!videoPath}
/>
<ToolbarButton
icon={<Music className="w-4 h-4" />}
label="Music"
active={activePanel === 'music'}
onClick={() => togglePanel('music')}
disabled={!videoPath}
/>
<ToolbarButton
icon={<ListVideo className="w-4 h-4" />}
label="Append"
active={activePanel === 'append'}
onClick={() => togglePanel('append')}
disabled={!videoPath}
/>
<div className="flex items-center gap-1.5 px-2 py-1 rounded-md bg-editor-surface border border-editor-border">
<select
value={whisperModel}
@ -657,17 +732,24 @@ export default function App() {
</header>
{/* Main content */}
<div className="flex-1 flex overflow-hidden">
<div className="main-content flex-1 flex overflow-hidden">
{/* Left: video + transcript */}
<div className="flex-1 flex flex-col min-w-0">
<div className="flex-1 flex min-h-0">
<div ref={splitRef} className="flex-1 flex min-h-0" style={{ position: 'relative' }}>
{/* Video player */}
<div className="w-1/2 p-3 flex items-center justify-center bg-black/20">
<div className="p-3 flex items-center justify-center bg-black/20 overflow-hidden" style={{ width: `${splitRatio * 100}%`, minWidth: 0 }}>
<VideoPlayer />
</div>
{/* Draggable divider */}
<div
className="w-1 shrink-0 bg-editor-border cursor-col-resize hover:bg-editor-accent/50 active:bg-editor-accent transition-colors relative z-10"
style={{ cursor: isDraggingSplit.current ? 'col-resize' : 'col-resize' }}
onMouseDown={startSplitDrag}
/>
{/* Transcript */}
<div className="w-1/2 border-l border-editor-border flex flex-col min-h-0">
<div className="border-l border-editor-border flex flex-col min-h-0" style={{ width: `${(1 - splitRatio) * 100}%`, minWidth: 0 }}>
{videoPath && (
<div className="flex items-center gap-2 px-3 py-1.5 border-b border-editor-border shrink-0 bg-editor-surface/50">
{projectName && (
@ -736,15 +818,25 @@ export default function App() {
{/* Right panel (AI / Export / Settings) */}
{activePanel && (
<div className="w-80 border-l border-editor-border overflow-y-auto shrink-0">
<div className="flex shrink-0">
{/* Draggable sidebar divider */}
<div
className="w-1 shrink-0 bg-editor-border cursor-col-resize hover:bg-editor-accent/50 active:bg-editor-accent transition-colors relative z-10"
onMouseDown={startSidebarDrag}
/>
<div className="overflow-y-auto" style={{ width: sidebarWidth }}>
{activePanel === 'zones' && (
<ZoneEditor />
)}
{activePanel === 'silence' && <SilenceTrimmerPanel />}
{activePanel === 'markers' && <MarkersPanel />}
{activePanel === 'music' && <BackgroundMusicPanel />}
{activePanel === 'append' && <AppendClipPanel />}
{activePanel === 'ai' && <AIPanel />}
{activePanel === 'export' && <ExportDialog />}
{activePanel === 'settings' && <SettingsPanel />}
</div>
</div>
)}
</div>
{import.meta.env.DEV && <DevPanel />}

View File

@ -0,0 +1,78 @@
import { useEditorStore } from '../store/editorStore';
import { Video, Plus, Trash2, ChevronUp, ChevronDown } from 'lucide-react';
export default function AppendClipPanel() {
const { additionalClips, addAdditionalClip, removeAdditionalClip, reorderAdditionalClip, videoPath } = useEditorStore();
const handleAddClip = async () => {
const path = await window.electronAPI?.openFile();
if (path) {
addAdditionalClip(path);
}
};
return (
<div className="p-4 space-y-3">
<h3 className="text-sm font-semibold flex items-center gap-1.5">
<Video className="w-4 h-4" />
Append Clips
</h3>
<p className="text-[10px] text-editor-text-muted leading-relaxed">
Load additional video clips to append after the main video. Clips are concatenated in order during export.
</p>
{additionalClips.length === 0 ? (
<div className="text-[11px] text-editor-text-muted text-center py-3">
No additional clips loaded
</div>
) : (
<div className="space-y-1 max-h-60 overflow-y-auto">
{additionalClips.map((clip, idx) => (
<div
key={clip.id}
className="flex items-center gap-2 p-2 rounded bg-editor-surface border border-editor-border text-xs"
>
<Video className="w-3 h-3 text-editor-accent shrink-0" />
<span className="flex-1 truncate text-editor-text">{clip.label}</span>
<span className="text-[10px] text-editor-text-muted shrink-0">#{idx + 1}</span>
<div className="flex items-center gap-0.5 shrink-0">
<button
onClick={() => reorderAdditionalClip(clip.id, -1)}
disabled={idx === 0}
className="p-0.5 rounded hover:bg-editor-bg disabled:opacity-30 text-editor-text-muted hover:text-editor-text"
title="Move up"
>
<ChevronUp className="w-3 h-3" />
</button>
<button
onClick={() => reorderAdditionalClip(clip.id, 1)}
disabled={idx === additionalClips.length - 1}
className="p-0.5 rounded hover:bg-editor-bg disabled:opacity-30 text-editor-text-muted hover:text-editor-text"
title="Move down"
>
<ChevronDown className="w-3 h-3" />
</button>
</div>
<button
onClick={() => removeAdditionalClip(clip.id)}
className="p-0.5 rounded hover:bg-red-500/20 text-red-400"
title="Remove clip"
>
<Trash2 className="w-3 h-3" />
</button>
</div>
))}
</div>
)}
<button
onClick={handleAddClip}
disabled={!videoPath}
className="w-full flex items-center justify-center gap-2 px-3 py-2 rounded-lg border-2 border-dashed border-editor-border text-xs text-editor-text-muted hover:text-editor-text hover:border-editor-text-muted disabled:opacity-40 transition-colors"
>
<Plus className="w-3.5 h-3.5" />
Add Clip
</button>
</div>
);
}

View File

@ -0,0 +1,139 @@
import { useEditorStore } from '../store/editorStore';
import { Music, Trash2, Volume2, Disc3 } from 'lucide-react';
export default function BackgroundMusicPanel() {
const { backgroundMusic, setBackgroundMusic, updateBackgroundMusic } = useEditorStore();
const handleLoadMusic = async () => {
const path = await window.electronAPI?.openFile();
if (path) {
setBackgroundMusic({
path,
volumeDb: -10,
duckingEnabled: true,
duckingDb: 6,
duckingAttackMs: 10,
duckingReleaseMs: 200,
});
}
};
const handleRemoveMusic = () => {
setBackgroundMusic(null);
};
return (
<div className="p-4 space-y-4">
<h3 className="text-sm font-semibold flex items-center gap-1.5">
<Music className="w-4 h-4" />
Background Music
</h3>
{!backgroundMusic ? (
<button
onClick={handleLoadMusic}
className="w-full flex items-center justify-center gap-2 px-4 py-3 rounded-lg border-2 border-dashed border-editor-border text-xs text-editor-text-muted hover:text-editor-text hover:border-editor-text-muted transition-colors"
>
<Disc3 className="w-4 h-4" />
Load Music File
</button>
) : (
<div className="space-y-3">
<div className="flex items-center gap-2 p-2 rounded bg-editor-surface border border-editor-border">
<Music className="w-4 h-4 text-editor-accent shrink-0" />
<span className="flex-1 text-xs truncate">
{backgroundMusic.path.split(/[/\\]/).pop()}
</span>
<button
onClick={handleRemoveMusic}
className="p-1 rounded hover:bg-red-500/20 text-red-400 transition-colors"
title="Remove music"
>
<Trash2 className="w-3 h-3" />
</button>
</div>
<div className="space-y-2">
<div className="flex items-center gap-2">
<Volume2 className="w-3 h-3 text-editor-text-muted shrink-0" />
<span className="text-[10px] text-editor-text-muted w-16">Volume:</span>
<input
type="range"
min={-30}
max={12}
step={1}
value={backgroundMusic.volumeDb}
onChange={(e) => updateBackgroundMusic({ volumeDb: Number(e.target.value) })}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{backgroundMusic.volumeDb} dB</span>
</div>
</div>
<label className="flex items-center gap-2 cursor-pointer">
<input
type="checkbox"
checked={backgroundMusic.duckingEnabled}
onChange={(e) => updateBackgroundMusic({ duckingEnabled: e.target.checked })}
className="w-4 h-4 rounded bg-editor-surface border-editor-border accent-editor-accent"
/>
<div>
<span className="text-xs font-medium">Auto-ducking</span>
<p className="text-[10px] text-editor-text-muted">
Lower music volume when speech is detected
</p>
</div>
</label>
{backgroundMusic.duckingEnabled && (
<div className="pl-6 space-y-2">
<div className="flex items-center gap-2">
<span className="text-[10px] text-editor-text-muted w-20">Duck amount:</span>
<input
type="range"
min={1}
max={20}
step={1}
value={backgroundMusic.duckingDb}
onChange={(e) => updateBackgroundMusic({ duckingDb: Number(e.target.value) })}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{backgroundMusic.duckingDb} dB</span>
</div>
<div className="flex items-center gap-2">
<span className="text-[10px] text-editor-text-muted w-20">Attack:</span>
<input
type="range"
min={1}
max={100}
step={1}
value={backgroundMusic.duckingAttackMs}
onChange={(e) => updateBackgroundMusic({ duckingAttackMs: Number(e.target.value) })}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{backgroundMusic.duckingAttackMs}ms</span>
</div>
<div className="flex items-center gap-2">
<span className="text-[10px] text-editor-text-muted w-20">Release:</span>
<input
type="range"
min={10}
max={1000}
step={10}
value={backgroundMusic.duckingReleaseMs}
onChange={(e) => updateBackgroundMusic({ duckingReleaseMs: Number(e.target.value) })}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{backgroundMusic.duckingReleaseMs}ms</span>
</div>
</div>
)}
<p className="text-[10px] text-editor-text-muted leading-relaxed">
The music will be mixed during export. Enable auto-ducking to lower music volume whenever speech is active.
</p>
</div>
)}
</div>
);
}

View File

@ -1,10 +1,10 @@
import { useState, useCallback } from 'react';
import { useEditorStore } from '../store/editorStore';
import { Download, Loader2, Zap, Cog, Info, Volume2, FileText } from 'lucide-react';
import { Download, Loader2, Zap, Cog, Info, Volume2, FileText, ZoomIn, Video, Music } from 'lucide-react';
import type { ExportOptions } from '../types/project';
export default function ExportDialog() {
const { videoPath, words, cutRanges, muteRanges, gainRanges, speedRanges, globalGainDb, isExporting, exportProgress, backendUrl, setExporting, getKeepSegments } =
const { videoPath, words, cutRanges, muteRanges, gainRanges, speedRanges, globalGainDb, isExporting, exportProgress, backendUrl, setExporting, getKeepSegments, additionalClips, backgroundMusic } =
useEditorStore();
const hasCuts = cutRanges.length > 0;
@ -22,6 +22,10 @@ export default function ExportDialog() {
captions: 'none',
normalizeAudio: false,
normalizeTarget: -14,
zoom: { enabled: false, zoomFactor: 1.25, panX: 0, panY: 0 },
removeBackground: false,
backgroundReplacement: 'blur',
backgroundReplacementValue: '',
});
const [exportError, setExportError] = useState<string | null>(null);
const [transcriptFormat, setTranscriptFormat] = useState<'txt' | 'srt'>('txt');
@ -147,10 +151,7 @@ export default function ExportDialog() {
speed: r.speed,
}));
const res = await fetch(`${backendUrl}/export`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
const body: Record<string, any> = {
input_path: videoPath,
output_path: outputPath,
keep_segments: keepSegments,
@ -167,7 +168,34 @@ export default function ExportDialog() {
normalize_loudness: options.normalizeAudio,
normalize_target_lufs: options.normalizeTarget,
captions: options.captions,
}),
};
// Zoom
if (options.zoom?.enabled) {
body.zoom = options.zoom;
}
// Additional clips
if (additionalClips.length > 0) {
body.additional_clips = additionalClips.map((c) => c.path);
}
// Background music
if (backgroundMusic) {
body.background_music = backgroundMusic;
}
// Background removal
if (options.removeBackground) {
body.remove_background = true;
body.background_replacement = options.backgroundReplacement || 'blur';
body.background_replacement_value = options.backgroundReplacementValue || '';
}
const res = await fetch(`${backendUrl}/export`, {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify(body),
});
if (!res.ok) {
let detail = res.statusText;
@ -185,7 +213,7 @@ export default function ExportDialog() {
setExportError(err instanceof Error ? err.message : 'Export failed');
setExporting(false);
}
}, [videoPath, options, backendUrl, setExporting, getKeepSegments, cutRanges, muteRanges, gainRanges, speedRanges, globalGainDb, words, HANDLE_EXPORT_filters]);
}, [videoPath, options, backendUrl, setExporting, getKeepSegments, cutRanges, muteRanges, gainRanges, speedRanges, globalGainDb, words, HANDLE_EXPORT_filters, additionalClips, backgroundMusic]);
return (
<div className="p-4 space-y-5">
@ -239,6 +267,139 @@ export default function ExportDialog() {
]}
/>
{/* Video zoom / punch-in */}
<div className="space-y-2 pt-1 border-t border-editor-border">
<label className="flex items-center gap-2 cursor-pointer">
<input
type="checkbox"
checked={options.zoom?.enabled || false}
onChange={(e) => setOptions((o) => ({ ...o, zoom: { ...o.zoom!, enabled: e.target.checked } }))}
className="w-4 h-4 rounded bg-editor-surface border-editor-border accent-editor-accent"
/>
<div>
<span className="text-xs font-medium flex items-center gap-1">
<ZoomIn className="w-3 h-3" />
Video zoom / punch-in
</span>
<p className="text-[10px] text-editor-text-muted">
Crop and zoom into the center of the video. Requires re-encode.
</p>
</div>
</label>
{options.zoom?.enabled && (
<div className="pl-6 space-y-2">
<div className="flex items-center gap-2">
<span className="text-[10px] text-editor-text-muted w-16">Zoom:</span>
<input
type="range"
min={1}
max={3}
step={0.05}
value={options.zoom?.zoomFactor || 1}
onChange={(e) => setOptions((o) => ({ ...o, zoom: { ...o.zoom!, zoomFactor: Number(e.target.value) } }))}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{options.zoom?.zoomFactor?.toFixed(2)}x</span>
</div>
<div className="flex items-center gap-2">
<span className="text-[10px] text-editor-text-muted w-16">Pan X:</span>
<input
type="range"
min={-1}
max={1}
step={0.05}
value={options.zoom?.panX || 0}
onChange={(e) => setOptions((o) => ({ ...o, zoom: { ...o.zoom!, panX: Number(e.target.value) } }))}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{((options.zoom?.panX || 0) * 100).toFixed(0)}%</span>
</div>
<div className="flex items-center gap-2">
<span className="text-[10px] text-editor-text-muted w-16">Pan Y:</span>
<input
type="range"
min={-1}
max={1}
step={0.05}
value={options.zoom?.panY || 0}
onChange={(e) => setOptions((o) => ({ ...o, zoom: { ...o.zoom!, panY: Number(e.target.value) } }))}
className="flex-1 h-1.5"
/>
<span className="text-xs text-editor-text w-10 text-right">{((options.zoom?.panY || 0) * 100).toFixed(0)}%</span>
</div>
</div>
)}
</div>
{/* Background removal */}
{!isAudioOnly && (
<div className="space-y-2 pt-1 border-t border-editor-border">
<label className="flex items-center gap-2 cursor-pointer">
<input
type="checkbox"
checked={options.removeBackground || false}
onChange={(e) => setOptions((o) => ({ ...o, removeBackground: e.target.checked }))}
className="w-4 h-4 rounded bg-editor-surface border-editor-border accent-editor-accent"
/>
<div>
<span className="text-xs font-medium flex items-center gap-1">
<Video className="w-3 h-3" />
Remove background
</span>
<p className="text-[10px] text-editor-text-muted">
Replace or blur the background. Uses MediaPipe if available.
</p>
</div>
</label>
{options.removeBackground && (
<div className="pl-6 space-y-2">
<SelectField
label="Background replacement"
value={options.backgroundReplacement || 'blur'}
onChange={(v) => setOptions((o) => ({ ...o, backgroundReplacement: v as 'blur' | 'color' | 'image' }))}
options={[
{ value: 'blur', label: 'Blur background' },
{ value: 'color', label: 'Solid color' },
{ value: 'image', label: 'Custom image' },
]}
/>
{options.backgroundReplacement === 'color' && (
<input
type="text"
value={options.backgroundReplacementValue || '#00FF00'}
onChange={(e) => setOptions((o) => ({ ...o, backgroundReplacementValue: e.target.value }))}
placeholder="#00FF00"
className="w-full px-2 py-1.5 text-xs bg-editor-surface border border-editor-border rounded focus:outline-none focus:border-editor-accent [color-scheme:dark]"
/>
)}
{options.backgroundReplacement === 'image' && (
<p className="text-[10px] text-editor-text-muted">Place a background image file path above.</p>
)}
</div>
)}
</div>
)}
{/* Background music track info */}
{backgroundMusic && (
<div className="pt-1 border-t border-editor-border">
<div className="flex items-center gap-1.5 text-xs text-editor-accent">
<Music className="w-3 h-3" />
Background music: {backgroundMusic.path.split(/[/\\]/).pop()}
</div>
</div>
)}
{/* Append clips info */}
{additionalClips.length > 0 && (
<div className="pt-1 border-t border-editor-border">
<div className="flex items-center gap-1.5 text-xs text-editor-accent">
<Video className="w-3 h-3" />
{additionalClips.length} additional clip{additionalClips.length > 1 ? 's' : ''} appended
</div>
</div>
)}
{/* Audio normalization — integrated into export */}
<div className="space-y-2 pt-1 border-t border-editor-border">
<label className="flex items-center gap-2 cursor-pointer">

View File

@ -0,0 +1,151 @@
import { useState } from 'react';
import { useEditorStore } from '../store/editorStore';
import { MapPin, Trash2, PencilLine, Check, X, Copy } from 'lucide-react';
const COLORS = ['#6366f1', '#ef4444', '#22c55e', '#f59e0b', '#3b82f6', '#ec4899', '#8b5cf6', '#14b8a6'];
export default function MarkersPanel() {
const { timelineMarkers, addTimelineMarker, updateTimelineMarker, removeTimelineMarker, getChapters } =
useEditorStore();
const currentTime = useEditorStore((s) => s.currentTime);
const [editingId, setEditingId] = useState<string | null>(null);
const [editLabel, setEditLabel] = useState('');
const [newLabel, setNewLabel] = useState('');
const [newColor, setNewColor] = useState(COLORS[0]);
const [showChapters, setShowChapters] = useState(false);
const chapters = getChapters();
const addAtCurrentTime = () => {
addTimelineMarker(currentTime, newLabel || undefined, newColor);
setNewLabel('');
};
const startEdit = (id: string, label: string) => {
setEditingId(id);
setEditLabel(label);
};
const commitEdit = (id: string) => {
if (editLabel.trim()) {
updateTimelineMarker(id, { label: editLabel.trim() });
}
setEditingId(null);
};
const exportChapters = () => {
const lines = chapters.map((ch) => {
const h = Math.floor(ch.startTime / 3600);
const m = Math.floor((ch.startTime % 3600) / 60);
const s = Math.floor(ch.startTime % 60);
const timeStr = `${h > 0 ? `${h}:` : ''}${String(m).padStart(2, '0')}:${String(s).padStart(2, '0')}`;
return `${timeStr} ${ch.label}`;
});
const text = lines.join('\n');
navigator.clipboard.writeText(text).catch(() => {});
};
return (
<div className="p-4 space-y-4">
<div className="space-y-1">
<h3 className="text-sm font-semibold flex items-center gap-1.5">
<MapPin className="w-4 h-4" />
Timeline Markers
</h3>
<p className="text-xs text-editor-text-muted">
Drop markers at key points. Markers become YouTube-compatible chapters.
</p>
</div>
{/* Add marker at current time */}
<div className="space-y-2">
<div className="flex items-center gap-2">
<input
value={newLabel}
onChange={(e) => setNewLabel(e.target.value)}
placeholder={`${currentTime.toFixed(2)}s`}
className="flex-1 px-2 py-1.5 text-xs bg-editor-surface border border-editor-border rounded focus:outline-none focus:border-editor-accent"
/>
<div className="flex gap-0.5">
{COLORS.map((c) => (
<button
key={c}
onClick={() => setNewColor(c)}
className={`w-4 h-4 rounded-full border ${newColor === c ? 'border-white ring-1 ring-white' : 'border-transparent'}`}
style={{ backgroundColor: c }}
/>
))}
</div>
</div>
<button
onClick={addAtCurrentTime}
className="w-full flex items-center justify-center gap-1 px-2 py-1.5 text-xs bg-editor-accent/20 text-editor-accent hover:bg-editor-accent/30 rounded"
>
<MapPin className="w-3 h-3" />
Add
</button>
</div>
{/* Marker list */}
{timelineMarkers.length > 0 && (
<div className="space-y-1 max-h-60 overflow-y-auto">
{timelineMarkers.map((m) => (
<div
key={m.id}
className="flex items-center gap-2 px-2 py-1.5 rounded bg-editor-surface border border-editor-border text-xs"
>
<div className="w-2.5 h-2.5 rounded-full shrink-0" style={{ backgroundColor: m.color }} />
<span className="text-[10px] text-editor-text-muted w-14 shrink-0">{m.time.toFixed(2)}s</span>
{editingId === m.id ? (
<>
<input
value={editLabel}
onChange={(e) => setEditLabel(e.target.value)}
autoFocus
className="flex-1 px-1.5 py-0.5 text-xs bg-editor-bg border border-editor-border rounded focus:outline-none focus:border-editor-accent"
/>
<button onClick={() => commitEdit(m.id)} className="p-0.5 text-editor-success"><Check className="w-3 h-3" /></button>
<button onClick={() => setEditingId(null)} className="p-0.5 text-editor-text-muted"><X className="w-3 h-3" /></button>
</>
) : (
<>
<span className="flex-1 truncate">{m.label}</span>
<button onClick={() => startEdit(m.id, m.label)} className="p-0.5 hover:text-editor-accent"><PencilLine className="w-3 h-3" /></button>
<button onClick={() => removeTimelineMarker(m.id)} className="p-0.5 hover:text-editor-danger"><Trash2 className="w-3 h-3" /></button>
</>
)}
</div>
))}
</div>
)}
{/* Chapters */}
{chapters.length > 0 && (
<div className="space-y-2 pt-1 border-t border-editor-border">
<button
onClick={() => setShowChapters(!showChapters)}
className="flex items-center gap-1 text-xs font-medium text-editor-text-muted hover:text-editor-text"
>
{showChapters ? '▼' : '▶'} Chapters ({chapters.length})
</button>
{showChapters && (
<div className="space-y-1">
{chapters.map((ch) => (
<div key={ch.markerId} className="flex items-center gap-2 text-[10px] text-editor-text-muted">
<span className="font-mono">{ch.label}</span>
</div>
))}
<button
onClick={exportChapters}
className="flex items-center gap-1 text-[10px] text-editor-accent hover:underline"
>
<Copy className="w-2.5 h-2.5" />
Copy as YouTube timestamps
</button>
</div>
)}
</div>
)}
</div>
);
}

View File

@ -1,8 +1,9 @@
import { useAIStore } from '../store/aiStore';
import { useState, useEffect, useCallback } from 'react';
import type { AIProvider } from '../types/project';
import type { AIProvider, KeyBinding, HotkeyPreset } from '../types/project';
import { useEditorStore } from '../store/editorStore';
import { Bot, Cloud, Brain, RefreshCw } from 'lucide-react';
import { Bot, Cloud, Brain, RefreshCw, Keyboard } from 'lucide-react';
import { loadBindings, saveBindings, applyPreset as applyKeyPreset, DEFAULT_PRESETS, detectConflicts as detectKeyConflicts } from '../lib/keybindings';
export default function SettingsPanel() {
const { providers, defaultProvider, setProviderConfig, setDefaultProvider } = useAIStore();
@ -19,6 +20,51 @@ export default function SettingsPanel() {
window.localStorage.setItem(CONFIDENCE_THRESHOLD_KEY, String(clamped));
}
};
// Keyboard shortcuts state
const [bindings, setBindings] = useState<KeyBinding[]>(() => {
try { return loadBindings(); } catch { return DEFAULT_PRESETS['standard']; }
});
const [editingKey, setEditingKey] = useState<string | null>(null);
const [editKeyValue, setEditKeyValue] = useState('');
const conflicts = detectKeyConflicts(bindings);
const persistBindings = (newB: KeyBinding[]) => {
saveBindings(newB);
setBindings(newB);
};
const applyPresetAction = (preset: HotkeyPreset) => {
persistBindings(applyKeyPreset(preset));
};
const startKeyEdit = (idx: number) => {
setEditingKey(bindings[idx].id);
setEditKeyValue(bindings[idx].keys);
};
const handleKeyCapture = (e: React.KeyboardEvent, idx: number) => {
e.preventDefault();
const parts: string[] = [];
if (e.ctrlKey || e.metaKey) parts.push('Ctrl');
if (e.shiftKey) parts.push('Shift');
if (e.altKey) parts.push('Alt');
const key = e.key === ' ' ? 'Space' : e.key.length === 1 ? e.key.toUpperCase() : e.key;
if (!['Control', 'Shift', 'Alt', 'Meta'].includes(key)) parts.push(key);
if (parts.length === 0) return;
const combo = parts.join('+');
const newBindings = bindings.map((b, i) => (i === idx ? { ...b, keys: combo } : b));
setEditKeyValue(combo);
setEditingKey(null);
persistBindings(newBindings);
};
const handleReset = (idx: number) => {
const standard = DEFAULT_PRESETS['standard'];
const existing = standard.find((b: KeyBinding) => b.id === bindings[idx].id);
if (!existing) return;
persistBindings(bindings.map((b, i) => (i === idx ? { ...existing } : b)));
};
const [ollamaModels, setOllamaModels] = useState<string[]>([]);
const [loadingModels, setLoadingModels] = useState(false);
@ -112,6 +158,60 @@ export default function SettingsPanel() {
</div>
</div>
{/* Keyboard shortcuts */}
<div className="space-y-2 pt-1 border-t border-editor-border">
<h4 className="text-xs font-semibold flex items-center gap-1.5">
<Keyboard className="w-3.5 h-3.5" />
Keyboard Shortcuts
</h4>
<div className="flex items-center gap-2">
<button
onClick={() => applyPresetAction('standard')}
className="flex-1 px-2 py-1.5 text-xs rounded bg-editor-accent/20 text-editor-accent hover:bg-editor-accent/30"
>
Standard Preset
</button>
<button
onClick={() => applyPresetAction('left-hand')}
className="flex-1 px-2 py-1.5 text-xs rounded bg-editor-accent/20 text-editor-accent hover:bg-editor-accent/30"
>
Left-Hand Preset
</button>
</div>
{conflicts.length > 0 && (
<div className="px-2 py-1 rounded border border-red-500/40 bg-red-500/10 text-[10px] text-red-300">
{conflicts.join('; ')}
</div>
)}
<div className="max-h-52 overflow-y-auto space-y-1 pr-1">
{bindings.map((b, i) => (
<div key={b.id} className="flex items-center gap-2 text-[11px]">
<span className="flex-1 truncate text-editor-text-muted">{b.label}</span>
<input
value={editingKey === b.id ? editKeyValue : b.keys}
onFocus={() => startKeyEdit(i)}
onChange={(e) => {
setEditingKey(b.id);
setEditKeyValue(e.target.value);
}}
onKeyDown={(e) => handleKeyCapture(e, i)}
className="w-28 px-2 py-1 text-[10px] font-mono bg-editor-bg border border-editor-border rounded text-center focus:outline-none focus:border-editor-accent"
placeholder="Type shortcut"
/>
<button
onClick={() => handleReset(i)}
className="text-[10px] text-editor-text-muted hover:text-editor-text px-1"
>
</button>
</div>
))}
</div>
<p className="text-[10px] text-editor-text-muted">
Press <kbd>?</kbd> anytime to view shortcuts. Changes apply immediately.
</p>
</div>
{/* Default provider selector */}
<div className="space-y-2">
<label className="text-xs text-editor-text-muted font-medium">Default AI Provider</label>

View File

@ -1,6 +1,7 @@
import { useRef, useEffect, useCallback, useState, useMemo } from 'react';
import { useEditorStore } from '../store/editorStore';
import { AlertTriangle } from 'lucide-react';
import { extractThumbnails } from '../lib/thumbnails';
const RULER_H = 20; // px reserved at top of canvas for the time ruler
const COLLAPSED_CUT_DISPLAY_SECONDS = 0.08;
@ -239,13 +240,18 @@ export default function WaveformTimeline({
const videoPath = useEditorStore((s) => s.videoPath);
const backendUrl = useEditorStore((s) => s.backendUrl);
const duration = useEditorStore((s) => s.duration);
const setCurrentTime = useEditorStore((s) => s.setCurrentTime);
const cutRanges = useEditorStore((s) => s.cutRanges);
const muteRanges = useEditorStore((s) => s.muteRanges);
const gainRanges = useEditorStore((s) => s.gainRanges);
const speedRanges = useEditorStore((s) => s.speedRanges);
const timelineMarkers = useEditorStore((s) => s.timelineMarkers);
const markInTime = useEditorStore((s) => s.markInTime);
const markOutTime = useEditorStore((s) => s.markOutTime);
const setCurrentTime = useEditorStore((s) => s.setCurrentTime);
const [showThumbnails, setShowThumbnails] = useState(() => typeof window !== 'undefined' && localStorage.getItem('talkedit:showThumbnails') === 'true');
const [thumbnailFrames, setThumbnailFrames] = useState<Map<number, string>>(new Map());
void setShowThumbnails;
const thumbnailContainerRef = useRef<HTMLDivElement>(null);
const addCutRange = useEditorStore((s) => s.addCutRange);
const addMuteRange = useEditorStore((s) => s.addMuteRange);
const addGainRange = useEditorStore((s) => s.addGainRange);
@ -606,6 +612,46 @@ export default function WaveformTimeline({
if (markInTime !== null) drawMarkLine(markInTime, 'I');
if (markOutTime !== null) drawMarkLine(markOutTime, 'O');
// Draw timeline markers (numbered circles)
const sortedMarkers = [...timelineMarkers].sort((a, b) => a.time - b.time);
for (let mi = 0; mi < sortedMarkers.length; mi++) {
const marker = sortedMarkers[mi];
const number = mi + 1;
const x = (sourceToDisplayTime(marker.time, timelineSegments, dur) - scroll) * pxPerSec;
if (x < -8 || x > width + 8) continue;
const radius = 7;
const cy = waveTop - radius - 2;
// Draw filled circle
ctx.beginPath();
ctx.arc(x, cy, radius, 0, Math.PI * 2);
ctx.fillStyle = marker.color;
ctx.fill();
ctx.strokeStyle = '#0f1117';
ctx.lineWidth = 1.5;
ctx.stroke();
// Draw number
ctx.fillStyle = '#ffffff';
ctx.font = 'bold 9px sans-serif';
ctx.textAlign = 'center';
ctx.textBaseline = 'middle';
ctx.fillText(String(number), x, cy);
ctx.textAlign = 'start';
ctx.textBaseline = 'alphabetic';
// Draw a thin vertical line below the circle
ctx.beginPath();
ctx.moveTo(x, cy + radius);
ctx.lineTo(x, waveTop + waveH);
ctx.strokeStyle = marker.color;
ctx.lineWidth = 1;
ctx.stroke();
// Label to the right
ctx.fillStyle = marker.color;
ctx.font = '9px sans-serif';
ctx.textBaseline = 'bottom';
ctx.fillText(marker.label, Math.min(width - 50, Math.max(2, x + 5)), waveTop - 2);
ctx.textBaseline = 'alphabetic';
}
const mid = waveTop + waveH / 2;
ctx.beginPath();
ctx.strokeStyle = '#4a4d5e';
@ -1169,6 +1215,26 @@ export default function WaveformTimeline({
if (selectedZone.type === 'speed' && !showSpeedZones) setSelectedZone(null);
}, [selectedZone, showCutZones, showMuteZones, showGainZones, showSpeedZones]);
// Capture thumbnail frames when enabled
useEffect(() => {
if (!showThumbnails) { setThumbnailFrames(new Map()); return; }
const dur = displayDuration || waveformDataRef.current?.duration || 0;
if (dur <= 0) return;
const video = document.querySelector('video') as HTMLVideoElement | null;
if (!video) return;
const interval = 10;
const times: number[] = [];
for (let t = 0; t < dur; t += interval) times.push(t);
let cancelled = false;
extractThumbnails(video, times).then((frames) => {
if (!cancelled) setThumbnailFrames(frames);
});
return () => { cancelled = true; };
}, [showThumbnails, videoUrl, displayDuration]);
if (!videoUrl) {
return (
<div className="w-full h-full flex items-center justify-center text-editor-text-muted text-xs">
@ -1248,6 +1314,35 @@ export default function WaveformTimeline({
</pre>
</div>
) : (
<div className="flex-1 relative flex flex-col">
{showThumbnails && thumbnailFrames.size > 0 && (
<div
ref={thumbnailContainerRef}
className="h-14 shrink-0 overflow-x-auto border-b border-editor-border/60"
style={{ scrollbarWidth: 'thin' }}
>
<div className="relative h-full" style={{ width: '100%', minHeight: 0 }}>
{Array.from(thumbnailFrames.entries()).map(([time, dataUrl]) => {
const dur = displayDuration || waveformDataRef.current?.duration || 1;
const pct = (time / dur) * 100;
return (
<img
key={time}
src={dataUrl}
alt={`Thumbnail at ${time.toFixed(0)}s`}
className="absolute top-1 rounded border border-editor-border/40 object-cover cursor-pointer"
style={{ left: `${pct}%`, width: 80, height: 45, transform: 'translateX(-50%)' }}
title={`${time.toFixed(0)}s`}
onClick={() => {
const video = document.querySelector('video') as HTMLVideoElement | null;
if (video) { video.currentTime = time; setCurrentTime(time); }
}}
/>
);
})}
</div>
</div>
)}
<div className="flex-1 relative">
<canvas ref={waveCanvasRef} className="absolute inset-0 w-full h-full" />
<canvas
@ -1259,6 +1354,7 @@ export default function WaveformTimeline({
onWheel={handleWheel}
/>
</div>
</div>
)}
</div>
);

View File

@ -1,5 +1,7 @@
import { useEffect, useRef } from 'react';
import { useEditorStore } from '../store/editorStore';
import { loadBindings } from '../lib/keybindings';
import type { KeyBinding } from '../types/project';
export function useKeyboardShortcuts() {
const addCutRange = useEditorStore((s) => s.addCutRange);
@ -10,9 +12,13 @@ export function useKeyboardShortcuts() {
const clearMarkRange = useEditorStore((s) => s.clearMarkRange);
const selectedWordIndices = useEditorStore((s) => s.selectedWordIndices);
const words = useEditorStore((s) => s.words);
const playbackRateRef = useRef(1);
// Read bindings fresh from localStorage on every call to avoid stale closures
const getBindings = (): KeyBinding[] => {
try { return loadBindings(); } catch { return []; }
};
useEffect(() => {
const getVideo = (): HTMLVideoElement | null => document.querySelector('video');
@ -22,81 +28,58 @@ export function useKeyboardShortcuts() {
const video = getVideo();
switch (true) {
// --- Undo / Redo ---
case e.key === 'z' && (e.ctrlKey || e.metaKey) && e.shiftKey: {
e.preventDefault();
useEditorStore.temporal.getState().redo();
return;
}
case e.key === 'z' && (e.ctrlKey || e.metaKey): {
// Build a key string from the event for matching
const parts: string[] = [];
if (e.ctrlKey || e.metaKey) parts.push('Ctrl');
if (e.shiftKey && !['Shift'].includes(e.key)) parts.push('Shift');
if (e.altKey) parts.push('Alt');
const keyStr = e.key === ' ' ? 'Space' : e.key.length === 1 ? e.key.toUpperCase() : e.key;
parts.push(keyStr);
const combo = parts.join('+');
// Look up binding — fresh read every keystroke so Settings changes take effect immediately
const currentBindings = getBindings();
const binding = currentBindings.find((b) => b.keys === combo);
if (!binding) return; // Unbound key — ignore
e.preventDefault();
switch (binding.id) {
case 'undo':
useEditorStore.temporal.getState().undo();
return;
}
// --- Delete / Backspace: cut selected words ---
case e.key === 'Delete' || e.key === 'Backspace': {
case 'redo':
useEditorStore.temporal.getState().redo();
return;
case 'cut': {
if (selectedWordIndices.length > 0) {
e.preventDefault();
const sorted = [...selectedWordIndices].sort((a, b) => a - b);
const startTime = words[sorted[0]].start;
const endTime = words[sorted[sorted.length - 1]].end;
addCutRange(startTime, endTime);
addCutRange(words[sorted[0]].start, words[sorted[sorted.length - 1]].end);
return;
}
if (markInTime !== null && markOutTime !== null) {
e.preventDefault();
const start = Math.min(markInTime, markOutTime);
const end = Math.max(markInTime, markOutTime);
if (end - start >= 0.01) {
addCutRange(start, end);
}
if (end - start >= 0.01) addCutRange(start, end);
clearMarkRange();
}
return;
}
// --- Space: play / pause ---
case e.key === ' ' && !e.ctrlKey: {
e.preventDefault();
if (video) {
if (video.paused) video.play();
else video.pause();
}
case 'play-pause':
if (video) { if (video.paused) video.play(); else video.pause(); }
return;
}
// --- J: reverse / slow down ---
case e.key === 'j' || e.key === 'J': {
e.preventDefault();
case 'slow-down': {
if (video) {
playbackRateRef.current = Math.max(-2, playbackRateRef.current - 0.5);
if (playbackRateRef.current < 0) {
// HTML5 video doesn't support negative rates natively; step back
video.currentTime = Math.max(0, video.currentTime - 2);
} else {
video.playbackRate = playbackRateRef.current;
if (video.paused) video.play();
}
if (playbackRateRef.current < 0) video.currentTime = Math.max(0, video.currentTime - 2);
else { video.playbackRate = playbackRateRef.current; if (video.paused) video.play(); }
}
return;
}
// --- K: pause ---
case e.key === 'k' || e.key === 'K': {
e.preventDefault();
if (video) {
video.pause();
playbackRateRef.current = 1;
}
case 'pause':
if (video) { video.pause(); playbackRateRef.current = 1; }
return;
}
// --- L: forward / speed up ---
case e.key === 'l' || e.key === 'L': {
e.preventDefault();
case 'speed-up': {
if (video) {
playbackRateRef.current = Math.min(4, playbackRateRef.current + 0.5);
video.playbackRate = Math.max(0.25, playbackRateRef.current);
@ -104,60 +87,37 @@ export function useKeyboardShortcuts() {
}
return;
}
// --- Arrow Left: seek back 5s ---
case e.key === 'ArrowLeft' && !e.ctrlKey: {
e.preventDefault();
case 'rewind':
if (video) video.currentTime = Math.max(0, video.currentTime - 5);
return;
}
// --- Arrow Right: seek forward 5s ---
case e.key === 'ArrowRight' && !e.ctrlKey: {
e.preventDefault();
case 'forward':
if (video) video.currentTime = Math.min(video.duration, video.currentTime + 5);
return;
}
// --- I: mark in-point ---
case e.key === 'i' || e.key === 'I': {
e.preventDefault();
case 'mark-in':
if (video) setMarkInTime(video.currentTime);
return;
}
// --- O: mark out-point ---
case e.key === 'o' || e.key === 'O': {
e.preventDefault();
case 'mark-out':
if (video) setMarkOutTime(video.currentTime);
return;
}
// --- Ctrl+S: save project ---
case e.key === 's' && (e.ctrlKey || e.metaKey): {
e.preventDefault();
case 'save': {
const saveBtn = document.querySelector('[title="Save"]') as HTMLButtonElement | null;
if (saveBtn) saveBtn.click();
else saveProject();
return;
}
// --- Ctrl+E: export ---
case e.key === 'e' && (e.ctrlKey || e.metaKey): {
e.preventDefault();
// Trigger export panel via DOM click
case 'export': {
const exportBtn = document.querySelector('[title="Export"]') as HTMLButtonElement;
if (exportBtn) exportBtn.click();
return;
}
// --- ?: show shortcut cheatsheet ---
case e.key === '?' || (e.key === '/' && e.shiftKey): {
e.preventDefault();
toggleCheatsheet();
case 'search': {
const findBtn = document.querySelector('[title="Find (Ctrl+F)"]') as HTMLButtonElement;
if (findBtn) findBtn.click();
return;
}
case 'help':
toggleCheatsheet(currentBindings);
return;
default:
break;
}
@ -205,7 +165,7 @@ async function saveProject() {
}
}
function toggleCheatsheet() {
function toggleCheatsheet(bindings: KeyBinding[]) {
const existing = document.getElementById('keyboard-cheatsheet');
if (existing) {
existing.remove();
@ -220,32 +180,17 @@ function toggleCheatsheet() {
overlay.remove();
};
const shortcuts = [
['Space', 'Play / Pause'],
['J', 'Reverse / Slow down'],
['K', 'Pause'],
['L', 'Forward / Speed up'],
['\u2190 / \u2192', 'Seek \u00b15 seconds'],
['I / O', 'Mark in / out points'],
['Delete', 'Cut selected words'],
['Ctrl+Z', 'Undo'],
['Ctrl+Shift+Z', 'Redo'],
['Ctrl+S', 'Save project'],
['Ctrl+E', 'Export'],
['?', 'This cheatsheet'],
];
const rows = shortcuts
const rows = bindings
.map(
([key, desc]) =>
`<tr><td style="padding:6px 16px 6px 0;font-family:monospace;color:#818cf8;font-weight:600">${key}</td><td style="padding:6px 0;color:#e2e8f0">${desc}</td></tr>`,
(b) =>
`<tr><td style="padding:6px 16px 6px 0;font-family:monospace;color:#818cf8;font-weight:600;white-space:nowrap">${b.keys}</td><td style="padding:6px 0;color:#e2e8f0">${b.label}</td><td style="padding:6px 0 6px 12px;font-size:10px;color:#94a3b8">${b.category}</td></tr>`,
)
.join('');
overlay.innerHTML = `<div style="background:#1a1d27;border:1px solid #2a2d3a;border-radius:12px;padding:24px 32px;max-width:400px;" onclick="event.stopPropagation()">
overlay.innerHTML = `<div style="background:#1a1d27;border:1px solid #2a2d3a;border-radius:12px;padding:24px 32px;max-width:450px;" onclick="event.stopPropagation()">
<h3 style="margin:0 0 16px;font-size:14px;font-weight:600;color:#e2e8f0">Keyboard Shortcuts</h3>
<table style="font-size:13px">${rows}</table>
<p style="margin:16px 0 0;font-size:11px;color:#94a3b8;text-align:center">Press ? or click outside to close</p>
<p style="margin:16px 0 0;font-size:11px;color:#94a3b8;text-align:center">Customize in Settings &bull; Press ? to close</p>
</div>`;
document.body.appendChild(overlay);

View File

@ -0,0 +1,83 @@
/**
* Configurable keyboard shortcuts system.
* Stores bindings in localStorage under 'talkedit:keybindings'.
* Provides default presets and conflict detection.
*/
import type { KeyBinding, HotkeyPreset } from '../types/project';
const STORAGE_KEY = 'talkedit:keybindings';
export const DEFAULT_PRESETS: Record<HotkeyPreset, KeyBinding[]> = {
'left-hand': [
{ id: 'play-pause', label: 'Play / Pause', keys: 'Space', category: 'transport' },
{ id: 'rewind', label: 'Rewind 5s', keys: 'Q', category: 'transport' },
{ id: 'forward', label: 'Forward 5s', keys: 'E', category: 'transport' },
{ id: 'speed-up', label: 'Speed Up', keys: 'W', category: 'transport' },
{ id: 'slow-down', label: 'Slow Down', keys: 'S', category: 'transport' },
{ id: 'pause', label: 'Pause', keys: 'D', category: 'transport' },
{ id: 'mark-in', label: 'Mark In Point', keys: 'A', category: 'edit' },
{ id: 'mark-out', label: 'Mark Out Point', keys: 'F', category: 'edit' },
{ id: 'cut', label: 'Cut Selection', keys: 'X', category: 'edit' },
{ id: 'undo', label: 'Undo', keys: 'Ctrl+Z', category: 'edit' },
{ id: 'redo', label: 'Redo', keys: 'Ctrl+Shift+Z', category: 'edit' },
{ id: 'save', label: 'Save', keys: 'Ctrl+S', category: 'file' },
{ id: 'export', label: 'Export', keys: 'Ctrl+E', category: 'file' },
{ id: 'search', label: 'Find in Transcript', keys: 'Ctrl+F', category: 'edit' },
{ id: 'help', label: 'Shortcut Help', keys: '?', category: 'view' },
],
'standard': [
{ id: 'play-pause', label: 'Play / Pause', keys: 'Space', category: 'transport' },
{ id: 'rewind', label: 'Rewind 5s', keys: 'ArrowLeft', category: 'transport' },
{ id: 'forward', label: 'Forward 5s', keys: 'ArrowRight', category: 'transport' },
{ id: 'speed-up', label: 'Speed Up', keys: 'L', category: 'transport' },
{ id: 'slow-down', label: 'Slow Down', keys: 'J', category: 'transport' },
{ id: 'pause', label: 'Pause', keys: 'K', category: 'transport' },
{ id: 'mark-in', label: 'Mark In Point', keys: 'I', category: 'edit' },
{ id: 'mark-out', label: 'Mark Out Point', keys: 'O', category: 'edit' },
{ id: 'cut', label: 'Cut Selection', keys: 'Delete', category: 'edit' },
{ id: 'undo', label: 'Undo', keys: 'Ctrl+Z', category: 'edit' },
{ id: 'redo', label: 'Redo', keys: 'Ctrl+Shift+Z', category: 'edit' },
{ id: 'save', label: 'Save', keys: 'Ctrl+S', category: 'file' },
{ id: 'export', label: 'Export', keys: 'Ctrl+E', category: 'file' },
{ id: 'search', label: 'Find in Transcript', keys: 'Ctrl+F', category: 'edit' },
{ id: 'help', label: 'Shortcut Help', keys: '?', category: 'view' },
],
};
export function loadBindings(): KeyBinding[] {
try {
const stored = localStorage.getItem(STORAGE_KEY);
if (stored) return JSON.parse(stored);
} catch { /* use defaults */ }
return DEFAULT_PRESETS['standard'];
}
export function saveBindings(bindings: KeyBinding[]) {
localStorage.setItem(STORAGE_KEY, JSON.stringify(bindings));
}
export function applyPreset(preset: HotkeyPreset): KeyBinding[] {
const bindings = DEFAULT_PRESETS[preset];
saveBindings(bindings);
return bindings;
}
export function detectConflicts(bindings: KeyBinding[]): string[] {
const conflicts: string[] = [];
const seen = new Map<string, string>();
for (const b of bindings) {
if (seen.has(b.keys)) {
conflicts.push(`"${b.keys}" is used by both "${seen.get(b.keys)}" and "${b.label}"`);
}
seen.set(b.keys, b.label);
}
return conflicts;
}
export function findBinding(bindings: KeyBinding[], id: string): KeyBinding | undefined {
return bindings.find((b) => b.id === id);
}
export function getBoundKey(bindings: KeyBinding[], id: string): string {
return findBinding(bindings, id)?.keys || '';
}

View File

@ -0,0 +1,81 @@
/**
* Frontend-side video thumbnail extraction.
* Captures frames from the <video> element using canvas.
*/
const THUMBNAIL_CACHE = new Map<string, string>();
export function extractThumbnail(video: HTMLVideoElement, time: number, width = 160, height = 90): string | null {
const cacheKey = `${video.src}_${time.toFixed(3)}_${width}x${height}`;
const cached = THUMBNAIL_CACHE.get(cacheKey);
if (cached) return cached;
// Seek to the time, wait for seeked, then capture
// Since this is synchronous, we use the current ready frame
const canvas = document.createElement('canvas');
canvas.width = width;
canvas.height = height;
const ctx = canvas.getContext('2d');
if (!ctx) return null;
// Try to draw the current frame at the requested time
const originalTime = video.currentTime;
video.currentTime = time;
// We can't synchronously wait for seek, so catch the 'seeked' event externally
// For now, draw whatever video frame is available
ctx.drawImage(video, 0, 0, width, height);
// Return to original time (best-effort)
video.currentTime = originalTime;
const dataUrl = canvas.toDataURL('image/jpeg', 0.6);
THUMBNAIL_CACHE.set(cacheKey, dataUrl);
return dataUrl;
}
export async function extractThumbnails(
video: HTMLVideoElement,
times: number[],
width = 160,
height = 90,
): Promise<Map<number, string>> {
const results = new Map<number, string>();
const originalTime = video.currentTime;
for (const time of times) {
const cacheKey = `${video.src}_${time.toFixed(3)}_${width}x${height}`;
const cached = THUMBNAIL_CACHE.get(cacheKey);
if (cached) {
results.set(time, cached);
continue;
}
// Seek and wait for the frame to be available
video.currentTime = time;
await new Promise<void>((resolve) => {
const handler = () => {
video.removeEventListener('seeked', handler);
resolve();
};
video.addEventListener('seeked', handler);
// Fallback: resolve after a short timeout
setTimeout(resolve, 500);
});
const canvas = document.createElement('canvas');
canvas.width = width;
canvas.height = height;
const ctx = canvas.getContext('2d');
if (ctx) {
ctx.drawImage(video, 0, 0, width, height);
const dataUrl = canvas.toDataURL('image/jpeg', 0.5);
THUMBNAIL_CACHE.set(cacheKey, dataUrl);
results.set(time, dataUrl);
}
}
// Restore original position
video.currentTime = originalTime;
return results;
}

View File

@ -12,6 +12,11 @@ import type {
SilenceDetectionRange,
SilenceTrimSettings,
SilenceTrimGroup,
TimelineMarker,
Chapter,
ZoomConfig,
ClipInfo,
BackgroundMusicConfig,
} from '../types/project';
interface EditorState {
@ -27,6 +32,7 @@ interface EditorState {
speedRanges: SpeedRange[];
globalGainDb: number;
silenceTrimGroups: SilenceTrimGroup[];
timelineMarkers: TimelineMarker[];
transcriptionModel: string | null;
language: string;
@ -47,6 +53,10 @@ interface EditorState {
backendUrl: string;
zonePreviewPaddingSeconds: number;
zoomConfig: ZoomConfig;
additionalClips: ClipInfo[];
backgroundMusic: BackgroundMusicConfig | null;
}
interface EditorActions {
@ -89,6 +99,10 @@ interface EditorActions {
settings: SilenceTrimSettings;
}) => { groupId: string; appliedCount: number };
removeSilenceTrimGroup: (groupId: string) => void;
addTimelineMarker: (time: number, label?: string, color?: string) => void;
updateTimelineMarker: (id: string, updates: Partial<TimelineMarker>) => void;
removeTimelineMarker: (id: string) => void;
getChapters: () => Chapter[];
setTranscribing: (active: boolean, progress?: number, status?: string) => void;
setExporting: (active: boolean, progress?: number) => void;
setZonePreviewPaddingSeconds: (seconds: number) => void;
@ -97,6 +111,12 @@ interface EditorActions {
getWordAtTime: (time: number) => number;
loadProject: (projectData: any) => void;
reset: () => void;
setZoomConfig: (config: Partial<ZoomConfig>) => void;
addAdditionalClip: (path: string, label?: string) => void;
removeAdditionalClip: (id: string) => void;
reorderAdditionalClip: (id: string, direction: -1 | 1) => void;
setBackgroundMusic: (config: BackgroundMusicConfig | null) => void;
updateBackgroundMusic: (updates: Partial<BackgroundMusicConfig>) => void;
}
const ZONE_PREVIEW_PADDING_KEY = 'talkedit-zone-preview-padding-seconds';
@ -122,6 +142,7 @@ const initialState: EditorState = {
speedRanges: [],
globalGainDb: 0,
silenceTrimGroups: [],
timelineMarkers: [],
transcriptionModel: null,
language: '',
currentTime: 0,
@ -138,6 +159,9 @@ const initialState: EditorState = {
exportProgress: 0,
backendUrl: 'http://127.0.0.1:8000',
zonePreviewPaddingSeconds: getStoredZonePreviewPaddingSeconds(),
zoomConfig: { enabled: false, zoomFactor: 1, panX: 0, panY: 0 },
additionalClips: [],
backgroundMusic: null,
};
let nextRangeId = 1;
@ -182,7 +206,7 @@ export const useEditorStore = create<EditorState & EditorActions>()(
setTranscriptionModel: (model) => set({ transcriptionModel: model }),
saveProject: (): ProjectFile => {
const { videoPath, words, segments, cutRanges, muteRanges, gainRanges, speedRanges, globalGainDb, silenceTrimGroups, transcriptionModel, language, exportedAudioPath } = get();
const { videoPath, words, segments, cutRanges, muteRanges, gainRanges, speedRanges, globalGainDb, silenceTrimGroups, timelineMarkers, transcriptionModel, language, exportedAudioPath, zoomConfig, additionalClips, backgroundMusic } = get();
if (!videoPath) throw new Error('No video loaded');
const now = new Date().toISOString();
// Strip globalStartIndex (runtime-only field) before persisting.
@ -204,9 +228,13 @@ export const useEditorStore = create<EditorState & EditorActions>()(
speedRanges,
globalGainDb,
silenceTrimGroups,
timelineMarkers,
language,
createdAt: now, // will be overwritten if we track original creation time later
createdAt: now,
modifiedAt: now,
zoomConfig,
additionalClips,
backgroundMusic: backgroundMusic ?? undefined,
};
},
@ -453,6 +481,40 @@ export const useEditorStore = create<EditorState & EditorActions>()(
});
},
addTimelineMarker: (time, label, color) => {
const { timelineMarkers } = get();
const newMarker: TimelineMarker = {
id: `marker_${nextRangeId++}`,
time,
label: label || `Marker ${timelineMarkers.length + 1}`,
color: color || '#6366f1',
};
set({ timelineMarkers: [...timelineMarkers, newMarker].sort((a, b) => a.time - b.time) });
},
updateTimelineMarker: (id, updates) => {
const { timelineMarkers } = get();
set({
timelineMarkers: timelineMarkers
.map((m) => (m.id === id ? { ...m, ...updates } : m))
.sort((a, b) => a.time - b.time),
});
},
removeTimelineMarker: (id) => {
const { timelineMarkers } = get();
set({ timelineMarkers: timelineMarkers.filter((m) => m.id !== id) });
},
getChapters: () => {
const { timelineMarkers } = get();
return timelineMarkers.map((m) => ({
markerId: m.id,
label: m.label,
startTime: m.time,
}));
},
setTranscribing: (active, progress, status) =>
set({
isTranscribing: active,
@ -557,6 +619,43 @@ export const useEditorStore = create<EditorState & EditorActions>()(
return lo < words.length ? lo : words.length - 1;
},
setZoomConfig: (config) => {
const { zoomConfig } = get();
set({ zoomConfig: { ...zoomConfig, ...config } });
},
addAdditionalClip: (path, label) => {
const { additionalClips } = get();
const id = `clip_${Date.now()}_${Math.random().toString(36).slice(2, 6)}`;
set({ additionalClips: [...additionalClips, { id, path, label: label || path.split(/[/\\]/).pop() || 'Clip' }] });
},
removeAdditionalClip: (id) => {
const { additionalClips } = get();
set({ additionalClips: additionalClips.filter((c) => c.id !== id) });
},
reorderAdditionalClip: (id, direction) => {
const { additionalClips } = get();
const idx = additionalClips.findIndex((c) => c.id === id);
if (idx === -1) return;
const target = idx + direction;
if (target < 0 || target >= additionalClips.length) return;
const reordered = [...additionalClips];
[reordered[idx], reordered[target]] = [reordered[target], reordered[idx]];
set({ additionalClips: reordered });
},
setBackgroundMusic: (config) => {
set({ backgroundMusic: config });
},
updateBackgroundMusic: (updates) => {
const { backgroundMusic } = get();
if (!backgroundMusic) return;
set({ backgroundMusic: { ...backgroundMusic, ...updates } });
},
loadProject: (data) => {
const { backendUrl, zonePreviewPaddingSeconds, projectFilePath } = get();
const url = `${backendUrl}/file?path=${encodeURIComponent(data.videoPath)}`;
@ -587,9 +686,13 @@ export const useEditorStore = create<EditorState & EditorActions>()(
speedRanges: data.speedRanges || [],
globalGainDb: typeof data.globalGainDb === 'number' ? data.globalGainDb : 0,
silenceTrimGroups: data.silenceTrimGroups || [],
timelineMarkers: data.timelineMarkers || [],
transcriptionModel: data.transcriptionModel ?? null,
language: data.language || '',
exportedAudioPath: data.exportedAudioPath ?? null,
zoomConfig: data.zoomConfig || { enabled: false, zoomFactor: 1, panX: 0, panY: 0 },
additionalClips: data.additionalClips || [],
backgroundMusic: data.backgroundMusic || null,
});
},

View File

@ -72,9 +72,13 @@ export interface ProjectFile {
speedRanges?: SpeedRange[];
globalGainDb?: number;
silenceTrimGroups?: SilenceTrimGroup[];
timelineMarkers?: TimelineMarker[];
language: string;
createdAt: string;
modifiedAt: string;
zoomConfig?: ZoomConfig;
additionalClips?: ClipInfo[];
backgroundMusic?: BackgroundMusicConfig;
}
export interface TranscriptionResult {
@ -83,6 +87,28 @@ export interface TranscriptionResult {
language: string;
}
export interface ZoomConfig {
enabled: boolean;
zoomFactor: number; // 1.0 = no zoom, 2.0 = 2x zoom
panX: number; // -1 to 1, normalized pan offset
panY: number;
}
export interface ClipInfo {
id: string;
path: string;
label: string;
}
export interface BackgroundMusicConfig {
path: string;
volumeDb: number; // gain in dB for music track
duckingEnabled: boolean;
duckingDb: number; // how much to duck (dB reduction)
duckingAttackMs: number;
duckingReleaseMs: number;
}
export interface ExportOptions {
outputPath: string;
mode: 'fast' | 'reencode';
@ -91,8 +117,34 @@ export interface ExportOptions {
enhanceAudio: boolean;
captions: 'none' | 'burn-in' | 'sidecar';
captionStyle?: CaptionStyle;
zoom?: ZoomConfig;
removeBackground?: boolean;
backgroundReplacement?: 'blur' | 'color' | 'image';
backgroundReplacementValue?: string;
}
export interface TimelineMarker {
id: string;
time: number;
label: string;
color: string;
}
export interface Chapter {
markerId: string;
label: string;
startTime: number;
}
export interface KeyBinding {
id: string;
label: string;
keys: string; // e.g. "Ctrl+Z"
category: string; // "transport", "edit", "file", "view"
}
export type HotkeyPreset = 'left-hand' | 'standard';
export interface CaptionStyle {
fontName: string;
fontSize: number;

View File

@ -1 +1 @@
{"root":["./src/App.tsx","./src/main.tsx","./src/vite-env.d.ts","./src/components/AIPanel.tsx","./src/components/DevPanel.tsx","./src/components/ExportDialog.tsx","./src/components/SettingsPanel.tsx","./src/components/SilenceTrimmerPanel.tsx","./src/components/TranscriptEditor.tsx","./src/components/VideoPlayer.tsx","./src/components/VolumePanel.tsx","./src/components/WaveformTimeline.tsx","./src/components/ZoneEditor.tsx","./src/hooks/useKeyboardShortcuts.ts","./src/hooks/useVideoSync.ts","./src/lib/dev-logger.ts","./src/lib/tauri-bridge.ts","./src/store/aiStore.ts","./src/store/editorStore.test.ts","./src/store/editorStore.ts","./src/types/project.ts"],"version":"5.9.3"}
{"root":["./src/App.tsx","./src/main.tsx","./src/vite-env.d.ts","./src/components/AIPanel.tsx","./src/components/AppendClipPanel.tsx","./src/components/BackgroundMusicPanel.tsx","./src/components/DevPanel.tsx","./src/components/ExportDialog.tsx","./src/components/MarkersPanel.tsx","./src/components/SettingsPanel.tsx","./src/components/SilenceTrimmerPanel.tsx","./src/components/TranscriptEditor.tsx","./src/components/VideoPlayer.tsx","./src/components/VolumePanel.tsx","./src/components/WaveformTimeline.tsx","./src/components/ZoneEditor.tsx","./src/hooks/useKeyboardShortcuts.ts","./src/hooks/useVideoSync.ts","./src/lib/dev-logger.ts","./src/lib/keybindings.ts","./src/lib/tauri-bridge.ts","./src/lib/thumbnails.ts","./src/store/aiStore.ts","./src/store/editorStore.test.ts","./src/store/editorStore.ts","./src/types/project.ts"],"version":"5.9.3"}