word-timestamps-extractor

Skill ID: $word-timestamps-extractor · Source

Mục đích

Lấy file audio narration → trả về transcript có timestamp cấp câu + cấp từ dùng cho subtitle, dedupe, và căn cảnh asset.

Sau khi đã có audio voice (từ $ausynclab-voice hoặc upload thủ công).
Trước $semantic-asset-mapper và $video-render-plan-builder — cả 2 đều cần timeline word-level.
User muốn đổi cách render subtitle / đổi audio gốc.

Cần OPENAI_API_KEY trong .env. Gõ:

$word-timestamps-extractor — extract word timestamps from source/voice.wav, language vi

Agent gọi Whisper/OpenAI transcription qua script, normalize output, ghi vào source/transcript_word_level.toml.

⌘I