Skip to main content

On This Page

Building an Automated Video Generation Pipeline with Claude Code

2 min read
Share

These articles are AI-generated summaries. Please check the original sources for full details.

The pipeline that emerged

Aliaksei Zelianouski developed ‘Simona,’ a customized Claude Code setup capable of building its own toolset for video production. The entire creative effort cost $45.26 across multiple iterations and assets.

Why This Matters

While AI agents are often marketed as seamless, the technical reality involves managing ‘blind’ systems that cannot see their own visual output, leading to timing drifts and synchronization errors. The process demonstrates that high-fidelity output requires a hybrid approach: combining expensive generative AI for hero moments with cheap, deterministic ffmpeg scripts for the bulk of the runtime to manage costs and maintain control.

Key Insights

  • Cost-driven creative decisions: Total spend was $45.26 (2026), where higher costs per clip forced a pivot from hyperrealistic images to cheaper chalkboard styles.
  • Skill accretion: The system uses ‘skills’ (small Python CLI wrappers with SKILL.md documentation) to freeze successful API paths and avoid rediscovery.
  • Deterministic editing over eyeballing: Due to LLM blindness, the pipeline relies on precise written editing patterns in ffmpeg rather than visual feedback loops.
  • Reference-to-video consistency: Using reference images and voice samples across models like Seedance 2.0 ensures character and narrator consistency between static and motion clips.

Working Examples

Implementation of the Ken Burns effect (slow zoom) rendered at 4K and downscaled with lanczos to prevent jitter.

ffmpeg -i doors.png -vf "zoompan=z='1+(1.4-1)*on/(frames-1)':d=100:\nx='iw/2-iw/zoom/2':y='ih/2-ih/zoom/2':s=3840x2160:fps=25,\nscale=1920:1080:flags=lanczos" -frames:v 100 scene.mp4

Mixing narration over ambient sound using adelay for timestamps and normalize=0 to prevent volume attenuation.

ffmpeg ... -filter_complex \
"[1:a]adelay=300|300[a1];[2:a]adelay=4500|4500[a2];[3:a]adelay=10000|10000[a3];\
[0:a][a1][a2][a3]amix=inputs=4:duration=first:normalize=0[out]" ...

Practical Applications

    • Use case: Automated content creation using a modular ‘skill’ library (Simona) to wrap various AI APIs into reproducible CLI tools.
  • Pitfall: Granting agents unrestricted git access; a misdirected commit wiped two months of untracked assets.
    • Use case: Seamless transitions between static zooms and AI video by outpainting a frame and compositing the original back via ffmpeg overlay with feathered edges.
  • Pitfall: Relying on generative AI for realistic human faces in Seedance 2.t, which triggers content guardrails.

References:

Continue reading

Next article

Strategic Subtransmission Planning: Optimizing the Power Grid's Middle Mile

Related Content