Cursor fixed my music video (part 2)

Made another cover. Ran my avsync tool. Sync was off.

Instead of giving up or filing an issue on some random repo, I just… opened Cursor and fixed it myself. This is the story of that debugging session.

The problem

First cover worked great. Second cover? The output video’s sync drifted over time. Started fine, but by the end the audio and video were noticeably off.

I also wanted a new feature: the output video should match the replacement audio’s duration. No more manually trimming the result in iMovie.

The debugging spiral

Tried a bunch of things:

Different ffmpeg flags
Two-pass encoding instead of single-pass
Various trim approaches

Nothing worked. The sync kept drifting.

The funny part

Claude started analyzing the audio patterns more carefully. It computed correlation at different points in time to see where the drift was happening.

The results were… weird. The drift wasn’t linear. It jumped around randomly.

Then it checked the correlation confidence, it was kinda low. So I checked my phone again to find that I was syncing the wrong video file the entire time.

The right video worked immediately.

The optimization rabbit hole

With the bug fixed, I got curious. “Can this be faster?”

The tool was taking 65 seconds. By the end of the session:

Change	Time
Original (two-pass, software encoding)	65s
Single-pass (only encode what we need)	38s
Hardware encoding (VideoToolbox)	17.6s
Hardware decoding too	14.1s
ffmpeg audio extraction	14s

4.7x faster. Same output quality.

The key optimizations:

Single-pass encoding instead of encoding the full video then trimming
VideoToolbox GPU encoding on macOS
Auto-detect frame rate instead of hardcoding 30fps
Use ffmpeg for audio extraction instead of librosa (way faster for video files)

What I love about this

A year ago, if my video sync tool broke, I’d be stuck. File an issue, wait for a maintainer, maybe switch to a different tool.

Now I just fix it. The bug was my fault anyway (wrong file lol) but I didn’t know that. I just described the problem and iterated until it worked.

No context switching. No waiting. No explaining my setup to random support person on email.

This is what coding with AI should feel like.

The tool

Published v0.2.0 to PyPI with all the improvements:

pip install audio-video-sync --upgrade

New in this version:

Output trimmed to match audio duration automatically
Hardware acceleration on macOS
Low confidence warning (would’ve caught my wrong-file mistake)

Hope this was fun to read :)

Sanjeed