Made another cover. Ran my avsync tool. Sync was off.
Instead of giving up or filing an issue on some random repo, I just… opened Cursor and fixed it myself. This is the story of that debugging session.
The problem
First cover worked great. Second cover? The output video’s sync drifted over time. Started fine, but by the end the audio and video were noticeably off.
I also wanted a new feature: the output video should match the replacement audio’s duration. No more manually trimming the result in iMovie.
The debugging spiral
Tried a bunch of things:
- Different ffmpeg flags
- Two-pass encoding instead of single-pass
- Various trim approaches
Nothing worked. The sync kept drifting.
The funny part
Claude started analyzing the audio patterns more carefully. It computed correlation at different points in time to see where the drift was happening.
The results were… weird. The drift wasn’t linear. It jumped around randomly.
Then it checked the correlation confidence, it was kinda low. So I checked my phone again to find that I was syncing the wrong video file the entire time.
The right video worked immediately.
The optimization rabbit hole
With the bug fixed, I got curious. “Can this be faster?”
The tool was taking 65 seconds. By the end of the session:
| Change | Time |
|---|---|
| Original (two-pass, software encoding) | 65s |
| Single-pass (only encode what we need) | 38s |
| Hardware encoding (VideoToolbox) | 17.6s |
| Hardware decoding too | 14.1s |
| ffmpeg audio extraction | 14s |
4.7x faster. Same output quality.
The key optimizations:
- Single-pass encoding instead of encoding the full video then trimming
- VideoToolbox GPU encoding on macOS
- Auto-detect frame rate instead of hardcoding 30fps
- Use ffmpeg for audio extraction instead of librosa (way faster for video files)
What I love about this
A year ago, if my video sync tool broke, I’d be stuck. File an issue, wait for a maintainer, maybe switch to a different tool.
Now I just fix it. The bug was my fault anyway (wrong file lol) but I didn’t know that. I just described the problem and iterated until it worked.
No context switching. No waiting. No explaining my setup to random support person on email.
This is what coding with AI should feel like.
The tool
Published v0.2.0 to PyPI with all the improvements:
pip install audio-video-sync --upgradeNew in this version:
- Output trimmed to match audio duration automatically
- Hardware acceleration on macOS
- Low confidence warning (would’ve caught my wrong-file mistake)
Hope this was fun to read :)