In VirtualDub, choose Audio -> Interleaving. There's a field for ms delay of the audio track. Increasing the number makes the audio start later. I don't know how - or if you can - play both audio tracks at once in VirtualDub. What I usually do is play it with the original audio and note some landmark that lines up with the song, like the first block, or a pattern of blocks, or a red in a sea of blues or something like that, and then zero in on it. Window Movie Maker might give you an easier go of it.
[edit] I just remembered - sometimes when I did that, the audio seemed fine on playback, but when I uploaded the video, the audio started at 0:00 for some reason. For those videos, I had to open the song in Audacity, add the correct amount of silence on the start, export the song to mp3, and use that audio track with no delay. A bit of a hassle, so if WMM works, it might be an easier route.