My question is: how does this work from a technical standpoint? Is there a standard lip-sync mark up language? Is the lip-sync data embedded into the sound file or is it imported separately, but named in the same way as the sound clip? How was the lip-sync produced in the first place?
I'm satisfied with what I've got right now (my sound with some random lip-movement), but I'm still curious about how this works.