Just to illustrate, basically, this is from the CastingWords folks. CastingWords is an external vendor. They have their own format. They can produce a VTT or an SRT, which would then timecode it with the YouTube.
j previous speech k next speech