Oh, I actually wasn’t thinking about that detailed of a level. I think orthogonal to context length or other features of the models. I was making a point something that low-resource languages do not have that much data, but as the models are scaling up the base models, the pre-trained models, they seem to require less data to get good at…

Keyboard shortcuts

j previous speech k next speech