Right. That is a valid cluster. It can be very specific, so I retract that statement. I think it would be very interesting. I would probably make it in Europe’s paper. I am not really a paper person, I am more interested in prototypes, but I am sure someone would want to say, these are metrics, let’s try a bunch of different models… Compare and contrast, how are we able to evaluate them? I like that these metrics are also human. They are not just evaluations for a language model. Okay. This is a very good evaluation system. I think it would be worthwhile.

Keyboard shortcuts

j previous speech k next speech