I’ve always pushed for the cybersecurity metaphor. And that requires something that is like the emergency response team, the CERT, the SOC, the ISAC, the post-release product incident response PSIRT, and so on, which I’m sure you’re already aware of. And I think this week I’m happy to hear that previously folks who don’t do open red teaming are now seriously considering or even about to agree on the values of open red teaming. I think this has, without naming names, real possibilities to flip open source models from the maverick of the family into, well, doing good things of the family, because then it democratizes community input into open red teaming. It allows people to work on not just safety, but also alignment in a democratic way.

Keyboard shortcuts

j previous speech k next speech