I don’t think humanity is ready for those precision persuasions. I read a paper by ByteDance researchers. It’s called Linear Alignment, that’s the title of the paper. It basically is a very interesting way to turn what’s called alignment fine tuning during their training phase, which was usually expensive, and you do it once and serve many models, they change it so that the reward model to each user, well, by dance has some users, each user can be applied at inference time. So I film myself doing a TikTok video or whatever, and then the 5 million viewers, each viewer has a reward model and then it tune towards persuasiveness. So maybe it’s just some subtle color balances, maybe it is a different caption, maybe it’s a different intonation, maybe different procedure, doesn’t matter. And so if that program knows that the reward model of each preference in each user, they can use their own phone to do this alignment so that they don’t have to spend GPUs or anything, it just automatically played in a way that is what we call a super stimulus, right?

Keyboard shortcuts

j previous speech k next speech