Birdwatch Note Rating
2024-12-06 14:48:24 UTC - NOT_HELPFUL
Rated by Participant: F3D2A8D89BF4B836D8EE3082658B1B5A12A7ABFBB40104EF5894122C001A75F2
Participant Details
Original Note:
A crucial context is missing: the model acted this way after it was told to pursue a goal at any cost. So while this is a valid example of instrumental convergence, it didn’t happen spontaneously. The tweet’s author agreed it’s misleading: https://x.com/shakeelhashim/status/1865004134631391295?s=46 The paper: https://static1.squarespace.com/static/6593e7097565990e65c886fd/t/6751eb240ed3821a0161b45b/1733421863119/in_context_scheming_reasoning_paper.pdf
All Note Details