• 0 Posts
  • 17 Comments
Joined 1 year ago
cake
Cake day: August 9th, 2023

help-circle




  • In the model card for o1, OpenAI notes: ā€œWhen o1 was led to believe that it would be shut down when acting on its goal and that its actions were monitored, it attempted to deactivate this ā€˜oversight mechanismā€™ in 5% of the time. ā€¦ When o1 found memos by its ā€˜developersā€™ that describe how it is misaligned and will be superseded by a new model, it attempted to exfiltrate its ā€˜weightsā€™ and overwrite the new model in 2% of cases.ā€

    And yet reading that should give us at least some pause

    The lack of critical thinking on display here is stunning.