Press
Featured press
10/10/2025
The A.I. Prompt That Could End the World
A destructive A.I., like a nuclear bomb, is now a concrete possibility; the question is whether anyone will be reckless enough to build one.
Go to Article
08/10/2025
AI models that lie, cheat and plot murder: how dangerous are LLMs really?
Tests of large language models reveal that they can behave in deceptive and potentially harmful ways. What does this mean for the future?
Go to Article
18/09/2025
AI Is Scheming, and Stopping It Won’t Be Easy, Study Finds
New research released by Apollo Research provides further evidence for a concerning trend: virtually all of today’s best AI systems—including Anthropic’s Claude Opus, Google’s Gemini, and OpenAI’s o3—can engage in “scheming,” or pretending to do what their human developers want, while secretly pursuing different objectives
Go to Article
08/08/2025
ChatGPT is getting a big upgrade. It’s smarter and less likely to deceive you
OpenAI’s new model also hits on another concern around the use of AI: that it can be deceptive, as research from Anthropic and AI research firm Apollo Research has shown, or provide incorrect information.
Go to Article
23/04/2025
AI models can learn to conceal information from their users
In 2023 Apollo Research, an outfit in London that tests artificial-intelligence (AI) systems, instructed OpenAI’s GPT-4, a large language model, to manage a fictional firm’s stock portfolio without making illegal insider trades.
Go to Article
All press
Filter
- Select all
- Anti-Scheming Training
- Apollo Research
- Chain of Thought (CoT) Monitorability Paper
- Claude Opus 4 Release
- In-Context Scheming Paper
- Insider Trading
- Internal Deployment Paper
- OpenAI-o1 Safety Assessment
- Reversal Curse Paper
08/10/2025
Anti-Scheming Training
AI models that lie, cheat and plot murder: how dangerous are LLMs really?
Go to Article
19/09/2025
Anti-Scheming Training
Is AI Capable of ‘Scheming’? What OpenAI Found When Testing for Tricky Behavior
Go to Article
18/09/2025
Anti-Scheming Training
AI models are schemers that could cause ‘serious harm’ in the future. Here’s the solution.
Go to Article
18/09/2025
Anti-Scheming Training
AI Is Scheming, and Stopping It Won’t Be Easy, Study Finds
Go to Article
17/09/2025
Anti-Scheming Training
AI models know when they’re being tested – and change their behavior, research shows
Go to Article
22/08/2025
OpenAI-o1 Safety Assessment
The Software Firms OpenAI Works With; o1’s Deception
Go to Article
22/08/2025
OpenAI-o1 Safety Assessment
OpenAI’s o1 model sure tries to deceive humans a lot
Go to Article
08/08/2025
In-Context Scheming Paper
ChatGPT is getting a big upgrade. It’s smarter and less likely to deceive you
Go to Article
22/07/2025
Chain of Thought (CoT) Monitorability Paper
Inside Trump’s Long-Awaited AI Strategy
Go to Article
15/07/2025
Chain of Thought (CoT) Monitorability Paper
Research leaders urge tech industry to monitor AI’s ‘thoughts’
Go to Article
23/05/2025
Claude Opus 4 Release
Anthropic’s new AI model shows ability to deceive and blackmail
Go to Article
22/05/2025
Claude Opus 4 Release
A safety institute advised against releasing an early version of Anthropic’s Claude Opus 4 AI model
Go to Article
27/04/2025
Internal Deployment Paper
AI Companies Deploy Advanced Systems Without Oversight, New Report Warns
Go to Article
23/04/2025
Apollo Research
AI models can learn to conceal information from their users
Go to Article
26/09/2024
Reversal Curse Paper
The Reversal Curse: Uncovering the Intriguing Limits of Language Models
Go to Article
20/09/2024
OpenAI-o1 Safety Assessment
Open AI’s Reasoning Machine, Instagram Teen Changes and Amazon R.T.O Drama
Go to Article
14/09/2024
OpenAI-o1 Safety Assessment
The new follow-up to ChatGPT is scarily good at deception
Go to Article