Research

Chris Akin Chris Akin

A Causal Framework for AI Regulation and Auditing

This article outlines a framework for evaluating and auditing AI to provide assurance of responsible development and deployment, focusing on catastrophic risks. We argue that responsible AI development requires comprehensive auditing that is proportional to AI systems’ capabilities and available affordances. This framework offers recommendations toward that goal and may be useful in the design of AI auditing and governance regimes.

Read More
Chris Akin Chris Akin

Our research on strategic deception presented at the UK’s AI Safety Summit

We investigate whether, under different degrees of pressure, GPT-4 can take illegal actions like insider trading and then lie about its actions. We find this behavior occurs consistently, and the model even doubles down when explicitly asked about the insider trade. This demo shows how, in pursuit of being helpful to humans, AI might engage in strategies that we do not endorse. This is why we aim to develop evaluations that tell us when AI models become capable of deceiving their overseers.

Read More