In the last few years, AI systems have become more capable, faster, than almost anyone expected. Models trained on the open internet now write code, pass professional exams, and carry out multi-step tasks with little supervision. The trend line does not obviously stop.
That raises a set of questions that used to be academic and are now practical: how do we build systems this powerful and keep them aligned with human interests? How do we know what a model will do before we deploy it? Who gets to decide what it should do at all? What happens to institutions (jobs, science, democracy) if most cognitive work is done by machines?
"AI safety" is the loose label for work on these questions. It has a technical side (interpretability, alignment, evaluations, robustness) and a policy side (governance, coordination, oversight of frontier labs). It is not one field so much as several fields that happen to share a concern.
The aim of this group is to take that concern seriously: to read the primary sources, argue in good faith, and understand the strongest version of each position before agreeing or disagreeing with it.
A detailed, scenario-based account of how the next few years of AI development might unfold, written by former OpenAI researchers. The most concrete picture of what's at stake.
ReadThe Anthropic CEO's case for what a world with powerful, well-aligned AI could actually look like. Readable, optimistic, and written by someone at the center of it.
ReadMIRI's plain-language explanation of why building smarter-than-human AI is dangerous and why the problem is unsolved. Short, direct, and written by the researchers who have been working on this the longest.
ReadIf something obviously belongs here and isn't, email us at contact@ucsbaisafety.org.