A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety

From AcaWiki
Jump to: navigation, search

Citation: Camille François, Ludovic Péran, Ayah Bdeir, Nouha Dziri, Will Hawkins, Yacine Jernite, Sayash Kapoor, Juliet Shen, Heidy Khlaaf, Kevin Klyman, Nik Marda, Marie Pellat, Deb Raji, Divya Siddarth, Aviya Skowron, Joseph Spisak, Madhulika Srikumar, Victor Storchan, Audrey Tang, Jen Weedon A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety.
DOI (original publisher): 10.48550/ARXIV.2506.22183
Semantic Scholar (metadata): 10.48550/ARXIV.2506.22183
Sci-Hub (fulltext): 10.48550/ARXIV.2506.22183
Internet Archive Scholar (search for fulltext): A Different Approach to AI Safety: Proceedings from the Columbia Convening on Openness in Artificial Intelligence and AI Safety
Wikidata (metadata): Q135644843
Download: https://arxiv.org/abs/2506.22183
Tagged:

Summary

Reports the outcomes of the Columbia Convening on AI Openness and Safety (Nov 19, 2024) and its six-week prep program: (i) a community-informed research agenda on the intersection of openness and safety; (ii) a workflow-based map of post-training technical interventions and open-source tooling for deploying open foundation models safely; and (iii) a survey of the content-safety filter ecosystem with a development roadmap. It argues that openness—transparent weights, interoperable tooling, and public governance—can strengthen safety via independent scrutiny and decentralized mitigation, but highlights gaps (multimodal/multilingual benchmarks, defenses against prompt-injection/compositional attacks in agentic systems, and participatory mechanisms). The paper closes with five priority directions (participatory inputs; future-proof content filters; ecosystem-wide safety infrastructure; rigorous agentic safeguards; expanded harm taxonomies).

Theoretical and Practical Relevance

Turns “openness helps safety” into an operational program: a post-training safety workflow developers can map against, tables of tools and interventions (with identified gaps) to target investment, and a consolidated content-filter landscape with trade-offs and a roadmap. It also points to ROOST—a new open, scalable safety-infrastructure effort—as a vehicle for building shared tooling, and notes the agenda’s influence on the Feb 2025 French AI Action Summit. For practitioners and policymakers, this yields a concrete way to specify where to build/open tooling, how to choose and evaluate filters, and where to prioritize research on agentic-system safeguards.