Chief AI Scientist Josh Joseph and BKC RA Seán Boddy address the risks that misalignment and loss of control pose to increasingly complex LLM-based agents.
Their paper, available from arXiv, suggests that the LLM agents' agency becomes the direct target of regulatory observation and intervention. "This [representation engineering] approach enables a variety of regulatory mechanisms consistent with the motivations underlying the 'Scientist A' paradigm [Bengio et al., 2025a]: mandated testing protocols for high-risk applications, domain-specific agency limits calibrated to risk levels, insurance frameworks that price premiums based on measurable agency characteristics, and hard ceilings on agency levels to prevent societal-scale risks."
You might also like
- communityThe Cookout
- communityHow to build AI for democracy
