Project Background
Foundation models represent a significant advancement in artificial intelligence (AI), capable of a wide range of tasks such as text synthesis, image manipulation, and audio generation. Examples like OpenAI’s GPT-3 and GPT-4, which power tools like ChatGPT and image generators like Midjourney, have garnered considerable attention from policymakers and the media.
To address the risks and harness the benefits of foundation models, various regulatory and governance approaches are emerging across different sectors, including the private, civil society, and public sectors. These approaches prioritize the evaluation and testing of models through methods like red-teaming, piloting, and auditing, albeit primarily led by private-sector labs. However, ethical and legal concerns arise, such as the implications of publicly refining or training products like ChatGPT.
Given the increasing deployment of powerful AI technologies by tech companies, it’s crucial for policymakers and regulators to comprehend the evaluation methods available and their potential limitations.
Project Overview
This project seeks to furnish policymakers with evidence to inform foundation model governance. It will explore the placement of evaluation obligations within the foundation model supply chain, evaluate the limitations of current evaluation practices, and analyze potential consequences.
Drawing from existing literature on auditing and evaluating foundation models and conducting interviews with AI system developers and designers, the project aims to provide clarity on the concept of evaluation within the AI and foundation model context.
The initial output will be an explainer elucidating the meaning of evaluation and related concepts like audits, red-teaming, and impact assessments specific to AI and foundation models. Subsequently, a paper will summarize the current state of foundation model evaluations, assess their theoretical and practical limitations, outline the implications for policymakers, and recommend actionable steps based on evaluation outcomes.