It’s not easy to know how to manage and deploy AI systems responsibly today. But the U.S. Government Accountability Office has recently developed the federal government’s first framework to help assure accountability and responsible use of AI systems. It defines the basic conditions for accountability throughout the entire AI life cycle — from design and development to deployment and monitoring — and lays out specific questions for leaders and organizations to ask, and the audit procedures to use, when assessing AI systems.
When it comes to managing artificial intelligence, there is no shortage of principles and concepts aiming to support fair and responsible use. But organizations and their leaders are often left scratching their heads when facing hard questions about how to responsibly manage and deploy AI systems today.
That’s why, at the U.S. Government Accountability Office, we’ve recently developed the federal government’s first framework to help assure accountability and responsible use of AI systems. The framework defines the basic conditions for accountability throughout the entire AI life cycle — from design and development to deployment and monitoring. It also lays out specific questions to ask, and audit procedures to use, when assessing AI systems along the following four dimensions: 1) governance, 2) data, 3) performance, and 4) monitoring.
Our goal in doing this work has been to help organizations and leaders move from theories and principles to practices that can actually be used to manage and evaluate AI in the real world.
Understand the Entire AI Life Cycle
Too often, oversight questions are asked about an AI system after it’s built and already deployed. But that is not enough: Assessments of an AI or machine-learning system should occur at every point in its life cycle. This will help identify system-wide issues that can be missed during narrowly defined “point-in-time” assessments.
Building on work done by the Organisation for Economic Co-operation and Development (OECD) and others, we have noted that the important stages of an AI system’s life cycle include:
Design: articulating the system’s objectives and goals, including any underlying assumptions and general performance requirements.
Development: defining technical requirements, collecting and processing data, building the model, and validating the system.
Deployment: piloting, checking compatibility with other systems, ensuring regulatory compliance, and evaluating user experience.
Monitoring: continuously assessing the system’s outputs and impacts (both intended and unintended), refining the model, and making decisions to expand or retire the system.
This view of AI is similar to the life-cycle approach used in software development. As we have noted in separate work on agile development, organizations should establish appropriate life-cycle activities that integrate planning, design, building, and testing to continually measure progress, reduce risks, and respond to feedback from stakeholders.
Include the Full Community of Stakeholders
At all stages of the AI life cycle, it is important to bring together the right set of stakeholders. Some experts are needed to provide input on the technical performance of a system. These technical stakeholders might include data scientists, software developers, cybersecurity specialists, and engineers.
But the full community of stakeholders goes beyond the technical experts. Stakeholders who can speak to the societal impact of a particular AI system’s implementation are also needed. These additional stakeholders include policy and legal experts, subject-matter experts, users of the system, and, importantly, individuals impacted by the AI system.
All stakeholders play an essential role in ensuring that ethical, legal, economic, or social concerns related to the AI system are identified, assessed, and mitigated. Input from a wide range of stakeholders — both technical and non-technical — is a key step to help guard against unintended consequences or bias in an AI system.
Four Dimensions of AI Accountability
As organizations, leaders, and third-party assessors focus on accountability over the entire life cycle of AI systems, there are four dimensions to consider: governance, data, performance, and monitoring. Within each area, there are important actions to take and things to look for.
Assess governance structures. A healthy ecosystem for managing AI must include governance processes and structures. Appropriate governance of AI can help manage risk, demonstrate ethical values, and ensure compliance. Accountability for AI means looking for solid evidence of governance at the organizational level, including clear goals and objectives for the AI system; well-defined roles, responsibilities, and lines of authority; a multidisciplinary workforce capable of managing AI systems; a broad set of stakeholders; and risk-management processes. Additionally, it is vital to look for system-level governance elements, such as documented technical specifications of the particular AI system, compliance, and stakeholder access to system design and operation information.
Understand the data. Most of us know by now that data is the lifeblood of many AI and machine-learning systems. But the same data that gives AI systems their power can also be a vulnerability. It is important to have documentation of how data is being used in two different stages of the system: when it is being used to build the underlying model and while the AI system is in actual operation. Good AI oversight includes having documentation of the sources and origins of data used to develop the AI models. Technical issues around variable selection and use of altered data also need attention. The reliability and representativeness of the data needs to be examined, including the potential for bias, inequity, or other societal concerns. Accountability also includes evaluating an AI system’s data security and privacy.
Define performance goals and metrics. After an AI system has been developed and deployed, it is important not to lose sight of the questions, “Why did we build this system in the first place?” and “How do we know it’s working?” Answering these important questions requires robust documentation of an AI system’s stated purpose along with definitions of performance metrics and the methods used to assess that performance. Management and those evaluating these systems must be able to ensure an AI application meets its intended goals. It is crucial that these performance assessments take place at the broad system level but also focus on the individual components that support and interact with the overall system.
Review monitoring plans. AI should not be considered a “set it and forget it” system. It is true that many of AI’s benefits stem from its automation of certain tasks, often at a scale and speed beyond human ability. At the same time, continuous performance monitoring by people is essential. This includes establishing a range of model drift that is acceptable, and sustained monitoring to ensure that the system produces the expected results. Long-term monitoring must also include assessments of whether the operating environment has changed and to what extent conditions support scaling up or expanding the system to other operational settings. Other important questions to ask are whether the AI system is still needed to achieve the intended goals, and what metrics are needed to determine when to retire a given system.
Think Like an Auditor
We have anchored our framework in existing government auditing and internal-control standards. This enables its audit practices and questions to be used by existing accountability and oversight resources that organizations already have access to. The framework is also written in plain language so that non-technical users can apply its principles and practices when interacting with technical teams. While our work has focused on accountability for the government’s use of AI, the approach and framework are easily adaptable to other sectors.
The full framework outlines specific questions and audit procedures covering the four dimensions described above (governance, data, performance, and monitoring). Executives, risk managers, and audit professionals — virtually anyone working to drive accountability for an organization’s AI systems — can immediately put this framework to use, because it actually defines audit practices and provides concrete questions to ask when assessing AI systems.
When it comes to building accountability for AI, it never hurts to think like an auditor.
Comments are closed