Move Algorithmic-Driven Pay and Scheduling Systems From Surveillance Pay to Fair Wages
Employers increasingly rely on scheduling, timekeeping, and payroll software to determine hours, eligibility, and pay. When monitoring data and optimization rules feed these systems, or what this memo refers to as “algorithmic wage-setting”—it rarely appears as a standalone tool. It shows up as configured rules and thresholds, time edits, automatic deductions, and eligibility flags that can quietly change compensable time and earnings. A 2025 Equitable Growth brief describes this dynamic as “surveillance pay”—the use of granular monitoring data integrated into pay systems to set compensation and calculate wages in ways that can disconnect time from pay and make outcomes harder to predict, audit, and challenge for discrimination.
States are already moving to regulate surveillance/algorithmic wage-setting, but proposals focus on prohibition and basic notice rights. This memo complements those efforts by centering the enforcement reality: payroll and timekeeping are the system of record and the regulatory choke point. It pairs guardrails on non-job-related data use with an enforcement operating model, audit-ready decision trails, integration/egress mapping, standardized audits and complaints triage, and minimum operational standards, so agencies can prove violations, correct errors quickly, and prevent repeat harm using preexisting wage-and-hour, civil rights, consumer protection, and procurement authority.
Challenge and Opportunity
Core labor protections like minimum wage, overtime, predictable scheduling, and anti-discrimination regulations, increasingly run through proprietary workplace systems that employers and vendors configure, but workers and regulators often cannot see or challenge. As these tools spread across white- and blue-collar industries—including healthcare, retail, logistics, food service, manufacturing, construction, and public services, they can normalize hidden wage loss, income volatility, and unequal treatment, especially when employers use surveillance-derived metrics to change pay tiers, incentives, benefits eligibility, or hours without clear notice or a workable way to challenge errors.
Why payroll and timekeeping are the focus.
In most workplaces, pay and schedules do not come from a single “algorithmic wage tool.” Instead, they come from connected systems that track hours, assign shifts, and apply workplace rules, that then feed into HR and payroll systems, which serve as the official record for compensation.
This memo focuses on payroll and timekeeping/scheduling because that’s where data turns into earnings: wages, hours paid, premiums, bonuses, and benefits eligibility. It is also where states can most realistically require auditable records, set clear limits on what data can influence pay decisions, and enforce worker rights.
Worker data typically flows through a simple data chain:
- Capture: timekeeping, scheduling, attendance, and productivity/monitoring tools record events (clock-ins, breaks, shift changes, performance flags).
- Integrate: HR and payroll systems (and their vendors/subcontractors) pull those inputs together and link them to pay rules.
- Decide: configured rules, thresholds, or models trigger pay-affecting actions—time edits, automatic deductions, eligibility flags for premiums/bonuses, schedule adjustments, and pay calculations.
- Pay out: the results appear in payroll as wages, hours paid, premiums, bonuses, and take-home pay.
Because payroll and timekeeping are the official record, regulators cannot rely on the paycheck alone. Instead, regulators need to see and audit the system’s decision trail which includes the data sources that were used, the rule or thresholds that were applied, what changed (e.g., edits, deductions, eligibility flags), and who made (or approved) any changes.
Added risk pathway: third-party intermediaries.
Worker data does not always stay inside a single employer system. In instances, third parties such as verification services, analytics intermediaries, and sometimes data brokers/resellers collect and commercialize worker-related data and feed it back into workplace tools in the form of aggregated scores, flags, or “risk/reliability” signals that can affect scheduling, wages, and/or compensation.
Without clear limits and disclosure on this type of third-party data sourcing and onward sharing for pay and time keeping-affecting decisions (including brokered data and broker-derived scores), non-job-related data can also shape pay and scheduling indirectly while obscuring provenance (who supplied it), purpose (why it was used), and accountability (who is responsible) .
From a regulatory standpoint, the risks typically concentrate in four areas:
- Implementation/configuration failures, rollouts, integrations, default settings, or rule changes that trigger underpayment or missing premiums.
- Improper inputs/uses in pay or time keeping decisions, such that non-job-related personal data (including surveillance-derived metrics and brokered inferences) used to set or modify wages, hours, eligibility, or incentives.
- Secondary use and onward sharing (data governance risk), in that worker pay/HR data repurposed, shared, or sold beyond payroll/service delivery, potentially re-entering decision systems as scores, flags, or eligibility signals.
- Black-box accountability gaps, in which systems that prevent workers, unions, and regulators from seeing which inputs and rules produced pay outcomes.
Understanding these is key from a regulatory standpoint because the question then becomes not only what the paycheck says, but what rules and input influenced any changes in wage or compensation calculations and whether those inputs are legitimate and traceable.
The recommendations that follow do three things: (1) cut off high-risk data inputs, (2) require audit-ready decision trails, and (3) give workers enforceable rights to notice, explanation, and correction.
Why states should act now.
We already have evidence that algorithmic pay is common in some sectors in the labor markets, and that payroll “modernization” rollouts can cause widespread pay errors when software becomes the system of record. Even if “surveillance wages” is not yet widespread beyond the gig economy, which is the point: states can act upstream, before these tools harden into default infrastructure. At the same time and in parallel, states are also introducing surveillance-pricing prohibition signaling growing legislative appetite to regulate data-driven personalization and discrimination before it becomes default infrastructure.
Below are examples of the ways this trend is taking shape:
- Algorithmic pay in app-based work. The best-known example of algorithmic wage-setting is in the context of ride-hail and delivery platforms, where algorithms determine pay and can be difficult for workers to predict or contest, a classic “black box” accountability problem that vendors now export into traditional workplaces. Bloomberg Law notes this “baseline model” is now being exported to other industries through vendors marketing automated pay tools.
- Payroll systems failing at scale: Workday rollouts. After Seattle rolled out Workday, a third-party algorithmic payment system, workers filed a wage theft class action alleging underpayments and payroll problems. This is an example of how implementation/configuration failures at scale can take place when software becomes the system of recording. In another example, Oregon state workers also reached a $15 million class action settlement over wage errors tied to the implementation of Workday Payroll. Both cases show why “software errors” can function as de facto wage theft when employers run pay through complex proprietary systems that workers cannot debug and employers struggle to correct.
- Timekeeping rules driving wage loss: time edits, rounding, and auto meal deductions. In a federal class action against Yale New Haven Health, workers alleged the employer rounded and edited time records and automatically deducted meal breaks even when breaks were missed, interrupted, or not fully taken, showing how pay harm can come from timekeeping configuration choices (and the lack of a transparent, contestable audit trail), not an “even when no one labels it “algorithmic wage-setting”.
- Timekeeping systems failing at scale: the Kronos/UKG outage. The 2021 ransomware attack on Kronos/UKG’s timekeeping platform left employers without their normal system of record for weeks, triggering wage-and-hour claims that employees went unpaid or underpaid when employers reconstructed hours and overtime. For example, Cargill reached a $2.4M settlement tied to unpaid wages/overtime allegations stemming from the outage, and other employers (e.g., Frito-Lay; Honda) have faced similar outage-related wage claims and settlements, illustrating how dependence on a single timekeeping/payroll platform can create systemic pay risk when the software fails.
These examples show how payroll and timekeeping systems are often the choke point because they encode pay rules, execute pay-affecting actions (like time edits and eligibility flags), and generate, or withhold, the audit trail regulators need to verify compliance.
Harms this proposal targets (and what we know about scope)
This memo targets a specific set of harms that arise when employers route compensation decisions through timekeeping, scheduling, and payroll systems (often with third-party inputs).
These harms fall into five buckets:
- Hidden wage loss and underpayment.
Examples include time edits and reclassifications, automatic deductions (e.g., meal breaks), missing premiums/differentials, or misapplied overtime triggers that reduce pay without a clear explanation or easy correction path.
What we know: wage-and-hour complaints and litigation regularly surface these mechanisms, especially when payroll/timekeeping becomes the system of record. - Income volatility and scheduling instability.
Automated scheduling and rule-based eligibility can drive unpredictable hours, unstable earnings, and difficulty budgeting, even more so, when rules change inside proprietary systems.
What we know: volatility is well-documented in app and gig-based labor markets and is a growing concern as similar logic moves into traditional workplaces. - Discrimination and disparate impact at scale.
Surveillance-derived metrics, proxy variables, and eligibility flags can embed unequal treatment in pay, hours allocation, or access to premiums/bonuses, especially when workers cannot see or contest the underlying rule or data input.
What we know: civil rights risk is structural when decisioning relies on opaque metrics and limited contestability; disability advocates flag heightened vulnerability due to higher fixed costs and budgeting constraints. - Accountability failures (“black box” enforcement gaps).
When the system’s decision trail is unavailable, employers can’t explain pay outcomes, workers can’t self-advocate, and agencies can’t prove violations, turning basic labor protections into an after-the-fact guessing game.
What we know: this is a recurring barrier in investigations and disputes involving payroll/timekeeping platforms and integrated tools. - Data governance harms (secondary use and third-party re-entry).
Worker pay/HR data may be repurposed, shared onward, or reintroduced via third-party scores/flags (e.g., verification, analytics intermediaries, brokers), shaping pay and scheduling indirectly while obscuring provenance and accountability.
What we know: third-party ecosystems exist and can influence eligibility/access decisions; the risk increases when data egress and sourcing aren’t disclosed.
Given these harms, this memo seeks to reduce wage loss, volatility, and discrimination by (1) limiting high-risk inputs and secondary use, (2) requiring audit-ready decision trails and integration/egress visibility, and (3) giving workers practical rights to notice, explanation, and correction.
Plan of Action
Recommendation 1. Establish a clear guardrail on compensation data use.
Adopt legislation to create the bright-line ban, scope, and remedies, then reinforce it through existing wage/civil rights/UDAP enforcement and procurement requirements for public employers and contractors.
States should adopt a bright-line rule that bars employers and vendors from using non-job-related personal data, including brokered data and broker-derived scores or classifications—to set or change wages, hours, bonuses, differentials, benefits, or pay eligibility. “Non-job-related personal data” means any data or inference not reasonably necessary and proportionate to determine hours worked, pay owed, or job-related compensation factors, which are limited to seniority, job classification, documented skills/credentials, objective shift attributes (e.g., nights/weekends/hazard pay), location-based cost adjustments, and transparent performance metrics tied to job duties (not biometrics, health inferences, parenthood status, home address, or off-duty behavior). This targets the core risk: opaque, individualized wage manipulation.
To prevent loopholes and misclassification incentives, the guardrail should:
- Cover workers broadly. Apply to employees and to workers treated as independent contractors when an algorithm or platform determines compensation, since contractors are often most vulnerable to misclassification and therefore irregular pay, and exposed to variable, algorithm-set pay.
- Define prohibited practices clearly. Treat “surveillance wage-setting” as the use of surveillance-derived data or inferences to determine compensation at an individualized level.
- Create real accountability. Pair agency enforcement (e.g., labor and state Attorney Generals) with meaningful penalties and a private right of action. For this, states can look to bills such as Colorado’s HB26-1210, New York’s S8872 (and Assembly companion A9641), or Maryland’s HB0148 as models.
- Center disability and civil rights protection. Require accessibility, nondiscrimination testing, and meaningful appeal/human review for any permitted automated compensation practices.
- Preserve legitimate pay practices. Allow transparent, non-personalized wage premiums (e.g., seniority steps, COLA, hazard pay, shift differentials) and job-related market adjustments that do not rely on individualized surveillance.
- Limit secondary use and commercialization of compensation data. Prohibit employers and vendors from selling, licensing, or otherwise disclosing worker compensation data (and pay-derived eligibility flags) to third parties for purposes unrelated to payroll and scheduling/service delivery, and prohibit use of compensation data to train generalized AI models unless the data is truly deidentified and the use is strictly necessary for the contracted service.
Recommendation 2. Make enforcement practical: require audit-ready records for algorithmic pay and scheduling systems.
Use rulemaking/guidance and enforcement to require decision-trail records and standardized audits, reinforced through procurement requirements for public employers and contractors, and use targeted legislation only if agencies lack clear authority to compel retention/production or to cover vendors directly.
This recommendation targets two recurring failure modes: (1) rollout/configuration errors (especially during integrations) and (2) black-box systems that prevent regulators from showing what the software did and why. Guardrails only work if agencies can access the decision trail behind pay outcomes.
Agencies already use payroll records/paystubs, time and attendance data, schedules, job classifications and rate tables, and worker complaints. But those records often show only the outcome, not the mechanism; they rarely reveal which rules, inputs, or system changes produced a pay result. To enforce wage and civil rights protections when software mediates pay and scheduling, agencies must also require retention and production of:
- Automated decision records (audit trails): logs of pay-affecting actions (time edits, auto-deductions, pay recalculations, eligibility changes for premiums/bonuses), including timestamps and the rule or data source that triggered each change.
- Rule/configuration history: the pay rules in effect over time, plus change history and approvals.
- Integration and egress map: (a) upstream systems feeding decisioning (timekeeping, scheduling, attendance, productivity, location/ratings); (b) any third-party sources supplying inputs (verification databases and any data brokers/resellers), including which fields/scores they provide and which pay/eligibility rules use them; and (c) whether worker pay/HR data is shared onward or sold, and for what purposes, including any analytics, benchmarking, or model-training uses unrelated to payroll/service delivery.
These missing records are not “nice to have;” they are the minimum evidence needed to audit pay outcomes when software is the system of record. To close this enforcement gap, states should do two things at once: (1) require retention and production of decision-trail records, and (2) standardize how agencies request, analyze, and enforce them.
Actions states can take now include:
- Modernize payroll recordkeeping. Require employers (and covered vendors where appropriate) to retain and produce audit trails, rule/configuration history, and integration/egress maps as standard payroll records.
- Standardize an audit protocol (Labor and State Attorney Generals). Use a shared checklist and data request template to compare system outputs to hours worked/pay owed and identify repeat patterns (missing premiums, unexplained deductions, volatility, disparate impact). A small interagency working group should maintain templates, secure intake, and a vendor/system map.
- Rapid supply-chain mapping: for each investigation, map (1) payroll/HRIS, timekeeping, scheduling, and monitoring systems; (2) each vendor/subcontractor processing worker data; (3) third-party sources supplying scores/flags; (4) which fields feed which pay/eligibility rules; and (5) any onward sharing/sale of worker data.
- Audit templates should include both case-level review (individual decision trails) and pattern tests (aggregate metrics that reveal systematic underpayment, volatility, or disparities after rollouts or rule changes).
- Use procurement as leverage. For public employers and contractors, require auditability, data retention, worker notice, and cooperation with investigations as contract conditions. Contracts should also prohibit undisclosed sale/sharing of workforce and pay data and prohibit using worker pay/HR data for analytics, benchmarking, or model training unrelated to the contracted service, with audit rights and penalties for noncompliance.
- Set minimum standards for pay-affecting vendor practices (rule-setting and procurement). States do not need to regulate every feature of payroll and scheduling software to reduce harm. A practical approach is to set a small set of baselines, enforcement-ready standards through State Attorney General labor enforcement guidance, settlement terms, and public procurement that target the most common ways software drives wage loss and blocks accountability.
To make this action (#4) more concrete, states can start with a brief list of “minimum operational standards” that directly targets the most common ways payroll and timekeeping systems reduce pay and block accountability.
Four minimum operational standards can pursue:
- No silent time edits or auto-deductions. Prohibit break deductions, time reclassifications (e.g., “idle time”), or post-hoc hour changes without clear worker notice. Require an easy, timely process to review and correct records. (Example: an automatic meal-break deduction should not reduce paid time when a worker did not take a break, took a shorter break, or had the break interrupted, unless the worker can easily confirm or correct the record.) States can also require logs that show what changed, when, and why.
- Document and notify pay rules and eligibility changes. Require documentation and worker notice when systems change eligibility for premiums, bonuses, differentials, or overtime triggers and when pay rules/configurations change. States can also seek to prohibit earnings-impacting changes without a traceable record and a responsible human owner.
- Exportable audit logs (decision trails) as payroll records. Require tamper-resistant, exportable logs that capture pay-affecting actions, the rule that triggered them, and the data source used, and treat these logs as payroll records subject to retention and production requirements.
- Burden-shifting when records are missing. If required records are missing, place the risk on the employer and not the worker—using a rebuttable presumption in favor of the worker’s reasonable account of hours/pay, plus escalating penalties for recordkeeping failures (civil fines, enhanced damages for systemic violations, monitoring/injunctive relief, and procurement consequences).
When to act. Agencies should open an investigation when complaints jump right after a new system rollout, when time edits or auto-deductions show up unusually often, when workers can’t get a plain-English explanation or timely correction, or when it looks like third-party/non-work data is affecting pay, hours, or eligibility. To do this consistently, agencies should use a simple, standardized intake and escalation process that logs the employer, the vendor/system (when known), and the issue type and flags patterns that should be reviewed by a designated triage team.
Recommendation 3. Guarantee worker-facing transparency and contestability: a right to know, a right to an explanation, and a right to correct.
Use agency guidance/rules and procurement to require notice, explanations, and fast corrections where agencies already have authority; use legislation to create new worker rights (access, deadlines, anti-retaliation) where needed; and use enforcement to hold employers and vendors accountable when notices or records are missing, false, or misleading.
Enforcement alone often leaves workers waiting months for relief. States should therefore require worker-facing transparency for any automated system that sets pay or materially shapes earnings through time classification, scheduling, differentials, bonuses, or pay eligibility so workers can spot problems early, document patterns, and seek timely correction. Aggregated reporting can help identify systemic issues, but it does not replace a worker’s right to see and contest the records that determine their individual pay.
Privacy and data-broker rules (e.g., CCPA/CPRA-style disclosure and Delete Act-style broker mechanisms) provide useful templates for disclosure and access rights in the worker-pay context.
A worker rights package focused on this issue would include:
- Right to notice. Provide clear, plain-language notice when automated systems affect pay, time classification, hours allocation, or eligibility for premiums/bonuses. Notices should name the system(s), the categories of data used, the outcomes affected, and whether the employer/vendor relies on third-party sources (including brokers/resellers, verification databases, or analytics intermediaries).
- Right to an explanation. When pay, eligibility, or schedules change due to an automated rule, provide a readable explanation showing what changed; the rule/threshold applied; the input data sources; the effective date; the responsible human owner; and the identity of any third-party source that supplied an input, score, or eligibility flag.
- Right to inspect core records. Provide access to pay-related artifacts needed for self-advocacy, including time entries and edits (including auto-deductions); premium/differential eligibility status; incentive/bonus formulas that apply; and a history of pay-affecting adjustments within the pay period.
- Aggregated reporting for pattern detection (privacy-protective). Require aggregated reporting (by job role/site/shift/pay period, with privacy thresholds) so workers and unions can detect systemic patterns (frequent time edits, missing premiums, pay volatility, disparities). Examples include: % of shifts with auto-deductions; time edits per one hundred workers per pay period; premium eligibility changes over time by job class/site; weekly hours volatility pre/post rollout; and disparity checks such as premium loss rates (no tiny groups).
- Right to timely correction and human review. Require fast, accessible processes to challenge errors or automated determinations affecting pay, with clear deadlines for correction/back pay and escalation paths when errors cause hardship.
- Collective bargaining compatibility. Structure these rights so unions can incorporate them into CBAs and side letters, including access to aggregated reports (time edits/auto-deductions, eligibility changes, error rates, volatility, disparate impact indicators) and bargaining triggers when vendors, integrations, or pay rules change.
Worker-facing transparency also strengthens enforcement: it creates documentation, reduces information asymmetry, and helps agencies identify employers and vendors that warrant priority investigation.
Conclusion
Fair and trustworthy workplace technology starts with something workers understand: a paycheck they can trust and a schedule they can plan around. The evidence is clear: algorithmic pay-setting is established in app-based work, and payroll/timekeeping failures show how software can produce systemic wage harm at scale. States can act now using existing labor, civil rights, consumer protection, and procurement authority—strengthened by a prohibition on surveillance wage-setting, enforcement-ready decision trails, and worker rights to notice, explanation, and correction, so “efficiency” doesn’t come at the expense of fairness, dignity, accessibility, or basic economic security.
Not necessarily, but targeted legislation is often the cleanest way to close emerging gaps. Policymakers can approach AI-mediated pay and scheduling in three lanes:
1. Enforce existing laws now. A large share of the harms described in this memo can already be investigated and remedied under preexisting wage-and-hour enforcement, recordkeeping requirements, civil rights/equal pay law, consumer protection (UDAP), and procurement authority.
2. Use rulemaking and guidance to modernize existing authority. Even where statutes are strong, enforcement can fail if agencies cannot access the documentation that explains how software produced pay outcomes. States can often use rulemaking, guidance, and standardized audit protocols to clarify that payroll records and compliance obligations include automated decision records (audit logs), pay-rule/configuration history, and basic documentation of upstream data sources/integrations when software is the system of record.
3. Use new legislation as a targeted backstop. Where current law does not clearly reach upstream practices—especially the use of surveillance-derived or non-job-related personal data to set or modify compensation targeted legislation can establish bright-line prohibitions (e.g., banning surveillance wage-setting), extend coverage to contractor/platform arrangements where algorithms determine pay, and ensure vendor accountability, cooperation, and meaningful remedies. Examples include Colorado’s HB26-1210 or New York’s proposed prohibition on algorithmic wage-setting (S8872 and Assembly companion A09641), and bills that explicitly address surveillance-based wage setting or wage discrimination (e.g., Maryland HB0148; Minnesota HF4131).
It is important to note that policymakers should also expect to see broader bills that create baseline rights and duties for automated tools across a wider range of employment decisions (not only wages and scheduling, but also hiring, promotion, discipline, and termination). In that context, the guardrails in this memo, especially a prohibition on surveillance wage-setting, can be adopted as a compensation-focused module within a broader worker-tech protections package.
Colorado. Colorado’s HB26-1210, Prohibit Surveillance Price & Wage Setting, would prohibit individualized wage setting (and individualized pricing) when a “price or wage setting algorithm” uses surveillance data and the algorithm’s output is a substantial factor in determining the wage offered to a worker. The bill also takes an enforcement-ready approach: it treats violations as a deceptive trade practice under the Colorado Consumer Protection Act, authorizes the Attorney General to adopt rules, and requires entities using these systems to publish procedures that promote data accuracy and allow workers to request information about the data used to set wages and to correct or challenge that data.
New York. New York lawmakers are considering a direct prohibition on algorithmic wage-setting (S8872), including penalties and a private right of action. New York also has proposals in the broader worker-tech rights direction, such as measures focused on disclosure and inventories of automated employment decision-making tools in the public sector and related employment contexts. This illustrates a practical model: enforce now under existing wage, recordkeeping, and civil rights authority use rulemaking to make records and audits enforcement-ready and codify new guardrails where emerging tech creates gaps.
How State Leaders Can Put People First in AI Decision-Making
How State Leaders Can Put People First in AI Decision-Making is a framework to ask and answer the right foundational questions about artificial intelligence (AI) from the beginning. The public wants the government to take action to ensure the power of AI technology is used for good. In the current political climate, the work of state leaders is critical. The recommendations in this memo are focused on helping state leaders across the country ground decision-making about AI use in fairness, accountability, evidence-based inquiry, and inclusive governance so that AI can work for people.
Many state agencies have already deployed or are considering using AI in consequential decisions related to healthcare, housing, education, policing, finance, and other highly sensitive areas. While a few states have taken steps to implement decision-making mechanisms for certain AI systems, too many leaders are simply accepting narratives about AI’s purported public benefit at face value – jumping to the “how” of AI implementation before thoroughly vetting potential systems and deciding whether they are appropriate to use at all.
State officials may be eager, and even feel pressure, to tap into the potential benefits of AI in the hopes of better serving their constituents. But the personal, political, and operational risks of AI use should not be underestimated. People across the political spectrum are deeply concerned about the impact of AI on their lives and these concerns are well-founded. There have already been numerous examples where the failure to center people in AI decision-making and use has resulted in government systems that range from inefficient and wasteful to disruptive and downright dangerous, causing significant harm to, and backlash from, community members.
For AI’s potential benefits to be realized, state leaders need to implement consistent, inclusive people-first AI decision-making structures. Crucially, this process should ask the foundational question of whether to use AI in the first place. This policy memo provides timely guidance on:
- Why policymakers should adopt a people-centered AI decision-making process;
- What consistent process to follow as the foundation of any AI decision-making policy;
- How to operationalize this process through a flexible set of options designed to meet specific needs, structures, and opportunities in different states.
Rather than offering a one-size-fits-all approach, this memo provides a suite of mechanisms for engaging thoughtful AI decision-making with examples of how different state governments have tackled emerging AI issues. We give recommendations for how state leaders can implement the AI decision-making process for whichever path they choose, including methods to promote accountability so that the decision-making process is followed and can truly work to put people first.
Challenge and Opportunity
The use of AI by state agencies is growing. By 2024, 59% of state and local government employees reported that their agency had already made an AI application available for use and a majority of public sector employees reported using AI applications either several times a week or daily.
Generative AI (GenAI) systems and agentic AI systems are now joining machine learning and automated decision-making systems (ADS) that have been in use for many years – with the lines between the types of systems blurring as AI products become increasingly integrated.
AI is also being applied in many high-stakes situations where mistakes or bias can have life-altering ramifications. AI systems now make decisions that can affect the lives of tens of millions of low-income people in the United States, from determination of SNAP benefits, to Medicaid enrollment, to Social Security disability payments. Sixty percent of people in the United States live in a jurisdiction that employs some sort of pretrial risk assessment tool that uses AI. According to one AI surveillance vendor, thousands of police departments in the United States are using face surveillance.
While many policymakers may be enticed by the promise of AI, people across the country and political spectrum have deep concerns. As of 2025, only 17% of the general public believes AI will positively impact the United States. Americans broadly oppose AI being used in high-stakes decision-making, like health insurance, loan applications, and job screening. A 2025 poll of U.S. voters found that 82% said they do not trust technology leaders to tackle regulation independently. A supermajority – 69% – of the U.S. public does not think the government is doing enough to regulate AI.
How does the public feel about AI?
More than 50% of people in the U.S, and 65% of low-income people, fear being left behind by AI. Only 4 in 10 people ages 18-34 in the U.S. say that they “trust” AI and only 23% of people in the U.S. over age 55 trust AI systems. As AI advances, public anxiety grows. Polling reveals that 77% of people in the U.S. want companies to “take AI creation slowly to get it right the first time.”
Public concerns with AI are well-founded. Former high-profile staffers at several AI companies have warned that companies are moving too fast and minimizing AI’s deficiencies, with new AI systems “generating more errors, not fewer.” While the technology industry is pushing the pedal on AI, the public would like to hit the brakes and for leaders to “do something before it goes too far.”
In the rush to adopt AI, some government officials have been making mistakes. The most impacted communities, including low income and communities of color, often end up excluded from public deliberation about government use of technology. There are already numerous examples of how these same communities bear the brunt when there is a lack of people-centered AI decision-making:
- The state of Michigan abandoned a $47-million automated unemployment fraud detection system after it was found to have a false positive rate of 93%. Michigan had to pay $20 million to settle a civil rights class action lawsuit and spend an additional $78 million to install a new system.
- Automated systems caused delays in obtaining critical unemployment benefits for people in several states, including in California where a fraud detection algorithm system improperly flagged 600,000 people as having potentially fraudulent characteristics.
- Issues of discrimination related to AI algorithm use has been documented across numerous jurisdictions, including a family screening system used in Pennsylvania that gave more racially disparate recommendations than child welfare workers.
- AI errors are also creating dangerous interactions between police and community members. Police have wrongfully arrested people, many Black community members, because of AI-powered face surveillance. Maryland police also swarmed out of eight squad cars and surrounded a 16-year old and held him at gunpoint after his school’s AI security system mistakenly identified his bag of potato chips as a gun.
There are high costs for improper AI use – for the people whose lives are impacted, in the state dollars that are invested, and in how these actions can further undermine trust in government.
At their best, AI systems can help improve government functions. They have the potential to be used to triage community feedback, provide translation services that make government more accessible, facilitate emergency preparedness, or aid scientific research, among other uses. For example, Maryland’s Department of Labor is partnering with academic researchers to help test how AI can train staff and assist caseworkers with compliance regulations and other complex paperwork.
People want government leaders to take action to ensure AI technology is used for the public good. As the current administration has undermined safeguards at the federal level and issued executive orders attempting to stifle state action on AI, the continuing work of state leaders to safeguard rights and center people in AI decision-making has become even more critical.
A few states have already taken some steps to implement process mechanisms for AI decision-making and potential use. These include: Connecticut’s Act Concerning Artificial Intelligence, Automated Decision-Making and Personal Data Privacy and AI Responsible Use Framework; California’s State Guidelines for Evaluating Impacts of Generative AI on Vulnerable and Marginalized Communities; Maryland’s Responsible AI Policy; New York State’s 2024 LOADinG Act; and Texas’ Responsible Artificial Intelligence Governance Act.
While these steps are an important start, more needs to be done given what is at stake with AI use and its potential impact on people’s rights, livelihoods, and personal safety. For the potential benefits of AI to actually be realized for community members, strong state leadership in this moment is needed to pierce through the hype. This memo lays out a plan of action for state leaders to implement consistent, inclusive people-first AI decision-making structures that do not skip over the foundational questions of why and whether to use AI in the first place.
Plan of Action
State leaders should establish a people-centered decision-making process that consistently and thoughtfully considers why and whether to use AI before jumping to use policies or other safeguards. This process should be followed whenever a state is considering the acquisition or use of an artificial intelligence system, whether through formal procurement, partnerships, in-kind donations, or other means. This decision-making process should be utilized when considering any AI system that has the potential to impact people’s rights, opportunity, well-being, safety, and security.
In the following section we provide:
- The four key steps a people-first AI decision-making process should follow.
- Examples of AI uses that should be prohibited.
- Recommendations for how to operationalize the decision-making process via executive or legislative action, depending on the needs and structure of a given state government.
- Methods to ensure that the people- centered decision-making process is followed and enforceable.
The Four Key Steps for People-First AI Decision-Making
Step 1. Articulate a specific and inclusive “why” for AI use that centers the interests and voices of diverse community members to identify problems and needs.
State leaders should ensure that the first step in decision-making about any existing or potential use of an AI system is for an agency to articulate a specific and inclusive “why” that centers the interests and voices of a wide range of community members. Particular attention should be paid to historically marginalized communities. This community engagement should happen pre-procurement or use of any AI system.
Key considerations for centering diverse community members include, but are not limited to:
Inclusivity and representation: Use multiple strategies to support participation from diverse stakeholders, including funding and support for state agency outreach. Develop potential partnerships with trusted local organizations such as community groups, faith-based organizations, schools, and neighborhood associations who can help spread the word, organize meetings, and share information and surveys with diverse community members.
Accessibility: Make it possible for diverse community members to be actively engaged through a combination of in-person and remote engagement mechanisms. Also provide asynchronous paper and online surveys distributed in multiple languages in easy-to-understand formats. Information about any proposed AI systems should describe how a system would work and what it would do in ways that the general public can understand. Schedule any in-person meetings in places and times when diverse community members will be able to attend and provide necessary support for participation, like childcare and transportation. Remote meetings should also be scheduled at a time in the day when working people and people with families can attend.
Power sharing: Centering diverse voices means meaningful collaboration, not token consultation. Community members should have genuine influence on determining what are the most important issues facing them and how they should be addressed. You should listen to community members about any non-AI solutions that they would prefer and why.
Transparency and Accountability: Be clear about the engagement process and ensure it allows for serial feedback. Make sure materials are publicly published and easily accessible on a government website in a timely manner to allow public engagement with the process. Articulate how community input will be incorporated and have a mechanism to report back to the community on how their input influenced the ultimate decision.
California took important steps to promote effective community consultation when it issued the State Guidelines for Evaluating Impacts of Generative AI on Vulnerable and Marginalized Communities. Authored by the state Government Operations Agency, Office of Data and Innovation, and California Department of Technology, the guidelines recognize the need for a systematic approach that leads with meaningful engagement with diverse communities and how critical it is to specifically consider potential impacts on vulnerable and marginalized communities. Appendix B of California’s guidelines provides some additional helpful guidance on key principles, structures, activities, and focus questions for community consultation.
Step 2. Conduct an AI Impact Assessment that evaluates public benefits and risks, including how the AI system would use people’s information, its impact on rights, and risks of discrimination and bias.
Technology vendors often tout the benefits and downplay costs and risks. It is crucial that amidst the hype state leaders create the structures and processes to support evidence-based decisions about a potential system’s public benefit and risks and avoid AI “snake oil” that wastes state resources and does more harm than good.
State leaders should ensure that there is an AI Impact Assessment (AIIA) to evaluate and explain how the proposed AI system will work, the evidence for its effectiveness and potential public benefit, and its potential for harm (for implementation advice, see below section, “Mechanisms to Operationalize People-Centered AI Decision-Making”). The process should include a public comment period for engagement with the AIIA so people can bring up additional information and concerns. Leaders should also ensure that any company they potentially contract with provides them with the necessary information to conduct an AIIA. Don’t let vendor claims, including claims about potential trade secrets, prevent meaningful review of its products and services.
An AI Impact Assessment (AIIA) should include:
- The intended use cases of the AI system, including its potential public benefits.
- The people and groups most likely to be impacted by the AI use.
- Information about:
- How the AI system was built and trained, including the information used.
- The kind of information that the AI system would collect, use, and output.
- How the public would be informed when they are interacting with the AI system.
- Evidence demonstrating the effectiveness of the proposed AI system at directly addressing and solving the stated purpose and achieving the desired public benefit.
- Report with specificity on the potential public harms of the AI system related to:
- Civil liberties and civil rights, including privacy and surveillance, free speech, due process, human autonomy, and bias and discrimination.
- Equitable access to education, health, housing, employment, and other public services and benefits.
- Energy consumption, carbon emissions, water usage, electronic waste, and other similar environmental impacts.
- Financial impacts, including initial purchase, personnel and other ongoing costs, and any current or potential sources of funding.
- What alternative interventions have been considered to address the stated purpose for the AI system.
- How the AI system would be monitored and regularly evaluated as to public benefit and harms, including the feasibility of meaningful human oversight.
- The public comment period and how people can engage with the content of the AIIA.
Step 3. Use a decision-making standard that is based on diverse community considerations and an evidence-based inquiry that the public benefit justifies the proposed use and substantially outweighs the potential harms.
Decisions about why and how to deploy AI should be driven by the real needs and interests of impacted communities. Using the AI Impact Assessment and the input and preferences of potentially impacted communities, the agency or department should apply a public benefit standard, assessing whether such a purpose for the AI has been demonstrated and whether the evidence-based benefits of the particular use of AI substantially outweigh the potential harms.
This decision-making standard should give strong weight to the opinions of those who will be impacted by the technology, especially historically marginalized communities. Steps to accomplish this include:
- Reviewing comments, survey responses, and meeting notes from the community engagement.
- Summarizing community support and concerns in a report.
- Explaining how community input shaped the decision.
Decisions should clearly articulate what quantitative and qualitative evidence was relied on for the decision. These considerations should be memorialized in a publicly accessible document.
Step 4. Conduct timely, ongoing evaluation of AI systems to determine whether they should continue to be used.
If a state entity moves forward with use of a particular AI system, state leaders should require timely review that centers impacted communities in the qualitative and quantitative evaluation of whether the system is achieving the intended public benefit. This review should also identify any harms arising from the AI use. If public benefits of the particular use of AI do not continue to substantially outweigh the harms, the AI use should end.
The review and evaluation processes should ensure:
- Ongoing feedback loops with people using the AI system with channels for community members to reach out about the benefits and problems related to the AI system.
- Proactive efforts to gather evidence of AI efficacy in meeting its original stated public benefit goals.
- Proactive efforts to assess whether the AI system has impacted people’s rights or otherwise affected equal opportunities to government services.
- Initial evaluation should be planned for no later two years after initial use and then regularly thereafter.
Recommendation 1. Some uses of AI are simply too dangerous. Get ahead by taking them off the table.
Putting people first in AI also means proactively prohibiting uses of AI systems and applications that are simply incompatible with democratic, civil, and human rights. Numerous evaluations from government leaders, academics, technologists, civil rights organizations, and groups representing vulnerable and marginalized communities have found that the threats stemming from the below applications of AI significantly outweigh the benefits. Your AI decision-making process should preclude the following:
- Social scoring systems
- Emotional Recognition systems
- Facial surveillance or other biometric surveillance
- AI use in predictive policing or family policing
- AI control over a person’s life and liberty, including in criminal justice and immigration
Many prudent city and state government officials have already preemptively taken some dangerous AI uses off the table. Maryland’s AI policy prohibits AI that violates fundamental rights, such as social scoring and emotional recognition. Montana’s AI law bans using AI for cognitive behavioral manipulation and sets hard limits on dragnet mass surveillance. And many cities have prohibited government use of face surveillance.
Recommendation 2. Mechanisms to Operationalize People-Centered AI Decision-Making
How to best implement the AI decision-making framework depends on the particular needs, opportunities, and structure of each state government. States that have taken steps to create a consistent process for AI evaluation and adoption have done so through different legal and legislative mechanisms. Which option to pursue – executive action, legislation, agency guidelines, or a combination of the three – is a decision that should be made by those most familiar with the contours of their particular state.
Executive Action – A Governor can issue an executive order requiring all executive agencies to follow a people-centered AI decision-making process. This executive order can identify an agency, or a subset of existing agencies, to develop the process itself and coordinate among different department leaders and staff to provide expertise and oversight that ensures compliance. If relying on an existing agency or state department, state leaders may find that an agency or department already focused on technology, information services, operations, or administrative service might be most well-suited to this role. Or an executive order can create a new entity to provide support.
- California’s Executive Order on AI provided for the Government Operations Agency, the California Department of General Services, the California Department of Technology, and the California Cybersecurity Integration Center to develop guidelines for government procurement and use of AI systems, including guidelines that agencies should follow for evaluating equity concerns of generative AI and its impact on marginalized communities.
- Maryland’s Executive Order created an AI subcabinet charged with facilitating statewide coordination on the responsible, ethical, and productive use of AI. Its work included recommending approaches and state policies.
Legislation – State lawmakers can enact legislation to require state entities to create and follow an AI decision-making process, either through direct statutory language or by tasking a state agency to develop policy and implementation guidelines.
- The New York state LOADinG Act required legal authorization for the use of automated decision-making systems by state agencies and required the publication of an impact assessment for any authorized AI use.
- Connecticut’s Public Act No. 23-16 directed the state’s Office of Policy and Management to develop Connecticut’s AI Responsible Use Framework. It also instructed the Department of Administrative Services to inventory all AI systems.
- Maryland enacted the Artificial Intelligence Governance Act of 2024 and required each executive agency to appoint an AI lead and for the State Department of Information Technology (DoIT) to work with the new AI subcabinet. The DoIT then issued the State of Maryland’s Responsible AI Policy and implementation guidance.
Recommendation 3. Provide Support Structures for State Agencies
State leaders should ensure that there are structures to support state agencies to operationalize the people-centered decision-making process, including conducting diverse community outreach, evidence-based AI Impact Assessment, and quantitative and qualitative evaluation.
This support can come from a variety of sources. State leaders should provide funding for existing staff or agencies to serve as point people, creating a diverse AI board, partnering with academic institutions to provide expertise, or a combination of these strategies.
- In California, the Governor designated multiple state agencies to work on AI implementation. The state government also partnered with academic institutions including UC Berkeley and Stanford to convene an AI Summit bringing together stakeholders from labor, business, nonprofit, academia, and technology companies. This summit convened a series of workshops with grassroots organizations and diverse community members to gather input on AI use and potential policies for assessing impact on vulnerable communities. In December 2025, the Governor also launched the California Innovation Council, an initiative to tap “California’s best and brightest to advance responsible AI.”
- In Maryland, the Governor established an AI subcabinet by executive order and state leaders enacted legislation for the state Department of Information Technology (DoIT) to work with the new AI subcabinet. The state issued a Responsible AI Policy that includes roles and responsibilities for the DoIT, the AI subcabinet, and each executive agency. The DoIT implementation guidance also helps state agencies understand, operationalize, and follow the AI Policy. Each agency has a designated AI Lead and the DoIT has an AI team that holds weekly office hours and answers questions that agencies may have about using AI and following the state’s AI policy.
Recommendation 4. Ensure the Process is Followed Through with Transparency, Accountability, and Oversight
It is also essential for state leaders to make sure the decision-making process does not just work on paper, but truly translates into people-centered transparency, accountability, and oversight of AI systems.
Any legislation, executive order, or agency guidelines should provide for public and private enforcement mechanisms so people can take action if rules are not followed. State leaders should also require a public inventory, updated at least annually, of all AI systems so the public knows what is in use. As discussed earlier, all assessment materials need to be publicly published in a timely manner during the process.
After the decision-making process is completed, state leaders should ensure that any agency that moves forward with an AI system is required to establish a robust use policy that will help protect people from abuse, misuse, and mistakes, with ongoing evaluation of the benefits and harms of the AI system. Developing a robust use policy is outside the scope of this memo, but please see the FAQ section for some resources.
Conclusion
State leaders can make AI work for people.
The future of government use of AI is still being written, and state governments have a powerful role to play. What we do now will help determine whether the power of AI will work for or against people’s rights and dignity.
If AI is to serve rights, justice, and democracy, leaders at the state level must act to implement a people-first process that centers diverse community members and asks and answers foundational questions about “why” and “whether” to use AI before skipping to the “how” of AI implementation. The recommendations in this memo help state leaders meet this moment and ground decision-making about AI use in fairness, accountability, evidence-based inquiry, and inclusive governance.
The views and opinions expressed herein are solely those of the author and do not necessarily reflect the views, positions, or policies of any organization, employer, board, institution, client, or other entity with which the author is affiliated.
- Connecticut’s 2023 Act Concerning Artificial Intelligence, Automated Decision-Making and Personal Data Privacyrequired each state agency to inventory all uses of AI systems and mandated a process for evaluation. The state developed an AI Responsible Use Framework that requires each agency to conduct an AI impact assessment before implementing an AI system. It also created an Advisory Board that evaluates agency adoption of AI systems.
- California issued State Guidelines for Evaluating Impacts of Generative AI on Vulnerable and Marginalized Communities in December 2024 and directs state agencies to use these guidelines early in the AI consideration process, when assessing readiness and prior to initiating a procurement process. The guidelines provide an equity evaluation checklist where state agencies identify the communities potentially impacted by the AI system, conduct community outreach, and identify the potential forms of bias, mechanisms of oversight, and a process for transparency. These guidelines currently only apply to Generative AI systems, not all AI systems, and many of the provisions are recommendations, not requirements. On March 30, 2026, California Governor Newsom issued Executive Order N-5-26 that provides stipulations for AI procurement and contracting to prevent discrimination and harm to civil rights, among other issues.
- Maryland issued a Responsible AI Policy in 2025 that creates a governance framework for all AI systems, which includes an intake process, impact assessment, and other processes. It also prohibits real time biometric surveillance, social scoring, emotion analysis, fully automated decision-making procedures, and behavioral manipulation.
- New York State’s 2024 LOADinG act requires that all existing AI systems be disclosed and prohibits the future or ongoing use of any AI system that has not been evaluated using an impact assessment and found to be safe and free from discrimination.
- Colorado’s Consumer Protections for Artificial Intelligence took effect on February 1, 2026, and requires both developers and deployers of artificial intelligence to disclose and preempt potentially dangerous use of the system in question through variety of stipulations, including the completion of an impact assessments.
- The Texas Responsible Artificial Intelligence Governance Act limits dangerous AI practices like social scoring, behavioral manipulation, discrimination, and biometric identification.
There have already been marked gaps in how “high risk” is interpreted. California enacted a law mandating annual inventory reports on all high-risk automated decision systems in use by the state. The report that the California Department of Technology issued identified no high-risk systems in use, despite publicly available examples of potentially worrisome ADS systems employed by different California agencies.
- AI Now – Algorithmic Impact Assessments Report: A Practical Framework for Public Agency Accountability
- Carnegie Endowment for International Peace – How Cities Use the Power of Public Procurement for Responsible AI
- Center for Democracy and Technology – AI Governance Checklist for Elected Officials: Advancing Responsible AI Adoption and Use in the Public Sector
- Data & Society – Democratizing AI: Principles for Meaningful Public Participation and Driving Change in Public Sector Technology through Community Input
- GovAI Coalition – Policy Templates and Knowledge-Sharing Tools
- Federal Office of Management and Budget – Advancing Governance, Innovation, and Risk Management for Agency Use of Artificial Intelligence
- Local Progress Impact Lab and AI Now – Local Leadership in the Era of Artificial Intelligence and the Tech Oligarchy
Empowering Communities through Community Benefit Agreements in AI-Fueled Data Center Development
The United States is experiencing an unprecedented surge in data center construction driven by AI infrastructure demand. Over 5,000 facilities are operating today, with investments of $400 billion in 2025 and an estimated $1.8 trillion in between 2024 and 2030. This capital is arriving faster than environmental review processes, utility planning cycles, and community engagement frameworks were designed to accommodate. The consequences for communities are serious and well-documented: rising electricity bills, massive water consumption, e-waste, noise and light pollution, and billions in tax subsidies to some of the world’s most profitable corporations — often without meaningful public disclosure. These harms do not fall evenly, with communities of color and low-income neighborhoods already carrying disproportionate burdens.
Community Benefit Agreements (CBAs) are a legally binding, enforceable tool that allows communities to secure real commitments from data center developers before development proceeds. When properly structured — with specific numeric targets, secured financial obligations, independent monitoring, and meaningful enforcement — CBAs transform data center deals into durable community partnerships. Drawing on practitioner expertise from dozens of negotiations across sectors, emerging AI data center agreements, and new research on community harm and regulatory gaps, this memo makes the case for CBAs and provides a practical policy playbook for using them effectively, including potential provisions and considerations like enforceable harm mitigations, meaningful community investment, and lasting accountability mechanisms, to surface broad community needs while remaining adaptable to local contexts.
Challenge and Opportunity
Harms to Communities from Rapid Expansion of AI Infrastructure
U.S. data centers consumed 183 TWh of electricity in 2024 – more than 4% of total national consumption and roughly equivalent to the annual electricity demand of Pakistan, with it only projected to grow larger – roughly 17% more by 2030. A typical AI-focused hyperscaler consumes as much electricity as 100,000 households; the largest under construction are expected to use 20 times as much. The scale is such that AI data center demand in Virginia alone contributed to an 833% increase in regional capacity market auction prices – what electricity utilities and grid operators pay to ensure there will be enough power generation available during peak demand periods – for 2025–2026. These pressures do not just translate directly into costs for ordinary ratepayers but because these are structural costs baked into the grid, they also make it harder for communities to see, contest, or hold anyone accountable for the surge. Electricity prices in some data center-heavy regions have surged over 250% in five years, with estimates predicting data center electricity demand could double–or even triple–by 2028.
The scale of harm to nearby communities extends beyond electricity prices: increased water usage, e-waste, air and noise pollution, and adverse health effects. A single large data center can use up to 5 million gallons of water a day (with about a quarter of the usage from direct cooling), equivalent to a city of 50,000 people. Additionally, hardware disposal is projected to generate 1.2–5 million metric tons of e-waste from generative AI alone between 2020 and 2030. Diesel backup generators – required at every facility – emit particulate matter classified by the EPA as a likely human carcinogen. Diesel generators emit harmful nitrogen oxides 200–600 times more than natural gas plants per unit of electricity produced. Researchers estimate that data center backup generators in Virginia, operating at just 10% of permitted levels, could already cause 14,000 asthma symptom cases and 13-19 deaths annually, with public health costs of $220–$300 million per year spreading across multiple states – and communities of color, low income communities and rural communities paying the bulk of that price.
But perhaps the most underappreciated community harm from the data center boom is fiscal: the extraordinary scale of tax subsidies that state and local governments have extended to some of the world’s most profitable companies, frequently without meaningful public disclosure or community input. Good Jobs First, which tracks corporate subsidies nationally, found that in 10 of the 20 states disclosing data center subsidy costs, programs cost over $100 million per year. Further, the opacity of these arrangements is striking: of 36 states with data center subsidy programs, only 11 publicly disclose which companies receive benefits. Virginia, the world’s largest data center market, for example, forgoes nearly $1 billion annually in state and local revenue without telling the public which companies receive the money or how much each receives. Not to mention, data centers, once fully built and operational, employ on average only 157 permanent workers – an extraordinarily low jobs return on billions in public subsidy – averaged $1.4 million to $2.1 million in subsidies per permanent job. Additionally, companies frequently hide behind non-disclosure agreements (NDAs) avoiding public input and scrutiny, especially on critical details about energy use, water consumption, and sometimes even the identity of the data center operator.
Centering Community Needs in AI Infrastructure Development
As data centers have proliferated and these harms are starting to be documented, so has grown the backlash against new developments. Data Center Watch, which tracks grassroots opposition to large-scale projects across 28 U.S. states, found that between May 2024 and March 2025, $64 billion worth of data center projects were blocked or delayed by local opposition. In Q2 2025 alone, more project disruptions occurred than in the previous two years combined. Opposition is bipartisan and geographically broad. Nationwide polling found that only 44% of Americans would welcome a data center nearby – a lower acceptance rate than for gas plants, wind farms, or nuclear facilities.
This issue is an urgent priority now because while public concern over rising energy rates, water usage, and unchecked development is growing, no comprehensive mechanism currently exists to align the interests of communities, developers, and local governments.
As AI companies promise us the large-scale and incredible societal benefits to come from AI, they can show they are serious by starting with making sure the data centers they are building to power the AI future benefits the communities they’re in.
Why Community Benefit Agreements?
CBAs are legally binding agreements, negotiated between developers and community stakeholders, that secure enforceable commitments before development proceeds. Adapted from their successful use in bank merger oversight (under the Community Reinvestment Act) and clean energy project approvals, CBAs can:
- Establish environmental monitoring and reporting requirements more stringent than applicable permits.
- Secure financial contributions to community investment funds, backed by letters of credit that allow enforcement without costly litigation.
- Lock in local hiring commitments with specific numeric targets and apprenticeship pipelines.
- Create Community Advisory Boards with real authority and ongoing oversight throughout the life of the project.
- Make transparent what would otherwise remain hidden: water consumption, energy use, tax benefits, and environmental commitments.
In the absence of broader legislative and regulatory protections, CBAs offer a promising, underutilized and legally binding tool to ensure adequate harm mitigation and potential for communities to share in the opportunities, and not just the costs, of AI infrastructure; with the additional benefit of being able to be tailored specifically to a community’s needs.
For instance, in late 2025, the city of Lancaster negotiated a legally binding CBA with the developers of the Lancaster AI Hub before construction was finalized, securing $20 million in community contributions. Key wins include a hard cap of 20,000 gallons per day of municipal water use per campus, a 100% clean energy requirement backed by tiered financial penalties of up to $10 million per building, strict noise limits tied to pre-construction ambient levels, and full public records transparency.
The agreement also commits developers to a local hiring plan, free first-responder training, and ongoing community engagement — demonstrating that municipalities can extract meaningful, enforceable protections from data center developers when they engage before key approvals are locked in. Of note, the city is the negotiator of the CBA in this case, but the same negotiations and provisions can be won in a legally binding CBA through communities themselves as well – working with community leaders, community-based organizations, and local policymakers with enforcement mechanisms woven in for effectiveness.
Importantly, CBAs do not require communities to support a project. They are negotiated exchanges. If a developer will not make commitments adequate to the community’s concerns, opposition — including calls for moratoriums — remains a legitimate and sometimes appropriate response. The credibility of that alternative is precisely what gives CBA negotiations their teeth.
Especially while policymaking, legislation and other broader reforms can take time; in their absence, CBAs can be a particularly useful interim governance mechanism to meet the urgency of this moment.
Why now?
Hyperscalers are urgently racing to secure sites, power contracts, and permits to meet AI demand. Given that the time to power is crucial for the data center companies, it gives communities and municipalities genuine leverage right now, alongside the need, urgency, and tools/resources to be able to engage. Data center developments face political opposition that is delaying billions of dollars in projects. They need community support, or at minimum community acquiescence, to move through permitting processes that increasingly require public hearings, board votes, and in some jurisdictions, community benefit plans.
With the scale of projected and current investments in the billions of dollars, and their effects in communities already being felt with more to come, and especially as broader reforms that are slower to move are not yet in place, CBAs are not just a useful interim governance policy tool that can fill this currently urgent need, but now is also the time of maximum policy leverage.
Plan of Action
States should not rely on voluntary developer promises. They should create a statutory and regulatory framework that makes robust CBAs a condition for approval or subsidy in high-impact data center projects.
We recommend CBAs be utilized as a potential policy tool for facilitation and solutions-building to meet community, developers’, and local governments’ tripartite objectives, under defined conditions. Local policymakers should treat CBAs as a lever that enables communities to provide direct input, occupy an established space to negotiate impacts and mitigations, and secure reinvestment in ways that benefit the community.
Local governments can require CBAs (working alongside community-based organizations and other community leaders) if developers apply for permits, zoning, or other approvals to build out data centers – such that planning departments, zoning boards, or city councils can condition approval on compliance and can then impose penalties, delay permits, or revoke approvals if terms aren’t met.
The following recommendations highlight specific ways and provisions that policymakers at the local governmental level (like the City of Lancaster for the Lancaster data center CBA) and community-based organizations advocating and negotiating on behalf of communities can utilize in their efforts to protect communities from harm and establish some fairness, transparency and accountability in the data center development process. Key provisions alongside their criticality are also summarized in Summary Table 1 at the end of this proposal.
Recommendation 1. Policymakers (and CBOs and community leaders negotiating on behalf of communities) should utilize specific provisions to address harms and provide mitigations, to increase transparency, and to steward ongoing governance and accountability.
Harm Remediation
- Prohibit cost-shifting of energy rates to ratepayers. The impacts on electricity affordability, grid infrastructure, and ratepayers resulting from the proposal’s energy demand are some of the harms that are closest to communities. Measures intended to prevent or offset disproportionate burdens on residential customers and frontline communities, including developers fronting the costs of any infrastructure upgrades and interconnection, or creation of a new rate class (like in Oregon or Virginia) for data centers.
- Require developers to go beyond regulatory compliance on environmental protections. The Lancaster CBA specifically with data center developers requires selective catalytic reduction on generators. In California, the California Environmental Quality Act (CEQA) required and negotiated mitigations have included fence-line monitoring, health risk assessments, and restrictions more stringent than state permits. Every CBA should include independent real-time air monitoring with publicly available data, a community health fund financed by the developer, and diesel emission standards that go beyond what permits require.
- Require prioritization and usage of clean energy. Lancaster CBA, for instance, requires 100% clean sourcing required, with tiered penalties of $2.5M–$10M per building backed by a $10M Letter of Credit, and penalty proceeds directed to a Sustainable Development and Clean Energy Fund. Add third-party Renewable Energy Certificates (RECs) verification and prohibit characterizing REC purchases as equivalent to direct clean energy generation without explicit disclosure. In the absence of full clean energy sourcing, energy ratcheting over time should be utilized.
- Set a hard numeric cap on water usage with public reporting. Given the documented conflicts over water in drought-prone regions, water provisions are increasingly among the most contentious and most important elements of data center CBAs. Lancaster CBA’s 20,000-gallon-per-day municipal water cap per campus, combined with closed-loop cooling requirements, is a strong model. Add quarterly public consumption reporting and a renegotiation trigger if operations expand beyond the scope contemplated at execution.
Transparency, Governance & Accountability
- Mandate public dashboards with ongoing reporting. These should include water usage, energy usage, as well as pollution metrics like the amount of time spent on backup diesel generators or noise decibels.
- Require full public disclosure of all tax incentives, Payments in Lieu of Taxes (PILOTs), and government subsidies received by the developer. Given that 25 of 36 states with data center subsidy programs do not disclose recipients, communities must insist on transparency in the CBA itself.
- Conduct impact assessments, including equity impact assessments.
- Create a Board with real enforcement authority. Every CBA needs a Community Advisory Board (CAB) with seats for environmental justice representatives and community residents (not just officials), with the authority to commission independent audits, defined financial penalties for violations, and a right to seek injunctive relief directly, as well as the responsible entity for the community fund.
- Make enforcement penalties for violations clear and escalating. Community negotiators should insist on specific, escalating financial penalties for violations — not vague remediation language — with enforcement authority vested in the CAB.
- Include sunset and renegotiation triggers. Include mandatory renegotiation at five-year intervals or upon material changes in facility scope, ownership, or energy consumption. There should also be clear processes outlining any potential decommissioning and long-term liability to avoid stranded assets with locals being left footing the bill. These could look like, including decommissioning bonds (tied to facility footprint or power draw) posted at execution, a funded remediation escrow, and a specific site restoration timeline.
Recommendation 2. Policymakers and CBOs negotiating on behalf of communities should require that investment in communities as a baseline condition for any equitable agreement.
The data center boom is generating extraordinary wealth. The hyperscalers building these facilities are among the most valuable companies in human history. The AI services that will run on this infrastructure will generate tens of billions of dollars in revenue. None of this wealth is being created in a vacuum: it is being created in specific communities, using specific community resources – land, water, electricity, roads, emergency services, and environmental carrying capacity. The communities that provide these resources deserve a meaningful share of the value they help create.
Aside from harm remediation, CBA, in its associated prep and processes, can serve as a platform to uncover, understand, and platform broad community needs. There should be specific provisions that specifically seek to address these needs, to ultimately move towards a more balanced and equitable distribution of the costs and benefits associated with AI development in the community, given the wide ramifications of data center developments in host communities.
- Establish a Community Fund: CBA community funds can support locally-determined priorities such as broadband access, AI and digital literacy programs, just transition pathways with apprenticeships and training, healthcare, quality of life upgrades like parks and art ensuring that the wealth generated by AI infrastructure is reinvested in the communities hosting it. They can also be utilized to offset any ratepayer costs of infrastructure upgrades that are spread outside of the data center developers. Critically, Nondisclosure agreements (NDAs) on government incentive terms must be prohibited, ensuring that subsidy arrangements are publicly accessible and communities can assess whether tax concessions are being offset by CBA commitments.
- Set Numeric Workforce Targets and Prohibit Misclassification: Workforce provisions should include specific local hiring targets – typically 30–50% of construction labor hours from defined geographies – written into the CBA itself rather than deferred to post-execution plans. Because operational data centers average only 157 permanent employees, workforce provisions should focus primarily on the construction phase, while leveraging the developer’s long-term presence to fund broader workforce training initiatives, including AI just transition opportunities, in the community.
- Secure Financial Commitments with Letters of Credit: Payments should be secured by a Letter of Credit or corporate guarantee from a sufficiently capitalized entity, with payment triggers tied to specific construction and operational milestones. For example, Lancaster commits $20M total, secured by a $20M Letter of Credit or corporate guarantee from a $100M+ net-worth entity, with payments triggered at construction financing and operations commencement per building.
- Explore Diverse Community Wealth-Sharing Mechanisms: Beyond direct cash funds, CBAs can incorporate a range of wealth-sharing tools such as community land trusts, local equity stakes in the facility, revenue-sharing agreements tied to facility profits, or dedicated funds for affordable housing and small business development – ensuring communities build lasting economic power rather than receiving one-time payments.
- Address AI-Specific Infrastructure Concerns: Although not as common yet, CBAs can also consider specific provisions addressing AI operations, data sourcing practices, and the risks of long-term infrastructure lock-in associated with AI systems.
Recommendation 3. Policymakers (and/or community negotiators) should proactively identify and put the supporting mechanisms in place for meaningful representation, negotiation, enforcement, and accountability.
The most common CBA failures are not in the provisions communities demand – they are in process and enforcement structure. When poorly structured, or negotiated after key approvals are in hand, they can give the appearance of community benefit while delivering very little.
There are certain necessary conditions, dependencies, and actionable sub-recommendations for CBAs to be effective such as investing in and strengthening community-level organizing and coalition-building, providing training and workshops on provisions and negotiations, and critically, providing thoughtful representation to prevent takeover, and building robust enforcement mechanisms for delivery of benefits in practice. Looking back at the legal history and utilization of CBAs in the bank merger approval process and CEQA “Opt-In” process in CA that requires a CBA, we have gleaned some important lessons about levers, enforceability, and accountability, as well as recommendations on the negotiation and power-building process, listed below.
- Negotiate Early. Treat CBA execution as a precondition of permitting support as negotiating leverage is greatest before approvals are granted. Work with local government officials to make clear to developers that permitting support is conditioned on a satisfactory CBA. The Lancaster CBA was negotiated after zoning opinions had been issued and demolition had begun, and its gaps (no specific hire targets, no independent community board, no air monitoring) directly reflect that reduced leverage.
- Build a United Coalition. Organize internally before engaging the developer, presenting a united front through a Community Advisory Board and a mediator if necessary – the coalition should exclude both intractable opponents and members prepared to support the project without a CBA.
- Establish Ground Rules First. Before negotiating specifics, use a memorandum of understanding (MOU) to set the terms for timeline, information-sharing, representation, and dispute resolution. This also prevents developers from selectively engaging sympathetic stakeholders while sidelining community members most directly affected.
- Secure Legal and Technical Representation, ideally with cost-recovery agreement with the developers. Hire legal counsel with energy and environmental expertise and a technical expert to interpret site assessments, emissions modeling, and energy projections – unrepresented communities are structurally disadvantaged at the negotiating table. Negotiate a cost-recovery agreement requiring the developer to pay for community-selected legal counsel and technical experts, a practice well-established in permitting and utility interconnections that should become standard in CBA negotiations.
- Require Developer-Funded Community Review. Ask the developer to fund community technical review – a precedent well-established in CEQA practice. It can be coupled with the negotiation including a due diligence phase, where documentation is provided to the community coalition to review and provide recommendations.
- Demand Numeric Targets, Not Aspirational Language. Replace “good faith efforts” and vague commitments with specific, measurable targets subject to annual reporting and financial penalties for non-compliance, as bank merger advocates successfully did with dollar-denominated, geo-specific lending commitments.
- Prohibit NDAs on Environmental and Financial Data. Do not allow nondisclosure agreements on monitoring data, permits, consumption reports, or government incentive terms – the notorious Memphis xAI case, in which 35 unpermitted turbines operated in secret with health and environmental consequences for the community, illustrates the consequences of unchecked secrecy. Lancaster’s CBA also correctly designates the CBA as a public record under Pennsylvania’s Right-to-Know Law.
- Negotiate CBAs and PILOT Agreements Together. CBA and payment-in-lieu-of-tax agreements (PILOT) must be negotiated in tandem with a cap on total payments, ensuring community investment funds supplement, and do not substitute for, any expected tax revenue.
- Specify Any Fund Governance in the Agreement. Ambiguous collective fund governance renders financial commitments meaningless – specify committee composition, voting rules, permitted uses, and annual reporting directly in the CBA.
- Frame Agreements Around Impact Mitigation, Not Approval. Require developers to first identify community concerns and propose mitigation before discussing payments. Framing money as the price of approval produces smaller commitments and less community ownership of outcomes.
- Know When CBAs Are Not the Right Tool. CBAs cannot substitute for strong environmental permitting, transparent subsidy disclosure, or robust utility regulation, and should not be pursued when permits are already in place, transparency has been denied, or a developer-backed document is being falsely presented as a community agreement. There are plenty of situations where opposition or moratorium might be more appropriate. Know the limitations of CBAs – their scope is limited to what the contracting parties agree to, and their enforceability depends on clear terms, specific metrics, secured financial obligations, and parties with the legal standing and resources to enforce them.
Conclusion
The extraordinary wealth generated by the AI data center boom is being built on community land, water, electricity, and environmental capacity. Yet, the communities bearing these burdens are seeing little of the benefit. The hyperscalers behind this buildout are among the most valuable companies in human history, and the AI services running on this infrastructure will generate billions in revenue. None of this wealth is created in a vacuum: it is created in specific places, using specific community resources, and the communities providing those resources deserve a meaningful share of the value they help create.
The current pattern in which vulnerable communities absorb the largest burdens, profitable companies receive the largest subsidies, and benefits flow primarily to shareholders, is neither inevitable nor acceptable. It reflects choices being made right now, as the buildout accelerates and the patterns of harm and benefit are being set. CBAs are a tool to make different choices: to insist that the communities hosting AI infrastructure share genuinely in its benefits, and that the costs of that infrastructure – to air quality, water systems, grid reliability, and community character – are borne by those who profit from it, not by those who simply happen to live nearby. The time to act is now.
Lancaster, PA, 2025
- The City of Lancaster negotiated a legally binding CBA with the developers of the Lancaster AI Hub before construction was finalized, securing $20 million in community contributions. Key wins include a hard cap of 20,000 gallons per day of municipal water use per campus, a 100% clean energy requirement backed by tiered financial penalties of up to $10 million per building, strict noise limits tied to pre-construction ambient levels, and full public records transparency. The agreement also commits developers to a local hiring plan, free first-responder training, and ongoing community engagement — demonstrating that municipalities can extract meaningful, enforceable protections from data center developers when they engage before key approvals are locked in.
Nashville MLS Soccer, Nashville, TN, 2018
- A coalition called Stand Up Nashville successfully advocated for this CBA in connection with a soccer stadium development project. The CBA includes, among other things, commitments on jobs that pay a living wage, hiring priorities, affordable housing, and a childcare center. As part of this CBA, Stand Up Nashville’s committed to support rezoning legislation for the stadium, which was widely opposed before the CBA. Nashville’s Mayor eventually supported the stadium project in large part due to the CBA.
Facebook Campus Expansion CBA, Menlo Park, CA, 2016
- This CBA, associated with an office expansion, is between Facebook and a coalition of community groups. In this agreement, Facebook made an almost $20 million commitment to affordable housing in the area, which led to an additional $60 million in other donor commitments.
Brookings: Why community benefit agreements are necessary for data centers | Brookings
NAACP: CBA Template for Data Centers
Good Jobs First: Key Reforms: Community Benefits Agreements
Kapor Foundation: The Unequal Burden of Data Centers
AI Now Institute: North Star Data Center Policy Toolkit: State and Local Policy Interventions to Stop Rampant AI Data Center Expansion
From NAACP’s CBA Guide
In practice, this can mean: 1. The initial agreement pays for legal counsel and technical support, selected by and managed by the community coalition. 2. The next phase is either: (1) an agreement to establish binding requirements for transparency, impact studies, labor standards, and equity protections, which is contained in Article 3 of the template; OR (2) a due diligence phase, which requests information provided in Article 3. 3. An amendment is negotiated after the community has access to impact information on electric, environmental, housing, and infrastructure demands, which could be an amendment specifying the exact dollar amounts and project-specific mitigation measures. This approach allows communities to understand the scale and type of impacts before finalizing the financial structure of the Community Benefits Agreement, while maintaining leverage and ensuring that non-opposition is tied to a complete, enforceable package of commitments.
From PolicyLink CBA Toolkit:
Unless developers face significant public pressure and/or legal leverage that jeopardizes public
approval, developers are unlikely to compromise. A coalition may exert leverage to bring the developer to the table in a variety of ways: direct lobbying of elected officials and city staff, notifying any reporters covering the issue that the community has significant concerns, using social media to amplify the community’s voice and raise support, protests at the worksite or at City Hall, or artist-led community responses, like chalk art at the site or near City Hall.
Stakeholders & Roles:
A community coalition can include stakeholders such as: Individual residents, Neighborhoods councils, Faith groups, Local non-profits, Local businesses, PTAs, Housing advocates, City administration staff and elected leaders can demonstrate inclusive leadership by (i) providing transparency around the project; (2) insisting on broad community support for project approval; (3) encouraging CBA negotiations, without trying to influence them. 2-4 coalition representatives should contact the elected officials (or city council staff) most involved in the proposed project and brief them on the coalition, its priorities, and any engagement it has had or plans to have with the developer. The coalition representatives should ask that the officials condition a vote in favor of the project upon the developer’s support for the coalition’s priorities.
Elected officials can be an important ally in a CBA negotiation because they can persuade their colleagues on council to delay a vote on the project to allow more time for the coalition to negotiate with the developer. They can also apply pressure on the developer to reach an agreement with the Coalition. The coalition should assess whether it can count on commitments of support from a majority of the committee and/or council members. Particularly if a coalition new, support from key elected officials will help bring developers to the table. It may be necessary to take legal action against objectionable aspects of the development to inspire a willingness to negotiate.
Settlement Wins Against Big Tech Should Underwrite Digital Resilience Funds
Historically large penalties have been insufficient in crafting durable and effective deterrence against corporate wrongdoing. A better approach has bedeviled regulatory enforcers, legislators, attorneys general, and the judiciary. This challenge has been especially acute as enforcers have attempted to rein in the worst violations of the largest technology companies as we transition from the social media era to the AI era. Company scale and market power allow them to absorb even historic penalties as the cost of doing business, blunting the effectiveness of civil litigation and regulatory fines.
The stakes for more effective deterrence and a more robust remedies toolkit are rapidly compounding. Many emerging AI related harms, including AI induced psychosis, maladapted socialization, deepfake driven bullying and harassment, suicide coaching, and declines in children’s literacy bear the hallmarks of a public health crisis or environmental disaster rather than just discrete consumer injuries. The scale of these externalities invites greater prosecutorial and regulatory scrutiny but also demands a more creative enforcement playbook. When historic fines against these companies and their predecessors disappear into general treasuries those funds remain largely inert instead of helping the public defend itself.
Injunctive relief and headline fines are important enforcement mechanisms but if enforcement is to reach its deterrent potential and protect the public in the advanced algorithmic era, we must recognize that penalizing corporate misconduct is only half the battle. By allocating funds from tech settlements to investments in broad-based consumer education, digital literacy, independent researchers, or new enforcement and investigatory infrastructure, state attorneys general and the judiciary can transform these otherwise inert dollars into a sustained and active defense against digital harms.
Challenge and Opportunity
The Federal Trade Commission’s historic $5 billion settlement with Facebook in 2019 is perhaps the clearest example of a broken enforcement model. At the time of its announcement, the penalty was the largest ever imposed by the FTC on a company for violating consumer privacy. Even as a majority of the Commission approved the settlement Commissioners Rebecca Slaughter and Rohit Chopra warned in their dissents that the penalty was unlikely to meaningfully deter the company or the broader market. They were right. The settlement imposed some compliance obligations, but none challenging its underlying business model of aggressive data harvesting. The company’s stock price rose after the announcement. Within a few years the FTC sought to reopen its privacy orders against Meta over subsequent alleged privacy violations, illustrating the failure of the penalty to sustainably alter corporate behavior.
The Facebook settlement was hopefully the high watermark of a certain kind of enforcement paradigm. Fines should be larger. Behavioral and structural remedies should be stronger and imposed more often. Vital work has been done to turn that page and institute meaningful controls on data abuses and exploitative design. But, as we continue to use fines and penalties we have to confront a limitation in the enforcement model. When those dollars disappear into state or federal treasuries they do little to address systemic technological disruption. To protect the public, enforcers can put settlement dollars to work. We need to invest directly in the public so our society is prepared to handle this wave of technological disruption.
Inert Fines, Federal Constraints, and State Action
What if the $5 billion Facebook settlement had been put to better use?
Imagine if even a portion of those funds had supported a sustained nationwide consumer education effort on the harms of social media use and digital literacy? A fraction of that money could support public education campaigns teaching about manipulative design practices and how we can take our autonomy back. The fine itself only punished the company’s past conduct on the supply side; investing that money in public education could have helped shift the demand side, changing the user behavior in the market that made these products profitable.
Instead, like most federal settlements, by law that money flowed straight from Facebook directly into the federal treasury. Federal enforcers have limited ability to direct those funds towards targeted public education or resilience efforts (with the notable exception of the Consumer Finance Protection Bureau (CFPB) which is allowed to direct civil penalties to a special consumer education fund).
While Federal regulators like the Federal Trade Commission and the Department of Justice have obtained some landmark penalties, state attorneys general have increasingly become the primary defenders of Americans’ digital rights. In recent years the states have secured billions of dollars through aggressive enforcement: A $700 million settlement over Google’s app store practices; $391.5 million in a multistate effort over deceptive location tracking; $1.4 billion from Meta for “using facial recognition without users’ permission”. The list goes on.
Crucially, state attorneys general have different constraints on how civil penalty funds may be used. Many states have their own Unfair or Deceptive Acts or Practices (UDAP) statutes, in addition to a variety of consumer protection laws. Under common law practice and state statutes many AGs have more leeway in directing their settlement funds to organizations and causes “consistent with the objectives and purposes of the underlying cause of action”. Through multistate settlements, AGs have repeatedly demonstrated their ability to coordinate enforcement and reshape industry practices on a national scale.
But, it’s fair to ask: how much will even a $1.4 billion payout change a company’s underlying market incentives and consumer behavior? What might that $1.4 billion accomplish if even a portion were invested in changing consumer behavior through consumer education, digital literacy, independent research, and resilience building?
Crafting forward looking structural and behavioral remedies in a fast-changing industry is important and difficult. It was only in the past few years, decades into the internet era, that the Federal Trade Commission embraced data minimization and algorithmic deletion as the appropriate remedies for data abuses. Finding the appropriate remedies for AI related harms is crucial work, but will take time. If we put the dollars to work, negotiated settlements can help build deterrence and prevention right now.
Successful Lawsuits Against Defective Algorithms, Addictive Product Design
Recent state-level actions prove we can change the enforcement paradigm. Two recent cases target the root of these digital harms: defective algorithms and addictive product design. By framing these platforms as defective products engineered to exploit children, enforcers have bypassed the traditional tech liability shield. This breakthrough could open the floodgates for systemic accountability.
In March, a California jury awarded a 20 year old plaintiff $6 million after finding that Meta and YouTube negligently designed their platforms and caused severe mental health crises under a theory of defective products liability. In the same month a New Mexico jury levied a historic $375 million penalty against Meta for violating the state’s Unfair Practices Act by misleading parents about the safety of their products thereby enabling child exploitation.
These verdicts could be bellwethers for a wave of impending litigation and settlements. Currently, a historic and “sprawling” set of consolidated lawsuits, known as Multidistrict Litigation (MDL) 3047 is proceeding in Federal Court. This lawsuit includes 41 attorneys general, hundreds of school districts, thousands of individual personal injury suits, all consolidated and contesting the “‘unreasonably dangerous’ design of social media platforms.”
These cases, and others, mean billions of dollars may soon be changing hands. The critical policy question for state enforcers is whether those funds, after class members and direct victims are made whole through restitution, will disappear into general treasuries or be used to address the real problems at hand.
Plan of Action
Putting Settlement Dollars to Work
We propose putting settlement dollars to work through the creation of a Digital Resilience Fund. This is not a radical departure from current enforcement norms. Rather, it’s a call to accelerate adoption of a model, such as the Truth Initiative, which informs this proposal, that’s been successfully deployed across other industries, as seen in Table 1.
In each of the cases in Table 1 enforcers recognized that penalties alone weren’t sufficient to ameliorate the harm from the underlying legal violations. Settlement dollars disappearing into general treasuries would have been a disservice. As victim advocates frequently note, one of the most profound ways to honor victims is by preventing others from becoming victims, whether the threat is from opioids, unlawful financial practices, smoking, or pollution.
Further, the time to act is now. The public is already convinced there’s a problem with AI accountability. Recent Gallup polling reveals a fascinating paradox regarding the next generation of consumers. While over half of Generation Z uses generative AI on a weekly basis, their optimism about the technology is plummeting. They are increasingly anxious about the technology’s impact, with majorities expressing fear that AI will come at a high cost to human creativity, critical thinking, and learning. The public can feel the ground shifting, but lacks the tools to fight back.
Recommendation 1. Establish a Digital Resilience Fund
The previous examples of settlement agreements all exemplify an important principle: settlement dollars ought not to be inert. Directing the spend on those dollars is an additional tool in the toolkit of deterring corporate wrongdoing, mitigating digital and AI harms, and hardening society to deal more effectively against disruptive technology. An educated public acts as a deterrent and could help steer the market towards deploying technologies that serve, rather than exploit their users.
State AGs alone, or in concert with each other, or with legislators, can begin redirecting a portion of major technology settlement proceeds into a fund focused on education, research, and harm mitigation related to AI. These funds could be administered through an independent nonprofit (like the Truth Campaign), through an existing public foundation structure, or through state-level grant programs (like the Opioid Abatement funds). The precise institutional form is less important than the principle that settlement or fine dollars tied to AI related harms ought to be used to build society’s capacity to stop or withstand those harms.
What a Digital Resilience Fund Could Do
Depending on the size of the settlement and the scope of the underlying harm enforcers and lawmakers could scale these funds across a range of initiatives:
- Fund Counter-Marketing and Awareness Campaigns: A fund could drastically modernize consumer education around technology. The Truth Initiative showed that a well funded and sophisticated campaign can change behavior. To compete against billion-dollar engagement engines, we need compelling communications that resonate with the public. We envision a “touch grass” message delivered with cultural fluency on the social platforms where harms occur – meeting the moment and helping people make different, more informed, choices.
- Support Independent Research and Monitoring: Money for research means we’ll be equipped with a better understanding of how these tools affect behavior and mental well-being. Researchers could also identify evidence-based interventions that help save lives. The research could then be translated into public education materials and materials for evidence-based remedies, regulation, and legislation.
- Support Digital Literacy and AI Education at Scale: Counter marketing can raise awareness, digital literacy can build skills for this media era. This could mean grants to schools, libraries, or community organizations to teach students, educators, and families how AI systems can shape behavior, how to navigate a changing information environment, or how deepfakes can erode trust. As documented by the OECD, a whole-of-society approach to media literacy can be extremely effective against disinformation.
- Act as a Nimble Response Mechanism: As AI tools become more autonomous and agentic, new risks will emerge faster than legislation or litigation can mitigate them. A resilience campaign could launch educator toolkits or literacy campaigns as a first step while legislative efforts and litigation strategies are ongoing.
- Educate and Protect the Labor Force: AI and algorithmic harms will extend beyond social media. For settlements involving hiring software, worker surveillance, or discriminatory models, public education campaigns could also educate workers about their rights. Do professionals who work primarily on computers know that by “using AI” in their daily work, they may be training their replacements? Do they know how to communicate with their coworkers without being surveilled so they can take collective action?
- Establish a multistate investigatory and research apparatus: As comprehensive federal tech regulation remains stalled by gridlock, state enforcers have become the primary defenders of consumer rights. By pooling settlement resources, a coalition of states could establish shared or parallel investigatory infrastructure. Recent initiatives like the Governor’s Public Health Alliance shows that states already have the logistical framework to pool expertise and coordinate parallel responses when federal infrastructure is lacking. This would provide state regulators, AGs, and legislators the dedicated technical expertise, auditing capacity, and ongoing monitoring of the market needed to support future litigation and evidence-based regulation without waiting for federal action.
Together, these recommendations all work to influence cultural change about how our society views AI and evaluates corporate harms. Luckily, we have evidence that this kind of investment can be successful.
The Evidence for Culture Change Efficacy
It’s easy to look back at the 1980s “Just Say No” era and wonder if public education campaigns can actually do anything to change entrenched consumer behavior. But the data tells a different story. Well-funded and targeted campaigns have made a difference.
The Truth Initiative is the gold standard. Instead of dryly lecturing teenagers, the campaign exposed the manipulative marketing tactics of tobacco executives and helped cause a collapse in teen smoking – dropping from nearly 23% in 2000 to less than 2% today. Peer reviewed studies have shown that in just one year the campaign prevented 300,000 kids and young adults from becoming smokers.
The collapse in smoking is a generational public health accomplishment, but other interventions around the world have shown that public education works:
- Finland has used a whole of society education approach for decades to combat disinformation.
- The Real Cost: The FDA has used digital and social media advertising to drive declines in teen vaping.
- This Girl Can: The UK tackled the gender gap in sports, inspiring over 1.6 million women to start exercising by dismantling cultural stigmas.
- Time to Change & It’s Not OK: The UK and New Zealand successfully used massive awareness campaigns to measurably reduce mental health discrimination and shift community norms around family violence.
- Dumb Ways to Die: Australia used a humorous digital campaign to secure a 20 percent drop in transit accidents.
When combined effectively, litigation, regulation, and education have a proven track record of changing social behavior. Protecting the public from the tech industry’s predatory business models and the next wave of AI harms is an enormous challenge, but we have the evidence that trying to build a healthier digital culture is absolutely worth the effort.
Guardrails and Guidance
To maintain public trust and to prevent the misuse of funds any Digital Resilience Fund or similar initiative collects, it must operate under a narrow mandate focused on the remediation and prevention of AI-related harms and follow the best practices set forward in previous settlements. For example, the National Opioid Settlement Agreement provided a list of approved uses for funds focused solely on abatement. Other states have instituted public dashboards to track spending of settlement dollars in a transparent way.
While many AGs already have the authority to direct these funds through settlement agreements, ultimately codifying them through state legislative frameworks may provide greater predictability and transparency for their long-term operation. Legislation may also be necessary to allow fines and penalties, not just settlements, to contribute to the fund in some states.
Conclusion
An informed public is a valuable partner in deterring corporate malfeasance.
Fines must be large enough to penalize lawbreaking, and structural and behavioral remedies must aggressively dismantle harmful corporate practices, especially with regards to the growing power of AI companies. These are the core instruments of any effective enforcement toolkit. However, if we really want to change these companies’ behavior, we have to change the market they operate in.
A well funded digital literacy and culture campaign could step into this chasm. By giving ordinary people the skills to spot deepfakes, resist manipulative algorithms, and protect their mental health, we empower them to demand safer products.
State attorneys general have an incredible opportunity to build on the historic work they have already done. From the Tobacco Master Settlement to the Opioid abatement funds, the states have proven themselves as the primary architects of massive, society saving interventions.
As algorithmic harms increasingly mirror environmental disasters and public health crises, our response must be equally systemic. The next wave of technology settlements offers a generational opportunity to look beyond the standard playbook. Rather than treating historic recoveries as a simple windfall for state treasuries, enforcers must deploy these funds to protect our communities and build a stronger foundation for our democracy.
Yes, it was. Colorado requested the establishment of a “public education fund” as one of the remedies in the Google search monopoly case. Judge Amit Mehta declined to sign off on it noting 1) the states did not draw any connection between Google’s distribution agreements and the public’s perceptions of other search engines as a prong of the Sherman Act allegations and 2) the state’s “lack of specifics [about the potential program] is fatal”. Helpfully, this lays out a guide for future litigants to win a public education fund by drawing this connection and providing those specifics. In cases relating to consumer protection, deceptive practices, and product liability, consumer perceptions are central, so it will be even easier to demonstrate a connection to public education. The first enforcer to win such a remedy would serve as a model for others, creating a snowball effect. Note: This decision was regarding a court ordered remedy, and does not limit settlements.
Prioritize Student Safety in K-12 Education By Establishing AI Procurement Guardrails
Artificial intelligence (AI) tools are rapidly entering K–12 education, influencing discipline, grading, placement, attendance monitoring, tutoring, and school safety. While these systems claim to promote efficiency and innovation, adoption has outpaced oversight. Opaque and insufficiently tested tools are increasingly shaping student outcomes without consistent transparency, civil rights review, or technical safeguards.
This presents material legal and operational risks. AI systems affecting discipline, eligibility, and monitoring may implicate education civil rights laws such as Title VI, Title IX, Section 504, and the Individuals with Disabilities Education Act (IDEA), particularly where disparate impacts arise from biased historical data. Tools that collect or process sensitive student information also raise compliance concerns under the Family Educational Rights and Privacy Act (FERPA) and related state laws. At the same time, many districts lack the capacity to evaluate vendor efficacy claims or negotiate contracts that protect against bias, privacy breaches, or vendor lock-in.
States and the U.S. Department of Education can address these risks using procurement and oversight tools already within their authority. This memo proposes six actionable steps: (1) establish statewide AI procurement guardrails; (2) require Algorithmic Impact Assessments for high-risk systems; (3) prohibit or strictly limit predictive-policing and law-enforcement-derived analytics in schools; (4) encourage ongoing performance monitoring and incident response; (5) create a state-level technical assistance and vendor accountability programs; and (6) invest in leadership-level capacity building for superintendents and senior administrators. Together, these measures support safer adoption, reduce discrimination and privacy risks, strengthen fiscal stewardship, and build public trust.
Challenge and Opportunity
Recent incidents demonstrate that AI deployment in K–12 settings can create serious risks when implemented without adequate safeguards, transparency, or oversight. In one widely reported example, the Los Angeles Unified School District entered a contract with an education-tech startup that ultimately misused student data and put sensitive information at risk. In another case, an AI-enabled security camera system used in a Baltimore County Public Schools school misidentified a bag of chips as a firearm, illustrating the potential for inaccurate automated threat detection systems to trigger unnecessary panic, disciplinary responses, or law enforcement intervention. These incidents underscore that AI systems deployed in schools can materially affect student privacy, safety, disciplinary outcomes, and civil rights, particularly when systems are introduced without sufficient testing, human oversight, or clear accountability mechanisms.
In some contexts, AI may present genuine opportunities for K–12 education when deployed thoughtfully and within appropriate limits. Certain tools may help educators identify students who need additional academic support, expand access to tutoring, streamline administrative tasks, and improve language accessibility for multilingual learners and families. Used as decision-support rather than decision-making systems, AI can help schools direct limited resources more efficiently and support individualized learning in ways that traditional software often cannot.
At the same time, the benefits of these systems depend heavily on how they are selected, designed, governed, and monitored in practice. Educational institutions are increasingly being asked to evaluate unproven products, some costing in the tens of millions, that make probabilistic inferences, adapt over time, and operate with limited transparency. These are features that differ substantially from conventional education technology. These distinctions matter because systems introduced to improve efficiency may also shape high-stakes educational outcomes in ways that are difficult to detect without structured oversight, and some can even be harmful to students and school communities.
The core challenge is not simply AI adoption, but that it is occurring through procurement systems designed for conventional software—not probabilistic tools that may influence discipline, placement, and safety. Most districts rely on standard ed-tech purchasing processes that rarely require structured review of training data, demographic performance, or long-term equity impacts, leaving high-stakes decisions without proportionate risk analysis.
This governance gap is amplified by fragmentation. Local school boards can adopt procedures to aid decision making. Meanwhile, thousands of districts negotiate independently with sophisticated vendors, often lacking the expertise to assess claims about accuracy, bias, data security, or simply accept “off-the-shelf” AI products and terms of service. Contracts can limit audit rights, assent to harmful data practices, and create vendor lock-in, with smaller districts particularly vulnerable to unverified assurances. In worst case scenarios, oversight begins only after harm occurs, leaving districts reactive rather than preventative. In short, decentralized procurement, uneven capacity, and opaque vendor practices create structural risks that, absent coordinated state standards, may entrench inequities and erode public trust.
While the challenges are significant, states possess clear authority to address them. States retain primary responsibility for K–12 governance, including procurement standards, contracting requirements, and oversight of local education agencies. State legislatures and departments of education can issue guidance, promulgate regulations where authorized, condition funding on compliance, and coordinate with procurement offices, CIOs, and attorneys general to establish uniform contracting expectations.
In practice, states can establish baseline procurement checklists and disclosure requirements; mandate better processes that promote better informed decision making; develop model contract clauses addressing data minimization, audit rights, and termination; require pre-deployment review for high-risk systems; condition technology funding on governance criteria; and provide centralized technical assistance. The U.S. Department of Education plays a complementary role through civil rights enforcement, FERPA guidance, and grant-making authority under relevant statutes.
Importantly, not all AI use in schools is harmful. Tutoring systems that scaffold reasoning, or tools that identify students for additional support, can expand opportunity when transparent and used as decision-support rather than automated decision-making systems. The goal then is not to halt innovation, but to channel it responsibly.
Procurement is a high-leverage intervention point that can foster responsible innovation and technology integration. Rather than framing AI governance as a choice between bans and unregulated adoption, guardrails focus on conditions of purchase and deployment, preventing harms upstream before litigation, remediation, or public controversy arise. A statewide framework reduces fragmentation and strengthens negotiating leverage. Consistent standards lower compliance costs and incentivize vendors to compete on transparency, fairness testing, and privacy protections. By acting now, states and federal education leaders can shape procurement norms before harmful practices become entrenched thereby supporting innovation while safeguarding students’ rights and trust.
Plan of Action
Recommendation 1. Establish Statewide AI Procurement Guardrails for K–12 Purchasing
Statewide procurement guardrails are the most feasible and immediate way to reduce risk in school-based AI adoption. Rather than requiring each district to independently develop technical and legal expertise, state departments of education can establish uniform, scalable standards grounded in existing authority and assist schools with product analysis resources. State departments of education already regulate contracts for textbooks, transportation, and student data systems. AI introduces new technical considerations, but these can be addressed through structured disclosure requirements, risk classification, and standardized contract provisions led by state departments of education.
Centralizing guardrails reduces duplication, strengthens negotiating leverage with vendors, and addresses civil rights and privacy risks upstream. A modest investment in templates and oversight can significantly reduce long-term legal, fiscal, and reputational exposure.
Classify and Define “High-Risk” AI Uses
States should classify AI systems by risk level so that safeguards are proportionate to the stakes involved. High-risk systems would include those that implicate rights found in the 2022 White House “Blueprint for an AI Bill of Rights.” Within an education context that would include decisions that impact discipline or expulsion, affect placement or eligibility decisions (including gifted or special education), generate behavioral risk scores or threat assessments, enable surveillance or facial recognition, impact grading or graduation eligibility, or predict student performance in ways tied to consequential outcomes.
These uses are high-risk because they directly affect educational access, safety, liberty interests, and protected status under federal civil rights law, potentially implicating due process protections and civil rights statutes such as Title VI, Title IX, Section 504, and the IDEA. State guidance should also clarify that systems qualify as high-risk when they materially influence high-stakes decisions or process highly sensitive data such as disability or biometric information. For example, in the K–12 education setting, systems that recommend student discipline, identify students as potential safety threats, determine eligibility for advanced academic programs, monitor student mental health or behavior, or evaluate students for special education interventions can influence school administrator’s decisions and substantially shape student educational opportunities, outcomes, and civil rights protections, and therefore warrant heightened oversight. It is important to underscore that AI that may materially influence high-stakes decisions should always be subject to human oversight and never be the sole basis for a decision.
Require a Pre-Purchase Review Checklist and Minimum Vendor Disclosures
For high-risk systems, states should require a structured pre-purchase review to ensure that key legal, technical, and operational risks are addressed before contracts are executed.
A standardized AI Procurement Review Checklist should require districts to document:
- Purpose and Use Case: The specific problem addressed, whether the system supplements or replaces human decision-making, and which student populations are affected including the risk that certain groups of students may be adversely impacted and how.
- Accuracy and Validation: Identity of the validation tester and any conflicts of interest including financial interest in the sale of the product being evaluated, documented error rates, validation studies (including demographic performance where available), and known system limitations.
- Disparate Impact and Treatment Monitoring: Whether the system has been tested for disparate impact and disparate treatment across protected groups, the methodology used, and plans for ongoing monitoring. Other legal indicia of discrimination would ideally be considered as well, including whether a system has been tested for: selective enforcement, hostile learning environment, proxy discrimination, failures to provide accommodation, amongst others.
- Human Oversight and Contestability: Who reviews outputs, whether AI influences adverse decisions, and the process for student or parent appeals.
- Data Governance and Security: Categories of data used, retention timelines, subcontractor involvement, encryption standards, access controls, and incident response protocols.
Overall Safety Declaration: A declaration that determines whether the product is safe or too dangerous for school access.
Vendors, but ideally independent evaluators, should also provide minimum disclosures, including general descriptions of training data sources, audit summaries (if available), subgroup performance metrics, data retention policies, subprocessor lists, server locations, and cybersecurity certifications (e.g., SOC 2). For example, subgroup performance metrics can help identify whether a tool performs differently for students with disabilities or students from different racial or linguistic backgrounds; data retention and server location disclosures help districts evaluate compliance with student privacy laws and data governance obligations; and cybersecurity certifications and subprocessor information help schools assess risks related to student data security and third-party access. These requirements do not mandate disclosure of proprietary algorithms, but they do require sufficient transparency for schools to conduct procurement diligence, evaluate potential harms, and ensure accountability for systems that may materially affect students’ educational experiences and opportunities.
Establish Standard Contract Clauses
Procurement guardrails are only effective if embedded in enforceable contracts. States should develop model AI contract provisions that districts are required or strongly encouraged to adopt.
Core clauses should include:
- Data Minimization and Use Limits: Collection limited to what is necessary as determined by the school district; prohibition on secondary uses (e.g., model training or resale) without parental consent; clear data deletion timelines upon termination of the contract.
- Security and Breach Notification: Encryption at rest and in transit; defined breach notification timelines; vendor liability for negligence-related breaches.
- Audit and Transparency Rights: District, state, or independent evaluator access to performance and compliance documentation; right to independent audit; annual performance updates.
- Termination and Exit Protections: Termination rights for material failures or civil rights concerns; data portability; certified data deletion or media sanitization upon termination of contract.
- Subcontractor and Data Location Controls: Full subprocessor disclosure; equivalent data protections; restrictions on undisclosed offshore data transfers; U.S.-based storage where feasible.
Embedding these provisions strengthens district leverage, reduces vendor lock-in, and creates enforceable accountability if harms occur.
Recommendation 2. Require an Algorithmic Impact Assessment (AIA) Before Deployment of High-Risk Systems
States, with the assistance of independent evaluators, should require districts to complete and publicly post an Algorithmic Impact Assessment (AIA) before deploying any high-risk AI system. An AIA is a structured evaluation of a system’s purpose, risks, legal implications, and mitigation strategies conducted prior to deployment.
By requiring AIAs, states shift governance upstream and identify civil rights, due process, privacy, and safety risks before systems affect students, rather than responding after harm occurs. This approach reinforces that educational innovation must comply with civil rights law, protect student data, and meet defined safety standards. AIAs do not prohibit AI use. They ensure that AI adoption is deliberate, transparent, and accountable.
When an AIA Is Required
An AIA should be mandatory for AI systems classified as high-risk under Recommendation 1, including systems that:
- Influence discipline or safety interventions
- Affect placement, eligibility, or grading decisions
- Conduct behavioral monitoring or surveillance
- Generate risk scores tied to adverse actions
The AIA must be completed prior to contract execution or system activation.
Core Contents of the AIA
To be meaningful, an AIA must provide sufficient detail to inform decision-makers and the public without requiring disclosure of proprietary source code. States should issue a standardized template to ensure consistency. An AIA should include:
- Purpose and Use: The problem addressed, how AI outputs will be used (advisory or determinative), and which student populations are affected.
- Data Inputs and Governance: Categories of data used; whether protected characteristics or proxies are included; sources of training data; retention timelines; data sharing practices (including subcontractors and cross-border transfers); and whether student data are used for ongoing model training. This enables risk evaluation under FERPA and civil rights law without requiring disclosure of proprietary datasets.
- Accuracy and Validation: Error rates, subgroup performance where feasible, validation methods, and known limitations. Where subgroup testing has not occurred, that must be disclosed.
- Disparate Impact, Disparate Treatment, and Oversight: Results of disparate-impact and treatment testing, ongoing monitoring plans, mitigation strategies, and confirmation that AI will not serve as the sole basis for adverse decisions without human review. The assessment should also identify which agency, department, or designated internal team is responsible for ongoing oversight, whether dedicated staff capacity exists to carry out that responsibility, and who will be accountable for responding if discriminatory outcomes, safety failures, or other material harms emerge after deployment.
- Due Process Protections: Notification procedures, appeal processes, review timelines, and mechanisms for correcting erroneous records.
- Privacy and Security Safeguards: Encryption standards, access controls, incident response protocols, breach notification timelines, and FERPA compliance.
Transparency and Public Posting Requirements
AIAs should balance meaningful transparency with protection of proprietary information. States should require a two-tiered structure:
- Public version: Narrative descriptions of the system’s purpose, data categories, data and privacy protections, performance metrics, limitations, safeguards, and high-level demographic testing results where feasible, accompanied by a plain-language summary that is accessible to parents and guardians.
- Regulator-facing appendix (as needed): More detailed technical documentation, validation studies, and contractual data governance materials.
Raw datasets and proprietary algorithms do not need to be disclosed. However, vague assurances (e.g., “tested for bias”) are insufficient. AIAs must include independent testing, documented metrics, and clear descriptions of methodologies.
Each AIA should be publicly posted on the district website prior to acquisition, including a parent-friendly summary (2–3 pages), contact information for questions or complaints, and a clear explanation of appeal rights. States can reduce administrative burden by providing standardized templates and maintaining a centralized statewide repository.
Public transparency strengthens trust, enables independent review, promotes vendor accountability, and supports cross-district learning.
Recommendation 3. Prohibit or Strictly Limit Predictive-Policing and Law-Enforcement-Derived Analytics in School Settings
Certain AI uses in schools pose heightened civil rights, due process, and safety risks that procurement safeguards alone cannot mitigate. Systems that replicate predictive-policing models or rely on law-enforcement-derived data warrant clear statutory limits.
Define and Prohibit High-Risk Predictive Discipline and Law-Enforcement-Derived Systems
The increasing use of AI-driven behavioral analytics, predictive monitoring tools, and school surveillance technologies raises significant concerns for student privacy, civil rights, due process, educational equity, and student safety. Systems that predict future misconduct, generate behavioral threat scores, or rely on law-enforcement-derived data risk replicating historical patterns of bias, normalizing heightened surveillance, and increasing unnecessary disciplinary or law enforcement intervention, particularly for students of color, students with disabilities, LGBTQ+ students, and other historically marginalized groups. Because these technologies can materially influence disciplinary outcomes states should prohibit or strictly limit AI systems within the education setting that:
- Generate forward-looking risk scores used to justify discipline, suspension, expulsion, or law enforcement referral.
- Produce behavioral threat scores like “aggression” or “threats” to school safety absent specific, individualized evidence.
- Rely primarily on law enforcement datasets or predictive policing models.
- Use facial recognition or other biometric surveillance.
- Identify or label objects as weapons.
- Integrate student data into external law enforcement analytics systems without explicit statutory authorization.
Predictive Discipline Risk Scoring
Due to discrepancies in school discipline, for example, the well-documented research showing Black students are disproportionately disciplined compared to White students for similar behaviors, the same AI risk score may lead to very different interventions, which would exacerbate existing disparities. Therefore, assigning students algorithmic “risk scores” for future misconduct raises serious equity concerns. Systems that aggregate attendance records, prior disciplinary history, or behavioral indicators risk replicating documented racial disparities embedded in historical data. Even if statistically predictive, such tools may institutionalize biased baselines and normalize heightened surveillance of certain students.
Accordingly, states should prohibit AI systems that use forward-looking misconduct predictions to justify disciplinary action or automated referrals without individualized human evaluation.
Law-Enforcement-Derived Analytics
AI systems adapted from policing contexts introduce additional risks. Law enforcement datasets often reflect patterns of over-policing and incorporating them into school decision-making can import external bias into educational settings. States should prohibit systems that integrate criminal justice databases into student risk scoring, share student behavioral data with predictive law enforcement platforms absent a specific incident, or use facial recognition tied to law enforcement watchlists in routine school operations.
Schools are educational institutions—not extensions of the criminal justice system. Clear statutory boundaries are necessary to prevent normalization of predictive surveillance in learning environments.
Allow Narrow Exceptions Only with Heightened Safeguards
There may be limited scenarios where data analytics support school safety planning or student support interventions. In such cases, use should be permitted only under strict conditions:
- AI outputs may not serve as the sole basis for adverse action.
- All outputs must be reviewed by trained personnel.
- Clear documentation of evidentiary basis must accompany any intervention.
- Systems must undergo an Algorithmic Impact Assessment prior to deployment.
- Annual disparate impact and disparate treatment analysis must be conducted.
These safeguards increase the likelihood that technology supplements, not replaces, professional judgment.
Recommendation 4. Governance, Ongoing Performance Monitoring, Public Reporting, and Incident Response
AI systems evolve and update over time, interact with changing student populations, and may degrade in accuracy or fairness after deployment. Effective governance therefore requires a lifecycle oversight model that includes continuous monitoring, transparent reporting, and structured response mechanisms. This recommendation establishes that high-risk AI systems in schools are not “set and forget” technologies. They must be evaluated regularly against performance, equity, and safety benchmarks.
Ongoing Testing and Annual Public Reporting Requirements
States should require districts using AI systems to submit annual public reports summarizing system performance and impact, using a standardized template provided by the state education departments to ensure consistency and reduce burden. Continuous oversight reflects four core principles: accuracy can degrade over time; equity requires ongoing monitoring; transparency builds trust; and accountability must be enforceable through mechanisms such as sunset or reauthorization. This recommendation does not presume failure; rather it ensures responsible innovation through measurable outcomes and structured review.
Annual reporting should include:
- Performance Metrics: Error rates (false positives/negatives), misclassification rates, trends over time, and testing methods. For example, security systems should report alerts versus confirmed threats; grading systems should report educator overrides.
- Demographic Disparities: Disaggregated data on flagging rates across high-impact use cases, AI-linked disciplinary actions, and override rates across race, disability status, English learner status, and other relevant categories. Ongoing disparity monitoring supports compliance with federal civil rights obligations.
- Human Oversight: Frequency of overrides, instances where AI influenced adverse actions, and documentation of review processes to ensure systems are not functioning as automated decision-makers.
- Complaints and Resolution: Volume and type of complaints (accuracy, bias, privacy), resolution timelines, and corrective actions taken, which together serve as an early warning mechanism.
Embedding monitoring and reauthorization into state governance ensures AI systems remain tools for student support rather than unexamined sources of risk.
Rapid Incident Response Protocol
In addition to annual reporting, states should require districts to adopt a rapid incident response protocol for significant AI-related harms, including major student data breaches, unsafe outputs such as erroneous security alerts that trigger law enforcement involvement, systemic bias identified through internal review or complaint, and widespread false positives and negatives affecting multiple students. The protocol should recommend immediate containment—including suspension of system use where necessary—prompt notification to affected families and the state education department within a defined timeframe (such as 48–72 hours), a documented root cause analysis conducted in coordination with the vendor, and a corrective action plan with clear mitigation steps, accompanied by a public summary.
Districts should not hesitate to pause or suspend deployment when student safety, civil rights, or liability risks are implicated. In high-stakes environments, it is prudent to err on the side of intervention rather than adopt a “wait and see” approach. This framework aligns with established cybersecurity incident response standards and minimizes the risk of prolonged or compounded harm.
Sunset and Reauthorization Requirement
To prevent long-term entrenchment of ineffective or harmful systems, states should require periodic reauthorization of AI tools. Authorization should automatically sunset after three years unless renewed based on demonstrated accuracy, absence of unexplained demographic disparities, documented educational benefit, and compliance with reporting and audit requirements. This approach creates accountability without imposing permanent bans and incentivizes continuous system improvement.
Recommendation 5. Create a State-Level Technical Assistance and Vendor Accountability Program
Procurement guardrails, AIAs, and monitoring will only succeed if districts have the capacity to implement them. Many, especially rural districts, lack high speed internet, expertise in AI evaluation, data governance, contract negotiation, and receive less funding than their urban peers. A state-level technical assistance and vendor accountability program, buttressed with support from universities and independent evaluators, can close this gap and shape stronger market standards.
Without practical support, reforms risk becoming procedural rather than protective. Modest centralized investment can reduce duplication, strengthen negotiating leverage, reinforce civil rights and privacy compliance, and promote responsible innovation at far lower cost than reactive remediation after harm occurs.
Statewide Technical Assistance and Training
States should establish targeted training for procurement staff, technology leaders, and administrators overseeing AI adoption. Training should cover risk classification, completion of AIAs, evaluation of vendor claims and accuracy metrics, disparate impact and treatment analysis, contract negotiation best practices, and incident response obligations. Delivery can leverage existing professional development structures such as webinars, regional workshops, online modules, and standardized toolkits. To limit costs, states can partner with public universities, education service agencies, or nonprofit research centers with expertise in education technology and civil rights compliance.
Optional Statewide “Approved Vendor” Pathway
States may establish an optional pre-vetted or “approved vendor” pathway for AI systems, structured as conditional certification tied to transparency and compliance standards. Under this model, vendors voluntarily submit documentation demonstrating compliance with state disclosure, testing, and contract requirements. The state conducts a structured review. If approved, vendors are listed in a public registry. Districts may still procure other vendors but must complete full independent review.
States should also consider pairing any approved-vendor pathway with targeted compliance assistance, particularly for smaller or emerging vendors that may lack dedicated legal or regulatory staff but offer promising educational tools. This could include technical workshops, model disclosure templates, and guidance on meeting state testing, documentation, privacy, and contract expectations. Providing this support helps ensure that approval pathways do not inadvertently favor only large incumbent ed-tech companies with extensive compliance infrastructure, while still preserving rigorous standards for safety, transparency, and civil rights protections.
This approach reduces duplicative review, strengthens bargaining leverage through uniform standards, and incentivizes vendors to compete on transparency and validated performance rather than marketing claims. Approval should remain conditional, subject to ongoing compliance monitoring and revocation if standards are not met.
Independent Evaluation and Privacy-Preserving Audit Options
To strengthen accountability, states should mandate independent evaluation of high-risk AI systems that balance transparency with student privacy and vendor intellectual property protections. Options include secure data enclaves, aggregated performance reviews under confidentiality agreements, de-identified or differential privacy testing environments, and partnerships with public universities for validation studies. Independent evaluation helps test vendor claims, detect disparate impacts and treatment, inform evidence-based policymaking, and build public trust. Where feasible, states may offer grants to support validation of widely used systems.
Ensuring Effective Transmission of State Guidance to Districts
State guidance does not always translate cleanly into local practice. Differences in district capacity, staffing, and procurement autonomy can result in uneven compliance. To improve implementation, states should integrate AI oversight into existing compliance or accreditation cycles, provide standardized templates and model contract language, designate a clear AI governance lead within the State education departments, and phase implementation beginning with high-risk systems. Experience from data privacy, IDEA, and Title IX compliance shows that clear documentation and centralized technical assistance significantly improve consistency between state policy and district practice.
Recommendation 6. Leadership-Level Capacity Building for Superintendents and Senior District Officials
Procurement reform is insufficient without leadership capacity. Superintendents and senior administrators often make AI adoption decisions based on vendor presentations or innovation pressures, yet may lack training in algorithmic risk, civil rights implications, and technology contract governance. States should establish targeted AI governance training for superintendents, cabinet leaders, CTOs, chief academic officers, and school board members.
Core Training Components
Leadership-level training should cover core legal and governance competencies, including civil rights risks under Title VI, IDEA, Section 504, FERPA, and due process standards. Training should include strategic procurement literacy such as avoiding vendor lock-in and understanding critical audit and data provisions. Training should distinguish supportive from punitive AI use cases and build skills to recognize overpromising in vendor marketing. Lastly, training must contain sections on crisis preparedness, including how to respond to discriminatory or unsafe outcomes and communicate transparently with families and the public.
Delivery Mechanisms
States can integrate this leadership capacity-building into existing professional development structures. Some examples could include: annual superintendent conferences, certification renewals, school board association trainings, and regional education service agency programs. Leveraging established forums avoids creating new bureaucratic layers while using trusted professional networks to promote consistent implementation.
Why Leadership Capacity Matters
State guidance does not always translate cleanly into district practice. Implementation is shaped by resource disparities, competing priorities, leadership turnover, and vendor influence. When superintendents understand the governance framework and its rationale, they are more likely to demand compliance with procurement guardrails, resist premature adoption, dedicate staff time to meaningful review, and support transparency and public reporting. Effective leadership now also requires understanding how to manage AI-enabled systems in practice including how automated outputs interact with existing administrative processes, where human judgment must remain central, and how risks may emerge over time after deployment. Because AI governance is inherently cross-functional, district leaders must be prepared to coordinate legal, procurement, technical, and ethical considerations rather than treat AI as a purely technical issue delegated to IT staff alone. The goal is not to turn superintendents into technical specialists, but to ensure they can exercise informed oversight over AI-enabled decision environments. Without leadership buy-in, even well-designed safeguards risk remaining underutilized.
Conclusion
States must ensure AI integration supports student safety and well-being.
This is possible by adopting procurement guardrails, requiring Algorithmic Impact Assessments, limiting high-risk predictive uses, and mandating ongoing oversight. This framework relies on existing state authority over procurement, contracting, and oversight. By leveraging these tools, policymakers can focus on responsible implementation. Modest investments in technical assistance and leadership capacity can create clear, workable standards for districts and predictable expectations for vendors.
Done well, this approach delivers safer technology deployment. Specifically, it lowers discrimination risk and builds stronger data governance. Doing so ensures better stewardship of public funds, and greater public trust. AI in schools should expand opportunity, not erode it. Acting now, while adoption norms are still forming, allows education leaders to ensure innovation and student rights advance together.
This memo does not comprehensively address several important and distinct issues that warrant separate, dedicated analysis. For example, while this memo references students with disabilities, it does not fully examine the unique legal, educational, accessibility, and civil rights implications AI systems may pose for students protected under the IDEA and Section 504. Similarly, this memo does not comprehensively address the particular risks and considerations affecting English language learners, immigrant students, or multilingual families, each of which raises important questions related to language access, equity, and data governance. The memo does not broadly explore AI literacy or training needs for teachers, students, parents, or school administrators, despite the growing importance of ensuring school communities understand how these technologies function and affect educational environments. Given the confines of this memo, the analysis is intentionally focused on procurement guardrails and governance mechanisms for high-risk AI systems in K–12 education. Nevertheless, the issues not addressed here remain critically important and warrant substantial future analysis and policymaker attention.
How to Safely Bring AI into Law Enforcement: The Case of AI-Generated Police Reports
Commercial artificial intelligence tools have recently emerged that are able to produce police reports. Some police departments have already adopted this technology. Also, some individual officers are using publicly-available AI tools. If AI could greatly reduce the time spent producing police reports, this could either substantially reduce the cost of policing, or free up police officers for other work. However, if the resulting reports are inaccurate, incomplete or biased, or if the process leaks confidential information, this could undermine the criminal justice system and harm citizens, perhaps causing an innocent person to be charged with a crime while the actual criminal is overlooked. At this time, both the benefits and the risks are poorly understood.
Yet, despite the uncertainty, each of the more than 18 thousand law enforcement agencies in the U.S. must make its own decision about the use of AI. These agencies do not have the expertise or resources to assess whether any of the AI-based products on the market are right for them, and if so, what training, departmental policies and deployment strategies are needed to use the technology both safely and effectively.
This memo proposes fostering innovation in AI for policing without sacrificing safety through a combination of centralized actions by the U.S. Department of Justice and independent actions by state and local law enforcement agencies. The Department of Justice, through its National Institute of Justice, should establish a new research and evaluation program that will give state and local government agencies the information they need to make the best decisions about use of AI for police reports given their own needs and resources, and keep Congress and the Department of Justice abreast of AI use in policing nationwide as well. Each state and local agency should use this information to devise its own strategy, addressing issues such as whether to adopt AI, officer training, technology choice, budget, transparency, and other policies and procedures to use the technology where it is safe and effective.
While this memo focuses on use of AI for police reports, the recommended solution serves as a model for other AI use cases as well. Similar problems occur every time a large number of local government agencies are contemplating the use of AI in scenarios where the pros and cons are poorly understood, and there is potential for significant harm.
Challenge and Opportunity
Why Police Departments are Considering AI for Police Reports
Police reports are a cornerstone of law enforcement. These reports serve as the official record and generally the only written record of significant interactions between police officers and individuals, including arrests, crimes reported, and car crashes observed. The contents of police reports can influence important decisions, such as whether an individual is charged with a crime. When police officers testify in court about an incident that occurred months or years earlier, they typically rely on the police reports that they wrote soon after the incident to get the details right. When insurance companies want to assess liability, their decisions often depend on police reports. When police officers are accused of misconduct, investigators study the relevant police reports. When compiling crime statistics on which policy decisions will be made, critical data comes from police reports. It is therefore important for police reports to be accurate, complete, and unbiased.
Given the importance, it is no surprise that many police officers spend hours per day producing these reports. This comes at a cost. If the time spent on police reports could be reduced, then police departments could reduce the number of officers employed and thereby greatly reduce expenses, or reallocate officer time to other productive tasks, or some combination of the two. Many police departments in the U.S. are especially motivated now to free up their officers’ time, because there is a national shortage of qualified officers, and many departments have unfilled positions.
A number of companies have announced products that integrate AI into the writing of police reports. Some vendors such as Truleo and Axon have claimed that AI assistance can reduce the total time spent on police reports by 80% to 90%, which would yield tremendous cost savings if true. In response to such promises, some police departments have already adopted this technology. Given financial and staffing pressures, more departments are likely to follow.
But are the cost savings real? Are the reports produced when using AI reliable enough for their intended purpose? And what strategies for adoption will maximize both cost savings and report quality? Most police departments do not have the AI expertise on staff to answer those questions. Indeed, roughly three fourths of law enforcement agencies in the U.S. have fewer than 25 police officers, and thus very few IT professionals.
How AI Would Be Used
The general idea is that information about the incident is fed into an AI-based system which produces a draft report of what a particular police officer did and observed, which that officer must review. The details vary from one AI-based product to another. In some cases, police officers feed this information into the system by typing relevant facts on a computer. In others, officers participate in an interactive oral interview with the system. In the most ambitious system, the AI system is fed information about an incident by uploading recordings from a body-worn camera, with no direct involvement from the officer. These systems transcribe the audio and use the resulting text; some analyze video as well. In all of these cases, once the AI-based system produces an initial draft, the officer inspects the draft, makes any changes he or she wishes, and signs off on the result.
The Risks of Using AI for Police Reports are Poorly Understood
AI-based products for police reports use generative AI, where an AI system is trained from a set of prior examples to understand which words and phrases are frequently used together. The system can then generate entirely new text for new circumstances by using the relationships observed in its training in combination with some new input data and some elements that are entirely random to avoid repetition and unnaturally formulaic text. Regardless of the domain, producing text using generative AI can be problematic.
First, generative AI can randomly produce “hallucinations,” i.e. information that is roughly consistent with the training data but incorrect in the current circumstance.
Second, when an AI model is trained on biased data, it produces biased results. For example, if reckless driving citations in the training data are more likely to involve alcohol with young drivers than with old drivers, then hallucinations involving alcohol may be more likely with young drivers. Companies are rarely transparent about their training data sources, but some sources from law enforcement could easily be biased with respect to factors such as race, age and gender.
Third, some generative AI models leak information in unexpected and often unseen ways. For example, if the system uses new inputs from users to improve (or “train”) the model, then a new input may later be revealed to other users. This happens with the widely-used generative AI services that are offered for free to the public, and some officers already use those free tools. Even if new inputs are not used in this way, those new inputs could be transferred to a provider of AI-based services with weak defenses. If a police department allows its officers to use a system with inadequate protections, this would risk citizens’ privacy and possibly compromise future court cases. It is technically possible to design systems with better protection against leakage, but police departments typically have no way to tell which services have done so effectively. Given all of these risks, it is no surprise that some localities have sought to prohibit use of AI for police reports.
Of the various methods of putting information into the system described above, using recordings from body-worn cameras could save the most officer time, but it also brings additional risks that must be assessed. For example, when an officer in Utah uploaded the recording of an incident that occurred while a movie was playing in the background, the AI reportedly produced a police report claiming that the officer transformed into a frog. An error like that does no harm because it is easy to detect, but a different movie might have produced a far more dangerous error. Also, audio transcription is less reliable when people speak with accents or with an African-American Vernacular. Using AI to accurately turn video into text can be even more challenging. Finally, with this approach there is no opportunity to record an officer’s subjective experience before the officer is influenced by AI-generated text, which some people have argued is important. Testing is required to understand the seriousness of these potential risks, and any mitigation strategies.
In 2025, I organized a research project at Carnegie Mellon University (CMU) to investigate use of generative AI for police reports. We produced police reports using three different kinds of generative AI technology, and observed that material inaccuracies do occur. For example, in one assault case, an input to the AI indicated that the victim was not transported to a medical facility without providing a reason, but the resulting report inaccurately claimed that the victim refused transport to a medical facility. We also observed that error rates varied from one AI product to another, as well as from one type of police report to another, perhaps because some types of reports are more complex than others. Thus, it matters which AI technology a police department chooses and under what circumstances it directs its officers to use that technology.
As long as AI is only used to produce the first draft of a report, problematic text does not compromise report quality if the police officer finds this text and rewrites it before submitting the final report. That may or may not be sufficient. As explained by MIT professor David Autor and Alphabet Senior Vice President James Manyika, AI systems that augment humans without replacing them can fail if the AI is not designed to collaborate with humans, such as when human pilots could not prevent an Air France flight from crashing after the autopilot failed because the tool gave the pilots limited situational awareness. It is even less obvious, but the converse is also true: problems can occur if humans are not explicitly trained to collaborate with AI.
The CMU researchers conducted experiments in which experienced police officers were asked to make corrections to prewritten police reports which contained hallucinations, omissions, and “event swaps” in which things occur in the wrong chronological order. We observed that officers missed many problems, including those that might matter in legal proceedings, such as when a report incorrectly indicated that a suspect was holding a knife when encountered. It is important to note that this occurred in a university research exercise rather than a professional setting, and that the officers had never been explicitly trained to edit AI-generated text, i.e. to collaborate with AI. Better results might be possible in real police departments that have adopted the right kind of training, but this requires more investigation.
Even an error that is not directly material to the case can do harm. A memo from the King County Prosecuting Attorney’s Office reports that, thanks to AI, “an otherwise excellent report included a reference to an officer who was not even at the scene. … And when an officer on the stand alleges that their report is accurate — they will be proven wrong…we do not want your officers certifying false police reports. The consequences will be devastating for the case, the community and the officer.” Defense attorneys can bring up this error every time that officer testifies for many years to come.
The Benefits of Using AI for Police Reports are Poorly Understood
On the positive side, many departments would save money if AI reduced the amount of time that each officer spends on police reports by just tens of minutes per week. This reduction could be within reach. One prominent survey found that 62% of officers spend more than two hours per day on police reports and 14% spend more than four, and there have been news articles quoting police officers who said that time savings from AI were substantial, although this is anecdotal. Yet the most rigorous study to date did not find any reduction in time spent when AI was introduced. This issue also deserves more investigation. Moreover, the impact of AI on time spent and police budgets will vary greatly between departments, so a single one-size fits-all conclusion is inadequate. Savings depend on factors like the number of police incidents per week, the types of incidents that are most common, and how pervasive technology already is in the department.
The benefits and risks associated with AI also depend on the deployment strategy. For example, police departments may choose to use AI in cases where time savings are great and risks are low, or when time savings are insignificant and risks are high. Departments may choose to use AI in a transparent manner in which problems are easily observed and quickly corrected, or in an opaque manner. Research could provide guidance to police departments on whether and how to adopt this technology while minimizing risks.
Unfortunately, this research will rarely occur under current policies. Individual police departments are unlikely to invest their limited resources into testing commercial AI software products, developing new officer training programs, measuring whether AI saves time or money, or collecting best practices for adoption. If the federal government fails to act, some states or cities may fund useful work. However, even the state and local agencies with the largest budgets, such as the New York City Police Department and the California Highway Patrol, have little incentive to bear the full cost of making new discoveries and then informing the nation’s 18 thousand law enforcement agencies, most of which are small and have needs and resources that are quite different. There are university researchers doing this kind of work, but very few, and most police do not read academic journals. Informed decisions will only happen if the federal government takes action.
Plan of Action
Most of the actual decisions about whether police should use AI technologies at all, which specific AI technologies to acquire, and how those AI technologies should be used will be made by local officials. The specific decision-maker varies from locality to locality. For most of these decisions, police chiefs are critical. They can weigh in directly on issues such as officer training and department policies governing technology use, or can delegate that role. In some jurisdictions, police departments make independent decisions about procuring technology such as AI, whereas in others municipal Chief Information Officers may play a more decisive role. It should be the responsibility of the federal government to inform these decisions, regardless of which state or local official has the final say in any locality. Thus, this memo will make actionable recommendations to two audiences: the federal Department of Justice, and those who make decisions for state and local law enforcement agencies.
Recommendation 1. The Department of Justice, through the National Institute of Justice (NIJ) and in consultation with the National Institute for Standards and Technology (NIST), should create ongoing projects whose goal is to provide information to state and local agencies that helps these agencies make better decisions regarding use of generative AI for police reports.
The introduction of AI for police reports raises technical and operational questions that individual law enforcement agencies are poorly positioned to answer on their own. Addressing these questions falls within the mission of the National Institute of Justice (NIJ), the Department of Justice’s research and evaluation arm. NIJ is well positioned to generate and disseminate this evidence at a national scale, reducing duplication across thousands of agencies and enabling more consistent, evidence-based adoption decisions.
The NIJ should draw on expertise from multiple institutions to address these important questions. Universities should play a central role, because the best academic researchers are accustomed to inventing entirely new methods that address novel challenges and emerging technologies. NIJ should therefore establish a funding program to support external research. Others already work for NIJ, where understanding of the problem domain is deep, so important work can also be done internally. Although they typically lack law enforcement expertise, there are also experienced AI researchers at NIST’s Center for AI Standards and Innovation, so consultation with that center could help. Below are some examples of research that is needed.
Research on Evaluation Methodology for AI Products and Services
A new methodology must be created that can assess AI-based products and services for police reports, and quantitatively determine their ability to produce reports that are both accurate and complete under a wide variety of scenarios. This methodology should also assess the risk of leaking confidential information.
Research on how to train police to edit AI-generated reports
Even when reports are generated by AI, it is the responsibility of a police officer to ensure quality through editing. Simply having a human involved does not mean that the report will be anywhere near as accurate or complete as if a human wrote it. Detecting and correcting subtle mistakes in text that someone else wrote is challenging, and few police officers have experience with the task. Extensive training may prove critical. For example, officers might first learn enough about how AI-based tools work to dispel any illusions that they are infallible. Then officers might learn the types of mistakes that AI tends to make, which are different from the types of mistakes that humans tend to make. Research is needed to develop training strategies, and determine their effectiveness.
Research on Benefits and Costs of AI
The primary motivation for adopting AI is to save time and money. Do AI tools really reduce the time spent on police reports, and if so, by how much? What are the lifecycle costs, including software, storage, IT support, and officer training? How do expected cost savings depend on factors that vary by police department, such as number of officers, the types of police report that are most common in the department, and existing IT infrastructure? How do they depend on technology choices, such as whether officers feed the AI by typing in information, participating in an audio interview, or uploading recordings from a body-worn camera?
Research on how departments can perform quality control
Any organization that introduces a technology with unknown impact should have a way of measuring quality in context on an ongoing basis, and not just before deployment. How does a police department know if the reports generated with AI assistance are good enough, or if its officers are well-trained? One possibility might be to routinely assess the completed reports, such as by comparing AI-generated reports with video footage in a monthly audit as the Boulder Police Department tried or with officer-written reports as the Oklahoma City Police Department tried. Doing this as efficiently and effectively as possible may require a new method. Another might be to artificially inject errors of the kind that AI is likely to produce, and monitor whether injected errors are corrected. (One existing product from Axon already injects errors. Effectiveness may be limited because the injected errors are unlike those that AI is likely to produce, but this requires testing.) If a few officers consistently submit reports with injected errors or other problems, this may indicate that those officers need further training. If many officers consistently do so, then this may indicate a more systemic problem.
Other types of research and analysis are perennial and therefore should generally be led by staff within NIJ, although outside researchers could play a smaller role. Outside researchers tend to be less effective when success requires the trust of law enforcement agencies, or when being consistently accurate is more important than inventing something new. Examples include:
- Assessment of products and services on the market today: Once the research described above produces a methodology to evaluate the quality of an AI product or service, that method should be applied to each product. DoJ staff should use that methodology to assess every new product or major update of an existing product that comes on the market, and the results should be made available to law enforcement agencies and municipalities across the country.This is comparable to the NIST program which tests facial recognition products.
- Identifying best practices and tracking use: With thousands of law enforcement agencies making independent decisions about the use of AI, it is inevitable that some will adopt better strategies than others. One ongoing mission of this program should be to collect information about what law enforcement agencies are doing with AI and its impact, both positive and negative.From this, they should produce a set of best practices which can be widely disseminated, and continually revise these best practices over time.
All results and recommendations from this program should be made available directly to all of the 18 thousand law enforcement agencies in the U.S.The program should disseminate results to organizations that train police officers, including future police chiefs.This includes the FBI National Academy and state organizations like the California Commission on Peace Officer Standards and Training.It should also disseminate results through national organizations that serve state and local decision-makers, such as the National Association of Chiefs of Police, the Association of Public-Safety Communications Officials International, the U.S. Council of Mayors, and the National Association of State Chief Information Officers.
The program should also provide annual summaries of use of AI for police reports in the U.S. to Congress, the Department of Justice, and the general public, so it is possible to track trends over time and detect potential concerns before they become problematic.
Recommendation 2. Any state or local law enforcement agency that is seriously considering adoption of AI for police reports should first produce a strategic plan using information provided by NIJ, knowledge of local needs and resources, and other available information.
Without an appropriate strategy in place, the use of AI for police reports is likely to produce reports that fail to meet the needs of the criminal justice system, potentially putting innocent people at risk, and wasting taxpayer money. An effective strategic plan can mitigate these risks. This plan should address the following.
- Choice of AI technology: Police departments must choose products and services carefully, and incorporate DoJ recommendations from the research program above into existing procurement processes. For example, they should not procure an AI system in which the data that is entered can be used to train the model as this can make the data accessible and thus undermine privacy, or a system in which the risks of hallucination, omission or bias are deemed to be high. The National Association of State Procurement Officials should also include these recommendations in its guides. Given that individual officers can and already do use AI even when their department has not adopted it, departments should also explicitly prohibit officers from using publicly-available AI systems.
- Phased deployment: When rolling out any technology that carries risk, it is helpful for an organization to deploy in phases, such that each phase expands the extent of use. After each phase, the organization must carefully assess whether deployment was successful before deciding whether to advance. Whenever problems are observed in these assessments, those problems must be addressed before proceeding. This has proven to be an important approach when adopting AI. In this case, that probably means initially using AI only for reports for which the consequence of errors is lower and/or the risk of inaccuracy is lower, e.g. traffic incidents rather than felony arrests.It also may mean initially using AI only by a small group of police officers who are already comfortable with the technology rather than scaling quickly to all officers.
- Transparency and oversight:The use of AI should be sufficiently transparent to stakeholders, the relevant legislative bodies (e.g. city councils, state legislatures), civilian oversight boards, and the community. For example, the policies and procedures about use of AI discussed above should be stated in advance and publicly accessible, as should plans for large-scale procurements. This is important both as a means of quickly detecting and correcting any problems that may emerge before they become serious, and for fostering community buy-in by keeping police use of AI consistent with public expectations and goals. In addition, when a police officer uses AI-based tools to produce a report, there should be an indication on that report regarding what role AI played, something that some police departments have deliberately concealed.There should also be an indication of where the information that was fed into the AI system came from, e.g. whether that input includes recordings from body-worn cameras, audio from interactions with dispatchers, etc.This allows all stakeholders, including defense and prosecuting attorneys, to apply an appropriate form of scrutiny to the resulting police reports.
- Policies and procedures: Disasters are possible even with good technology. Police departments should establish effective policies and procedures before using AI for actual reports, also considering DoJ recommendations. This includes creating an effective training program for officers, where effective training goes well beyond just knowing the features of the software and addresses effective quality control. It also includes determining whether officers are allowed to see AI-generated content before they have written their own observations, given the “elasticity of human memory.” Departments may adapt the process by which a police report is reviewed by people other than the author, e.g. by supervisors or experts. Departments should also define policies regarding which intermediate datasets used by AI software to produce police reports are retained, and who can access them. For example, if audio recordings are transcribed and summarized before generating reports, how long should the original recordings be retained?
Conclusion
In recent years, the capabilities of generative AI have advanced at an astonishing rate, leaving our understanding of how to make use of those capabilities far behind. This is particularly challenging for those who would like to use the potentially transformative capabilities of generative AI for producing police reports, and for other AI applications that share two qualities. First, there are dire consequences if use of the technology goes badly, such as the possibility that a flawed police report could lead authorities to charge the wrong person with a crime. Second, most of the decisions with significant impact are made by 18 thousand independent local government agencies with different needs and limited resources and AI expertise. It is hard to imagine how all of these agencies could make informed decisions regarding use of an emerging technology that is still poorly understood by tech-savvy institutions.
Some agencies will avoid the risk by never even considering AI for a purpose like this. However, they forgo any possibility of reaping potential benefits, such as a significant reduction in costs, or a reallocation of police time from paperwork to other productive activities. Other agencies will adopt AI, but in a way that does more harm than good, perhaps because they chose the wrong product or because they used it poorly. This paper proposes a two-pronged strategy that will give state and local decision-makers both the information they need to make good decisions, and the confidence that their decisions are right for their respective agencies.
The U.S. Department of Justice, through its National Institute of Justice, should establish a set of programs that all have the goal of providing actionable information to law enforcement agencies about use of AI for police reports. This includes the pros and cons of adopting the technology and how both vary from agency to agency, the strengths and weaknesses of AI products on the market, how to train officers in use of AI for police reports, how to perform continual quality control, and other best practices.
Each state or local law enforcement agency that is considering AI for police reports should produce a strategic plan that makes use of information provided by NIJ. Topics in the strategic plan would likely include the types of AI that should and should not be used, a phased approach to adoption, a transparency strategy that makes it easier to identify issues before they become highly problematic, and other policies and procedures.
My thanks to my CMU colleagues who worked on a 2025 research project on AI and police reports: Dr. Aleecia McDonald, Dylan Bonanno, Kai Collins, Ayana Curto, Katie Eisenman, Madeline Falk, Jane Fleischman, Harrison Green, En Hung, Wendy Jiang, Lily Klucinec, Isabella Krisky, Skylar Lukic, Tzen-Chuen Ng, Nicholas Ortiz, Miguel Rivera-Lanas, Christopher Rodas Ochoa, Keya Sharma, Autumn Swartz, Morgan van der Linde, Maximilian Vieweg, Sophie Vincens, Kemp Winkler, Avi Wong.
Yes. General-purpose generative AI tools have been available to the public for several years, including OpenAI’s ChatGPT, Google’s Gemini, Anthropic’s Claude, and Microsoft’s CoPilot. Police departments did not officially embrace these tools, but individual officers have. For example, it was discovered that an ICE agent used ChatGPT to produce reports, which led the judge to respond that this “may explain the inaccuracy of these reports.” This is inevitable unless police departments adopt policies that prohibit use of these tools and actively inform officers about those policies.
Since then, companies have built tools intended for law enforcement by adopting a general-purpose AI-based tool, and adding features specific to police reports, such as additional training data and police-friendly interfaces. Relevant companies include Axon, Caseify, Central Square, Code Four, Police1, PoliceNarratives.ai, Policereports.ai and Truleo.
Building on general-purpose models gives companies the opportunity to outperform general-purpose models, perhaps by improving accuracy or reducing risk of information leakage. However, since the technical details underlying commercial products developed for law enforcement are typically opaque and proprietary, many potential buyers cannot know whether improvements are present. Evaluation by a trusted organization could address this problem, by testing the product directly and demanding technical details about product design.
The greatest risk is that AI tools will produce police reports with flaws that are not corrected in editing. Generative AI is inherently vulnerable to hallucinations that produce inaccurate information. AI tools can also omit critical facts, or put events in the wrong chronological order. AI can produce biased text, i.e. text may depend on characteristics of individuals in the report such as race, gender or age when those characteristics should be irrelevant. When an AI system is trained from biased data, the system is likely to perpetuate those biases.
Inaccuracies, omissions, event swaps and biased text can all be material in important decisions. Seemingly minor inaccuracies or omissions have serious consequences, such as making an innocent bystander look deceptively guilty, or making it appear that police did not comply with applicable laws when they did. Inaccuracies can undermine legal proceedings. Even errors that are not material to the case can become problematic if a police officer later testifies that the police report is entirely correct, as this could put the officer’s entire testimony and reputation in doubt. Research is needed to understand these risks.
Yes, these recommendations are intended both for use of AI to produce police reports, and as a model for advancing safe, impactful and innovative adoption of AI and other technologies in similar cases. The goal is to adopt AI where (and only where) it brings improvements. The issues are similar whenever the following characteristics are present.
First, the technology being considered offers significant potential for benefits and significant risks for harm, so that “move fast and break things” is not the best approach. Adoption can be accelerated by addressing the concerns of potential adopters and building confidence.
Second, much is not known about how to use the technology safely, perhaps because the technology is as new as generative AI. Thus, someone should produce and disseminate information that will enable good informed decisions.
Third, local government agencies are the primary decision-makers. Unlike federal agencies and large companies, local governments have limited resources to investigate new technologies. Most for-profit companies that would advise them simply want to make a sale.
When these three characteristics are present, the federal government can provide critical information to decision-makers. Also, local governments can benefit from phased deployments with assessments after every phase, and transparency provisions.
The Trump Administration’s executive orders do not address AI for police reports specifically, but they seek ways to advance AI innovation and adoption using a strategy that is consistent with the recommendations in this memo.
President Trump issued an executive order calling for an AI action plan. America’s AI Action Plan has three pillars, the first of which is innovation. According to the Plan, “the United States needs to innovate faster and more comprehensively than our competitors in the development and distribution of new AI technology across every field, and dismantle unnecessary regulatory barriers that hinder the private sector in doing so.” Consistent with America’s AI Action Plan, this memo recommends creation of federal programs that foster innovation wherever that innovation benefits society without imposing barriers on state and local governments.
America’s AI Action Plan explicitly recommends evaluation, stating that “rigorous evaluations can be a critical tool in defining and measuring AI reliability and performance in regulated industries,” and directing the federal government to “support the development of the science of measuring and evaluating AI models, led by NIST at DOC, DOE, NSF, and other Federal science agencies” This clearly includes NIJ assessments of AI for police reports.
Congress has passed no laws that specifically address use of AI for police reports, but two states have: Utah and California. These laws are consistent with this memo’s recommendations.
Under Utah’s Law Enforcement Usage of Artificial Intelligence Law, agencies must have policies that indicate which generative AI technologies employees can use, and for what tasks. The law also mandates that any police report created with AI assistance should include a disclaimer describing the role of AI, and a certification that the author reviewed the report for accuracy.
California’s Law Enforcement Agencies: Artificial Intelligence Law similarly mandates that police reports created with AI assistance include a disclaimer, and that agencies retain the initial draft of the report which was created entirely by AI and an audit trail of subsequent changes. Finally, the law prohibits vendors of AI-based tools from selling information that they obtain in this process.
These policies are consistent with recommendations of this memo, although this memo is not proposing mandates from the federal government. This memo would recommend that the NIJ collect data on the consequences of any state law, and use the lessons learned to recommend best practices to the other states.
FairCare Verification Offers a Human-Centered Path for AI in Medicaid
A wheelchair user with complex care needs submits a prior-authorization request that her physician supports. An algorithm-generated denial arrives with no meaningful explanation, only that her condition “does not meet medical necessity.” Her appeal languishes for weeks. By the time a human reviewer sees the case, her condition has deteriorated, confirming the algorithm’s prediction that she was “high-risk.” This scenario reflects concerns documented across automated denial systems in commercial insurance and other coverage settings, where algorithms have been used to guide or accelerate utilization review and prior-authorization decisions. In Medicaid managed care, similar dynamics are playing out as algorithmic systems make high-stakes decisions about patient care. These decisions include prior authorizations, risk scoring, triage, and fraud detection. Too often, affected patients, clinicians, and regulators cannot see how the system works, why a decision was made, or whether meaningful human oversight occurred. Growing evidence suggests these systems may systematically disadvantage low-income patients, people with disabilities, and racial and ethnic minorities, perpetuating health inequities at scale. Their deployment is broadly unpopular with the general public and with frontline healthcare workers, particularly nurses, whose clinical judgment is routinely overridden by opaque automated systems.
This memo focuses primarily on Medicaid because it is where vulnerable beneficiaries are already exposed to opaque automated decision systems through managed care organizations (MCOs). It also focuses on Medicaid because existing federal and state authorities can be used now, without waiting for new legislation. The memo proposes a policy framework (“FairCare Verification”) built around two core reforms: Community Algorithmic Impact Statements (CAIS) and Nursing-Led AI Audit Brigades (N-LABs). These reforms would be supported by patient-facing appeal and explainability protections and by contract-based limits on exploitative secondary uses of safety-net data.
These reforms can be advanced through guidance from the Centers for Medicare & Medicaid Services (CMS), enforcement by the HHS Office for Civil Rights (OCR), certification standards from the Office of the National Coordinator for Health Information Technology (ONC), and state Medicaid managed care contracts. Because managed care organizations and vendors often build products to the highest applicable compliance standard, Medicaid-focused guardrails can shape broader market behavior across healthcare.
Challenge and Opportunity
This section addresses the central problem created by growing reliance on algorithmic tools in Medicaid managed care and explains why the current policy moment creates an opening for targeted intervention. The core issue is not simply that AI is entering healthcare; it is that high-impact decisions about coverage and care are increasingly being shaped by systems that affected patients, clinicians, and regulators cannot meaningfully scrutinize.
Accountability Structures
Current accountability frameworks were built for human decision-makers. They assume that decisions can be explained, questioned, appealed, and attributed to a responsible actor. Algorithmic systems strain each of those assumptions. Civil rights enforcement is still largely organized around individual complaints, but algorithmic harms often emerge at the population level through statistical patterns that are difficult for any single patient to detect or prove, a challenge well documented in emerging legal analyses of algorithmic discrimination.
Medicaid is especially important because it is a joint federal-state program in which the federal government sets baseline rules while states administer benefits and contract with managed care organizations. That structure creates multiple adoption points for algorithmic systems, but it also creates multiple intervention points. This memo therefore focuses on Medicaid managed care, especially prior authorization and utilization management. Similar issues are emerging elsewhere in healthcare. For example, CMS’s Wasteful and Inappropriate Service Reduction (WISeR) Model is a traditional Medicare model that uses enhanced technologies, including AI and machine learning, along with human clinical review, to review selected services before or around payment. That initiative is distinct from this memo’s principal Medicaid focus, but it illustrates why automated review is becoming a broader policy concern.
Automated Accountability Risks
AI tools are already embedded across prior authorization, risk scoring, triage, patient communications, and fraud detection. Proponents argue that these systems can speed processing, reduce administrative delays, standardize decision-making, and detect fraud more efficiently than manual review. In a system with substantial administrative waste, those goals are not trivial. But the case for efficiency is often asserted rather than independently validated, and internal safeguards are rarely transparent to the communities most affected.
The concern is not limited to explicit use of race, disability, or other protected traits. High-impact systems can generate discriminatory effects through proxy variables such as prior healthcare spending, geographic indicators, housing instability, or employment status. Under Section 1557 of the Affordable Care Act, Medicaid programs may not discriminate on the basis of race, color, national origin, age, disability, or sex, yet algorithmic systems may produce exactly such differential outcomes through indirect pathways. The Optum algorithm controversy showed how a widely used population health tool could systematically under-identify Black patients for additional care even when they were equally sick. Patients with disabilities may be especially vulnerable to extended automated reviews or wrongful denials when algorithms fail to account for the complexity and variability of disability-related care needs. Separately, reporting on automated insurance denials has raised concern that the speed and scale of algorithmic review can sideline meaningful clinical judgment.
Population Differences and High-Impact Use
A related concern is the gap between the population on which a system was trained and the population on which it is deployed. One problem arises when a model is trained on national claims data that does not capture the disease burden, disability prevalence, language needs, or care-access barriers of a specific state Medicaid population. A second problem arises when a model trained in one hospital system is deployed in a different care environment with different workflows, staffing patterns, and patient needs. Both problems should be treated as core governance issues, not afterthoughts. A central purpose of Community Algorithmic Impact Statements is to force disclosure of source populations, deployment settings, and subgroup validation before high-impact use.
Existing legal tools already provide a starting point. Section 1557 of the ACA, Medicaid managed care regulations under 42 CFR Part 438, health IT certification authorities, and state utilization-management oversight all create avenues for oversight. The problem is not total absence of authority. The problem is that existing authority has not been translated into a practical governance framework for algorithmic systems before those systems become entrenched.
Public trust is also fragile. A 2023 Pew Research Center survey found that 60 percent of Americans would feel uncomfortable if their healthcare provider relied on AI for diagnosis and treatment. Frontline workforce opposition reinforces this concern. In a 2024 National Nurses United survey of more than 2,300 registered nurses, many respondents reported that AI tools undermined patient safety, conflicted with clinical judgment, or could not be modified when nurses disagreed with the output. Nurses are therefore well positioned to identify when automated systems conflict with bedside realities, create avoidable delays, or shift burdens onto patients and care teams. For that reason, this memo treats nurses not just as affected stakeholders, but as central participants in accountability.
The Strategic Window
There is a near-term opportunity to act because state Medicaid contracts are periodically renewed and routinely used to add new performance, reporting, and quality requirements. Several large states, including Texas, Florida, Ohio, and Illinois, have active or upcoming Medicaid managed care procurement cycles. These cycles create natural insertion points for algorithmic transparency, audit cooperation, and appeal safeguards. States do not need to wait for Congress to begin acting through procurement and contract oversight.
At the same time, policymakers are paying closer attention to adjacent problems in Medicare. CMS’s WISeR prior-authorization model has heightened concern about automated review and delayed care, even though CMS describes the model as combining enhanced technology with human clinical review. Bipartisan congressional inquiries into automated denial systems in both Medicare Advantage and Medicaid also signal growing political interest in this space. This proposal remains centered on Medicaid managed care, where states and federal administrators have especially clear opportunities to set guardrails for high-impact systems already being used in coverage and care management.
The political environment also calls for a realistic implementation strategy. The current federal administration has expressed skepticism toward disparate-impact frameworks, and a proactive federal push framed solely in those terms may face resistance. For that reason, the most durable near-term pathway is to emphasize patient protection, clinical accountability, fair process, transparency, and state contract authority, while preserving civil-rights enforcement as an essential backstop rather than the only implementation lever. This proposal protects patient autonomy through the right to appeal and receive an explanation. It supports clinical judgment by empowering nurses to challenge opaque algorithms. It also creates accountability without expanding government bureaucracy by leveraging existing external review infrastructure and state authority. These principles resonate across the political spectrum.
The urgency of this window is heightened by the introduction of new Medicaid work-reporting requirements. States are rushing on expedited timelines to build algorithmic systems for eligibility and compliance determinations, creating additional risks of erroneous benefit terminations and increased vendor lock-in. While this proposal focuses on AI in clinical and utilization management decision-making rather than eligibility processing, the governance frameworks proposed here, particularly CAIS transparency requirements, could be extended to eligibility determination systems as well.
Plan of Action
The recommendations below are designed to be mutually reinforcing, but the memo places greatest weight on two core interventions: Community Algorithmic Impact Statements and Nursing-Led AI Audit Brigades. The remaining proposals are narrower supports intended to make those two primary reforms workable in practice.
Recommendation 1. Require Community Algorithmic Impact Statements and establish Nursing-Led AI Audit Brigades.
Any Medicaid managed care organization, subcontractor, or vendor should be required to file a public Community Algorithmic Impact Statement before deploying AI for high-impact Medicaid decisions. Covered uses should include prior authorization, utilization management, care coordination prioritization, fraud flagging, triage, and other decisions that materially affect access to care. The filing should occur before deployment and annually thereafter.
CAIS should be modeled in part on the logic of environmental impact review, which requires public assessment of potential effects, alternatives, and mitigation before major federal actions. The goal is not to create a generic disclosure form. A CAIS should require a plain-language description of the system, its intended use, the population affected, the decisions it can influence, the data sources on which it relies, the source population on which it was trained, known limitations, plausible risks of harm, mitigation steps, monitoring plans, and available alternatives. For high-impact uses, it should also disclose subgroup performance testing and whether performance was evaluated on a population meaningfully similar to the state Medicaid population in which the tool will be deployed.
CAIS should classify systems as high-impact, moderate-impact, or advisory. High-impact systems would include prior-authorization denials, utilization-management restrictions, fraud flagging with downstream care consequences, and triage systems that materially affect access. These risk tiers draw on existing frameworks such as the NIST AI Risk Management Framework, adapted to Medicaid decision-making. High-impact systems should require pre-deployment filing, state review, and a public comment period before use. Moderate-impact systems should require annual reporting and post-deployment monitoring. Advisory tools should still be documented, but with lighter obligations.
Testing for disparate impact across all protected characteristics presents measurement challenges, particularly for disability status. Unlike race and ethnicity, for which inference methodologies such as Bayesian Improved Surname Geocoding (BISG) exist when self-reported data is unavailable, no comparable inference methodology currently exists for disability status. RAND describes BISG as a method that combines surname and geocoded address information to estimate race and ethnicity when direct data are missing or incomplete. Medicaid claims data and eligibility categories may provide some basis for identifying disability-related disparities, but this remains an area requiring further methodological development. CAIS filings should document these measurement limitations transparently and describe the best available approaches for subgroup testing.
At the same time, states should establish Nursing-Led AI Audit Brigades, or N-LABs, as independent audit teams. These teams should include registered nurses but also data scientists, health law or civil-rights experts, and at least one patient advocate or community health worker with lived Medicaid experience. The American Association of Colleges of Nursing reports more than 5 million registered nurses in the United States, making nursing the nation’s largest healthcare profession. Recent nursing scholarship on clinical AI auditing similarly emphasizes that assurance frameworks should include nursing leadership, not merely technical validation. The purpose of the N-LAB model is not to have nurses perform technical validation alone, nor to have data scientists audit systems without clinical grounding. The point is to create a multidisciplinary audit process in which each discipline evaluates the same system from a different but complementary vantage point. Including patient advocates ensures that audit priorities and scorecard criteria reflect beneficiary perspectives alongside clinical and technical expertise; beneficiary input can also be channeled through existing Community Advisory Boards (required under 42 CFR 438.110) and Federally Qualified Health Center governing boards (which include patient majorities).
In practice, an N-LAB would operate in six steps. First, it would review the CAIS and underlying documentation. Second, it would obtain case samples, denial rationales, override data, and subgroup outcomes from the managed care organization or vendor. Third, data scientists would test for accuracy, subgroup disparities, calibration, and training-versus-deployment mismatch. Fourth, nurse auditors would assess clinical plausibility, workflow burden, appropriateness of overrides, and whether the system appears to displace rather than support professional judgment. Fifth, legal reviewers would analyze whether observed patterns raise concerns under Medicaid managed care rules, civil-rights obligations, grievance requirements, or contract terms. Sixth, the team would publish a public scorecard and, where necessary, require a corrective action plan.
N-LAB scorecards should rate systems on accuracy, subgroup performance, explainability, human-override capacity, documentation quality, and post-deployment monitoring. Systems rated “needs improvement” should be required to submit corrective action plans within 60 days. Systems rated “fails” for high-impact use should be suspended by the state Medicaid agency until corrective action is verified. State agencies, not N-LABs themselves, should retain final suspension authority.
Estimated cost remains modest relative to Medicaid program scale. A team costing roughly $500,000 to $750,000 annually, assuming roughly 1.25 to 1.9 million beneficiaries per audit team, would amount to approximately $0.40 per Medicaid beneficiary. This is an illustrative estimate based on comparable external quality review organization (EQRO) staffing models. A large state may require two to four teams depending on MCO and system volume. The stronger argument, however, is institutional: N-LABs translate abstract oversight into a repeatable operational process.
Implementation can proceed through existing authority. CMS can issue model CAIS guidance and encourage incorporation into managed care contracts. States can require filing and audit cooperation through requests for proposals and contract terms. ONC can reinforce these expectations by embedding documentation and audit-readiness requirements into certification-related pathways, building on ONC’s existing role coordinating EHR certification under the 21st Century Cures Act. The HHS Office for Civil Rights can use filings and scorecards as triggers for proactive review. If current political conditions make explicit Affordable Care Act Section 1557 (protecting individuals from sex discrimination) framing difficult, alternative language focusing on “differential outcomes based on personal characteristics” or “equitable access to care” can maintain legal force while being politically adaptive. These timelines are realistic based on comparable regulatory actions: CMS issued comprehensive Medicaid managed care rules (42 CFR Part 438) with 18-month implementation; ONC implemented 21st Century Cures Act certification criteria within 24 months; state insurance commissioners routinely issue bulletins with 6–12 month effective dates.
Recommendation 2. Add algorithm-specific explainability and appeal protections for high-impact adverse decisions.
Some Medicaid rules already require plans to provide denial reasons to requesting providers (42 CFR § 438.404). This proposal does not duplicate that baseline. It adds a more usable and enforceable framework for algorithmic decisions by requiring patient-facing explanations, clinician-usable explanation materials, auditor documentation, and an independent review pathway when an automated or algorithmically informed adverse decision affects care. The following standards are adapted from emerging model documentation and explanation frameworks, including model cards for model reporting and post-hoc explanation tools such as LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations).
For patients, explanations should be plain-language, translated as needed, and specific enough to explain what decision was made, what key factors drove it, and what information could change the outcome. Boilerplate language such as “not medically necessary” is not enough when the decision was materially shaped by an algorithmic system. Patients should receive a short contestability notice with the adverse decision that explains the decision, the right to appeal, the timeline, and how to submit additional information. For complex models, post-hoc counterfactual explanations using methods such as LIME or SHAP can translate opaque outputs into patient-accessible language.
For clinicians, the standard should be more operational. If a model materially shaped a denial or restriction, the responsible entity should be able to identify what inputs were used, which factors were most influential, what uncertainty surrounds the output, and how the case compares with relevant benchmarks or similar cases. The goal is not to force disclosure of source code or trade secrets. It is to provide enough information for meaningful clinical contestation and review.
For regulators and auditors, the system should be documented well enough to reconstruct individual decisions, evaluate subgroup performance, review validation methods, and identify known failure modes. A system that cannot generate meaningful patient-facing, clinician-facing, and auditor-facing explanations should not be used as the sole basis for a high-impact denial or restriction. That is the operative limit on insufficiently explainable systems. ONC should add these requirements to health IT certification where systems fall within certification-related pathways, specifying that certified systems must demonstrate capability to generate explanations meeting these standards.
This memo also establishes an enforceable appeal right for adverse algorithmic decisions, preserving at least the existing 60-day appeal period under federal Medicaid managed care rules while adding algorithm-specific notice and expedited independent review by an appropriately qualified clinician not employed by the plan. This mirrors the logic of existing external review structures, at comparable cost ($500-$700 per review), but makes clear that algorithmic involvement triggers heightened transparency, documentation, and reversal tracking. The process: (1) patient receives an adverse decision with a plain-language contestability card explaining the decision, the algorithmic factors involved, and appeal rights, available in multiple languages; (2) patient files appeal within the applicable appeal period; (3) independent clinician conducts de novo review; (4) binding decision issued within 15 business days; (5) reversals reported to the relevant N-LAB; (6) patterns of reversals for a particular system trigger automatic audit.
These appeal rights should be understood as complements to, not substitutes for, bright line regulatory rules where appropriate. A bright-line rule is a clear rule that draws a firm boundary and leaves little room for case-by-case balancing. For certain categories of decisions, such as prior authorizations for treatments meeting established clinical criteria, regulators should consider prohibiting automated denial without clinical review altogether, as has been proposed in the Medicare Advantage context. Audits and transparency mechanisms are most valuable for the many algorithmic applications where bright-line prohibitions are impractical.
Finally, safety-net data should not be treated as a free raw material for secondary commercial exploitation. Vendors using Medicaid and other safety-net data should be required to sign Community Data Covenants restricting secondary uses through purpose limitation, data minimization, retention limits, transparency, and benefit-sharing where appropriate. Existing education-sector data-use agreements provide a workable precedent for this kind of contractual approach, including the U.S. Department of Education’s Model Terms of Service and state-level models such as Alabama’s SDPC Alliance.
Recommendation 3. Use state Medicaid contracts as the primary near-term implementation pathway.
Because Medicaid is jointly governed, states do not need to wait for new federal actions. State Medicaid agencies can incorporate FairCare Verification (CAIS filing, N-LAB cooperation), minimum explanation requirements, and appeal obligations directly into managed care requests for proposals and contracts. These mechanisms fit naturally alongside existing reporting, quality, and utilization-management provisions. For example, a state could require: “Contractor shall file CAIS documents for all high-impact AI systems 90 days before deployment and cooperate with state-designated N-LAB audits. Failure to comply constitutes material breach.”
A state-based pathway is also more feasible in the current moment. It allows policymakers to frame the issue in terms of accountability, patient protection, fair process, and clinical integrity rather than relying exclusively on expansive new federal directives. It also creates opportunities for coalition-building among states that want to avoid fragmented vendor compliance and regulatory arbitrage.
To reduce patchwork harms, states should work through a consortium model. A coalition of large states could develop shared contract language, common filing templates, reciprocal recognition of audit findings, and aligned minimum standards for high-impact systems. This approach mirrors the National Association of Insurance Commissioners (NAIC) model law development process and multi-state pharmaceutical supplemental rebate agreements. Benefits include reducing compliance burden, increasing leverage over vendors, and accelerating de facto national standard-setting through market pressure. Where stronger requirements are desired, states can pass legislation, with Colorado’s SB 21-169 and Illinois’s AI Video Interview Act providing useful models.
Enforcement should sit inside this state-contract architecture rather than operate as a disconnected apparatus. Failure to file a required CAIS, cooperate with audit review, produce required explanation materials, or implement corrective action should count as a material contract breach. States can then use cure notices, financial penalties, suspension authority, and procurement consequences already familiar in managed care oversight. Federal agencies, including CMS through Medicaid matching-fund conditions, ONC through certification standards, and OCR through independent Section 1557 authority, should reinforce, not replace, this state-centered pathway.
Conclusion
The strongest immediate case for action is not that every healthcare AI use can be solved at once. It is that Medicaid beneficiaries are already exposed to high-impact algorithmic systems in coverage and care-management decisions, and that existing law and contract authority can be used now to make those systems more visible, more contestable, and more accountable.
“FairCare Verification”, or Community Algorithmic Impact Statements and Nursing-Led AI Audit Brigades, are the memo’s central reforms because they address the first-order governance problem: systems are being deployed without adequate public disclosure or credible independent review. Explainability rules, appeal rights, and data covenants matter, but they work best as supporting mechanisms around those two core interventions. No new legislation is required for initial implementation. CMS could issue initial guidance quickly; state Medicaid directors can add clauses at the next MCO contract renewal; state Medicaid agencies and MCOs can adopt policies at their next meeting. What is required is a practical governance strategy that fits current institutional realities, focuses clearly on Medicaid managed care, and gives patients, nurses, regulators, and community stakeholders structured ways to detect and correct harm before opaque systems become entrenched. The next 18-24 months may be a critical window before harmful practices become deeply embedded. The question is whether policymakers seize this opportunity or face far more difficult remediation after extensive harm accumulates.
Medicaid beneficiaries are especially vulnerable to opaque administrative systems and often have the least practical power to absorb delays, navigate appeals, or switch coverage arrangements. Similar concerns also exist in Medicare, including in traditional Medicare pilots, but Medicaid managed care offers especially clear and immediate levers for contract-based intervention. Medicaid requirements can also influence commercial insurers, shape vendor products, and set precedent across sectors because vendors and MCOs often build products to satisfy the strictest large-market contract standard rather than maintaining separate systems for every payer.
Existing requirements under 42 CFR § 438.404 generally address the obligation to provide a denial reason. This proposal goes further by requiring algorithm-specific disclosure, patient-facing contestability materials, clinician-usable explanation content, independent review for adverse algorithmic decisions, and systematic feedback from reversals into ongoing audit oversight.
The proposal does not require disclosure of source code or proprietary architecture. It requires disclosure of inputs, intended use, validation, subgroup performance, explanation capacity, and monitoring results—enough to evaluate what the system does and whether it can lawfully and safely be used in high-impact Medicaid decisions. Post-hoc explainability techniques (LIME, SHAP, counterfactual explanations) provide transparency without revealing trade secrets.
N-LAB audits cost approximately $0.40 per Medicaid beneficiary annually. CAIS filing requires staff time, not new technology. The appeal right leverages existing external review infrastructure and timelines. These costs are modest compared with remediating entrenched algorithmic discrimination years later.
The Federal Government Should Pilot a Decision Subject Representative Program for AI Systems
AI systems are regularly used to make decisions that directly impact individuals, from who gets a housing voucher to who gets a job, to bail—contexts with a long history of social disparities, facilitating encoded discrimination. The designs of these consequential AI decision systems are shaped by corporations and increasingly overseen by governments with little input from the public, specifically from users and individuals impacted by these decisions.
Executive branch agencies frequently engage the public in policy decisions via requests for comment and town halls. For decades, the Food and Drug Administration (FDA) has gone beyond traditional agency engagement processes via the Patient Representative Program (PRP), which recruits, trains, and embeds patients into oversight of the pharmaceutical industry, including decisions regarding clinical trial design, endpoints (evaluation metrics), risk/benefit analysis, product labeling, etc. This memo proposes creating a Decision Subject Representative Program inspired by the FDA’s Patient Representative Program.
While pharmaceutical drugs and consequential AI decision systems vary in scope and impact, both technologies need to be safe and effective to be trusted by the public and consumers. Public engagement has long been a tool for building trust and legitimacy in governance decisions while providing a complement to expertise associated with elite institutions. Three decades of FDA experience in systematizing patient engagement offer valuable inspiration for AI governance. Specifically, the General Services Administration (GSA) should pilot embedding Decision Subject Representatives into the procurement process for consequential AI decision systems, the National Institute of Standards and Technology (NIST) should pilot engaging Decision Subject Representatives in efforts to shape standards, and Congress could add a flexible Decision Subject Representatives Program (DSRP) to new regulatory proposals.
Challenge and Opportunity
Technologists have attempted to address concerns regarding bias and discrimination in consequential AI decision systems (AI systems that serve as a basis for a decision or judgment in consequential contexts such as education, employment, essential utilities, financial services, legal services, etc.) by analyzing statistical outcomes or applying fairness metrics. The challenge with this approach is that there are a variety of ways to conceptualize and measure fairness that can not be encoded at the same time. Additionally, fairness metrics often rely on the availability of sensitive category data, which may be restricted by privacy laws and historic human rights laws. Instead, scholars offer that the application context matters and those directly affected should be engaged in the selection and formation of fairness metrics.
More recently, scholars have advocated for a more holistic view of fairness that takes into account the sociotechnical context and the whole process of coming to a certain decision. This approach underscores the need for decision subjects to be included in the entire process of AI system design and deployment, with an emphasis on the assessment of risks and harms broadly, processes for contestability, and transparency measures. As consequential AI decision systems proliferate, it is imperative that the U.S. government pilot systems for engaging decision subjects.
Engaging decision subjects in AI governance faces many challenges. Efforts risk looking like participation washing or engaging decision subjects for theatrical purposes without real power or influence over the final decision. Participatory AI projects can also be inaccessible or exploitative—challenges the FDA’s Patient Representative Program has grappled with.
FDA’s Patient Representative Program
Officials at the FDA woke up to the power of lay expertise in pharmaceutical drug development in the wake of the AIDS epidemic when patients advocated to have their experiences considered in disputes pertaining to the design and methodology of drug trials.
In 1988, the FDA initiated the patient engagement process through the Office of AIDS Coordination. By 1993 the first Patient Representative served on an FDA Advisory Committee. Since then, the FDA has greatly expanded patient engagement with over 200 Patient Representatives, dedicated offices and programs, reporting systems, and regular public guidance aimed at incorporating patient experience data into regulatory decision-making.
The program has been largely implemented at the direction of FDA leadership. In 2012, Congress enacted the Safety and Innovation Act. The act’s language is the first “official” codification of the FDA Patient Representative Program and other patient liaison activities. The law provided greater stability to the program and opened the door for more staff and educational programs.
Patient Representatives include patients, patient advocacy group members, family and/or caregivers, and health care providers. The FDA recruits Patient Representatives through open applications, patient organizations, and staff outreach. Selected participants are vetted and onboarded as Special Government Employees (a category of federal worker for individuals that serve the government temporarily while maintaining employment elsewhere). They consult with FDA review divisions, serve on advisory committees, present at workshops, participate in the Patient Engagement Collaborative which shapes practices in clinical trials, and in other regulatory activities where the patient perspective is needed. Patient Representatives receive training in FDA regulatory processes and can work with FDA staff to prepare to meaningfully participate in advisory committees meetings along with other stakeholders. The FDA covers travel costs and forgone salary for Patient Representatives participating in meetings and training.
FDA employees describe Patient Representatives’ expertise as “a street sense” based on personal experience, describing their views as “a value judgment overlay on top of measurable, empirical clinical trial evidence.” Participants often ask “questions [that] would never be raised” and push for clarity. When Patient Representatives engage with expert stakeholders, learning “works both ways” with clinicians altering their way of thinking based on what Patient Representatives share, and Patient Representatives gaining a better understanding of the science which they can share with their communities.
Over the years, FDA’s Patient Representative program has faced challenges. The drug-specific nature of engagement makes it difficult for patients to engage on cross-cutting issues, conflict of interest rules have made it hard for patients to engage with both drug companies and the FDA, and patients have expressed concerns about not actually knowing the impact they have on decisions. The FDA continues to address these concerns. In 2017, it initiated a Patient Affairs Staff to centralize support for Patient Representatives, and has launched new communication and transparency efforts to help patients understand their influence. While the degree to which patient representative views should be weighted alongside clinical trial data and processes for measuring patient influence will long be contested, the FDA’s program represents a model in which Patient Representatives are remunerated for their time and continually shape FDA processes.
How may the FDA’s Patient Representative Program inspire similar efforts for AI governance?
AI systems are different from pharmaceutical drugs. They are deployed cross-sector and individuals can unknowingly be impacted by an automated decision, whereas drug patients often know they are taking a drug. These differences will impact the way a Decision Subject Representative Program would be deployed, but the types of decisions Patient Representatives consult on are analogous to important decisions in AI governance today. The table below describes the types of governing decisions Patient Representatives have engaged in via the FDA’s drug approval and monitoring process and how those areas correspond with governing consequential AI decision systems. Similar to the FDA, federal agencies could engage Decision Subject Representatives throughout the lifecycle of consequential AI decision systems development including pre-market approval (most comparable to procurement), and the development of testing, evaluation, and transparency standards for deployers and developers (both voluntary and mandated by regulation).
Similar to the way the FDA engages patients in specific drugs (or categories of drugs) as needed, decision subject engagement would work best on applied systems where they can be assessed within a sociotechnical context – particularly in contexts where fairness may be an issue (e.g., education, employment, essential utilities, financial services, legal services). Similar to the FDA, Decision Subject Representatives could consult directly with agency staff on decisions described in Table 1 or serve on advisory committees with other relevant stakeholders.
Recruitment for a Decision Subject Representative Program will vary based on the context and likely include partnering with civil society organizations and community groups that can recommend Decision Subject Representatives. For example, agencies governing (i.e., procuring, drafting standards, issuing testing mandates) hiring software could recruit workers who have experience navigating AI systems to obtain a job or individuals from communities that are historically discriminated against in hiring by partnering with workforce development programs (e.g., American Job Centers, local libraries). Whereas agencies governing (i.e., procuring, drafting standards, issuing mandates) education technology could recruit students, parents and teachers, particularly those from lower-income school districts, to serve as Decision Subject Representatives.
Similar to the FDA, a Decisions Subject Representative Program must include extensive training for Decision Subject Representatives that covers agency (e.g., GSA) processes, why decision subject views are important, and training to combat feeling intimidated by academic and industry expertise. Host agencies should also communicate the importance of decision subject perspectives to other stakeholders. Any pilot should be accompanied by an evaluation of the Decision Subject Representatives’ experience and impact on final decisions.
Plan of Action
Recommendation 1. GSA should consult Decision Subject Representatives when procuring consequential AI decision systems
Recent procurement guidance issued by the Trump administration and the Biden administration directs GSA to identify risks in high-impact AI systems (including what this memo refers to as consequential AI decision systems), conduct pre-award testing, and monitor performance, including quantitative success metrics. GSA should pilot onboarding Decision Subject Representatives as Special Government Employees to consult on these activities as they apply to consequential AI decision systems.
One benefit of engaging decision subjects in the procurement process is that private companies that build systems for the government and for industry may choose to adopt the practices and standards required to meet government requirements for their commercial offerings.
Decision Subject Representatives will have connections to a broader community of individuals impacted by consequential AI decision systems and can serve as a bridge to a wider set of experiences. Additionally, Decision Subject Representatives can consult on new agency programs aimed at engaging decision subjects more broadly over time.
Recommendation 2. NIST should engage Decision Subject Representatives in future Zero Draft development
Risk management, assessment, metrics, and documentation of AI systems will likely be shaped by international standards, especially as the International Standards Organization (ISO) responds to the European Union’s AI Act and similar efforts globally. International standards have traditionally focused on objective guidance and been shaped by industry actors. The need to consider context-specific harms in AI risk assessment has necessitated a recent shift towards sociotechnical standards, creating an imperative for broader stakeholder representation.
NIST, recognizing this shift, recently launched a “Zero Drafts” pilot with the express interest of engaging stakeholders in NIST proposals that are eventually submitted to standards bodies. The two initial topics: AI testing, evaluation, verification, and validation (TEVV) and transparency documentation, are horizontal standards (i.e., not specific to an applied AI system or specific AI use case) and therefore not as well suited to decision subject engagement. But the NIST zero draft program is designed to be responsive to AI workstreams within the international standards bodies, which means they should eventually work on context-specific risk guidance such as AI in hiring, AI in education, AI in financial systems, AI in criminal justice, etc.
As context specific or applied zero draft efforts begin, NIST should pilot engaging Decision Subject Representatives in stakeholder meetings and on edits to draft text. While standards are not regulatory, they can be referenced by regulators worldwide, including U.S. states. In this way, they represent a potential central point of influence over the design and assessment of consequential AI decision systems.
Recommendation 3. Congress should add flexible Decision Subject Representatives Programs to new regulatory proposals
With some exceptions, AI systems do not have to meet transparency or testing requirements to demonstrate they are safe or effective in order to enter or remain on the market. While transparency and testing guidance are currently the domain of NIST (and therefore voluntary), Congress is considering proposals to mandate risk assessment and transparency requirements for consequential AI decision systems (e.g., Algorithmic Accountability Act of 2025). Additionally, Congress has introduced proposals for a comprehensive new digital regulator that would issue regulations, oversee codes of conduct councils or advisory boards, and weigh in on decisions such as those listed in Table 1 (e.g., Digital Consumer Protection Commission Act, Digital Platform Commission Act).
Similar to the Food and Drug Administration Safety and Innovation Act (FDASIA) (2012) Section 1137, Congress could add legislative text to proposals such as those listed above that provides flexibility for agencies to onboard Decision Subject Representatives when they can contribute to decisions related to consequential AI decision systems.
An Example of Legislative Text
Inspired by FDASIA 2012 Section 1137
‘‘(a) IN GENERAL.—The Secretary [or Commission] shall develop and implement strategies to solicit the views of decision subjects during [procurement decisions] or [standards development] or [regulatory discussions] related to consequential AI decision systems including by –
(1) fostering participation of decision subjects who may serve as a special government employee in appropriate agency meetings with consequential AI decision systems developers, deployers, assessors, and investigators; and
(2) exploring means to provide for identification of decision subjects who do not have any, or have minimal, financial interests in companies that provide consequential AI decision systems”
Where DECISION SUBJECT means the person or party to whom the decision applies in a specific context.
Where CONSEQUENTIAL AI DECISION SYSTEMS means “any system, software, or process (including one derived from machine learning, statistics, or other data processing or artificial intelligence techniques and excluding passive computing infrastructure) that uses computation, the result of which serves as a basis for a decision or judgment” [followed by a list of critical contexts such as education, employment, essential utilities, financial services, legal services, etc)]
(definition inspired by the Algorithmic Accountability Act of 2022 and lineages therein, exact definition may be adjusted based on the bill context)
Conclusion
As AI systems are increasingly integrated into government and entrusted with decision-making roles, we risk further embedding bias and mistakes into AI-assisted decisions and outcomes. Existing tools from other domains, such as existing robust public engagement processes in drug development, when applied to AI deployment can help strengthen public trust in these systems and enhance perceptions of their legitimacy and the decisions they produce. Embedding Decision Subject Representatives in the procurement of consequential AI systems, regulatory processes, and agency decision-making represents a gold-standard approach. With minimal additional oversight and support, this practice can help drive the development of high-quality systems that are informed by real-world needs.
Similar to the pharmaceutical context, both companies designing and deploying consequential AI decision systems and governments procuring and overseeing consequential AI decision systems should engage decision subjects. Corbet and colleagues recently (2023) assessed participatory approaches to AI development and found that many projects struggle to provide decision subjects with meaningful influence over AI governance decisions.
As the FDA has worked to engage patients over the years, it has shared its learnings back with the pharma industry, leading to overall improvements in patient engagement related to both regulatory decisions and company drug development decisions. A Decision Subject Representative Program accompanied by rigorous evaluation could help inform best practices for industry public engagement efforts.
Engaging the public in science and technology policy involves building bridges between communities with different levels of power and access in society. It will be challenging, require time, financial resources, and rigorous evaluation. AI fairness advocates should push for these activities both in industry and within government agencies.
Ensuring proper representation of viewpoints in science and technology policy is a perpetual challenge (democracy is hard). FDA Patient Representatives often serve because of their passion for representing their community (individuals living with a health condition) and often engage in online communities and forums. They can bridge not only their own experience but that of others in their community.
While there are only a few hundred Patient Representatives, the agency has several other efforts to engage patients including:
- Patient-Focused Drug Development: Processes for collecting and referencing qualitative patient and caregiver data in drug evaluations.
- Clinical Outcome Assessment: Efforts to integrate patient-reported outcomes, clinician-reported outcomes, and observer reported outcomes into clinical trials.
- Patient testimony offered at workshops, advisory meetings, listening sessions, or via open comment periods on FDA guidance.
- FDA Adverse Event Reporting System (FAERS) database: Provides a way for patients to report adverse events.
These programs are often developed with input from Patient Representatives and create less time- and resource-intensive pathways for patient voices to be included in drug development and oversight. These programs also serve as entry points for recruiting Patient Representatives. Additionally, the existence of these programs has spurred an ecosystem of patient advocacy organizations, creating additional non-governmental pathways for engaging patients in drug development.
How State Governments Should Purchase AI to Ensure Fair, Transparent, and Accountable Use
State and local governments are rapidly procuring AI systems, but the contracts governing these tools overwhelmingly lack provisions for transparency, fairness, and accountability. While attention has been paid to the way the federal government procures AI, comparatively little attention has been paid to procurement by state and local governments. However, some of the most consequential AI systems spanning areas such as criminal justice, healthcare, and education are being deployed at these levels of government. Our analysis of thousands of state AI contracts across California, Florida, and Utah finds that 77% of provisions are standard boilerplate. 3.0% of these provisions address cybersecurity, 5.3% address transparency, and 2.4% address fairness and accountability. Meanwhile, these procurement decisions lock in governance choices for years, with some contracts spanning a decade or more.
Procurement is not merely an administrative function—it is how AI enters government and the first line of defense for responsible AI in the public sector. Contract language is often a relatively low friction and politically viable tool that can generate concrete governance benefits without requiring new AI legislation. State governments should adopt three reforms: (1) standardized responsible AI contract clauses aligned with the NIST AI Risk Management Framework, (2) risk-tiered procurement review processes modeled on proven approaches in San José and Colorado, and (3) mandatory AI vendor fact sheets as a condition of contract award and renewal.
Challenge and Opportunity
Procurement is the first line of defense for responsible AI in the public sector
Governments adopt AI to save money and improve efficiency. But poorly written contracts can hard-code opacity, vendor lock-in, and weak accountability for years or decades. They also waste scarce public resources in ways that are difficult to unwind. According to our analysis of the Electronic Privacy Information Center (EPIC)’s dataset of more than 600 state contracts (2023), the median contract value is approximately $1 million.
Although procurement may sound like a technical or unfamiliar term to many, it is not merely an administrative function. It is a core governance tool. Anyone who cares about how technology is used in government should care about procurement, because it is how technology enters government. Procurement is the first line of defense for ensuring responsible AI in the public sector. Most AI policy debates focus downstream on regulation, but some of the most consequential decisions are made upstream in contracts. Legislation and regulation of AI can be difficult, especially at the state level. AI procurement promises to be a potent tool for security, transparency, fairness, and accountability, not just compliance and cost containment.
In either case, AI-specific considerations rarely enter the process. For example, agencies may not ask about bias testing, government access to training data, or requirements for vendor to disclose how the model makes decisions. A joint National Association of Statement Procurement Officers (NASPO) and National Association of State Chief Information Officers (NASCIO) report recommended that states prioritize bias mitigation, transparency, and accountability in AI procurement. Standard procurement evaluates cost, vendor qualifications, and compliance with existing regulations, but typically lacks the government capacity to assess algorithmic risk.
There is a growing race between technological change and government capacity
State and local governments are rapidly procuring AI systems, with EPIC documenting 600 such contracts in 2023 and our analysis identifying over 1000 just in the states of California, Utah, and Florida. Governments are acquiring AI through both stand alone procurements and renewals of broader technology contracts that now embed AI features. In both cases, procurement capacity has not kept pace with technical complexity, leaving many agencies ill-equipped to evaluate performance, negotiate price and scope, and ensure these tools are used effectively and responsibly.
Cooperative procurement can save time and resources, but it can also concentrate risk by locking many jurisdictions into the same contractual terms
Because procurement takes time and resources, governments often rely on cooperative purchasing agreements (arrangements in which one state competitively bids and negotiates a contract that other states and local governments can adopt without rerunning the procurement process) to buy goods and services together and reduce administrative costs. The National Association of State Procurement Officials (NASPO) is often the institutional vehicle for this process. It was founded in 1944 during World War II, following President Franklin D. Roosevelt’s signing of the Surplus War Property Disposal Act. In the EPIC dataset, more than 4 out of 5 state AI contracts were negotiated through the NASPO ValuePoint platform (NASPO’s flagship cooperative contract program). Cooperative procurement can increase bargaining power and reduce administrative costs for participating states. Yet it also makes the initial contract especially consequential, as boilerplate language often becomes the template for all participating jurisdictions.
In our ongoing research, we analyzed AI contracts from three states—Utah (which initiated many NASPO agreements), California, and Florida—classifying 3,771 individual contract provisions across 215 contracts.
We found that 77% of provisions are standard boilerplate, such as force majeure and indemnification clauses. Transparency provisions (audit rights, reporting obligations) are the most common substantive category at 5.3%. Cybersecurity provisions (data encryption, breach notification, access controls) account for 3.0%, and fairness and accountability provisions (non-discrimination, bias testing algorithmic accountability) are about 2.4%.
Long term contracts are often poorly suited to rapidly evolving technologies and governance norms
Contract terms may also be lengthy. In the EPIC data, the average contract length was seven years. Some contracts even span a decade. When governments experience a failed AI implementation, they often respond by signing longer, not shorter, contracts. In the aftermath of failure, agencies may turn to more established vendors that appear credible and reliable, even if they are more expensive.
In 2013, Michigan’s Unemployment Insurance Agency entered into a $47 million contract with Fast Enterprises to design and run the Michigan Integrated Data Automated System, or MiDAS. The system incorporated algorithm-based fraud detection tools. From 2013 to 2015, MiDAS wrongly accused more than 34,000 unemployed individuals of fraud. In 2022, the state replaced it with the Deloitte-developed Unemployment Framework for Automated Claim and Tax Services, known as uFACTS. It is projected to cost about $78 million over a 10 year contract. Throughout this fiasco, little attention was paid to how the original contract was negotiated and structured. Nor was there meaningful scrutiny of whether procurement practices improved when the state later signed an even larger contract with Deloitte.
Critically, neither the original $52 million MiDAS contract nor the replacement $78 million uFACTS agreement included meaningful provisions for algorithmic transparency, bias testing, or independent performance auditing—precisely the types of clauses that could have flagged the system’s 93% false-positive rate before it devastated tens of thousands of families. The MiDAS debacle cost the state over $125 million across two contracts, falsely accused 40,000 residents, and resulted in a $20 million class-action settlement. In short, the absence of responsible AI contract provisions creates real-world harm.
Locking in AI governance decisions for years, or even a decade, leaves little room to adapt. It places states and local governments in a vulnerable position, as the underlying models and risks can evolve dramatically within just a few years. Once a contract is signed, the window for negotiating transparency, fairness, or accountability provisions largely closes. Revisiting core terms mid-contract is costly and legally complex, which means the initial procurement decision effectively sets the governance framework for the system’s entire operational life.
Vendor lock-in compounds these risks. Once an AI system is deployed under a long-term contract, governments may lose meaningful control over the data the system processes. Vendors may retain proprietary rights over training data, model architectures, or performance analytics, making it difficult for the government to audit system behavior or switch providers. When institutional knowledge becomes embedded in vendor-controlled platforms—as happened when Arkansas could not explain the details of a model used to determine Medicaid benefits—the dependency becomes nearly irreversible. In Idaho, a state agency refused to disclose its benefits allocation formula, claiming it was a vendor trade secret, effectively shielding a public decision-making system from public accountability.
Contracts are an underutilized policy lever
Although state governments rarely include responsible AI provisions in their contracts, these clauses represent an important policy lever. Based on the EPIC data, all 50 states, as well as DC and Guam, have entered into AI related contracts.
Contract language is often a relatively low friction and politically viable tool that can generate concrete governance benefits without requiring new AI legislation. Moreover, vendors tend to be repeat players, with companies such as Deloitte, Accenture, and Pondera providing various types of government technology. This fact creates opportunities to negotiate principles across various AI products. Clearer contract language standards also benefit smaller companies and new entrants by demystifying expectations and lowering the barrier for bidders that lack dedicated government affairs teams.
Nonetheless, a contract’s leverage is time sensitive. Once it is signed, the window of opportunity largely closes. Revisiting or unwinding core terms can be difficult and costly. Governments therefore need to use the negotiation process to exercise their purchasing power to reduce risk and strengthen transparency and accountability. The cost of failing to do so is substantial. These agreements are often sticky and are frequently reused as boilerplate language, allowing weaknesses to persist across agencies and over time.
What role do policy networks play in AI procurement reform?There are growing AI communities within state and local governments that view procurement as an underutilized governance tool. The GovAI Coalition, launched by San José in 2023, has expanded to more than 3,000 members across 900 government agencies. In April 1976, the San José City Council approved the Coalition’s transition into an independent nonprofit organization. Within the coalition, procurement is one of the core committees, and vendors are not permitted to serve on it. There are also networks such as the National Association of State Chief Information Officers and the Beeck Center for Social Impact and Innovation’s State Chief Data Officers Network, where best practice sharing, information gathering, and coalition building are active. These networks enable state and local governments to use their collective purchasing power more strategically in their dealings with vendors.
Plan of Action
State governments have both the authority and the practical tools to strengthen AI procurement today. The following three recommendations can be implemented through existing procurement authority, without requiring new legislation, and draw on proven models already in use.
Recommendation 1. State procurement offices should adopt standardized responsible AI contract clauses aligned with the NIST AI Risk Management Framework.
AI procurement should not rely solely on traditional cost benefit analysis, but also incorporate a systematic risk benefit assessment. The EU’s AI Act, which entered into force in 2024, distinguishes between high and low risk AI systems and is accompanied by model contractual clauses tailored to different risk categories. In the U.S, the National Institute of Standards and Technology (NIST) has developed the AI Risk Management Framework (2023), a cross sector tool to guide risk evaluation and mitigation. Aligning these risk assessment frameworks with standardized contract clauses would substantially improve responsible AI procurement practices across state and local governments, while also reducing administrative burdens. Even if adoption is not mandatory, such resources can encourage more proactive engagement with responsible AI provisions by lowering the cost of asking the right questions, identifying relevant information, and translating risk considerations into clear contractual language.
IEEE Standard 3119-2025, an international standard specifically for AI procurement, provides a ready-made framework covering problem definition, solicitation, vendor evaluation, and contract monitoring. A multi-state working group convened through NASPO—building on its existing collaboration with NASCIO on AI procurement—could adapt these standards into model contract clauses within 12 months. At minimum, clauses should address: data governance and retention, algorithmic bias testing, explainability requirements for high-risk decisions, breach notification procedures specific to AI systems, and performance benchmarks with renewal contingencies. Canada’s Algorithmic Impact Assessment and the EU’s model contractual clauses for AI offer proven international templates.
Recommendation 2. States should implement risk-tiered AI procurement review processes, modeled on San José’s Digital Privacy Office approach.
The City of San José, located in the heart of Silicon Valley, has alreadyadopted this risk analysis approach. When a city department submits a procurement request, the Digital Privacy Office assesses its risk level. If the system is deemed low risk, the request is approved without creating a backlog. If it is classified as high risk, the office conducts an impact assessment and requires the vendor to complete a structuredAI FactSheet. This simple document helps government officials know what questions to ask and how to communicate with vendors about them. It covers training and test data, model characteristics, update procedures, performance metrics, and related information. These materials are then reviewed by cybersecurity and privacy teams, followed by testing and ongoing monitoring.
City of San José website (2026)
This approach can be elevated to the state level by establishing a similar risk analysis procedure within the procurement process. The Colorado Office of Information Technology (OIT) already uses a NIST-based risk assessment framework to evaluate all generative AI use cases and ensure that procurement complies with state law and data security requirements, providing a state-level proof of concept.
States with existing AI governance infrastructure are natural pilots. California’s Governor issued an executive order in 2023 directing the development of AI procurement guidelines, and the state has since published purchasing rules for generative AI. Colorado’s AI Act (SB 24-205) already requires reasonable care for high-risk AI systems. These states, alongside jurisdictions active in the GovAI Coalition could pilot risk-tiered review processes within existing procurement office budgets. San José’s Digital Privacy Office operates within the city’s IT department without a dedicated budget line, demonstrating that this model can be implemented by designating existing staff rather than creating new offices. NASCIO, which has made AI governance a top priority for 2026.
Recommendation 3. State governments should require AI vendors to complete structured AI fact sheets as a condition of contract award and renewal.
One relatively easy to implement reform is to adopt shorter term contracts with built in opportunities for revision or modification after a clearly defined period of use and evaluation. This recommendation aligns with the call to avoid rigid procurement cycles and embrace more modular, outcome-driven buys by Lewis and Pahlka (2025). Renewal should be contingent on demonstrated performance. The guiding principle is simple: no test, no renewal. As part of contract negotiations, vendors should be required to provide an AI fact sheet and update it as needed. No high-risk, high-impact, high-stakes AI system should be launched or renewed without appropriate testing and ongoing monitoring.
The AI fact sheet can serve as a condition of contract award and renewal. It should function as a “nutrition label” for government AI systems, modeled on San Josés vendor-facing template and inspired by IBM Research’s AI FactSheets 360. At minimum, the template should capture: training data provenance and representativeness, model performance metrics and known limitations, bias audit results across protected classes, update and versioning procedures, data retention and deletion policies, and human oversight mechanisms. Fact sheets should be updated whenever the model is retrained or its scope of use changes, and must be submitted as a condition of both initial contract award and each renewal cycle. New York City’s Local Law 144 demonstrates that mandatory AI disclosure requirements are implementable, though its enforcement challenges underscore the importance of tying disclosure to the procurement process itself—where the government has direct leverage—rather than relying solely on post-deployment regulation.
There is a role for the federal government
The federal government can also reinforce and scale these organic, though still scattered, reform efforts. The AI in Government Act of 2020 and Office of Management and Budget Memorandum M-25-21 offer a federal-level template that states can adapt to their own procurement contexts. Perhaps the most effective thing the federal government can do in this space is avoid preempting state efforts to innovate. Recent legislation and executive orders, including proposed moratoriums on state AI rulemaking advanced in federal budget and regulatory packages, have attempted to create regulatory ceilings on state efforts. Such efforts could prematurely stunt useful state innovation.
Conclusion
Procurement is how technology, including AI, enters government. It is the first line of defense for responsible AI in the public sector. When procurement fails, the downstream consequences can be significant and long-lasting.AI procurement is not a narrow technical issue. It is the mechanism through which governments quietly govern AI at scale. Strengthening procurement today will shape AI outcomes for decades. By adopting standardized contract clauses, risk-tiered review processes, and mandatory vendor fact sheets, state governments can use their existing procurement authority to build transparency, fairness, and accountability into AI systems from the outset.
When a state agency needs an AI system, it follows one of three paths: issuing a competitive request for proposals (RFP), using an exemption (for emergencies or sole-source purchases), or purchasing through a cooperative agreement like those administered by NASPO ValuePoint, where a single “lead state” negotiates terms that dozens of other states can adopt. In competitive bidding, agencies define the problem, draft an RFP specifying scope and terms, evaluate vendor bids on cost and technical merit, negotiate final contract terms, and monitor vendor performance through the contract’s life. However, as EPIC’s report documents, many AI systems enter government through cooperative purchasing agreements or emergency exemptions that bypass competitive bidding entirely — meaning AI-specific considerations like bias testing and data governance never get evaluated. EPIC identified 621 AI contracts across all 50 states, finding that the top ten vendors alone accounted for over $715 million in potential contract value.
Cooperative procurement allows multiple government entities to purchase goods and services under a single contract, reducing administrative costs and increasing bargaining power. The National Association of State Procurement Officials (NASPO) facilitates this through the ValuePoint platform. In the EPIC dataset, more than 4 out of 5 state AI contracts were negotiated through NASPO ValuePoint. While this efficiency is valuable, it means a single contract’s terms—including any gaps in AI governance provisions—can propagate across dozens of jurisdictions.
Once an AI system is deployed under a long-term contract, governments may lose meaningful control over the data the system processes and the decisions it produces. Vendors may retain proprietary rights over training data, model architectures, or performance analytics, making it difficult for the government to audit system behavior or switch providers. Over time, institutional knowledge becomes embedded in vendor-controlled platforms — staff learn the vendor’s system rather than the underlying process, and the data needed to transition to a new provider may not be readily exportable. These dynamics create high switching costs and reduce the government’s bargaining power at renewal. Shorter contract terms with performance-contingent renewal clauses (Recommendation 3) help mitigate these risks by preserving the government’s ability to reassess and, if necessary, change course.
Risk-tiered review ensures low-risk AI systems are approved quickly—San José’s model only triggers full review for high-risk systems, avoiding bottlenecks. Standardized contract clauses and fact sheet templates actually reduce negotiation time by providing ready-made language that procurement officers can adopt rather than draft from scratch. Also, the cost of upfront review is far less than the cost of failure downstream: Cooperative procurement means the review investment is shared across participating jurisdictions.
Several federal frameworks support the recommendations in this memo. The AI in Government Act of 2020 established requirements for federal AI governance. OMB Memorandum M-25-21 emphasizes structured governance, accountability, and public trust in federal AI use. The NIST AI Risk Management Framework provides a cross-sector tool for risk evaluation. While procurement is primarily a state and local function, federal guidance can reinforce state-level reforms by encouraging contract transparency and model standards.
OIT AI governance framework was implemented by designating existing staff rather than creating a new office. A NASPO-convened working group could develop model contract clauses once for shared use across all member states, amortizing development costs across dozens of jurisdictions. IEEE 3119-2025 provides a ready-made procurement framework that reduces the need for states to develop standards independently. The cost of inaction—failed AI deployments, legal liability, and harm to constituents—far exceeds the cost of reform. AI initiative failure rates in government settings reach 70-85%, and the federal government already spends 80% of its $100 billion IT budget maintaining legacy systems.
Finally, implementation costs should be understood not only as personnel expenses but also as internal coordination burdens created by fragmented procurement processes. Clear ownership across agencies is essential to manage these risks and ensure accountable, responsible AI procurement from start to finish.
Making Rural Communities Visible in Artificial Intelligence Through Rural Proofing in Kansas and Beyond
A road can show connection, but not access. Rural communities might appear in data and public systems, yet still remain invisible when AI systems do not reflect distance, transportation barriers, service gaps, workforce constraints, smaller data sets, and local strengths. Rural proofing gives Kansas and other rural states a practical way to make these realities visible in the AI-driven decisions already shaping health and social services.
Artificial intelligence (AI) is increasingly shaping decisions across public health systems, including how needs are identified, how resources are distributed, and how services are delivered. As a result, AI will play an important role in the future of healthy rural communities. When designed and governed carefully, AI can improve access, resource planning, coordination, and service delivery. When rural contexts are overlooked, AI systems can reproduce uneven outcomes and risk deepening existing disparities. In rural areas, where health systems often operate with fewer providers, thinner infrastructure, and less margin for error (meaning fewer backup resources when something goes wrong), these risks can be especially significant.
This memo examines rural invisibility in AI-related health systems, defined as the underrepresentation of rural communities in data, system design, validation, and governance. It explains why these gaps matter and why AI should be developed, tested, and governed with rural communities in mind. The term “rural” can be defined in a variety of ways, but this memo leans on the shared understanding of rural places as those with fewer people, less population density, and greater distance to services. While each rural community has a different history, strengths, resources and challenges, this memo – and the concept of “rural-proofing”, explained within – recognizes there are many shared challenges commonly faced by rural communities.
At both the national and state levels, there is an opportunity for more intentional action to recognize rural invisibility in AI systems as a policy issue. States can position themselves as proactive leaders in rural AI governance by aligning with federal frameworks while developing practical, state-level approaches. Kansas can become a leader in developing and implementing practical rural-proofing approaches that can serve as a model for other rural states. To do so, the state should take five connected steps: 1) make rural context a required part of any Kansas AI task force; 2) require rural proofing before agencies adopt or expand high impact AI tools; 3) institutionalize rural listening through trusted local partners; 4) document the Kansas model as a public blueprint other states can adapt; and 5) build a statewide rural AI literacy framework for residents,students, frontline workers, and public agencies.
Challenges and Opportunities
Rural communities have strong social connectedness, local knowledge, community leadership, and deep relationships that support resilience and innovation. Yet, they often face lower population density, greater geographic dispersion, and more limited access to services and infrastructure. In these settings, AI decisions in one domain can quickly affect others, making locally grounded context and community-level oversight especially important. As AI adoption grows, its effects on rural communities reach well beyond any single tool or system. What’s at stake is broader: how rural needs are represented in data, who has a voice in how AI decisions are made and governed, and how the benefits and burdens of AI systems and infrastructure are distributed across communities. These dynamics raise important questions about whether AI systems adequately account for rural conditions, populations, and lived experiences.
Rural Invisibility in AI Systems
Rural invisibility in AI systems occurs when rural communities are underrepresented in the data, assumptions, design, validation, and governance that shape how systems are built and used. That can make rural needs harder to see and rural harms harder to detect. In practice, it means that AI systems may be built on assumptions that do not reflect rural realities, leaving rural communities overlooked in decisions about resources, services, and policy.
The body of evidence, including the 2025 scoping review, illustrates how this invisibility carries into practice. It highlights that rural AI research is underdeveloped and that models underperform in rural settings, and the consequences of those failures are rarely studied where they are felt most. As the 2025 National Rural Health Association policy brief notes the challenge is not simply whether rural systems use AI, but whether technologies reflect the realities of fragmented records, thin staffing, and delayed care pathways. When those realities remain invisible in design and implementation, the consequences can include missed, delayed, or incorrect diagnosis, misallocation of resources, and greater strain on rural providers.
Gaps in AI Governance Frameworks
It is important to assess how well current governance approaches perform across different contexts. Current AI governance frameworks, including the National Institute of Standards and Technology Artificial Intelligence Risk Management Framework and Organization for Economic Co-operation and Development (OECD), provide a strong foundation by emphasizing fairness, transparency, accountability, and risk mitigation, but they provide limited guidance on how to operationalize these principles in rural environments. These frameworks often do not fully account for differences across settings. For example, communities and organizations vary in data availability, institutional capacity, and service infrastructure. They also differ in their ability to evaluate and govern AI tools, especially when staffing, technical expertise, and resources are uneven. Most frameworks do not require testing across small or geographically distinct populations, which can make it harder to see how AI performs in rural areas and allow disparities to go unnoticed.
In addition, current frameworks do not specify how local knowledge, professional judgment, or community perspectives, particularly those from rural communities, should be incorporated into AI oversight and decision-making, which can both algorithmic invisibility and broader forms of rural invisibility in AI. While they emphasize stakeholder engagement, they leave implementation largely undefined, which can limit the ability to identify context-specific risk. These gaps also matter because AI already shapes public benefits, legal navigation, housing, and service coordination. When trained on data shaped by past inequities, AI can deepen disparities rather than reduce them. This is why AI governance must move beyond general principles and explicitly incorporate rural proofing, accountability, and meaningful community involvement.
Federal policy remains an important lever because it can help push state policy forward by signaling priorities, shaping governance expectations, and giving states a stronger foundation for action.Current federal guidance provides a foundation for responsible AI use but offers more limited practical direction for rural settings, where sparse data, limited staffing, and fragmented service systems can affect how AI works in practice. Even though the recommendations in this memo focus primarily on actions at the state level, federal guidance on addressing rural invisibility in AI across health, education, and social systems can help create the conditions for states to act more effectively and equitably on behalf of rural communities.
The White House Office of Science and Technology Policy (OSTP) or the Domestic Policy Council (DPC) is well positioned to lead coordination across federal agencies, ensuring that rural AI implementation challenges are recognized in efforts affecting health, education, and social systems. Building on that coordination, the Office of Management and Budget (OMB) is well positioned to reinforce this work through its existing governance and procurement role to clarify how existing expectations for artificial intelligence procurement, validation, monitoring, oversight, and accountability apply in rural-serving settings. The Department of Health and Human Services (HHS), the Department of Agriculture (USDA), and the Department of Education (ED) should then help translate that guidance into practice for artificial intelligence systems and programs that directly affect rural communities. The National Institute of Standards and Technology (NIST) should provide supplemental examples showing how artificial intelligence risks can present differently in rural settings. This would strengthen implementation under existing frameworks without requiring the development of a separate framework.
Federal agencies should use existing programs to strengthen rural data infrastructure, technical assistance, and workforce readiness, and governance capacity needed for responsible AI implementation in rural communities. HHS, USDA, and ED can support rural-serving institutions directly, while NIST and other federal partners can provide tools, guidance and practical examples to help organizations implement AI responsibly and effectively.
The Need for Rural Proofing
Rural proofing is the process of systematically checking whether policies, tools, and investments reflect rural realities, avoid unintended rural harms, and support fair outcomes for rural communities. In practice, it means asking early and explicitly how a policy or AI system will function in places with lower population density, greater distance from services, thinner infrastructure, smaller administrative capacity, and different patterns of need and service use.
When applied to AI, rural proofing makes rural conditions visible across system design, data, deployment, and oversight. This includes defining clear use cases, keeping communities involved in decisions about AI, explaining what the system does and does not do, and regularly reviewing whether it creates unequal results. It also means regularly reviewing system performance, checking for weak results in small or low-volume populations, documenting when rural data is limited, and being transparent about how those limitations affect outcomes. Rather than treating rural impact as an afterthought, rural-proofing makes rural context and rural strengths a core part of design, implementation, oversight, and evaluation. Within governance processes, it also helps ensure that policies and decisions are informed by rural needs, contexts, and strengths rather than assumptions developed elsewhere.
Because many rural systems operate with limited staff, tight budgets, and shared regional responsibilities, AI governance requirements must be practical. Federal and state agencies should give rural-serving organizations the time, funding, and support needed to review systems, raise concerns, and participate in oversight. They should also provide plain-language documentation so local leaders, frontline staff, and community members can understand how decisions are being made. Finally, rural proofing requires clear accountability. When AI systems cause harm or fail to work fairly in rural communities, agencies and vendors should have a clear process to identify the problem, respond to it, and fix it (see Figure 1).
Plan of Action
Addressing rural invisibility in AI algorithms and systems across health and social sectors requires coordinated national attention and action, including the integration of rural proofing into national AI governance efforts. Because national frameworks often serve as guidance for states, progress at the national level is needed to provide the standards, expectations, and resources that support states in adapting AI governance to their specific contexts.In the meantime, states can begin building their own pathways by aligning with existing frameworks, piloting approaches in priority areas, and strengthening internal capacity.
Kansas as a Blueprint
As one of the nation’s rural states, Kansas has a strong interest in ensuring that AI systems work effectively for rural communities. As AI becomes increasingly integrated into sectors that are important to rural Kansan, including health care, education, transportation, agriculture, emergency response, public benefits, and other public services, rural-proofing can help ensure that AI tools are responsive to rural contexts.
For Kansas, this could include leveraging existing rural health infrastructure, engaging local stakeholders, and testing practical approaches that can be scaled as clearer national direction emerges. The Center for Medicare and Medicaid Services (CMS)’s Rural Health Transformation Program offers one practical pathway for aligning rural technology investment and technical assistance in Kansas with rural AI proofing principles. The Kansas Legislative Artificial Intelligence Task Force should explicitly include rural context as a defined part of its charge, membership, and workplan. The Kansas Office of Information Technology Services (OITS), the Information Technology Executive Council (ITEC), the Kansas Department of Health and Environment (KDHE), the Kansas Department for Aging and Disability Services (KDADS), and the Kansas Department for Children and Families (DCF) should work collectively to translate broad AI governance principles into practical oversight and implementation for rural health and social systems.
Furthermore, implementation of these recommendations can be staged based on current capacity, allowing agencies to begin with foundational actions and progressively build toward a more coordinated, statewide approach over time (see Figure 2).
Recommendation 1. Ensure The Kansas Legislative Artificial Intelligence Task Force and Any Future State-Level Task Forces Explicitly Include a Focus on Rural Context and Health
The Kansas Legislative Artificial Intelligence Task Force, given its role in shaping AI policy and direction, should explicitly include rural context as a defined part of its charge, membership, and workplan. The current task force already includes legislators, executive branch leadership, universities, health systems, agriculture, and private sector technology members. The taskforce’s scope could include reviewing AI use in rural contexts, incorporating rural and frontline voices into decisions around AI procurement and deployment, and issuing guidance on procurement, oversight, and accountability in rural health and social systems.
In practice, Kansas can build on the existing role of OITS by extending its coordination function to include AI-specific responsibilities, such as setting standards for evaluation, interoperability, and responsible use across agencies. ITEC can provide statewide governance direction by aligning AI efforts with broader IT strategy and policy. Service agencies, including KDHE, KDADS, and DCF would implement these efforts within health and social systems. This structure gives Kansas a practical model that other states can adapt by pairing a statewide IT authority with the agencies that directly manage public benefits, care, and social services.
- Define Scope. State-level Kansas AI task forces, working groups, and advisory bodies should explicitly include AI use and governance in rural contexts as a core scope of work, embedding rural considerations directly into the charge, membership, and workplan rather than treating them as secondary or optional.
- Coordinate Governance. Cross-agency coordination should occur through OTIS with statewide governance direction set through ITEC, while agencies including KDHE, KDADS, and DCF should identify and document where AI affects health and social service access.
- Establish Structure. Kansas should establish a clear governance structure that translates national AI principles into practical oversight, procurement, and implementation decisions tailored to rural conditions.
Recommendation 2. Require Rural Proofing for AI Used in Kansas Health and Social Service Programs
AI-enabled tools are expanding across eligibility decisions, care coordination, analytics, and service delivery. Because of this, Kansas should strengthen AI governance within the agencies that directly shape health and social outcomes. In practice, this work should begin with KDHE, KDADS, and DCF with cross-agency coordination support from OITS. Rather than relying only on broad fairness principles, these agencies should use a practical rural-proofing process to assess whether AI tools work reliably in rural settings with different staffing levels, service access, broadband conditions, data volume, and administrative capacity. Taking these steps now would help Kansas clarify oversight responsibilities, procurement standards, and rural risk before AI becomes more deeply embedded in public systems.
- Inventory. KDHE, KDADS, and DCF should identify and document current and planned uses of automated decision tools and AI systems used in decision-making that affect eligibility, benefits, care access, case management, service coordination, navigation, and enforcement. Agencies should require vendors to inventory AI used in care management, prior authorization, utilization management, member outreach, and provider network decisions. Kansas can then expand this inventory approach to other agencies that shape health, including housing, transportation, workforce, education, justice, and environmental systems.
- Rural AI Proofing. Agencies should apply rural-proofing review before they procure, renew, expand, modify, or deploy high-impact AI systems. This review should assess whether the tool performs reliably in rural settings, whether rural data are sufficient for validation, whether lower service use is being misread as lower need, and whether the system creates added burdens in places with limited staff, broadband, transportation, or service infrastructure.
Governance and Coordination. Kansas should establish a centralized cross-agency approach to AI governance to ensure consistency and avoid fragmented implementation across agencies. A coordinated structure—led jointly by OITS and state procurement—should define statewide oversight, reporting expectations, and minimum standards for the use of AI and automated decision tools. - Strengthen and Formalize AI Governance. Within this structure, agencies should strengthen and formalize AI governance by assigning clear oversight responsibility, documenting risk management and vendor review practices, incorporating transparency and explainability requirements, and building internal and community-level AI literacy. Agencies should also implement consistent oversight and reporting expectations for vendor AI use, including requirements for rural proofing, audit, and review mechanisms. OITS, in coordination with state procurement, should provide cross-agency technical guidance, ensure consistency in procurement standards, and align AI use with state IT and data governance policies. Individual agencies should retain program-level oversight within their statutory authority while operating within this coordinated governance framework.
- Vendor Requirements. Require vendors to explain how their AI tools perform in rural settings, disclose known data and performance limits, identify human review points, and provide plain-language documentation on system purpose, intended use, and potential failure points.
Recommendation 3. Institutionalize Rural Listening through Trusted Intermediaries
Meaningful engagement with rural communities is especially important in this context because AI systems are often designed and evaluated far from the places where their effects will be felt. However, engagement alone is insufficient. This recommendation draws a deliberate distinction between consultation, where agencies ask communities what they think, and co-governance, where rural communities hold real influence over AI decisions that affect them. Kansas should aim for co-governance, not just input collection In rural areas, where access to care, public services, transportation, broadband, and legal support may already be limited, even small design flaws or inaccurate assumptions can have outsized consequences. Regular listening with rural residents and trusted local partners can help surface needs, barriers, and unintended harms that may otherwise remain invisible in statewide decision-making.
- Support. OTIS or the Governor’s Office should coordinate recurring AI listening sessions, with participation from KDHE, KDADS and DCF, and with research and facilitation support provided by Kansas public universities.
- Leverage Partners. OITS and KDHE, KDADS, DCF should engage Kansas public universities to support session design, facilitation, documentation, synthesis of findings, and evaluation to ensure structured feedback and continuity across sessions.
- Conduct Sessions. KDHE, KDADS, DCF in partnership with county and local public health agencies, behavioral health providers, university extension networks, libraries, legal aid and court self-help programs, and community-based organizations, should conduct sessions using a shared calendar and unified feedback process.
- Apply Findings. KDHE, KDADS, DCF should use listening sessions as a rural-proofing mechanism to assess how AI-enabled tools in health, benefits, and social service programs affect access to care, legal protections, and social determinants of health in rural settings.
- Integrate Oversight. OITS and KDHE, KDADS, DCF should integrate findings into state oversight by documenting recurring rural issues, flagging systems for review, and using insights to strengthen procurement, monitoring, and accountability for AI systems used in Kansas programs.
Recommendation 4. Establish a Kansas ‘Rural AI Health Governance Blueprint’ for Other Rural States to Replicate
Clear leadership at the state level matters because rural proofing is unlikely to be applied consistently if agencies and vendors are left to interpret it on their own. A statewide approach creates shared expectations, strengthens accountability, and makes clear that rural context should be built into procurement, oversight, and evaluation from the beginning. This approach is also replicable because it relies on documented processes, practical tools, review steps, and implementation lessons that other rural states can adapt to fit their own governance structures, service systems, and community conditions. The framework should also incorporate AI infrastructure impacts, including data center siting, to ensure rural-proofing standards address the distribution of resource, environmental, and land use burdens associated with AI development.
- Document Framework. OITS, in coordination with the Information, ITEC and participating agencies such as KDHE, KDADS, and DCF, should document Kansas’s rural AI governance framework in a public implementation guide, including rural-proofing standards, shared workflows, listening session models, and transparency practices.
- Assess and Evaluate. KDHE, KDADS, DCF should partner with Kansas public universities to assess outcomes, identify lessons learned, and produce evidence-based recommendations to strengthen rural AI governance over time.
- Share and Scale. OITS and participating agencies should share implementation tools, templates, rural-proofing checklists, and model policies through interstate networks such as the National Governors Association, the National Conference of State Legislatures, and state rural health associations so other states can adapt the Kansas model.
- Pilot Collaboration. OITS, with support from Kansas public universities and partner agencies, should pilot cross-state collaboration with at least two predominantly rural states to test whether the Kansas workflow can transfer across different governance structures, agency arrangements, and service systems.
- Demonstrate Practice. OITS, state service agencies, and university partners should position Kansas as a practical demonstration state by showing how rural proofing can translate broad AI governance expectations into workable state practice that reflects rural conditions, administrative limits, and community realities.
Recommendation 5. Establish a Standardized and Contextualized Kansas Rural AI Health Literacy Framework
Kansas should complement the upstream AI governance framework with a statewide Rural Health AI Literacy Framework to ensure residents, students, and frontline workers can engage AI systems critically. Unlike general AI literacy, which often focuses on basic awareness of AI tools and digital skills, rural health AI literacy should prepare residents, students, frontline workers, and public institutions to understand how AI can shape health access, eligibility, referrals, triage, service coordination, and related decisions in rural communities. Governance structures alone are insufficient if communities lack shared standards for understanding how AI affects eligibility, health access, agriculture, transportation, and legal services in rural settings. The Kansas State Department of Education (KSDE), in coordination with the Kansas Board of Regents (KBOR) and the Kansas Office of Information Technology Services (OITS), should lead development of tiered, age-appropriate AI literacy competencies spanning K–12, postsecondary education, and public-sector roles.
To operationalize this framework, Kansas should:
- Integrate Curriculum. KSDE should integrate AI literacy into digital literacy, civics, agricultural education, and career and technical education standards, with support from Regional Education Service Centers for teacher training.
- Embed in Higher Education. KBOR should embed AI literacy modules into general education requirements and first-year seminars across public universities and community colleges, aligning with Higher Learning Commission expectations.
- Establish Rural Hubs. Kansas State University and Cooperative Extension should serve as rural AI literacy hubs delivering applied programming for health, agriculture, and local government sectors.
- Expand Community Delivery. State agencies and education partners should collaborate with public libraries, Tribal education departments, workforce development boards, and community-based organizations to deliver multilingual AI literacy programming statewide.
- Train Public Workforce. State agencies should implement baseline AI literacy training for employees in health, eligibility, and human service roles.
Conclusion
As AI becomes more embedded in public systems affecting health and social outcomes, it is important to account for rural context, particularly in Kansas, where many communities operate under conditions that differ from those assumed in typical AI development and deployment environments. These conditions include greater data sparsity, lower service density, and constrained institutional capacity for oversight. The proposed recommendations aim to operationalize responsible AI principles through coordinated cross-agency governance, integration of rural proofing into existing structures, and stronger community engagement in AI decision-making. By acting now, Kansas can build a more accountable model for rural AI governance and offer other rural states a practical path forward.
Rural health refers to the health outcomes, service access, and community conditions that shape well-being in rural communities. It includes access to healthcare, behavioral health, substance use treatment, prevention, workforce capacity, transportation, and the social determinants of health that affect whether rural residents can receive timely and appropriate care.
Common federal rural definitions include those developed by the U.S. Census Bureau, the Office of Management and Budget, and the U.S. Department of Agriculture Economic Research Service. The ideas, challenges and recommendations presented here within, but are not limited by, common rural definitions used across public health and health care. While rurality exists on a spectrum, definitions often use some combination of population thresholds, population density, housing density, and proximity to dense urban areas to define levels of rurality and urbanicity.
They should establish AI governance structures and policies, inventory current and planned AI use, assess whether tools are necessary and can function effectively in rural settings, document rural data limitations and oversight responsibilities, require vendor disclosure, and provide plain-language information about how systems work and how human review and oversight are incorporated into decision-making.
AI vendors should explain how their systems perform in rural settings, disclose known data and performance limitations, identify human review points, and provide plain-language documentation on system purpose, intended use, and conditions under which performance may vary.
Listening sessions help state agencies hear directly from rural residents, frontline workers, and local organizations about how AI affects access to care, benefits, legal navigation, and other services in practice. The memo recommends using those findings to improve procurement, monitoring, and accountability.
Public Participation IS the Ingenuity We Need
Building Blocks To Make Public Participation Solutions Work
Participation is not a distraction from governing — it is how government governs well. When treated as compliance, it comes too late and excludes those most affected, weakening legitimacy. Designed as a strategic asset, it builds trust, eases implementation, and supports more durable decisions.
Implications for democratic governance
- Participation is how decisions are made. Engagement should focus on clearly defined choices, constraints, and tradeoffs so decisions can move forward.
- Who participates, and when, shapes what government hears. Participation must reflect the full scope of impacts, not just the most visible or organized voices.
- Legitimacy depends on follow-through. Agencies should explain how input was considered so people can make sense of the outcome, even when they disagree.
Capacity needs
- Learn and adapt throughout the process. Track who is participating, what input is generated, and how it is used — then adjust the approach as needed.
- Engage early enough to matter — and continuously where possible. Bring people in while options are still open, and stay engaged so feedback informs both decisions and implementation.
- Structure input for decisions. Ask targeted questions and use formats that help compare impacts, priorities, and implementation considerations.
- Train for real-world engagement. Equip staff to facilitate conversations, synthesize input into key themes, and navigate disagreement.
- Make participation feasible. Offer enough lead time, plain-language materials, and multiple ways to engage (e.g., virtual, in-person, asynchronous) that account for different access needs.
Government, at its best, is democracy’s promise made real: the mechanism through which a society turns values into action and public voice into policy. But that mechanism has corroded. Rebuilding it starts with something simple — treating the public not as a problem to manage, but as a source of ingenuity government cannot function without.
Public participation, as the federal government executes it today, rarely builds trust: the public hearing held after decisions are already made, the comment period that produces thousands of responses with no visible impact, the listening session where officials take notes but never engage. The other version, where a highly organized few monopolizes public ears and distorts public response for their niche interests, is equally demoralizing. Critics are right to call out this failure. Across the political spectrum, there is a shared diagnosis: the current system of public engagement too often functions as a series of veto points, rewarding obstruction over problem-solving and delay over delivery.
That erosion matters enormously right now. Americans have grown deeply skeptical that government can solve hard problems, and climate change, with its complexity and its demands on every sector of the economy, may be the hardest challenge it has ever faced. As the authors ask in the opening argument of the Center for Regulatory Ingenuity, if government can’t effectively address challenges it deems an “existential threat,” what good is it, and can democracy overcome this downward spiral of mutually reinforcing cynicism?
We believe it can. Today, we are living through a mid-transition moment in climate policy, in which the technologies we need exist, the economics are increasingly favorable, and many obstacles are governmental: slow processes, fragile coalitions, and policies that get built and then litigated into irrelevance. In that context, the instinct to streamline is understandable. But a government that privileges artificially weighted listening, or avoids listening because it didn’t plan well, doesn’t move faster. It moves blindly or with bias. It builds the wrong things, in the wrong places, for the wrong reasons, and then wonders why nothing sticks. The argument here is not for more or less participation but for better participation, treated as a strategic asset rather than a box to check, and designed with the same rigor as any other policy instrument. Done well, public engagement demonstrates that government can listen, adapt, and earn trust. At a moment when democratic institutions are fragile, that demonstration is not incidental to the work of governing, it’s central to it.
Climate change is not just a technical problem, it’s a governance problem, and increasingly, a democratic one. It reaches into every community, every economy, and every aspect of daily life, from how people power their homes to how they move through their cities to whether their communities remain livable at all. There is no technocratic solution that can bypass the public. The choices climate demands (where to build, what to prioritize, who bears costs, and who benefits) are fundamentally democratic choices. If democratic government cannot make those choices in ways that are effective and legitimate, it will not just fail on climate. It will fail at its most essential purpose: helping people shape the conditions of their shared future.
Definitions and the Role in Governance
Before making the case for better participation, it helps to be precise about what we mean in a space where a range of practices often get conflated.
In January 2025, the White House Office of Management and Budget issued a memorandum that the authors of this paper helped develop, laying out a federal framework for broadening public participation and community engagement. The memo itself was built through the practices it recommends: a public request for information drew input from hundreds of participants and nearly 300 written comments, documented in a public summary that informed the final guidance. It offers definitions worth building on. Public participation is any process that engages the public in government decision-making — helping shape policies, regulations, or research, or soliciting new ideas and innovations. It is inherently transactional: it seeks to inform and obtain input from those interested in or affected by agency action. Community engagement, by contrast, is primarily relational — the consistent building of relationships with communities over time, informed by the history those communities have with an agency, and transparent about the real opportunities and real limitations of that relationship.
Public participation without community engagement can produce processes that feel hollow — technically open, but not genuinely accessible to the people most affected. Community engagement without public participation can build trust and goodwill that never translates into actual influence over decisions. Together, they reinforce each other: relationships make meaningful input possible, and meaningful input, when reflected in decisions, strengthens those relationships over time. But they are not interchangeable, and not always needed in equal measure. One of the memo’s most important contributions is helping government actors think more carefully about which tool fits which moment; a public comment period serves a different purpose than a years-long relationship with a frontline community, and conflating them produces both bad process and bad outcomes. Resources like EPA’s Public Involvement Spectrum make this concrete, mapping participation options from basic outreach through information exchange, recommendations, agreements, and stakeholder action, each carrying a different promise to the public and requiring a different level of agency commitment and design.
What the OMB memo does is not add new mandates (federal statutes from the Administrative Procedure Act to the National Environmental Policy Act already require it across a wide range of agency functions). It strengthens and clarifies what good practice looks like within those existing requirements while making that guidance accessible to anyone in government who wants to apply it beyond those requirements.
That the memo exists at all reflects a recognition that the statutory floor was never enough and democratic practice does not end on election day. The case for meaningful participation rests on the idea that government derives its legitimacy from the people it serves, and that legitimacy has to be earned continuously.
Why Engagement Often Fails
If the case for participation is so strong, why does it so often fail to deliver? Not because the theory is wrong! But because the dominant models were built for a different era, and even well-intentioned efforts, when filtered through broken processes, can deepen the cynicism they were meant to address.
This matters because bad engagement actively fuels the cynicism it was supposed to address. Every bad process reinforces the conviction that participation is theater (or participation is illegitimate, or corruption sanitized) and makes it harder to do the real thing next time. These failures matter in any policy domain, but for climate they are something closer to existential — not just for policy outcomes, but for faith in democratic governance itself.
The Design Failures Are Structural, Not Incidental
The most common problems with public engagement are baked into the dominant models. Traditional notice-and-comment processes, for instance, were designed for transparency, judicial review, and weighting technical and legal expertise over lived experience — and they achieve those aims in a narrow procedural sense. But transparency is not the same as accessibility, and access is not the same as influence. As Nicholas Bagley has argued, procedural rules now actively exacerbate the very problems they were designed to solve. Notice-and-comment tends to reward the organized, the well-resourced, or the professionally represented, producing voluminous records that reflect the priorities of those who could afford to engage, not necessarily the communities most affected by the decision. A single well-funded group can submit thousands of pages of technical comments; a frontline community facing the same decision may have no idea the comment period exists, and no capacity to respond to it even if they did.
Formal public hearings share similar pathologies. They are designed to create a record, not a dialogue, and they tend to produce exactly what their format invites. Participants deliver prepared statements into a microphone. Agency officials listen without responding. The atmosphere is frequently adversarial, structured in ways that entrench “us versus them” dynamics rather than creating any genuine opportunity for exchange, learning, or compromise. People leave feeling unheard not because their words weren’t recorded, but because nothing about the process suggested anyone was actually listening.
One-way communication runs across many engagement formats. As the IAP2 Public Participation Toolbox makes clear, even well-designed information-sharing tools — fact sheets, websites, press releases — have significant limitations when they substitute for genuine dialogue rather than supporting it. Listening sessions, informational webinars, and town halls designed primarily to transmit information leave the public as a passive audience (or unheard correspondent) rather than active participants. When people cannot ask questions, push back, or engage in real exchange with the decision-makers who affect their lives, the engagement reinforces rather than reduces the distance between government and community.
Timing compounds all of these design problems. Engagement that happens too late in the decision-making process — after the key choices have been made, after alternatives have narrowed, after political and financial commitments are in place — is unlikely to meaningfully influence outcomes no matter how well it is conducted. Collaboration works best when it begins early, when there is still genuine room to shape the purpose, alternatives, and design of a proposed action. By the time a formal public comment period opens on many major decisions, the substantive work is essentially done. Community input at that stage can affect the margins but rarely the fundamentals and communities know it. When people show up, offer input, and watch nothing change, they stop showing up.
The scale and complexity of engagement materials create their own barrier. Technical documents running hundreds of pages, written in regulatory language accessible only to specialists, are not a neutral feature of the process — they are a filter. Effective strategic communication requires understanding who the audience is, what they already know, and how the issue connects to their lives, none of which is served by dense regulatory documents distributed through official channels. When the baseline requirement for participation is fluency in administrative law and agency-specific jargon, the people best positioned to engage are lawyers and lobbyists, not the residents of a community downstream from a proposed facility. Simplifying materials is not dumbing them down. It is recognizing that the expertise of affected communities is just as relevant as the expertise of credentialed professionals — and that accessing it requires meeting people where they are, not where the agency finds it convenient.
Perhaps most corrosively, some procedural tools originally designed to protect communities have been repurposed as obstruction. When participation requirements become veto points and when the primary function of an engagement process is to create grounds for litigation rather than to genuinely improve decisions, they undermine both the efficiency that critics of government rightly demand and the democratic accountability that participation is supposed to deliver. Small, organized, well-resourced groups can exploit procedural requirements to delay or block decisions that broader communities support, effectively capturing processes intended for the many and wielding them on behalf of the few. This is the participation failure that most directly drives the “just build it” impulse and it is a legitimate grievance. A system that was designed to give communities a voice has, in too many cases, been captured, producing neither democratic legitimacy nor efficient delivery. At the same time, when communities are left out of upstream planning, these veto points often become one of the only avenues available to influence decisions.
The Barriers to Inclusion Run Just as Deep
Even when processes are reasonably well-designed, they often fail to reach the communities that matter most. The gap between who participates and who is affected often follows the contours of existing inequality with uncomfortable precision.
Awareness is the first and most basic barrier. Many people simply don’t know that an opportunity to engage exists. Believe it or not, federal agency websites are not where most Americans spend their time, and the Federal Register is not how most communities learn about decisions that will affect them (even for experienced users). When engagement opportunities are announced through official channels that already skew toward the educated, the connected, and the English-proficient, the resulting participant pool reflects those skews. Compounding this is a lack of clarity about what participation is even for: many people who are aware of an opportunity don’t understand how their input could make a difference, or whether it ever has. That uncertainty is itself a barrier, and it is one that agencies rarely address directly.
The materials and communications that agencies use to invite and support participation often create their own exclusions. Technical documents written for specialists, notices distributed through unfamiliar channels, comment periods with deadlines that give working families no realistic time to respond, engagement formats that assume reliable broadband and digital literacy — none of these are neutral design choices. A sound public engagement plan starts by understanding audience needs and building authentic, reciprocal relationships — the opposite of defaulting to formats that are convenient for the agency. Each design choice narrows the pool of who can meaningfully participate, and the cumulative effect is systematic. When engagement materials are only available in English in communities where many residents speak other languages primarily, the process has already decided who counts. When comment periods close before community organizations have had time to mobilize their members, the timeline has already decided who counts. These are decisions that agencies make, often without fully recognizing them as decisions at all.
Physical access barriers operate similarly. Transportation costs, distance to venues, the inaccessibility of meeting spaces for people with disabilities, the difficulty of attending a weekday hearing while working multiple jobs — these are all practical obstacles that might seem minor in isolation but compound into systematic exclusion for communities that are often most directly exposed to the environmental and infrastructure decisions being made. Virtual engagement has opened some of these doors, but it has closed others: digital access gaps, limited bandwidth in rural and low-income communities, and the particular challenges of meaningful online participation for older adults or those with limited technology experience mean that remote options solve some access problems while creating new ones.
Trust may be the most intractable barrier of all, because it cannot be addressed by logistical improvements alone. Communities that have experienced past harms from government (broken promises, extractive processes, decisions that hurt them and were made without them) carry a rational skepticism about whether this time will be any different. That skepticism is not ignorance or apathy. It is an accurate reading of a track record. Overcoming it requires more than a well-designed meeting or a plain-language summary. It requires demonstrated consistency over time: showing up when there is no immediate decision to be made, acknowledging historical harms directly rather than implicitly, and following through on commitments in ways that are visible and verifiable. This is precisely what Hollie Gilman and Sabeel Rahman mean when they argue in Civic Power that meaningful participation isn’t ultimately about better meetings — it is about redistributing power so that those closest to the problem are genuinely part of the solution.
Privacy concerns add another dimension that is easy to overlook from a position of power or privilege. In communities with historically complex or adversarial relationships with government – such as those experiencing overpolicing, immigrant communities, or other underserved groups – the act of identifying oneself in public participation can seem risky. Showing up on the record or speaking at a public hearing can feel less like civic engagement and more like exposure. These are not irrational fears; they reflect lived experience and how government has used information, access, interests, and related levers against communities. Genuine participation in these contexts will not occur just by invitation. Agencies must actively and continuously address the conditions that make showing up feel unsafe (e.g., by using intermediaries, anonymous or aggregated input, or other protective measures) and acknowledge the prior harm.
One additional factor shapes who participates and whether their participation endures: organization. Even when agencies improve conditions for participation, durable community voice does not emerge automatically; it depends on collective capacity, especially in communities historically marginalized by race or class. Without organizational support, individuals often lack the connections, information, and trust to translate lived experience into effective engagement. Organization helps aggregate that experience, sustain engagement, and convert it into usable input for government. In its absence, participation remains fragmented, reinforcing the advantage of those already organized and resourced.
The Benefits of Effective Engagement as a Strategic Asset
In the early 2000s, a proposed bridge crossing the St. Croix River (a National Scenic Riverway on the Minnesota-Wisconsin border) had been stuck in gridlock for five decades. When serious planning had kicked off in the 1980s and 1990s, the 1931 structure was already fracture-critical. Stakeholder groups and a disparate range of public institutions, representing sharply conflicting interests and mutual distrust, had reliably blocked every attempt to move forward.
What finally broke the logjam wasn’t a better technical study or a more powerful agency directive, a least common denominator solution, or a decision to simply ignore the interests of a set of stakeholders. It was a deliberate shift to structured collaboration rather than the usual method of talking at one another with concerns, with no means of finding points of alignment, resolution, or tradeoffs. Twenty-eight stakeholder groups shaped the bridge’s location and design, with a comprehensive mitigation package addressing the natural, social, and cultural impacts of the new bridge. Moreover,
Relationships and communication among the stakeholders improved remarkably during the problem-solving process. In the words of one stakeholder, “We were able to spend the time necessary to get over our natural inclination to not trust people from the other side. […] We had enough time and enough space to come to a conclusion that everybody could feel comfortable with.”
That outcome wasn’t the result of exhaustion, but of deliberate design and an emphasis on negotiation rather than government serving as an answering machine that never calls back.
Before making that case, it is worth naming what good engagement is not. Smart, well-designed participation is not a mechanism for local communities to veto decisions that serve broader public interests. When a neighborhood is asked whether it wants new apartments, or a community facing a new transmission line is simply asked whether they approve, the answer is predictably no — and designing processes that guarantee that outcome is not good participation! It is a design failure that produces exactly the NIMBY dynamics that rightly frustrate those who want to build. The problem is not that people oppose things. It is that poorly designed processes over-sample the most proximate opposition, structure engagement queries without a sense of purpose or audience, start with technicalities rather than longstanding trust, avoid the potential for early negotiation, or systematically exclude the regional, national, and future stakeholders who have just as much at stake in the outcome. These failures aren’t inevitable.
Good engagement produces better decisions.
Agencies have technical expertise, legal authority, and institutional knowledge, but they routinely lack on-the-ground understanding of how policies will actually land in specific communities, what tradeoffs matter most to the people affected, and what solutions might work that no one in a headquarters office has thought of yet. Meaningful participation fills those gaps.Well-designed engagement generates solutions that are more effective, empowers people from different backgrounds, and builds the local networks that make implementation actually work. It brings in lived experience that data alone cannot capture, surfaces local knowledge that improves policy design, and produces decisions that are more responsive to the full range of affected interests rather than the loudest or most proximate ones.
Research on structured deliberation consistently shows that when people are given good information and genuine opportunity to reason together — rather than just reacting to proposals that feel threatening — they reach more nuanced, durable conclusions than either polarized public opinion or top-down expert judgment alone produces. Deliberation, in this sense, is not just a democratic value. It is a technical tool for overcoming the polarization that makes hard policy decisions feel impossible. It is also, critically, a tool for helping people reason past their immediate self-interest toward a broader understanding of tradeoffs, which is precisely what is missing when participation processes sample only those with the most to lose from a particular change, rather than those with the most at stake in the outcome. Done well, it changes which voices dominate and acknowledges the role of power. That shift is the difference between a participation process that ratifies the preferences of whoever showed up and one that actually informs a decision. This is not an argument for endless input or process without limits. As James Goodwin argues in this collection, the most effective participation is targeted rather than open-ended, focused on the core disputes that actually need resolving, rather than generating voluminous records that obscure more than they illuminate. The goal is engagement that is both more inclusive and more purposeful: asking the right questions of the right people at the right time (which means going well beyond the immediate neighborhood to find the people whose lives will be shaped by a decision).
But asking the right questions requires doing the work before the room fills up. Good engagement doesn’t begin with an open-ended invitation to say whatever comes to mind. It begins with agencies doing enough homework (with experts and communities) to frame the problem clearly: what is actually being decided, what constraints are real, what tradeoffs exist, and where there is genuine room for public input to influence the outcome. That frame is one of mutual respect.
This is where standards matter, as explored later in this essay. The difference between engagement that produces insight and engagement that produces noise is largely a design question.
Good engagement builds the trust that makes government function.
Scholars have considered the consequences of low trust through many lenses, but the legitimacy of democracies relies on trust. Lower trust means less engagement with functions that government performs uniquely or drives, whether disaster response, weather warnings, federal benefits, security functions, public health, or independent data collection and analysis — and that lesser engagement means those functions work less well for everyone else. This has a cascading impact as democratic institutions weaken when government cannot, does not, or is not believed to deliver on expectations of its citizens.
When people experience decision-making as transparent, accessible, and genuinely responsive to their input, trust builds. When they show up and feel unheard, or never show up at all because no one made it possible, trust erodes, most quickly and most consequentially for those communities that can least afford to lose it. This isn’t incidental — trust is the medium through which every other benefit of engagement operates. When it erodes, the downstream consequences reach far beyond any single process, a pattern worth examining directly when we turn to why engagement so often fails.
While study after study shows that less than half of Americans trust the federal government, far fewer (21%) believe it listens to the public — and just 15% believe it is transparent, both according to a 2024 national survey. This underscores a deep perception that government is neither responsive nor accountable to the people it serves. Every engagement process is a small test of whether democracy is meaningful — especially when government is asking people to accept changes as visible and consequential as those required by climate policy.
Good engagement reduces conflict and can prevent litigation.
As Andy Gordon has argued at FAS, listening is a prerequisite for discovery and a requirement for success on any ambitious public goal where stakes are high. The ARPA-I national listening tour demonstrated this concretely: starting with questions rather than answers, and drawing on distributed expertise from every layer of the transportation system, produced an agenda-setting process that no small group behind closed doors could have replicated. The same principle holds for climate and environmental policy. The instinct to streamline participation (to save time, to avoid change, to avoid the NIMBY trap) often backfires in the most concrete terms. Decisions made without adequate input tend to generate opposition downstream, when it is far more costly to address: through litigation, organized resistance, implementation failures, and the kind of sustained community distrust that shadows projects for years. The veto-point problem, in other words, is not a feature of too much participation. It is what participation looks like when it arrives too late, is structured too adversarially, and samples too narrowly. Fix the design (broaden who is heard, start earlier, frame the problem honestly) and participation stops functioning as a veto mechanism and starts functioning as the evidence base that allows government to make hard calls with confidence. An agency that has genuinely sought broad, representative input is in a far stronger position to defend a difficult decision. When that engagement runs through organized, representative groups, it also enables more effective deal-making — where tradeoffs can be negotiated and outcomes reflect a broader, more representative set of community interests.
Collaborative approaches improve the quality of decision-making and increase public trust precisely because they bring the right stakeholders in early (engaging more upstream), before positions have hardened and before the public record has closed. Agreements get built earlier, key voices feel heard rather than steamrolled, impacts are better understood across not only the loudest but those most widely impacted, and mitigation is transparent. And with that, the potential litigation drops, compliance improves, and implementation becomes something communities feel invested in rather than something done to them.
A Framework for Doing It Right
Moving with the public is how government earns trust and makes decisions people can understand, accept, and stand behind.
Participation doesn’t eliminate conflict, and isn’t meant to. The challenge is structuring disagreement so decisions can still be made — and hold up to scrutiny — by clarifying tradeoffs, surfacing impacts, and narrowing options. Climate policy makes this especially clear; decisions about energy, land use, and infrastructure must move quickly while navigating disagreement about costs, risks, and local impacts.
Participation must reflect not just who shows up, but the full set of people affected, including those whose interests are less visible but equally consequential. Engagement can overrepresent highly organized or locally affected groups, even when decisions carry broader regional or national benefits. Designing participation to reflect those broader impacts — and the perspectives not in the room — helps avoid decisions that are responsive but not well-balanced.
The OMB memo advances a shift from participation as a single event (a hearing or listening session) to a practice agencies must design intentionally, tailor to context, and improve over time. The memo’s five principles — Purposeful, Respectful, Transparent and Accountable, Accessible, and Learning-Focused — offer a practical framework aligned with the decision, the stakes, and the people affected.
At their best, these principles help government:
- reach the right people (not just the easiest to reach)
- ask questions tied to real choices and constraints (not just general opinions)
- use input to shape decisions and follow-through (not just record it)
The goal isn’t consensus — it’s decisions that can move forward.Climate urgency makes getting this right non-negotiable. This framework is designed for exactly the conditions climate policy creates: high stakes, contested choices, deep skepticism, and no margin for processes that consume time without building trust.
Five Guiding Principles for Meaningful Engagement
Is it purposeful?
Purposeful engagement starts with clarity. What decision is being made? What is open to input at this stage? What is constrained by law, budget, or timing? Who is affected, and when can input still affect outcomes? It also means asking participants to respond to specific, decision-relevant questions. Engagement begins early enough for communities to help shape options, not simply react to them.
Why it matters
Without a clear purpose tied to a decision, engagement captures reactions to incomplete information or comes after key choices are already set. When choices are complex, participants may rely on partial or misleading assumptions rather than the factors shaping them. Framing questions around real constraints and tradeoffs produces more informed, actionable input. Inviting input outside an agency’s authority or capacity can also overwhelm staff and create unmet expectations.
What it looks like
- Engage before major choices are locked in.
- Present a limited set of realistic options (e.g., 2–3 alternatives).
- Ask targeted questions tied to specific considerations (e.g., cost, siting, mitigation, local impacts), not general preferences (e.g., “do you support this project?”).
- Define what input is in scope, and how out-of-scope input will be routed (e.g., partners, interagency coordination, future tracking).
Is it respectful?
Respectful engagement treats communities as knowledgeable partners and acknowledges the costs of participation. It reduces barriers where possible and ensures participation is worthwhile and relevant.
Why it matters
When engagement feels extractive, one-sided, or not worth the time required, participation drops and trust erodes, especially in communities already bearing environmental and infrastructure burdens. This affects not just who participates, but the relevance of the input received. People are more likely to engage — and stay engaged — when the process is clearly connected to decisions, provides enough context, and is worth their effort.
What it looks like
- Partner with trusted intermediaries to co-design or co-host engagement.
- Provide support (e.g., stipends, childcare, travel reimbursement) where feasible (see, for example, community compensation guidelines from the Colorado Department of Human Services and Washington State Office of Equity).
- Set clear norms for dialogue and how input will be documented.
- Train staff in facilitation, cultural competence, and navigating disagreement.
- Explain how input will be considered by decision-makers.
Is it transparent and accountable?
Transparent and accountable engagement sets clear expectations about what is being decided, how public input will be used, and how decisions will be communicated.
Why it matters
People don’t need to agree with outcomes to see them as legitimate, but they do need to understand how decisions were made. Transparency clarifies how input connects to decisions — including what can and cannot change — and helps maintain trust even when alignment is difficult. Accountability comes from closing the loop: showing what was heard, and what followed. Not all input will change outcomes, but its role should be visible.
What it looks like
- Communicate decision criteria, constraints, and timelines up front.
- Explain how input will be used before engagement begins.
- Share examples of how prior feedback influenced agency decisions or plans.
- After the engagement phase, show how input was considered by decision-makers (e.g., agency response to comments, draft revisions).
Is it accessible?
Accessible engagement removes practical barriers and proactively invites participation from those most affected but least likely to show up by default.
Why it matters
Barriers determine who participates and whose perspectives are heard. When participation depends on time, resources, technical fluency, or familiarity with government, input skews toward those advantages and may not reflect the full scope of public impacts. As a result, decisions rely on a narrower — and potentially distorted — set of inputs. Broadening access improves both representation and the quality of information decisions rely on.
What it looks like
- Offer multiple ways to participate (e.g., in-person, virtual, evenings, weekends) and provide input (e.g., written, audio, mapping tools).
- Use accessibility-by-default design principles (e.g., plain language, compatibility with assistive technologies, screen reader-friendly materials).
- Include stakeholders beyond those most visible or organized (e.g., future residents, regional beneficiaries).
- Conduct outreach through trusted community channels (e.g., local organizations, faith groups, libraries, community centers).
- Provide captioning, translation, interpretation, and other accessibility supports as standard practice — not only upon request.
Is it learning-focused?
Learning-focused engagement is iterative and adaptive. Agencies assess whether engagement is reaching the right people, producing helpful input, and informing decisions — and adjust accordingly.
Why it matters
Without iteration, engagement repeats the same gaps in participation and input. Learning in real time allows agencies to adjust how engagement is designed and delivered. It also prevents wasted effort by enabling course correction, saving resources and community goodwill — essential in fast-moving environmental and infrastructure contexts.
What it looks like
- Collect participant feedback (e.g., on clarity, accessibility, usefulness).
- Conduct internal debriefs to identify what worked and what didn’t.
- Monitor who is participating and who is missing.
- Adjust outreach, format, or framing mid-process.
- Use clear measures to assess the effectiveness of participation.
Is this engagement tied to a decision that is still open — and are participants being asked to respond to clearly defined choices, constraints, or tradeoffs?
Is this engagement worth people’s time — and are participants equipped to provide informed, relevant input?
Is it clear how input will be used — and how participants will see how it was considered?
Are we reaching and enabling participation beyond those most visible or organized — and who may still be missing?
Are we using what we learn to adjust this process in real time — and to improve future engagement?
Matching Methods to the Moment
No single method fits every situation. Effective participation matches the approach to the decision, based on who is affected, what is open to input, and what participation is feasible. Different stages of the policy lifecycle call for different forms of engagement:
- Early stages: define problems and surface lived experience.
- Mid-stage decisions: compare options and test tradeoffs.
- Implementation: troubleshoot and adapt.
The five principles set the standard for how to engage. The next step is choosing methods that apply them — fitting the decision, the audience, and the agency’s constraints (e.g., timeline, resources, legal requirements). Frameworks like the IAP2 Spectrum of Participation, the T.I.E.R.S. Public Engagement Framework, and the National Coalition for Dialogue and Deliberation’s Engagement Streams Framework can help avoid a common trap: defaulting to the same level or type of participation regardless of context.
Effective participation isn’t about asking more people more questions. It’s about selecting approaches that produce usable input.
That alignment also applies to who is included. Decisions with regional or national consequences require engagement that includes those who will benefit or bear indirect impacts. For example, housing or transmission projects often draw input primarily from current residents, even when benefits accrue to future residents or regional users.
This is not about giving any group veto power, but ensuring decision-making reflects the full distribution of impacts and interests.
Evidence from the Field
Well-designed participation isn’t just a process — it’s a governance tool. Examples from environmental and infrastructure policy show it can inform decisions, improve design, clarify contested evidence, and build the capacity for better engagement over time.
Participation that changes decisions
A 2025 study analyzing 108 Environmental Impact Statements under the National Environmental Policy Act (NEPA) found that public comments led to substantive changes in agency decisions in the majority of cases:
- 62% involved meaningful changes to decisions,
- 64% modified project alternatives,
- 42% changed mitigation plans, and
- when preferred alternatives shifted, agencies directly credited public input as the reason.
While the study did not assess outcome quality, longstanding NEPA success stories suggest these changes often strengthen project design, e.g., by identifying overlooked impacts, informing mitigation strategies, or incorporating local knowledge into technical analysis.
Place-sensitive design
Infrastructure decisions highlight the importance of place-sensitive engagement because impacts vary by location, history, and lived experience.
The Bipartisan Policy Center’s examination of a U.S. Department of Energy-funded carbon storage demonstration project in Illinois shows how early, sustained engagement helped build understanding and trust around geologic carbon storage. Engagement began years before site selection and relied on trusted local experts, multiple outreach strategies, and two-way communication to familiarize communities with the technology and its potential impacts. These efforts contributed to broad-based support and community willingness to host the project, illustrating how early engagement can shape perceptions of risk and benefit and improve the conditions under which projects move forward.
Similarly, analysis by Acadia Center and Clean Air Task Force found that opposition and delays were reduced, and public support for infrastructure grew, when clean energy planners took local siting and environmental concerns seriously and equipped communities to participate meaningfully.
These examples underscore an important balance. Place-sensitive engagement works best when local input is considered alongside broader system needs, so place-based concerns inform — but do not override — decisions with wider public benefits.
Joint fact-finding and shared inquiry
In disputes involving scientific uncertainty and contested values, joint fact-finding — where agencies, experts, and stakeholders collaboratively define questions, gather evidence, and interpret findings — produces more credible, usable information. Rather than positioning agencies and communities as adversaries, these approaches shift focus from competing claims to shared inquiry, helping participants develop a common understanding of facts and tradeoffs.
Environmental dispute cases show that joint fact-finding can narrow disagreements and reduce mistrust even when consensus is not possible. In practice, these processes help participants clarify what is known, what remains uncertain, and where value-based disagreements persist — allowing decisions to move forward on a more transparent and informed basis.
Environmental justice case studies documented by the U.S. Environmental Protection Agency (EPA) likewise illustrate how collaborative inquiry can enhance participation and buy-in from affected communities. In several cases, involving community members directly in data collection and interpretation improved the relevance of findings, increased confidence in the results, and fostered more constructive dialogue between agencies and communities — strengthening both the substance of decisions and their implementation.
Community-led research and data governance
Community-based participatory research offers another pathway to stronger decisions by changing who controls the production of knowledge. Analysis from the Brookings Institution shows that when communities help set research priorities and interpret findings, the results better reflect local context and needs. Traditional research models often reflect externally defined agendas that lack community-specific knowledge, limiting their usefulness for decision-making.
Community-led approaches, by contrast, redistribute control over research and data governance, enabling communities to shape how information is generated and used. While joint fact-finding focuses on how agencies, experts, and stakeholders collaboratively interpret evidence in decision-making contexts, community-led research changes who sets the agenda in the first place. In practice, this can produce more relevant inputs for policy and planning, strengthen the connection between data and lived experience, and support ongoing partnerships that extend beyond a single engagement process.
Evaluation
Effective engagement improves through feedback. For example, the U.S. Army Corps of Engineers’ evaluation of public involvement in flood risk management pilots found that engagement tied to clear decision points and structured activities (e.g., working groups, facilitated discussions) strengthened agency capacity for public involvement and improved two-way dialogue with communities. Participating staff reported that these efforts helped teams better understand community concerns, identify information gaps, and structure engagement more systematically.
Internal debriefs and participant feedback informed adjustments across project phases, helping teams refine outreach, coordination, and how input is organized and applied.
These findings spotlight another benefit of well-designed engagement: not just contributing to individual decisions, but building the knowledge, relationships, and processes that make more informed and collaborative decision-making possible.
What These Examples Show
Taken together, these examples point to a clear pattern: engagement works best when tied to decisions still being shaped and structured to produce usable input from those affected. In these conditions, participation does more than gather input — it improves decisions and delivery.
As noted earlier, participation doesn’t remove disagreement. It makes it more manageable by clarifying what is known and uncertain, surfacing tradeoffs, and reflecting a more balanced set of perspectives. This better equips decision-makers to explain and defend their choices.
Building Capacity to Deliver
Well-designed participation doesn’t substitute for agency capacity — it sharpens it, especially at the state and local levels where timelines are tight, staff are limited, and decisions are high stakes.
Participation is often treated as an added burden on already stretched institutions. But when targeted and structured, it helps agencies use existing capacity more effectively by identifying concerns early and reducing downstream conflict, redesign, and delay. This isn’t just an equity argument; it’s a speed and delivery argument.
What matters isn’t whether agencies “have capacity,” but whether participation is designed to support decisions that can be explained and sustained.
Strengthening State Capacity
Next, let’s look at the specific elements necessary to improve public participation.
Leadership and Governance
Why it matters
Engagement succeeds when it is treated as core governance, not a communications add-on. When leaders treat participation as part of decision-making, it affects how processes are designed, how staff are incentivized, and how tradeoffs are handled.
What it looks like
- Designate engagement leads to coordinate across programs and align input with decisions that cut across issues or policy areas.
- Embed clear ownership within program teams.
- Build engagement milestones into project timelines.
- Have senior leaders review engagement summaries alongside legal, technical, and budget analyses.
- Reflect participation goals in agency strategic plans, implementation plans, and performance reviews.
Skills and Culture
Why it matters
Engagement failures are often organizational, not technical. Without the right skills and norms, staff may struggle to use public input or navigate conflict, and even well-intentioned engagement can break down.
What it looks like
- Train staff to interpret and weigh public input alongside technical and operational constraints.
- Synthesize input into themes and areas of agreement or tension so it can be compared and shared without revisiting individual comments.
- Design engagement plans across functions (policy, legal, technical, communications).
- Set expectations that engagement is part of policy development and implementation, not a parallel process.
- Reinforce these norms through staffing, timelines, and accountability.
Tools and Resources
Why it matters
Without the right tools and resources, engagement can generate more input than agencies can realistically analyze or respond to. The issue isn’t just volume, but whether input can be organized, interpreted, and applied to decisions.
What it looks like
- Use practical checklists and templates for planning, documentation, and follow-up.
- Use formats that produce comparable, decision-relevant input (e.g., facilitated discussions, guided prompts, prioritization exercises).
- Use analysis approaches that account for different types of input and perspectives (e.g., written comments vs. oral input, surveys vs. community discussions), not just the easiest responses to process.
- Establish pre-approved contracting mechanisms for facilitators, translators, and interpreters.
- Track commitments and follow-through internally (e.g., via dashboards).
Learning and Improvement Systems
Why it matters
Adaptation requires more than intent. Learning at the project level isn’t sufficient — without shared systems, agencies tend to apply the same approaches across teams and over time, regardless of effectiveness.
What it looks like
- Standardize internal reporting that connects public input to decisions and next steps (e.g., “what we heard / what we’re doing” summaries).
- Establish criteria for adjusting outreach or formats based on participation patterns (e.g., extend timelines if turnout is low, add targeted outreach to missing groups).
- Create shared repositories so lessons learned carry across teams and inform future projects.
- Incorporate outcome checks over time (e.g., whether engagement reduced conflict or improved implementation).
Strengthening Public Capacity
This challenge extends beyond agencies themselves. Participation design should account for both agency capacity and who can realistically participate. When engagement skews toward people who face fewer barriers to participation, it raises equity concerns and weakens the quality of information and problem-solving.
Reducing Participation Barriers
Why it matters
Reducing barriers isn’t about paying for feedback. It’s about making participation feasible, informed, and reflective of those most affected, including those who will experience the long-term impacts.
What it looks like
- Provide context in advance so participants can engage without technical expertise.
- Provide information in multiple languages and accessible formats.
- Share clear examples of useful input (e.g., specific impacts, leading practices).
- Present side-by-side comparisons of relevant options and tradeoffs (e.g., anticipated impacts, costs, timelines).
- Offer opportunities for the public to ask clarifying questions before providing input (e.g., Q&A sessions, office hours).
Equipping Communities to Engage
Why it matters
Meaningful engagement often requires skills, time, and capacity that some communities — especially smaller or under-resourced ones — may not have. Without intentional outreach and resourcing, agencies hear repeatedly from the same well-resourced groups. Just as important, without support for organized participation, engagement struggles to translate into representative voice, particularly for communities historically marginalized by race or class.
What it looks like:
- Provide practical tools to support effective participation (e.g., how-to videos, comment templates, guides to agency processes).
- Share meeting materials early so communities have time to review, discuss, and respond.
- Provide small grants or stipends so trusted partners can convene discussions, and synthesize input, and support representative participation.
- Designate clear agency points of contact to help communities navigate participation processes.
Developing Long-Term Relationships
Why it matters
Engagement is faster and more collaborative when relationships already exist. When agencies engage only during moments of controversy, participation is more likely to feel reactive and transactional.
What it looks like
- Maintain engagement beyond one-off interactions, including through light-touch ways (e.g., periodic newsletters, community listservs).
- Maintain continuity in points of contact through clear handoffs and shared records so relationships and institutional knowledge persist as staff change.
- Communicate regularly and follow through outside active decision processes.
- Acknowledge past harms or broken trust.
- Return to communities after implementation to share outcomes.
Return on Investment
Trust, as this paper has argued, is the load-bearing wall. Investing in state and public capacity builds it — so participation helps decisions progress rather than stall. Well-designed engagement enables agencies to use input and respond credibly. When communities can participate fully, decisions are better grounded, easier to explain, and more likely to hold.
Looking Ahead
As engagement tools evolve, the same principles apply. Digital participation, civic technology, and AI-assisted analysis are already being used to help governments reach more people and make sense of large volumes of input. But without deliberate design, they risk introducing new exclusions and harms, such as unequal access, privacy concerns, and bias.
Internationally, several examples illustrate both the promise and the tradeoffs. France used AI tools to analyze millions of contributions submitted during the Grand Débat National, grouping input by theme to make citizen insights more accessible to policymakers; the process highlighted tensions between scale and nuance, and raised questions about how input is aggregated and whose perspectives are preserved. The UK government uses an AI consultation analysis tool to analyze consultation responses at scale, reducing manual work and improving the speed and consistency of analysis. Its evaluation documents practical challenges, including the need for human review and concerns regarding bias, accuracy across groups, public trust, and transparency in interpretation.
These examples point to real potential: making participation more scalable, accessible, and usable in decision-making. But they also reinforce the same lesson as the rest of this paper — the impact of these tools depends less on the technology itself than on how they are designed and governed.
The path forward is not to adopt emerging tools wholesale, but to hold them to the same standards.
- Purposeful: Use tools to help answer specific questions or synthesize input, not simply because they are new or project “modernization.”
- Respectful: Handle community data, stories, and participation with care, including clear consent and appropriate privacy protections, especially when considering AI tools beholden to terms of use agreements.
- Transparent and Accountable: Explain how input is collected, analyzed, and used, especially when automation or AI is involved.
- Accessible: Design for inclusion rather than assume technology access or digital fluency.
- Learning-Focused: Assess whether tools improve engagement and decision-making, and adjust or retire tools that do not.
Used this way, technology doesn’t replace judgment or governance; it strengthens both.
As agencies adopt new tools, they can also build on existing human infrastructure — such as promotores and other community-based messengers — that has long been used in public health and service delivery to support outreach, build trust, and connect agencies with communities.
When participation is designed to clarify tradeoffs, surface real impacts, and support accountable decisions, it becomes what democratic governance needs most right now, and what this collection argues is still within reach: a government that listens well enough to be worth believing in.
Strengthening the Federal Cycle of Learning and Adaptation by Closing the Loops
The federal government has a feedback-loop problem.
Regularly generated information, including evidence, performance information, and qualitative insights from implementation, too often fails to shape decisions. Evidence may be reviewed without changing priorities; performance data may be tracked without clarifying what it informs, and implementation feedback may reach leadership without surfacing what works for whom and why, or suggesting next steps. The components of a cyclical learning system linking priorities, questions, evidence, decisions, and implementation information exist in theory and on paper, but the connective tissue that turns all of these components into a functioning cycle of learning and adjustment is lacking. Information and artifacts alone don’t necessarily facilitate learning and adaptation; strengthening federal feedback loops requires embedding translation and use into decision-making from the start.
This memo is not a case for new infrastructure. The Evidence Act, learning agendas, evaluation plans, performance frameworks, and customer experience authorities already exist; what they do not yet add up to is a learning system. The translation this memo proposes is turning the infrastructure we have into the learning system we need, and it’s addressed to federal program leaders, policy officials, evaluation and evidence staff, performance officers, and strategic planning teams who already sit inside it and are best positioned to make it function as intended.
Challenge and Opportunity
The federal government already operates within a broad cycle of goal-setting, evidence generation, performance review, implementation, and reporting. On paper and in principle, this cycle should allow for learning, adjustment, and improvement to federal programs over time. In practice, however, agencies vary in how consistently they translate such information into planning, decision-making, or course-correction. Federal agencies have made progress in building and using evidence, but translating that information into timely operational or policy revisions remains uneven.
The core problem isn’t production; it’s translation, and the translation failure shows up as “so what” gaps on both sides of the information pipeline. On the input side, receivers of information are often left asking what they’re supposed to do, and on the output side, a second question appears – “is it my job to act on this, and if so, how?”. Research findings are often too slow, too caveated, or too disconnected from immediate policy and management questions. Performance data may show quantitative changes in outputs, costs, or enrollment without revealing the mechanisms behind them or the practical implications for implementation, or cueing the design apparatus that could apply these insights. Feedback from frontline service providers and affected users might reach leadership mainly through quantitative indicators, dashboards, or status updates, which don’t always capture lived experience, causal explanation, or informed suggestions for course correction. Without named owners and defined next steps, even the most actionable information tends to circulate rather than convert.
Three gaps sit behind this pattern. First; a context gap – decision-makers often lack the full picture, because qualitative indicators and customer experience research arrive separately, or later than quantitative evidence, leaving them with only a partial view of what’s working well or driving implementation problems. Second, an action gap; even with a complete view of the picture, it’s not always obvious which lever applies, on what timeline, or with what tradeoff. Third, an ownership gap; it’s often unclear who is responsible for translating any given signal into a decision, and this ambiguity means that insights can be observed without being acted on. Together, these three gaps leave evidence and feedback insufficiently integrated into decision-making routines.
The problem is also structural; decision-makers face turnover, competing priorities, time limits, and management pressures, and thus, evidence needs a more robust pathway. Devoid of clear translation, trusted messengers, and defined or mandated use points, even the most relevant information can be too late, too ill-timed, or too jargon-heavy to influence decisions, resulting in missed adaptation opportunities.
The federal government doesn’t need an entirely new learning architecture. It needs to make the one it already has more usable. Agencies can do this in a few ways. First, by building stronger translation functions by creating space for “knowledge brokers” (people or teams whose core function is to translate evidence into decision-relevant language and maintain the required relationships that make the translation trusted). Second, by incorporating the use of evidence, performance, and implementation feedback into policy and program work from the start. Third, by creating better pathways for implementation and lived-experience feedback to reach leadership in ways that resonate with them and support action.
Formal federal guidance envisions a closed “loop” linking goals, priority questions, evidence, and performance information, along with review, decision-making, implementation, reporting, and feedback. In practice, the loop often degrades at key handoffs: evidence-use capacity, coordination and integration, translation into action, ownership and execution, and upstream learning from outcomes. The most significant recurring bottleneck occurs during the transition from review and interpretation to decision and prioritization, where information is generated and reviewed but does not reliably translate into action.
Plan of Action
We need to shift from a system that collects information to a system that uses it.
Agencies should create or strengthen embedded translation functions that connect evidence, performance information, implementation experience, and policy levers at the moment decisions are being made. The key is to move from a dissemination model to a utilization model. Instead of “produce, disseminate, and hope for uptake”, agencies should do the following:
Recommendation 1. Designate a knowledge broker to facilitate regular decision briefs…
or routines that create structured opportunities to clarify what’s known, what’s uncertain, and what actions are available – beginning with a defined set of high priority issues rather than every single decision the agency needs to make.
This recommendation targets the translation into action bottleneck in Figure 1; the handoff from review and interpretation to decision and prioritization, which can be considered the most significant recurring point of failure in the federal cycle of learning and adaptation. In practice, this means assigning this function to a role or small team housed in an existing performance, evaluation, strategy, or program office and requiring that group to support recurring decision points with short decision briefs. Those briefs should identify the decision, synthesize relevant evidence, performance trends, and implementation feedback, and specify available actions, tradeoffs, and owners.
For example, in the rollout of the FAFSA Simplification Act, a knowledge broker tied specifically to this initiative could have translated readiness indicators, beneficiary feedback, and information from financial aid administrators into decision-ready synthesis for the officials with decision authority who were attempting to course correct in real time. Instead, significant delays turned the rollout into a high-profile implementation issue.
Crucially, this function should start narrow. Rather than positioning a knowledge broker as an all-purpose translator for an agency’s full decision load, the initial portfolio could be scoped to a small set of priority issues. Starting narrow lets the broker establish credibility and relationships that make translation trusted and refine what the routine and explicit outputs are before it scales – the portfolio can scale later. This gives agencies a defined mechanism for turning reviewed information into decisions rather than leaving that handoff informal. Within the federal government, the Office of Evaluation Sciences has modeled how an embedded team of evidence translators can work alongside program offices rather than from a silo.
Recommendation 2. Start policy initiatives and evaluation planning with a real question…
or decision, identify the user, specify the lever, and clarify – in advance – what different findings would imply. Incorporating this thinking upstream changes the role of evidence from a retrospective input to an operational tool.
This recommendation mainly addresses the translation into action bottleneck, and secondarily, the evidence-use capacity bottleneck. Agencies can operationalize this by building decision framing into existing learning agenda and evaluation planning processes, both of which are already required under the Evidence Act. Before an evidence product is commissioned or a performance indicator is selected, program offices can be required to answer four questions on the record: What decision is this for? Who will use it? What lever would change as a result? What finding would lead to what action?
In a hypothetical example, say that USDA’s Food and Nutrition Service (FNS) wants to commission a new evaluation or analysis regarding SNAP redetermination churn (the pattern of households losing SNAP benefits at recertification and then re-enrolling, often for procedural reasons rather than eligibility). The four questions noted above can be answered on the record before the work begins. The decision is whether to issue new guidance to state agencies on recertification practices and what that guidance should encourage. The user is the FNS administrator and the relevant policy office, with state level SNAP directors as the implementing audience. The lever is subregulatory guidance. The early thinking regarding mapping findings to actions would specify, in advance, which patterns or insights would trigger which associated response. This way, when the findings arrive, USDA wouldn’t be starting from scratch with the “what do we do with this” question; the decision architecture would already be in place.
Pre-specifying these conditions in a short decision-framing memo that travels with the work turns evidence from a retrospective deliverable into a tool scoped to a specific decision or policy window. The same logic extends to decision memo templates themselves, which can include standing prompts such as “what evidence informed this decision?” and “how will we learn about this in real time?”, so that utilization is built in.
Recommendation 3. Create pathways for easier access to mixed methods evidence and insights from lived experience.
Recommendation 3 targets the coordination and integration, and upstream learning bottlenecks; the gap where evaluation, performance, administrative, qualitative, and customer experience data move at different speeds, live in different places, and reach decision-makers as parallel streams. Insights from lived experience – what programs actually look and feel like to the people using them – are particularly likely to be separated from information that reaches leadership, arriving as anecdotes, if they arrive at all. Because these problems are distinct; the recommendations can be broken down and addressed at the agency level.
Recommendation 4. Standardize decision-ready formats that consolidate quantitative and qualitative evidence.
Agencies should build standardized decision templates and briefs that present quantitative indicator-level data alongside narrative summaries of lived experience and implementation conditions, so decision-makers aren’t expected to synthesize across disparate sources on their own timelines. The resulting artifacts should be tied to recurring decision moments (budgeting, guidance revisions, program reauthorization) so that they can be used in real time.
In the FAFSA Simplification Act rollout, the Department of Education leadership faced this problem: application data, technical readiness indicators, information from financial aid administrators, and user feedback existed separately, and moved at different speeds. A standardized decision-ready format could have pulled those streams together; pairing completion trend data with brief narratives or exemplary quotes regarding what applicants and financial aid offices were actually encountering, rather than leaving leaders alone to assemble the picture in real time
Recommendation 5. Actively use existing general clearance mechanisms for rapid qualitative and user experience research.
Agencies should make use of standing generic clearance mechanisms that allow them to fast-track small qualitative and user experience studies (for example, up to 100 respondents, completed within a fixed time window) when unexpected findings need rapid explanation. This would allow for the ability to run a tightly scoped evaluation in weeks rather than months, which is the operational timescale at which decisions frequently move. Without it, the qualitative evidence needed to explain any type of performance anomaly often arrives after the decision or policy window has closed.
For SNAP redetermination churn, this would let FNS turn around a short, scoped evaluation of why participants are dropping off at a specific step in the recertification cycle in weeks rather than months. The insights could then inform the next round of guidance rather than coming in after the fact.
Recommendation 6. Build customer-first indicators built into existing federal reporting requirements.
Beneficiary and frontline experience should become part of the evidence base by default rather than by exception. Most federal programs already have reporting infrastructure, and layering in a modest set of customer-first indicators that use the existing infrastructure rather than building new information collection requirements ensures that user perspectives are consistently available as routine inputs.
Within the federal government, the customer experience and life experience work coordinated through OMB and performance.gov has demonstrated that lived experience can be collected and used at scale within existing authorities, which can be considered a foundation to build from rather than reinvent. At the state level,Minnesota’s Story Collective, housed within Minnesota Management and Budget (MN MMB), pairs administrative and performance data with qualitative, lived-experience narratives to give decision-makers a richer view of what their programs are actually producing.
These recommendations also address a common weakness in the federal system: evidence and performance information sit within the same broad ecosystem but move at different speeds, use different tools, and often reach different audiences. This is all the more reason to create a translation layer that can synthesize across them. Agencies need staff and routines that can connect evaluation, administrative data, performance indicators, qualitative input, and implementation realities into decision-relevant guidance. Without that connective tissue, agencies are left with parallel streams of information that don’t consistently converge at the point where action occurs.
Conclusion
The federal government already generates a great deal of information about what it’s doing and how it’s performing, but information isn’t the same as learning, and learning isn’t the same as adaptation. The gap between them is where the “so what” goes unanswered, and where the federal feedback loop breaks down.
Closing these loops doesn’t require new infrastructure or authority. It requires three shifts in how the existing system is used: designating knowledge brokers to carry translation across the handoff from review to decision-making, building decision framing into policy and evaluation work from the start so that evidence is scoped to the decisions it’s meant to inform, and creating pathways that move mixed-methods and lived experience into decision-makers’ hands in formats and timeframes that match how decisions actually happen in the federal environment. Whether the question is about USDA responding to SNAP redetermination churn or the Department of Education learning from an application rollout in real time, the underlying pattern is the same: the signals exist, but the translation that turns signals into actionable insights doesn’t reliably happen.
If the government wants a system of learning and adaptation that improves results in real time, it has to treat translation, utilization, and adaptation as core functions of governance rather than as afterthoughts.