Algorithmic Accountability and the Litigation of Predictive Failures
Systemic Error in Automated Decision Systems
The public conversation about artificial intelligence has focused heavily on hallucinations, on systems fabricating case citations or inventing facts. That problem is real. But it is not the most consequential form of AI failure now reaching the courts. The more pervasive category is predictive error. Systems that do not invent information, but that get the answer wrong. An algorithm that denies a Medicare patient post-acute care based on a recovery timeline disconnected from clinical reality. A fraud-detection system that accuses 40,000 people of welfare fraud with a 93% false positive rate. A hiring tool that tells a deaf applicant to practise active listening.
These are not glitches, they are the products of systems designed to apply statistical averages to individual circumstances and the litigation they have generated is redefining how courts allocate responsibility for automated decisions. This article examines predictive failures across healthcare, insurance, employment, public benefits, criminal justice, real estate and securities markets, identifies the legal doctrines emerging from the resulting litigation and assesses the accountability frameworks that are beginning to take shape.
Healthcare and Insurance
The nH Predict Algorithm and Medicare Advantage
The litigation involving UnitedHealthcare and its nH Predict algorithm is the most prominent example of a predictive system producing systematically incorrect outcomes in healthcare. nH Predict was developed to forecast how long patients would require care in skilled nursing facilities or post-acute settings. The tool generated recovery timelines that were then used as the basis for coverage determinations.1,2
The problem was structural. nH Predict was calibrated to predict average length of stay, but it was applied as a functional ceiling on coverage. Internal investigations found that while the tool reduced average stays by 15% to 25%, it did so at the expense of accuracy.2 The plaintiffs allege that over 90% of patient claim denials generated by the algorithm were reversed when challenged through internal appeals or federal Administrative Law proceedings, an allegation the court treated as sufficient at the pleading stage and one UnitedHealth disputes.2 A reversal rate that high does not indicate a system identifying unnecessary care. It indicates a system systematically under-predicting the care required by legitimate patients.
In Estate of Gene B. Lokken et al. v. UnitedHealth Group, Inc., decided by Judge John R. Tunheim in the District of Minnesota on 13 February 2025, the court allowed a putative class action to proceed by focusing on the insurer’s contractual representations.3 If an insurer represents to policyholders that decisions will be based on clinical judgment but instead relies on a flawed algorithm, the plaintiffs can sue for breach of contract and breach of the implied covenant of good faith. The ruling separates the coverage denial from the use of the inaccurate tool, creating a distinct pathway for algorithmic accountability in healthcare.
Range Compression in Personal Injury Claims
In casualty and liability insurance, the shift from human-led negotiation to software-driven settlement recommendations has produced what claimants describe as range compression. Systems used by Allstate and Progressive categorise claims, flag fraud suspicion, and suggest settlement ranges based on standardised inputs such as diagnosis and procedure codes.1 These systems often fail to account for variables that resist quantification. Pain intensity, long-term mobility impairment and caregiving burdens do not fit into structured data categories and algorithms that omit them will systematically undervalue the claims they assess.
In 2024, Progressive agreed to a $48 million settlement in New York to resolve a class action alleging that third-party software adjustments (Mitchell International’s WorkCenter Total Loss) had systematically undervalued total loss claims for approximately 93,000 policyholders.1 In 2010, Allstate paid $10 million to 45 states following a multi-state NAIC examination of its use of the Colossus bodily injury claim software.1 In both cases, the algorithm was optimised to reduce average payouts rather than to produce accurate valuations of individual losses. When the optimisation target is a business metric rather than an accuracy metric, undervaluation is not a bug. It is the system working as designed.
Employment and Algorithmic Agency
Mobley v. Workday
Mobley v. Workday, Inc. has become the leading case on liability for automated recruitment. The plaintiff, a 40-year-old black job seeker with a disability, alleged that Workday’s AI-driven applicant screening system systematically rejected his applications for nearly 100 positions.4,5
In July 2024, the court allowed discrimination claims to proceed against Workday on an agency theory, while dismissing the separate employment-agency theory.6 The reasoning on the surviving claim was that the software did more than execute employer-defined criteria. It used its own AI models to evaluate, score and rank candidates, making it an active participant in the hiring decision rather than a passive tool.6,7 If the agency theory succeeds at trial, it will prevent companies from outsourcing compliance obligations to technology vendors by holding both the employer and the vendor accountable for discriminatory outcomes.
Proxy Variables and Contextual Failure
Predictive hiring tools reproduce bias because they learn from historical data. If a company’s historical top performers have been predominantly white men from specific universities, an AI system will learn to rank candidates with those profiles higher, regardless of whether other candidates are objectively more qualified.8,9 A 2024 University of Washington study tested 120 first names across three large language models against over 500 real job listings. The models preferred white-associated names 85% of the time and black-associated names 9% of the time.4,8
In March 2025, the ACLU of Colorado filed a complaint against Intuit and HireVue regarding an AI video interview tool that gave a deaf employee feedback that she needed to practise active listening.7,8 The system had misidentified her lack of auditory response as a lack of professional engagement. This was not a hallucination. It was a model that could not account for the context of disability and it illustrates the limits of systems that reduce human behaviour to statistical patterns.
Separately, Kistler et al. v. Eightfold AI Inc., filed in Contra Costa County Superior Court in January 2026 and removed to Federal court on 2 March 2026, raises an issue under the Fair Credit Reporting Act.6 The plaintiffs argue that AI-generated match scores, which rank candidates from 0 to 5 using social media and internet activity data, function as consumer reports. If that argument succeeds, vendors would be required to provide candidates with the right to access their scores and dispute inaccuracies, a requirement that most proprietary screening systems are not currently designed to meet.
Public Benefits and Automated Stategraft
When governments automate the adjudication of public benefits, the consequences of predictive error are borne by populations with the fewest resources to challenge them. Two cases illustrate the pattern.
Robodebt
Australia’s Robodebt scheme operated from 2015 to 2019. It was designed to identify welfare overpayments by comparing social security records with Australian Taxation Office data.10,11 The algorithm divided annual income into fortnightly increments and compared them against reported fortnightly earnings. The method assumed income was earned evenly across the year. For anyone with variable or seasonal earnings, the assumption was wrong.
A student who earned nothing for ten months but worked full-time for two months would be flagged as having exceeded the income threshold for every fortnight of the year under the averaged model.11 The mathematical error is explained by Jensen’s Inequality. Benefit eligibility is not a linear function of income. Averaging the input before applying the eligibility rules produces a different result than applying the rules to the actual data. The system generated more than 400,000 false debt notices.10,13 The government settled the resulting class action for A$1.2 billion in 2020, though the total financial relief, including refunds and debt waivers, exceeded that figure.10,13 The court ruled the scheme unlawful, in part because it shifted the burden of proof onto citizens to disprove debts generated by a flawed prediction.12,14
Michigan MiDAS
Michigan’s Integrated Data Automated System was implemented to detect unemployment insurance fraud. Between 2013 and 2015, the system issued over 60,000 fraud determinations, wrongly accusing approximately 40,000 residents. Subsequent audits found a false positive rate of 93%.15,16,17 The system was over-calibrated to treat any discrepancy between state and federal records as intentional misrepresentation. A typo in an employer’s name was sufficient to trigger a fraud determination.15,17
Michigan levied a 400% penalty on alleged fraud, the highest in the nation.17,18 Because the system also automated notifications, often sending them to out-of-date addresses, many victims were unaware of the accusation until their bank accounts were garnished.17 The litigation in Bauserman v. Unemployment Insurance Agency forced the state to acknowledge violations of procedural due process. The Michigan Supreme Court ruled that victims could recover monetary damages for constitutional-tort claims, and a $20 million settlement was approved by the Michigan Court of Claims in January 2024.15,16
Criminal Justice and Predictive Policing
COMPAS and LSI-R
Risk assessment algorithms such as COMPAS and the LSI-R inform sentencing and parole decisions by predicting recidivism. These tools rely on static factors including education, family history and employment, variables that frequently serve as proxies for race and socioeconomic status.20 The Department of Justice’s Criminal Division has described such assessments as dangerous because they penalise individuals for their social environment rather than their conduct.20
ProPublica’s 2016 analysis found that black defendants who did not go on to reoffend were falsely flagged as high risk at a rate of 45%, compared with 23% for white defendants in equivalent circumstances. The disparity is roughly two to one.20 White defendants were correspondingly more likely to be incorrectly classified as low risk. The model optimises for overall accuracy without ensuring equitable error rates across demographic groups.7,21
Transparency Litigation
In EPIC v. DOJ, the Electronic Privacy Information Center sued for government records on risk assessments and predictive policing.20 The case revealed a previously undisclosed DOJ report to the White House acknowledging the potential for disparate impacts and the erosion of consistent sentencing, even as the government continued to promote these tools.20
In Chicago, journalists sued to obtain information about the city’s Strategic Subject List, an algorithmic tool that ranked individuals based on their likelihood of involvement in a shooting.22 The list was criticised for targeting individuals based on social networks and prior arrests for non-violent offences, leading to repeated police contact with people who had never been convicted of a serious crime.22,23 The system identified a statistical correlation between an individual and a criminal network, but treated correlation as a basis for intervention, with the attendant infringement of civil liberties.23,24
Real Estate, Credit Scoring and the Devaluation of Equity
Automated Valuation Models
AVMs are designed to provide objective property estimates for mortgage lending. Research by the Urban Institute found that these models systematically produce larger errors for black homeowners.25 In a study of Atlanta and Memphis, AVMs generated valuation errors 3.4 percentage points higher for black homeowners than for white homeowners, and undervalued black-owned properties by an average of 5% compared to comparable white-owned homes.25
The undervaluation reflects the model’s reliance on historical sales data from neighbourhoods shaped by segregation. If a neighbourhood has been historically devalued through redlining, the algorithm treats that devaluation as a market signal and incorporates it into future predictions.25,26 The bias is not introduced by the model. It is inherited from the data and perpetuated through automation.
Algorithmic Credit Scoring
Financial institutions are increasingly using machine learning to analyse alternative data, including social media activity, browsing habits and device choices, to predict creditworthiness.29 The intention is to serve the ‘credit invisible’, but the method introduces proxy variables that correlate with race and class. A model might predict that an individual who uses a particular device or visits certain websites is less likely to repay a loan, regardless of their actual financial history.29
A 2022 Bloomberg investigation found that Wells Fargo’s mortgage algorithm approved only 47% of black refinancing applicants, compared with 72% of white applicants.29 The failure was one of representation bias. The model was trained on data that did not adequately represent the financial behaviours of minority communities, producing skewed results that excluded high-earning minority borrowers from favourable loan terms.29,30
Securities Fraud and AI-Washing
A growing area of litigation involves companies making misleading claims about the capabilities of their predictive analytics to attract investment. When the systems fail to deliver, or when it emerges that the claimed AI was largely non-functional, the result is securities fraud exposure.
In Helo v. Sema4 Holdings Corp, a securities fraud class action filed in Connecticut Federal Court, the plaintiffs alleged that statements about a proprietary health intelligence platform using advanced AI were materially misleading, as the platform reportedly lacked the claimed predictive capability.31 Zillow Group faced litigation after its algorithmic home-buying programme, Zillow Offers, failed to accurately predict housing market movements, leading the company to overpay for thousands of properties and shut down the business unit at a loss of approximately $528 million.31
These cases establish that when a company’s business model depends on the predictive accuracy of an algorithm, a wrong outcome is not merely a technical failure. It is a material risk requiring disclosure to investors.
Conclusion
The pattern across these sectors is consistent. Predictive analytics produce incorrect outcomes when they are used to manage complex human systems through reductive statistical averages. Whether the context is the denial of post-acute care, the undervaluation of homes in black neighbourhoods, or the automated extraction of welfare debts, the failures share common features. Loss of human context. Reliance on biased historical data. Absence of meaningful human oversight.
The litigation trends identified here, particularly the agency theory in employment law and the constitutional tort claims in public benefits, indicate that algorithmic impunity is ending. Organisations can no longer defend a wrong outcome by pointing to the algorithm. As regulatory frameworks such as the Colorado AI Act (effective 30 June 2026) come into force, the burden will increasingly fall on developers and deployers to demonstrate that their predictive models are accurate, equitable, and transparent.5
The technology is not the problem. The problem is deploying it without the accountability structures that the law requires for any system that determines individual rights and entitlements. Building those structures is the task ahead.
References
1. AI In Insurance Claim Review Raises Concerns Over Delays, Freedom for All Americans.
2. When Faulty AI Falls Into the Wrong Hands: The Risks of Erroneous AI, IJOC.
3. Lawsuit over AI usage by Medicare Advantage plans allowed to proceed, DLA Piper (2025).
4. AI Hiring Bias Lawsuits Are About to Surge, Reworked.
5. AI Bias in Hiring: Algorithmic Recruiting and Your Rights, Sanford Heisler Sharp (2025).
6. Two Lawsuits Expose AI Accountability Gaps in Hiring, Veris Insights.
7. When Machines Discriminate: The Rise of AI Bias Lawsuits, Quinn Emanuel.
8. AI Hiring Bias: Real Cases, Legal Consequences, and Prevention, Responsible AI Labs.
9. AI Hiring Tools Under Legal Scrutiny: Lessons for Employers, Hoyer Law Group.
10. Quantifying Losses for 443,000 Australians in a $1.2 Billion Robodebt Class Action, Vincents.
11. Robodebt not only broke the laws of the land, it also broke laws of mathematics, University of Wollongong (2023).
12. Why Robodebt failed, ANU Reporter.
13. Learning from the failures of Robodebt, Victoria Legal Aid.
14. Managing unintended consequences of algorithmic decision-making: The case of Robodebt, Sage Journals.
15. Michigan Unemployment Insurance False Fraud Determinations, BTAH.
16. Case Over the Michigan Unemployment Insurance Agency’s Faulty Automated System Finally Settled, Ford School (2024).
17. Automated Stategraft: Faulty Programming and Improper Collections in Michigan’s Unemployment Insurance Program, Wisconsin Law Review.
18. Michigan’s MiDAS Unemployment System: Algorithm Alchemy Created Lead, Not Gold, IEEE Spectrum.
19. The MIDAS Touch: Atuahene’s “Stategraft” and Unregulated AI, UNM Digital Repository.
20. EPIC v. DOJ (Criminal Justice Algorithms), EPIC.
21. DOJ Report on AI in Criminal Justice: Key Takeaways, Council on Criminal Justice.
22. Police departments sued over predictive policing programs, Police1.
23. The Dangers of Unregulated AI in Policing, Brennan Center for Justice.
24. The Legal Risks of Big Data Policing, American University Law.
25. Do Automated Valuation Models Reinforce Disparities in Home Values?, Urban Institute.
26. Racial Disparities in Automated Valuation Models: New Evidence Using Property Condition and Machine Learning, HUD.
27. Don’t trust AI for home valuation, estate agents warn, Mortgage Professional America.
28. Are AVMs Sabotaging Property Valuations?, Seaport Real Estate Services.
29. When Algorithms Judge Your Credit: Understanding AI Bias in Lending Decisions, Accessible Law, UNT Dallas.
30. When Algorithms Deny Loans: The Fraught Fight to Purge Bias from AI, IoT For All.
31. Consequences Clear for Firms that are AI-Washing, Labaton.


