Federal Judges Using AI — Who’s Deciding Cases?
A Northwestern University–affiliated study of federal judges found that generative artificial intelligence tools have entered many chambers but are not yet a routine part of most judges’ decision-making.
The study surveyed a stratified random sample of federal judges across bankruptcy, magistrate, district court and court of appeals levels. Researchers sent questionnaires to roughly 500 judges (reported as 500, 502, or 502 in source summaries) drawn from a population of 1,738 active federal judges and received 112 responses, a 22.3–22.4% response rate. The authors reported an approximate overall margin of error of ±9% at a 95% confidence level and noted larger margins for subgroup estimates, along with potential self‑selection and social‑desirability biases and other sampling limitations.
Across respondents, more than 60% reported having used at least one listed AI tool in their judicial work, while 38.4% reported never using any of the tools covered by the questionnaire. Frequency of use was uneven: 22.4% of respondents reported using AI on a weekly or daily basis (with daily use specifically reported by 5.4% and weekly use by 17.0% in one account), and larger shares reported monthly or rare use. Use varied by judge type: bankruptcy judges reported the highest rates of weekly or daily use (reported as 32.2% in one breakdown), magistrate judges reported higher-than-average use (reported as 21.9% in one breakdown), and district court judges reported the highest rate of never using AI (reported as 46.5% in one breakdown). Only six federal appellate judges responded.
Judges and chambers showed a clear preference for legal‑specific, vendor‑integrated AI tools over stand‑alone, general‑purpose systems. Named tools included legal products such as Westlaw AI‑Assisted Research/Deep Research, Lexis+ AI/Protégé, CoCounsel, Vincent AI, Harvey, and Legora, and general tools such as ChatGPT, Claude, Copilot, Gemini, Grok, and Perplexity. Reported tool adoption rates differed by tool type: for example, Westlaw’s AI products were reported as used by 38.4% of respondents in one account, and ChatGPT was reported by 28.6% in one account. Frequency differences by tool type were noted, including one report that 5.4% used legal‑specific tools daily versus 0.9% daily use of general‑purpose tools.
Legal research was the most common AI application, reported by about 30.0% of judges and by 39.8% of other chamber staff; document review was the next most common application, reported by 15.5% of judges and 16.7% of chambers staff. Use of AI to draft or edit documents filed in cases was minimal: 1.8% reported using AI to draft filed orders, opinions, or judgments and 2.7% reported using it to edit such filings in one account; non‑filed drafting and editing showed somewhat higher but still low rates. A small share of judges reported using AI to affect decision‑making directly: 1.8% said they use AI to make decisions and 4.5% said they use it to inform decisions. Several respondents reported uncertainty about or lack of awareness of their staff’s AI practices, and at least one judge recounted an instance in which AI produced largely fabricated citations.
Attitudes were mixed. Respondents were nearly evenly split between optimism and concern about AI’s role in the judiciary, with one account giving 43% optimistic, 42% concerned, and 14.8% neutral. Common concerns included AI hallucinations, reliance on incorrect or fabricated authorities, and potential erosion of legal skills. Optimistic responses emphasized efficiency gains and assistance with research.
Court‑provided AI training and chambers policies were uneven and often lacking. About 61.1% of judges said they had not been offered AI training or did not know whether training was offered; other accounts reported 45.5% saying no training had been provided and 15.7% unsure. Among judges who recalled being offered training, roughly 73.8% attended. Chambers policies ranged from formal prohibitions to encouragement: roughly one‑third of judges said they permit or encourage AI use in chambers, about 20% reported formal prohibitions, 17.6% discouraged use without formal prohibition in one account, and about 24.1% reported having no official chambers policy. Counting judges who discouraged use without a formal policy raises the share lacking an official policy to about 41.7% in one summary. Some judges who permitted AI imposed limits, such as banning AI‑generated content from orders, opinions, or case communications.
Personal and professional use were correlated: about 38% of judges reported daily or weekly AI use outside of work, while 26.9% reported rare personal use and 25.9% reported never using AI personally in one account; another account reported 20.4% of judges never using AI in either personal life or work.
The authors and observers concluded that the federal judiciary remains in an early phase of AI governance, with no single model for chambers policy yet dominant. The federal judiciary created a task force to examine whether new AI policies are needed. The study’s authors recommended expanded judiciary‑specific AI training and clearer, intentional policies as the next practical steps and indicated plans for further research and collaboration with judges on responsible AI deployment.
Original Sources: 1, 2, 3, 4, 5, 6, 7, 8 (bankruptcy)
Real Value Analysis
Overall judgment: the article reports useful facts but provides almost no practical help to an ordinary reader. It is mainly descriptive research reporting who among federal judges is using AI and how chambers are addressing it. That information can inform general awareness, but the piece stops short of offering actionable guidance, explanations of underlying causes, or steps people can take in response.
Actionability
The article does not give clear, concrete steps a reader can use immediately. It tells you that a majority of surveyed federal judges have used AI, who uses it more often, and that many judges lack training or chambers policies, but it does not tell a reader how to adopt, evaluate, or govern AI tools. There are no instructions, checklists, or resources that a nonexpert could follow to try AI responsibly, to request training, to implement a chambers policy, or to verify whether AI influenced a judicial decision. If you are a judge, court staff, lawyer, or member of the public seeking specific guidance on safe or appropriate AI use in judicial settings, the article gives no usable roadmap.
Educational depth
The article gives useful surface facts and basic statistics from a survey, but it lacks deeper explanation. It does not explain the survey methodology in detail, such as how respondents were selected beyond “randomly selected,” response bias implications, question wording, or statistical margins of error. It reports percentages (overall AI use, weekly use, training offered and attended, policy distribution) without analyzing causes, timelines, or mechanisms that produced those numbers. It does not explore why bankruptcy and magistrate judges use AI more, what specific AI functions judges tried, how AI outputs were assessed for accuracy, or the risks and limitations judges encountered. Because of that, the article teaches only what was observed, not why it matters or how the patterns arose.
Personal relevance
For most members of the public the direct relevance is limited. The findings mainly affect judges, court staff, and people working with the federal judiciary; they do not change individual safety, health, or finances for ordinary citizens in any immediate way. For attorneys and legal professionals, the article signals that AI is entering judicial workstreams and could affect legal research practices or submissions to courts, which is moderately relevant. For policy makers and court administrators it is more relevant because it highlights gaps in training and policy. But the piece does not translate those signals into specific actions those groups should take.
Public service function
The article has some public-service value by documenting the early-stage state of AI governance in the federal judiciary and noting the existence of a federal judiciary task force on AI. However, it does not offer safety guidance, warnings, or actionable recommendations for the public. It does not tell litigants how to check whether an AI tool affected a decision, what disclosures should be expected, or how to raise concerns about AI use in court. As a result it functions more as reporting than as public guidance.
Practical advice quality
There is essentially no practical advice in the article. Statements about training and policy distribution are informative but not instructive. The few implications the article suggests—such as an unmet demand for more visible AI training—are not turned into concrete guidance. Any ordinary reader who wants to act (seek training, press for policies, adapt practice) will not find step-by-step help here.
Long-term impact
The article helps by flagging a longer-term trend: the judiciary is entering an early phase of AI governance. That may help institutions anticipate the need to develop policies and training. But because it lacks guidance on what good governance looks like, how to balance transparency and confidentiality in courts, or how to audit AI outputs, the long-term utility is limited. It does not equip readers to plan or respond concretely beyond being aware the issue exists.
Emotional and psychological impact
The article is neutral and factual; it does not seek to alarm or reassure strongly. Because it lacks practical remedies, readers who are worried about AI in courts may feel helpless; readers who are curious may receive only curiosity rather than clarity. Overall it neither calms nor meaningfully empowers the audience.
Clickbait and sensationalism
The article appears straightforward and non-sensational. It does not use dramatic or exaggerated claims. It reports measured percentages and notes the small sample size in some judge categories. It does not overpromise conclusions beyond the survey findings.
Missed opportunities to teach or guide
The article missed many chances to be more useful. It could have:
- Explained survey design, sampling limitations, and what the response rates mean for interpreting results.
- Described typical AI tasks judges used, examples of appropriate and inappropriate uses, and checks judges used to verify AI output.
- Suggested basic policy elements a chambers AI policy might include (disclosure, permitted tools, auditing, data handling).
- Offered guidance for judges, court staff, lawyers, and litigants about training, disclosure expectations, or how to raise concerns about AI use in court.
- Pointed readers to concrete resources such as model AI-use policies or training curricula (the article did not provide these).
Practical, usable guidance you can use now
If you want to act on the issues the article raises, here are realistic, general steps and reasoning you can use without needing specific external data.
If you are a judge or court staff member: Start a short, practical conversation in your chambers about AI use. Define whether staff may use AI tools for drafting or research and set simple rules: require staff to disclose when AI substantially contributed to a draft or legal analysis; require human verification of factual claims and citations produced by AI; and prohibit submitting AI-generated material to a court without review and, where appropriate, disclosure. Offer a brief, mandatory training session that covers common AI failure modes (hallucinations, incorrect citations, privacy leaks) and how to check outputs (verify citations directly, cross-check facts with primary sources). Keep rules simple, focused on safety and transparency, and iterate them as you learn.
If you are an attorney working with the federal judiciary: Assume judges or their staff might use AI tools for research or drafting. When you rely on AI for filings, verify every factual assertion and citation against primary sources and be prepared to disclose the use of AI if required by local rules. When you suspect AI affected a judicial opinion or process that materially matters to your case, raise the issue respectfully by asking whether AI tools were used in chambers or in drafting and, if so, what safeguards were applied.
If you are a court administrator or policymaker: Prioritize visible, short training modules for judges and staff that explain risks and verification practices. Develop a simple model policy template that courts can adapt: permitted uses, disclosure expectations, data handling rules, mandatory human review, and an audit mechanism. Start with a pilot program in one division and collect feedback before scaling.
If you are a member of the public concerned about AI in courts: Ask simple questions of your representative or local court administration about whether they have policies and training on AI. Request transparency: ask courts to publish summary policies and whether AI was used in deciding cases that materially affect people. Use public comment opportunities to encourage sensible safeguards like disclosure and human review.
How to evaluate similar reporting in the future
When you read articles about new technologies entering institutions, check for these points to assess usefulness: Who was surveyed, what was the response rate, and how might response bias affect results? What exactly counts as “use” of the technology—occasional testing or substantive reliance? Are there examples of the tool’s outputs and of safeguards used? Does the article offer concrete policy options or training approaches? If those are missing, treat the report as informational but incomplete.
Short risk checklist for everyday assessment
When deciding whether to trust or rely on an AI output in a professional setting, use this basic logic: confirm the source of the claim in primary materials when possible, verify citations and facts independently, treat any surprising or consequential claim with skepticism until corroborated, and require a named human reviewer who accepts responsibility for the final content. This approach reduces the chance of being misled by AI errors without needing specialized tools.
These steps are practical, low-cost, and grounded in general principles of verification, transparency, and incremental policy development. They give a reader concrete actions to take even though the original article did not provide them.
Bias analysis
"found that a majority of federal judges reported using an artificial intelligence tool at least once in their judicial work."
This phrase frames the result as a majority without giving the exact percent in the sentence, which can make the finding sound stronger than it might be. It helps the impression that AI use is widespread among judges while hiding how large the majority is. The wording nudges readers to view AI adoption as common. It favors emphasizing uptake rather than the underlying response rate or details.
"survey sent to 500 randomly selected federal judges yielded 112 responses"
This line shows a low response rate (112 of 500) but does not highlight how that weakens representativeness. By just stating the numbers, the text hides the potential for nonresponse bias. It helps present the findings as if they apply broadly to federal judges while omitting the risk that respondents differ from nonrespondents.
"with bankruptcy judges having the highest response rate and only six federal appellate judges responding"
Calling out high participation from bankruptcy judges while noting only six appellate judges responded signals uneven sampling. The phrasing lets the reader infer results are more reflective of some judge types without explicit caveats. It hides the limited appellate perspective and thus helps conclusions appear more general than warranted.
"over 60% of surveyed federal judges have used an AI platform, with the tools most commonly applied in chambers for legal research."
Saying "over 60% of surveyed federal judges" ties the percent to respondents but may still imply broad judicial adoption. The phrase "most commonly applied in chambers for legal research" generalizes the context of use without giving exact breakdowns. This wording downplays the variety of uses and may make AI use seem narrowly and professionally focused, which can soften concern about other uses.
"22.4% of respondents use AI on a weekly or daily basis, while more than one third reported never using any of the tools covered by the questionnaire."
Presenting both a specific minority (22.4%) and "more than one third" is selective: the precise smaller figure is given, while the larger non-use group is described vaguely. This mix of precise and vague numbers shifts emphasis toward the active users and dilutes the impact of the sizable non-user group. It helps readers focus on adoption rather than nonuse.
"A small share of judges reported that AI affects decision-making: 1.8% said they use AI to make decisions and 4.5% said they use it to inform decisions."
Labeling these percentages as a "small share" frames the influence of AI on decisions as minimal. The phrase steers readers toward seeing AI as not materially affecting rulings, which downplays potential risks. It favors reassurance over highlighting that any judges use AI in decision-making.
"judges may not be aware of all AI activity by their staff."
This sentence uses "may" to suggest uncertainty while implying an information gap. It signals possible hidden AI use without providing evidence, nudging readers to suspect underreported activity. The phrasing introduces speculation presented as a plausible concern rather than a tested finding.
"About 61.1% of judges said they had not been offered AI training or did not know whether training was offered; among those offered training, 73.8% attended."
Combining "had not been offered" with "or did not know" merges distinct issues—lack of training and lack of awareness—into one statistic. That blurs whether training was unavailable or simply unnoticed. The second clause about attendance rate emphasizes demand once training is offered, which supports the interpretation that visible training should increase uptake. This structure nudges toward the conclusion of unmet training demand.
"Approximately one third of judges said they allow AI use in their chambers, while nearly one quarter reported having no official chambers policy on AI."
Contrasting "allow AI use" with "no official chambers policy" frames policy as either permissive or missing, ignoring other nuanced arrangements. The wording sets up a binary impression and helps the conclusion that governance is inconsistent. It downplays possible informal norms or conditional permissions.
"the study characterized the distribution of chambers policies as ranging from formal prohibitions to narrow encouragement, and concluded that the judiciary remains in an early phase of AI governance with no single model yet dominant."
Phrases like "early phase" and "no single model yet dominant" present a narrative of nascent, unsettled governance. This interprets the data and guides readers to see the judiciary as still developing rules. It helps a storyline of emerging governance rather than showing all possible interpretations of policy diversity.
"The federal judiciary created a task force last year to examine whether new AI policies are needed."
This statement links the task force creation to the topic but uses no detail about the task force's scope. Placing it at the end emphasizes institutional attention and lends authority to the study's concern. It slightly pushes the view that policy action is underway and needed, without supplying evidence that the task force was created because of the study or how urgent the issue is.
Emotion Resonance Analysis
The passage expresses a restrained set of emotions that are mostly implied through word choice and factual framing rather than through overt feeling. One clear emotion is caution or concern, suggested by phrases that emphasize uncertainty and incompleteness: words such as “may not be aware,” “unmet demand,” “early phase,” and “no single model yet dominant” signal a cautious stance about AI use and governance. This caution is moderate in strength; it does not sound alarmist or panicked but it does highlight gaps and potential risks. Its purpose is to make the reader notice unresolved issues and to prompt careful attention to how AI is being adopted and governed. A reader will likely feel the passage is raising a sensible warning and thus be nudged toward concern or attentiveness rather than reassurance.
A related emotion is prudence or deliberation, present when the report notes actions being taken and studied—such as the creation of a task force and the survey’s careful statistics. Words like “examined,” “survey,” “task force,” and the methodical reporting of percentages and response rates convey measured, thoughtful evaluation. This tone is mild to moderate in strength and serves to build trust in the information: it reassures readers that the subject is being looked at methodically, which reduces alarm and encourages confidence in ongoing oversight.
There is a subdued sense of curiosity or interest, implied by the detailed breakdowns of who uses AI, how often, and what policies exist. The listing of percentages, distinctions among judge types, and the note that some judges do not know whether training is offered all reflect an investigative attitude. This curiosity is low in emotional intensity but functional; it guides the reader to view the topic as one worth learning about and discovering more about, supporting engagement without strong persuasion.
A faint feeling of ambivalence appears in the juxtaposition of adoption and nonuse—over 60% have used AI at least once, yet more than one third reported never using the tools, and nearly one quarter have no official policy. This ambivalence is mild but purposeful: it frames the situation as mixed rather than settled and steers the reader away from binary judgments toward a nuanced view that adoption is uneven.
There is also a subtle implication of responsibility or accountability tied to the observation that judges “may not be aware of all AI activity by their staff” and that many had not been offered training. This carries a low-to-moderate sense of duty: the language implies that more oversight, training, and clearer policies might be needed. The effect on the reader is to encourage concern about governance gaps and to promote support for measures that increase awareness and training.
Finally, a restrained sense of progress or cautious optimism is present in references to usage statistics and the existence of a task force. Phrases about “use” being common in chambers and that a task force was “created” suggest forward movement. This feeling is weak to moderate and serves to balance concern with recognition that the judiciary is actively engaging the issue, which can reassure readers and make them more receptive to incremental solutions.
The writer uses emotion mainly through careful framing and selective factual emphasis rather than through explicitly emotive language. Instead of saying judges are “worried” or “excited,” the text uses phrases like “may not be aware,” “unmet demand,” and “early phase of AI governance” to evoke caution and the need for action without overtly stating feelings. Repetition of statistics and contrasts—such as the split between frequent users and those who never use AI, and the difference in response rates across judge types—works like a rhetorical device that increases the emotional impact by making the unevenness and uncertainty more vivid. The presence of concrete numbers and procedural details functions as an appeal to reason that also conveys trustworthiness; grounded facts reduce emotional excess while still prompting concern. Comparison is implicit when the text places high-use groups (bankruptcy and magistrate judges) beside low-use groups (appellate and many district judges), and that contrast magnifies the sense of inconsistency and unsettled governance. Overall, these techniques steer the reader toward measured concern, trust in careful study, and openness to governance measures by emphasizing gaps, showing that steps are being taken, and presenting clear, repeatable data rather than dramatic language.

