AI in Clinical Research: The Genius 9-Year-Old Needs Adult Supervision

I’ve talked with a lot of people involved in clinical research about AI. We use it extensively, and have learned where it works, and where it may introduce risks. I often describe AI as a genius 9-year-old with ADHD.

It is fast, creative, and capable of doing impressive work when the assignment is clear. It can also get distracted, misunderstand the instruction, fill in blanks that should have stayed blank, or run confidently in a direction no responsible adult intended.

The analogy is imperfect, as most analogies are. But it helps put it into perspective.

AI can process information at speeds humans cannot match. It can summarize, draft, compare, organize, and identify patterns with remarkable efficiency. It can also overlook an important constraint, misread the practical meaning of a request, or proceed based on assumptions that sound reasonable until someone with experience looks closely.

That is inconvenient in many industries, in clinical research, it can become dangerous.

In April 2026, multiple outlets reported that a Claude-powered Cursor AI coding agent deleted a company’s production database and backups in seconds, despite instructions not to take destructive action without authorization.¹

The reported failure was not simply that the AI produced a bad suggestion. It acted. It had access to systems it should not have been able to damage. It moved quickly, made assumptions, and did not stop at the point where a competent human would have said, “Wait, this seems risky.”

That was a software company. Now consider the clinical research version.

An AI tool may summarize a protocol but softens an exclusion criterion. It may identify possible patients for pre-screening but misses a key lab value. It may draft a feasibility response that sounds confident but overstates site capacity. It may generate patient-facing language that is technically accurate but ethically unacceptable. It may review source documentation and create a clean narrative that does not quite match the medical record.

Or, more concerning, it may recommend a monitoring action based on incomplete context or take an action inside a live trial system before the right person has reviewed it.

That is where the industry needs to slow down long enough to be smart.

AI is already being used in clinical research, and it will become more prevalent. That is not speculation. FDA has acknowledged the growing use of AI across the drug development life cycle, including nonclinical, clinical, postmarketing, and manufacturing activities.² It has also issued draft guidance around the use of AI to support regulatory decision-making for drugs and biological products.³

One of the most important concepts in FDA’s approach is “context of use.”³ That phrase deserves more attention.

Using AI to reformat internal meeting notes is one thing. Using AI to support eligibility review, feasibility analysis, safety narratives, regulatory submissions, monitoring decisions, or patient communications is something quite different.

The risk changes with the task, and the controls should change with the risk.

This is where clinical research has an advantage, at least in theory. We already understand controlled processes. We already understand documentation. We already understand delegated authority, source verification, protocol adherence, audit trails, and the uncomfortable reality that “it seemed reasonable at the time” is not an adequate inspection defense.

AI should fit into that discipline. It should not be allowed to wander around it wearing a lab coat.

The better question is not simply, “Can AI do this?” The better question is, “Should AI be allowed to do this without qualified human review?”

For practical purposes, I would place clinical research AI use into three broad categories.

Low-risk assistance includes drafting, formatting, creating internal outlines, preparing training summaries, or organizing noncritical information.

Moderate-risk support includes protocol summaries, feasibility planning, recruitment analysis, site performance review, patient pre-screening support, and monitoring trend analysis.

High-risk or restricted use includes eligibility determinations, safety assessments, adverse event causality support, source data interpretation, regulatory claims, patient-facing clinical guidance, trial system changes, and anything that could affect participant rights, safety, well-being, or data integrity.

Those categories will not answer every question, but they create a useful starting point. They also force the right internal conversation before a technology decision becomes an operational habit.

Here are the guardrails I believe sponsors, CROs, sites, and technology vendors should be building now.

Keep instructions short, precise, and sequenced.

AI performs better when the workflow is broken into steps.

A vague instruction such as “review this protocol and tell us who qualifies” invites the system to do too much at once. A better approach is to ask it to summarize the inclusion criteria, summarize the exclusion criteria, identify ambiguous language, create a structured checklist, and then stop for review before applying that checklist to any actual data.

That final instruction matters.

In clinical research, stopping at the right point is often as important as moving quickly.

Separate suggestion from action.

AI may suggest, but humans should authorize.

That distinction needs to be designed into the workflow, not buried in a policy that no one reads after implementation.

An AI tool should not independently modify source systems, send patient communications, change trial data, submit regulatory documents, alter recruitment lists, or trigger operational actions without appropriate review.

Clinical research has enough trouble with version control when humans are involved. We do not need an enthusiastic digital intern updating live systems because it “reasoned” its way into a shortcut.

Limit permissions aggressively.

AI agents should not receive broad access because someone thinks it might be convenient later.

If the task is protocol summarization, the AI agent does not need write access to the CTMS. If the task is patient pre-screening support, the agent does not need authority to contact patients. If the task is feasibility support, the AI does not need authority to submit the final response.

Permission design is not merely an IT issue. In clinical research, permission design is a quality issue.

Put the right human in the loop.

“Human-in-the-loop” sounds comforting, but it is too vague.

A coordinator may be the right person to review recruitment language. A PI or Sub-Investigator must review clinical judgments. A regulatory professional should review submission-sensitive content. A sponsor or CRO quality lead may need to review outputs that affect trial oversight, inspection readiness, or operational commitments.

The review role should match the risk of the output. A warm body with a login is not a control.

Document AI use when it matters.

If AI supports a trial-critical decision, there should be a record of what was asked, what was produced, who reviewed it, what was accepted, what was rejected, and what final action was taken.

That does not mean every AI-assisted email needs a validation package. It does mean AI use should be traceable when it touches participant safety, data integrity, regulatory decision-making, feasibility commitments, monitoring actions, or trial-critical operations.

FDA’s current AI work emphasizes concepts such as context of use, risk-based assessment, data governance, documentation, performance, and life cycle management.² ³

Those principles should not live only in regulatory submissions. They are useful operational principles for the rest of us too.

Test AI against examples you already understand.

Before using AI to summarize protocols, test it against protocols your team knows well.

Before using AI to identify possible patients, test it against cases where eligibility has already been determined.

Before using AI to draft feasibility responses, compare its outputs against high-quality prior responses.

Do not evaluate AI by whether the output sounds polished. Evaluate it by whether the output is correct, complete, cautious, and appropriately limited.

Clinical research has been fooled by polished nonsense before. We usually call it an over-optimistic feasibility response.

Build escalation rules.

AI should be instructed, configured, and tested to stop when uncertainty is high.

In clinical research, “I am not sure” can be a very good answer.

The system should escalate when criteria are ambiguous, source data are incomplete, medical judgment is required, protocol language conflicts with operational reality, patient safety may be affected, or the output could create a regulatory commitment.

It should also escalate before any irreversible action. That one seems obvious, but apparently the software world has been kind enough to provide us with a very expensive reminder.

Never confuse fluency with reliability.

This may be the greatest risk. AI can write with confidence when it is wrong. It can make weak assumptions look organized. It can produce a clean paragraph that quietly skipped the one detail that mattered.

Clinical research has spent decades building systems to protect participants and preserve data integrity. We should be very careful before allowing polished language to bypass those systems.

None of this means clinical research should avoid AI.

Quite the opposite.

AI can help sites understand protocols faster. It can help sponsors and CROs improve feasibility. It can reduce administrative burden. It can support training, recruitment planning, document review, data review, and study startup efficiency.

Used well, AI may give time back to investigators, coordinators, CRAs, project managers, feasibility teams, and study leaders who are buried under repetitive work that should have been simplified years ago.

But clinical research does not need reckless autonomy. It needs disciplined acceleration.

So yes, invite the genius 9-year-old into the room. Give clear instructions, break the work into manageable steps, keep dangerous tools out of reach, check the work, and require qualified adult supervision before anything irreversible happens.

And never mistake speed for judgment.

In clinical research, faster is only better when participant safety, data integrity, regulatory credibility, and public trust come along for the ride.

Endnotes

1. The Guardian, “Claude-Powered AI Agent’s Confession After Deleting a Firm’s Entire Database: ‘I Violated Every Principle I Was Given,’” April 29, 2026, https://www.theguardian.com/technology/2026/apr/29/claude-ai-deletes-firm-database.
2. U.S. Food and Drug Administration, “Artificial Intelligence for Drug Development,” accessed May 7, 2026, https://www.fda.gov/about-fda/center-drug-evaluation-and-research-cder/artificial-intelligence-drug-development.
3. U.S. Food and Drug Administration, “Considerations for the Use of Artificial Intelligence to Support Regulatory Decision-Making for Drug and Biological Products,” draft guidance, January 2025, https://www.fda.gov/regulatory-information/search-fda-guidance-documents/considerations-use-artificial-intelligence-support-regulatory-decision-making-drug-and-biological.

AI in Clinical Research: The Genius 9-Year-Old Needs Adult Supervision

Next PostBeyond the Employment Paradox

Leave a Reply Cancel Reply

Quick Links

Quick Links

Phone