Revolutionizing Ulcerative Colitis Assessment
How AI-powered endoscopy review improves accuracy and efficiency
Marcela Vieira Fleury Curado, M.D. – Medical Director, Gastroenterology at Clario
In the evolving landscape of gastroenterology, artificial intelligence (AI) is transforming the accuracy and efficiency of endoscopic assessments, particularly in ulcerative colitis (UC) clinical trials. Central reading – where expert gastroenterologists evaluate colonoscopy videos using scoring systems like the Mayo Endoscopic Subscore (MES) – is essential for assessing disease severity consistently. However, manual review is both time-consuming and prone to variability among readers, leading to inefficiencies and inconsistencies in interpretation.
A joint effort by researchers from Clario and Atrius Health has led to a study that demonstrates how AI-powered frame detection can streamline endoscopic video review, reduce workload, and improve grading consistency. This innovative system automates the identification of the most relevant frames, allowing gastroenterologists to focus on key video segments, speed up assessments, and make more reliable clinical decisions. By filtering non-informative frames, AI not only enhances the accuracy of disease grading but also reduces the time burden and fatigue on clinicians, making UC clinical trials more efficient and effective for both researchers and patients.
The challenge of manual MES scoring
Assessing UC severity accurately involves meticulous review of colonoscopy videos, but the process is troubled by inefficiencies.
Primary issues include:
- High inter-reader variability: Different gastroenterologists may interpret the same degree of severity differently, leading to inconsistencies in grading.
- Time-intensive review process: Hours of video footage must be manually examined to determine disease severity, a relatively high-workload process.
- Non-informative frames: Portions of these videos often contain uninformative segments, including blurry frames, out-of-body views, or obstructed visuals, which slow down review without contributing to meaningful assessment.
Addressing these challenges requires a method to automatically highlight the most relevant video frames while filtering out non-informative segments.
AI-powered frame detection: A more efficient approach
Researchers developed an AI system capable of automatically detecting and prioritizing informative frames from endoscopic videos. The system, trained utilizing an aggregated dataset of over 100,000 annotated endoscopic frames, uses neural networks to identify and remove unhelpful video segments, and in doing so, enhancing the efficiency of UC grading.
A key feature of this system is its temporal smoothing module, which evaluates entire videos by analyzing frame-by-frame. This ensures that only the most relevant frames – those that truly reflect disease severity- or the absence of it– are presented for assessment.
To evaluate its effectiveness, the AI model was tested on a large aggregate dataset consisting of 1,564 UC patients and a total 357 hours of video footage.
The findings showed:
- 80.02% accuracy in detecting non-informative frames
- 3:09 minutes average reduction per video by filtering uninformative segments
- 83 total hours saved across the dataset
- 77.61% mean video informativeness level (mVIL), indicating that most retained frames provided meaningful diagnostic information
Boosting the accuracy of automated disease scoring
In addition to saving time, AI-powered frame selection can improve the performance of automated MES grading. The system was evaluated by comparing AI-predicted scores with those from expert gastroenterologists. Results showed a high correlation between AI-generated MES scores and human assessments, both in the full dataset and when central and local readings aligned.
The metric used to assess scoring agreement was Quadratic Weighted Kappa (QWK), which measures how closely AI-generated scores matched expert assessments.
The impact of AI-driven filtering on QWK was as follows:
Swipe left to scroll table.
Video Type | Before AI Filtering | After AI Filtering | Improvement |
---|---|---|---|
Full-Length Videos | 0.666 | 0.690 | +0.024 |
Concordant Readings | 0.706 | 0.725 | +0.019 |
Adjudicated Readings | 0.615 | 0.646 | +0.031 |
The data confirms that removing non-informative frames enhances the accuracy of AI-assisted disease grading. Even small improvements in QWK translate into more consistent and reliable UC assessments, benefiting clinical trials and patient care.
Reducing reviewer fatigue and enhancing workflow
The introduction of AI-powered frame detection is not just about improving accuracy—it also reduces fatigue among gastroenterologists. Physicians reviewing endoscopic videos must maintain high levels of focus to ensure consistent, accurate grading. Unnecessary exposure to irrelevant footage could contribute to mental strain and variability in assessments.
By trimming videos to focus only on essential frames, AI enhances both the efficiency and quality of clinical evaluations. This allows gastroenterologists to allocate more focus to complex cases rather than sifting through redundant footage. The AI system can be integrated into the central reading platform, streamlining video review by optionally enabling the automated filtering of non-assessable intervals.
The future of AI in gastroenterology
AI-supported video analysis represents a significant leap forward in UC clinical trials, but its implications extend beyond just this field. Similar AI frameworks can be applied to other gastrointestinal conditions that rely on endoscopic evaluation, such as Crohn’s disease or colorectal cancer.
Moreover, as AI models continue to learn from larger and more diverse datasets, their accuracy and efficiency will improve, further refining the role of AI-assisted assessments in gastroenterology.
The successful implementation of AI-powered informative frame detection in UC trials paves the way for broader adoption of AI in clinical workflows. By enhancing accuracy and improving efficiency, AI is set to redefine the future of disease assessment and treatment planning.