Last reviewed · How we verify

NCT07500428: BUST-AI Bench

Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models

Recruiting now Last updated 30 March 2026
What this trial tests

trial testing Multimodal AI Model Diagnostic Evaluation in Breast Neoplasms in 1,380 participants. Currently enrolling.

Timeline
12 March 2026
Primary endpoint
1 December 2026
1 March 2027

Quick facts

Lead sponsorPeking Union Medical College Hospital
StatusRecruiting now
Study typeOBSERVATIONAL
Enrollment1,380
Start date12 March 2026
Primary completion1 December 2026
Estimated completion1 March 2027
Sites1 location across China

Drugs / interventions tested

Conditions studied

Sponsor

Peking Union Medical College Hospital

Who can join

Adults 18 to 75, female only, with Breast Neoplasms or Breast Diseases. Patients with the condition only — healthy volunteers not accepted.

Sponsor's own description

This single-center, retrospective, observational study aims to construct a standardized benchmark evaluation system for intelligent breast ultrasound image interpretation and to systematically assess the diagnostic performance of current mainstream multimodal artificial intelligence (AI) models. De-identified B-mode breast ultrasound images with confirmed pathological diagnoses will be retrospectively collected from the institutional archive (2018-2025) and supplemented with images from published open-access datasets. Expert radiologists with varying experience levels will independently annotate all images according to the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) v2025 criteria, including glandular tissue composition, lesion characterization (mass vs. non-mass lesion), morphological descriptors, and final BI-RADS classification. Baseline deep learning models (CNN-based ResNet-50 and Transformer-based USFM) will be trained to establish performance baselines and to stratify cases by diagnostic difficulty through cross-architecture consensus. Multiple multimodal large language models (MLLMs), including both general-purpose and medical-domain models, will then be evaluated via standardized API calls using BI-RADS-guided chain-of-thought prompts at temperature 0 for reproducibility. Primary endpoints include BI-RADS classification accuracy and diagnostic AUC for benign-malignant differentiation. Model robustness and safety will be assessed through out-of-distribution rejection testing, temperature-stability experiments, and thinking-mode ablation studies. This study adheres to the FLAIR and TRIPOD-LLM reporting guidelines.

Publications & conference data

No peer-reviewed publications indexed yet for this trial.

Verify or expand the search:

Other recruiting trials for Breast Neoplasms

Currently open trials in the same condition.

Other Peking Union Medical College Hospital trials

Trials by the same sponsor.

Verify against primary sources

Data sources for this page

Drug Landscape aggregates and links these public records for informational use only. Always verify against the primary source before clinical or regulatory decisions. Canonical URL: https://druglandscape.com/trial/NCT07500428.

Primary sources · FDA · ClinicalTrials.gov · EMA · SEC EDGAR · ChEMBL · Wikidata · full sourcing