Last reviewed · How we verify

NCT02795806

NLM Scrubber: NLM s Software Application to De-identify Clinical Text Documents

ENROLLING BY INVITATION Last updated 6 April 2026
What this trial tests

trial in Personally Identifiable Information in 50,000 participants. Enrolling by invitation.

Timeline
25 May 2016
Primary endpoint
31 January 2027
31 January 2027

Quick facts

Lead sponsorNational Library of Medicine (NLM)
StatusENROLLING BY INVITATION
Study typeOBSERVATIONAL
Enrollment50,000
Start date25 May 2016
Primary completion31 January 2027
Estimated completion31 January 2027
Sites1 location across United States

Conditions studied

Sponsor

National Library of Medicine (NLM)

Who can join

1 Day and older, any sex, with Personally Identifiable Information. Patients with the condition only — healthy volunteers not accepted.

Sponsor's own description

Background: Electronic health records contain a vast amount of data about diseases and treatments. Researchers could use this data to test their ideas, but they would need to use records from more than just their own group of patients. But access to those records is restricted to ensure patient privacy. U.S. National Library of Medicine (NLM) has created a computer tool called NLM Scrubber. This program recognizes and deletes personal information from health records. The researchers who developed this program now need access to the original records. This will allow them to see how well the program removes personal information from patient records and how they can make it more accurate. Objectives: To find ways to improve clinical text de-identification. Eligibility: No new participants. Researchers will review data that have already been collected. Design: Researchers will collect a random sample of reports. These will be from different doctors in different fields. Researchers will manually remove personal information from the records. Researchers will also automatically remove personal information from original records using NLM-Scrubber. Researchers will compare the results of the computer program versus the manual changes. They will note when the program has not been removing personal information correctly. They will also note when the program has been deleting nonpersonal health information incorrectly. Researchers will use the results to revise the program. They will keep testing it until the de-identification process is complete.

Publications & conference data

No peer-reviewed publications indexed yet for this trial.

Verify or expand the search:

Other National Library of Medicine (NLM) trials

Trials by the same sponsor.

Verify against primary sources

Data sources for this page

Drug Landscape aggregates and links these public records for informational use only. Always verify against the primary source before clinical or regulatory decisions. Canonical URL: https://druglandscape.com/trial/NCT02795806.

Primary sources · FDA · ClinicalTrials.gov · EMA · SEC EDGAR · ChEMBL · Wikidata · full sourcing