Graphic image

Projects at TBIC

Current Projects & Collaborations

Patient Screening for Trial Eligibility

Insufficient patient enrollment in clinical trials remains a serious and costly problem, and is often considered the most critical issue to solve for the clinical trials community. Health care providers' lack of awareness of appropriate trials and the difficulty to correlate eligibility criteria with patient characteristics are often cited reasons. Eligibility criteria specify the characteristics of study participants and provide a checklist for screening and recruiting those participants. They are essential to every clinical research study. Computable representations of eligibility criteria can significantly accelerate electronic screening of clinical research study participants and improve research recruitment efficiency. The adoption of Electronic Health Record (EHR) systems is growing at a fast pace in the U.S., and this growth results in very large quantities of patient clinical data becoming available in electronic format. Secondary use of clinical data is essential to fulfill the potentials for effective scientific research. Our hypothesis is that an automated process based on natural language processing (NLP) can detect patients eligible for a specific clinical trial, linking the information extracted from the narrative description of clinical trial eligibility criteria to the corresponding clinical information extracted from the EHR, and alerting clinicians taking care of the patient.

Improvements & Automation to the Problems & Allergens List

Medical errors are recognized as the cause of numerous deaths, and even if some are difficult to avoid, many are preventable. Computerized physician order-entry systems with decision support have been proposed to reduce this risk of medication errors, but these systems rely on structured and coded information in the electronic health record (EHR). Unfortunately, a substantial proportion of the information available in the EHR is only mentioned in narrative clinical documents. Electronic lists of problems and allergies are available in most EHRs, but they require manual management by their users, to add new problems, modify existing ones, and the removal of the ones that are irrelevant. Consequently, these electronic lists are often incomplete, inaccurate, and out of date. As a solution to these problems, we are developing, implementing, and evaluating a new system to automatically extract structured and coded medical problems and allergies from clinical narrative text in the EHR of patients suffering from cancer. This not only helps Health care providers maintain complete and timely lists of problems and allergies, providing them with an efficient overview of a patient, but also helps Health care organizations attain meaningful use requirements. This project is funded by the National Cancer Institute.


Automated Clinical Text De-Identification

Secondary use of clinical data is essential to fulfill the potentials for high quality Health care, improved Health care management, and effective clinical research. De-identification of patient data has been proposed as a solution to both facilitate secondary uses of clinical data, and protect patient data confidentiality. The majority of clinical data found in the EHR is represented as narrative text clinical notes, and de-identification of clinical text is a tedious and costly manual endeavor. To address these issues, we are developing and evaluating a new system to automatically de-identify clinical notes found in the EHR, to then improve the availability of clinical text for secondary uses, as well as ameliorate the protection of patient data confidentiality. This will improve access to richer, more detailed, and more accurate clinical data for clinical researchers. It will also ease research data sharing, and help Health care organizations protect patient data confidentiality. This project is funded by the National Institute of General Medical Sciences (NIGMS).


Automated Patient Electronic Health Record Summarization

CliniWhiz is a Phase II Federally-funded technology transfer (STTR, NCI) for enhancing a current system to automatically extract structured and coded medical problems and allergies from clinical narrative text. See the Problems & Allergens Project for a fuller description of the problem space.

Past Projects & Collaborations

Treatment Performance & Quality Measures Assessment

The development of high quality Health care and improved Health care management relies on methods to encourage and assess adherence to evidence-based care. This assessment is typically based on quality or performance measures such as the Health care Effectiveness Data and Information Set (HEDIS) published by NCQA or CMS Quality Measures. Similarly, as part of multiple U.S. federal incentives for "meaningful" adoption and use of Electronic Health Record (EHR) systems, various performance measures have been developed and implemented.

In general, these quality or performance measures rely on clinical data that can often only be found in the unstructured part of the EHR: clinical notes. Manual chart review is therefore the most common approach to acquire these measures. To enable more scalable and sensitive approaches, NLP-based clinical information approaches are being developed, to eventually automatically compute quality and performance metrics based on the EHR content.