FAQs

Table of Contents

General Questions

Q: What does i2b2 stand for and where can I get more information?
Q: Who is the scope of data for the MUSC i2b2 project?
Q: Who has access to the i2b2 project? How do I gain access?
Q: How is patient data protected?
Q: What version of i2b2 are we using?
Q: How/where do I report an issue or enhancement?
Q: What if I would like additional help in navigating how to use i2b2 for my project or study?

Navigating in the Application

Q: Is there a user guide for i2b2?
Q: Where do the items in Navigate Terms come from?
Q: Why are there duplicate items in the Navigate Terms hierarchy?
Q: What ontologies are utilized?
Q: What is the SHARED folder in the Workplace section for? What about the folder with my id?
Q: What are the items in the Previous Queries section?
Q: How do you view data from previous queries?
Q: How do I perform a Temporal Query?

Understanding the Data

Q: What is the date range of data?
Q: When is i2b2 refreshed?
Q: Is there a data dictionary? Where is it?
Q: There are several age concepts and values, which one should I use?
Q: What is the scope of the Diagnosis and Procedures data?
Q: What is the scope of the Medications data?
Q: How do I utilize the RXNORM ontology for Medications in my queries?
Q: Not all lab results seem to be available, why is this?
Q: What vitals are included in i2b2?
Q: How are the chronic condition concepts derived under the computable phenotype category?
Q: How are the Charlson Comorbidity scores calculated?
Q: Where does the clinical trial data come from?
Q: What is the scope of the Cancer Registry data and where does it come from?

Using Plugins

Q: How do I use the i2b2 plugins?
Q: How do I make a patient set and where can I locate it?
Q: Where can I get more details on specific plugins?
Q: How can I format the Export XLS Plugin output so all of the data is on the same row for the same patient?
Q: Are there suggestions to avoid timeout issues while using the Export XLS Plugin?

General Questions

Q: What does i2b2 stand for and where can I get more information?
A: i2b2 stands for Informatics for Integrating Biology and the Bedside. You can find out more by visiting the website at i2b2.

Q: Who is the scope of data for the MUSC i2b2 project?
A: The data for the MUSC i2b2 project includes a subset of the data from the Research Data Warehouse. New data domains are added to the project quarterly. A full list of available fields in the Research Data Warehouse and the i2b2 interface can be found in our data dictionary.

Q: Who has access to the i2b2 project? How do I gain access?
A: All MUSC faculty and sponsored staff members have access. All non-faculty users must be sponsored by a MUSC faculty member; please send a completed Sponsorship Form to datarequest@musc.edu. Access is generally granted within 1 to 2 business days.

Q: How is patient data protected?
A: The data that populates i2b2 is a limited data set. All HIPPA identifiers have been removed except for the dates of service. Any queries that return less than or equal to 25 patients will return 0 as the patient count to prevent users from analyzing small patient sets with the plugin tools.

Q: What version of i2b2 are we using?
A: We are using web client version 1.7.08a. Details can be found in the Release Notes.

Q: How/where do I report an issue or enhancement?
A: Please email datarequest@musc.edu to report an issue or suggest an enhancement.

Q: What if I would like additional help in navigating how to use i2b2 for my project or study?
A: Please check out of the resources available to you on the BMIC website. If you still need additional help, you can submit a Self-Service Research Data & Feasibility Consultation service request at sparc.musc.edu.

Navigating in the Application

Q: Is there a user guide for i2b2?
A: Yes, to access the user guide press Help in the upper right section of the web client.

Q: Where do the items in Navigate Terms come from?
A:  The items that appear in the Navigate Terms portion of the i2b2 Query & Analysis Tool are built from standard medical ontologies or customized to match MUSC’s data in the Research Data Warehouse. The terms are meant to be understood by researchers familiar with our organization’s data.

Q: Why are there duplicate items in the Navigate Terms hierarchy?
A: The concepts that appear in the Navigate Terms portion of the i2b2 Query & Analysis Tool are sometimes part of multiple medical ontologies and therefore appear multiple times.

Q: What ontologies are utilized?
A: The concepts that appear in the Navigate Terms are described below:

  • Navigation Folder Name: Allergies
    Ontology: Variation of SNOMED
    Additional Information: SNOMED website
  • Navigation Folder Name: Assessments
    Ontology: Custom
    Additional Information: Based on subset of MUSC observations
  • Navigation Folder Name: Cancer Registry
    Ontology: Based on NAACCR
    Additional Information: NAACCR website
  • Navigation Folder Name: Demographics
    Ontology: MUSC Local Codes
    Additional Information: Epic
  • Navigation Folder Name: Diagnoses
    Ontology: ICD9-CM and ICD10-CM
    Additional Information: ICD9 – prior to October 2015 and ICD10 – after October 2015
  • Navigation Folder Name: Immunizations
    Ontology: CDC
    Additional Information: CDC website
  • Navigation Folder Name: Labs
    Ontology: LOINC
    Additional Information: LOINC website
  • Navigation Folder Name: Medications
    Ontology: RxNorm
    Additional Information: RxNorm website
  • Navigation Folder Name: Problem List
    Ontology: ICD9-CM
  • Navigation Folder Name: Procedures
    Ontology: CPT, ICD9-CM, ICD10-PCS
  • Navigation Folder Name: Research Permissions
    Ontology: MUSC Local Codes
    Additional Information: Epic
  • Navigation Folder Name: Visit
    Ontology: MUSC Local Codes
    Additional Information: Epic
  • Navigation Folder Name: Vital Signs
    Ontology: Custom

Q: What is the SHARED folder in the Workplace section for? What about the folder with my id?
A: Items in the Workplace SHARED folder are queries that are visible to other users with access to the same project. If you want to share a query with others in your group, drag the query to the SHARED folder. These queries won’t be deleted. Items in the NetID folder are personal queries that won’t be deleted. If you want to save a query for personal use, drag it to your personal folder.

Q: What are the items in the Previous Queries section?
A: Any query you execute will be saved in the Previous Queries section. The default name is built from the first characters for the items placed in the Group boxes. A user can overwrite the default name or rename after it’s saved by right clicking. There is a limit to how many items appear in the Previous Queries section. Only the most recent queries are visible in the Previous Queries section; older queries roll off. For long term saving of queries, drag them to the personal folder. You can right click and delete a previous query. You can change the Max Number of Queries to display by pressing the ‘Set Options’ button and overwriting the default of 20.

Q: How do you view data from previous queries?
A: There are two options: you can replay the previous query to see the prior results or rerun against current data. Dragging the previous query to the “Query Name” bar will replay the results of the original query. Dragging the query to the “Group 1” panel will allow you to re-run the query against the current data. If the data has changed, the result sets will be different. For details, reference the i2b2 help guide or select the “i2b2 Previous Queries” section from the i2b2 user guide in the web client.

Q: How do I perform a Temporal Query?
A: The steps below serve as a reminder of the steps needed to set up a Temporal Query. Full instructions with pictures are available in the i2b2 User Guide available by pressing the Help button in the upper right section of the screen.

  • Change the Temporal Constraint from Treat all groups independently to Define sequence of events
  • A second entry appears with a drop down displaying the default selection of Population in which events occur. Select your base set of patients. (If you don’t provide a filter here, the entire patient population is used.)
  • Press the drop down arrow next to Population in which events occur and select Event 1. Define your Event 1.
  • Change drop down from Event 1 to Event 2. (optional step and you can add events if you need more than two)
  • Change drop down from Event # to Define order of events. A new set of columns appear allowing you to define the rules related to the events.
  • Press the Run Query button.
  • Temporal queries have a (t) in front of the name in the Previous Queries panel.

Understanding the Data

Q: What is the date range of data?
A: The current date range, available domains, and patient counts are displayed in the i2b2 interface. An example is below.

i2b2 data range

Q: When is i2b2 refreshed?
A: A monthly incremental procedure runs on the 3rd of each month for data entered or updated during the previous month. Utilize the date range displayed in the i2b2 interface to confirm the date range of data loaded.

Q: Is there a data dictionary? Where is it?
A: Yes, the data dictionary is available on the Biomedical Informatics Center website. This dictionary contains all available fields in the research data warehouse. Fields that are available in i2b2 are specified with an “X” in the “Present_in_i2b2” column.

Q: There are several age concepts and values, which one should I use?
A: The “Age” concept under Demographics is calculated based on the current age of the patient on the day you are running the query. The “Age at Visit” concept under Visit is the age of the patient at the visit date.

Q: What is the scope of the Diagnosis and Procedures data?
A: Based on final billing codes.

Q: What is the scope of the Medications data?
A: The medication information exposed in i2b2 are medication orders.

Q: How do I utilize the RXNORM ontology for Medications in my queries?
A: Medications are represented through the VA Drug Classes and RXNORM ingredients. If a medication has multiple ingredients, the medication will have a fact for each ingredient.

Q: Not all lab results seem to be available, why is this?
A: Only lab results that have an associated LOINC code are exposed in i2b2.

Q: What vitals are included in i2b2?
A: The min, max, and median measurement per day are available.

Q: How are the chronic condition concepts derived under the computable phenotypes category?
A: The chronic conditions are derived monthly. We implemented a variation of the 27 Chronic Conditions algorithms from the CMS Chronic Conditions Data Warehouse (CCW). We use only ‘claims' related diagnosis codes and use inclusion/exclusion rules regarding the claims (any DX on the claim, ONLY first or second DX on the claim, ONLY principal DX on the claim). We do not have any restriction for the reference period or the number / type of claims. CMS Chronic Conditions Data Warehouse (CCW)CCW Condition Algorithms (PDF), and CCW Chronic Condition Reference List (PDF) are available with more details.

Q: How are the Charlson Comorbidity scores calculated?
A: We calculate two Charlson Comorbidity scores - one with an adjustment for age and one without an adjustment for age. Below are the score adjustments for age:

Age Range

Score

<>

0

50 to 59

1

60 to 69

2

70 to 79

3

>= 80

4

This table (PDF) contains the diagnosis codes for each Charlson category under the ICD-10 and Enhanced ICD-9-CM columns in the TABLE 1. ICD-9-CM and ICD-10 Coding Algorithms for Charlson Comorbidities. 

The weights for the Charlson categories are listed below:

Weight

Charlson Category

1

Cerebrovascular disease
Chronic pulmonary disease
Congestive heart failure
Dementia
Diabetes with chronic complication
Diabetes without chronic complication
Mild liver disease
Myocardial infraction
Peptic ulcer disease
Peripheral vascular disease
Rheumatic disease

2

Any malignancy, including lymphoma and leukemia, except malignant neoplasm of skin
Hemiplegia or paraplegia
Renal disease

3

Moderate or severe liver disease

6

AIDS/HIV
Metastatic solid tumor

Q: Where does the clinical trial data come from?
A:The clinical trial data comes from the RSCH record in EPIC. Not all trials are entered into Epic and therefore the Trial data included in the Research Data Warehouse and i2b2 is a subset of the total number of patients enrolled in trials on an institution level. Only patients associated to a study with a valid NCT number and with one of the below active ENROLLMENT_STATUS values at the time of the monthly incremental are included.

  • Consented – In Screening
  • Enrolled Follow-Up Only
  • Enrolled – Receiving Treatment AND/OR Intervention
  • Enrolled – Without Treatment AND/OR Intervention

Q: What is the scope of the Cancer Registry data and where does it come from?
A: The data that populates the Cancer Registry fields comes from the registry managed by MUSC’s Hollings Cancer Center (HCC). All data from patients diagnosed 5/12/2012 or later are available in i2b2.

Using Plugins

Q: How do I use the i2b2 plugins?
A: Access the plugins page by clicking Analysis Tools in the top right-hand corner of the screen. Click once on the desired plugin from the Plugins field at the bottom of the screen to select it. Select the Specify Data tab to provide input, and drag and drop patient sets from the Previous Queries box and concepts from the Navigate Terms box. Select the View Results tab to see results. Select the Plugin Help tab for more detailed information about the plugin.
Note: You must have created a patient set beforehand to use plugins. See below.

Q: How do I make a patient set and where can I locate it?
A: Before using plugins, you must run a query in the Find Patients tab. Select the patient set option in the Run Query pop-up box to produce a patient set from your query results.
To locate your patient set(s), expand a query folder and then the results folder under Previous Queries in the lower left-hand corner of the webclient.

Q: Where can I get more details on specific plugins?
A: For more detailed information, refer to the User Guide to i2b2 Plugins.

Q: How can I format the Export XLS Plugin output so all of the data is on the same row for the same patient?
A: At the top of the Output Options tab for the plugin change the Formatting by selecting ‘1 row per patient, 1 column per observation set’ from the drop down. The default Formatting option is ‘1 row per observation (duplicates removed,1 column per observation set)’. The following details are available on the “Plug In Help” tab.

  • 1 row per observation (duplicates removed, 1 column per observation set): A new row is created for each observation. All observation details (concept code, value, unit, ...) are written into one cell. One column is created for each concept that has been dragged onto the input box in step 1. Attention: Duplicate entries are removed. This format only returns a list of the different observations that were found.
  • 1 row per observation (all, with timestamps, 1 column per observation set): Similar to the option above, but: timestamps of the observations are tabulated as well. Therefore, duplicates are not possible and nothing is removed.
  • 1 row per observation (detailed, 1 column per observation detail): This is the most detailed option. A new row is created for each observation and all observation details (concept code, value, unit, ...) are written to dedicated columns.
  • 1 row per patient, 1 column per observation set: A new row is created for each patient. One column is created for each concept that has been dragged onto the input box. All observations of a patient are then written into one cell (with respect to the concept column). Note: This is the only output option where the first column starting number will match the specified value of 'Starting Patient'.

Q: Are there suggestions to avoid timeout issues while using the Export XLS Plugin?
A: The i2b2 community offers the following suggestions:

  • If a query would return very large result sets, the server automatically pages the result. This causes a considerable delay that sometimes will fail or hang, due to timeouts. If you encounter this problem, the query can be paged manually by setting the 'Query Subgroup Size' value on the "Specify Data" tab. This is still slower than an 'at-once' query, but faster than automatic paging and it avoids server overload. The necessary value cannot be predicted in general and strongly depends on the number of observations returned, but 20 to 50 is a good idea for beginning. Higher values result in faster processing but higher risk of server overload. Press the HELP button for more details.
  • If a patient set contains thousands of patients, and you are not sure if the concepts you are specifying would result in large run time or large result set, then it is better to start with a subset of the patient set (e.g. 'Starting Patient': 1, and 'Number of Patients: 500, etc.) first, as you can always run again with subsequent subsets (e.g. 'Starting Patient': 501, and 'Number of Patients: 500, etc.). Press the HELP button for more details. Press the HELP button for more details.
  • Avoid checking ‘Resolve Concept/Modifier Codes’ or ‘Include Ontology Path of Concepts’ on the “Output Options” tab until the final export of data.