Cash Award


1st Prize


2nd Prize


3rd Prize

Sponsored by Suit Endowment Fund and Mary R. Boyvey Dean's Excellence Fund at School of Information, University of Texas at Austin


The goal of AI Health data challenge is to promote data-driven and AI-driven approaches to enable better health. A participant can join either as a single person or in a team to develop either a tool or a data analytical package based on MIMIC datasets and PubMed Knowledge Graph. Potential examples are (but not limited to):

  • › Applying and developing novel AI algorithms to automatically generate bounding boxes to annotate MIMIC CXR chest x-ray images, enable visual question anwering using CXR radiology reports and images, automatically generate radiology reports based on CXR chest x-ray images, create human-centered AI approaches for medical imaging diagnosis.
  • › Increasing the interpretability of AI approaches for patient risk predictions based on MIMIC EHR datasets.
  • › Developing apps to enable evidence-based care for doctors/patients based on MIMIC EHR datasets using the FHIR standards.
  • › Applying Graph mining on PubMed Knowledge Graph.
  • › Building frontend for PubMed Knowledge Graph.

Data Access

· MIMIC There is a formal process for requesting access to MIMIC datasets ( You will need to pass the Human Subject Online Course from MIT in order to get permission to download the dataset. This process can take several days or several weeks, so plan accordingly. Participants must agree to the MIMIC data use agreement.

· PubMed Knowledge Graph PubMed Knowledge Graph covering PubMed articles from 1800 to 2020 with extracted bio-entities from 29 million PubMed articles using BioBert, disambiguated author names, integrated funding data through NIH ExPORTER, affiliation history and educational background of authors from ORCID, and fine-grained affiliation data from MapAffil (Xu et al., 2020). By integrating the credible multi-source datasets, this PubMed knowledge graph contains connections among bio-entities (e.g., gene/protein, disease, drug/chemical, species, and mutations), authors, articles, affiliations, and funding with the author name disambiguation results reaching the F1 score of 98.09%. This PubMed knowledge graph contains 14,830,461 authors, 18,361,409 bio entities, 8,300,984 affiliations, and 102,070 NIH funded projects. For more details, please visit:

Important Dates

(all deadlines are US Central Time)

June 7th, 2021

AI health data challenge submission

June 27th, 2021

Notification for final around

July 7th, 2021

Competition for final awards