AI Health Data Challenge

Sponsored by Suit Endowment Fund and Mary R. Boyvey Dean’s Excellence Fund at School of Information, the University of Texas at Austin.

Cash Awards

3rd Prize
2nd Prize
1st Prize

AI Health Data Challenge is a cutting-edge competition for enthusiastic scientist students who want to showcase their analytical and technical skills. The participating students can join either as a single person or in a team to promote data-driven and AI-driven approaches to enable better health  based on MIMIC datasets and PubMed Knowledge Graph.

Potential Examples

Designated Datasets


There is a formal process for requesting access to MIMIC datasets (

You will need to pass the Human Subject Online Course from MIT in order to get permission to download the dataset. This process can take several days or several weeks, so plan accordingly.

Participants must agree to the MIMIC data use agreement.

PubMed Knowledge Graph

PubMed Knowledge Graph covering PubMed articles from 1800 to 2020 with extracted bio-entities from 29 million PubMed articles using BioBert, disambiguated author names, integrated funding data through NIH ExPORTER, affiliation history and educational background of authors from ORCID, and fine-grained affiliation data from MapAffil (Xu et al., 2020).

By integrating the credible multi-source datasets, this PubMed knowledge graph contains connections among bio-entities (e.g., gene/protein, disease, drug/chemical, species, and mutations), authors, articles, affiliations, and funding with the author name disambiguation results reaching the F1 score of 98.09%.

This PubMed knowledge graph contains 14,830,461 authors, 18,361,409 bio entities, 8,300,984 affiliations, and 102,070 NIH funded projects.

For more details, please visit:

Submission Guideline


The link to your GitHub (optional)


A 2-5 page report on the details of how you built the app. This report should contain the details of your methods, and screenshots of your app/tool.


A 5-minute video about your app/tool

Important Dates

June 7, 2021


5 PM Central Time (U.S.)

June 27, 2021

Notification for final round

5 PM Central Time (U.S.)

July 7, 2021

Competition for final awards

5 PM Central Time (U.S.)

Judging Criterias

A team of judges will be formed by experts from healthcare, data science, artificial intelligence, and entrepreneurship. Your submission will be judged based on the following criteria:


Is your app/tool useful to address some healthcare issues?
Is your app/tool easy to use?


What are NEW features in your app/tool?
Are there any creative and exciting things in your app/tool?


Can others reproduce your app/tool?

Past Winners

1st Prize winner of 2021

Patient-Based Supervised Contrastive Learning for Thoracic Disorders Identification and Localization in Chest X-Ray Images

This project proposed a simple and effective end-to-end framework using supervised contrastive learning to identify Thoracic disorders in chest x-ray. 

1st Prize winner of 2021

Exploring the effect of Discharge Summaries for the Prediction of 30-day unplanned patient readmission to the ICU

This project incorporates medical notes (e.g., discharge notes) along with demographic data available in the MIMIC-III dataset, to visualize patterns and finally train a prediction model for readmission of patients in the ICU.

2nd Prize winner of 2021

COVID-19 Portal: Integrating Literature, Clinical Trials, and Knowledge Graphs

This project built a website with PubMed Knowledge Graph to help people track COVID-19 research trend by analyzing  papers and the relationship between bio-entities, researchers and institutions.

3rd Prize winner of 2021

Racial Disparity in Medication Prescriptions in MIMIC-III

This project used the MIMIC-III database to design a system that investigates racial and ethnic disparity in medication prescriptions to promote fair health care treatments to all racial 4 groups.

1st Prize winner of 2020


2nd Prize winner of 2020

i-Radiodiagno: Clinically Accurate Report Generation

Our Organizers