Biomedical Informatics & Data Science

About Us

The BioMedical Informatics and Data Science program is dedicated to the advancement of data analytics, machine learning and basic/clinical research informatics at Children's Hospital Los Angeles.

How We Support

The BioMedical Informatics and Data Science program promotes data-driven science across the entire research enterprise at CHLA, including practices, functions, solutions and new ventures, to foster a broader data ecosystem toward innovative clinical discoveries. The program engages directly with the diverse and interdisciplinary research community at CHLA and USC to support the mission of the research enterprise.

At the core of the program are the design, implementation and operation of novel cutting-edge information systems, analytics software and scientific computing resources. The program is also dedicated to educating our research community about optimized data solutions. 

The BioInformatics and Data Science team is providing the foundation for data-driven and machine learning-enabled research at CHLA to encourage widespread adoption and usage of a broad spectrum of data assets.

  • Clinical Research Cohort Discovery

Cohort discovery provides the CHLA research community a process to discover patient count information to self-serve hypothesis exploration. The de-identified count information can then be used to submit an IRB study protocol and obtain access to the full data set.

The BioInformatics and Data Science program is currently supporting three (3) systems for cohort discovery:

Trinetx, Inc

The Trinetx, Inc solution allows cohort discovery on HIPAA Safe Harbor de-identified data raging from November of 2004 to December of 2018 and is limited to 100 selected labs. In addition to CHLA data, data from about 60 other institutions can be queried.

I2B2 Core

The I2B2 Core system provides cohort discovery on HIPAA LDS (limited dataset) de-identified data including all clinical observations until 2018.


The Los Angeles Data Resource (LADR) is a data sharing consortium organized by UCLA and includes several institutions within the LA area. Data queries are executed at all institutions and result in a federated cohort discovery. Researchers can then obtain IRB from the respective institution to obtain the full data set.