Data Science & Biomedical Informatics

About Us

The Data Science & Biomedical Informatics program is dedicated to the advancement of data analytics, machine learning, and basic/clinical research informatics at Children's Hospital Los Angeles.

How We Support

The Data Science & Biomedical Informatics program promotes data-driven science across the entire research enterprise at CHLA, including practices, functions, solutions, and new ventures, to foster a broader data ecosystem toward innovative clinical discoveries. The program engages directly with the diverse and interdisciplinary research community at CHLA and USC to support the mission of the research enterprise.

At the core of the program are the design, implementation, and operation of novel cutting-edge information systems, analytics software, and scientific computing resources. The program is also dedicated to educating our research community about optimized data solutions. 

The team is providing the foundation for data-driven and machine learning-enabled research at CHLA to encourage widespread adoption and usage of a broad spectrum of data assets.

Clinical Research Cohort Discovery

Cohort discovery through the Enterprise Data Lake (EDL) and HealtheIntent provides the CHLA research community a process to discover patient count information to self-serve hypothesis exploration. The de-identified count information can then be used to submit an IRB study protocol and obtain access to the full data set.

The Data Science & Biomedical Informatics program is currently supported by TSRI and Information Services.

Trinetx, Inc

The Trinetx, Inc solution allows cohort discovery on HIPAA Safe Harbor de-identified data raging from November 2004 to December 2018 and is limited to 100 selected labs. In addition to CHLA data, data from about 60 other institutions can be queried.

I2B2 Core

The I2B2 Core system provides cohort discovery on HIPAA LDS (limited dataset) de-identified data including all clinical observations until 2018.


The Los Angeles Data Resource (LADR) is a data-sharing consortium organized by UCLA and includes several institutions within the LA area. Data queries are executed at all institutions and result in a federated cohort discovery. Researchers can then obtain IRB from the respective institution to obtain the full data set.