BCHSI / UCSF Datasets Selected to be Integrated into NAIRR Pilot
The U.S. National Science Foundation has announced two major advancements in America's AI infrastructure: the launch of the Integrated Data Systems and Services (NSF IDSS) program to build out national-scale data systems and the selection of 10 datasets. Two BCHSI / UCSF affiliated datasets will be integrated into the National Artificial Intelligence Research Resource (NAIRR) Pilot. The selection followed a competitive NSF-led process in partnership with an interagency group of 12 federal agencies. Read NSF news article
Learn more below!
Microbiome dataset from the DREAM Challenge, led by the March of Dimes Prematurity Research Center at UCSF (Tomiko Oskotsky, MD, Scientific Director, March of Dimes Prematurity Research Center, UCSF and Marina Sirota, PhD, professor / BCHSI Interim Director), is one of 10 datasets selected for integration into the NAIRR Pilot.
This dataset includes >3,500 vaginal microbiome samples from ~1,300 pregnant women across multiple studies, harmonized using the MaLiAmPi tool developed by Jonathan Golob. More details on the dataset and the DREAM Challenge are available in the team’s Cell Reports Medicine publication
This dataset, along with other pregnancy-related omics datasets, is accessible through the March of Dimes Prematurity Research Data Repository (https://pretermbirthdb.org). See MOD press release The UCSF MOD team is thrilled to see this resource shared more broadly through NSF’s platform and looks forward to the future opportunities it will enable.
UCSF Industry Documents Library
BCHSI Associate Director, Knowledge Computing, [KT1] Gundolf Schenk, PhD, and team collaborated with UCSF’s Industry Documents Library (IDL) to explore how to de-identify archival documents through a tool they created called “Philter” – Protected Health Information Filter. The IDL is a digital archive containing over 25 million documents publicly released from industries which impact public health. The partnership with BCHSI has supported IDL in its efforts to identify and safeguard personal information before documents are made public. UCSF IDL is frequently used by UCSF faculty, staff, and students in public health research and is also one of the selected 10 datasets to be integrated into the NAIRR Pilot.
