Data Resources
University of Utah Resources
- Wasatch Front Research Data Center (WFRDC) is a member of the national network of Federal Statistical Research Data Centers (FSRDC), and serves the urban core of Utah, often called the Wasatch Front, as well as the state and the Intermountain region.
- Utah Population Database (UPDB) at Huntsman Cancer Institute at the University of Utah is one of the world’s richest sources of in-depth information that supports research on genetics, epidemiology, demography, and public health..
- MesoWest collects weather data from numerous public sources and provides access to current and past.
- Sloan Digital Sky Survey has created the most detailed three-dimensional maps of the universe ever made, with deep multi-color images of one third of the sky, and spectra for more than three million astronomical objects.
- Research Data Management provides information on some of the most important aspects of data mangement.
University of Utah Services and Groups
- University of Utah Health Data Science Services is the research data concierge for UHealth EDW/Epic data, DSS offers the following services such as Feasibility, Datasets, Analytics, Clinical Trials, Natural Language Processing, Data Management, Tools and Applications, and Collaborations.
- Biomedical Informatics Core (BMIC) of the Clinical and Translational Science Institute (CTSI) provides comprehensive clinical and translational research informatics support to researchers through a variety of means, including education, consultation, and service delivery.
- Cores and Recharge Centers include 42 shared core facilities that offer a variety of advanced technologies, equipment, and capabilities. The Cores facilitate research with specialized equipment run by dedicated and highly trained directors and staff.
- Interdisciplinary Exchange for Utah Science (NEXUS) is a research center that serves the University of Utah’s strategic research priorities. The mission of NEXUS is to address society’s grand social challenges by promoting interdisciplinary research and team science.
- Scientific Computing and Imaging (SCI) Institute has established itself as an internationally recognized leader in visualization, scientific computing, and image analysis research applied to a broad range of domains.
- Surgical Population Analysis Research Core (SPARC) strives to improve surgical care by supporting a core group of faculty, connecting them to a diverse set of big datasets, providing analytic and visualization services, and obtaining extramural funding.
- Technology Licensing Office aims to be a leader in innovation management that creates value for the University of Utah, its stakeholders, and society.
National and International Resources
- All of Us Research Hub is one of the largest biomedical data resources of its kind with health data from a diverse group of participants from across the United States.
- NIH NHGRI Genomic Analysis, Visualization and Informatics Lab-space (AnVIL) is a cloud-based genomic data sharing and analysis platform. AnVIL facilitates integration and computing on and across large datasets generated by NHGRI programs, as well as initiatives funded by National Institutes of Health (NIH), or by other agencies that support human genomics research.
- Environmental Influences on Child Health Outcomes (ECHO)-wide Cohort incorporates longitudinal data on a growing 30,000 pregnancies and 50,000 children from 69 pediatric cohorts to investigate how exposure to environmental factors — including physical, chemical, biological, social, behavioral, natural, and built environments — impact child health and development.
- EPIC Cosmosdata set combines billions of clinical data points in a way that forms a high quality, representative, and integrated data set that can be used to change the health and lives of people everywhere.
- Health Care Artificial Intelligence Code of Conduct (AICC) is a pivotal initiative of the National Academy of Medicine (NAM), aimed at providing a guiding framework to ensure that AI algorithms and their application in health, medical care, and health research perform accurately, safely, reliably, and ethically in the service of better health for all.
- Inter-university Consortium for Political and Social Research (ICPSR) advances and expands social and behavioral research, acting as a global leader in data stewardship and providing rich data resources and responsive educational opportunities for present and future generations.
- National Aeronautics and Space Administration (NASA) Commercial SmallSat Data Acquisition (CSDA) Program provides high-quality commercial Earth observation data to NSF-funded researchers at no additional cost. Request access here.
- National AI Research Resource Pilot is a first step towards a shared research infrastructure that will strengthen and democratize access to critical resources necessary to power responsible AI discovery and innovation. See the full list of government and non-government contributed resources aligned with the NAIRR Pilot goals, such as pre-trained models, AI ready datasets, and relevant platforms.
- National Center for Science and Engineering Statistics (NCSES) includes public use files from several surveys, including: FFRDC Research and Development Survey (FFRDC R&D), Higher Education Research and Development Survey (HERD), National Survey of College Graduates (NSCG), Survey of Doctorate Recipients (SDR), Survey of Graduate Students and Postdoctorates in Science and Engineering (GSS), Survey of Science and Engineering Research Facilities, and Scientists and Engineers Statistical Data System (SESTAT) Integrated File.
- NASA Climate Data Democratization Partnership will provide researchers with new ways to access and utilize NASA’s climate data, including navigation via timestamps, fields, and other dimensions.
- NASA Software Catalog offers hundreds of new software programs you can download for free to use in a wide variety of technical applications.
- NHLBI TOPMed: Omics Phenotypes of Heart, Lung, and Blood Disorders applications are due October 17, 2023 and 2024 (LOI due 30 days prior) for access for three years.
- NIH: Access to Data Management Coordinating Center for Diagnostic Centers of Excellence for access for three to five years.
- NIH Cloud Lab removes barriers to cloud adoption by providing no-cost, customized, and scientifically relevant training, making it easier for researchers to learn about and explore the cloud with confidence. Provides $500 of cloud credits.
- NIH National Library of Medicine Center for Clinical Observational Investigations aims to reduce barriers to finding and evaluating for use relevant clinical datasets by providing a curated metadata profile comprised of an overview, basic statistics, and concept counts for each clinical dataset.
- NIH National Library of Medicine Dataset Catalog is a catalog of biomedical datasets from various repositories for users to search, discover, retrieve, and connect with datasets to accelerate scientific research.
- National COVID Cohort Collaborative uses privacy-preserving record linkage to synthesize data from multiple different sources including electronic health records (EHR), claims data, and other administrative data to provide a comprehensive understanding of the impact of COVID-19 on individuals and communities.
- NIH ScHARe-hosted datasets include 190 datasets, such as CDC’s BRFSS – Behavioral Risk Factor Surveillance System.
- PhenX Toolkit is a expert generated web-based catalog of recommended measurement protocols for human-subject research.
- Physiome Repository is a list of processed model exposures, with all models containing documentation and associated metadata.
- NIH Common Fund’s Stimulating Peripheral Activity to Relieve Conditions (SPARC) Portal integrates Data, Knowledge, Computational Modeling, and Spatial Mapping for the peripheral nervous system will greatly advance scientific understanding and will deliver significant impacts for clinical medicine.
- NIH NCI-DOE Collaboration AI/ML Resources include Modeling Outcomes Using Surveillance Data and Scalable AI for Cancer (MOSSAIC), AI-Driven Multiscale Investigation of the RAS/RAF Activation Lifecycle (ADMIRRAL), Innovative Methodologies and New Data for Predictive Oncology Model Evaluation (IMPROVE), Accelerating Therapeutics for Opportunities in Medicine (ATOM), CANcer Distributed Learning Environment (CANDLE), and the Predictive Oncology Model and Data Clearinghouse (MoDaC).
- Data Resources in Polycystic Kidney Disease
Common Fund Datasets
- Common Fund Data Ecosystem Search Portal
- 4D Nucleome (4DN): Reference nucleomics and imaging data sets, including an expanding tool set for open data processing and visualization
- Extracellular RNA Communication (exRNA): Catalog of exRNA molecules found in human biofluids like plasma, saliva, and urine; and potential exRNA biomarkers for diseases
- Gabriella First Kids First (KF): Data from whole-genome sequencing of cohorts with structural birth defects and/or susceptibility to childhood cancer, with associated phenotypic and clinical data
- Genotype-Tissue Expression (GTEx): Whole genome- and RNA sequence data from multiple human tissues to study tissue-specific gene expression and regulation, including tissue samples
- Glycoscience (GL): A data integration and dissemination project for carbohydrate and glycoconjugate related data
- Human BioMolecular Atlas Program (HuBMAP): An open and global platform to map healthy cells in the human body to determine how the relationships between cells can affect the health of an individual
- H3Africa: The initiative consists of 51 African projects across 30 African countries that include population-based genomic studies of common, non-communicable disorders such as heart and renal disease, as well as communicable diseases such as tuberculosis
- Human Microbiome Project (HMP): Characterization of 5 major human body sites and three cohorts using multiomics, including16s and metagenomics shotgun sequencing
- Illuminating the Druggable Genome (IDG): Data on understudied druggable proteins, including mRNA and protein expression data, phenotype associations, bioactivity data, drug target interactions, disease links, and functional information
- Integrated Human Microbiome Project (iHMP): Microbiome, epigenomic, metabolomic, and phenotypic data for three cohorts
- Knockout Mouse Phenotyping Program (KOMP2): Data from broad, standardized phenotyping of a genome-wide collection of mouse knockouts
- Library of Integrated Network-based Cellular Signatures (LINCS): Molecular signatures that describe how different types of cells respond to a variety of agents that disrupt normal cellular function
- Metabolomics Workbench: Metabolomics data and metadata from studies on cells, tissues, and organisms
- Molecular Transducers of Physical Activity in Humans (MoTrPAC): Data contain assay-specific results, associated metadata, quality control reports, and animal phenotype data related to molecular transducers that underlie the effects of physical activity
- Stimulating Peripheral Activity to Relieve Conditions (SPARC): Maps and tools to identify and influence therapeutic targets that exist within the neural circuitry of a wide range of organs and tissues
- Undiagnosed Diseases Network (UDN): Provides clinical, multiomics, and model organism data to provide answers for patients and families affected by these mysterious conditions
Are we missing key resources?
This is a growing list and we appreciate your help in including important resources here, please contact Penny Atkins with additional resources that should be included.