Rachel Liao is the Head of Scientific Strategy in the Data Sciences Platform at the Broad Institute and a co-Principal Investigator (co-PI) for AnVIL. In this guest blog post, Rachel shares how the AnVIL platform can help researchers meet the new NIH Data Management and Sharing Policy requirements effective as of January 25, 2023.
The past two decades have seen a data generation explosion, due to the emergence of new technologies like single-cell profiling and the completion of foundational projects like the Human Genome Project, which was itself funded by the NIH through the creation of a new NIH institute, the National Human Genome Research Institute (NHGRI). As a result, there has been a dramatic increase in genomics and other organism-scale datasets generated by the research community and tremendous opportunities for researchers to not only generate their own data, but also leverage compatible datasets generated by others to study the myriad factors that contribute to any individual’s, and by extension, population’s, overall health.
For this reason, the National Institutes of Health (NIH) has been a longtime and consistent supporter of robust data sharing policies and practices, notably through the Genomic Data Sharing Policy and other efforts around data sharing. Now, with the release of the new Data Management and Sharing (DMS) Policy, the NIH has further solidified its commitment to “accelerating the pace of biomedical research, enabling validation of research results, and providing accessibility to high-value datasets” through responsible data management.
The new DMS policy lays out practical requirements and expectations for the management and sharing of all newly generated NIH-funded data, to increase the FAIR-ness of the data (that is, the findability, accessibility, interoperability, and reusability), and maximize the value of NIH-funded research. However, researchers are left to do their own research to find an appropriate repository that will meet the requirements of the DMS Policy and provide support to data submitters with limited time and funding.
We are therefore delighted to introduce the AnVIL platform that offers intuitive tools and attentive user support to help researchers meet all their DMS Policy needs. Funded by NHGRI, the AnVIL already has a reputation as a gold-standard repository for hosting and managing large genomics datasets. Now, in response to the community’s need for a repository that meets the more general DMS policy requirements, AnVIL is delighted to offer its repository services to researchers across scientific disciplines, including:
- Secure, cloud-native data storage and management using intuitive self-service tools
- A robust and flexible analysis environment connected to thousands of methods and tools, and the ability to add your own
- A detailed library of support documentation and responsive real-time user support resources
- Data access controls that can be customized to your dataset’s specific Data Use Limitations (DULs), in accordance with global standards like GA4GH Data Use Ontology
- A data access and management software service (DUOS) that can support Data Access Committees with semi-automated data access requests and authorizations
In addition to these services, by submitting your data to the AnVIL platform, you will not only be satisfying the DMS Policy requirements, you will be contributing to a federated data ecosystem that connects your dataset to a network of more than 10 petabytes of NIH-funded data that are being used to make incredible advances in our ability to prevent and treat human diseases.
As you consider your options for meeting the new DMS policy requirements, we invite you to reach out to us here or visit this page for more information on how to leverage the AnVIL platform for your data management and sharing needs.