Dockstore is a repository for publishing, sharing, and discovering reproducible computational analyses that are described using portable workflow languages. It is the product of a growing collaboration among the University of California, Santa Cruz (UCSC), the Ontario Institute for Cancer Research (OICR), and, recently, the Broad Institute. In this guest post, Beth Sheets, Program Manager at UCSC, offers a rundown of the latest Dockstore features that workflow enthusiasts should know about.
Dockstore has been evolving rapidly thanks to support from NHGRI AnVIL and NHLBI BioData Catalyst, two NIH initiatives centered on hosting petabytes of data in the cloud for researchers to analyze securely and collaboratively, using interconnected platforms that include Dockstore and Terra. As the platform’s capabilities expand, we’re seeing growing numbers of tool developers and organizations contributing content to Dockstore, as well as an influx of researchers who are new to working in the cloud and are looking for workflows that are available “out of the box”.
This kind of growth highlights two key requirements of any repository of community-contributed content: we need to ensure that contributors can manage their content conveniently, and that researchers are able to find the right content for their needs. Today I want to highlight some of the improvements we’ve recently made in this space, which we hope will help meet the needs of our expanding user community and lay solid foundations for sustainable growth over time.
New ways to organize and search for workflows
Over the past year, we’ve introduced two new features that content contributors can use to organize their workflows, called Organizations and Collections. The Organizations feature is meant for grouping workflows produced by a particular team, institution or consortium. At the next level down, the Collections feature is intended to help group content by topic or use case. For example, if you look up workflows published by the Broad Institute, you’ll see that the Broad’s workflows are further grouped into Collections that reflect different functional areas and projects.
Screenshot of the Broad Institute’s Organization page on Dockstore showing 2 of the 7 Collections currently maintained by Broad teams.
Both Organizations and Collections are controlled by verified members of the organizations in question; creating a new organization involves making a request to the Dockstore curation team, who will verify the affiliation of the requestor, which helps ensure that users can trust the provenance of the workflows they find in the repository. Organizations have three membership levels (Admin, Maintainer, Member) with varying permissions to allow administrators to maintain all current members of their organization while also being able to moderate changes by only giving certain members editing privileges. There are currently 31 verified organizations that publish their workflows in Dockstore!
In a similar vein, content contributors can now link their ORCID profile to their Dockstore account, which makes it easier for anyone interested in their content to look up their qualifications and published work. Currently, ORCIDs are displayed as part of the user thumbnails shown in the list of an organization’s members, and in the list of users who have “starred” an organization or a specific workflow.
The process of linking your ORCID profile is straightforward; see this documentation article for instructions. Keep an eye on this feature as we plan to enhance our integration with ORCID to offer more ways to share your workflow contributions with the research community.
Integrating development, deployment and publication
From a functional standpoint, we also want to make it easy for contributors to keep their content up to date with minimal manual intervention. To that end, we recently published a Github App that enables developers who use Github to set up their code repository to automatically sync with Dockstore. The process involves doing a one-time registration step per repository or organization, described here, and adding a short .yml configuration file to your Github repository, listing the workflow(s) that should be synced. Note that this can be used to register workflows in Dockstore in the first place, bypassing the “normal” manual workflow registration process.
When your content on Dockstore is ready for an official release, you can create an immutable “snapshot” version of your workflow and request a Digital Object Identifier (DOI) for it through Zenodo, so that others can cite it in their own work.
Screenshot of an example workflow’s version management page showing the snapshot creation and DOI request action links.
You can find detailed instructions on creating the snapshot and DOI in the Dockstore documentation.
A video playlist to learn more about how to use Dockstore to the fullest
Finally, if you’re just getting started and all this sounds lovely, but you’re not sure how to get started with Dockstore, check out our new video series on YouTube. It was originally a full end-to-end online workshop that we recorded, which had the advantage of covering all the topics you’re likely to want to learn as a newcomer with the platform in a well-ordered progression, but at a runtime of over 2 hours it was daunting and impractical for most viewers. So we cut it up into a playlist of short digestible segments that maintains the original continuity yet is a lot easier to tackle by increments, and also allows you to jump to specific topics much more conveniently.
I hope this will help you take full advantage of Dockstore’s latest improvements, and I look forward to following up on this to announce other enhancements to the platform that are currently in progress.
For a short review of how to import a workflow from Dockstore starting from within a Terra workspace, see this video, starting at 2:11.