Kathleen Morrill is a PhD candidate studying the genomics of behavior in dogs in the Karlsson Lab at UMass Chan Medical School. In this guest blog post, she gives us a pawtastic tour of canine science initiatives and lays out the ulti-mutt vision of a “dogs in the cloud” data ecosystem.
Mankind’s closest companion is also our ideal counterpart for studying the biology of disease. Like us, dogs share a common household environment, live emotional day-to-day lives, and face similar challenges to their health. Dogs are, just as humans, a naturally diverse population with similar variability in health conditions with just as many genetic and environmental variables. That’s why dogs attract the interest of many in the scientific research community: we have a ton to learn about the biology of dogs, and those findings can help inform our understanding of our own health.
Facilitating canine science initiatives
Thanks to large-scale community science studies on the biological, genetic, and environmental factors of health and behavior in dogs, canine science is entering its own informatics era. People and their dogs can participate in projects similar in aims to human research studies. For instance, Darwin’s Ark is a survey-based genetic study of behavior in companion dogs. The project has already enrolled over 30,000 dogs and sequenced 10% of those dogs’ genomes, which has helped genetically map canine behavior and simultaneously address pervasive stereotypes about dog breeds.
Longitudinal studies that follow dogs over their natural lifespans have the potential to answer important questions about healthy aging by capturing high-resolution health and environmental data, in addition to genomes and other biological data. For example, the Dog Aging Project is a study launched to study all kinds of dogs over time – collecting genomes, microbiomes, and epigenomes – with parallels to the Framingham Heart Study and the All of Us Research Program. Just as in human studies, these canine studies involve tracking samples, managing health records, and performing standardized analysis — making the use of shared resources and computational tools more important than ever. Accordingly, the Dog Aging Project has set up data repositories and analysis tools to run on Terra.bio, which also supports the All of Us Researcher Workbench.
Human and canine genomics for health in tandem
For a long time, dog genetics has heeled on the progress made in human genetics. Now, just as the gaps in the human reference genome are finally filled, the dog boasts *five* independent de-novo reference assemblies (including one for the intriguing Australian dingo), each aiming to contribute to a more complete understanding of the dog genome. Efforts like the Dog Genome Project and Dog10K – similar to the 1000 Genomes Project, aiming to deeply characterize canine genetic variation – have vastly expanded the collection of whole genomes sequenced at high coverage available from dogs. As canine genomic and epigenomic resources continue to grow, it is exciting to envision a shared Cloud data ecosystem for dogs that is more on par with what is available for human studies.
The National Institutes of Health has already taken the first step by creating the Integrated Canine Data Commons, a Cloud-based resource for genomics, transcriptomics, and clinical trial data from dogs with cancer.
How does canine cancer fit within the mandate of the NIH, you ask? The idea is that cross-species analysis between human and canine studies might help us capture shared genetic and epigenetic pathways to health and disease, from cancer to neuropsychiatric conditions. It seems that from a comparative biology perspective too, people can indeed find no better companion than dogs.
A launching pad for bioinformatics
This vision of a canine cloud ecosystem also presents a tremendous opportunity for genomics education: students and trainees looking to explore large data sets will have an excellent start at learning on canine genomic data. As open data canine projects will pose fewer barriers to data access, these genomic learners may find dogs to be a great way to work with human-analogous, paired individual-level genotypes and phenotypes. Newcomers interested in bioinformatics can get started on real genomic analysis using a Cloud-based platform like Terra, without the need for access to institutional clusters or specialized support.
To get started right away, you can check out the public workspace containing surveys and genomes from dogs enrolled in Darwin’s Ark. To access the Dog Aging Project’s curated collection of dog data, go to the open data access page on the Dog Aging Project website and register for data access credentials. And consider adding your own dog (unless you are a cat person) to these ever-growing canine initiatives!