Researchers at the University of California, including UCSB’s National Center for Ecological Analysis and Synthesis, have partnered with dozens of other universities and agencies to create DataONE, a global data access and preservation network for earth and environmental scientists that will support breakthroughs in environmental research.
DataONE (Data Observation Network for Earth) is one of two $20 million awards made this year as part of the National Science Foundation’s DataNet program. Universities and government agencies coalesced to address the mounting need for organizing and serving up vast amounts of highly diverse and inter-related but often incompatible scientific data. Resulting studies will range from research that illuminates fundamental environmental processes to the identification of environmental problems and potential solutions.
The award will drive advanced research and data acquisition, storage, mining, integration and visualization for DataONE. The resulting computing and processing “cyberinfrastructure” will be made permanently available for use by the broader UC community and international science communities. DataONE is led by the University of New Mexico, and includes additional partner organizations across the United States, as well as from Europe, Africa, South America, Asia and Australia.
“Scientists have spent hundreds of years collecting environmental data — measuring temperature, counting fish and butterflies,” says Stephanie Hampton, deputy director of NCEAS. “We already know quite a lot, when you estimate the volume of scientific data that must exist out there, but the challenge is to find those data sets and then put them together in a manner that helps to address the important questions for science and society. DataONE will be that portal for environmental data.”
The DataONE team will study how a vast digital data network can provide secure and permanent access into the future, and also encourage scientists to share their data. The team will help determine data and data citation standards and will create the tools for organizing, managing and publishing data.
“Institutions have made extensive investments in infrastructure for managing data at their local institutions and in discipline-specific consortia, but these systems generally don’t interoperate,” says Matthew Jones, director of Informatics Research and Development at NCEAS. “DataONE will provide a critically needed interoperability layer that will allow scientists from diverse domains to collaborate on pressing environmental science challenges.”
Scientific data integration and management also occupy computer science researchers who develop methods and tools that support all stages of the data life cycle. “Effective annotation and integration of data, and efficient management of data lineage information are hot research topics in the database and scientific workflows communities,” said Bertram Ludaescher, professor of computer science at UC Davis, whose team specializes in scientific workflow and data integration technologies, and storage and querying of data provenance.
Libraries have traditionally played a critical role in preserving and providing access to scholarly materials, and recently have begun to focus on the complex challenges associated with managing scientific data. “Libraries don’t have the capacity to address these challenges individually,” said Patricia Cruse, director of the UC Curation Center at the California Digital Library. “We need to partner with researchers, information technologists, and domain specialists to address these complex problems.”
DataONE includes experts from library, computer and environmental sciences explicitly to bridge these worlds and to develop an infrastructure to serve science for many decades to come.