Shared Data Science Infrastructure (SDSI)
A Presidential Initiative for Interdisciplinary Research (PIIR) in Data Driven Science

Open data promise to enable more efficient and effective decision making, foster innovation that society can benefit from, and drive organizational and sector change through transparency.

Availability of big open data, e.g. on the web in a downloadable form, is a positive step toward these goals, but access alone is not sufficient because of significant barriers that exist in obtaining and using big data. Data-driven scientists around the world are effectively facing a new digital divide: the barrier to enter data-driven science is prohibitive. Only a few places with well-established infrastructure and deep expertise can attempt large-scale data analyses. Necessary expertise includes: programmatically accessing data sources for data acquisition and cleaning, data storage and retrieval, data mining, scalable data infrastructure design, and visualization.

The need for expertise in these five different areas significantly increases the entrance costs. As a result, data-driven experiments are often not replicated, reusability of experimental data is low, and data associated and produced by such experiments is often inaccessible, obsolete, or worse. Moreover, building analysis infrastructure to process ultra-large-scale data efficiently can be costly and very hard to accomplish. There are efforts to simplify large-scale data analysis; however, we do not yet have user-centric solutions that democratize innovation in data-driven science. There have also been efforts that provide users access to a set of web-based exploratory analysis tools and report descriptive statistics over datasets, but any new idea, typically not anticipated by data providers, is met with the same barriers. Many scientists aren't able to innovate for themselves. The problem is particularly acute for small colleges and HBCUs that lack both expertise and resources and are essentially disenfranchised from data-driven science.

This project brings together a transdisciplinary team to decrease the barrier to entry for data-driven science for ISU researchers and other data-driven scientists around the world by enabling them to harness open data for 21st-century science and engineering. By doing so, we aim to prepare data-driven scientists for grand challenges of the next decade, create unique data science capabilities for research and education, and leverage federal, state, local and private investments to facilitate shared and collaborative data-driven science.

Presidential research initiative promotes big thinking in data-driven science

The third round of funding from an Iowa State University presidential initiative will build four research teams that will use big data to benefit human and animal health, improve cities and build new tools for researchers. Iowa State President Steven Leath launched the Presidential Initiative for Interdisciplinary Research in 2012. The program provides seed funding to establish research teams from across campus to tackle emerging societal challenges. The goal is to help the teams grow into well-funded, cross-disciplinary research groups... Read More...

Iowa State University hosts data-driven research summer school

Iowa State University hosted the Midwest Big Data Summer School in June, attracting nearly 150 participants from universities and organizations around the Midwest. The week-long course introduced early-career researchers to data-driven research... Read More...