Dr. Katy Prudic, University of Arizona, School of Natural Resources and the Environment
Co-sponsored by EEB / EEOB.
Abstract. Massive data from citizen science observations are becoming available at a rapidly increasing rate with no apparent end in sight. Examples include iNaturalist, eButterfly, GBIF, Nature's Notebook, sensors, and satellites oh my. The emerging field of Biodiversity Data Science presents researchers with many exciting research and training opportunities and challenges. Success in biodiversity data science requires scalable statistical inference integrated to computational science with a serious amount of domain science to guide questions and determine when approaches go horribly wrong. In this talk, I discuss some of such challenges and opportunities, and emphasize the importance of incorporating domain knowledge in biodiversity data science method development and application. I illustrate the key points using several case studies, including analysis of data from large scale butterfly focused citizen science web platforms, integrative analysis of different types and sources of data, reproducible and replicable research, and cloud computing focused on providing data products to inform stakeholder decision making.