News from Michigan State University

Novel Statistical Model Pieces Together Biodiversity Puzzle

Thousands of species are in decline worldwide and thousands more are of unknown status, possibly going extinct unnoticed, according to the International Union for the Conservation of Nature.

But as species dwindle, data about them are on the rise—from citizen science projects like iNaturalist to technologically advanced data collection sites like the National Science Foundation’s National Ecological Observation Network (NEON)—and could help scientists keep pace with record changes in biodiversity.

Headshot of Elise Zipkin.

Elise Zipkin, associate professor in the Department of Integrative Biology.

Elise Zipkin, associate professor in MSU’s Department of Integrative Biology in the College of Natural Science, will use a 3-year, $783,676 NSF grant to unite diverse data sources like these into a novel and flexible statistical modeling framework, what she calls an Integrated Community Model, with the aim of assessing the status, trends and dynamics of biodiversity.

“We cannot ignore the growing amount of opportunistic citizen science data, but it comes with huge challenges—no design, no randomization—all the things the scientific community knows are important for making an inference beyond the area of study,” explained Zipkin, whose grant was awarded by NSF’s Division of Biological Infrastructure. “My lab is asking: what can we do with this wealth of data? How can we make these data valuable for basic and applied research?”

The Zipkin Lab is well positioned to develop the ground breaking methodology. For the last six years, they have specialized in estimating how the abundance and distribution of both single species and whole communities of species are affected by climate and environmental change.

Orange butterfly on greenery.

Zipkin’s lab has done quite a bit of work developing approaches that integrate data for single species, including evaluating factors influencing monarch butterfly declines. Photo credit: David Pavlik.

“Our lab has done quite a bit of work developing approaches that integrate data for single species,” said Zipkin, who used multiple data sources from Mexico, the U.S. and Southern Canada to evaluate the factors influencing monarch butterfly declines. “Simultaneously, we’ve also been working on community modeling approaches where we analyze multi-species data that comes from a single source, such as a transect surveys that record all bird species encountered.”

Integrating the sheer amount of biodiversity data across the many available sources is finally possible thanks to high-performance computers, but designing a framework flexible enough to analyze high volume data with diverse structures is a challenge, especially one that will work across many different kinds of organisms.

Imagine trying to piece together a jigsaw puzzle with millions of individual pieces coming from thousands of different boxes, and fast, before some of the rarer pieces disappear.

“One of the main motivations for developing Integrated Community Models is to gain better inferences on rare species that may have only a few data entries in any single data source,” explained Zipkin, whose previous research demonstrated the consequences of biodiversity loss for rare species. “Integrated Community Models could help provide unprecedented information about the status of rare species and the factors that influence their dynamics.”

Two images, one of a mouse in a hand, and the other is robins and their babies in a nest.

This project will develop approaches to estimate population trends of species communities, such as small mammals in the desert Southwest and breeding birds, using multiple data sources. Photo credits: Allison Sussman (mouse) and Erin Zylstra (robins).

Due to the rate at which biodiversity is declining worldwide, data analysis efficiency and speed are top priorities. For example, a main source of data useful to MSU and researchers across the United States is the NSF NEON sites, which will collect standardized, high quality ecological data across the country for the next 30 years.

“One aspect of the model is to make data, such as those collected by NEON, more useful to researchers by combining them with other, complementary collection efforts,” Zipkin said. “NEON is at the beginning of data collection, so integrating NEON data with other data sources and types can make it exceptionally valuable for trend analyses right now, and not just in another decade when the time series are long enough.”

By conducting computer generated simulations in tandem with empirical case studies, Zipkin will be able to test the conditions under which the Integrated Community Model will be most successful.

“Data integration is the future,” Zipkin said. “There are many species with unknown conservation status, so approaches that allow researchers to simultaneously analyze all available data can go a long way towards rapid, accurate assessments.”

Many questions about how and where biodiversity shifts happen cannot be answered unless scientists gather and analyze more data at larger spatial scales than ever before. Zipkin sees the Integrated Community Model as a critical step towards revealing those answers.

“Ultimately, understanding which species are declining and why is critical to develop effective conservation plans and policies that can protect biodiversity.”

Val Osowski via MSU Today