UC Santa Barbara announced on Mar. 26 that it is leading a major initiative, the Big Bee Project, to modernize natural history collections through advanced imaging and data analysis. The project involves collaboration with 13 U.S. institutions and aims to create over one million high-resolution images of bee specimens along with annotated datasets of bee traits, supported by $3 million in funding from the National Science Foundation.
This effort is significant because it allows scientists to apply new research techniques such as artificial intelligence, big data analytics, and networked databases to physical specimens that have traditionally been difficult to study at scale. “How can we apply these techniques to natural history collections, especially when much of the intrinsic information a specimen has to offer is difficult to quantify?” asks Katja Seltmann, director of UC Santa Barbara’s Cheadle Center for Biodiversity & Ecological Restoration.
Seltmann’s team used machine learning, computer vision, and crowdsourcing methods in their research. For example, they uploaded detailed bee photos to the Notes From Nature database where more than 5,000 volunteers helped take body measurements for an annotated dataset. According to Seltmann’s group, volunteer input was comparable in quality to trained scientists’.
Another aspect of the project involved using computer vision and machine learning tools to analyze hair density and color across hundreds of bee species. This work led researchers to publish findings about how bees adapt their hair coverage in response to different climates and environmental changes. In partnership with engineering professor B.S. Manjunath at UC Santa Barbara—an expert in computer vision—the team also explored automating species identification through analysis of wing structure.
Beyond entomology, the project piloted innovative methods for quantifying complex biological traits and established new standards for research using museum collections. “So scientists have really been pushing to turn these into quantitative things, like numbers, matrices and graphs,” Seltmann explained.
Manjunath highlighted further applications: “Storing images is a no-brainer,” he said regarding his lab’s BisQue platform which supports image storage and analysis in the cloud. His group continues developing tools such as integrating natural language processing into BisQue for easier scientist-software interaction.
While this phase of the Big Bee Project concludes soon, Seltmann plans ongoing work expanding access so researchers from various fields can use this growing resource: “We can ask brand new questions because it is now far easier than it ever was to pull patterns from specimens and turn those into things that we can analyze with statistics,” she said.



