1
|
Graves SJ, Marconi S, Stewart D, Harmon I, Weinstein B, Kanazawa Y, Scholl VM, Joseph MB, McGlinchy J, Browne L, Sullivan MK, Estrada-Villegas S, Wang DZ, Singh A, Bohlman S, Zare A, White EP. Data science competition for cross-site individual tree species identification from airborne remote sensing data. PeerJ 2023; 11:e16578. [PMID: 38144190 PMCID: PMC10749090 DOI: 10.7717/peerj.16578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 11/13/2023] [Indexed: 12/26/2023] Open
Abstract
Data on individual tree crowns from remote sensing have the potential to advance forest ecology by providing information about forest composition and structure with a continuous spatial coverage over large spatial extents. Classifying individual trees to their taxonomic species over large regions from remote sensing data is challenging. Methods to classify individual species are often accurate for common species, but perform poorly for less common species and when applied to new sites. We ran a data science competition to help identify effective methods for the task of classification of individual crowns to species identity. The competition included data from three sites to assess each methods' ability to generalize patterns across two sites simultaneously and apply methods to an untrained site. Three different metrics were used to assess and compare model performance. Six teams participated, representing four countries and nine individuals. The highest performing method from a previous competition in 2017 was applied and used as a baseline to understand advancements and changes in successful methods. The best species classification method was based on a two-stage fully connected neural network that significantly outperformed the baseline random forest and gradient boosting ensemble methods. All methods generalized well by showing relatively strong performance on the trained sites (accuracy = 0.46-0.55, macro F1 = 0.09-0.32, cross entropy loss = 2.4-9.2), but generally failed to transfer effectively to the untrained site (accuracy = 0.07-0.32, macro F1 = 0.02-0.18, cross entropy loss = 2.8-16.3). Classification performance was influenced by the number of samples with species labels available for training, with most methods predicting common species at the training sites well (maximum F1 score of 0.86) relative to the uncommon species where none were predicted. Classification errors were most common between species in the same genus and different species that occur in the same habitat. Most methods performed better than the baseline in detecting if a species was not in the training data by predicting an untrained mixed-species class, especially in the untrained site. This work has highlighted that data science competitions can encourage advancement of methods, particularly by bringing in new people from outside the focal discipline, and by providing an open dataset and evaluation criteria from which participants can learn.
Collapse
Affiliation(s)
- Sarah J. Graves
- Nelson Institute for Environmental Studies, University of Wisconsin-Madison, Madison, Wisconsin, United States
| | - Sergio Marconi
- Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, Florida, United States
| | - Dylan Stewart
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, Florida, United States
| | - Ira Harmon
- Department of Computer and Information Sciences and Engineering, University of Florida, Gainesville, Florida, United States
| | - Ben Weinstein
- Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, Florida, United States
| | - Yuzi Kanazawa
- Artificial Intelligence Laboratory, Fujitsu Laboratories Ltd., Kawasaki, Kanagawa, Japan
| | - Victoria M. Scholl
- Earth Lab, Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado at Boulder, Boulder, Colorado, United States
- Department of Geography, University of Colorado at Boulder, Boulder, Colorado, United States
| | - Maxwell B. Joseph
- Earth Lab, Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado at Boulder, Boulder, Colorado, United States
| | - Joseph McGlinchy
- Earth Lab, Cooperative Institute for Research in Environmental Sciences (CIRES), University of Colorado at Boulder, Boulder, Colorado, United States
| | - Luke Browne
- Yale School of the Environment, Yale University, New Haven, Connecticut, United States
| | - Megan K. Sullivan
- Yale School of the Environment, Yale University, New Haven, Connecticut, United States
| | | | - Daisy Zhe Wang
- Department of Computer and Information Sciences and Engineering, University of Florida, Gainesville, Florida, United States
| | - Aditya Singh
- Department of Agricultural & Biological Engineering, University of Florida, Gainesville, Florida, United States
| | - Stephanie Bohlman
- School of Forest, Fisheries, and Geomatics Sciences, University of Florida, Gainesville, Florida, United States
| | - Alina Zare
- Department of Electrical and Computer Engineering, University of Florida, Gainesville, Florida, United States
- Informatics Institute, University of Florida, Gainesville, Florida, United States
- Biodiversity Institute, University of Florida, Gainesville, Florida, United States
| | - Ethan P. White
- Department of Wildlife Ecology and Conservation, University of Florida, Gainesville, Florida, United States
- Informatics Institute, University of Florida, Gainesville, Florida, United States
- Biodiversity Institute, University of Florida, Gainesville, Florida, United States
| |
Collapse
|
3
|
Musinsky J, Goulden T, Wirth G, Leisso N, Krause K, Haynes M, Chapman C. Spanning scales: The airborne spatial and temporal sampling design of the National Ecological Observatory Network. Methods Ecol Evol 2022. [DOI: 10.1111/2041-210x.13942] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- John Musinsky
- National Ecological Observatory Network, Battelle Boulder CO USA
| | - Tristan Goulden
- National Ecological Observatory Network, Battelle Boulder CO USA
| | | | | | - Keith Krause
- National Ecological Observatory Network, Battelle Boulder CO USA
| | - Mitch Haynes
- National Ecological Observatory Network, Battelle Boulder CO USA
| | - Cameron Chapman
- National Ecological Observatory Network, Battelle Boulder CO USA
| |
Collapse
|
4
|
Li D, Record S, Sokol ER, Bitters ME, Chen MY, Chung YA, Helmus MR, Jaimes R, Jansen L, Jarzyna MA, Just MG, LaMontagne JM, Melbourne BA, Moss W, Norman KEA, Parker SM, Robinson N, Seyednasrollah B, Smith C, Spaulding S, Surasinghe TD, Thomsen SK, Zarnetske PL. Standardized
NEON
organismal data for biodiversity research. Ecosphere 2022. [DOI: 10.1002/ecs2.4141] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023] Open
Affiliation(s)
- Daijiang Li
- Department of Biological Sciences Louisiana State University Baton Rouge Louisiana USA
- Center for Computation and Technology Louisiana State University Baton Rouge Louisiana USA
| | - Sydne Record
- Department of Biology Bryn Mawr College Bryn Mawr Pennsylvania USA
- Department of Wildlife, Fisheries, and Conservation Biology University of Maine Orono Maine USA
| | - Eric R. Sokol
- National Ecological Observatory Network (NEON), Battelle Boulder Colorado USA
- Institute of Arctic and Alpine Research (INSTAAR) University of Colorado Boulder Boulder Colorado USA
| | - Matthew E. Bitters
- Department of Ecology and Evolutionary Biology University of Colorado Boulder Boulder Colorado USA
| | - Melissa Y. Chen
- Department of Ecology and Evolutionary Biology University of Colorado Boulder Boulder Colorado USA
| | - Y. Anny Chung
- Departments of Plant Biology and Plant Pathology University of Georgia Athens Georgia USA
| | - Matthew R. Helmus
- Integrative Ecology Lab, Center for Biodiversity, Department of Biology Temple University Philadelphia Pennsylvania USA
| | | | - Lara Jansen
- Department of Environmental Science and Management Portland State University Portland Oregon USA
| | - Marta A. Jarzyna
- Department of Evolution, Ecology and Organismal Biology The Ohio State University Columbus Ohio USA
- Translational Data Analytics Institute The Ohio State University Columbus Ohio USA
| | - Michael G. Just
- Ecological Processes Branch U.S. Army ERDC CERL Champaign Illinois USA
| | | | - Brett A. Melbourne
- Department of Ecology and Evolutionary Biology University of Colorado Boulder Boulder Colorado USA
| | - Wynne Moss
- Department of Ecology and Evolutionary Biology University of Colorado Boulder Boulder Colorado USA
| | - Kari E. A. Norman
- Department of Environmental Science, Policy, and Management University of California Berkeley Berkeley California USA
| | - Stephanie M. Parker
- National Ecological Observatory Network (NEON), Battelle Boulder Colorado USA
| | - Natalie Robinson
- National Ecological Observatory Network (NEON), Battelle Boulder Colorado USA
| | - Bijan Seyednasrollah
- School of Informatics, Computing and Cyber Systems Northern Arizona University Flagstaff Arizona USA
| | - Colin Smith
- Environmental Data Initiative University of Wisconsin‐Madison Madison Wisconsin USA
| | - Sarah Spaulding
- Institute of Arctic and Alpine Research (INSTAAR) University of Colorado Boulder Boulder Colorado USA
| | - Thilina D. Surasinghe
- Department of Biological Sciences Bridgewater State University Bridgewater Massachusetts USA
| | - Sarah K. Thomsen
- Department of Integrative Biology Oregon State University Corvallis Oregon USA
| | - Phoebe L. Zarnetske
- Department of Integrative Biology Michigan State University East Lansing Michigan USA
- Ecology, Evolution, and Behavior Program Michigan State University East Lansing Michigan USA
| |
Collapse
|
5
|
Gill NS, Mahood AL, Meier CL, Muthukrishnan R, Nagy RC, Stricker E, Duffy KA, Petri L, Morisette JT. Six central questions about biological invasions to which NEON data science is poised to contribute. Ecosphere 2021. [DOI: 10.1002/ecs2.3728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Affiliation(s)
- Nathan S. Gill
- Department of Natural Resources Management Texas Tech University Lubbock Texas 79410 USA
| | - Adam L. Mahood
- Earth Lab Cooperative Institute for Research in the Environmental Sciences at the University of Colorado Boulder Boulder Colorado 80309 USA
- Geography Department University of Colorado Boulder Boulder Colorado 80309 USA
| | - Courtney L. Meier
- National Ecological Observatory Network Battelle Boulder Colorado 80301 USA
| | - Ranjan Muthukrishnan
- Environmental Resilience Institute Indiana University Bloomington Bloomington Indiana 47408 USA
| | - R. Chelsea Nagy
- Earth Lab Cooperative Institute for Research in the Environmental Sciences at the University of Colorado Boulder Boulder Colorado 80309 USA
| | - Eva Stricker
- Department of Biology University of New Mexico Albuquerque New Mexico 87131 USA
| | - Katharyn A. Duffy
- School of Informatics, Computing & Cyber Systems Northern Arizona University Flagstaff Arizona 86011 USA
| | - Laís Petri
- School for Environment and Sustainability University of Michigan Ann Arbor Michigan 48109 USA
| | - Jeffrey T. Morisette
- National Invasive Species Council U.S. Department of the Interior Washington DC 20240 USA
| |
Collapse
|