1
|
Morueta-Holme N, Iversen LL, Corcoran D, Rahbek C, Normand S. Unlocking ground-based imagery for habitat mapping. Trends Ecol Evol 2024; 39:349-358. [PMID: 38087707 DOI: 10.1016/j.tree.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/06/2023] [Accepted: 11/14/2023] [Indexed: 04/05/2024]
Abstract
Fine-grained environmental data across large extents are needed to resolve the processes that impact species communities from local to global scales. Ground-based images (GBIs) have the potential to capture habitat complexity at biologically relevant spatial and temporal resolutions. Moving beyond existing applications of GBIs for species identification and monitoring ecological change from repeat photography, we describe promising approaches to habitat mapping, leveraging multimodal data and computer vision. We illustrate empirically how GBIs can be applied to predict distributions of species at fine scales along Street View routes, or to automatically classify and quantify habitat features. Further, we outline future research avenues using GBIs that can bring a leap forward in analyses for ecology and conservation with this underused resource.
Collapse
Affiliation(s)
- N Morueta-Holme
- Center for Macroecology, Evolution and Climate, Globe Institute, University of Copenhagen, Copenhagen, Denmark.
| | - L L Iversen
- Department of Biology, McGill University, Montréal, Québec, H3A 1B1, Canada
| | - D Corcoran
- Section for Ecoinformatics & Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark; Center for Sustainable Landscapes under Global Change, Department of Biology, Aarhus University, Aarhus, Denmark
| | - C Rahbek
- Center for Macroecology, Evolution and Climate, Globe Institute, University of Copenhagen, Copenhagen, Denmark; Center for Global Mountain Biodiversity, Globe Institute, University of Copenhagen, Copenhagen, Denmark; Institute of Ecology, Peking University, Beijing, China; Danish Institute for Advanced Study, University of Southern Denmark, Odense, Denmark
| | - S Normand
- Section for Ecoinformatics & Biodiversity, Department of Biology, Aarhus University, Aarhus, Denmark; Center for Sustainable Landscapes under Global Change, Department of Biology, Aarhus University, Aarhus, Denmark; Center for Landscape Research in Sustainable Agricultural Futures, Department of Biology, Aarhus University, Aarhus, Denmark
| |
Collapse
|
2
|
Berezin CT, Aguilera LU, Billerbeck S, Bourne PE, Densmore D, Freemont P, Gorochowski TE, Hernandez SI, Hillson NJ, King CR, Köpke M, Ma S, Miller KM, Moon TS, Moore JH, Munsky B, Myers CJ, Nicholas DA, Peccoud SJ, Zhou W, Peccoud J. Ten simple rules for managing laboratory information. PLoS Comput Biol 2023; 19:e1011652. [PMID: 38060459 PMCID: PMC10703290 DOI: 10.1371/journal.pcbi.1011652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2023] Open
Abstract
Information is the cornerstone of research, from experimental (meta)data and computational processes to complex inventories of reagents and equipment. These 10 simple rules discuss best practices for leveraging laboratory information management systems to transform this large information load into useful scientific findings.
Collapse
Affiliation(s)
- Casey-Tyler Berezin
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Luis U. Aguilera
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Sonja Billerbeck
- Molecular Microbiology Unit, Faculty of Science and Engineering, University of Groningen, Groningen, the Netherlands
| | - Philip E. Bourne
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States of America
- Department of Biomedical Engineering, University of Virginia, Charlottesville, Virginia, United States of America
| | - Douglas Densmore
- College of Engineering, Boston University, Boston, Massachusetts, United States of America
| | - Paul Freemont
- Department of Infectious Disease, Imperial College, London, United Kingdom
| | - Thomas E. Gorochowski
- School of Biological Sciences, University of Bristol, Bristol, United Kingdom
- BrisEngBio, University of Bristol, Bristol, United Kingdom
| | - Sarah I. Hernandez
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Nathan J. Hillson
- Biological Systems and Engineering Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- US Department of Energy Agile BioFoundry, Emeryville, California, United States of America
- US Department of Energy Joint BioEnergy Institute, Emeryville, California, United States of America
| | - Connor R. King
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Michael Köpke
- LanzaTech, Skokie, Illinois, United States of America
| | - Shuyi Ma
- Center for Global Infectious Disease Research, Seattle Children’s Hospital, University of Washington Medicine, Seattle, Washington, United States of America
| | - Katie M. Miller
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Tae Seok Moon
- Department of Energy, Environmental & Chemical Engineering, Washington University in St. Louis, St. Louis, Missouri, United States of America
| | - Jason H. Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, California, United States of America
| | - Brian Munsky
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Chris J. Myers
- Department of Electrical, Computer & Energy Engineering, University of Colorado Boulder, Boulder, Colorado, United States of America
| | - Dequina A. Nicholas
- Department of Molecular Biology & Biochemistry, University of California Irvine, Irvine, California, United States of America
| | - Samuel J. Peccoud
- Department of Electrical and Computer Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| | - Wen Zhou
- Department of Statistics, Colorado State University, Fort Collins, Colorado, United States of America
| | - Jean Peccoud
- Department of Chemical and Biological Engineering, Colorado State University, Fort Collins, Colorado, United States of America
| |
Collapse
|
3
|
Masum H, Bourne PE. Ten simple rules for humane data science. PLoS Comput Biol 2023; 19:e1011698. [PMID: 38127691 PMCID: PMC10734991 DOI: 10.1371/journal.pcbi.1011698] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2023] Open
Affiliation(s)
- Hassan Masum
- Waterloo Institute for Complexity and Innovation, Waterloo, Canada
| | - Philip E. Bourne
- School of Data Science, University of Virginia, Virginia, United States of America
| |
Collapse
|
4
|
Meyer MN, Basl J, Choffnes D, Wilson C, Lazer DMJ. Enhancing the ethics of user-sourced online data collection and sharing. NATURE COMPUTATIONAL SCIENCE 2023; 3:660-664. [PMID: 38177316 DOI: 10.1038/s43588-023-00490-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/06/2024]
Affiliation(s)
- Michelle N Meyer
- Department of Bioethics and Decision Sciences, Geisinger Health System, Danville, PA, USA
- Behavioral Insights Team, Steele Institute for Health Innovation, Geisinger Health System, Danville, PA, USA
| | - John Basl
- Department of Philosophy and Religion, Northeastern University, Boston, MA, USA
- Ethics Institute, Northeastern University, Boston, MA, USA
| | - David Choffnes
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Cybersecurity and Privacy Institute, Northeastern University, Boston, MA, USA
| | - Christo Wilson
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA
- Cybersecurity and Privacy Institute, Northeastern University, Boston, MA, USA
| | - David M J Lazer
- Khoury College of Computer Sciences, Northeastern University, Boston, MA, USA.
- College of Social Sciences and Humanities, Northeastern University, Boston, MA, USA.
- Network Science Institute, Northeastern University, Boston, MA, USA.
- The Institute for Quantitative Social Science, Harvard University, Cambridge, MA, USA.
| |
Collapse
|
5
|
Mayer K, Pfeffer J. Editorial: Critical data and algorithm studies. Front Big Data 2023; 6:1193412. [PMID: 37234688 PMCID: PMC10206293 DOI: 10.3389/fdata.2023.1193412] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2023] [Accepted: 04/18/2023] [Indexed: 05/28/2023] Open
Affiliation(s)
- Katja Mayer
- Science and Technology Studies, University of Vienna, Vienna, Austria
| | - Jürgen Pfeffer
- Computational Social Science and Big Data, Technical University of Munich, Munich, Germany
| |
Collapse
|
6
|
The viewer doesn't always seem to care—response to fake animal rescues on YouTube and implications for social media self‐policing policies. PEOPLE AND NATURE 2023. [DOI: 10.1002/pan3.10416] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
|
7
|
Polisar J, Davies C, Morcatty T, Da Silva M, Zhang S, Duchez K, Madrid J, Lambert AE, Gallegos A, Delgado M, Nguyen H, Wallace R, Arias M, Nijman V, Ramnarace J, Pennell R, Novelo Y, Rumiz D, Rivero K, Murillo Y, Salas MN, Kretser HE, Reuter A. Multi-lingual multi-platform investigations of online trade in jaguar parts. PLoS One 2023; 18:e0280039. [PMID: 36689405 PMCID: PMC9870105 DOI: 10.1371/journal.pone.0280039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 12/20/2022] [Indexed: 01/24/2023] Open
Abstract
We conducted research to understand online trade in jaguar parts and develop tools of utility for jaguars and other species. Our research took place to identify potential trade across 31 online platforms in Spanish, Portuguese, English, Dutch, French, Chinese, and Vietnamese. We identified 230 posts from between 2009 and 2019. We screened the images of animal parts shown in search results to verify if from jaguar; 71 posts on 12 different platforms in four languages were accompanied by images identified as definitely jaguar, including a total of 125 jaguar parts (50.7% posts in Spanish, 25.4% Portuguese, 22.5% Chinese and 1.4% French). Search effort varied among languages due to staff availability. Standardizing for effort across languages by dividing number of posts advertising jaguars by search time and number of individual searches completed via term/platform combinations changed the proportions the rankings of posts adjusted for effort were led by Portuguese, Chinese, and Spanish. Teeth were the most common part; 156 posts offered at least 367 teeth and from these, 95 were assessed as definitely jaguar; 71 of which could be linked to a location, with the majority offered for sale from Mexico, China, Bolivia, and Brazil (26.8, 25.4, 16.9, and 12.7% respectively). The second most traded item, skins and derivative items were only identified from Latin America: Brazil (7), followed by Peru (6), Bolivia (3), Mexico (2 and 1 skin piece), and Nicaragua and Venezuela (1 each). Whether by number of posts or pieces, the most commonly parts were: teeth, skins/pieces of skins, heads, and bodies. Our research took place within a longer-term project to assist law enforcement in host countries to better identify potential illegal trade and presents a snapshot of online jaguar trade and methods that also may have utility for many species traded online.
Collapse
Affiliation(s)
- John Polisar
- Wildlife Conservation Society, Jaguar Conservation Program, Bronx, New York, United States of America
- Department of Environment and Development, Zamorano Biodiversity Center, Zamorano University, Tegucigalpa, Honduras
| | - Charlotte Davies
- Wildlife Conservation Society, Counter Wildlife Trafficking Program (Global), Bronx, New York, United States of America
| | - Thais Morcatty
- Oxford Wildlife Trade Research Group, Oxford Brookes University, Oxford, United Kingdom
- RedeFauna—Rede de Pesquisa em Diversidade, Conservação e Uso da Fauna da Amazônia, Manaus, Brazil
| | | | - Song Zhang
- Xianda College of Economics and Humanities, Shanghai International Studies University, Shanghai, China
| | - Kurt Duchez
- Wildlife Conservation Society, Guatemala Program, Flores, Guatemala
| | - Julio Madrid
- Wildlife Conservation Society, Guatemala Program, Flores, Guatemala
| | - Ana Elisa Lambert
- Wildlife Conservation Society, Latin America Illegal Wildlife Trade Program, Lima, Peru
- School of Environment, Education, and Development, University of Manchester, Manchester, United Kingdom
| | - Ana Gallegos
- Wildlife Conservation Society, Peru Program, Lima, Peru
| | - Marcela Delgado
- Wildlife Conservation Society, Colombia Program, Cali, Colombia
| | - Ha Nguyen
- Wildlife Conservation Society, Vietnam Program, Ha Noi, Vietnam
| | - Robert Wallace
- Wildlife Conservation Society, Bolivia Program, La Paz, Bolivia
| | - Melissa Arias
- WWF Amazon Coordination Unit, Quito, Ecuador
- Department of Zoology, Interdisciplinary Centre for Conservation Science, Oxford-Martin Programme on Illegal Wildlife Trade, University of Oxford, Oxford, United Kingdom
| | - Vincent Nijman
- Oxford Wildlife Trade Research Group, Oxford Brookes University, Oxford, United Kingdom
| | - Jon Ramnarace
- Wildlife Conservation Society, Belize Program, Belize City, Belize
| | - Roberta Pennell
- Wildlife Conservation Society, Belize Program, Belize City, Belize
| | - Yamira Novelo
- Wildlife Conservation Society, Belize Program, Belize City, Belize
| | - Damian Rumiz
- Museo de Historia Natural Noel Kempff Mercado, Santa Cruz, Bolivia
| | - Kathia Rivero
- Museo de Historia Natural Noel Kempff Mercado, Santa Cruz, Bolivia
| | | | - Monica Nuñez Salas
- Universidad del Pacífico, Lima, Perú
- Department of Geography, Environment, and Society, University of Minnesota, Minneapolis, Minnesota, United States of America
| | - Heidi E. Kretser
- Wildlife Conservation Society, Global Conservation Program, Bronx, New York, United States of America
- Department of Natural Resources and the Environment, Cornell University, Ithaca, New York, United States of America
| | - Adrian Reuter
- Wildlife Conservation Society, Latin America Illegal Wildlife Trade Program, Mexico City, Mexico
| |
Collapse
|
8
|
Favaretto M, De Clercq E, Caplan A, Elger BS. United in Big Data? Exploring scholars' opinions on academic-industry partnership and the use of corporate data in digital behavioral research. PLoS One 2023; 18:e0280542. [PMID: 36662904 PMCID: PMC9858826 DOI: 10.1371/journal.pone.0280542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2021] [Accepted: 01/03/2023] [Indexed: 01/21/2023] Open
Abstract
The growing amount of data produced through digital technologies holds great promise for advancing behavioral research. Scholars worldwide now have the chance to access an incredible amount of personal information, thanks to the digital trace users continuously leave behind them. Private corporations play a crucial role in this scenario as the leading collectors of data on users, thus creating new incentives for partnerships between academic institutions and private companies. Due to the concerns that academic-company partnerships might raise and the ethical issues connected with Big Data research, our study explores the challenges and opportunities associated with the academic use of corporate data. We conducted 39 semi-structured interviews with academic scholars (professors, senior researchers, and postdocs) involved in Big Data research in Switzerland and the United States. We also investigated their opinions on using corporate data for scholarly research. Researchers generally showed an interest in using corporate data; however, they coincidentally shared ethical reservations towards this practice, such as threats to research integrity and concerns about a lack of transparency of companies' practices. Furthermore, participants mentioned issues of scholarly access to corporate data that might both disadvantage the academic research community and create issues of scientific validity. Academic-company partnerships could be a positive development for the advancement of scholarly behavioral research. However, strategies should be implemented to appropriately guide collaborations and appropriate use of corporate data, like implementing updated protocols and tools to govern conflicts of interest and the institution of transparent regulatory bodies to ensure adequate oversight of academic-corporate research collaborations.
Collapse
Affiliation(s)
| | - Eva De Clercq
- Institute for Biomedical Ethics, University of Basel, Basel, Switzerland
| | - Arthur Caplan
- Division of Medical Ethics, NYU Grossman School of Medicine, New York, NY, United States of America
| | | |
Collapse
|
9
|
Filazzola A, Xie G, Barrett K, Dunn A, Johnson MTJ, MacIvor JS. Using smartphone-GPS data to quantify human activity in green spaces. PLoS Comput Biol 2022; 18:e1010725. [PMID: 36520687 PMCID: PMC9754188 DOI: 10.1371/journal.pcbi.1010725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 11/10/2022] [Indexed: 12/23/2022] Open
Abstract
Cities are growing in density and coverage globally, increasing the value of green spaces for human health and well-being. Understanding the interactions between people and green spaces is also critical for biological conservation and sustainable development. However, quantifying green space use is particularly challenging. We used an activity index of anonymized GPS data from smart devices provided by Mapbox (www.mapbox.com) to characterize human activity in green spaces in the Greater Toronto Area, Canada. The goals of our study were to describe i) a methodological example of how anonymized GPS data could be used for human-nature research and ii) associations between park features and human activity. We describe some of the challenges and solutions with using this activity index, especially in the context of green spaces and biodiversity monitoring. We found the activity index was strongly correlated with visitation records (i.e., park reservations) and that these data are useful to identify high or low-usage areas within green spaces. Parks with a more extensive trail network typically experienced higher visitation rates and a substantial proportion of activity remained on trails. We identified certain land covers that were more frequently associated with human presence, such as rock formations, and find a relationship between human activity and tree composition. Our study demonstrates that anonymized GPS data from smart devices are a powerful tool for spatially quantifying human activity in green spaces. These could help to minimize trade-offs in the management of green spaces for human use and biological conservation will continue to be a significant challenge over the coming decades because of accelerating urbanization coupled with population growth. Importantly, we include a series of recommendations when using activity indexes for managing green spaces that can assist with biomonitoring and supporting sustainable human use.
Collapse
Affiliation(s)
- Alessandro Filazzola
- Centre for Urban Environments, University of Toronto Mississauga, Mississauga, Ontario, Canada
- Apex Resource Management Solutions, Ottawa, Ontario, Canada
- * E-mail:
| | - Garland Xie
- Centre for Urban Environments, University of Toronto Mississauga, Mississauga, Ontario, Canada
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada
| | | | - Andrea Dunn
- Conservation Halton, Burlington, Ontario, Canada
| | - Marc T. J. Johnson
- Centre for Urban Environments, University of Toronto Mississauga, Mississauga, Ontario, Canada
- Department of Biology, University of Toronto Mississauga, Mississauga, Ontario, Canada
| | - James Scott MacIvor
- Centre for Urban Environments, University of Toronto Mississauga, Mississauga, Ontario, Canada
- Apex Resource Management Solutions, Ottawa, Ontario, Canada
- Department of Biological Sciences, University of Toronto Scarborough, Toronto, Ontario, Canada
| |
Collapse
|
10
|
Hou Q, Waury K, Gogishvili D, Feenstra KA. Ten quick tips for sequence-based prediction of protein properties using machine learning. PLoS Comput Biol 2022; 18:e1010669. [PMID: 36454728 PMCID: PMC9714715 DOI: 10.1371/journal.pcbi.1010669] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
The ubiquitous availability of genome sequencing data explains the popularity of machine learning-based methods for the prediction of protein properties from their amino acid sequences. Over the years, while revising our own work, reading submitted manuscripts as well as published papers, we have noticed several recurring issues, which make some reported findings hard to understand and replicate. We suspect this may be due to biologists being unfamiliar with machine learning methodology, or conversely, machine learning experts may miss some of the knowledge needed to correctly apply their methods to proteins. Here, we aim to bridge this gap for developers of such methods. The most striking issues are linked to a lack of clarity: how were annotations of interest obtained; which benchmark metrics were used; how are positives and negatives defined. Others relate to a lack of rigor: If you sneak in structural information, your method is not sequence-based; if you compare your own model to "state-of-the-art," take the best methods; if you want to conclude that some method is better than another, obtain a significance estimate to support this claim. These, and other issues, we will cover in detail. These points may have seemed obvious to the authors during writing; however, they are not always clear-cut to the readers. We also expect many of these tips to hold for other machine learning-based applications in biology. Therefore, many computational biologists who develop methods in this particular subject will benefit from a concise overview of what to avoid and what to do instead.
Collapse
Affiliation(s)
- Qingzhen Hou
- Department of Biostatistics, School of Public Health, Cheeloo College of Medicine, Shandong University, Shandong, P. R. China
- National Institute of Health Data Science of China, Shandong University, Shandong, P. R. China
| | - Katharina Waury
- Department of Computer Science, Bioinformatics Group, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - Dea Gogishvili
- Department of Computer Science, Bioinformatics Group, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| | - K. Anton Feenstra
- Department of Computer Science, Bioinformatics Group, Vrije Universiteit Amsterdam, Amsterdam, the Netherlands
| |
Collapse
|
11
|
Schwitter N, Pretari A, Marwa W, Lombardini S, Liebe U. Big data and development sociology: An overview and application on governance and accountability through digitalization in Tanzania. FRONTIERS IN SOCIOLOGY 2022; 7:909458. [PMID: 36466797 PMCID: PMC9712952 DOI: 10.3389/fsoc.2022.909458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Accepted: 10/24/2022] [Indexed: 06/17/2023]
Abstract
The digital revolution and the widespread use of the internet have changed many realms of empirical social science research. In this paper, we discuss the use of big data in the context of development sociology and highlight its potential as a new source of data. We provide a brief overview of big data and development research, discuss different data types, and review example studies, before introducing our case study on active citizenship in Tanzania which expands on an Oxfam-led impact evaluation. The project aimed at improving community-driven governance and accountability through the use of digital technology. Twitter and other social media platforms were introduced to community animators as a tool to hold national and regional key stakeholders accountable. We retrieve the complete Twitter timelines up to October 2021 from all ~200 community animators and influencers involved in the project (over 1.5 million tweets). We find that animators have started to use Twitter as part of the project, but most have stopped tweeting in the long term. Employing a dynamic difference-in-differences design, we also do not find effects of Oxfam-led training workshops on different aspects of animators' tweeting behavior. While most animators have stopped using Twitter in the long run, a few have continued to use social media to raise local issues and to be part of conversations to this day. Our case study showcases how (big) social media data can be part of an intervention, and we end with recommendations on how to use digital data in development sociology.
Collapse
Affiliation(s)
- Nicole Schwitter
- Department of Sociology, University of Warwick, Coventry, United Kingdom
| | | | | | | | - Ulf Liebe
- Department of Sociology, University of Warwick, Coventry, United Kingdom
| |
Collapse
|
12
|
Milne R, Sheehan M, Barnes B, Kapper J, Lea N, N'Dow J, Singh G, Martín-Uranga A, Hughes N. A concentric circles view of health data relations facilitates understanding of sociotechnical challenges for learning health systems and the role of federated data networks. Front Big Data 2022; 5:945739. [PMID: 36238653 PMCID: PMC9552575 DOI: 10.3389/fdata.2022.945739] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2022] [Accepted: 08/08/2022] [Indexed: 11/13/2022] Open
Abstract
The ability to use clinical and research data at scale is central to hopes for data-driven medicine. However, in using such data researchers often encounter hurdles–both technical, such as differing data security requirements, and social, such as the terms of informed consent, legal requirements and patient and public trust. Federated or distributed data networks have been proposed and adopted in response to these hurdles. However, to date there has been little consideration of how FDNs respond to both technical and social constraints on data use. In this Perspective we propose an approach to thinking about data in terms that make it easier to navigate the health data space and understand the value of differing approaches to data collection, storage and sharing. We set out a socio-technical model of data systems that we call the “Concentric Circles View” (CCV) of data-relationships. The aim is to enable a consistent understanding of the fit between the local relationships within which data are produced and the extended socio-technical systems that enable their use. The paper suggests this model can help understand and tackle challenges associated with the use of real-world data in the health setting. We use the model to understand not only how but why federated networks may be well placed to address emerging issues and adapt to the evolving needs of health research for patient benefit. We conclude that the CCV provides a useful model with broader application in mapping, understanding, and tackling the major challenges associated with using real world data in the health setting.
Collapse
Affiliation(s)
- Richard Milne
- Wellcome Connecting Science, Cambridge, United Kingdom
- Kavli Centre for Ethics, Science and the Public, Faculty of Education, University of Cambridge, Cambridge, United Kingdom
| | - Mark Sheehan
- Ethox Centre, Nuffield Department of Population Health, University of Oxford, Oxford, United Kingdom
- Oxford National Institute for Health and Care Research (NIHR) Biomedical Research Centre, Oxford University Hospitals Trust, Oxford, United Kingdom
| | - Brendan Barnes
- European Federation of Pharmaceutical Industries and Associations, Brussels, Belgium
| | - Janek Kapper
- Estonian Chamber of Disabled People/European Patients Forum, The Estonian Inflammatory Bowel Disease Society, Tallinn, Estonia
| | - Nathan Lea
- Institute for Innovation Through Health Data (i-HD), Gent, Belgium
| | - James N'Dow
- Academic Urology Unit, University of Aberdeen, Aberdeen, United Kingdom
| | | | | | - Nigel Hughes
- Janssen Research and Development, Beerse, Belgium
- *Correspondence: Nigel Hughes
| |
Collapse
|
13
|
Diversifying the genomic data science research community. Genome Res 2022; 32:gr.276496.121. [PMID: 35858750 PMCID: PMC9341509 DOI: 10.1101/gr.276496.121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 06/02/2022] [Indexed: 11/25/2022]
Abstract
Over the past 20 years, the explosion of genomic data collection and the cloud computing revolution have made computational and data science research accessible to anyone with a web browser and an internet connection. However, students at institutions with limited resources have received relatively little exposure to curricula or professional development opportunities that lead to careers in genomic data science. To broaden participation in genomics research, the scientific community needs to support these programs in local education and research at underserved institutions (UIs). These include community colleges, historically Black colleges and universities, Hispanic-serving institutions, and tribal colleges and universities that support ethnically, racially, and socioeconomically underrepresented students in the United States. We have formed the Genomic Data Science Community Network to support students, faculty, and their networks to identify opportunities and broaden access to genomic data science. These opportunities include expanding access to infrastructure and data, providing UI faculty development opportunities, strengthening collaborations among faculty, recognizing UI teaching and research excellence, fostering student awareness, developing modular and open-source resources, expanding course-based undergraduate research experiences (CUREs), building curriculum, supporting student professional development and research, and removing financial barriers through funding programs and collaborator support.
Collapse
|
14
|
Cox A. The Ethics of AI for Information Professionals: Eight Scenarios. JOURNAL OF THE AUSTRALIAN LIBRARY AND INFORMATION ASSOCIATION 2022. [DOI: 10.1080/24750158.2022.2084885] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
Affiliation(s)
- Andrew Cox
- Information School, The University of Sheffield, Sheffield, UK
| |
Collapse
|
15
|
Takats C, Kwan A, Wormer R, Goldman D, Jones HE, Romero D. Ethical and Methodological Considerations of Twitter Data for Public Health Research: A Systematic Review (Preprint). J Med Internet Res 2022; 24:e40380. [DOI: 10.2196/40380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2022] [Revised: 11/08/2022] [Accepted: 11/13/2022] [Indexed: 11/15/2022] Open
|
16
|
Building a culture of responsible neurotech: Neuroethics as socio-technical challenges. Neuron 2022; 110:2057-2062. [PMID: 35671759 DOI: 10.1016/j.neuron.2022.05.005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Revised: 02/28/2022] [Accepted: 05/05/2022] [Indexed: 11/24/2022]
Abstract
Scientists around the globe are joining the race to achieve engineering feats to read, write, modulate, and interface with the human brain in a broadening continuum of invasive to non-invasive ways. The expansive implications of neurotechnology for our conception of health, mind, decision-making, and behavior has raised social and ethical considerations that are inextricable from neurotechnological progress. We propose "socio-technical" challenges as a framing to integrate neuroethics into the engineering process. Intentionally aligning societal and engineering goals within this framework offers a way to maximize the positive impact of next-generation neurotechnologies on society.
Collapse
|
17
|
Keddy KH, Saha S, Kariuki S, Kalule JB, Qamar FN, Haq Z, Okeke IN. Using big data and mobile health to manage diarrhoeal disease in children in low-income and middle-income countries: societal barriers and ethical implications. THE LANCET INFECTIOUS DISEASES 2022; 22:e130-e142. [DOI: 10.1016/s1473-3099(21)00585-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 08/23/2021] [Accepted: 08/31/2021] [Indexed: 12/28/2022]
|
18
|
Igumbor JO, Bosire EN, Vicente-Crespo M, Igumbor EU, Olalekan UA, Chirwa TF, Kinyanjui SM, Kyobutungi C, Fonn S. Considerations for an integrated population health databank in Africa: lessons from global best practices. Wellcome Open Res 2022; 6:214. [PMID: 35224211 PMCID: PMC8844538 DOI: 10.12688/wellcomeopenres.17000.1] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/12/2021] [Indexed: 12/17/2022] Open
Abstract
Background: The rising digitisation and proliferation of data sources and repositories cannot be ignored. This trend expands opportunities to integrate and share population health data. Such platforms have many benefits, including the potential to efficiently translate information arising from such data to evidence needed to address complex global health challenges. There are pockets of quality data on the continent that may benefit from greater integration. Integration of data sources is however under-explored in Africa. The aim of this article is to identify the requirements and provide practical recommendations for developing a multi-consortia public and population health data-sharing framework for Africa. Methods: We conducted a narrative review of global best practices and policies on data sharing and its optimisation. We searched eight databases for publications and undertook an iterative snowballing search of articles cited in the identified publications. The Leximancer software
© enabled content analysis and selection of a sample of the most relevant articles for detailed review. Themes were developed through immersion in the extracts of selected articles using inductive thematic analysis. We also performed interviews with public and population health stakeholders in Africa to gather their experiences, perceptions, and expectations of data sharing. Results: Our findings described global stakeholder experiences on research data sharing. We identified some challenges and measures to harness available resources and incentivise data sharing. We further highlight progress made by the different groups in Africa and identified the infrastructural requirements and considerations when implementing data sharing platforms. Furthermore, the review suggests key reforms required, particularly in the areas of consenting, privacy protection, data ownership, governance, and data access. Conclusions: The findings underscore the critical role of inclusion, social justice, public good, data security, accountability, legislation, reciprocity, and mutual respect in developing a responsive, ethical, durable, and integrated research data sharing ecosystem.
Collapse
Affiliation(s)
- Jude O Igumbor
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| | - Edna N Bosire
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| | - Marta Vicente-Crespo
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa.,African Population and Health Research Centre, Nairobi, Kenya
| | - Ehimario U Igumbor
- Nigeria Centre for Disease Control, Abuja, Nigeria.,School of Public Health, University of the Western Cape, Cape Town, Western Cape, South Africa
| | - Uthman A Olalekan
- Warwick-Centre for Applied Health Research and Delivery (WCAHRD), Division of Health Sciences, Warwick Medical School, University of Warwick, Coventry, UK
| | - Tobias F Chirwa
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| | | | | | - Sharon Fonn
- School of Public Health, University of the Witwatersrand, Johannesburg, Gauteng, 2193, South Africa
| |
Collapse
|
19
|
Lee BD, Gitter A, Greene CS, Raschka S, Maguire F, Titus AJ, Kessler MD, Lee AJ, Chevrette MG, Stewart PA, Britto-Borges T, Cofer EM, Yu KH, Carmona JJ, Fertig EJ, Kalinin AA, Signal B, Lengerich BJ, Triche TJ, Boca SM. Ten quick tips for deep learning in biology. PLoS Comput Biol 2022; 18:e1009803. [PMID: 35324884 PMCID: PMC8946751 DOI: 10.1371/journal.pcbi.1009803] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022] Open
Affiliation(s)
- Benjamin D. Lee
- In-Q-Tel Labs, Arlington, Virginia, United States of America
- School of Engineering and Applied Sciences, Harvard University, Cambridge, Massachusetts, United States of America
- Department of Genetics, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Anthony Gitter
- Department of Biostatistics and Medical Informatics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
- Morgridge Institute for Research, Madison, Wisconsin, United States of America
| | - Casey S. Greene
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Department of Biochemistry and Molecular Genetics, University of Colorado School of Medicine, Aurora, Colorado, United States of America
- Center for Health AI, University of Colorado School of Medicine, Aurora, Colorado, United States of America
| | - Sebastian Raschka
- Department of Statistics, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Finlay Maguire
- Faculty of Computer Science, Dalhousie University, Halifax, Nova Scotia, Canada
| | - Alexander J. Titus
- University of New Hampshire, Manchester, New Hampshire, United States of America
- Bioeconomy.XYZ, Manchester, New Hampshire, United States of America
| | - Michael D. Kessler
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland, United States of America
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland, United States of America
| | - Alexandra J. Lee
- Department of Systems Pharmacology and Translational Therapeutics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Marc G. Chevrette
- Wisconsin Institute for Discovery and Department of Plant Pathology, University of Wisconsin-Madison, Madison, Wisconsin, United States of America
| | - Paul Allen Stewart
- Department of Biostatistics and Bioinformatics, Moffitt Cancer Center, Tampa, Florida, United States of America
| | - Thiago Britto-Borges
- Section of Bioinformatics and Systems Cardiology, Klaus Tschira Institute for Integrative Computational Cardiology, University Hospital Heidelberg, Heidelberg, Germany
- Department of Internal Medicine III (Cardiology, Angiology, and Pneumology), University Hospital Heidelberg, Heidelberg, Germany
| | - Evan M. Cofer
- Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey, United States of America
- Graduate Program in Quantitative and Computational Biology, Princeton University, Princeton, New Jersey, United States of America
| | - Kun-Hsing Yu
- Department of Biomedical Informatics, Harvard Medical School, Boston, Massachusetts, United States of America
- Department of Pathology, Brigham and Women’s Hospital, Boston, Massachusetts, United States of America
| | - Juan Jose Carmona
- Philips Healthcare, Cambridge, Massachusetts, United States of America
| | - Elana J. Fertig
- Department of Oncology, Johns Hopkins University, Baltimore, Maryland, United States of America
- Department of Biomedical Engineering, Department of Applied Mathematics and Statistics, Convergence Institute, Johns Hopkins University, Baltimore, Maryland, United States of America
| | - Alexandr A. Kalinin
- Medical Big Data Group, Shenzhen Research Institute of Big Data, Shenzhen, China
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Brandon Signal
- School of Medicine, College of Health and Medicine, University of Tasmania, Hobart, Australia
| | - Benjamin J. Lengerich
- Computer Science Department, Carnegie Mellon University, Pittsburgh, Pennsylvania, United States of America
| | - Timothy J. Triche
- Center for Epigenetics, Van Andel Research Institute, Grand Rapids, Michigan, United States of America
- Department of Pediatrics, College of Human Medicine, Michigan State University, East Lansing, Michigan, United States of America
- Department of Translational Genomics, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Simina M. Boca
- Innovation Center for Biomedical Informatics, Georgetown University Medical Center, District of Columbia, United States of America
- Department of Oncology, Georgetown University Medical Center, Washington, DC, United States of America
- Department of Biostatistics, Bioinformatics and Biomathematics, Georgetown University Medical Center, Washington, DC, United States of America
- Cancer Prevention and Control Program, Lombardi Comprehensive Cancer Center, Washington, DC, United States of America
| |
Collapse
|
20
|
Fungtammasan A, Lee A, Taroni J, Wheeler K, Chin CS, Davis S, Greene C. Ten simple rules for large-scale data processing. PLoS Comput Biol 2022; 18:e1009757. [PMID: 35143491 PMCID: PMC8830682 DOI: 10.1371/journal.pcbi.1009757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Arkarachai Fungtammasan
- DNAnexus, Inc., Mountain View, California, United States of America
- * E-mail: (AF); (C-SC); (SD); (CG)
| | - Alexandra Lee
- Genomics and Computational Biology Graduate Program, University of Pennsylvania, Philadelphia, Pennsylvania, United States of America
| | - Jaclyn Taroni
- Childhood Cancer Data Lab, Alex’s Lemonade Stand Foundation, Philadelphia, Pennsylvania, United States of America
| | - Kurt Wheeler
- Childhood Cancer Data Lab, Alex’s Lemonade Stand Foundation, Philadelphia, Pennsylvania, United States of America
| | - Chen-Shan Chin
- DNAnexus, Inc., Mountain View, California, United States of America
- * E-mail: (AF); (C-SC); (SD); (CG)
| | - Sean Davis
- Center for Health AI, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
- Department of Medicine, Divisions of Medical Oncology and Hematology, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
- * E-mail: (AF); (C-SC); (SD); (CG)
| | - Casey Greene
- Center for Health AI, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
- Department of Biochemistry and Molecular Genetics, University of Colorado Anschutz School of Medicine, Aurora, Colorado, United States of America
- * E-mail: (AF); (C-SC); (SD); (CG)
| |
Collapse
|
21
|
Contaxis N, Clark J, Dellureficio A, Gonzales S, Mannheimer S, Oxley PR, Ratajeski MA, Surkis A, Yarnell AM, Yee M, Holmes K. Ten simple rules for improving research data discovery. PLoS Comput Biol 2022; 18:e1009768. [PMID: 35143479 PMCID: PMC8830647 DOI: 10.1371/journal.pcbi.1009768] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Affiliation(s)
- Nicole Contaxis
- NYU Health Sciences Library, NYU Langone Health, New York, New York, United States of America
- * E-mail:
| | - Jason Clark
- Montana State University Library, Montana State University, Bozeman, Montana, University States of America
| | - Anthony Dellureficio
- Medical library, Memorial Sloan Kettering Cancer Center, New York, New York, United States of America
| | - Sara Gonzales
- Galter Health Sciences Library and Learning Center, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| | - Sara Mannheimer
- Montana State University Library, Montana State University, Bozeman, Montana, University States of America
| | - Peter R. Oxley
- Samuel J. Wood Library and C.V. Starr Biomedical Information Center, Weill-Cornell Medicine, New York, New York, United States of America
| | - Melissa A. Ratajeski
- Health Sciences Library System, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Alisa Surkis
- NYU Health Sciences Library, NYU Langone Health, New York, New York, United States of America
| | - Amy M. Yarnell
- Health Sciences and Human Services Library, University of Maryland—Baltimore, Baltimore, Maryland, United States of America
| | - Michelle Yee
- NYU Health Sciences Library, NYU Langone Health, New York, New York, United States of America
| | - Kristi Holmes
- Galter Health Sciences Library and Learning Center, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
- Department of Preventive Medicine, Northwestern University Feinberg School of Medicine, Chicago, Illinois, United States of America
| |
Collapse
|
22
|
Martínez-García M, Hernández-Lemus E. Data Integration Challenges for Machine Learning in Precision Medicine. Front Med (Lausanne) 2022; 8:784455. [PMID: 35145977 PMCID: PMC8821900 DOI: 10.3389/fmed.2021.784455] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2021] [Accepted: 12/28/2021] [Indexed: 12/19/2022] Open
Abstract
A main goal of Precision Medicine is that of incorporating and integrating the vast corpora on different databases about the molecular and environmental origins of disease, into analytic frameworks, allowing the development of individualized, context-dependent diagnostics, and therapeutic approaches. In this regard, artificial intelligence and machine learning approaches can be used to build analytical models of complex disease aimed at prediction of personalized health conditions and outcomes. Such models must handle the wide heterogeneity of individuals in both their genetic predisposition and their social and environmental determinants. Computational approaches to medicine need to be able to efficiently manage, visualize and integrate, large datasets combining structure, and unstructured formats. This needs to be done while constrained by different levels of confidentiality, ideally doing so within a unified analytical architecture. Efficient data integration and management is key to the successful application of computational intelligence approaches to medicine. A number of challenges arise in the design of successful designs to medical data analytics under currently demanding conditions of performance in personalized medicine, while also subject to time, computational power, and bioethical constraints. Here, we will review some of these constraints and discuss possible avenues to overcome current challenges.
Collapse
Affiliation(s)
- Mireya Martínez-García
- Clinical Research Division, National Institute of Cardiology ‘Ignacio Chávez’, Mexico City, Mexico
| | - Enrique Hernández-Lemus
- Computational Genomics Division, National Institute of Genomic Medicine (INMEGEN), Mexico City, Mexico
- Center for Complexity Sciences, Universidad Nacional Autnoma de Mexico, Mexico City, Mexico
| |
Collapse
|
23
|
|
24
|
Thompson RM, Hall J, Morrison C, Palmer NR, Roberts DL. Ethics and governance for internet-based conservation science research. CONSERVATION BIOLOGY : THE JOURNAL OF THE SOCIETY FOR CONSERVATION BIOLOGY 2021; 35:1747-1754. [PMID: 34057267 DOI: 10.1111/cobi.13778] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/12/2020] [Revised: 04/23/2021] [Accepted: 05/19/2021] [Indexed: 06/12/2023]
Abstract
Internet-based research is increasingly important for conservation science and has wide-ranging applications and contexts, including culturomics, illegal wildlife trade, and citizen science. However, online research methods pose a range of ethical and legal challenges. Online data may be protected by copyright, database rights, or contract law. Privacy rights may also restrict the use and access of data, as well as ethical requirements from institutions. Online data have real-world meaning, and the ethical treatment of individuals and communities must not be marginalized when conducting internet-based research. As ethics frameworks originally developed for biomedical applications are inadequate for these methods, we propose that research activities involving the analysis of preexisting online data be treated analogous to offline social science methods, in particular, nondeceptive covert observation. By treating internet users and their data with respect and due consideration, conservationists can uphold the public trust needed to effectively address real-world issues.
Collapse
Affiliation(s)
- Ruth M Thompson
- Durrell Institute of Conservation and Ecology, School of Anthropology and Conservation, University of Kent, Canterbury, Kent, UK
| | - Jordan Hall
- Information Compliance Office, Darwin College, University of Kent, Canterbury, Kent, UK
| | - Chris Morrison
- Copyright, Licensing & Policy, Information Services, Templeman Library, University of Kent, Canterbury, Kent, UK
| | - Nicole R Palmer
- Research Ethics and Governance, Research Services, The Registry, University of Kent, Canterbury, Kent, UK
| | - David L Roberts
- Durrell Institute of Conservation and Ecology, School of Anthropology and Conservation, University of Kent, Canterbury, Kent, UK
- Department of Zoology, University of Oxford, Oxford, UK
- Oxford Martin School, University of Oxford, Oxford, UK
| |
Collapse
|
25
|
Whitman M. Modeling Ethics: Approaches to Data Creep in Higher Education. SCIENCE AND ENGINEERING ETHICS 2021; 27:71. [PMID: 34796403 DOI: 10.1007/s11948-021-00346-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2021] [Accepted: 10/06/2021] [Indexed: 06/13/2023]
Abstract
Though rapid collection of big data is ubiquitous across domains, from industry settings to academic contexts, the ethics of big data collection and research are contested. A nexus of data ethics issues is the concept of creep, or repurposing of data for other applications or research beyond the conditions of original collection. Data creep has proven controversial and has prompted concerns about the scope of ethical oversight. Institutional review boards offer little guidance regarding big data, and problematic research can still meet ethical standards. While ethics seem concrete through institutional deployment, I frame ethics as produced. Informed by my ethnographic research at a large public university in the U.S., I explore ethics through two models: ethics as institutional procedures and ethics as acts and intentions. The university where I conducted fieldwork is the development grounds for a predictive model that uses student data to anticipate academic success. While students consent to data collection, the circumstances of consent and the degree to which they are informed are not so apparent, as many data are a product of creep. Drawing from interviews and participant observation with administrators, data scientists, developers, and students, I examine data ethics, from a larger institutional model to everyday enactments related to data creep. After demonstrating the limits of such models, I propose a remodeling of ethics that draws on recent works on data, justice, and refusal to pose generative questions for rethinking ethics in institutional contexts.
Collapse
Affiliation(s)
- Madisson Whitman
- Center for Science and Society, Columbia University, New York, NY, USA.
| |
Collapse
|
26
|
Ferretti A, Ienca M, Velarde MR, Hurst S, Vayena E. The Challenges of Big Data for Research Ethics Committees: A Qualitative Swiss Study. J Empir Res Hum Res Ethics 2021; 17:129-143. [PMID: 34779661 PMCID: PMC8721531 DOI: 10.1177/15562646211053538] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Big data trends in health research challenge the oversight mechanism of the Research Ethics Committees (RECs). The traditional standards of research quality and the mandate of RECs illuminate deficits in facing the computational complexity, methodological novelty, and limited auditability of these approaches. To better understand the challenges facing RECs, we explored the perspectives and attitudes of the members of the seven Swiss Cantonal RECs via semi-structured qualitative interviews. Our interviews reveal limited experience among REC members with the review of big data research, insufficient expertise in data science, and uncertainty about how to mitigate big data research risks. Nonetheless, RECs could strengthen their oversight by training in data science and big data ethics, complementing their role with external experts and ad hoc boards, and introducing precise shared practices.
Collapse
Affiliation(s)
- Agata Ferretti
- Health Ethics and Policy Lab, Department of Health Sciences and Technology, 27219ETH Zürich, Switzerland
| | - Marcello Ienca
- Health Ethics and Policy Lab, Department of Health Sciences and Technology, 27219ETH Zürich, Switzerland.,College of Humanities, Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland
| | - Minerva Rivas Velarde
- Department of Radiology and Medical Informatics, Faculty of Medicine, 27212University of Geneva, Switzerland
| | - Samia Hurst
- Institute for Ethics, History, and the Humanities, Faculty of Medicine, 27212University of Geneva, Switzerland
| | - Effy Vayena
- Health Ethics and Policy Lab, Department of Health Sciences and Technology, 27219ETH Zürich, Switzerland
| |
Collapse
|
27
|
Wilson SL, Way GP, Bittremieux W, Armache JP, Haendel MA, Hoffman MM. Sharing biological data: why, when, and how. FEBS Lett 2021; 595:847-863. [PMID: 33843054 PMCID: PMC10390076 DOI: 10.1002/1873-3468.14067] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Samantha L Wilson
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada
| | - Gregory P Way
- Imaging Platform, Broad Institute of MIT and Harvard, Cambridge, MA, USA
| | - Wout Bittremieux
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA, USA.,Department of Computer Science, University of Antwerp, Antwerpen, Belgium
| | - Jean-Paul Armache
- Department of Biochemistry & Molecular Biology, The Huck Institutes of Life Sciences, Pennsylvania State University, University Park, PA, USA
| | | | - Michael M Hoffman
- Princess Margaret Cancer Centre, University Health Network, Toronto, ON, Canada.,Department of Medical Biophysics, Department of Computer Science, University of Toronto, Toronto, ON, Canada.,Vector Institute, Toronto, ON, Canada
| |
Collapse
|
28
|
Lazer D, Hargittai E, Freelon D, Gonzalez-Bailon S, Munger K, Ognyanova K, Radford J. Meaningful measures of human society in the twenty-first century. Nature 2021; 595:189-196. [PMID: 34194043 DOI: 10.1038/s41586-021-03660-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Accepted: 05/20/2021] [Indexed: 02/06/2023]
Abstract
Science rarely proceeds beyond what scientists can observe and measure, and sometimes what can be observed proceeds far ahead of scientific understanding. The twenty-first century offers such a moment in the study of human societies. A vastly larger share of behaviours is observed today than would have been imaginable at the close of the twentieth century. Our interpersonal communication, our movements and many of our everyday actions, are all potentially accessible for scientific research; sometimes through purposive instrumentation for scientific objectives (for example, satellite imagery), but far more often these objectives are, literally, an afterthought (for example, Twitter data streams). Here we evaluate the potential of this massive instrumentation-the creation of techniques for the structured representation and quantification-of human behaviour through the lens of scientific measurement and its principles. In particular, we focus on the question of how we extract scientific meaning from data that often were not created for such purposes. These data present conceptual, computational and ethical challenges that require a rejuvenation of our scientific theories to keep up with the rapidly changing social realities and our capacities to capture them. We require, in other words, new approaches to manage, use and analyse data.
Collapse
Affiliation(s)
- David Lazer
- Network Science Institute, Northeastern University, Boston, MA, USA. .,Institute for Quantitative Social Science, Harvard University, Cambridge, MA, USA.
| | - Eszter Hargittai
- Department of Communication and Media Research, University of Zurich, Zurich, Switzerland
| | - Deen Freelon
- Hussman School of Journalism and Media, University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | | | - Kevin Munger
- Department of Political Science, Pennsylvania State University, State College, PA, USA
| | - Katherine Ognyanova
- School of Communication and Information, Rutgers University, New Brunswick, NJ, USA
| | - Jason Radford
- Network Science Institute, Northeastern University, Boston, MA, USA
| |
Collapse
|
29
|
Abstract
AbstractEnacting an AI system typically requires three iterative phases where AI engineers are in command: selection and preparation of the data, selection and configuration of algorithmic tools, and fine-tuning of the different parameters on the basis of intermediate results. Our main hypothesis is that these phases involve practices with ethical questions. This paper maps these ethical questions and proposes a way to address them in light of a neo-republican understanding of freedom, defined as absence of domination. We thereby identify different types of responsibility held by AI engineers and link them to concrete suggestions on how to improve professional practices. This paper contributes to the literature on AI and ethics by focusing on the work necessary to configure AI systems, thereby offering an input to better practices and an input for societal debates.
Collapse
|
30
|
|
31
|
Rice WL, Pan B. Understanding changes in park visitation during the COVID-19 pandemic: A spatial application of big data. WELLBEING, SPACE AND SOCIETY 2021; 2:100037. [PMID: 34934999 PMCID: PMC8677329 DOI: 10.1016/j.wss.2021.100037] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/27/2021] [Accepted: 05/04/2021] [Indexed: 05/05/2023]
Abstract
In the spring of 2020, the COVID-19 pandemic changed the daily lives of people around the world. In an effort to quantify these changes, Google released an open-source dataset pertaining to regional mobility trends-including park visitation trends. Changes in park visitation are calculated from an earlier baseline period for measurement. Park visitation is robustly linked to positive wellbeing indicators across the lifespan, and has been shown to support wellbeing during the COVID-19 pandemic. Therefore, this dataset offers vast application potential, containing aggregated information from location data collected via smartphones worldwide. However, empirical analysis of these data is limited. Namely, the factors influencing reported changes in mobility and the degree to which these changes can be directly attributable to COVID-19 remain unknown. This study aims to address these gaps in our understanding of the changes in park visitation, the causes of these changes (e.g., safer-at-home orders, amount of COVID-19 cases per county, climate, etc.) and possible impacts to wellbeing by constructing and testing a spatial regression model. Results suggest that elevation and latitude serve as primary influences of reported changes in park visitation from the baseline period. Therefore, it is surmised that Google's reported changes in park-related mobility are only partially the function of COVID-19.
Collapse
Affiliation(s)
- William L Rice
- Department of Society and Conservation, W.A. Franke College of Forestry & Conservation, University of Montana, Missoula, MT 59812, USA
- Parks, Tourism, & Recreation Management Program, W.A. Franke College of Forestry & Conservation, University of Montana, Missoula, MT 59812, USA
| | - Bing Pan
- Department of Recreation, Park, and Tourism Management, College of Health and Human Development, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
32
|
Harrington LA, Auliya M, Eckman H, Harrington AP, Macdonald DW, D'Cruze N. Live wild animal exports to supply the exotic pet trade: A case study from Togo using publicly available social media data. CONSERVATION SCIENCE AND PRACTICE 2021. [DOI: 10.1111/csp2.430] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Affiliation(s)
- Lauren A. Harrington
- Wildlife Conservation Research Unit, Department of Zoology University of Oxford, Recanati‐Kaplan Centre Abingdon UK
| | - Mark Auliya
- Department of Conservation Biology Helmholtz Centre for Environmental Research GmbH – UFZ Leipzig Germany
- Zoological Research Museum Alexander Koenig Bonn Germany
| | | | - Alix P. Harrington
- Wildlife Conservation Research Unit, Department of Zoology University of Oxford, Recanati‐Kaplan Centre Abingdon UK
| | - David W. Macdonald
- Wildlife Conservation Research Unit, Department of Zoology University of Oxford, Recanati‐Kaplan Centre Abingdon UK
| | - Neil D'Cruze
- Wildlife Conservation Research Unit, Department of Zoology University of Oxford, Recanati‐Kaplan Centre Abingdon UK
- World Animal Protection London UK
| |
Collapse
|
33
|
Väisänen T, Heikinheimo V, Hiippala T, Toivonen T. Exploring human-nature interactions in national parks with social media photographs and computer vision. CONSERVATION BIOLOGY : THE JOURNAL OF THE SOCIETY FOR CONSERVATION BIOLOGY 2021; 35:424-436. [PMID: 33749054 DOI: 10.1111/cobi.13704] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Revised: 08/07/2020] [Accepted: 08/14/2020] [Indexed: 06/12/2023]
Abstract
Understanding the activities and preferences of visitors is crucial for managing protected areas and planning conservation strategies. Conservation culturomics promotes the use of user-generated online content in conservation science. Geotagged social media content is a unique source of in situ information on human presence and activities in nature. Photographs posted on social media platforms are a promising source of information, but analyzing large volumes of photographs manually remains laborious. We examined the application of state-of-the-art computer-vision methods to studying human-nature interactions. We used semantic clustering, scene classification, and object detection to automatically analyze photographs taken in Finnish national parks by domestic and international visitors. Our results showed that human-nature interactions can be extracted from user-generated photographs with computer vision. The different methods complemented each other by revealing broad visual themes related to level of the data set, landscape photogeneity, and human activities. Geotagged photographs revealed distinct regional profiles for national parks (e.g., preferences in landscapes and activities), which are potentially useful in park management. Photographic content differed between domestic and international visitors, which indicates differences in activities and preferences. Information extracted automatically from photographs can help identify preferences among diverse visitor groups, which can be used to create profiles of national parks for conservation marketing and to support conservation strategies that rely on public acceptance. The application of computer-vision methods to automatic content analysis of photographs should be explored further in conservation culturomics, particularly in combination with rich metadata available on social media platforms.
Collapse
Affiliation(s)
- Tuomas Väisänen
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, 00014, Finland
- Helsinki Institute of Sustainability Science, University of Helsinki, Helsinki, 00014, Finland
| | - Vuokko Heikinheimo
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, 00014, Finland
- Helsinki Institute of Sustainability Science, University of Helsinki, Helsinki, 00014, Finland
| | - Tuomo Hiippala
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, 00014, Finland
- Helsinki Institute of Sustainability Science, University of Helsinki, Helsinki, 00014, Finland
- Department of Languages, University of Helsinki, Helsinki, 00014, Finland
| | - Tuuli Toivonen
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, 00014, Finland
- Helsinki Institute of Sustainability Science, University of Helsinki, Helsinki, 00014, Finland
- Conservation Science Group, University of Cambridge, Cambridge, CB2 3EJ, U.K
| |
Collapse
|
34
|
Li J, Hu Q. Using culturomics and social media data to characterize wildlife consumption. CONSERVATION BIOLOGY : THE JOURNAL OF THE SOCIETY FOR CONSERVATION BIOLOGY 2021; 35:452-459. [PMID: 33749024 DOI: 10.1111/cobi.13703] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2020] [Revised: 10/02/2020] [Accepted: 10/11/2020] [Indexed: 06/12/2023]
Abstract
Wildlife provides food, medicine, clothing, and other necessities for humans, but overexploitation can disrupt the sustainability of wildlife resources and severely threaten global biodiversity. Understanding the characteristics of consumer behavior is helpful for wildlife managers and policy makers, but the traditional survey methods are laborious and time-consuming. In contrast, culturomics may more efficiently identify the features of wildlife consumption. As a case study of the culturomics approach, we examined tiger bone wine consumption in China based on social media and Baidu search engine data. Tiger bone wine is one of the most purchased tiger products; its consumption is closely related to tiger poaching, which greatly threatens wild tiger survival. We searched a popular social media website for the term "tiger bone wine" and focused on posts that were originally created from 1 January 2012 to 31 December 2018. We filtered and classified posts related to the purchase, sale, or consumption of tiger bone wine and extracted information on providers, consumption motivations, year of production, and place of origin of the tiger bone wines based on the texts and photos of these posts. We found 756 posts related to tiger bone wine consumption, 113 of which mentioned providers of tiger bone wine, including friends (53%), elder relatives (37%), peer relatives (7%), and others (3%). Out of the 756 posts, 266 indicated the motivations of tiger bone wine consumption. Tiger bone wines were consumed as a tonic (34%), medicine (23%), game product (30%), and a symbol of wealth (28%). Some posts indicated ≥2 consumption motivations. These findings were consistent with the search queries from Baidu index. Such information could help develop targeted strategies for tiger conservation. The culturomics approach illustrated by our study is a rapid and cost-efficient way to characterize wildlife consumption.
Collapse
Affiliation(s)
- Juan Li
- School of Engineering, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang Province, 310024, China
| | - Qi Hu
- School of Life Sciences, Westlake University, 18 Shilongshan Road, Hangzhou, Zhejiang Province, 310024, China
- Institute of Biology, Westlake Institute for Advanced Study, Hangzhou, Zhejiang Province, 310024, China
| |
Collapse
|
35
|
Sandbrook C, Clark D, Toivonen T, Simlai T, O'Donnell S, Cobbe J, Adams W. Principles for the socially responsible use of conservation monitoring technology and data. CONSERVATION SCIENCE AND PRACTICE 2021. [DOI: 10.1111/csp2.374] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Affiliation(s)
- Chris Sandbrook
- Department of Geography University of Cambridge Cambridge UK
| | - Douglas Clark
- University of Saskatchewan Saskatoon Saskatchewan Canada
| | | | - Trishant Simlai
- Department of Geography University of Cambridge Cambridge UK
| | | | - Jennifer Cobbe
- Department of Geography University of Cambridge Cambridge UK
| | - William Adams
- Department of Geography University of Cambridge Cambridge UK
| |
Collapse
|
36
|
Parker MS, Burgess AE, Bourne PE. Ten simple rules for starting (and sustaining) an academic data science initiative. PLoS Comput Biol 2021; 17:e1008628. [PMID: 33600414 PMCID: PMC7891724 DOI: 10.1371/journal.pcbi.1008628] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [MESH Headings] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Affiliation(s)
- Micaela S. Parker
- Academic Data Science Alliance, Seattle, Washington, United States of America
- * E-mail:
| | - Arlyn E. Burgess
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States of America
| | - Philip E. Bourne
- School of Data Science, University of Virginia, Charlottesville, Virginia, United States of America
| |
Collapse
|
37
|
Keep your distance: Using Instagram posts to evaluate the risk of anthroponotic disease transmission in gorilla ecotourism. PEOPLE AND NATURE 2021. [DOI: 10.1002/pan3.10187] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022] Open
|
38
|
McKeown A, Mourby M, Harrison P, Walker S, Sheehan M, Singh I. Ethical Issues in Consent for the Reuse of Data in Health Data Platforms. SCIENCE AND ENGINEERING ETHICS 2021; 27:9. [PMID: 33538942 PMCID: PMC7862505 DOI: 10.1007/s11948-021-00282-0] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 12/21/2020] [Indexed: 05/08/2023]
Abstract
Data platforms represent a new paradigm for carrying out health research. In the platform model, datasets are pooled for remote access and analysis, so novel insights for developing better stratified and/or personalised medicine approaches can be derived from their integration. If the integration of diverse datasets enables development of more accurate risk indicators, prognostic factors, or better treatments and interventions, this obviates the need for the sharing and reuse of data; and a platform-based approach is an appropriate model for facilitating this. Platform-based approaches thus require new thinking about consent. Here we defend an approach to meeting this challenge within the data platform model, grounded in: the notion of 'reasonable expectations' for the reuse of data; Waldron's account of 'integrity' as a heuristic for managing disagreement about the ethical permissibility of the approach; and the element of the social contract that emphasises the importance of public engagement in embedding new norms of research consistent with changing technological realities. While a social contract approach may sound appealing, however, it is incoherent in the context at hand. We defend a way forward guided by that part of the social contract which requires public approval for the proposal and argue that we have moral reasons to endorse a wider presumption of data reuse. However, we show that the relationship in question is not recognisably contractual and that the social contract approach is therefore misleading in this context. We conclude stating four requirements on which the legitimacy of our proposal rests.
Collapse
Affiliation(s)
- Alex McKeown
- Department of Psychiatry, Wellcome Centre for Ethics and Humanities, Warneford Hospital, University of Oxford, Oxford, OX3 7JX, UK.
| | - Miranda Mourby
- Centre for Health, Law and Emerging Technologies (HeLEX), University of Oxford, Oxford, UK
| | - Paul Harrison
- Department of Psyhiatry, Oxford Health NHS Foundation Trust, University of Oxford, Oxford, UK
| | - Sophie Walker
- Department of Psychiatry, University of Oxford, Oxford, UK
| | - Mark Sheehan
- Ethox, Wellcome Centre for Ethics and Humanities, University of Oxford, Oxford, UK
| | - Ilina Singh
- Department of Psychiatry, Wellcome Centre for Ethics and Humanities, University of Oxford, Oxford, UK
| |
Collapse
|
39
|
Papamitsiou Z, Filippakis ME, Poulou M, Sampson D, Ifenthaler D, Giannakos M. Towards an educational data literacy framework: enhancing the profiles of instructional designers and e-tutors of online and blended courses with new competences. SMART LEARNING ENVIRONMENTS 2021; 8:18. [PMCID: PMC8446468 DOI: 10.1186/s40561-021-00163-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/25/2021] [Accepted: 08/31/2021] [Indexed: 06/15/2023]
Abstract
In the era of digitalization of learning and teaching processes, Educational Data Literacy (EDL) is highly valued and is becoming essential. EDL is conceptualized as the ability to collect, manage, analyse, comprehend, interpret, and act upon educational data in an ethical, meaningful, and critical manner. The professionals in the field of digitally supported education, i.e., Instructional Designers (IDs) and e-Tutors (eTUTs) of online and blended courses, need to be ready to inform their decisions with educational data, and face the upcoming data-related challenges; they need to update and enhance their profiles with relevant competences. This paper proposes a framework for EDL competence profiles of IDs/eTUTs and evaluates the proposal with the participation of worldwide professionals (N = 210) with experience in digitally supported education. The evaluation aims at validating the proposal and assesses (a) the current EDL-readiness of IDs/eTUTs; and (b) the extent to which the framework captures and describes the essential EDL competences. The findings indicate that professionals are not EDL-competent yet, but the proposed dimensions and related competences are offering a solid approach to support EDL development.
Collapse
Affiliation(s)
- Zacharoula Papamitsiou
- Department of Computer Science, Norwegian University of Science and Technology, Sem Sælands vei 9, IT-Bygget, Gløshaugen, 7034 Trondheim, Norway
| | | | | | | | | | - Michail Giannakos
- Department of Computer Science, Norwegian University of Science and Technology, Sem Sælands vei 9, IT-Bygget, Gløshaugen, 7034 Trondheim, Norway
| |
Collapse
|
40
|
Rice WL, Pan B. Understanding changes in park visitation during the COVID-19 pandemic: A spatial application of big data. WELLBEING, SPACE AND SOCIETY 2021; 2:100037. [PMID: 34934999 DOI: 10.31235/osf.io/97qa4] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/27/2020] [Revised: 02/27/2021] [Accepted: 05/04/2021] [Indexed: 05/20/2023]
Abstract
In the spring of 2020, the COVID-19 pandemic changed the daily lives of people around the world. In an effort to quantify these changes, Google released an open-source dataset pertaining to regional mobility trends-including park visitation trends. Changes in park visitation are calculated from an earlier baseline period for measurement. Park visitation is robustly linked to positive wellbeing indicators across the lifespan, and has been shown to support wellbeing during the COVID-19 pandemic. Therefore, this dataset offers vast application potential, containing aggregated information from location data collected via smartphones worldwide. However, empirical analysis of these data is limited. Namely, the factors influencing reported changes in mobility and the degree to which these changes can be directly attributable to COVID-19 remain unknown. This study aims to address these gaps in our understanding of the changes in park visitation, the causes of these changes (e.g., safer-at-home orders, amount of COVID-19 cases per county, climate, etc.) and possible impacts to wellbeing by constructing and testing a spatial regression model. Results suggest that elevation and latitude serve as primary influences of reported changes in park visitation from the baseline period. Therefore, it is surmised that Google's reported changes in park-related mobility are only partially the function of COVID-19.
Collapse
Affiliation(s)
- William L Rice
- Department of Society and Conservation, W.A. Franke College of Forestry & Conservation, University of Montana, Missoula, MT 59812, USA
- Parks, Tourism, & Recreation Management Program, W.A. Franke College of Forestry & Conservation, University of Montana, Missoula, MT 59812, USA
| | - Bing Pan
- Department of Recreation, Park, and Tourism Management, College of Health and Human Development, Pennsylvania State University, University Park, PA 16802, USA
| |
Collapse
|
41
|
Ilan Y. Second-Generation Digital Health Platforms: Placing the Patient at the Center and Focusing on Clinical Outcomes. Front Digit Health 2020; 2:569178. [PMID: 34713042 PMCID: PMC8521820 DOI: 10.3389/fdgth.2020.569178] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2020] [Accepted: 10/02/2020] [Indexed: 12/13/2022] Open
Abstract
Artificial intelligence (AI) digital health systems have drawn much attention over the last decade. However, their implementation into medical practice occurs at a much slower pace than expected. This paper reviews some of the achievements of first-generation AI systems, and the barriers facing their implementation into medical practice. The development of second-generation AI systems is discussed with a focus on overcoming some of these obstacles. Second-generation systems are aimed at focusing on a single subject and on improving patients' clinical outcomes. A personalized closed-loop system designed to improve end-organ function and the patient's response to chronic therapies is presented. The system introduces a platform which implements a personalized therapeutic regimen and introduces quantifiable individualized-variability patterns into its algorithm. The platform is designed to achieve a clinically meaningful endpoint by ensuring that chronic therapies will have sustainable effect while overcoming compensatory mechanisms associated with disease progression and drug resistance. Second-generation systems are expected to assist patients and providers in adopting and implementing of these systems into everyday care.
Collapse
|
42
|
Favaretto M, De Clercq E, Gaab J, Elger BS. First do no harm: An exploration of researchers' ethics of conduct in Big Data behavioral studies. PLoS One 2020; 15:e0241865. [PMID: 33152039 PMCID: PMC7644008 DOI: 10.1371/journal.pone.0241865] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Accepted: 10/21/2020] [Indexed: 11/24/2022] Open
Abstract
Research ethics has traditionally been guided by well-established documents such as the Belmont Report and the Declaration of Helsinki. At the same time, the introduction of Big Data methods, that is having a great impact in behavioral research, is raising complex ethical issues that make protection of research participants an increasingly difficult challenge. By conducting 39 semi-structured interviews with academic scholars in both Switzerland and United States, our research aims at exploring the code of ethics and research practices of academic scholars involved in Big Data studies in the fields of psychology and sociology to understand if the principles set by the Belmont Report are still considered relevant in Big Data research. Our study shows how scholars generally find traditional principles to be a suitable guide to perform ethical data research but, at the same time, they recognized and elaborated on the challenges embedded in their practical application. In addition, due to the growing introduction of new actors in scholarly research, such as data holders and owners, it was also questioned whether responsibility to protect research participants should fall solely on investigators. In order to appropriately address ethics issues in Big Data research projects, education in ethics, exchange and dialogue between research teams and scholars from different disciplines should be enhanced. In addition, models of consultancy and shared responsibility between investigators, data owners and review boards should be implemented in order to ensure better protection of research participants.
Collapse
Affiliation(s)
| | - Eva De Clercq
- Institute for Biomedical Ethics, University of Basel, Basel, Switzerland
| | - Jens Gaab
- Division of Clinical Psychology and Psychotherapy, Faculty of Psychology, University of Basel, Basel, Switzerland
| | | |
Collapse
|
43
|
Jung H, Ventura T, Chung JS, Kim WJ, Nam BH, Kong HJ, Kim YO, Jeon MS, Eyun SI. Twelve quick steps for genome assembly and annotation in the classroom. PLoS Comput Biol 2020; 16:e1008325. [PMID: 33180771 PMCID: PMC7660529 DOI: 10.1371/journal.pcbi.1008325] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Eukaryotic genome sequencing and de novo assembly, once the exclusive domain of well-funded international consortia, have become increasingly affordable, thus fitting the budgets of individual research groups. Third-generation long-read DNA sequencing technologies are increasingly used, providing extensive genomic toolkits that were once reserved for a few select model organisms. Generating high-quality genome assemblies and annotations for many aquatic species still presents significant challenges due to their large genome sizes, complexity, and high chromosome numbers. Indeed, selecting the most appropriate sequencing and software platforms and annotation pipelines for a new genome project can be daunting because tools often only work in limited contexts. In genomics, generating a high-quality genome assembly/annotation has become an indispensable tool for better understanding the biology of any species. Herein, we state 12 steps to help researchers get started in genome projects by presenting guidelines that are broadly applicable (to any species), sustainable over time, and cover all aspects of genome assembly and annotation projects from start to finish. We review some commonly used approaches, including practical methods to extract high-quality DNA and choices for the best sequencing platforms and library preparations. In addition, we discuss the range of potential bioinformatics pipelines, including structural and functional annotations (e.g., transposable elements and repetitive sequences). This paper also includes information on how to build a wide community for a genome project, the importance of data management, and how to make the data and results Findable, Accessible, Interoperable, and Reusable (FAIR) by submitting them to a public repository and sharing them with the research community.
Collapse
Affiliation(s)
- Hyungtaek Jung
- School of Biological Sciences, The University of Queensland, St Lucia, Queensland, Australia
- Centre for Agriculture and Bioeconomy, Queensland University of Technology, Brisbane, Queensland, Australia
| | - Tomer Ventura
- Genecology Research Centre, School of Science and Engineering, University of the Sunshine Coast, Sippy Downs, Queensland, Australia
| | - J. Sook Chung
- Institute of Marine and Environmental Technology, University of Maryland Center for Environmental Science, Baltimore, Maryland, United States of America
| | - Woo-Jin Kim
- Genetics and Breeding Research Center, National Institute of Fisheries Science, Geoje, Korea
| | - Bo-Hye Nam
- Biotechnology Research Division, National Institute of Fisheries Science, Busan, Korea
| | - Hee Jeong Kong
- Biotechnology Research Division, National Institute of Fisheries Science, Busan, Korea
| | - Young-Ok Kim
- Biotechnology Research Division, National Institute of Fisheries Science, Busan, Korea
| | - Min-Seung Jeon
- Department of Life Science, Chung-Ang University, Seoul, Korea
| | - Seong-il Eyun
- Department of Life Science, Chung-Ang University, Seoul, Korea
| |
Collapse
|
44
|
Wood SA, Winder SG, Lia EH, White EM, Crowley CSL, Milnor AA. Next-generation visitation models using social media to estimate recreation on public lands. Sci Rep 2020; 10:15419. [PMID: 32963262 PMCID: PMC7508982 DOI: 10.1038/s41598-020-70829-x] [Citation(s) in RCA: 21] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2020] [Accepted: 07/31/2020] [Indexed: 11/15/2022] Open
Abstract
Outdoor and nature-based recreation provides countless social benefits, yet public land managers often lack information on the spatial and temporal extent of recreation activities. Social media is a promising source of data to fill information gaps because the amount of recreational use is positively correlated with social media activity. However, despite the implication that these correlations could be employed to accurately estimate visitation, there are no known transferable models parameterized for use with multiple social media data sources. This study tackles these issues by examining the relative value of multiple sources of social media in models that estimate visitation at unmonitored sites and times across multiple destinations. Using a novel dataset of over 30,000 social media posts and 286,000 observed visits from two regions in the United States, we compare multiple competing statistical models for estimating visitation. We find social media data substantially improve visitor estimates at unmonitored sites, even when a model is parameterized with data from another region. Visitation estimates are further improved when models are parameterized with on-site counts. These findings indicate that while social media do not fully substitute for on-site data, they are a powerful component of recreation research and visitor management.
Collapse
Affiliation(s)
- Spencer A Wood
- eScience Institute, University of Washington, Seattle, WA, USA. .,EarthLab, University of Washington, Seattle, WA, USA.
| | | | - Emilia H Lia
- EarthLab, University of Washington, Seattle, WA, USA
| | - Eric M White
- Pacific Northwest Research Station, US Forest Service, Olympia, WA, USA
| | | | - Adam A Milnor
- Rivers, Trails, and Conservation Assistance Program, National Park Service, Tucson, AZ, USA
| |
Collapse
|
45
|
Mehta N, Zhu L, Lam K, Stall NM, Savage R, Read SH, Wu W, Pop P, Faulkner C, Bronskill SE, Rochon PA. Health Forums and Twitter for Dementia Research: Opportunities and Considerations. J Am Geriatr Soc 2020; 68:2881-2889. [PMID: 32894780 DOI: 10.1111/jgs.16790] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 07/24/2020] [Accepted: 07/27/2020] [Indexed: 12/13/2022]
Abstract
BACKGROUND/OBJECTIVES Social media platforms are promising sources for large quantities of participant-driven research data and circumvent some common challenges when conducting dementia research. This study provides a summary of key considerations and recommendations about using these platforms as research tools for dementia. DESIGN Mixed methods. SETTING Alzheimer's Society's online Dementia Talking Point forum from inception to April 17, 2018, and Twitter in February and March 2018. PARTICIPANTS All users of Dementia Talking Point who posted in subforums labeled "I have dementia" and "I care for a person with dementia," and Twitter users whose posts contained the keywords "dementia," "Alzheimer," or "Alzheimer's." MEASUREMENTS We quantified the average daily number of dementia-related posts on each platform and number of words per post. Guided by a codebook, we conducted thematic content analysis of 5% of the 15,513 posts collected from Dementia Talking Point, and 10% of the 25,948 comprehensible posts from Twitter containing "dementia," "Alzheimer," or "Alzheimer's." We also summarized research-relevant characteristics inherent to platforms and posts. RESULTS On average, Dementia Talking Point provided less than two new daily dementia-related posts with 213.5 to 241.5 words, compared with 7,883 new daily Twitter posts with 14.5 words. Persons with dementia (PWDs) commonly shared dementia-related concerns (75.7%), experiences (68.6%), and requests for, as well as offers of, information and support (44.3% and 38.6%, respectively). Caregivers commonly shared caregiving experience (67.0%) and requests for information and support (52.5%). Most common dementia-related Twitter posts were derogatory use of the term dementia (14.5%), advocacy, fundraising, and awareness (11.6%), and research dissemination (8.0%). Recommendations about these platforms' unique technical and ethical considerations are outlined. CONCLUSIONS Understanding the priorities of PWDs and their caregivers remains important to understand how clinicians can best support them. This study will help clinicians and researcher to better leverage online health forums and Twitter for such dementia-related information.
Collapse
Affiliation(s)
- Nishila Mehta
- Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.,Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.,Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada
| | - Lynn Zhu
- Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,Rotman Research Institute, Baycrest Health Sciences, Toronto, Ontario, Canada
| | - Kenneth Lam
- Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,Division of Geriatric Medicine, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Nathan M Stall
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.,Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,Division of Geriatric Medicine, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Rachel Savage
- Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,Division of Geriatric Medicine, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| | - Stephanie H Read
- Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada
| | - Wei Wu
- Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada
| | - Paula Pop
- Division of Geriatric Medicine, Department of Medicine, McMaster University, Hamilton, Ontario, Canada
| | - Colin Faulkner
- Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,Institute of Medical Sciences, University of Toronto, Toronto, Ontario, Canada
| | - Susan E Bronskill
- Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.,Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,ICES, Toronto, Ontario, Canada
| | - Paula A Rochon
- Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada.,Institute of Health Policy, Management and Evaluation, University of Toronto, Toronto, Ontario, Canada.,Women's College Research Institute, Women's College Hospital, Toronto, Ontario, Canada.,Division of Geriatric Medicine, Department of Medicine, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
46
|
Cesare N, Oladeji O, Ferryman K, Wijaya D, Hendricks‐Muñoz KD, Ward A, Nsoesie EO. Discussions of miscarriage and preterm births on Twitter. Paediatr Perinat Epidemiol 2020; 34:544-552. [PMID: 31912544 PMCID: PMC7496231 DOI: 10.1111/ppe.12622] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/31/2019] [Revised: 10/22/2019] [Accepted: 11/17/2019] [Indexed: 11/29/2022]
Abstract
BACKGROUND Experiences typically considered private, such as, miscarriages and preterm births are being discussed publicly on social media and Internet discussion websites. These data can provide timely illustrations of how individuals discuss miscarriages and preterm births, as well as insights into the wellbeing of women who have experienced a miscarriage. OBJECTIVES To characterise how users discuss the topic of miscarriage and preterm births on Twitter, analyse trends and drivers, and describe the perceived emotional state of women who have experienced a miscarriage. METHODS We obtained 291 443 Twitter postings on miscarriages and preterm births from January 2017 through December 2018. Latent Dirichlet Allocation (LDA) was used to identify major topics of discussion. We applied time series decomposition methods to assess temporal trends and identify major drivers of discussion. Furthermore, four coders labelled the emotional content of 7282 personal miscarriage disclosure tweets into the following non-mutually exclusive categories: grief/sadness/depression, anger, relief, isolation, annoyance, and neutral. RESULTS Topics in our data fell into eight groups: celebrity disclosures, Michelle Obama's disclosure, politics, healthcare, preterm births, loss and anxiety, flu vaccine and ectopic pregnancies. Political discussions around miscarriages were largely due to a misunderstanding between abortions and miscarriages. Grief and annoyance were the most commonly expressed emotions within the miscarriage self-disclosures; 50.6% (95% confidence interval [CI] 49.1, 52.2) and 16.2% (95% CI 15.2, 17.3). Postings increased with celebrity disclosures, pharmacists' refusal of prescribed medications and outrage over the high rate of preterm births in the United States. Miscarriage disclosures by celebrities also led to disclosures by women who had similar experiences. CONCLUSIONS This study suggests that increase in discussions of miscarriage on social media are associated with several factors, including celebrity disclosures. Additionally, there is a misunderstanding of the potential physical, emotional and psychological impacts on individuals who lose a pregnancy due to a miscarriage.
Collapse
Affiliation(s)
- Nina Cesare
- Department of Global HealthSchool of Public HealthBoston UniversityBostonMAUSA
| | - Olubusola Oladeji
- Department of Global HealthSchool of Public HealthBoston UniversityBostonMAUSA
| | - Kadija Ferryman
- Department of Technology, Culture, and SocietyTandon School of EngineeringNew York UniversityNew YorkNYUSA
| | - Derry Wijaya
- Department of Computer ScienceBoston UniversityBostonMAUSA
| | - Karen D. Hendricks‐Muñoz
- Department of PediatricsVirginia Commonwealth University School of MedicineRichmondVAUSA,Children's Hospital of RichmondRichmondVAUSA
| | - Alyssa Ward
- Children's Hospital of RichmondRichmondVAUSA
| | - Elaine O. Nsoesie
- Department of Global HealthSchool of Public HealthBoston UniversityBostonMAUSA
| |
Collapse
|
47
|
Bafeta A, Bobe J, Clucas J, Gonsalves PP, Gruson-Daniel C, Hudson KL, Klein A, Krishnakumar A, McCollister-Slipp A, Lindner AB, Misevic D, Naslund JA, Nebeker C, Nikolaidis A, Pasquetto I, Sanchez G, Schapira M, Scheininger T, Schoeller F, Sólon Heinsfeld A, Taddei F. Ten simple rules for open human health research. PLoS Comput Biol 2020; 16:e1007846. [PMID: 32881878 PMCID: PMC7470254 DOI: 10.1371/journal.pcbi.1007846] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open
Affiliation(s)
- Aïda Bafeta
- Center for Research and Interdisciplinarity (CRI), Université de Paris, INSERM U1284, Paris, France
| | - Jason Bobe
- Institute for Next Generation Healthcare, New York, New York, United States of America
| | - Jon Clucas
- MATTER Lab, Child Mind Institute, New York, New York, United States of America
| | | | - Célya Gruson-Daniel
- COSTECH, Université de Technologie de Compiègne, Compiègne, France; LabCMO, Université du Québec à Montréal, Université Laval, Montreal, Canada
| | - Kathy L. Hudson
- Hudson Works LLC, Washington, District of Columbia, United States of America
| | - Arno Klein
- MATTER Lab, Child Mind Institute, New York, New York, United States of America
| | - Anirudh Krishnakumar
- Center for Research and Interdisciplinarity (CRI), Université de Paris, INSERM U1284, Paris, France
| | | | - Ariel B. Lindner
- Center for Research and Interdisciplinarity (CRI), Université de Paris, INSERM U1284, Paris, France
| | - Dusan Misevic
- Center for Research and Interdisciplinarity (CRI), Université de Paris, INSERM U1284, Paris, France
| | - John A. Naslund
- Department of Global Health and Social Medicine, Harvard Medical School, Boston, Massachusetts, United States of America
| | - Camille Nebeker
- Department of Family Medicine and Public Health, School of Medicine, University of California San Diego, San Diego, California, United States of America
| | - Aki Nikolaidis
- Center for the Developing Brain, Child Mind Institute, New York, New York, United States of America
| | - Irene Pasquetto
- Harvard Kennedy School, Harvard University, Cambridge, Massachusetts, United States of America
| | | | - Matthieu Schapira
- Structural Genomics Consortium and Department of Pharmacology & Toxicology, University of Toronto, Toronto, Canada
| | - Tohar Scheininger
- Healthy Brain Network, Child Mind Institute, New York, New York, United States of America
| | - Félix Schoeller
- Center for Research and Interdisciplinarity (CRI), Université de Paris, INSERM U1284, Paris, France
| | - Anibal Sólon Heinsfeld
- Center for the Developing Brain, Child Mind Institute, New York, New York, United States of America
| | - François Taddei
- Center for Research and Interdisciplinarity (CRI), Université de Paris, INSERM U1284, Paris, France
| |
Collapse
|
48
|
Sun W, Nasraoui O, Shafto P. Evolution and impact of bias in human and machine learning algorithm interaction. PLoS One 2020; 15:e0235502. [PMID: 32790666 PMCID: PMC7425868 DOI: 10.1371/journal.pone.0235502] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2019] [Accepted: 06/17/2020] [Indexed: 12/22/2022] Open
Abstract
Traditionally, machine learning algorithms relied on reliable labels from experts to build predictions. More recently however, algorithms have been receiving data from the general population in the form of labeling, annotations, etc. The result is that algorithms are subject to bias that is born from ingesting unchecked information, such as biased samples and biased labels. Furthermore, people and algorithms are increasingly engaged in interactive processes wherein neither the human nor the algorithms receive unbiased data. Algorithms can also make biased predictions, leading to what is now known as algorithmic bias. On the other hand, human's reaction to the output of machine learning methods with algorithmic bias worsen the situations by making decision based on biased information, which will probably be consumed by algorithms later. Some recent research has focused on the ethical and moral implication of machine learning algorithmic bias on society. However, most research has so far treated algorithmic bias as a static factor, which fails to capture the dynamic and iterative properties of bias. We argue that algorithmic bias interacts with humans in an iterative manner, which has a long-term effect on algorithms' performance. For this purpose, we present an iterated-learning framework that is inspired from human language evolution to study the interaction between machine learning algorithms and humans. Our goal is to study two sources of bias that interact: the process by which people select information to label (human action); and the process by which an algorithm selects the subset of information to present to people (iterated algorithmic bias mode). We investigate three forms of iterated algorithmic bias (personalization filter, active learning, and random) and how they affect the performance of machine learning algorithms by formulating research questions about the impact of each type of bias. Based on statistical analyses of the results of several controlled experiments, we found that the three different iterated bias modes, as well as initial training data class imbalance and human action, do affect the models learned by machine learning algorithms. We also found that iterated filter bias, which is prominent in personalized user interfaces, can lead to more inequality in estimated relevance and to a limited human ability to discover relevant data. Our findings indicate that the relevance blind spot (items from the testing set whose predicted relevance probability is less than 0.5 and who thus risk being hidden from humans) amounted to 4% of all relevant items when using a content-based filter that predicts relevant items. A similar simulation using a real-life rating data set found that the same filter resulted in a blind spot size of 75% of the relevant testing set.
Collapse
Affiliation(s)
- Wenlong Sun
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, Kentucky, United States of America
| | - Olfa Nasraoui
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, Kentucky, United States of America
| | - Patrick Shafto
- Department of Mathematics and Computer Science, Rutgers University - Newark, Newark, New Jersey, United States of America
| |
Collapse
|
49
|
Briney K, Coates H, Goben A. Foundational Practices of Research Data Management. RESEARCH IDEAS AND OUTCOMES 2020. [DOI: 10.3897/rio.6.e56508] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The importance of research data has grown as researchers across disciplines seek to ensure reproducibility, facilitate data reuse, and acknowledge data as a valuable scholarly commodity. Researchers are under increasing pressure to share their data for validation and reuse. Adopting good data management practices allows researchers to efficiently locate their data, understand it, and use it throughout all of the stages of a project and in the future. Additionally, good data management can streamline data analysis, visualization, and reporting, thus making publication less stressful and time-consuming. By implementing foundational practices of data management, researchers set themselves up for success by formalizing processes and reducing common errors in data handling, which can free up more time for research. This paper provides an introduction to best practices for managing all types of data.
Collapse
|
50
|
Poom A, Järv O, Zook M, Toivonen T. COVID-19 is spatial: Ensuring that mobile Big Data is used for social good. BIG DATA & SOCIETY 2020; 7:2053951720952088. [PMID: 34191995 PMCID: PMC7453154 DOI: 10.1177/2053951720952088] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/08/2023]
Abstract
The mobility restrictions related to COVID-19 pandemic have resulted in the biggest disruption to individual mobilities in modern times. The crisis is clearly spatial in nature, and examining the geographical aspect is important in understanding the broad implications of the pandemic. The avalanche of mobile Big Data makes it possible to study the spatial effects of the crisis with spatiotemporal detail at the national and global scales. However, the current crisis also highlights serious limitations in the readiness to take the advantage of mobile Big Data for social good, both within and beyond the interests of health sector. We propose two strategical pathways for the future use of mobile Big Data for societal impact assessment, addressing access to both raw mobile Big Data as well as aggregated data products. Both pathways require careful considerations of privacy issues, harmonized and transparent methodologies, and attention to the representativeness, reliability and continuity of data. The goal is to be better prepared to use mobile Big Data in future crises.
Collapse
Affiliation(s)
- Age Poom
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science, Institute of Urban and Regional Studies, University of Helsinki, Helsinki, Finland
- Mobility Lab, Department of Geography, University of Tartu, Tartu, Estonia
- Age Poom, University of Helsinki, Gustaf Hällströmin katu 2, Helsinki 00014, Finland.
| | - Olle Järv
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science, Institute of Urban and Regional Studies, University of Helsinki, Helsinki, Finland
| | - Matthew Zook
- Department of Geography, University of Kentucky, Lexington, KY, USA
| | - Tuuli Toivonen
- Digital Geography Lab, Department of Geosciences and Geography, University of Helsinki, Helsinki, Finland
- Helsinki Institute of Sustainability Science, Institute of Urban and Regional Studies, University of Helsinki, Helsinki, Finland
| |
Collapse
|