Community engagement and data quality: best practices and lessons learned from a citizen science project on birdsong

Jäckel, Denise; Mortega, Kim G.; Darwin, Sarah; Brockmeyer, Ulrich; Sturm, Ulrike; Lasseck, Mario; Moczek, Nicola; Lehmann, Gerlind U. C.; Voigt-Heucke, Silke L.

doi:10.1007/s10336-022-02018-8

Community engagement and data quality: best practices and lessons learned from a citizen science project on birdsong

Original Article
Open access
Published: 13 October 2022

Volume 164, pages 233–244, (2023)
Cite this article

Download PDF

You have full access to this open access article

Journal of Ornithology Aims and scope Submit manuscript

Community engagement and data quality: best practices and lessons learned from a citizen science project on birdsong

Download PDF

3131 Accesses
7 Citations
11 Altmetric
Explore all metrics

Abstract

Citizen Science (CS) is a research approach that has become popular in recent years and offers innovative potential for dialect research in ornithology. As the scepticism about CS data is still widespread, we analysed the development of a 3-year CS project based on the song of the Common Nightingale (Luscinia megarhynchos) to share best practices and lessons learned. We focused on the data scope, individual engagement, spatial distribution and species misidentifications from recordings generated before (2018, 2019) and during the COVID-19 outbreak (2020) with a smartphone using the ‘Naturblick’ app. The number of nightingale song recordings and individual engagement increased steadily and peaked in the season during the pandemic. 13,991 nightingale song recordings were generated by anonymous (64%) and non-anonymous participants (36%). As the project developed, the spatial distribution of recordings expanded (from Berlin based to nationwide). The rates of species misidentifications were low, decreased in the course of the project (10–1%) and were mainly affected by vocal similarities with other bird species. This study further showed that community engagement and data quality were not directly affected by dissemination activities, but that the former was influenced by external factors and the latter benefited from the app. We conclude that CS projects using smartphone apps with an integrated pattern recognition algorithm are well suited to support bioacoustic research in ornithology. Based on our findings, we recommend setting up CS projects over the long term to build an engaged community which generates high data quality for robust scientific conclusions.

Zusammenfassung

Gesellschaftliches Engagement und Datenqualität: Bewährte Praktiken und Erfahrungen aus einem bürgerwissenschaftlichen Projekt zum Vogelgesang

Citizen Science (CS) ist eine Forschungsmethode, die in den letzten Jahren an Bedeutung gewonnen hat und innovatives Potenzial für die Dialektforschung in der Ornithologie bietet. Da die Vorbehalte gegenüber CS-Daten immer noch weit verbreitet sind, haben wir die Entwicklung eines dreijährigen CS-Projekts zum Gesang der Nachtigall (Luscinia megarhynchos) analysiert, um bewährte Praktiken und gewonnene Erfahrungen darzustellen. Wir fokussierten uns auf den Datenumfang, das individuelle Engagement von Teilnehmenden, die räumliche Verteilung und die Fehlbestimmungen von Arten aus Aufnahmen, die vor (2018, 2019) und während des COVID-19-Ausbruchs (2020) mit einem Smartphone unter Verwendung der "Naturblick" App erstellt wurden. Die Anzahl der Aufnahmen von Nachtigallgesängen und das individuelle Engagement stiegen stetig an und erreichten ihren Höhepunkt in der Saison während der Pandemie. 13.991 Aufnahmen von Nachtigallgesängen wurden von anonymen (64%) und nicht-anonymen Teilnehmenden (36%) erstellt. Im Laufe des Projekts weitete sich die räumliche Verteilung der Aufnahmen aus (von Berlin auf bundesweit). Die Rate der Fehlbestimmungen war gering, ging im Laufe des Projekts zurück (von 10% auf 1%) und wurde hauptsächlich von gesanglichen Ähnlichkeiten mit anderen Vogelarten beeinflusst. Unsere Studie zeigte außerdem, dass das gesellschaftliche Engagement und die Datenqualität nicht direkt von den durchgeführten Disseminationsaktivitäten beeinflusst wurden, sondern dass erstere von externen Faktoren abhingen und letztere von der App profitierte. Wir schließen daraus, dass CS-Projekte, die Smartphone-Apps mit einem integrierten Mustererkennungsalgorithmus verwenden, gut geeignet sind, um die bioakustische Forschung in der Ornithologie zu unterstützen. Auf der Grundlage unserer Ergebnisse empfehlen wir, CS-Projekte langfristig zu etablieren, um eine aktive Teilnehmergemeinschaft (Community) aufzubauen, die qualitativ hochwertige Daten für fundierte wissenschaftliche Schlussfolgerungen generiert.

Overview of LifeCLEF 2020: A System-Oriented Evaluation of Automated Species Identification and Species Distribution Prediction

LifeCLEF 2021 Teaser: Biodiversity Identification and Prediction Challenges

LifeCLEF 2020 Teaser: Biodiversity Identification and Prediction Challenges

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Citizen science (CS) is a research approach in which volunteers (non-professional scientists) and academic researchers (professional scientists) work together in one or several research processes to gain scientific knowledge (Bonney et al. 2014). In recent years, CS has been increasingly used in ecological and environmental research (e.g., Planillo et al. 2021). With the development of technical tools for citizen scientists such as interactive platforms or smartphone apps, the data collection increased recently (Falk et al. 2019). This technological improvement expanded local (e.g., Urban Birds Conservation Program of Vitoria-Gasteiz) and national projects (e.g., Dialects of Czech Yellowhammers: Diblíková et al. 2019) to global ones with real time data transmission (e.g., Xeno-Canto, Cornell Lab of Ornithology). Citizen scientists generated data on a large temporal and spatial scale that could otherwise not be obtained (Diblíková et al. 2019; Searfoss et al. 2020). The aim of CS studies can be quite diverse, e.g., they can serve conservation, raise awareness or monitor environmental efforts (e.g., Roy et al. 2012; Planillo et al. 2021). In the field of ornithology, CS has a long tradition and contains some long-running projects; for example, the National Audubon Society’s Christmas Bird Count, the North American Breeding Bird Survey and the Pan-European Common Bird Monitoring Scheme. The number of long-term projects has increased ever since, while professional and lay bird enthusiasts are continuing to upload data on online platforms and archives (e.g., eBird: Sullivan et al. 2009, ornitho.de: Frick and Jaehne 2013, Project FeederWatch). Indeed, Randler (2021) showed that people using ornitho.de have better birding skills compared to other birders. Citizen scientists with a longer engagement provide a large amount of valid data e.g., to document the worldwide presence or absence of birds (Lepczyk al. 2005).

Ecological estimates of diversity or abundance based on CS data can be affected by project structure and thus community engagement as well as data quality (Dickinson et al. 2010). Community engagement, in the sense of generated data, have been found to decrease with project duration (Bruckermann et al. 2021). In contrast, the COVID-19 outbreak starting in 2020 increased individual engagement and data scope in CS projects (Hochachka et al. 2021). The pandemic also changed human activities around the world in terms of data distribution, probably due to a greater desire to spend time in nature (Venter et al. 2021), and quality (Phillips et al. 2021). Data quality is multidimensional and may be expressed by more than a dozen factors (reviewed in Lewandowski and Specht 2015) such as anonymity or inexperience of citizen scientists and project duration (Dickinson et al. 2010). In bird studies, this can lead to misidentifications. Identification is a complex task that relies on several factors such as vocal and visual similarities between co-occurring species or species richness (Kelling et al. 2015).

We here aimed to gain a better understanding of the community engagement and data quality in a CS project that was based on the song of the Common Nightingale (Luscinia megarhynchos) before (2018, 2019) and during the COVID-19 pandemic (2020). The nightingale is an ideal study object to address scientific questions using a CS approach. Nightingales possess a very memorable melodic and complex song which can frequently be heard from mid-April to late June in parks and gardens (Glutz von Blotzheim 1988) in Berlin, Germany. The literal translation in German ('Nacht'igall) as well as in English ('night'ingale) is a "night singer" and suggests a purely nocturnal song. Yet, nightingales are known to sing during both the night and the day-time (Amrhein et al. 2004). In the course of three nightingale breeding seasons, we invited all citizens to record nightingales with their smartphones. Based on previous CS projects which showed that difficult tasks reduce data quality (Kosmala et al. 2016), we decided against strict protocols or specific instructions (duration, number, place or time of the recordings). Instead, we chose a low-threshold approach to target citizen scientists with little or no prior ornithological knowledge. The intent was to reach a wide audience, resulting in a diverse, high level of community engagement. The project, which started in Berlin in 2018, has been expanded to cover all of Germany from 2019 to 2020. We allowed the participants to engage anonymously or non-anonymously. As it has been shown that many dissemination activities increase data quality (Bryant and Oliver 2009), we aimed to provide information about nightingale song on the project website, in press coverage, and at scientific or cultural face-to-face events, mostly in Berlin and mainly before the COVID-19 pandemic.

The nightingale citizen science project (Forschungsfall Nachtigall) was based on a 2016 pilot project and previous dialect findings in the nightingale song (master thesis: Schehka 2004; doctoral thesis: Weiss 2012). Dialects are song variations between different populations and time periods (Catchpole and Slater 2008). For dialect studies, many recordings with a wide spatial distribution are needed. A growing body of literature demonstrates that this can easily be obtained through the power of worldwide participating citizen scientists (e.g., Diblíková et al. 2019; Searfoss et al. 2020). Our opportunistic approach indeed led to a large collection of geo-referenced nightingale song recordings. We previously showed that the majority of our CS data were valid enough (Jäckel et al. 2021) and of high value for dialect studies (Jäckel et al. 2022). The development of the project over three nightingale breeding seasons has not yet been studied.

Here, we thus focused on the community engagement (data scope, spatial distribution) and data quality (species misidentifications) before and during the COVID-19 pandemic. We investigated (1) the data scope in terms of the number of participants, cumulative duration and number of recordings from participants who took part either anonymously or non-anonymously, (2) the spatial and temporal distribution of recordings, and (3) species misidentifications in total and from anonymous users and non-anonymous ones and underlying patterns. In 2020 during the COVID-19 pandemic, we predicted to find a decrease in community engagement due to our reduced dissemination activities, yet an increase in individual engagement. The project has been promoted with a wider geographic outreach after the first year, whereby we expected that recordings would be more widely distributed over the last 2 years. We predicted that particularly common and other melodious bird species were mistaken for nightingales—especially during the night—as lay people often assume that only nightingales have a nocturnal song.

Methods

The nightingale citizen science project and its cooperation with the ‘Naturblick’ app

The nightingale citizen science project was launched in 2018 as a collaboration with the 'Naturblick' app at the Museum für Naturkunde (MfN) in Berlin, Germany. The app has been available since 2016 and has already been widely used in 2017 with almost 50,000 downloads (Sturm and Tscholl 2019). As a special feature, the app includes a pattern recognition algorithm (PRA) which automatically identifies bird species based on cross-correlation via template matching of spectrogram segments (Lasseck 2016; Stehle et al. 2020). During the last years, the PRA improved due to the use of neural networks as well as deep learning (Lasseck 2018).

Using diverse dissemination activities such as events both inside and outside the MfN, midnight excursions and press coverage, the public was invited to download the 'Naturblick' app on their smartphones and record nightingale songs (for details see Jäckel et al. 2021). Public events were free of charge, included two to 180 participants and by large took place in Berlin. Press coverage occurred mainly before and during the breeding season in the form of radio interviews, newspaper articles and social media posts. The app featured a bespoke button showing a nightingale, to highlight the citizen science project. By clicking on this button, participants could transmit their recordings directly or make use of the PRA to aid in species identification (Fig. 1). As part of this process, the three bird species whose vocalisations most closely match the recording were presented (Stehle et al. 2020). Participants were also allowed to choose whether they wanted to submit the recordings anonymously or non-anonymously with an individual username. Because of technical limitations, it was not possible to submit recordings more than once or with a duration of more than two minutes per recording. Temporal (day, time) and spatial (GPS coordinates) information were automatically captured in the metadata, if permitted by the participant.

Development of the data scope over the study period

To get a better understanding of the community development over the course of the project, we descriptively analysed the data scope before (2018, 2019) and during the COVID-19 pandemic (2020). We used the number of participants, cumulative duration and number of all recordings by participants who took part anonymously or non-anonymously in 1, 2, or 3 years as parameters. As dissemination activities are supposed to have an impact (Bryan and Oliver 2009), we examined whether their reduction in 2020 had any effect. Less dissemination activities were provided during the 2020 nightingale breeding season. That year was dominated by the COVID-19 pandemic lockdown and face-to-face events were not permitted in Berlin. Instead, online field trips and events were undertaken, a method which has been found to be equally reliable (Rögele et al. 2022).

Development of spatial distribution of CS recordings in Germany

To enable comparisons between populations, it is important to obtain recordings from different places. Citizen scientists, therefore, have to cover a wide spatial distribution. We aimed to reach people from Berlin, where the MfN is based, and all over Germany, since participation in this project was possible from any location with a smartphone and the 'Naturblick' app. Germany was separated into four regions (North, East, South, West) which included quadrants with an area of 100 × 100 kms each. We divided the number of quadrants that contained nightingale song recordings by the number of total quadrants in Germany to determine the percentages for each year and whether the spatial distribution increased with time.

Underlying patterns of species misidentifications

Species identification is challenging for citizen scientists (Crall et al. 2011) and one of the major data quality issues. We aimed to elucidate and better understand the development of species misidentifications over the course of the project and underlying patterns. MP3 and m4a files were converted into the WAV format using the WaveLab 7 program to analyse them visually and acoustically with Avisoft SASLab Pro 5.2 (R. Specht, Berlin, Germany, sampling rate = 22.050 Hz, FFT = 1024 points, Hamming-Window, overlap 93.75%). Recordings were sorted into four types (nightingale songs, nightingale calls, other bird species, i.e., species misidentifications of the nightingale and no bird song). We examined the effect of dissemination activities in all three nightingale breeding seasons and of participation (i.e., anonymity, experience) on the recording types and descriptively analysed the respective percentages (number of recordings per classified type/all recordings).

To identify temporal factors for underlying patterns (calendar week and time of day), we descriptively compared the number of nightingale songs and other bird species recorded across all study years. The peak singing time of nightingales (23:00–1:00 h) as reported in the literature (see Jäckel et al. 2021) and other species of birds (5:00–7:00 h and 19:00–21:00 h) were accounted for to see if they have an influence on the rate of species misidentifications. To understand if and when certain species are most often misidentified as nightingales, we compared recordings from day (04:00–22:00 h) and night (22:00–04:00 h). Not all of the other bird species could be identified (too short, disturbing background noise).

As there were few species misidentifications in total (about 1–10% of submissions), we decided to work on the taxonomic group level. We analysed underlying patterns in terms of similarities (vocal, visual and species abundance) between the taxonomic group of the other bird species and the nightingale. Vocal similarities were determined by the following parameters: melodic (yes/no), complex (yes/no), nightingale song elements (whistle/trill/buzz) and usual time of singing (night/dawn/day). Nightingales are visually recognisable on their song post. Thus, we determined for the evaluation of visual similarities, whether the plumage was identical to the nightingale (brown), similar (black) or different. As frequently occurring species are easier to identify than rarely occurring ones (Falk et al. 2019), we determined the species abundance in Germany from the ‘Berliner Ornithologische Arbeitsgemeinschaft e.V.’ (Berlin Ornithological Society) breeding bird monitoring (http://www.orniberlin.de/) and divided it into six groups (group 1 = ≤ 1,000, group 2 = ≤ 10,000, group 3 = ≤ 100,000, group 4 = ≤ 500,000, group 5 = ≤ 1,000,000, group 6 = > 1,000,000).

Statistical analyses

We performed all statistical analyses with R version 4.1.2 (R-Team 2021). A possible correlation between (i) the dissemination activities and the number of participants, (ii) cumulative duration and the recording type (iii) or number of recordings and the recording type was determined using Spearman's rank correlation test. We used a principal component analysis (PCA) and a general linear model (GLM) to compare similarities (vocal, visual and species abundance) between taxonomic groups and nightingales.

Results

Development of the data scope over the study period

The project showed a positive community development over the project duration in terms of the data scope. The number of dissemination activities (Fig. 2a, Online Resource 1), cumulative duration (Fig. 2b) and number of participants (Fig. 2c) increased from 2018 to 2019 and decreased in 2020. Number of recordings increased continuously over the nightingale breeding seasons (Fig. 2d), this was due to a higher number of individual engagement especially in 2020. The majority of data were recorded anonymously, followed by citizen scientists who participated non-anonymously in 1, 2 and 3 years. Dissemination activities did not correlate with the number of participants, cumulative duration or number of recordings (Spearman's rank correlation test: p > 0.05).

Development of spatial distribution of CS recordings in Germany

Most recordings were obtained from the East of Germany, followed by the West, North and South (Fig. 3). This matches the natural distribution of the nightingale (reported by the German national bird breeding count by the ‘Dachverband Deutscher Avifaunisten; Gedeon et al. 2014). During 2018, we had the lowest spatial distribution in terms of the percentage of quadrants which contained nightingale song recordings (42%). In 2020, during the COVID-19 pandemic, the spatial distribution for all regions was larger (90%) than in 2019 (74%). Over the years, the data were more geographically spread and no longer came mainly from Berlin as in the first year.

Underlying patterns of species misidentifications

The data quality of the recording types showed a positive development over the 3 years. The percentage of nightingale songs and calls (Fig. 4a) increased steadily over the study period. At the same time, the percentage of other bird species and no birds decreased continuously. In 2020, during the COVID-19 pandemic, the number of species misidentification was the lowest. Dissemination activities did not correlate with the recording types (Spearman's rank correlation test: p > 0.05). The percentages were similar between anonymous and non-anonymous participants (Fig. 4b, Online Resource 2). The highest percentage of recordings with nightingale songs and the fewest with other bird species were generated by non-anonymous citizen scientists who took part in all 3 years, followed by those who attended 2 years, 1 year and anonymously.

A descriptive analysis of recordings from all years revealed that species misidentification was not affected by the song timing across calendar weeks or time of day. Most nightingale songs were recorded in the 16th and 17th calendar week (Fig. 5a). Recordings with other bird songs were made primarily between the 18th and 20th calendar week (see Jäckel et al. 2021). After the high nightingale breeding season when song is most common, slightly more (26th week) or the same amount (27th week) of other bird songs were recorded. Most recordings of nightingale song were made between 23:00 and 0:00 h and of other bird species at 21:00 and 4:00 h (Fig. 5b). There were more recordings with species misidentifications of other bird species recorded at hours outside the times when the nightingale song is regularly found (23:00 and 1:00 h).

During the day, European Blackbirds (Turdus merula, n = 295), Song Thrush (Turdus philomelos, n = 35) and House Sparrow (Passer domesticus, n = 31) were the species that were mainly misidentified as nightingales by participants (Table 1). European Blackbirds (n = 30), song thrush (n = 12) and European Robin (Erithacus rubecula, n = 5) were the species mainly misidentified at night. The Thrush Nightingale (Luscinia luscinia), the common nightingale's sister species (Vokurková, et al. 2013) was recorded fourteen times.

Table 1 The number of species misidentifications on the species level for 601 recordings of other bird species during the whole study period (2018–2020)

Full size table

A PCA of three similarities traits (vocal, visual and species abundance) showed patterns of species misidentifications. Two principal components explained 71% of the species misidentifications with other species for vocal and visual similarities (Fig. 6, PC1 = 44%) and species abundance (PC2 = 28%) between the taxonomic groups and the nightingale. Vocal similarities mainly in the Muscicapoidea and Sylvioidea and partly in Certhiodea may have had an influence (for an overview of the species see Table 1). Visual similarities of Passeroidea and Certhiodea may have had an effect. Species abundance may be a factor for Passeroidea. A general linear model of the first principal component showed that species misidentifications on the level of the taxonomic groups were influenced significantly by the vocal similarity (GLM: df = 24, p-value < 0.005; Tab. 2).

Table 2 Results of a general linear model testing whether the first principal component (PC1) showed relationships between the traits and the taxonomic group

Full size table

Discussion

In this paper, we investigated the development of community engagement and data quality of a 3-year CS project based on the song of the nightingale before and during the COVID-19 pandemic. We found an increase in the dynamics of community engagement over the course of the project in both the data scope and spatial distribution. The number of species misidentifications decreased during the study period and species misidentifications were mostly affected by vocal similarities of other species. In the following, we will discuss our findings in more detail with regard to two aspects: best practices and lessons learned that we believe could be useful to grow and improve (future) avian-based CS projects.

Best practises: success factors in citizen science

The nightingale CS project steadily grew in numbers of recordings collected, despite all the challenges that CS brings (Bonney et al. 2014; Wittman et al. 2019) and the COVID-19 pandemic brought in 2020. In total, 13,991 nightingale song recordings were submitted to the project by anonymous (64%) and non-anonymous participants (36%). We achieved the goal of high community engagement, even though only a few non-anonymous citizen scientists participated in all three nightingale breeding seasons. Similar to other CS projects (e.g., Segal et al. 2015; Seymour and Haklay 2017), most of the non-anonymous participants took part in only a single nightingale breeding season. However, proportionately, these 1-year citizen scientists produced many and anonymous ones produced the majority of recordings. Anonymity may have been a success factor, without which two thirds of the data would not have been obtained. Bryant and Oliver (2009) demonstrated that dissemination activities do not seem to be a contributor to success for the duration or community engagement of participation in CS projects. Similarly, we assume that some citizen scientists liked joining our diverse dissemination activities but rarely generated recordings, while others recorded but did not attend. Even though press coverage and events did not seem to have an influence on the data scope (i.e., number of citizen scientists, duration and number of recordings), their reduction during the COVID-19 outbreak led to fewer participants and possibly shorter recording duration than in the previous years. These fewer citizen scientists in 2020 were individually more engaged and produced more recordings. The enhanced use as well as the improvement of the PRA may have led to an increase in the number and a decrease in the duration of recordings in the last year. Our findings are also consistent with other CS projects that reported higher individual engagement during the pandemic (e.g., Sánchez-Clavijo et al. 2021; Hochachka et al. 2021) based presumably on an increased desire to experience nature (Flaccus 2020) or outdoor activities (Venter et al. 2021). The geographical spread of our opportunistic data increased over the course of the project and was similar to the nightingale distribution in Germany (Gedeon et al. 2014). Moreover, our project was successful enough to expand the initial geographical focus from Berlin to a nationwide data coverage.

The low-threshold approach which engaged a wide audience of citizen scientists was also successful in this project. Contrary to Falk et al. (2019), participants as a mass may have improved, resulting in higher data accuracy than in other CS projects (e.g., Kosmala et al. 2016). In fact, the data quality of anonymous users was similar to those who participated for multiple years non-anonymously. This revealed that quality was not negatively affected neither by reduction of our dissemination activities (Bryant and Oliver, 2009) nor by anonymity of participants (Dickinson et al. 2010) but positively influenced by the project duration and the improved PRA.

Lessons learned: suggestions for future CS projects on birdsong

As there is evidence in the literature that in many cases, data quality is linked to avoidable errors in the study design (e.g., Bowser et al. 2020), we suggest the following measures for future CS projects:

Most species misidentifications did not occur at night, as expected, but were affected by vocal similarities of other species. We should, for example, have provided more information through the use of the already integrated 'Species Portrait' and 'Trait Selection' features of the app, which could have reduced confusions with other species based on vocal similarities. Citizen scientists should have been supported with more specific instructions for the recordings, i.e., how (smartphone orientation), where (regions), and when (during the night). Such a task-lead approach facilitates data collection for citizen scientists (Moczek et al. 2021) and enables them to get a common understanding of data quality (Land-Zandstra et al. 2021). Positive feedback and confirmation of the number of successfully recorded nightingale songs likewise might have helped to increase quality (Peltola and Arpin 2018) and motivate citizen scientists to participate long-term (Pandya and Dibner 2018). It has been suggested that rewards (Reeves et al. 2017), greater involvement in scientific processes and valuing individual contributions (Dowthwaite and Sprinks 2019) could lead to data generated not by a few participants, (90-9-1 rule: Gasparini et al. 2020), but by many and further over the whole project duration. In future projects, the aspects that influence data quality could be validated and specified by additional demographic data (e.g., leisure activities; Lee and Scott 2004). Equally informative would be a self-assessment item or a scale of skills and knowledge to compare the recordings (type, number and duration).

Overall, the project was very successful and has positively evolved over the years. As individual engagement, data scope, spatial distribution, and data quality have largely increased over the nightingale breeding seasons, this strongly suggests that CS projects need to be implemented over the long term. It also became apparent that it was indeed a good idea to focus on a specific species like the nightingale, since species identification is a difficult task (Crall et al. 2011) and the nightingale proved to be a charismatic focal bird.

Conclusion

The nightingale citizen science project has demonstrated that CS is a research approach that can contribute large datasets with data quality valid to science through increasing community development and engagement over time. For dialect research, many and diverse recordings from various locations are necessary, making our CS data highly valuable. The findings from our study may also offer great value for other CS projects to gain insights into best practices and to avoid systematic species misidentifications in the future, which can lead to biased ecological estimates (Dickinson et al. 2010). In sum, our study may inspire other existing and evolving CS projects in ornithology to adapt their study design with regard to their modes of community engagement and ways to ensure data quality.

Data availability

The datasets generated during and/or analysed during the current study are available from the corresponding author on reasonable request.

References

Amrhein V, Kunc HP, Naguib M (2004) Seasonal patterns of singing activity vary with time of day in the nightingale (Luscinia megarhynchos). Auk 121:110–117. https://doi.org/10.1093/auk/121.1.110
Article Google Scholar
Bonney R, Shirk JL, Phillips T, Wiggins A, Ballard HL, Miller-Rushing AJ, Parrish JK (2014) Next steps for citizen science. Science 343:1436–1437. https://doi.org/10.1126/science.1251554
Article Google Scholar
Bowser A, Cooper C, De Sherbinin A, Wiggins A, Brenton P, Chuang TR et al (2020) Still in need of norms: the state of the data in citizen science. Citiz Sci Theory Pract 5:1–16. https://doi.org/10.5334/CSTP.303
Article Google Scholar
Bruckermann T, Greving H, Brandt M, Harms U (2021) Daten sammeln reicht mir! Bürgerwissenschaften: Welche Aktivitäten interessieren Beteiligte? IPN Journal 8:24–29. www.ipn.uni-kiel.de/de/publikationen/ipn-journal/ipn-journal-no-8
Bryant J, Oliver MB (2009) Media effects: advances in theory and research. Routledge, London
Book Google Scholar
Catchpole CK, Slater PJB (2008) Bird song. Biological themes and variations. Cambridge Univ, Press
Book Google Scholar
Crall AW, Newman GJ, Stohlgren TJ, Holfelder KA, Graham J, Waller DM (2011) Assessing citizen science data quality: an invasive species case study. Conserv Lett 4:433–442. https://doi.org/10.1111/j.1755-263X.2011.00196.x
Article Google Scholar
Diblíková L, Pipek P, Petrusek A, Svoboda J, Bílková J, Vermouzek Z et al (2019) Detailed Large-scale Mapping of Geographical Variation of Yellowhammer Emberiza citrinella Song Dialects in a Citizen Science Project. Ibis 161:401–414. https://doi.org/10.1111/ibi.12621
Article Google Scholar
Dickinson JL, Zuckerberg B, Bonter DN (2010) Citizen science as an ecological research tool: challenges and benefits. Annu Rev Ecol Evol Syst 41:149–172. https://doi.org/10.1146/annurev-ecolsys-102209-144636
Article Google Scholar
Dowthwaite L, Sprinks J (2019) Citizen science and the professional-amateur divide: lessons from differing online practices. J Sci Commun 18:1–16. https://doi.org/10.22323/2.18010206
Article Google Scholar
Falk S, Foster G, Comont R, Conroy J, Bostock H, Salisbury A, Kilbey D, Bennett J, Smith B (2019) Evaluating the ability of citizen scientists to identify Bumblebee (Bombus) species. PLoS ONE 14(1–13):e0218614. https://doi.org/10.1371/journal.Pone.0218614
Article CAS Google Scholar
Frick S, Jaehne S (2013) Das Internetportal „ornitho. de “ - Instrument zum Artenmonitoring. Landschaftspflege Und Naturschutz in Thüringen 50:174–175
Google Scholar
Flaccus G (2020) Bird-watching Soars Amid COVID-19 as Americans Head Outdoors. Associated Press. https://apnews.com/article/94a1ea5938943d8a70fe794e9f629b13
Gasparini M, Clarisó R, Brambilla M, Cabot J (2020) Participation inequality and the 90–9–1 principle in open source. In: Proceedings of the 16th International Symposium on Open Collaboration 6:1–7. https://doi.org/10.1145/3412569.3412582
Gedeon K, Eikhorst W, Fischer S, Flade M, Frick S, Geiersberger I et al (2014) Atlas Deutscher Brutvogelarten. Stiftung Vogelmonitoring Deutschland
Google Scholar
Glutz von Blotzheim UN (1988) Handbuch der Vögel Mitteleuropas. Aula, Wiesbaden
Google Scholar
Hochachka WM, Alonso H, Gutiérrez-Expósito C, Miller E, Johnston A (2021) Regional variation in the impacts of the COVID-19 pandemic on the quantity and quality of data collected by the project eBird. Biol Cons 254:1–41. https://doi.org/10.1016/j.biocon.2021.108974
Article Google Scholar
Jäckel D, Mortega KG, Sturm U, Brockmeyer U, Khorramshahi O, Voigt-Heucke SL (2021) Opportunities and limitations: a comparative analysis of citizen science and expert recordings for bioacoustic research. PLoS ONE 16:1–25. https://doi.org/10.1371/journal.pone.0253763
Article CAS Google Scholar
Jäckel D, Mortega KG, Brockmeyer U, Lehmann GUC, Voigt-Heucke SL (2022) Unravelling the stability of nightingale song over time and space using open, citizen science and shared data. Front Ecol Evol 10:1–13. https://doi.org/10.3389/fevo.2022.778610
Article Google Scholar
Kelling S, Johnston A, Hochachka WM, Iliff M, Fink D, Gerbracht J et al (2015) Can observation skills of citizen scientists be estimated using species accumulation curves? PLoS ONE 10:1–20. https://doi.org/10.1371/journal.pone.0139600
Article CAS Google Scholar
Kosmala M, Wiggins A, Swanson A, Simmons B (2016) Assessing data quality in citizen science. Front Ecol Environ 14:551–560. https://doi.org/10.1002/fee.1436
Article Google Scholar
Land-Zandstra A, Agnello G, Gültekin YS (2021) Participants in citizen science, vol 13. Springer Nature, pp 243–259. https://doi.org/10.1007/978-3-030-58278-4_13
Book Google Scholar
Lasseck M (2016) Improving bird identification using multiresolution template matching and feature selection during training. In CLEF (Working Notes), pp 490–501
Lasseck M (2018) Audio-based Bird species identification with deep convolutional neural networks. In: CEUR Workshop Proceedings, pp 1–11
Lepczyk CA (2005) Integrating published data and citizen science to describe bird diversity across a landscape. J Appl Ecol 42:672–677. https://doi.org/10.1111/j.1365-2664.2005.01059.x
Lee JH, Scott D (2004) Measuring birding specialization: a confirmatory factor analysis. Leis Sci 26:245–260. https://doi.org/10.1080/01490400490461387
Article Google Scholar
Lewandowski E, Specht H (2015) Influence of volunteer and project characteristics on data quality of biological surveys. Conserv Biol 29:713–723. https://doi.org/10.1111/cobi.12481
Article Google Scholar
Moczek N, Hecker S, Voigt-Heucke SL (2021) The known unknowns: what citizen science projects in Germany know about their volunteers—and what they don’t know. Sustainability 13:1–10. https://doi.org/10.3390/su132011553
Article Google Scholar
Pandya RE, Dibner KA (2018) Learning through citizen science. enhancing opportunities by design. National Academies Press, Washington, D.C.
Book Google Scholar
Peltola T, Arpin I (2018) Science for everybody? Bridging the socio-economic gap in urban biodiversity monitoring. UCL Press
Google Scholar
Phillips TB, Parker A, Bowser A, Haklay M (2021) Publicly generated data: the role of citizen science for knowledge production, action, and public engagement. In: Closing the knowledge-implementation gap in conservation science. Wildlife Research Monographs, vol 4, Springer, Cham. https://doi.org/10.1007/978-3-030-81085-6_4
Planillo A, Fiechter L, Sturm U, Voigt-Heucke S, Kramer-Schadt S (2021) Citizen science data for urban planning: comparing different sampling schemes for modelling urban bird distribution. Landsc Urban Plan 211:713–715. https://doi.org/10.1016/j.landurbplan.2021.104098
Article Google Scholar
Randler C (2021) Users of a citizen science platform for bird data collection differ from other birdwatchers in knowledge and degree of specialization. Glob Ecol Conserv 27:1–10. https://doi.org/10.1016/j.gecco.2021.e01580
Article Google Scholar
Reeves N, Tinati R, Zerr S, Van Kleek MG, Simperl E (2017) From crowd to community: a survey of online community features in citizen science projects. In: Proceedings of the 2017 ACM Conference on computer supported cooperative work and social computing, pp 2137–2152
Rögele A, Scheiter K, Randler C (2022) Can involvement induced by guidance foster scientific reasoning and knowledge of participants of a citizen science project? Int J Sci Educ 12:1–17. https://doi.org/10.1080/21548455.2022.2043567
Article Google Scholar
Roy HE, Pocock MJ, Preston CD, Roy DB, Savage J, Tweddle JC, Robinson LD (2012) Understanding citizen science and environmental monitoring: final report on behalf of UK Environmental Observation Framework. Wallingford, NERC/Centre for Ecology & Hydrology
Sánchez-Clavijo LM, Martínez-Callejas SJ, Acevedo-Charry O, Diaz-Pulido A, Gómez-Valencia B, Ocampo-Peñuela N et al (2021) Differential reporting of biodiversity in two citizen science platforms during COVID-19 lockdown in Colombia. Biol Cons 256:1–9. https://doi.org/10.1016/j.biocon.2021.109077
Article Google Scholar
Searfoss AM, Liu WC, Creanza N (2020) geographically well-distributed citizen science data reveals range-wide variation in the Chipping Sparrow’s simple song. Anim Behav 161:63–76. https://doi.org/10.1016/j.anbehav.2019.12.012
Article Google Scholar
Segal A, Gal YA, Simpson RJ, Victoria Homsy V, Hartswood M, Page KR, Jirotka M (2015) Improving productivity in citizen science through controlled intervention. In: Proceedings of the 24th International Conference on World Wide Web 331–337 https://doi.org/10.1145/2740908.2743051
Seymour V, Haklay,M (2017) Exploring Engagement Characteristics and Behaviours of Environmental Volunteers. Citizen Science: Theory Pract 2:1–13. https://doi.org/10.5334/cstp.66
Stehle M, Lasseck M, Khorramshahi O, Sturm U (2020) Evaluation of acoustic pattern recognition of nightingale (Luscinia megarhynchos) recordings by citizens. Res Ideas Outcomes 6:1–9. https://doi.org/10.3897/rio.6.e50233
Article Google Scholar
Sturm U, Tscholl M (2019) The role of digital user feedback in a user-centred development process in citizen science. J Sci Commun 18:1–19. https://doi.org/10.22323/2.18010203
Article Google Scholar
Sullivan BL, Wood CL, Iliff MJ, Bonney RE, Fink D, Kelling S (2009) eBird: a citizen-based bird observation network in the biological sciences. Biol Cons 142:2282–2292. https://doi.org/10.1016/j.biocon.2009.05.006
Article Google Scholar
Venter ZS, Barton DN, Gundersen V, Figari H, Nowell MS (2021) Back to nature: Norwegians sustain increased recreational use of urban green space months after the COVID-19 outbreak. Landsc Urban Plan 214:1–10. https://doi.org/10.1016/j.landurbplan.2021.104175
Article Google Scholar
Vokurková J, Petrusková T, Reifová R, Kozman A, Mořkovský L, Kipper S et al (2013) The causes and evolutionary consequences of mixed singing in two hybridizing songbird species (Luscinia spp.). PLoS ONE 8:1–10. https://doi.org/10.1371/journal.pone.0060172
Article CAS Google Scholar
Wittmann J, Girman D, Crocker D (2019) Using iNaturalist in a coverboard protocol to measure data quality: suggestions for project design. Citiz Sci Theory Pract 4:1–12. https://doi.org/10.5334/cstp.131
Article Google Scholar
https://support.rstudio.com/hc/en-us/articles/206212048-Citing-RStudio

Download references

Acknowledgements

We thank our funders for financing the project: the BMBF (Förderkennzeichen: 01BF1709) as ‘Forschungsfall Nachtigall’ (FFN) during 2018 and 2019 and the MfN during 2020. We are grateful for the work of all our citizen scientists and of Nadja Tata, Alexander Buhl and Julia Rostin as part of the FFN. Furthermore, we thank the ‘Stadtnatur entdecken’ team for the development of the app 'Naturblick' and their ongoing support.

Funding

Open Access funding enabled and organized by Projekt DEAL.

Author information

Authors and Affiliations

Museum für Naturkunde Berlin, Leibniz Institute for Evolution and Biodiversity Science, Berlin, Germany
Denise Jäckel, Kim G. Mortega, Sarah Darwin, Ulrich Brockmeyer, Ulrike Sturm, Mario Lasseck & Silke L. Voigt-Heucke
Life Sciences Faculty, Humboldt-Universität zu Berlin, Berlin, Germany
Denise Jäckel
PLAN Institute for Architectural and Environmental Psychology, Berlin, Germany
Nicola Moczek
Evolutionary Ecology, Department of Biology, Humboldt-Universität zu Berlin, Berlin, Germany
Gerlind U. C. Lehmann
Animal Behaviour, Institute of Biology, Freie Universität Berlin, Berlin, Germany
Silke L. Voigt-Heucke

Authors

Denise Jäckel
View author publications
You can also search for this author in PubMed Google Scholar
Kim G. Mortega
View author publications
You can also search for this author in PubMed Google Scholar
Sarah Darwin
View author publications
You can also search for this author in PubMed Google Scholar
Ulrich Brockmeyer
View author publications
You can also search for this author in PubMed Google Scholar
Ulrike Sturm
View author publications
You can also search for this author in PubMed Google Scholar
Mario Lasseck
View author publications
You can also search for this author in PubMed Google Scholar
Nicola Moczek
View author publications
You can also search for this author in PubMed Google Scholar
Gerlind U. C. Lehmann
View author publications
You can also search for this author in PubMed Google Scholar
Silke L. Voigt-Heucke
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

Conceptualization: DJ, KGM, SLV-H; data curation: DJ. Formal analysis: DJ, UB; project administration: DJ, KGM, US, SLV-H; software: 'Naturblick' (App); validation: DJ, KGM, SLV-H; visualization: UB, DJ; writing (original draft): DJ; Writing (review & editing): KGM, SD, US, ML, NM, GUCL, SLV-H.

Corresponding author

Correspondence to Denise Jäckel.

Ethics declarations

Conflict of interest

This work was supported by BMBF (Förderkennzeichen: 01BF1709). The authors have no relevant financial or non-financial interests to disclose. The authors have no competing interests to declare that are relevant to the content of this article. All authors certify that they have no affiliations with or involvement in any organization or entity with any financial or non-financial interest in the subject matter or materials discussed in this manuscript. The authors have no financial or proprietary interests in any material discussed in this article.

Ethics approval

This study investigates the development of a citizen science project based on the nightingale song. The data of the citizen scientists were shared with our project with their approval via the 'Naturblick' app. For the recordings, we obtained the consent of participants to analyse their data. In Germany, the approval of an ethics committee is not required for such research questions and was, therefore, not obtained. We have, therefore, received all the necessary permissions required in Germany.

Additional information

Communicated by T. S. Osiejuk.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (XLSX 10 KB)

Supplementary file2 (XLSX 10 KB)

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Jäckel, D., Mortega, K.G., Darwin, S. et al. Community engagement and data quality: best practices and lessons learned from a citizen science project on birdsong. J Ornithol 164, 233–244 (2023). https://doi.org/10.1007/s10336-022-02018-8

Download citation

Received: 13 May 2022
Revised: 23 August 2022
Accepted: 31 August 2022
Published: 13 October 2022
Issue Date: January 2023
DOI: https://doi.org/10.1007/s10336-022-02018-8

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Community engagement and data quality: best practices and lessons learned from a citizen science project on birdsong

Abstract

Zusammenfassung

Similar content being viewed by others

Overview of LifeCLEF 2020: A System-Oriented Evaluation of Automated Species Identification and Species Distribution Prediction

LifeCLEF 2021 Teaser: Biodiversity Identification and Prediction Challenges

LifeCLEF 2020 Teaser: Biodiversity Identification and Prediction Challenges

Introduction

Methods

The nightingale citizen science project and its cooperation with the ‘Naturblick’ app

Development of the data scope over the study period

Development of spatial distribution of CS recordings in Germany

Underlying patterns of species misidentifications

Statistical analyses

Results

Development of the data scope over the study period

Development of spatial distribution of CS recordings in Germany

Underlying patterns of species misidentifications

Discussion

Best practises: success factors in citizen science

Lessons learned: suggestions for future CS projects on birdsong

Conclusion

Data availability

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Conflict of interest

Ethics approval

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (XLSX 10 KB)

Supplementary file2 (XLSX 10 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation