Research Article |
Corresponding author: Jonathan Silvertown ( jonathan.silvertown@ed.ac.uk ) Academic editor: Vincent Smith
© 2015 Jonathan Silvertown, Martin Harvey, Richard Greenwood, Mike Dodd, Jon Rosewell, Tony Rebelo, Janice Ansine, Kevin McConway.
This is an open access article distributed under the terms of the Creative Commons Attribution License (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Citation:
Silvertown J, Harvey M, Greenwood R, Dodd M, Rosewell J, Rebelo T, Ansine J, McConway K (2015) Crowdsourcing the identification of organisms: A case-study of iSpot. ZooKeys 480: 125-146. https://doi.org/10.3897/zookeys.480.8803
|
Accurate species identification is fundamental to biodiversity science, but the natural history skills required for this are neglected in formal education at all levels. In this paper we describe how the web application ispotnature.org and its sister site ispot.org.za (collectively, “iSpot”) are helping to solve this problem by combining learning technology with crowdsourcing to connect beginners with experts. Over 94% of observations submitted to iSpot receive a determination. External checking of a sample of 3,287 iSpot records verified > 92% of them. To mid 2014, iSpot crowdsourced the identification of 30,000 taxa (>80% at species level) in > 390,000 observations with a global community numbering > 42,000 registered participants. More than half the observations on ispotnature.org were named within an hour of submission. iSpot uses a unique, 9-dimensional reputation system to motivate and reward participants and to verify determinations. Taxon-specific reputation points are earned when a participant proposes an identification that achieves agreement from other participants, weighted by the agreers’ own reputation scores for the taxon. This system is able to discriminate effectively between competing determinations when two or more are proposed for the same observation. In 57% of such cases the reputation system improved the accuracy of the determination, while in the remainder it either improved precision (e.g. by adding a species name to a genus) or revealed false precision, for example where a determination to species level was not supported by the available evidence. We propose that the success of iSpot arises from the structure of its social network that efficiently connects beginners and experts, overcoming the social as well as geographic barriers that normally separate the two.
Biodiversity, Citizen Science, Crowdsourcing, Identification, Learning, Learning design, social networking
All fields of biodiversity science depend upon the fundamental requirement that the species under study must be correctly identified. Although biodiversity science has experienced explosive growth due to the advent of sophisticated tools such as next generation sequencing that can characterise microbiomes containing thousands of operational taxonomic units (
Crowdsourcing is the practice of seeking the execution of a task or the solution to a problem through an open invitation for anyone to participate (
iSpot crowdsources the identification of organisms shown in photographs submitted by participants to an online social network designed and dedicated to the purpose. Facebook, Twitter and Flickr are all widely used for identification (
iSpot is explicitly designed around a social network structure suited to verifiable identification and learning. Three different types of social network structure are schematically represented in Figure
Schematic of three network structures, from equal status, to expert status and a hybrid system. (Icons http://www.icons-land.com)
Fig.
The third type of social network structure (Fig.
The hybrid social network structure of iSpot shown schematically in Fig.
The conceptual social network structure of iSpot, showing the group (3 of the 8) compartmentalization and its interaction. Not shown is the learner-mentor interaction within each group. (Icons http://www.icons-land.com)
Formal network analysis of iSpot will be the subject of a future publication, but a preliminary insight into the iSpot social network can be obtained Fig.
The network linking participants who posted observations to iSpotnature.org without an identification and those providing a likely identification for those observations. The diagram is based on the sample of 5,000 identifications made up until 1 July 2014. This activity occurred over 32 days. The network contains 1,110 nodes linked by 2,876 edges. Red nodes (83.23%) are participants who only received identifications, green nodes (6.38%) are participants who only made identifications and blue nodes (10.39%) are participants who both made and received identifications.
A different perspective on the social network is provided by the map shown in Fig.
A map of the social network of a single participant who contributed about 500 observations to iSpotnature.org at the locations shown by blue dots, with other iSpot participants, living at locations shown by red stars, who provided determinations for those observations. The location of the observation and the location of the person identifying it are joined by a line.
The threefold requirement to identify experts, verify identifications and involve beginners in the social network is achieved through iSpot’s purpose-designed reputation system. There is a large and diffuse literature about systems for establishing reputation and trust in online social networks (
In the specific instance of designing a reputation system for online identification of organisms we have two advantages that do not apply in e-commerce. Firstly, high reputation in the online social network can be pre-assigned (seeded) to designated individuals who have already earned expert status in identification in an offline context. Secondly, social interactions in iSpot are mediated by a social object, the observation, and are not purely dependent on social relationships (
The iSpot reputation system works as follows. Each participant has a 9-dimensional reputation (Fig.
iSpot attaches badges to usernames when displayed on screen to signal a participant’s reputation, with a different icon for each of the 8 groups. For example a stylised bird indicates bird reputation, a butterfly indicates invertebrate reputation and so on. A participant may have up to five icons in any group, with the number of icons awarded being scaled geometrically with the numerical reputation score that has been earned. Only a low score is required to earn one bird icon, but a very high cumulative score is required to earn five. A participant’s full, 9-dimensional reputation is visible on their profile page (Fig.
When an iSpot participant posts an observation they are required to select which of the 8 groups it belongs to and they can also add the name of the organism if they think they know it. In order to encourage participants to utilize and hence test whatever level of knowledge they may have, iSpot allows a choice of three levels of confidence when a name is proposed. Where we have them, region-specific dictionaries of scientific and vernacular names are used to check that proferred names are correctly spelled and are current. Where we do not have a local dictionary, we use the global Catalogue of Life (“Catalogue of Life http://www.catalogueoflife.org/”). A participant who is suggesting a name must select between: “I’m as sure as I can be”, “It’s likely to be this but I can’t be certain” and “It might be this”.
The operation of the reputation system is illustrated in Fig.
We have evaluated the characteristics of the iSpot system against the reference model for online reputation systems proposed by Vavilis et al. (
The characteristics of the iSpot reputation system evaluated against the requirements for online reputation systems proposed by Vavilis et al. (
ID | Requirement | iSpot reputation system |
---|---|---|
R1 | Ratings should discriminate user behaviour. | Ratings are algorithmically awarded based upon weighted agreements and discriminate between user activity in different taxa (groups). |
R2 | Reputation should discriminate user behaviour. | Reputation is a scaled function of ratings and hence discriminates user behaviour. |
R3 | The reputation system should be able to discriminate “incorrect” ratings. | The reputation system only awards a Likely ID when its threshold is crossed and it selects better supported over less well supported names when more than one is proposed. |
R4 | An entity should not be able to provide rating for itself. | Users proposing a name cannot ‘agree’ with themselves. |
R5 | Aggregation of ratings should be meaningful. | Multiple agreements are the norm and represent the ratings of different users. There is no incentive for gratuitous agreement. |
R6 | Reputation should be assessed using a sufficient amount of information. | Reputation is earned from all agreements and is cumulative within groups. |
R7 | The reputation system should differentiate reputation information by the interaction it represents. | The 8 dimensions of the reputation system differentiate users by their taxonomic field of expertise. Social points are earned and aggregated separately from ID-based reputations. |
R8 | Reputation should capture the evolution of user behavior. | Reputation, earned by contributing correct identifications is dynamic, but it can only increase. |
R9 | Users should not gain advantage of their new status. | This requirement is contrary to the learning principle utilized in iSpot, which is that new users need to be encouraged with early rewards. In iSpot, it is easier to earn the first one or two reputation badges in a taxon than later ones. However, although it is easier to gain early status, relative ranks of users can only be changed by making valued contributions. |
R10 | New users should not be penalized for their status. | New users are encouraged (See R9). |
R11 | Users should not be able to directly modify ratings. | Users cannot directly modify their ratings. |
R12 | Users should not be able to directly modify reputation values. | Users cannot directly modify their reputation. |
R13 | Users should not be responsible for directly calculating their own reputation | Reputation is calculated algorithmically, not by users themselves. |
The Vavilis et al. model does not rate reputation systems directly according to how easy they are for participants to game, although defence against gaming is an important requirement. The iSpot system is quite resistant to gaming for a number of reasons.
A sybil attack is when a participant enhances their reputation by creating fake accounts that give their primary account positive feedback. Here too, there is no real incentive in iSpot for a participant to do this. The fake accounts would start with zero reputation and their influence on the Likely IDs won by the primary account would therefore be negligible. High taxonomic reputation can only be earned by consistently proposing names that attract the agreement of highly-ranked participants.
It is possible for a participant to submit fake, duplicate or ‘stolen’ observations, but reputation would only be earned from these if they were correctly named and agreed-with. In practice we have not seen this happen, probably because such behaviour would quickly become apparent to other iSpot participants and would therefore be self-defeating.
The two iSpot websites between them received more than 2.2 million visits from nearly 900,000 unique visitors (Table
Cumulative usage statistics up until 30 June 2014 for the two iSpot platforms www.ispotnature.org (mainly UK observations, launched June 2009) and iSpot Southern Africa www.ispot.org.za (launched June 2012). Data available from the Dryad Digital Repository: http://doi.org/10.5061/dryad.r0005
Number of | UK | % | ZA | % | Total | % |
---|---|---|---|---|---|---|
Unique visitors | 724,272 | - | 166,936 |
- | 891,208 | - |
Participants registered | 35,988 | - | 6,455 | - | 42,443 | - |
Observations uploaded | 262,942 | - | 127,222 | - | 390,164 | - |
Determinations made | 322,076 | - | 159,675 | - | 481,751 | - |
Agreements given | 1,020,539 | - | 241,431 | - | 1,261,970 | - |
Images uploaded | 423,501 | - | 287,776 | - | 711,277 | - |
Registered participants adding an observation | 13,644 | 37.65% | 1,895 | 29% | 15,539 | 37% |
Registered participants adding a determination | 8,408 | 23.20% | 1,555 | 24% | 9,963 | 23% |
Registered participants adding an agreement | 5,870 | 16.20% | 1,087 | 17% | 6,957 | 16% |
As at June 30th 2014, iSpot had a global total of more than 42,000 participants (Table
a iSpot participants who made at least one observation, ranked on the horizontal axis by the number of observations each made (shown on the vertical axis). b iSpot participants who made at least one determination ranked on the horizontal axis by the number of identifications each made. n = 201,711 observations on ispotnature.org for both.
The majority of observations (85%) submitted to ispotnature.org came from the UK, but the remainder were observed in 138 other countries or territories (Fig.
The global distribution of observations made on ispotnature.org and ispot.org.za up to December 2013.
How well does iSpot perform its core function of attaching verified names (determinations) to observations and how scaleable is this? Fig.
For ispotnature.org in the 4-year period shown: a The percentage of observations submitted each month that received a likely ID and the percentage where an expert provided or agreed the determination b The number of observations submitted per month.
The bar chart in Fig.
Of 109, 447 observations submitted to ispot.org.za, 95% received a name. Across all taxa, 83% were determined at species level, but this percentage varied from a high of 97% for mammals (n = 3,242) and 98% for herptiles (n = 5,513) to only 52% for fungi & lichens (n = 1,961) and 45% for insects (n = 15,590). The remaining determinations in the latter groups were made at the levels of genus, family or order.
To obtain an independent check on the names acquired by iSpot observations we submitted 46,736 observations of plants and moths made in the UK to the biological recording website iRecord (http://www.brc.ac.uk/irecord) run by the UK Biological Records Centre. Because of the volume of records submitted, only about 8% of them, a sample of 2,234 plants and 1,053 moths, were examined by experts in iRecord, but of these 94% of the plant records and 92% of the moths were verified.
iSpot is not only efficient in providing verified names, but it is also fast. Over half the UK observations submitted without a name were determined with one hour and 88% of such observations acquired a likely ID within 24 hours (Fig.
The cumulative frequency distribution for the time taken for a likely ID to be acquired by observations submitted to ispotnature.org without an organism name. n = 100,703 observations.
iSpot relies on photographs for identification. How accurate and precise are the determinations made this way? To answer this question we analysed in greater detail the 156,491 British observations made during the three years 2011–2013 that received at least one identification using a name that is present in the UK Species Inventory (UKSI) (“UK Species Inventory” 2014). Likely ID status is the only measure we have of accuracy, but precision can be represented by the taxonomic rank at which the Likely ID was made. There were 10, 672 taxa recorded in the whole sample, 80% at species rank, 14.5% at genus, 4% at family, 0.7% at order and 0.2% at phylum rank.
In 85% of the sample (132,835 observations), the likely ID was the only name submitted. Among the remaining 15% of observations (n = 23,606), the majority (21,613), comprising 13.81% of the whole sample, had two UKSI names proposed. Three names were proposed in 1,846 cases, four names in 135 cases and 12 observations had five names. All proposed names remain visible on iSpot and cannot be removed by the proposer. This preserves the discourse that takes place around contentious observations, providing a useful demonstration, especially for beginners, of how names may be arrived at.
Taking just the sample of 14,611 observations where there were two names and the alternative name as well as the Likely ID were present in the UKSI, we were able to calculate improvements to the accuracy and precision of the Likely ID compared with the offered alternative. An increase in accuracy without a change in precision occurred when the Likely ID and its alternative had the same taxonomic rank, for example when Pieris napi the Green-veined White butterfly was the likely ID and the alternative name suggested was another butterfly in the same genus, P. rapae, the Small White. This particular confusion occurred 38 times out of 251 P. rapae IDs, while the reverse mistake occurred only 3 times out of 296 P. napi IDs. In other words, though the two butterflies were reported with similar frequency, the Green-veined white was mistaken for the Small White about ten times as often as the reverse. The other common British species in this genus is the Large white, Pieris brassicae (228 IDs) which was mistaken for P. rapae 15 times, but for P. napi only twice.
In the two-name sample as a whole (Table
Taxonomic ranks of Likely IDs and of alternative names for a sample of 14,611 observations of UK species submitted to iSpot.
Rank of alternative name | |||||||
---|---|---|---|---|---|---|---|
Rank of Likely ID | Phylum | Class | Order | Family | Genus | Species | Total |
Phylum | 1 | 1 | - | 2 | 1 | 8 | 13 |
Class | 2 | 3 | 1 | 6 | 12 | 51 | 75 |
Order | - | 5 | 14 | 29 | 40 | 118 | 206 |
Family | 4 | 10 | 68 | 86 | 128 | 441 | 737 |
Genus | 4 | 18 | 118 | 200 | 295 | 1,807 | 2,442 |
Species | 26 | 113 | 388 | 737 | 1,985 | 7,889 | 11,138 |
Total | 37 | 150 | 589 | 1,060 | 2,461 | 10,314 | 14,611 |
Percentage of changes in taxonomic rank for the data shown in Table
Initial taxonomic rank | |||||||
---|---|---|---|---|---|---|---|
Change in taxonomic rank (Inference) | Phylum | Class | Order | Family | Genus | Species | All ranks |
Up (False precision in the alternative name) | - | 1% | 0% | 3% | 7% | 24% | 18% |
None (Improved accuracy in the Likely ID) | 3% | 2% | 2% | 8% | 12% | 76% | 57% |
Down (Improved precision in the Likely ID) | 97% | 97% | 97% | 88% | 81% | - | 25% |
We defined a change in the precision with which a Likely ID was made by comparing its rank to the rank of its alternative. An increase in precision occured when the rank of the Likely ID was lower than the rank of its alternative. For example when the Likely ID was Bombus lucorum and the alternative was Bombus sp. Improvements in precision from genus to species occurred in 81% of cases (Table
The reverse relationship of ranks, with the rank of the Likely ID higher than the alternative, indicated that the alternative determination had been made with false precision. For example, there were 11 cases where the name Bibio marci (St. Mark’s fly) was proposed, but where the Likely ID was Bibio sp. because the precise species of the genus Bibio could not be determined. In the whole dataset, 18% of comparisons were of this type (Table
Commentators on the progress of science tend to emphasize the impact of technological advances in hardware and with good reason. Particle accelerators, genome sequencers, satellites, MRI scanners and fast computers, to name just a handful of obvious examples, have been revolutionary. However, as Nielsen has argued in Reinventing Discovery (
The results reported here are the fruit of six years of community-building using both online and face-to-face methods in the UK (
iSpot is not the first or the biggest source of identification on the internet, but we believe that this is the first report of how effective the process can be in a social network that has been specifically designed for the purpose. As such, iSpot is not only a source of identification and learning, but also a laboratory in which we can discover how to improve both. For example, there is valuable information in the mistakes that people make. In our analysis of accuracy and precision, we found that the mistakes made among the 3 common Pieris butterflies were not symmetrical, suggesting that lack of familiarity with all the species, particularly P. napi, rather than their similarity was the source of the difficulty that naive participants had in identifying these species. This kind of quantitative information will be used in future to improve the identfication tools available in iSpot. Although the sample size we analysed here was large (n = 23,606), it covered only a small proportion (15%) of observations and therefore many taxa of interest will have been omitted. A recently introduced feature is the iSpot quiz which draws content from our entire database of observations. In future we will be able to use data derived from mistakes that people make in the quiz to extend our analysis of identification errors to a much higher proportion of species.
While we have anecdotal evidence from comments made by participants that they have learned, we do not have direct, quantitative evidence of learning in iSpot yet. However, we do know from a previous analysis of 400 participants’ behaviour that they provided determinations for fewer than 40% of their very first observations, but that they themselves determined more than 60% of their 50th observations (
We also have evidence that the effective operation of iSpot is becoming less dependent upon the limited number of experts whom we seeded with high reputations at the start. In January 2010, designated experts were contributing to 60% of determinations, but four years later 60% of observations were being determined without their input (Fig.
The iSpot reputation system achieves its three goals of identifying expertise, verifying identifications of organisms and involving beginners. It is resistant to gaming and performs well against generic requirements (Table
Firstly, although all agreements are used in computing Likely IDs, those contributed by highly ranked and expert participants have much greater weight than those given by beginners. This makes the system robust to mistakes by beginners who will be the greatest source of errors, while still allowing participants of all abilities to contribute and to be involved. Secondly, higher-ranked and expert participants tend to guard their reputation by exercising caution, so while their agreements carry considerable weight they are also less prone to error. Higher ranked and expert participants are more aware of alternative taxa and problematic groups and tend to retreat to higher level identifications, thus providing the most appropriate degree of taxonomic precision and preserving the kudos in their reputation (though in fact iSpot reputation itself cannot be lost). iSpot also facilitates caution by attaching three levels of confidence when names are proposed and by providing a comments thread in every observation where experts can explain in detail why an organism cannot be x, why it could be y or z and how to tell y and z apart. As soon as an observation acquires a name, iSpot displays thumbnails of other observations of the species in a carousel on the page. This provides a visual check, so that misidentified observations are often immediately apparent. Participants quickly learn to both check for a match with the other observations and to be more cautious.
Thirdly, we suggest that the low granularity of the reputation classification in taxonomic and geographical space is supplemented by additional, hidden granularity engendered by the small-world properties of the social network. In a small-world network, people who are unknown to one another are linked by mutual acquaintances and the distance between any two nodes scales as the log of the number of nodes (
We have concluded that the strength of the iSpot reputation system lies in its combination of apparent simplicity with hidden complexity. It follows from this that the way to improve the system is not to make it more complicated for the user by increasing the number of dimensions of reputation, but to make the underlying source of a participant’s reputation more visible. We are working on intuitive ways of doing this.
Finally, we should mention that although we did not initially set out to create a reference database for species identification, that is increasingly what iSpot provides. We know from feedback that visitors, including environmental impact assessors, conservation officials, horticulturists, landscapers and even expert taxonomists, regularly use iSpot to check identifications. As iSpot continues to expand with new communities around the world, we expect to teach, to learn and to discover more.
We are indebted to the team who assisted RG in developing the iSpot software: Richard Lovelock, Nick Freear, Kevin McLeod, Greg Roach, Phil Downs and Will Woods. We are grateful to the tens of thousands of members of the public who have contributed observations and determinations to iSpot. We thank Muki Haklay, Michael Pocock, John Tweddle, Eileen Scanlon, Mike Sharples and Will Woods for comments on the manuscript. We acknowledge funding from the Big Lottery Fund for England (through OPAL), British Ecological Society, Garfield Weston Foundation, South African National Biodiversity Institute, Wolfson Foundation (through the OpenScience Laboratory), British Council and The Open University.