Iberian Peninsula and Balearic Island Bathynellacea (Crustacea, Syncarida) database

Abstract This is the first published database of Bathynellacea. It includes all data of bathynellids (Crustacea, Bathynellacea) collected in the last 64 years (1949 to 2013) on the Iberian Peninsula and Balearic Island. The samples come from groundwater (caves, springs, wells and hyporrheic habitat associated rivers) from both sampling campaigns and occasional sampling conducted throughout the Iberian Peninsula and Balearic Islands. The dataset lists occurrence data of bathynellids distribution, sampling sites (with localities, county and geographic coordinates), taxonomic information (from family to species level) and sampling sources (collector and sampling dates) for all records. The descriptions of new species and species identifications have been carried out by an expert taxonomist (AIC) with 25 years experience in the bathynellids studies (see references). Many of the sampling sites are type localities of endemic species from Iberian Peninsula. The dataset includes 409 samples record corresponding to two families, 12 genera and 58 species, 42 of them formally described plus 16 taxa unpublished and 47 samples in study. All species known from the study area are included, which nearly sum up a quarter of species of Bathynellacea known in the world (250 species).

of this information has never been published, and other can be found but in separate sources distributed along an extended period of time, so we deemed it necessary to pool all information into a single dataset containing all the information available for each sample of bathynell. This way, the dataset is a significant contribution of basic information on Iberian Bathynellacea, which due to the rareness of the species and their extreme habitat can be useful for subterranearn biodiversity, ecology and conservation studies, as well as for Global Change estimations (the dataset includes sampling efforts in successive years). Our aims for publishing this dataset are 1) providing information on the diversity and distribution of the Iberian and Macaronesic groundwater fauna, 2) describing the bathynellacea collection of AIC and the MNCN, and 3) offering the first dataset of bathynellacea in the World to the scientific community in the hopes of promoting other researchers to publish their groundwater fauna datasets.
Additional information: Section 2 of the bibliography includes a list of the publications citing the bathynells included in this dataset. Table 3 includes information on all the new species of Bathynellacea described since 1986 until the present, including the catalogue number of the type series in the classic Crustacea collection of the MNCN, as well as the vouchers of the Tissue and DNA Collection of the MNCN referring to the DNA extractions from specimens of type localities where available.
Study area descriptions/descriptor: The study area includes 195 sites throughout the Iberian Peninsula and Balearic Island, and several sampling dates ranging from 1949 to 2013.
Most localities sampled are in karstic areas (Ayala Carcedo et al. 1986;García-Codrón 1983;Puch 1998). Sampling is always done in groundwater caves, springs, wells and interstitial environment of the epigen river where the stygobionts living in them can be collected. The general aim, apart from the specific objectives of each project, was identifying the Bathynellacea crustacean fauna inhabiting subterranean waters of Spain and Portugal (Fauna Ibérica).
Design description: This dataset was developed to determine the current distribution patterns of bathynellids species at the scale of the Iberian Peninsula. It also contributes to the knowledge of groundwater Biodiversity in the Iberian Peninsula and to identify endemic fauna at different geographic scales (country, counties and localities). Prior to digitisation, the taxonomic identification pre-existing was reviewed by the specialist AIC. The dataset is exported to DarwinCore v1.2 format and uploaded to the IPT of the GBIF Spanish node (http://www.gbif.es:8080/ipt). DarwinCore elements included in the dataset structure are listed in the dataset description section.

Data published through
GBIF: http://www.gbif.es:8080/ipt/resource.do?r=mncn-aic taxonomic coverage General taxonomic coverage description: This is a collection of Bathynellacea, a group of Crustacea Malacostraca, contains all known species for Spain and Portugal as well as all the localities where bathynells have been found within the region considered. The collection includes all the material obtained in the Iberian Peninsula and Balearic Islands except the samples collected between 1949 and 1968 in Portugal, which have been lost. Most of the collection is identified to species level. The samples without identification to species level, due to the lack adult specimens or the absence of males, have been identified to genus or family level. We have found 12 genera belonging to two families (Table 1), Parabathynellidae (63,8% of the species and 68% of the records) and Bathynellidae (36,2% of the species and 32% of the records) ( Figure 2A). In the Parabathynellidae family five genera have been identified: Iberobathynella Schminke, 1973 (22 species plus five unpublished found in all habitat), Paraiberobathynella Camacho & Serban, 1998 (two species, found in wells and interstitial river bank), Hexaiberobathynella Camacho & Serban, 1998 (two species found in wells and interstitial river bank), Guadalopebathynella Camacho & Serban, 1998 (one species found in interstitial river bank) and Hexabathynella Schminke, 1972 (four species plus one unpublished found in caves and interstitial river bank) (see Figure 2B). In the Bathynellidae family seven genera have been identified: Vejdovskybathynella Serban & Leclerc, 1984 (four species plus three unpublished found only in caves), Paradoxiclamousella Camacho et al., 2013 (two species plus two unpublished found in caves, spring and interstitial river bank), Clamousella Serban, Coineau & Delamare Deboutteville, 1971 (three unpublished species found in interstitial river bank), Hispanobathynella Serban, 1989 (one species in a cave), Bathynella Vejdovsky, 1882 (cf ) (four species incerta sedis found in interstitial river bank and one cave), Bathynellidae gen. nov. 1 (genus and species unpublished found in wells and interstitial river bank) and Bathynellidae gen. nov. 2 (genus and species unpublished found in a cave) ( Figure 2B). In addition there are 47 sample more, 33 of the Bathynellidae family and 14 of the Parabathynellidae family, still in study and probably a number of them belonging to new genera. In summary, until now we have identified 58 species (16 unpublished), all endemic from Portugal and Spain. Twenty seven of these, have been described as new species only in recent years (see Table 3 and Reference List 2). The other 16 species still pending formal description, are also new to science. This dataset includes all species of Bathynellacea known for the study area, and nearly a quarter of all the species known worldwide (Camacho 2006, Camacho andValdecasas 2008).    Total  32  157  11  209 409 -(93,1%) and only a small portion (3,9%) from Portugal (with 17 species registered) ( Table 2 and Figure 3). The region with most samples and most species is Cantabria (23.2% of records and 20 species) followed by Burgos (13.4% of the records and 11  species) and Asturias (11.7% of the records and 12 species); from all of Andalucía there are 37 records in total (9%) and 21 records from Aragón (5.1%) followed by only 4.2% of the records from Levante. In other provinces included in the dataset there less than 3 records (Salamanca, Pontevedra, La Coruña, Álava, Lugo, Cuenca, Navarra, León, Gerona, Lérida and Vizcaya), while from Madrid there are 47 records but these come from only 3 localities sampled many times showing only 3 different species.
Regarding the Balearic Islands, only 7 records are included (samples from caves) from the island of Mallorca (1.7% of the records and only 1 species) (Figure 3). There are no records from the provinces of Zamora, Barcelona, Cáceres, Badajoz, Albacete, Segovia, Guipúzcoa and Logroño. Considering the habitats sampled, most of them come from caves (51,1%), mainly from Cantabria (40.7%); interstitial epigean river banks (38.4%), mainly from Madrid and Portugal; a few records are from springs (2.7%), mainly from Cantabria; and 32 records are from wells found mainly in Andalucía and Levante (see Table 2). The sample distribution by provinces and habitat can be seen in Figures 3 and 4 respectively.

Method step description:
The collection has been digitisated with MSEXCEL software, compatible with DarwinCorev 1.2 or Darwincore 1.4. Pre-digitisation phase: The identifications of each specimen from each sample has been reviewed recently and some former imprecisions and the discovery of cryptic spe- cies (due for example to the use of molecular techniques) have lead modifying some records in the Excel file used as starting point for this work. The initial files were short on the number of fields for each of the sampling sites and dates of sampling (date, locality, province, habitat, collector and the species found with data on the family genus, species and author).
Digitisation phase: Starting from the initial Excel file, the standard fields for a Dar-winCore v1.2 database were added as needed, and the geographical data was included (UTM coordinates) from a GPS in association to the samples taken (PASCALIS samples and all those taken after the year 2000), or were obtained from grey (speleological reports) or published (Notenboom and Meijers 1984;Puch 1998) literature (i.e., the precise location through GPS in the entrance of the caves where bathynellid samples have been collected), as well as from type specimens.
Creation of the dataset: The dataset was exported as a file in DarwinCore v1.2 format. DarwinCore elements included in dataset structure are listed in the dataset description section. A Darwin Core table was prepared from the original database project. The field-to-filed mapping was fine-tuned with the support of GBIF-Spain's Coordination Unit. The resulted table was imported into the Darwin Test tool (http:// www.gbif.es/darwin_test/Darwin_test_in.php, Ortega-Maqueda and Pando 2008). This tool allows detailed metadating of the dataset, and also performs a number of quality checks on the data (dataset structure compliance to Darwin core, geographic consistency, date format, etc. currently over sixty of those checks are carried out). Once the potential errors flagged have bee checked and corrected, a Darwin Core Archive is generated, also by the DarwinTest tool. The produced DwC-A is then uploaded to the GBIF-Spain's IPT installation (http://www.gbif.es:8080/ipt/). From there, the dataset is made public, registered in GBIF and indexed and published by the GBIF data portal.
The dataset was transformed to a DarwinCore Archive format with metadata to ensure rapid discovery of this biodiversity resource and future publishing as a citable academic paper (see Chavan and Penev 2011) Study extent description: This collection begins with the sampling campaigns by AIC in northern Spain for his doctoral thesis in 1983. Most of the data prior to 1976 are bibliographic (3.9%) although some samples studied by AIC were Bathyllenacea obtained between 1976 and 1978 by R. Rouch et coll. (8.3%), in three short sampling trips to different areas of the Iberian Peninsula. In addition, from 1984 to 1986 Jos Notenboom, assisted by Ines Meijers, and later P. van der Hurk & R. Leys (1986), took groundwater samples throughout Spain (12.7%) looking for stygobionts amphipods for the Notenboom doctoral Thesis and all Bathynellacea they found in these samples were also donated to AIC for study. The following years AIC has continued obtaining samples of this fauna throughout Spain in the framework of different research projects. It is worth noting the PASCALIS European project (2002)(2003)(2004) (7.6%) in which AIC and his team conducted intensive sampling of groundwater fauna in the Cantabrian mountain ranges, an area where continuous sampling has been done since then together with C. Puch (65.3% of samples), increasing substantially the number of Bathynellacea records in Spain. The samples are mainly from the north of the Iberian Peninsula, Asturias, Cantabria and the north of Burgos (see Table 2 and Figures 1 and 3) although there is also a good representation of all the karstic areas of the Peninsula. The karstic areas of the Balearic Islands are still underrepresented (see grographic coverage section). The first sample recorded is from Portugal and was collected in 1949; the first bathynell from Spain dates from 1950 and is recorded for the Cueva de Genova (Genova cave) in Mallorca by the Romanian researchers Orghidan and Tabaccaru (Margalef 1951). Between the 50s and the 60s bathynells are found occasionally in samples from Portugal, Andalucia and Mallorca; in the 70s there are also few discoveries, but it is not until the 80s and from then on when most of the Bathynellacea samples of this dataset are found and studied. Figure 5 shows a graph of how the knowledge on bathynells has evolved along the last 70 years. Figure 6 shows the sampling efforts used in the Iberian Peninsula, translated into the number of records of bathynells included. The collection currently consists of over 409 samples with several thousand specimens and more than 2000 scientific preparations among which the type series of all new species described are included. The specimens are deposited in both the Collection of Crustaceans and the Tissues and DNA Collection of the MNCN.
Sampling description: Material of this collection has been collected in four ways: 1) Samples collected by Rouch et coll., in two short sampling campaigns in the Iberian Peninsula (1976 and1977), which have been studied by AIC. 2) Samples collected in the sampling campaigns of Jos Notenboom et coll., in 1984Notenboom et coll., in , 1985Notenboom et coll., in and 1986 to the Iberian Peninsula within the framework of his PhD thesis. These samples have also been studied by AIC. In addition some particular samples, with a more or less extense associated information, have been donated to AIC by fellow researchers (D.Jaume, A. Tinaut, J. Rodriguez, A. García-Valdecasas, P. Rodriguez, C. Boutin, E. Bello and C. Noreña).
The methods used in collecting this type of samples can be seen in Camacho 1992 and1994. The samples are fixed in the field in formalin 4% or ethanol 96°, or are frozen. Each sample collected is studied under a binocular microscope in order to isolate the bathynellids specimens found.
The specimens used for morphological study are stored in alcohol (70%). The specimens used for molecular study are directly frozen at -80 °C. A complete dissection of all anatomical parts of specimens of type series is necessary for taxonomic study. The permanent preparations include the dissections together with entire specimens  kept in special metal slides, using glycerine gelatine stained with methylene blue as the mounting medium. Anatomical examinations are performed using an oil immersion lens (100×) of an interference microscope.
The specific techniques used for molecular analysis for taxonomic application are detailed in Camacho et al. 2011Camacho et al. , 2012Camacho et al. and 2013a Quality control description: Systematics reliability and consistency is backed by the experience of AIC, who made all identifications, in the field of Bathynellacea taxonomy. Recently, the identifications made are being confirmed by molecular data. The validation and cleaning of the associated geographical information has been introduced in several steps as a key issue of the digitisation process.

Dataset description
Object name: Darwin Core Archive Iberian Peninsula and Balearic Island Bathynellacea (Crustacea, Syncarida) database Character encoding: UTF-8 Format name: Darwin Core Archive format