Southern Ocean Echinoids database – An updated version of Antarctic, Sub-Antarctic and cold temperate echinoid database

Abstract This database includes over 7,100 georeferenced occurrence records of sea urchins (Echinodermata: Echinoidea) obtained from samples collected in the Southern Ocean (+180°W/+180°E; -35°/-78°S) during oceanographic cruises led over 150 years, from 1872 to 2015. Echinoids are common organisms of Southern Ocean benthic communities. A total of 201 species is recorded, which display contrasting depth ranges and distribution patterns across austral provinces and bioregions. Echinoid species show various ecological traits including different nutrition and reproductive strategies. Information on taxonomy, sampling sites, and sampling sources are also made available. Environmental descriptors that are relevant to echinoid ecology are also made available for the study area (-180°W/+180°E; -45°/-78°S) and for the following decades: 1955–1964, 1965–1974, 1975–1984, 1985–1994 and 1995–2012. They were compiled from different sources and transformed to the same grid cell resolution of 0.1° per pixel. We also provide future projections for environmental descriptors established based on the Bio-Oracle database (Tyberghein et al. 2012).


Project title: Species distribution modelling of Echinoids in the Southern Ocean
Personnel: Salomé Fabri-Ruiz, Thomas Saucède, Bruno Danis, Bruno David Funding SF-R's work is supported by a PhD grant from French Ministry of Higher Education and Research. The project is a contribution to Biogeosciences laboratory (CNRS UMR6292) and contribution #15 to the vERSO project (www.versoproject.be) funded by the Belgian Science Policy Office (BELSPO, contract n°BR/132/A1/vERSO).

Study area descriptions/ descriptors
The study area extends from the Antarctic continent in the south to 35° S latitude to the north; it comprises the sub-Polar, Antarctic, Polar Frontal, and sub-Antarctic zones. The Southern Ocean is characterized by unique oceanographic features mainly including an unusually deep continental shelf ranging from 450 m to 1000 m depth (Clarke and Johnston 2003), and the Antarctic Circumpolar Current (ACC), the strongest and largest current of the planet that flows clockwise from west to east around Antarctica (Barker and Thomas 2004) and conditions marine species dispersal (Griffiths et al. 2009). Four major marine fronts are distributed from north to south : Subtropical Front (STF), Sub-Antarctic Front (SAF), Polar Front (PF), Antarctic Divergence (AD), and separate water masses of different physical and biotic properties (Sokolov andRintoul 2002, Roquet et al. 2009). One of these major fronts is the Polar Front that acts as a biogeographic barrier to the dispersal of many invertebrates between sub-Antarctic and Antarctic waters (Koubbi 1993, Clarke et al. 2005.

Design description
Nowadays, ecological niche modelling is commonly used in macroecological and biogeographic studies to enhance mapping and understanding of species distribution patterns. Models also constitute useful tools for marine area management purposes (Sánchez-Carnero et al. 2016), predicting invasive species distribution (Václavík and Meentemeyer 2012), identifying biodiversity hot spots and highlighting potential impacts of climate change on species distribution (Elith and Leathwick 2009). Extensive and consistent databases are essential to biogeographic studies to explore species distribution patterns in the Southern Ocean (De Broyer and Koubbi 2014). Reliability and robustness of distribution models are mainly conditioned by the quality and accuracy of occurrence data (Graham et al. 2007, Lobo 2008, Osborne and Leitão 2009. With this in mind, the creation of SCAR-Marbin in 2005(Griffiths et al. 2011) and RAMS in 2010(De Broyer and Danis 2011 allowed the first Antarctic marine biodiversity data compilation.
Objectives of our project are to produce robust and reliable species distribution models at the scale of the Southern Ocean, an area where distribution data are very heterogeneous and sampling gaps frequent.
This requires consistent and comprehensive datasets. For this purpose, an extensive echinoid occurrence dataset was compiled, updated, and checked for accuracy. This dataset is presented here.
Taxonomic information was updated according to the most recent literature. For example, Sterechinus bernasconiae Larrain, 1975 is now considered a junior synonym of Gracilechinus multidentatus   (Saucède et al. 2015). We checked for taxonomic accuracy using the World Echinoidea database (Kroh and Mooi 2017) and experts knowledge. However, mentions of former species identifications are kept in the dataset and clearly distinguished from updated taxonomy.
The dataset includes historical data sampled in the Southern Ocean over a century and a half from the Challenger expedition to the most recent oceanographic campaigns led on the Kerguelen Plateau, in Adelie Land and around the Antarctic Peninsula ( Figure 1). All compiled georeferenced locations were scanned and checked for accuracy.
DatasetName links the origins of occurrence records which are from academic collections (British Antarctic Survey Collection, Burgundy University collection, …), published articles, former databases , Pierrat et al. 2012 or cruise reports.
In order to quantify sampling effort, a 3° by 3° cell grid was shaped (Clarke et al. 2007) and each record of the database was assigned to a grid cell. Following Griffiths et al. (2011) sample and species numbers were both counted for each grid cell using ArcGIS v10.2 (ESRI 2011) and Microsoft Access (2013).
We also provide oceanographic features as environmental maps for physical and abiotic parameters that are relevant to echinoid ecology. Environmental data come from the World Ocean Circulation Experiment 2013 database and depth data come from ETOPO1 (Amante and Eakins 2009). Cell resolution was set up at 0.1 degree with the R 3.3.0 software. These data needed to be corrected for precise depth accuracy, which was performed using ArcGIS and following the protocol proposed by Guillaumot et al. (2016). A seafloor temperature layer was generated based on available temperatures for multiple depth layers of the water column. However, due to missing data, some values were interpolated using the nearest neighbour method with Arctoolbox (ESRI 2011).

Sampling effort and data description
The database includes more than 7,100 georeferenced records ( Figure 1). It is an updated version of the former database "Antarctic, Sub-Antarctic and cold temperate echinoid database" (Pierrat et al. 2012) that contains 1,000 additionnal records compared to Pierrat et al. 2012. This new version includes new records from the most recent oceanographic campaigns led in the Southern Ocean (e.g. POKER II, PRO- Sampling effort has long been heterogeneous in the Southern Ocean. It has been the highest along the Antarctic Peninsula and off New Zealand (>200 samples), two areas characterized by a high species number (25-30) (Figure 2a, 2b). In contrast, the number of species remains low (2-5 species) in the region of the Kerguelen Plateau while it has been intensively sampled as well (POKER 2 and PROTEKER cruises).
Our knowledge of genus and species distributions is strongly biased by the quality of sampling effort. Figure 3 highlights the link between the number of samples available and the recorded number of species and genera per grid cell.
Several areas have been little sampled including the waters close to the sea ice margin and deep oceanic basins, most records being concentrated in the first 400 meters ( Figure 4) and in the vicinity of scientific stations like in the north of the Kerguelen Plateau, in Adelie Land or along the Antarctic Peninsula. Conversely, the South Kerguelen Plateau and the west of the Ross Sea have been little explored. These under-sampled parts of the Southern Ocean constitute challenging areas for future

Latitudinal gradient
Main biogeographic features of Southern Ocean echinoids is a constant decrease of genus richness southward whereas species richness decreases from 35°S to 60°S, increases from 60°s to 65°S, then decreases again southward until 70°S (Figure 5). Such a pattern has already been published for Southern Ocean echinoids (Saucède et al. 2014) and herein supported by new data addition. The high number of species recorded between 60°S and 65°S could be due to the high sampling effort devoted to the region of the Antarctic Peninsula (Figure 2a-b) while conversely, sampling effort decreases southward until 70°S.

Environmental data
Environmental data were compiled from the following sources: Smith and Sandwell 1997, Boyer et al. 2013, Douglass et al. 2014. Environmental data are provided in raster format (Fabri-Ruiz et al, 2017). Mean surface temperature, mean seafloor temperature, mean surface salinity, and their respective amplitudes (winter minus summer averages) were calculated for the following decades: [1955 to 1964], [1965 to 1974], [1975 to 1984], [1985 to 1994] and [1995 to 2012]. Future projections are provided for mean surface temperature and salinity and for different IPCC scenarios (A2, A1B, B1) (IPCC, 5 th ) they were downloaded from the Bio-Oracle database ).

General taxonomic coverage
The database includes occurrence records of all echinoid species reported in the Southern Ocean from the Antarctic continent to 35°S latitude. Echinoids are common organisms of Southern Ocean benthic communities. They have contrasting depth ranges and distribution patterns across austral provinces and bioregions, ranging from coastal areas to the abyssal zone. Echinoid species show various ecological traits including different nutrition and reproductive strategies. In total, 201 species belonging to 31 families were recorded. Many of them are endemic to the Southern Ocean.

Environmental parameters description
Environmental descriptors for the Southern Ocean were compiled from various sources but most of them come from the World Ocean Atlas (Boyer et al. 2013) for current parameters. Available data are mean surface temperature, mean seafloor temperature, mean surface salinity and their respective amplitudes (winter minus summer averages) were calculated for the following decades: [1955 to 1964], [1965 to 1974], [1975 to 1984], [1985 to 1994] and [1995 to 2012]. Future projections are provided for mean surface temperature and salinity and for different IPCC scenarios (A2, A1B, B1) (IPCC 5 th ); they were downloaded from the Bio-Oracle database ).