Intraspecific variation and phylogeography of the millipede model organism, the Black Pill Millipede Glomerismarginata (Villers, 1789) (Diplopoda, Glomerida, Glomeridae)

Abstract The Black Pill Millipede, Glomerismarginata, is the best studied millipede species and a model organism for Diplopoda. Glomerismarginata is widespread, with numerous colour morphs occurring across its range, especially in the south. This study investigates whether colour morphs might represent cryptic species as well as the haplotype diversity and biogeography of G.marginata. The results of the COI barcoding fragment analysis include 97 G.marginata, as well as 21 specimens from seven potentially related species: G.intermedia Latzel, 1884, G.klugii Brandt, 1833 (G.undulata C.L. Koch, 1844), G.connexa Koch, 1847, G.hexasticha Brandt, 1833, G.maerens Attems, 1927, G.annulata Brandt, 1833 and G.apuana Verhoeff, 1911. The majority of the barcoding data was obtained through the German Barcode of Life project (GBOL). Interspecifically, G.marginata is separated from its congeners by a minimum uncorrected genetic distance of 12.9 %, confirming its monophyly. Uncorrected intraspecific distances of G.marginata are comparable to those of other widespread Glomeris species, varying between 0–4.7%, with the largest genetic distances (>2.5 %) found at the Mediterranean coast. 97 sampled specimens of G.marginata yielded 47 different haplotypes, with identical haplotypes occurring at large distances from one another, and different haplotypes being present in populations occurring in close proximity. The highest number of haplotypes was found in the best-sampled area, western Germany. The English haplotype is identical to northern Spain; specimens from southern Spain are closer to French Mediterranean specimens. Analyses (CHAO1) show that approximately 400 different haplotypes can be expected in G.marginata. To cover all haplotypes, it is projected that up to 6,000 specimens would need to be sequenced, highlighting the impossibility of covering the whole genetic diversity in barcoding attempts of immobile soil arthropod species.

After discovering a new chemical compound in G. marginata (Glomerin: Schildknecht et al. 1966), further studies on the defensive secretions of G. marginata were conducted by several authors (Meinwald et al. 1966, Carrel 1984. For a long time (see Shear et al. 2011) G. marginata was the only animal species known to sequester quinazolinone alkaloids. Glomeris marginata is the only millipede species in which the embryonic and postembryonic development is thoroughly known (Dohle 1964, Juberthie-Jupeau 1967, Enghoff et al. 1993, Janssen 2004, Prpic 2004. The unusual mating behaviour of pill millipedes (involving the sperm ejaculation on a piece of soil before the transfer to the female) was studied extensively in the Black Pill Millipede (e.g., Haacker 1964). The ecology of the species was also the subject of numerous studies (for single aspects e.g., Nicholson et al. 1966, Van der Drift 1975, David and Gillon 2002, Rawlins et al. 2006; for the role in species communities e.g., Dunger andSteinmetzger 1981 andVoigtländer 2011). The Black Pill Millipede was also the first myriapod species in which the pheromone producing postgonopodial glands were studied (Juberthie-Jupeau 1976).
Glomeris marginata is commonly included in arthropod phylogenetic analyses (e.g., Regier 2001Regier , 2005. The Black Pill Millipede is the only species of the Diplopoda in which gene expressions of different genes, including Hox genes, were widely researched (e.g., Prpic and Tautz 2003, Prpic 2005. Recently, the embryonic expression of Wnt genes was studied for the first time in myriapods (Janssen and Posnien 2014) in this species. Additionally, the embryonic development, especially the embryonic development of the segmentation inside the Myriapoda, is currently nowhere as well known as in G. marginata (Enghoff et al. 1993, Janssen 2011, Minelli 2015. The same applies to the neurogenesis (Dove 2003).
Despite the high importance of G. marginata for general studies of millipedes, and arthropod segmentation patterns in general, little to no taxonomic studies or population genetic studies of the species were conducted in recent decades. Recent genetic studies in congeneric pill millipedes allowed the detection of several synonymies as well as cryptic species, and clarified the taxonomic status of several Glomeris species (Hoess and Scholl 1999, 2001, Wesener 2015a, 2015b, Conrad and Wesener 2016. The lack of taxonomic studies in G. marginata is even more surprising considering the unusual wide distribution of the species (Kime and Enghoff 2011). Glomeris marginata is the only pill millipede reaching northern Europe. Its southernmost distribution is the south-eastern part of Spain alongside the southern border of the Pyrenees. The area of distribution of G. marginata covers France, England/Wales and Ireland, the whole of Germany except southern Bavaria and Saxony and extends north through Denmark to southern Sweden/Norway (Hoess 1999, Kime andEnghoff 2011: p. 104). Glomeris marginata is the most common pill millipede species in Germany (Reip et al. 2016).
While adult G. marginata normally can be easily distinguished from their congeners by their shiny completely black-brown colour with brightly coloured creamywhite tergal margins (see Schubart 1934: 32, Hoess 2000 Figure 1A), several unusual specimens (grey or reddish, with prominent white marks, or with orange or reddish margins, see 2A,B), currently interpreted as colour morphs, are often encountered. Such unusual specimens resemble other species of the genus, such as G. intermedia Latzel, 1884 (Figure 2B, C), which shares a similar, but more western, distribution pattern than G. marginata, or G. annulata Koch, 1847 ( Figure 2D), a local endemic in southern France (Hoess 2000, Kime andEnghoff 2011). Two other local endemic species, G. apuana Verhoeff, 1911(see Wesener 2015b and G. maerens Attems, 1927 (Figures 2E-G) not only occur in areas directly bordering the known distribution of G. marginata, but also show a similar colour pattern. Furthermore, the species G. klugii Brandt, 1833 / G. undulata C.L. Koch, 1844 andG. connexa Koch, 1847 sometimes also appear in dark-brown colour forms.
In this work, it is tested whether G. marginata and its different colour variants form a monophyletic taxon based on barcoding mt-DNA COI data. The phylogeographic relationship and the possible origin of the species are also ascertained. Finally, the relationship of the Black Pill Millipede to the other, similar coloured congeneric species, G. annulata, G. apuana, and G. maerens is clarified.  (Villers, 1789) colour morphs. A main coloration form, center immature specimens showing the perplexa colour pattern; Germany, Landskrone B strongly lightened adult perplexa pattern, France, Pays de la Loire C red mutant, Germany, Bonn D strongly red-banded form, from France, Montauroux E more weakly red-brown banded from, France, same population as D. A, D, E photographed by Jan Philip Oeyen B by ZFMK C by Dennis Rödder.  Latzel, 1884, with sympatric G. marginata, Germany, Landskrone, 2015D G. annulata Brandt, 1833, France, Gard, Courry, 2015E G. cf. lugubris Attems, 1952, Spain, Cádiz/ Sierra de Grazalema, 2008, preserved speci-menF G. cf. maerens Attems, 1927, Spain, Aragón/Teruel, 2010, preserved specimen G G. maerens, Spain, Tarragona/Montsià, 2017; B-D photographed by Jan Philip Oeyen.

Selection of specimens
Based on the project German Barcoding of Life (GBOL, http://www.bolgermany.de), 80 specimens of G. marginata from different locations were selected from the collection of the ZFMK (Zoologisches Forschungsmuseum Alexander Koenig, Bonn, Germany). All specimens of G. annulata, G. apuana and the G. maerens species-group came from the collection of the ZFMK, while the two specimens of G. hexasticha were collected by the first author. Six additional COI-sequences of G. marginata were obtained from former projects of the authors (see Spelda et al. 2011 andWesener et al. 2010). These sequences are available from GenBank (see Table 1 for accession numbers). Also, the COI-sequences of the outgroup species G. intermedia, G. klugii/ undulata, and G. connexa were obtained from the work of Spelda et al. (2011). An additional 11 French COI-sequences of G. marginata were available in BOLD (downloadable at the Public Data Portal, http://www.boldsystem.org, see Table 1 for BOLDnumbers) by end of November 2015. In total 97 COI-sequences of G. marginata and 21 of the seven outgroup species were obtained for this study (93 newly sequenced, 14 from GenBank and 11 from BOLD).
The specimens of G. marginata were collected from a major part of the distribution region in NW Europe, covering the region from NE Spain to northern Germany ( Figure 3). Material from the north-eastern part of the range (Denmark-Sweden-Norway) was not available. For the different analyses, two datasets were created, one which contained the 97 G. marginata sequences only, and a second one combining the G. marginata sequences with the 21 outgroup specimens.

DNA extraction, PCR, and sequencing
From the analysed specimens, genomic mtDNA (the barcoding region of COI) was extracted from muscle tissue applying a standard extraction protocol (see e.g., Wesener et al. 2015) at the ZFMK. Also, the PCR and sequencing protocols were identical to those used in a previous work (Wesener et al. 2015). All specimens and the aliquots of the DNA extractions were deposited in the collection of the ZFMK. All new sequences (80 G. marginata, two G. annulata, and nine G. maerens sp. as Glomeris sp.) were deposited in GenBank (see Table 1 for accession numbers).

Aligning and control
Sequences were aligned by hand in BIOEDIT (Hall 1999), version 7.2.5 (for final data set see Suppl. material S1). To rule out the accidental amplification of nuclear copies of the mitochondrial COI gene, the whole dataset was translated into amino acids following the 'invertebrate' code in MEGA 7 (Tamura et al. 2013); internal stop codons were absent in our dataset. There were in total 657 positions in the final dataset, gaps were absent. Voucher specimens and aliquots of the DNA extractions were stored in natural history collections and are available for each analysed sequence (see Table 1). Table 1. Analysed specimens, voucher and Genbank code, collection locality and bioregion (see Table 2).

Assignment to biogeographic regions
All specimens of G. marginata were assigned to a biogeographic region of the main sub-country level (bioregion) (see Table 1, column BioRegion and Table 2). The structuring of the specimens with their origin in Germany is based on the official map of natural regions of Germany, the "Großregionen", 1 st level (Meynen andSchmithüsen 1953-1962, see also "Naturräumliche Großregionen Deutschlands" at http://de.wikipedia.org). Due to their disproportionately large size, the regions "Norddeutsches Tiefland" and "Mittelgebirgsschwelle" are additionally each divided into a western and eastern part according to Figure 4. The structuring of the specimens with their origin in France is based on the "régions biogéographiques pour l'évaluation de l'état de conservation en France" (see http://inpn.mnhn.fr/programme/ rapportage-directives-nature/presentation). Additionally, the regions "France Atlantique" and "France Continentale" -due to their size -are each divided into a northern and southern part as shown in Figure 5. The ecological region "France alpine" is geographically divided into France Alps and France Pyrenees. The single specimen from Great Britain is located in southern England. For Spain, we used the regions of southern Pyrenees and the Cantabrian Mountains. In total 14 biogeographic regions were assigned in four countries (see Table 2).

Phylogenetic and distance analysis
Analyses were conducted in MEGA 7 (Kumar et al. 2015). The uncorrected pairwise distances (p-distances) were calculated with all codon positions included. Ambiguous positions were removed for each sequence pair. The distance matrix was exported to MICROSOFT EXCEL for further calculations of minimum interspecific and maximum intraspecific distances (see Suppl. material S2). A model test, as implemented in MEGA 7, was performed to find the best fitting maximum likelihood substitution model for the complete sequence set. The model with the lowest AICc value (Akaike Information Criterion, corrected) are considered to describe the best substitution pattern. Codon positions included were 1 st + 2 nd + 3 rd . The model test selected the General Time Reversible model (Tavaré 1986) with gamma distribution and invariant sites (GTR+G+I) as the best fitting model (AIC: 7988, lnL: -3750).
The evolutionary history was inferred by using the maximum likelihood method based on the selected GTR+G+I model. Initial tree(s) for the heuristic search were obtained automatically by applying NJ/BioNJ algorithms to a matrix of pairwise distances estimated using the Maximum Composite Likelihood (MCL) approach, and then selecting the topology with superior log likelihood value. The discrete gamma distribution was used with five categories to model evolutionary rate differences among sites. The analysis involved the complete sequence set (G. marginata + outgroup species). Codon positions included were "1st+2nd+3rd" (Missing Data: partial deletion). The bootstrap consensus tree inferred from 1,000 replicates (Felsenstein 1985) is taken to represent the evolutionary history of the analysed taxa. Trees were built with FIGTREE 1.4.2 and drawn to scale, with branch lengths measured in the number of substitutions per site.

Spatial relationship
Besides the genetic p-distances (see above) for all G. marginata specimen pairs (4656 pairs) the geographical distances were calculated based on the more exact method of calculation, the Euclidean geometry: The earth's radius (= er) in central Europe is 6,367 km. Lat1 and Lon1 are the latitude and longitude of the location of specimen 1, Lat2 and Lon2 those of specimen 2. For the full dataset see Suppl. material S3. A chart was plotted to show the relationship between the genetic and geographical distance.

Haplotype analysis
A haplotype analysis was conducted with DNASP (Librado and Rozas 2009) by assigning the genetic code to "mtDNA Drosophila" for invertebrates. The G. marginata sequences were grouped to haplotypes (DNASP / Generate / HaploType Data File, excluding sites with missing data). The haplotypes were marked by geography.
In a second run the sequences were grouped again by considering only non-synonymous changes. In this second step all synonymous changes were discarded. For this an interim sequence set with only non-synonymous changes was created (DNASP / Generate / Polymorphic Data File / "only Non-synonymous") and afterwards the Haplotype file was built. Because of the unequal sampling with a bias to the German fauna within the GBOL-project, no comparative population analysis was possible.
The previous first haplotype data file was used as a basis for a TCS Networks analysis (Clement et al. 2002). A TCS-network was created with the software POPART (Leigh 2015). For this a frequency matrix of haplotypes to bioregions was created in MICROSOFT EXCEL and according the software manual transformed to the POP-ART-nexus format (see Suppl. material S4).

Haplotype richness estimation
The potential number of haplotypes for the complete distribution area was estimated with ESTIMATES 9.1.0 (Colwell 2013). For this, the CHAO1-estimator (Chao 1984) based on the haplotype distribution (instead of a species distribution) was calculated (for the underlying data file see Suppl. Material S5). Together with the ACE-index the CHAO1-estimator is the main estimator for individually based abundance data (Gotelli and Colwell 2010). It is based on the number of all OTUs (operational taxonomic units, in this study the haplotypes) with one sequence in relation to the number of all OTUs with two sequences. With 10,000 randomized runs the haplotype accumulation curve (rarefaction curve) and the 95 % lower and upper boundaries of confidence intervals were calculated and additionally also their extrapolation curves (formulas in detail see Colwell et al. 2012).

Phylogenetic relationship of G. marginata with similar species
The minimum interspecific distance of G. marginata to other Glomeris species ranges from 12.9-15.9 % (see Table 3). There is a clear barcoding gap between the maximum intraspecific distance (5.0 %) and the minimum interspecific distance (12.9 %) (see also Figure 6). Glomeris connexa and the G. maerens species-group are closest to G. marginata. The separation of the outgroup species to G. marginata is clearly visible in the graphical mapping of the phylogenetic analyses (see Figure 7). The G. marginata specimens, together with G. connexa, G. apuana, and the G. maerens-group, form a distinct clade separate from the other species. The other four species (G. hexasticha, G. klugii/undulata, G. intermedia, and G. annulata) form a single clade. Statistical support for both clades is rather low, not exceeding 82 %.
The specimens of the G. maerens species-group cluster together with a minimum interspecific distance (10.5 %) to the other species, but the G. maerens specimens fall into three clades with a maximum intraspecific distance of up to 9.1 % (see Figure 7).

Intraspecific variation of G. marginata
All 97 specimens of G. marginata form a well-supported clade (bootstrap value 100 %, not shown in Figure 7). The 97 specimens of G. marginata have a maximum intraspecific distance of 5.0 %. The intraspecific distance chart ( Figure 6, blue bars) shows three peaks (at: 0 %, 0.9 % and 3.0 %) within the p-distances of the G. marginata specimens; within the range every p-distance value is present. There is no gap in the distribution of the p-distance values.

Geographical relationship of G. marginata specimens
The specimens from northern Germany and eastern France show the lowest genetic distance (≈ 1 %) to the rest of all samples. The specimens from western and southern France show the highest median distance (≈ 3-4 %) to those of other populations (see Table 4 and Suppl. Material S2). The maximum and the mean p-distance of G. marginata within the north-eastern part of the distribution (≈ 4 % or ≈ 1 %, respectively) is lower than in the southwestern part (≈ 5 % or ≈ 3-4 %, respectively). Specimens from Mediterranean France group most distantly from the rest, with a maximum p-distance of 5.0 %.
The plot of the genetic p-distance to the geographical distances of all samples (4,656 possible pairs) shows no distinct relationship between both values (see Figure 8). There is a small and negligible trend of +0.00001 % p-distance/km-distance. The coefficient of determination R² with ≈ 0.1 is extremely low. For example, two specimens collected only 43 km apart (77 to 78, see Figure 8: green circle and Table 5) show a genetic pdistance of 3.8 %, while contrarily two specimens with a geographical distance of more than 1,000 km (43 to 54, see Figure 8, grey circle and Table 5) belong to an identical haplotype (0 % p-distance). The geographically most distant analysed specimens (41 to 49, Figure 8: red circle) show a p-distance of 2.1 %.

Haplotypes/regions
Within the 657 sites of the 97 sequences of G. marginata, 74 were polymorphic which resulted from a total number of 81 mutations. The total number of synonymous changes is 71 and the total number of replacement changes is six. In the haplotype analysis, within the 97 samples, 47 haplotypes were detected, with 79 polymorphic sites. Haplotype diversity is 0.93, nucleotide diversity Pi is 0.017. 38 haplotypes (81 % of all haplotypes) consist of only one specimen (^ = 38 specimens ≙ 39 % of all specimens) and 42 haplotypes (89 % of all haplotypes) represents only specimens from one bioregion (^ = 48 specimens ^ = 49 % of all specimens). Nine haplotypes are represented in our dataset with two or more specimens (^ = 59 specimens ^ = 61 % of all specimens).  Table 5. Examples of specimen pairs with small and great ratio of p-distance (p-dist.) to geographical distance (geo-dist in km). Green marked: specimen pairs with exceptionally high p-dist. but low geo-dist.
(representative for dots of upper-left side of Figure 8: green box). Light-blue marked: specimens of the same location with the highest p-dist ( Figure 8: blue circle). Orange marked: specimen pair with exceptionally low p-dist. but high geo-dist. (representative for dots of lower-right side of Figure 8: red circle). Grey-blue marked: most distant specimen pair with identical haplotype (Figure 8: grey circle). The dataset was divided into five major haplotype lineages (see Figure 9 and partially Table 6). The major haplotype lineage V is basal to all other and shows a higher internal genetic variability (to their member subgroups and specimens: Ø 2.4 %) than the other haplotype lineages of G. marginata. Haplotype lineage V consists of several loosely connected subgroups, mainly from the French Mediterranean, the French Pyrenees and Spanish Cantabria (FR.MED, FR.PYR and ES.CC) (see Figure 9, Figure 10, black circled). This basal group is connected to the bioregion DE.MGSW via specimens 35 and 36 ( Figure 8, Table 1). The area occupied by lineage V excludes all other major haplotype lineages, which do not extend to the two South French regions (FR. MED and FR.PYR), or to the more western Spanish Cantabrian Mountains (ES.CC).

SpecimenID
The other four haplotype lineages I-IV show a wider area of distribution, but genetically less diversity. Major haplotype lineages I and IV are closely related (see Figure 9).  Table 6).
Haplotype lineage I occurs in an area reaching from the French Alps to NE Europe, with the main haplotype diversity in the German "Mittelgebirgsschwelle", eastern part (DE.MGSO). Haplotype lineage II shows a central distribution with a high proportion of specimens in the German "Mittelgebirgsschwelle", western part (DE. MGSW). Lineage II has the greatest distribution area and includes several subordinated haplotypes in the region DE.MGSW. Haplotype lineage III occurs in NW Europe with the most specimens in the France Atlantique, northern part (FR.ATLN). Additionally, the specimen from Great Britain (GB.EM) belongs to this group and has even Haplotype lineages I-III and partially lineage IV are especially poor in haplotypes. Four haplotypes, one in each lineage (see Table 6), are especially rich in specimens, 17, 15, 10, and 4, respectively, together representing 47 % (46 specimens) of all analysed G. marginata. Additional haplotypes can be added to those four main haplotypes, differing only by a few basepairs. 65 specimens can therefore be grouped into these haplotype lineages (I-IV in Table 6 and Figure 8, ^ = 67 % of all specimens). Every well-sampled bioregion has many haplotypes. The haplotype/specimen-rate is always higher than 0.3 (see Table 7). The less sampled a region is, the higher the haplotype/samples rate is. At the French Pyrenees and the Spanish Cantabrian Mountains, every sample of G. marginata represents a different haplotype. The three especially wellrepresented major haplotypes of lineages I-III were collected in 5, 4 or 3 different bioregions (see Table 6). These three haplotypes/lineages each cover a large geographical range, with all three overlapping centrally in the bioregion DE.MGSW, our best-sampled region.
The haplotype lineage III mainly connects the northern French bioregion (FR. ATLN) with central Germany (DE.MGSW). One direct connection exists between the southern French/Spanish (FR.MED, FR.PYR and ES.CC) and the northern French populations (specimen 57, FR.ATLN, Table 1).

Haplotype network of G. marginata
Based on the 47 haplotypes the TCS analysis shows a complex net of different possible evolutionary pathways between the haplotypes (see Figure 10). The clustering of the main four haplotypes (four largest filled circles in Figure 10) is similar to our phylogenetic tree Table 6. Number of samples and bioregions (BioR) to major haplotypes (mHapT) and lineages. Figure 9 Number ( Figure 9), with adjacent and closely related haplotypes forming distinct lineages (coloured oval lines in Figure 10). The haplotypes of the southern Mediterranean France and southern Spain are building a complex, highly disjunctive net (black oval line in Figure 10).

Haplotype number estimation
The rarefaction curve shows no saturation for the number of haplotypes (see Figure 11, 12). The estimation of CHAO1 shows that there could be overall 404 haplotypes in G. marginata (95 % confidence interval: 140-1,426 haplotypes). By extrapolation with rarefaction curves (Colwell et al. 2012) we estimate that a mean of 6,612 samples would be needed to be analysed to find all potential 404 different haplotypes. To reach the 95 % lower boundary (140 haplotypes) at least an additional 274 specimens need to be included.

Colour morphs of G. marginata
The dataset contains one specimen of the grey colour morph, eight with the "perplexa" pattern and four with red margins. Those 13 distinctly coloured specimens are marked in our specimen tree (see Figure 9 with symbols "G", "P", and "R"). The grey specimen belongs to the major haplotype of the lineage I. The specimens with the red margin are scattered in the tree and therefore do not cluster together. They are mainly found in Mediterranean France, therefore placed mainly in the lumping group V, but one specimen groups with lineage IV (Figure 9). The "perplexa" form is even more scattered over the tree, occurring in several bioregions.

Glomeris annulata, G. apuana, and G. maerens
The three local endemic species, despite some similarities in the coloration ( Figures 2D-G), are genetically clearly distinct from G. marginata, separated by p-distances of more than 13 %. Further studies should investigate the G. maerens-group in northern Spain. All three species (G. maerens, G. lugubris Attems, 1927, andG. obsoleta Attems, 1952) of the group were described by Attems from Spain (G. maerens: Tarragona and Lérida; G. lugubris: Cádiz; G. obsoleta: Barcelona) and show a similar obscure black-brown colouration (see examples in Figure 2E-G). Due to their geographically close type locations and quite similar colour, as well as thoracic shield striation pattern (both with two main striae) G. maerens and G. obsoleta may be synonyms. Therefore, the examined specimens could not be assigned to either species. However, our analysis recovers a considerable variation inside the species-group, with p-distances of 7.5-9.1 % which hints at the existence of several independent species in the G. maerens complex.

Monophyly of G. marginata
Glomeris marginata is genetically distant but related to G. connexa, with a p-distance of 12.9 %. Based on the COI-data, the G. maerens species group is more closely related to G. connexa/G. apuana than to G. marginata. The genetic distance of G. marginata to the other tested species (G. klugii/undulata, G. intermedia, G. hexasticha, and G. annulata) is, with a p-distance up to 15.9 %, even more pronounced.
In comparison to vertebrate species (e.g., fishes: 0.32 %, Keskin and Atar 2013 or rodents: 2.1 %, Li et al. 2015) a maximum intraspecific variation of a p-distance of 5 % is rather high. However, such an intraspecific variation of 5 % was also found in another widespread central European Glomeris, G. klugii/undulata (Wesener and Conrad 2016). A minimum p-distance of 12.9 % of G. marginata to the most closely related species (a factor of 2.6 to the maximum intraspecific p-distance), shows a clear barcoding gap to the nearest congener, G. connexa.
The known colour morphs of G. marginata do not represent single lineages or even subspecies. The conspicuously red borders in specimens from southern France (Figures 1D,E) are present in several lineages and sub-lineages ( Figure 9, marked with R). The same applies to the perplexa-form ( Figures 1A, B, 9, marked with P). The grey form is even a member of the main haplotype of the eastern lineage I (Figure 9, marked with G). Unfortunately, specimens of the brown form of northern Germany could not yet be sequenced, but they appear always syntopically with specimens of the black form ( Figure 2A). Therefore, any relevant divergence from those haplotypes cannot be expected.
The COI-gene is clearly working as a barcoding gene to identify and discriminate G. marginata specimens from the other Glomeris species.

Geographical relationship of G. marginata specimens
Syntopical specimens as well as specimens with a maximum geographical distance of 1,701 km (Germany, Brandenburg to Spain, La Rioja) were analysed. There is no obvious relationship between geographical and genetic distance. There are specimen pairs of the same haplotype (p-distance = 0) which were collected more than 1,000 km apart. This distance of 1,000 km seems to be the maximum distance G. marginata could spread without experiencing genetic changes. Specimen pairs with a geographical distance larger than 1,000 km experienced at least a few mutations in the COI gene, with a minimum p-distance of ≈ 0.8 % in our dataset (see Figure 7).
On the other hand, local specimens can show high genetic variation. Even from nearby locations specimen pairs show a p-distance as high as 3 %. Such a mutation rate is unlikely to have happened locally, but is more likely the result of a different geographical origin of the source populations. As such large genetic distances between different populations of G. marginata are common, a human-influenced dispersal seems not to be the reason behind the regular high COI-variance.

Haplotype regions, origin and potential migration patterns
The haplotype analysis shows five main haplotype lineages in G. marginata (Figure 9). Four of those (I-IV) show a wide distribution in northern Europe, one (lineage V) is restricted to southern Europe.
The haplotype lineage V is highly genetically variable, therefore a combination into a single group is not justified. Four rather distinct lineages not forming a monophylum could be seen in Figure 9 (coloured in different shades of green). Additionally, a block with unrelated singular haplotypes (see Figure 9 between lineage III and V) could be assigned to this fifth major haplotype lineage. Most of the specimens of these unrelated singular haplotypes are coming from the Mediterranean. These unrelated haplotypes are linked to the region DE.MGSW (specimens 19 and 31; Figure 8).
The examined northern European regions are mainly inhabited by specimens of the haplotype lineages I-IV, showing a low variance in their p-distance to one another (see Table 7). The specimen pairs within the whole North European area have a mean p-distance of 1.8 %. In contrast the French Mediterranean and French Pyrenees specimens of G. marginata show a higher p-distance (FR.MED: 2.2 % and FR.PYRN: 2.1 %). The specimen pairs of G. marginata within the geographically smaller South European bioregions (FR.MED, FR.PYRN, ES.PYRS, and ES.CC) have a mean p-distance of 2.5 %, higher than those observed in the entire North of Europe (1.8 %). With further sampling in southern Europe and collecting of similar haplotypes those values might decrease, however, further sampling will also reveal new haplotypes (see Figure 11). A saturation of the number of haplotypes is not detectable (see Figures 11, 12).
With the before mentioned mean p-distance of 2.5 %, the small south European area of bioregions contains a much higher genetic diversity in G. marginata than the much larger northern Europe. To develop such a higher genetic diversity, the south European populations of G. marginata must be older than the northern European populations. Northern Europe must have been colonized by G. marginata more recently. The main dispersal into those northern areas could only have been started after the last glaciation retreated during the early Holocene starting around 11,000 years ago (Roberts 2014).
Our data does not reveal how far north the distribution of G. marginata reached and how high any genetic diversity of the species was before the ice age. However, the south European mixed populations could be regarded as a remnant of old haplotype lineages of G. marginata, which are not any more present in the north European populations.
The geographical coverage of our analysed specimens is biased towards western Germany (MGSW, see Figure 3). For the colonization of northern Europe there are two possible scenarios. The new dispersal could have started from the south, or the dispersal could have started from a glacial refugium in northern Europe. The two scenarios are, however, not mutually exclusive and could have been concurrent. From a genetic point of view the northern populations differ from the southern populations. There are only a few and weak links between north and south. Therefore, a single or main colonization from the south to the north is not plausible.
Contrarily, all main haplotype lineages I-IV, which are exclusively found in northern Europe are linked to the bioregion DE.MGSW (Figure 3). The main redistribution over northern Europe could have been started from central Germany, which shows high haplotype diversity in G. marginata. From the bioregion DE.MGSW four major migrations could have led to the current distribution of the main haplotype lineages I-IV. Haplotype lineage I might have spread mainly to the north-east, haplotype lineage III to the North-West and haplotype lineage II only westwards. Haplotypelineage IV spread to the bioregion FR.CONN. The colonisations by the haplotype lineages were probably independent.

Haplotype number estimation
With this work, for the first time, a survey of almost 100 barcodes is presented for a diplopod species. On average, every haplotype in our study is based on two specimens (97 specimens / 47 haplotypes). In reality, the majority of haplotypes (38 haplotypes ^ = 81 %) are represented by only one specimen. The haplotype number estimation has shown that these 97 successfully sequenced specimens are just providing an overview of the real haplotype diversity in G. marginata. With the current data we are still far away from a complete collection of all haplotypes of the species. Many more specimens need to be collected to reach at least the lower estimated boundary of 140 haplotypes.
In general, this also means that haplotype analysis should not be based on few specimens and not only on specimens of a certain region, but always from specimens covering the whole distribution area of a species (Elias et al. 2007, Bergsten et al. 2012, Jordal and Kambestadt 2014. With the current data we should have a good base to cover the whole range of haplotypes. Further new haplotypes should mainly cluster within the current main lineages I to IV or should end up within the haplotype complex V with its four subgroups.
Many new haplotypes would simply represent the missing mutation steps present in the TCS-network of Figure 10 by dashes between the nodes. Probably most of the haplotypes representing end nodes in the current TCS-network are not representing the real end nodes of the mutation chains.

Nomenclatorial acts
In the year 1789 the species with the common name Cloporte bordé (bordered woodlouse) was first described by the French naturalist Charles Joseph de Villers (1724Villers ( -1810 as Oniscus marginatus. He used few, but descriptive words: "niger, segmentis corporis luteo marginatis" [black, segments of the body with yellow margin]. Within a few years the species has been named and described four times again (see below). Thirteen years after the description the French zoologist Pierre André Latreille (1762Latreille ( -1833 placed the species in his new genus Glomeris Latreille, 1802.  Almost one hundred years later several subspecies or variations were added by Verhoeff, Latzel, and Attems. Those taxa represent different versions of the pale form which was first named G. perplexa by Latzel (1895), all now regarded as synonyms of the nominate species.
We do not recognize any subspecies of G. marginata. Therefore the subspecies