Front. Mar. Sci., 08 December 2022
Sec. Marine Evolutionary Biology, Biogeography and Species Diversity
This article is part of the Research Topic Cephalopods in the Anthropocene: Multiple Challenges in a Changing Ocean View all 16 articles

The geographic problem in cephalopod genomics

Michael Vecchione*Michael Vecchione1*Michael J. SweeneyMichael J. Sweeney2Paula L. RothmanPaula L. Rothman3
  • 1National Oceanic and Atmospheric Administration, National Marine Fisheries Service (NOAA/NMFS) National Systematics Lab., National Museum of Natural History, Washington, DC, United States
  • 2Retired, Durham, NC, United States
  • 3Department of Invertebrate Zoology, Smithsonian Museum of Natural History, Washington, DC, United States

Publications describing genomes of various cephalopod species have recently proliferated. Some papers have involved large geographic distances between the collection locality of sequenced specimens and the type locality of the presumed species. However, cryptic species have been demonstrated in many cephalopods. Therefore, even if the sequenced specimen is very similar morphologically to the species in question, the likelihood that it is a member of the species in question decreases with increasing distance from the type locality. An associated problem is that many publications do not provide information adequate to determine the source locality for the genomic sequence. We reviewed a decade of literature on mitochondrial genomes of cephalopods and found a total of 43 publications containing 48 species within 23 genera. Of the 48 species, only 17 could be evaluated for our geographic question. Distances between sampling locality and type locality of the named species ranged from 0 nautical miles (sampled at type locality) to half-way around the world. Where data were present for distance calculation, the average for the 17 species was 3785 km (2044 nmi).


Determination of genetic sequences has revolutionized understanding of evolutionary relationships. Increasingly sophisticated methods have allowed this revolution to progress greatly throughout the last few decades to include inferences about entire genomes. Accordingly, the literature describing cephalopod genomes, especially those of the mitochondria, has increased greatly over the past 10 years (O'Brien, 2018). The primary goal of most of these publications has been to resolve phylogenetic relationships within extant Cephalopoda.

Another result of widespread use of genetic sequencing, including “barcodes” and other sequences shorter than entire genomes, has been an increasing recognition that species with distributions once considered to be very broad or even global were actually complexes of morphologically similar species with geographic ranges resembling a patchwork within the broad range of the species complex. Some examples include taxa within the families Sepiolidae (Fernandez-Alvarez et al., 2021), Loliginidae (Sales et al., 2017), Chtenopterygidae (Escanez et al. (2018), Ommastrephidae (Fernandez-Alvarez et al., 2020; Xu et al., 2020a), Spirulidae (Hoffmann et al., 2021), and Octopodidae (Avendano et al., 2020; Amor and Hart, 2021). Because of these species complexes, both currently recognized and possibly to be discovered in the future, a substantial potential exists for misidentification of specimens collected for genomic sequencing (e.g., Lima et al., 2017; Salvi et al., 2021). This misidentification potential is especially true if the genomic specimen is not collected within the normal range of nominal sequenced species (i.e., named based on morphological identification). We are concerned that authors, using specimens from the nearest convenient area to sample a presumed species or from sources where the actual collection locality cannot be verified (e.g., fish markets, aquarium dealers), could be using a different species than what they report and, as a result, sequences in genomic databases may be misidentified.

Materials and methods

We surveyed the past decade of genomic literature on Cephalopoda for comparison of collection locality with designated type localities of the nominal species to determine the extent of this potential problem. Only publications describing complete mitochondrial genomes were analyzed. Each publication was examined to determine the collection locality of the specimen used for genomic analysis. Type localities for the nominal species are available online. We converted both sample locality and type locality to latitude/longitude and then calculated distance between them using NOAA Latitude/Longitude Distance Calculator ( The repository and accession numbers for published genome sequences were also recorded (Table 1).


Table 1 Mitochondrial genome sequences for cephalopods in recent literature.


An online search of the previous ten years of Cephalopoda genomic literature found a total of 58 genomic descriptions within 43 publications containing 48 different species in 23 genera (Table 1). For many species sequenced (70%), either collection locality or type locality (from the original description) was missing or was too general (e.g., Australia). In addition, if either locality was indeterminate (e.g., Tsukiji fishery market); or there were multiple type localities (ex. syntypes); or the genome was derived from combined specimens from multiple localities, the sequence was not included in our distance analysis.

Of the 48 species sequenced, only 17 could be evaluated for our geographic question (Table 1). Distances calculated ranged from 0 km (sampled at type locality) to half-way around the world in a different ocean basin. The average distance between sampling locality and type locality for the 17 species for which data were adequate for distance calculation, was 3785 km (2044 nm).

Incidentally, as we reviewed this literature for geographic information, we also noticed that very few of the publications included any indication that voucher specimens or unprocessed tissue were preserved in established archival collections for future research. For example, of the 17 species mentioned above, only 5 (29.4%) had vouchered specimens. Thus, 10.4% of species accounts included both adequate geographic information and archived specimens.


Our point here is not that any of these publications is wrong. Rather, we want to highlight the potential for taxonomic errors in publications where the sampling area is very distant from the species’ type locality. As pointed out by one of the reviewers, for coastal cephalopod species in complex habitats, such errors are possible even at very small distances. Any taxonomic error introduced by this geographic mismatch may be compounded when the sequence is archived in a genomic database and the database is used for other investigations.

We therefore recommend selection of specimens for genomic sequencing collected from as close to the type locality of the species as possible. Although we recognize that it may not always be possible to sample the type locality, we recommend that the genomic sample be from the same biogeographic province (e.g., GOODS, 2009 or subsequent modifications by various authors) or “Large Marine Ecosystem” (LME – Sherman and Duda, 2011) as the type locality. The collecting locality should always be included in any publication resulting from DNA sequencing. Furthermore, specimens should not be selected for sequencing from a source where the actual collecting locality cannot be determined confidently (e.g., not from fishery landings, etc.). Also, although our primary purpose here is to highlight the need for sequenced specimens to come from as close to the type locality as possible, we also recommend that specimens sequenced and any unprocessed tissue be vouchered in an established archival collection. Relevant information about archived material (e.g., museum catalogue number) should be included in resulting publications.

Data availability statement

The original contributions presented in the study are included in the article/supplementary material. Further inquiries can be directed to the corresponding author.

Ethics statement

Ethical review and approval was not required for the animal study because this is a literature review. No animals (live or preserved) were used.

Author contributions

MS conceived the idea. MV and MS analyzed the data. PR accumulated the references. MV wrote the first draft. All authors contributed to the final manuscript. All authors contributed to the article and approved the submitted version.


NOAA, NMFS which employs MV. US Gov’t is committed to open-access publication.


The manuscript was improved by the comments from three reviewers and the Associate Editor.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.


Keywords: biogeography, genomics, species complex, type locality, sampling

Citation: Vecchione M, Sweeney MJ and Rothman PL (2022) The geographic problem in cephalopod genomics. Front. Mar. Sci. 9:1090034. doi: 10.3389/fmars.2022.1090034

Received: 04 November 2022; Accepted: 21 November 2022;
Published: 08 December 2022.

