Data from: Lineage diversification of fringe-toed lizards (Phrynosomatidae: Uma notata complex) in the Colorado Desert: Delimiting species in the presence of gene flow

Main Authors: Gottscho, Andrew D., Wood, Dustin A., Vandergast, Amy G., Lemos-Espinal, Julio, Gatesy, John, Reeder, Tod W.
Format: info dataset Journal
Terbitan: , 2017
Subjects:
Online Access: https://zenodo.org/record/4997609
Daftar Isi:
  • Multi-locus nuclear DNA data were used to delimit species of fringe-toed lizards of the Uma notata complex, which are specialized for living in wind-blown sand habitats in the deserts of southwestern North America, and to infer whether Quaternary glacial cycles or Tertiary geological events were important in shaping the historical biogeography of this group. We analyzed ten nuclear loci collected using Sanger sequencing and genome-wide sequence and single-nucleotide polymorphism (SNP) data collected using restriction-associated DNA (RAD) sequencing. A combination of species discovery methods (concatenated phylogenies, parametric and non-parametric clustering algorithms) and species validation approaches (coalescent-based species tree/isolation-with-migration models) were used to delimit species, infer phylogenetic relationships, and to estimate effective population sizes, migration rates, and speciation times. Uma notata, U. inornata, U. cowlesi, and an undescribed species from Mohawk Dunes, Arizona (U. sp.) were supported as distinct in the concatenated analyses and by clustering algorithms, and all operational taxonomic units were decisively supported as distinct species by ranking hierarchical nested speciation models with Bayes factors based on coalescent-based species tree methods. However, significant unidirectional gene flow (2NM >1) from U. cowlesi and U. notata into U. rufopunctata was detected under the isolation-with-migration model. Therefore, we conservatively delimit four species-level lineages within this complex (U. inornata, U. notata, U. cowlesi, and U. sp.), treating U. rufopunctata as a hybrid population (U. notata x cowlesi). Both concatenated and coalescent-based estimates of speciation times support the hypotheses that speciation within the complex occurred during the late Pleistocene, and that the geological evolution of the Colorado River delta during this period was an important process shaping the observed phylogeographic patterns.
  • Sanger_sequence_dataSequencher v4.7 (Gene Codes Corp., Ann Arbor, MI) was used to analyze data quality, trim primer sequences, produce alignments and call heterozygous sites to produce these fasta files, including alignments from Gottscho et al. (2014). See Appendix A for specimen information.PHASEPHASE v2.1 (Stephens et al. 2001) and seqPHASE (Flot 2010) were used to determine haplotypes; input and output files provided.GenelandIncluded are input files, R scripts and output files for an analysis of 10 Sanger loci in the R package Geneland v4.0.3 (Guillot 2008; Guillot et al. 2005, 2008).starBEASTIncluded are input xml files and a summary of results for Bayes Factor Delimitation (BFD, Grummer et al. 2014) and *BEAST (Heled and Drummond 2010) analysis of species trees in BEAST v1.8.1 (Drummond et al. 2012).raw_HiSeq_data1These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.raw_HiSeq_data2These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.raw_HiSeq_data3These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.raw_HiSeq_data4These data were collected using ddRADseq (Peterson et al. 2012) and sequenced on a single lane of an Illumina HiSeq 2500, UC Riverside. They are already demultiplexed by Illumina index and adapter index. See Appendix A for specimen information.pyRADWe used pyRAD v2.1.2 (Eaton 2014) with muscle3.8.31 and usearch7.0.1090 to filter and process raw data files. Two example parameter files are provided. Please see "raw_HiSeq_data" to download the .fastq files.raxmlWe used RAxML v8.1.1 (Stamatakis 2014) to created a maximum-likelihood phylogeny for our concatenated data. Input and output files provided.beast2The folder BFD* contains .xml input files and output logs for Bayes Factor Delimitation with genomic data (Leache et al. 2014) implemented in SNAPP/BEAST 2.3.1 (Bryant et al. 2012, Bouckaert et al. 2014). The concatenated folder includes input and output for a concatenated analysis of RAD data in BEAST 2.1.2. The SNAPP folder contains species tree input and output for SNAPP 1.1.5 implemented in BEAST 2.1.2.smartPCA_AdmixtureThese two programs are grouped together because they share a common data format. Admixture was described by Alexander et al. (2009), smartPCA by Patterson et al. (2006). There is a folder with the R script used to create the input files, and separate folders for each analysis; R scripts for visualizing results are also provided.DAPCWe used Discriminant Analysis of Principal Components in the R package adegenet 2.0.0 (Jombart et al. 2010) using the structure (.str) file output from pyRAD. Input file and annotated R script are provided.gphocsIncluded are data files, control files, log files and output files for G-PhoCS v1.2.3 (Gronau et al. 2011). Results summarized in G-PhoCs_results_120715.xlsx.Funding provided by: National Science FoundationCrossref Funder Registry ID: http://dx.doi.org/10.13039/100000001Award Number: DEB-1406589