S. ischiensis Organellar Genome features
We report here the first complete plastid and mitochondrial genomes of S. ischiensis assembled from WGS (whole genome sequencing, combined HiSeq and MiSeq runs) of total genomic DNA Nextera libraries. These combined WGS runs generated over 150 million reads (152,144,044) with a minimum length of 75 bp for the HiSeq run, 162 bp for the MiSeq with a quality scores >20 and an average sequence depth of 39X across the two organellar genomes. Both genomes were circular mapping with the plastid genome at 138,101 bp (see Fig. 1) and the mitochondrial at 41,773 bp (see Fig. 2).
Like many other photosynthetic heterokonts, S. ischiensis’ plastid genome consisted of a large single copy region (LSC, 83,435 bp), a small single copy region, (SSC, a 43,822 bp) separated by two inverted repeat regions (5,422 bp; see Additional file 4: Table S2) and was intermediate in genome size. Whereas across lineages, gene numbers ranged from 141 to 192, protein-coding genes from 127 to 156, and tRNAs from 27 to 34, with most taxa having six rRNAs (see Additional file 4: Table S2). Many of these features were intermediate for S. ischiensis consistent with its phylogenetic position in the gyristan heterokont tree of life [19, 23].
S. ischiensis mitochondrial genome’s size, number of overlapping gene regions (10) and gene density was more similar to its phaeophycean sister group than to those of unicellular heterokonts. While general mt genome architecture across heterokonts was very consistent lacking this unicellular/multicellular pattern (see Additional file 4). S. ischiensis GC content (41%) was higher for most heterokonts but contained similar gene content (65 genes with 28 protein-coding, three rRNAs, 26 tRNAs and no ORFs, see Additional file 5: Table S3). Overall, many S. ischiensis mitogenome features were either intermediate in nature or more phaeophycean like (see Additional file 5: Table S3).
Timing the Diversification of Heterokonts
Our mtDNA-based analysis robustly placed the emergence of gyristan heterokonts and the split between the autotrophic Ochrophyta and heterotrophic pseudofungal (oomycetes) clades in the first half of the Paleozoic (406-540 MYA, see Fig. 3, Additional file 6, Table S4a) with diversification of their respective lineages into the Mesozoic/Cenozoic. These estimates are consistent with available fossil data [see 9, 26], but in sharp contrast to a deep molecular clock study of the eukaryote tree of life [see 27]. We argue that our estimations more accurately reflected true heterokont diversification patterns because of our robust lineage representation (especially among the Ochrophytan lineages). However, our estimations for heterotrophic oomycetes are limited to the Saproleginales and Peronosporales, orders better studied because they contain important saprobic and pathogenic pseudofungal taxa. Our data places their divergence towards the end of the Mesozoic era (~100 MYA) consistent with previous oomycete clock estimations, fossil evidence and pathogenic plant associations [16].
For the autotrophic Ochrophyta our mtDNA and cpDNA based analyses provides one of the first robust dating of the origin main ochrophyte clades first identified in the 5 gene phylogeny of Yang et al. (2012) [19] and further resolved in the work of Derelle et al (2016) [Derelle et al 2016]. Our analyses (mtDNA, cpDNA) robustly captured the Khakista assemblage of the larger Diatomista clade (SIII clade of Yang et al. (2012)) and the core lineages of the Chyrista assemblage (SI-PX clades of Yang et al. (2012)) but had various placement for some of the Yang’s et al. (2012), SII classes like the Pelagophyceae or Synurophyceae, (Figs. 3 & 4, Additional file 6: Table S4 a&b), thus we focus on the clades that were reliably resolved and discuss their origin and diversification patterns. We estimate that the Khakista assemblage diverged early in the Mesozoic (245 to 223 Ma, Figs. 3 & 4) with the lineages of Diatomeae (diatoms) diversified into the Cenozoic consistent with fossil evidence and theoretical ocean conditions [9, 20], (Figs. 3 & 4, Additional file 6: Table S4b). The core Chyrista assemblages starting diversifying in the middle of the Paleozoic (~366 Ma, Fig. 4), with the lineages of Yang’s PX clade (Xanthophyceae, Phaeophyta) diverging towards the beginning of the Mesozoic (~252-221 Ma, Figs 3 & 4, Additional file 6, Table S4 a&b). The morphologically complex Phaeophyceae diverged towards the end of the Mesozoic (136-116 Ma) with major orders like the kelps (Laminariales) diversifying into the Cenozoic (~252-221 Ma, Figs 3 & 4, Additional file 5a-b: Table S4). These estimates are consistent with other molecular clock data and fossil evidence placing important phytoplankton and coastal lineages with a Mesozoic to early Cenozoic diversification history [9, 11, 27].
Emergence of Oceanic and Coastal Benthic Ecosystems
Autotrophic gyristan heterokonts (Ochrophyta) with their red algal derived plastids include many of the key lineages forming today’s phytoplankton and coastal benthic marine ecosystems [e.g., 9, 11, 27]. Thus, accurately dating their diversification patterns are key to understanding the emergence and formation of these critical ecosystems. Robust fossil evidence and physical data for oceans environments paint of picture of phytoplanktonic ecosystems shifting from green lineage dominated communities to red algal derived plastid lineages after the late Permian extinction event when oceans shifted to more oxygen rich, stable environments [see 9]. Theoretically, these lineages radiated late in the Mesozoic into the Cenozoic forming modern phytoplankton communities dominated by diatoms, coccolithophores and dinoflagellates. Our estimations for diatom diversification patterns are consistent with this evolutionary picture and support the emergence of modern phytoplankton communities after the Permian extinction radiating into the Cenozoic (see Figs 3 & 4, Additional file 6: Table S4 a & b).
Fossil evidence for the world’s coastal marine ecosystems’ plant components are much less robust. However, when combined with physical estimations for ocean conditions (e.g., oxygen levels), leads to a conclusion for their emergence and diversification towards the end of the Mesozoic into the Cenozoic [see 9, 11]. Our molecular clock based estimations are consistent this data and place the origin of the brown algae (Phaeophyceae) towards the end Mesozoic with diversification of the major lineages (e.g., Laminariales, Fucales) into the Cenozoic (Figs. 3 & 4, Additional file 6: Table S4 a & b). These lineages are the main architects of benthic coastal communities like the kelp forests of the Pacific Northwest and the rockweed beds of the Atlantic seaboard. Our data places these ecosystems with a late Mesozoic, early Cenozoic origin consistent to dates from other work on the evolution of the Phaeophyceae, the fossil record and the proposed nature of oceans [9, 11, 26].
Dating the last Transition to Multicellularity
Gyristan heterokonts include many lineages that shifted from unicellularity to some form of multicellularity, like hyphal coenocytic habits (e.g., pseudofungi or xanthophytes) or full morphological complexity (brown algae/phaeophytes), [5. 6, 15, 16]. These lineages represent some of the most recent shifts to complexity in the eukaryotic tree of life with the Phaeophyceae being the most recent to shift to complex multicellularity [see 5, 6, 12]. Our data from mitochondrial and plastid genes provide some of the first robust estimates for these events placing the origin of the Schizocladiophyceae (filamentous habit) in the Mesozoic some 300-500 Ma after the origin of metazoans or red algae (see Figs. 3 & 4, Additional file 6, Table S4 a & b), [10]. The origin of the brown algae (Phaeophyceae) occurred about hundred million years later (136-116 Ma) with their diversification and the emergence of brown algal dominated coastal marine ecosystems like kelp forests or Sargasso Sea beds into the Cenozoic consistent to the cooling of the oceans and the expansion of shallow coastal environments.