Research Article

Genome streamlining in a minute herbivore that manipulates its host plant

School of Biological Sciences, University of Utah, United States
Laboratory of Agrozoology, Department of Plants and Crops, Faculty of Bioscience Engineering, Ghent University, Belgium
Department of Evolutionary and Population Biology, Institute for Biodiversity and Ecosystem Dynamics, University of Amsterdam, Netherlands
Department of Plant Biotechnology and Bioinformatics, Ghent University, Belgium
Center for Plant Systems Biology, VIB, Belgium
Department of Human Genetics, University of Utah School of Medicine, United States
Department of Plant and Environmental Sciences, University of Copenhagen, Denmark
Centre for Microbial Ecology and Genomics, Department of Biochemistry, Genetics and Microbiology, University of Pretoria, South Africa
Henry Eyring Center for Cell and Genome Science, University of Utah, United States

Oct 23, 2020

https://doi.org/10.7554/eLife.56689

Open access
Copyright information

Version of Record

Accepted for publication after peer review and revision.

Download
Cite
Share
CommentOpen annotations (there are currently 0 annotations on this page).

Version of Record published: December 15, 2020 (This version)
Accepted Manuscript published: October 23, 2020 (Go to version)
Accepted: October 22, 2020
Received: June 4, 2020

1. Related to
Genome Reduction: Does size really matter?

David G Heckel

Insight Dec 8, 2020
Further reading

Abstract
eLife digest
Introduction
Results
Discussion
Materials and methods
Data availability
References
Article and author information
Metrics

Abstract

The tomato russet mite, Aculops lycopersici, is among the smallest animals on earth. It is a worldwide pest on tomato and can potently suppress the host’s natural resistance. We sequenced its genome, the first of an eriophyoid, and explored whether there are genomic features associated with the mite’s minute size and lifestyle. At only 32.5 Mb, the genome is the smallest yet reported for any arthropod and, reminiscent of microbial eukaryotes, exceptionally streamlined. It has few transposable elements, tiny intergenic regions, and is remarkably intron-poor, as more than 80% of coding genes are intronless. Furthermore, in accordance with ecological specialization theory, this defense-suppressing herbivore has extremely reduced environmental response gene families such as those involved in chemoreception and detoxification. Other losses associate with this species’ highly derived body plan. Our findings accelerate the understanding of evolutionary forces underpinning metazoan life at the limits of small physical and genome size.

eLife digest

Arthropods are a group of invertebrates that include insects – such as flies or beetles – arachnids – like spiders or scorpions – and crustaceans – including shrimp and woodlice. One of the tiniest species of arthropods, measuring less than 0.2 millimeters, is the tomato russet mite Aculops lycopersici. This arachnid is among the smallest animals on Earth, even smaller than some single-celled organisms, and only has four legs, unlike other arachnids. It is a major pest on tomato plants, which are toxic to many other animals, and it feeds on the top cell layer of the stems and leaves. Tomato growers need a way to identify and treat tomato russet mite infestations, but this tiny species remains something of a mystery.

One way to tackle this pest may be to take a closer look at its genome, as this could reveal what genes the mite uses to detoxify its diet. Examining the mite’s genome could also reveal information about how evolution handles creatures becoming smaller. An area of particular interest is the overall size of its genome. Not all of the DNA in a genome is part of genes that code for proteins; there are also sections of so-called ‘non-coding’ DNA. These sequences play important roles in controlling how and when cells use their genes. In the human genome, for example, just 1% of the DNA codes for protein. In fact, most human protein-coding genes are interrupted by sequences of non-coding DNA, called introns.

Here, Greenhalgh, Dermauw et al. sequence the entire tomato russet mite genome and reveal that not only is the mite's body size miniature: these tiny animals have the smallest arthropod genome reported to date, almost a hundred times smaller than the human genome. Part of this genetic miniaturization seems to be down to massive loss of non-coding DNA. Around 40% of the mite genome codes for protein, and 80% of its protein coding genes contain no introns. The rest of the miniaturization involves loss of genes themselves. The mites have lost some of the genes that determine body structure, which could explain why they have fewer legs than other arachnids. Additionally, they only carry a small set of genes involved in sensing chemicals and clearing toxins, which could explain why they are mostly found on tomato plants.

Greenhalgh, Dermauw et al.’s findings shed light on what may happen to the genome at the extremes of size evolution. Sequencing the genomes of other mites could reveal when in evolutionary history this genetic miniaturization occurred. Furthermore, a better understanding of the tomato russet mite genome could lead to the development of methods to detect the infestation of plants earlier and be highly beneficial for tomato agriculture.

Introduction

The free-living microarthropod Aculops lycopersici (Tryon) belongs to the superfamily of the Eriophyoidea (Arthropoda: Chelicerata: Acari: Acariformes) that harbors the smallest plant-eating animals on earth (Keifer, 1946; Navia et al., 2010; Sabelis and Bruin, 1996). Eriophyoids are known by many names including gall, blister, bud, and rust mites, depending on the type of damage they cause (Hoy, 2004). Since the 1930s, the tomato russet mite A. lycopersici has been reported as a minor pest of cultivated tomato (Solanum lycopersicum L.) worldwide (Massee, 1937). For unknown reasons, it has emerged in recent years as a significant pest of tomatoes in European greenhouses (Moerkens et al., 2018). While it is extremely small – only ~50 μm wide and 175 μm in length (Figure 1a,b) – it can reach high population densities (Figure 1c). The damage it causes to plants superficially resembles that of microbial disease (Figure 1d), for which it is often misdiagnosed, and controlling it is troublesome (Gerson and Weintraub, 2012; Van Leeuwen et al., 2010).

Figure 1

Download asset Open asset

The tomato russet mite *Aculops lycopersici* is a devastating pest of tomato.

(a) Habitus of the eriophyoid mite *A. lycopersici.* Male (left) and female (right) mites are slender, worm-like animals bearing, in contrast to non-eriophyoid mites with four pairs of legs, only two pairs of small legs (indicated by L1 and L2). (b) Low temperature (LT) - scanning electron microscopy (SEM) image of *A. lycopersici* on a leaf of *S. lycopersicum*. (c) *A. lycopersici* populations can rapidly build to extremely large numbers on tomato stems and leaves. (d) *A. lycopersici* damage of heavily infested tomato plants is shown. Scale bars in panels a and b represent 0.05 mm.

The mite feeds on plant epidermal cells (Royalty and Perring, 1988), which are relatively low in nutrients, with needle-shaped mouth parts (stylets) that allow the transfer of saliva and the uptake of cell contents (Nuzzaci and Alberti, 1996). The first visible signs of a russet mite infestation are a rapid local collapse of the leaf hairs (trichomes) on the stem, leaflet or petiole upon which the mites are feeding (van Houten et al., 2013). This is followed by withering and necrosis of infested leaves, which ultimately leads to a bronzed or russet color, from which the mite owes its name (Jeppson et al., 1975; Kawai and Haque, 2004). Although it is now a global pest on tomato, it can survive on many related solanaceous plants (nightshade family) such as potato, tobacco, petunia, nightshade, and various peppers (Perring and Farrar, 1986), as well as on a few hosts outside the nightshade family (Perring and Royalty, 1996; Rice and Strong, 1962).

The Eriophyoidea belong to the Chelicerata, a subphylum of Arthropoda which includes spiders, scorpions, ticks, and mites. The Eriophyoidea consists of three families – Phytoptidae, Eriophyidae (or eriophyids, to which A. lycopersici belongs), and Diptilomiopidae, and comprises 357 herbivorous genera found on more than 1800 different plant species (Oldfield, 1996; Zhang, 2011). Eriophyoids are known to manipulate host plant resource allocation and resistance, and many species do so by inducing the formation of plant galls (de Lillo et al., 2018), possibly by secreting molecular mimics of plant hormones in their saliva (De Lillo and Monfreda, 2004; de Lillo and Skoracka, 2010). Although A. lycopersici is not a gall-inducing species, it nevertheless manipulates the defense mechanisms of its tomato host to its benefit. Through an unknown mechanism during feeding, this mite suppresses the jasmonic acid (JA) signaling pathway (Glas et al., 2014; Schimmel et al., 2018). This blocks the ability of the tomato host plant to produce defensive metabolites and proteins against herbivorous insects and mites (Alba et al., 2015; Howe and Jander, 2008), thereby rendering the plant defenseless. The consequences of suppressing host defenses for the herbivore’s selective environment may be variable depending on the degree of host specialization (Blaazer et al., 2018; Kant et al., 2015) but for mite species that can feed on multiple hosts, there are indications of a trade-off between the ability to suppress defenses and the ability to cope with xenobiotics (Kant et al., 2008; Wybouw et al., 2015). Many species of eriophyoid mites cause little damage to their hosts (Jeppson et al., 1975), or alternatively induce damage indirectly as vectors of pathogens (Navia et al., 2013). In contrast, while A. lycopersici is not known to vector plant diseases, its ability to alter the chemistry and morphology of tomato severely weakens the plants, which are then overwhelmed and killed by exponentially growing A. lycopersici populations (Figure 1c,d; Perring, 1996).

In addition to being a priority pest of tomato, A. lycopersici and related eriophyoids are among the most extreme examples of miniaturization in arthropods. As one of the smallest documented animal species (Bailey and Keifer, 1943), with dimensions smaller than some single-celled organisms (Polilov, 2015), it is not surprising that A. lycopersici has a derived morphology. Compared to almost all adult arachnids outside of the Eriophyoidea, which have a body plan with eight legs, A. lycopersici has only four legs (Figure 1a,b). Further, reproductive structures, which are located at the terminal end in other mites, are positioned in the central ventral region (Nuzzaci and Alberti, 1996). This type of morphology has resulted in altered reproductive behavior wherein males, instead of direct insemination, deposit spermatophores (packets of sperm) in the environment that are subsequently picked up by females (Al-Azzazy and Alhewairini, 2018; Oldfield and Michalska, 1996). Despite these morphological and behavioral innovations, A. lycopersici retains the haplodiploid mechanism of sex determination characteristic of many other mite species (Anderson, 1954). Further, female A. lycopersici mites can lay up to four eggs per day, and the generation time is as little as 5 days under optimal conditions (Kawai and Haque, 2004; Rice and Strong, 1962). These features, which resemble those of other agriculturally important mite herbivores, result in rapid overexploitation of the host plant and have undoubtedly contributed to the importance of this species as a field and greenhouse pest of tomato.

Here, we present the genome of A. lycopersici, the first for an eriophyoid mite. At only 32.5 Mb, it is the smallest arthropod genome reported to date (Grbić et al., 2011; Waldron et al., 2017). As revealed by contrasting the genomic architecture of the tomato russet mite with other sequenced arthropods, including the two-spotted spider mite Tetranychus urticae (Grbić et al., 2011), a generalist herbivore often found in co-infestations alongside A. lycopersici (Glas et al., 2014), we elucidate mechanisms underlying dramatic genome reduction. In particular, we observed typical features of streamlined genomes (Arkhipova, 2018; Hessen et al., 2010a), including a marked reduction in the distance between adjacent genes, and few repetitive sequences. Massive loss of introns was apparent. Moreover, reductions in specific genes and gene families, such as environmental response genes, associate with A. lycopersici’s ability to suppress host plant defenses as well as its derived morphology. The genome therefore sheds light not only on mechanisms of extreme metazoan genome reduction, but also on the interplay between gene content and the lifestyle of small herbivores that manipulate their environment.

Results

Genome size, assembly, and annotation

We assembled the genome of A. lycopersici into seven scaffolds of cumulative length 32.53 Mb, of which 99.98% is represented on scaffolds 1–5 of lengths 12.44, 10.50, 3.66, 3.57 and 2.36 Mb, respectively. The remaining two scaffolds are each <6 kb in length, in addition to a mitochondrial genome scaffold. The observed assembly length is similar to the length estimated by a k-mer analysis with genomic sequence reads (34.81 Mb). Separate genome completeness estimates with CEGMA (Parra et al., 2007) and BUSCO (Simão et al., 2015) located 90.7% and 86.0% of the expected core eukaryotic genes, respectively; these values are within the same range as those for T. urticae, the only other sequenced chelicerate herbivore, and for which a high-quality Sanger assembly is available (95.16% and 92.07%, respectively). As an additional assessment of completeness, we generated a de novo assembly of the A. lycopersici transcriptome using deep, paired-end Illumina RNA-seq reads derived from mixed sex and developmental stages, and aligned it to the genome sequence. We found that 98.2% of transcript contigs could be located on the reference sequence. Of the remaining 243 unplaced transcript sequences, only eight had similarity to known arthropod sequences; the others had homology to bacterial, fungal, or plant sequences, or lacked homology to sequences in existing databases.

Features of extreme genome reduction in A. lycopersici

Annotation of the A. lycopersici genome by automated methods, coupled with extensive manual curation, revealed only 10,263 protein-coding genes. As assessed against other mite genomes, including T. urticae, Dermatophagoides pteronyssinus (the European house dust mite) (Waldron et al., 2017), and Metaseiulus occidentalis (a phytoseiid predatory mite) (Hoy et al., 2016), as well as the Drosophila melanogaster and human genomes, several features of genic organization in A. lycopersici stand out (Table 1). The fraction of the genome comprising coding sequence is highest in A. lycopersici, and the distance between genes is the lowest. Associated with the compact genic landscape of A. lycopersici (Figure 2 and Figure 2—figure supplements 1–6), the percentage of the genome consisting of transposable elements was merely 1.54%, which is more than fourfold less than that observed in several other mite genomes, or in the insect D. melanogaster (Figure 2—figure supplement 1, Supplementary file 1 — ‘Table S1’ Tab). Nevertheless, sequences homologous to the major classes of transposable elements, such as DNA transposons, including Helitrons, as well as both long terminal repeat (LTR) and non-LTR retrotransposons, were detected (Supplementary file 1 — ‘Table S1’ Tab and ‘Table S2’ Tab). Across the A. lycopersici genome, extended regions of low genic composition and high TE density were not observed (Figure 2—figure supplement 2), consistent with the purported holocentric chromosome architecture (lack of regional centromeres) of eriophyoid mites (Helle and Wysoki, 1996).

Figure 2 with 6 supplements see all

Download asset Open asset

Number of conserved introns and intron loss rate across 18 metazoan species.

(a) Phylogenetic tree built from 147 single copy orthologues (left; numbers at nodes indicate bootstrap support), and a histogram of introns present at 29,447 conserved positions identified by the software package Malin (right). (b) Phylogenetic tree with branch lengths labeled and scaled to the intron loss rate calculated by Malin. The unedited tree in both panels is given in Figure 2—figure supplement 3, and was, together with 2371 orthologous protein clusters (Supplementary file 2), used as input for Malin. (c) Alignment of *A. lycopersici* aculy03g01320 (which encodes an ADP-ribosylation factor-like 8, or Arl8, protein) with single copy orthologues from five other mite and insect species as indicated. Analogous positions of phase 0, 1, and 2 introns are denoted by colored triangles (legend, bottom right), with amino acids at the analogous intronic positions indicated beneath (identity, similarity, and non-similarity are indicated by ‘*', ‘:', and ‘.', respectively, for aculy03g01320 and its orthologue from *D. pteronyssinus,* the most closely related genome; in descending order, the sequence identifiers are aculy03g01320.1, g8154.t1, tetur10g00460, rna18006, BGIBMGA010943-RA, and FBtr0339723). The letter ‘E’ indicates that this intron position is conserved across other model organisms in Eukaryota; *Dictyostelium purpureum* (GenBank Accession XM_003283650), *C. elegans* (NM_070390.9), *H. sapiens* (NM_018184.3), *Monosiga brevicollis* (XM_001744342.1), and *A. thaliana* (NM_114847.5). (d) Local protein alignment, after panel c, revealing a candidate imprecise intron loss event in *aculy04g10480* (which encodes a polymerase delta-interacting protein) in *A. lycopersici* (insertion of S and N amino acid residues, top). Numbers denote positions in the *A. lycopersici* orthologue; sequence identifiers, in descending order, are aculy04g10480.1, g5664.t1, tetur01g12540, rna9399, BGIBMGA013121-RA, and FBtr0078681. Panels (c) and (d) are drawn based on Malin output. Other findings for intronic features and factors contributing to *A. lycopersici*’s genome reduction, and the supporting analyses, are presented in Figure 2—figure supplements 1, 2, 4, 5 and 6.

Table 1

Genome metrics for A. lycopersici, other mite species, D. melanogaster and H. sapiens.

Species	Genome size (Mb)	PCG*	% intronless^†	Coding %^‡	Intergenic %^§	Intronic %^¶	Intergenic M	Intron M
A. lycopersici	32.53	10,263	83.67	42.26	45.12	12.62	538 bp	170 bp
D. pteronyssinus	70.76	12,530	25.29	35.26	46.00	18.73	542 bp	75 bp
T. urticae	90.83	19,086	18.26	22.10	54.12	23.78	1302 bp	94 bp
M. occidentalis	151.90	17,310	24.97	15.25	59.14	25.61	2035 bp	135 bp
D. melanogaster	143.73	13,931	16.37	15.60	57.37	27.03	1228 bp	69 bp
H. sapiens	3088.27	19,636	6.74	1.10	68.14	30.77	23,279 bp	1,505 bp

*PCG: protein coding genes.

^†Percent coding genes with no introns.
^‡Percentage of genome in coding regions.

^§Percentage of genome in between genes.
^¶Percentage of genome in introns.

M = Median. See ‘Genome metric calculations’ in Materials and methods and Table 1—source data 1 for more information.

Table 1—source data 1 GFF3 annotation file of the A. lycopersici genome.: https://cdn.elifesciences.org/articles/56689/elife-56689-table1-data1-v2.zip
Download elife-56689-table1-data1-v2.zip

We also observed that the A. lycopersici genome has only 3057 introns in coding sequences (CDS introns), which is more than an order of magnitude fewer than the 44,881 in the 90 Mb T. urticae genome, and the 35,841 in the 70.8 Mb D. pteronyssinus genome. Strikingly, nearly 84% of A. lycopersici protein coding genes were intronless, which is more than threefold higher than observed for the other mite species we analyzed, and more than fivefold higher than for D. melanogaster (Table 1). To further investigate the dynamics of intron evolution, we evaluated patterns of intron gain and loss in orthologous genes among A. lycopersici and 17 other animal genomes using the Malin analysis pipeline (Csurös, 2008; Figure 2, and Figure 2—figure supplements 3 and 4, and Supplementary file 2). At 29,447 conserved intron sites (Figure 2a), A. lycopersici has a mere 207 introns. This is an ~11 fold reduction from that seen in the species with the next lowest counts, the European house dust mite D. pteronyssinus, at 2292. Apart from A. lycopersici, Acari intron loss rates were broadly similar to those observed for other arthropods, except for M. occidentalis, for which high rates of both intron loss and gain were apparent, a finding previously reported (Hoy et al., 2016). However, the rate of intron loss in A. lycopersici was higher than observed in M. occidentalis (Figure 2b), and in contrast to M. occidentalis, intron gains were minimal (Figure 2—figure supplement 4). The only evidence for retention of the minor spliceosome in A. lycopersici comes from the presence of a single canonical U12 (minor) intron in the gene aculy03g00270 that encodes an ultra-conserved calcium channel (splice sites AT-AC in intron one of length 12.5 kb). Splicing of this large intron is supported by RNA-seq read alignments, and the orthologous intron one of the T. urticae orthologue of this gene is one of the three U12 introns documented previously in T. urticae (Grbić et al., 2011).

Although relatively few conserved introns are present in the A. lycopersici genome, they exhibit a bias toward 5’ gene ends (Figure 2—figure supplement 5), and compared to most arthropods, the median intron length is larger (Table 1 and Figure 2—figure supplement 6). In a single copy (orthologous) gene set for which introns were lost in A. lycopersici, but conserved in five other closely related or high-quality mite or insect genomes (see Materials and methods), the impact of intron loss on A. lycopersici-encoded protein sequences was generally minimal. In fact, in the respective protein sequences spanning 97 of 100 A. lycopersici-specific intron loss events (97%), multi-species alignments did not reveal insertions or deletions (indels) of amino acid residues (e.g. Figure 2c, and Supplementary file 1 — ‘Table S3’ Tab and Supplementary file 3); for the remaining few cases (3%), the respective sites of loss events in A. lycopersici were coincident with the gain or loss of one or several amino acid residues (e.g. Figure 2d). Within this gene set, similar findings were apparent for the larger number of A. lycopersici intron losses as compared to intron sites conserved between the two closest relatives (D. pteronyssinus and T. urticae; Supplementary file 3). Despite striking examples of intronless genes arising from the loss of multiple conserved introns, as for aculy03g01320 (Figure 2c), some A. lycopersici genes have both lost and retained arthropod conserved introns (i.e. aculy02g00250, aculy03g02140, and aculy01g28080, Supplementary file 3).

Gene family contractions predominate in A. lycopersici

As revealed by the clustering algorithm implemented in the CAFE software (Han et al., 2013), A. lycopersici exhibits one of the highest rates of gene family contractions (1725), and by far the lowest rate of gene family expansions (206), among the 18 metazoans we analyzed (Figure 3; input data for the analysis are provided in Supplementary file 4 and Supplementary file 5). It also has the lowest average expansion per gene family (Supplementary file 1 — ‘Table S4’ Tab). Of the 105 gene families that were identified as ‘rapidly evolving’ in A. lycopersici, only four – as represented by orthogroups (OGs) OG0000007 (containing an Asteroid domain: IPR026832), OG0000546 (containing a Major Facilitator Superfamily, or MFS, domain: IPR011701), OG0000583 (containing a Troponin domain: IPR001978), and OG0002260 (hypothetical proteins) – were identified as expanding. The remaining 101 families were all identified as contracting (Supplementary file 1 — ‘Table S5’ Tab). Six of these contracting families did contain more than 10 members in A. lycopersici (OG0000000, containing a Zinc finger C2H2-type domain: IPR013087; OG0000003, containing a Homeobox domain: IPR001356; OG0000005, containing a Serine protease, trypsin domain: IPR001254; OG0000014, containing a Cytochrome P450 domain: IPR001128; OG0000015, containing a G-protein-coupled receptor, rhodopsin-like domain: IPR000276; G0000025, containing a Homeobox domain: IPR001356) and, except for OG0000014 containing members of the P450 family, which is known to have only few orthologous relationships (Feyereisen, 2011), on average 72.2% of retained A. lycopersici genes had an orthologue in the majority of chelicerate species (Supplementary file 1 — ‘Table S6’ Tab). Further, among the 101 rapidly contracted gene families we identified families previously implicated in mite and insect xenobiotic detoxification (Dermauw et al., 2013a; Dermauw et al., 2013b; Snoeck et al., 2018; Van Leeuwen and Dermauw, 2016) – carboxyl/choline esterases (CCEs: OG0000021 and OG0001201), cytochrome P450 monooxygenases (CYPs: OG0000014, OG0000030 and OG0000052), glutathione-S-transferases (GSTs: OG0000102, OG0000124), short-chain dehydrogenases/reductases (SDRs: OG0000096), ATP-binding cassette (ABC) transporters (ABCs: OG0000051 and OG0000109) and MFS proteins (OG0000029, OG0000071, OG0000099, OG0000187) (Supplementary file 1 — ‘Table S5’ Tab and ‘Table S7’ Tab). Given the role of these families in herbivory and host plant use (Després et al., 2007; Heckel, 2014; Van Leeuwen and Dermauw, 2016), we analyzed a selection of these gene families in detail (see the following sections).

Figure 3

Download asset Open asset

CAFE analysis of 6487 metazoan orthogroups.

The number of expanding orthogroups are indicated in green font, while contracting orthogroups are indicated in red font. The number of rapidly expanding or contracting orthogroups (p-value<0.05) is shown in parentheses and details regarding these orthogroups can be found in Supplementary file 1 — ‘Table S5’ Tab and ‘Table S7’ Tab.

We also found 315 orthogroups with no members in A. lycopersici but at least one member in all other arthropod species. This is the highest number of absent orthogroups of all arthropods included in our analysis, is ~2-fold more than those lacking in D. pteronyssinus (171), and more than threefold those absent in T. urticae (101) (Supplementary file 1 — ‘Table S8’ Tab). A gene ontology (GO) enrichment analysis for D. melanogaster members within these conserved arthropod orthogroups without A. lycopersici members revealed that N-acetylglucosamine metabolic process (GO:0006044), transferase activity (GO:0016740) and Golgi apparatus (GO:000579) were the most highly significantly enriched GO terms within the Biological Process, Molecular Function and Cellular Component GO categories, respectively (Supplementary file 1 — ‘Table S9’ Tab). Lastly, we found that 427 D. melanogaster essential genes (Aromolaran et al., 2020) coded for members of 390 orthogroups. Forty-eight of these essential orthogroups did not have members within the Acariformes, the mite superorder comprising A. lycopersici, D. pteronyssinus, and T. urticae, while 21 (5.4%) orthogroups were absent in A. lycopersici but present in other acariform mites (Supplementary file 1 — ‘Table S10’ Tab).

Furthermore, in a number of cases, orthogroups absent in A. lycopersici harbor conserved genes with potential roles in the development of tissues or structures that are absent or modified in the russet mite relative to other chelicerates or insects (see also Discussion, and Results section, ‘Loss of highly conserved transcription factors’). For instance, orthologues of Drosophila unkempt, a known developmental regulator, and Drosophila dachs, essential for appendage growth, are both absent in A. lycopersici but present in all other arthropods (OG0002898 and OG0006002, respectively). Dachs is known to interact with four-jointed (Buckles et al., 2001), which is also absent in A. lycopersici, even though it is present in all insect and chelicerate species included in our analysis (OG0003305). Finally, fat belongs, together with dachs and four-jointed, to the Fat/Hippo pathway and plays a key-role in tissue proliferation and development in both invertebrates and vertebrates (Simon et al., 2010). Although dachsous, another player in this pathway, is present (aculy04g02000 in OG0001018), a fat orthologue could not be identified in A. lycopersici while this orthologue was found in other acariform mites (OG0000383, Supplementary file 1 — ‘Table S7’ Tab).

Detoxification genes

We curated the A. lycopersici genome for sequences encoding established detoxification enzymes (Després et al., 2007; Heckel, 2014; Van Leeuwen and Dermauw, 2016) including GSTs, CCEs, and CYPs. In A. lycopersici, detoxification gene families are especially reduced, with a mere 4 GSTs, 8 CCEs, and only 23 CYPs (Table 2, Figure 4a, and Figure 4—figure supplements 1, 2 and 3; Van Leeuwen and Dermauw, 2016). In particular, the number of GSTs and CCEs is remarkably low (see Discussion). This finding was corroborated by mining of the A. lycopersici transcriptome assembly (the 4 GSTs and 8 CCEs present in the genome assembly were also present in transcriptome assembly, with no other transcript contigs with homology to GSTs or CCEs identified). Of note, half of the GSTs and almost all (7 out of 8) CCE genes in A. lycopersici are evolutionarily conserved across chelicerates or arthropods (Figure 4—figure supplements 1 and 2). We also examined transporters of the ABC family and MFS proteins that have been implicated in detoxification responses in arthropod species, although transporters in both of these families have diverse other roles as well (de la Paz Celorio-Mancera et al., 2013; Dermauw et al., 2013a; Dermauw et al., 2013b; Dermauw and Van Leeuwen, 2014; Govind et al., 2010). In contrast to genes encoding ‘classic’ detoxification enzymes like CYPs, CCEs, or GSTs, dramatic reductions in ABC transporter genes were not observed. For example, A. lycopersici has 9 ABCC and 16 ABCG transporters, while 22 and 2 are present in M. occidentalis and 39 and 23 are present in T. urticae, respectively (Table 2, Figure 4—figure supplement 4). Further, in contrast to the trend for contractions of the classic detoxification gene families, we also observed two A. lycopersici expansions - comprising three orthogroups, OG0000024, OG0000546, and OG0006109 - of the MFS, which is involved in membrane-based transport of small molecules (Figure 4b, Figure 4—figure supplement 5; Pao et al., 1998; Yan, 2015).

Table 2

Detoxification enzyme (CYPs, GSTs, CCEs) and ABC transporter gene family size in A. lycopersici, T. urticae, M. occidentalis, and D. melanogaster.

Detoxification enzyme	A. lycopersici	T. urticae	M. occidentalis	D. melanogaster
CYPs (total)	23	78*	63	86
CYP2	1	38	16	7
CYP3	17	9	23	36
CYP4	2	26	19	32
Mito Clan	3	5	5	11
GSTs (total)	4	31	13	37
Delta/Epsilon	1	16	3	25
Mu	2	12	5	0
Omega	0	2	3	5
Sigma	0	0	0	1
Theta	0	0	0	4
Zeta	1	1	1	2
Unknown	0	0	1	0
CCEs (total)	8	69	44	35
Dietary class (clade A-C)	0	0	0	13
Hormone class
D (integument CCEs)	0	0	0	3
E (secreted beta-esterases)	0	0	0	2
F (dipteran JHEs^†)	0	0	0	3
F' (chelicerate JHEs)	0	2	1	0
Neurodevelopmental class
H (glutactins)	0	0	0	4
J (AChE)	1	1	1	1
J' (Acari-specific CCEs)	0	32	19	0
J'' (Acari-specific CCEs)	0	22	15	0
K (gliotactin)	1	1	1	1
L (neuroligins)	2	5	5	4
M (neurotactin)	1	1	0	1
U (unchar. conserv. clade in Acariformes/L. polyphemus)	2	3	0	0
I (unchar. conserv. clade in insects)	0	0	0	2
No clear clade assignment	1	2	2	1
ABCs (total)	44	103	55	56
ABCA	4	9	8	10
ABCB-FT^‡	3	2	1	4
ABCB-HT^§	1	2	4	4
ABCC	9	39	22	14
ABCD	2	2	4	2
ABCE	1	1	1	1
ABCF	3	3	3	3
ABCG	16	23	2	15
ABCH	5	22	6	3
Unknown	0	0	4	0
Total	79	281	175	214

Numbers and class/clade/subfamily assignments were derived from previous studies (Grbić et al., 2011; Wei et al., 2020; Wu and Hoy, 2016) and this study.

*Of the 81 T. urticae CYPs identified by Grbić et al., 2011, three CYP genes (tetur46g00150, tetur46g00170 and tetur47g00090) and tetur602g00010 were considered as allelic variants and a pseudogene, respectively, and one new full-length CYP gene (tetur01g13730) was identified in this study.
^†JHE, juvenile hormone esterases.

^‡FT, full transporter.
^§HT, half transporter.

Chemosensory and related receptors

To see if A. lycopersici’s specialized lifestyle has had a notable impact on chemoreception, we also exhaustively mined and annotated the A. lycopersici genome for gustatory receptors (GRs), degenerin/epithelial Na+ channels (ENaCs), ionotropic receptors (IR) and transient receptor potential (TRP) channels. Members of these four families have been previously documented to play important roles in sensing environmental (chemical) cues in other arthropod species (Damann et al., 2008; Hoy et al., 2016; Ngoc et al., 2016; Robertson et al., 2003; Rytz et al., 2013; Whiteman and Pierce, 2008). The GR family, which contains seven transmembrane spanning regions (Touhara and Vosshall, 2009) and is linked to the detection of sweet and bitter compounds (Silbering and Benton, 2010), was the most strongly reduced, with only two of these genes identified (Figure 4c, Figure 4—figure supplement 6), as opposed to the 447 intact GRs reported in T. urticae (Ngoc et al., 2016). Further, only four ENaCs are present in the A. lycopersici genome (Figure 4d, Figure 4—figure supplement 7). Members of this family have recently been shown or suggested to be chemoreceptors for diverse compounds in insects and mites, but some family members likely have highly conserved roles in acid sensing (Ben-Shahar, 2011; Silbering and Benton, 2010), as well as in the perception of mechanical or osmotic cues (Ben-Shahar, 2011; Zelle et al., 2013). Of the two ENaCs likely to play these conserved roles in T. urticae, one is in a well-supported clade with a single ENaC in the tomato russet mite (aculy04g09940) (Figure 4 , Figure 4—figure supplement 7).

Figure 4 with 20 supplements see all

Download asset Open asset

Gene family contractions and mini-expansions in *A. lycopersici.*

Maximum likelihood phylogenetic analysis of selected detoxification and chemosensory families among *A. lycopersici*, *T. urticae*, *M. occidentalis* and *D. melanogaster*. (a) Glutathione-S-transferases (GSTs); the different GST classes (zeta, theta, delta, epsilon, omega, mu, sigma) are indicated with arches. (b) Major facilitator superfamily (MFS). (c) Gustatory receptors (GRs). (d) Epithelial Na+ Channels (ENaCs). All trees are midpoint rooted and only topology is shown. Gustatory receptors for *D. melanogaster* as well as the species-specific class A and B expansions identified in *T. urticae* are collapsed for clarity. Only bootstrap values above 70 are shown. Phylogenetic reconstructions for gene families, or analyses of domain losses in *A. lycopersici* in arthropod conserved genes, are given in Figure 4—figure supplements 1–20. For panels a-d, the detailed versions for each tree, including sequence identifiers, can be found in Figure 4—figure supplements 1, 5, 6 and 7, respectively. The alignments used for phylogenetic inference can be found in Supplementary file 7.

The IR family, which has been linked to odorant detection (Joseph and Carlson, 2015), humidity and temperature sensing in D. melanogaster (Enjin et al., 2016), is markedly reduced in A. lycopersici compared to most insects and M. occidentalis (Hoy et al., 2016). However, the numbers are similar to those in T. urticae (each has four putative IRs with strong bootstrap support), including homologues of the highly conserved IR25a and IR93a receptors (Figure 4—figure supplement 8). Interestingly, A. lycopersici may have as few as six ionotropic glutamate receptors (iGluRs), as compared to 14 in T. urticae (Figure 4—figure supplement 8); proteins in this family are related to IRs, but have ultra-conserved roles in synaptic transmission in animals (Benton et al., 2009).

Finally, we found both expansions and contractions of the TRP family (Figure 4—figure supplement 9). Like the other sequenced herbivorous mite, T. urticae, no orthologue of TRPA1 was located, but orthologues for TRPgamma, NopmC, and TRPML are present, with three copies of NopmC as compared to T. urticae's two. Unlike T. urticae, members of the TRPP and TRPM clades were completely absent in the russet mite, but strikingly, two putative members of the TRPV clade (Inactive and Nanchung), previously thought to be lost in mites and ticks (Peng et al., 2015; Regier et al., 2010), appear to be present.

Loss of highly conserved transcription factors

Among two vertebrates, one nematode and the 15 arthropod species we analyzed, A. lycopersici has the lowest number (364) of transcription factor (TF) genes (Supplementary file 1 — ‘Table S11’ Tab). Nevertheless, when accounting for the total number of genes by species, the TF fraction in A. lycopersici (3.55%) is higher than that of T. urticae (2.98%), and is within the range reported for metazoan animals (4.7% ±1.4, Charoensawan et al., 2010). However, a lower number of the PFAM TF domains Zinc finger (zf-C2H2 and zf-CCHC), Forkhead, Homeobox, Hormone (nuclear) receptor, HLH, bZIP_2 and T-box were found in A. lycopersici compared to all other species included in our analysis (Supplementary file 1 — ‘Table S11’ Tab). In addition, A. lycopersici orthologues of the Hairy Orange protein family (hey, cwo and deadpan) have lost the Hairy Orange domain (Figure 4—figure supplement 10), while an orthologue of D. melanogaster SoxNeuro could not be identified in A. lycopersici despite being present in the spider and Acari genomes examined (Figure 4—figure supplement 11). Among nuclear receptors (NRs), we identified eight canonical NRs in the A. lycopersici genome (E78, HR3, EcR, two RXRs, ERR, FTZ-F1, HR96) that contained both a DNA-binding domain (DBD) and a ligand-binding domain (LBD). However, no homologues of the evolutionary conserved NRs HNF4, HR39, HR78, and HR83 (Bodofsky et al., 2017; Bonneton and Laudet, 2012), nor a homologue of the T. urticae Photoreceptor-specific NR (PNR), were detected in the A. lycopersici genome, even though HR78, HNF4, and PNR are present in D. pteronyssinus (Supplementary file 1 — ‘Table S12.1’ Tab and ‘Table S12.2’ Tab). Further, for six nuclear receptors (E75, DSF, HR4, HR38, HR51, and SVP) that are evolutionary conserved across arthropods and normally have a canonical (DBD+LBD) structure (Fahrbach et al., 2012; Grbić et al., 2011; Hwang et al., 2014; Litoff et al., 2014), an LBD was not predicted for the respective A. lycopersici homologues. LBDs for all of these except HR4 were predicted for both the D. pteronyssinus and T. urticae homologues (Supplementary file 1 — ‘Table S12.1’ Tab and ‘Table S12.2’ Tab, Figure 4—figure supplements 12–17).

The basic helix-loop-helix (bHLH) gene family is an ancient family found in fungi, plants, and animals, and members of this family are essential both for organisms to respond to environmental factors, as well as for cellular differentiation during development (Skinner et al., 2010). The D. melanogaster achaete and scute bHLH genes play crucial roles in bristle development (García-Bellido and de Celis, 2009). Within the bHLH family group we found that T. urticae, M. occidentalis and I. scapularis have five bHLH proteins with an achaete-scute InterPro domain (IPR015660), while only three were found in both D. pteronyssinus (g4111.t1, g7028.t1 and g6164.t1) and A. lycopersici (aculy01g18470, aculy01g18540 and aculy02g28230).

A number of other specific transcription factors that are highly conserved among most arthropods are also absent from the A. lycopersici genome. For A. lycopersici, we were unable to identify proboscipedia, a member of the Hox gene family. Members of this family (labial, proboscipedia, Hox3/zen, Deformed, Sex combs reduced, fushi tarazu, Antennapedia, Ultrabithorax, abdominal-A, and Abdominal-B) encode homeodomain transcription factors and act to determine the identity of segments along the anterior–posterior axis in arthropods (Hughes and Kaufman, 2002). Proboscipedia is present in all chelicerate genomes (horseshoe crab, scorpions, spiders, mites and ticks) for which Hox genes have been analyzed (Figure 5, Supplementary file 1 — ‘Table S13.1’ Tab and ‘Table S13.2’ Tab, Supplementary file 6; Di et al., 2015; Hoy et al., 2016; Kenny et al., 2016; Schwager et al., 2017), and is believed to be ancestral to all arthropods (Pace et al., 2016). Of particular note, proboscipedia is located in close proximity (<35 kb) of labial in Acariformes, but in Aculops labial was the only Hox gene that was present on scaffold 2 (Supplementary file 1 — ‘Table S14’ Tab). Furthermore, A. lycopersici lacks a homologue of the T-box encoding gene org-1 (Figure 4—figure supplement 18), which in D. melanogaster plays a pivotal role in diversification of circular visceral muscle (Schaub and Frasch, 2013). Finally, we also mined the A. lycopersici genome for transcription factors and other genes involved in circadian rhythm (so-called ‘clock’ genes) (Supplementary file 1 — ‘Table S15’ Tab). Orthologues of the helix-loop-helix TFs cycle, Clock and tango and the bZIP TF vrille were identified in the A. lycopersici genome. However, we did not identify period and timeless, known negative regulators of Clock and cycle (Lee et al., 1999; Peschel and Helfrich-Förster, 2011). Other circadian regulators, like the circadian photoreceptor cryptochrome and the bZIP TF PAR-domain protein 1ε, were also not identified, even though these are present in T. urticae (Hoy et al., 2016).

Figure 5

Download asset Open asset

Hox genes in Acari and other ecdysozoan lineages.

Hox orthology groups are indicated by different colored boxes. Gray boxes with a dashed outline represent missing Hox genes. Some species have duplications of Hox genes and these are indicated by multiple boxes that overlap. *T. castaneum*, *H. dujardini* and *C. elegans* were selected as representative species for the Hox gene clusters of Insecta, Tardigrada and Nematoda, respectively.

Horizontally transferred genes

We identified 18 putatively intact horizontal gene transfer (HGT) candidate genes (Supplementary file 1 — ‘Table S16’ Tab), and performed subsequent phylogenetic analyses that suggested that nine were acquired from a foreign source. Seven of these genes code for UDP-glycosyltransferases (UGTs), members of which have well documented roles in xenobiotic detoxification (Snoeck et al., 2019). Phylogenetic inference with all T. urticae¸ D. pteronyssinus and A. lycopersici UGTs (80, 27, and 7, respectively) indicated that the seven UGTs in the tomato russet mite genome were the result of a lineage-specific expansion (Figure 4—figure supplement 19). Although we did not observe a clear phylogenetic signature of HGT (Wybouw et al., 2016), our phylogenetic reconstruction is consistent with previous studies which indicated that, prior to the formation of the Acariformes lineage, an ancestral mite species laterally acquired a UGT gene copy from a bacterial source (Ahn et al., 2014; Wybouw et al., 2018).

Two intact genes of bacterial origin (aculy01g38350 and aculy04g02470) were also identified in the tomato russet mite genome that are predicted to code for enzymes in the microbial and plant pantothenate biosynthesis pathway (an apparent duplicate of aculy01g38350 was also uncovered, but the coding sequence was disrupted, and it lacked expression, suggesting it is a pseudogene) (Figure 6). PCR amplification linked both laterally acquired genes with either neighboring intron-containing genes (aculy01g38350) or conserved eukaryotic genes (aculy04g02470 is located next to aculy04g02480, which encodes a Gtr1/RagA protein); in addition, an aculy01g38350 transcript (Illumina contig 1934) had a polyA tail, suggestive of eukaryotic transcription (Figure 6—figure supplement 1). Pantothenate, or vitamin B5, is a life-essential compound, and whereas plants and bacteria are able to synthesize this compound de novo, animals rely on dietary uptake. Genes for pantothenate synthesis are present in tetranychid mites, and genomic and phylogenetic approaches have pointed to an ancient HGT event prior to speciation within the Tetranychidae family for both genes. Constrained tree tests rejected the topology where ketopantoate hydroxymethyltransferase of A. lycopersici was the sister lineage to the group of spider mite biosynthetic proteins, but not for pantoate β-alanine ligase, suggesting that A. lycopersici acquired the ketopantoate hydroxymethyltransferase gene from a different bacterial donor species (Figure 6, Approximately Unbiased tests, p-value cut-off of 0.01).

Figure 6 with 1 supplement see all

Download asset Open asset

Maximum-likelihood phylogenetic inference for ketopantoate hydroxymethyltransferase and pantoate β-alanine ligase of *A. lycopersici*.

(a) Ketopantoate hydroxymethyltransferase. (b) Pantoate β-alanine ligase. Branches are color coded depending on their position within the tree of life; plants: green, animals: orange, fungi: red and bacteria: blue. RAxML phylogenetic reconstructions are consistent with the evolutionary scenario of independent horizontal transfer events of the two pantothenate biosynthetic genes in the *A. lycopersici* lineage, tetranychid spider mites, and hemipterans. Only RAxML bootstrap support values higher than 70 are depicted and the scale bars represent 0.2 amino acid substitutions per site. Informative nodes were identical and well-supported in another maximum-likelihood analysis (IQ-TREE; an asterisk indicates nodes with ultrafast bootstrap values above or equal to 95 in the IQ-TREE analyses). Plant homologues were used to root both phylogenetic trees. The alignments used for phylogenetic inference can be found in Supplementary file 7.

In T. urticae the acquisition of pantothenate biosynthetic genes is accompanied by the horizontal gene transfer of two methylenetetrahydrofolate dehydrogenases (MTHFDs), enzymes of the folate pathway and connected to the pantothenate biosynthesis pathway (Wybouw et al., 2018). Although such a HGT was not detected in A. lycopersici, an expansion of MTHFDs was detected compared to other mite species (OG0000706 in Supplementary file 1 — ‘Table S7’ Tab).

Secreted proteins

Small molecules or proteins produced in salivary glands are one mechanism by which arthropod herbivores can manipulate the defenses of their host plants. As A. lycopersici is able to potently suppress tomato defenses (Glas et al., 2014; Schimmel et al., 2018), we predicted its secretome, and found that 612 of the 10,263 annotated A. lycopersici proteins (6%) are putatively secreted (Supplementary file 1 — ‘Table S17’ Tab). Only one of the more than 600 secreted A. lycopersici proteins (aculy02g17370, a glycosyl hydrolase, family 13, IPR013780) had a best BLASTp hit with a T. urticae protein that was previously identified in T. urticae saliva using an LC-MS/MS (Jonckheere et al., 2016). More than half (351) of these proteins were absent in orthogroups in non-herbivorous arthropod species, and are less than 350 amino acids in length. Only 15 of these 351 proteins belonged to an orthogroup with more than one member in A. lycopersici (OG0006384, OG0009325, OG0009954 and OG0010904). Among these, OG0009325 contains three short A. lycopersici proteins <90 amino acids in length (aculy01g11450, aculy01g12600, and aculy01g12690). Of note, the gene encoding the single T. urticae representative in this group, tetur24g01070, was previously found to be overexpressed in the T. urticae salivary gland region (Jonckheere et al., 2016). OG0006384, on the other hand, contains cysteine peptidases (Peptidase C1A, papain C-terminal domain; InterPro IPR000668), which are enzymes reported to have key roles in plant-pathogen/pest interactions (Shindo and Van der Hoorn, 2008), and for which two lineage-specific expansions are present in A. lycopersici (Figure 4—figure supplement 20).

Small RNA pathways

We also characterized components of small RNA pathways that might be of potential relevance for agricultural control methods. The A. lycopersici genome harbors highly conserved miRNA sequences, such as let-7, miR-1, and miR-9a (Supplementary file 1 — ‘Table S18’ Tab). However, in contrast to T. urticae, a clear A. lycopersici homologue of Exportin-5, a dsRNA-binding protein mediating nuclear transport of pre-miRNAs (Bohnsack et al., 2004; Kim, 2005), is lacking, suggesting a deviating miRNA pathway in A. lycopersici. In line with the latter hypothesis, we could not identify an A. lycopersici homologue of Staufen, while this gene is present in T. urticae (Grbić et al., 2011; Supplementary file 1 — ‘Table S19’ Tab) and other arachnids (OrthoDb v 9.1, group EOG091G07A0 and EOG090Z04UZ, respectively) and was shown to negatively modulate miRNA activity in the nematode C. elegans (Ren et al., 2016).

The A. lycopersici genome contains, in line with T. urticae, clear homologues of Dicer-1, Loquacious, Drosha and Pasha and an expansion of the AGO1 and PIWI/AGO3 subfamilies. Of note, we found one A. lycopersici protein (aculy02g00240) that was highly homologous to the T. castaneum Dicer-1 enzyme (bitscore of 294) and that contained both an RNA-binding domain (PAZ-domain, cl00301) and the RNAse III domain (cd00593) while two A. lycopersici proteins (aculy02g04810 and aculy02g19970) showed reciprocal BLASTp hits with T. castaneum Dicer-2 and Dicer-1, respectively, but were relatively short (about 500 amino acids (aa) compared to 1726 aa for aculy02g00240) and only contained the RNAse III domain. However, the genes encoding these proteins are located next to a sequencing gap in the current assembly and it could be that gene-models for these Dicer-like enzymes are not complete. Similar to T. urticae, we could not identify clear homologues of R2D2 and AGO2 (Grbić et al., 2011; Supplementary file 1 — ‘Table S19’ Tab), suggesting that the siRNA pathway is either absent or non-canonical in both mite species (Okamura et al., 2011).

Further, important players in the PIWI-interacting RNA (piRNA) pathway (Iwasaki et al., 2015) were identified in the A. lycopersici genome (PIWI/AGO3, Zucchini, Armitage, Maelstrom and SoYb; Supplementary file 1 — ‘Table S19’ Tab), while homologues of Armitage and Zucchini could not be identified in T. urticae, which is in line with the recently suggested non-canonical piRNA pathway in T. urticae (Supplementary file 1 — ‘Table S19’ Tab, Huang et al., 2014; Mondal et al., 2018b).

Finally, RNA-dependent polymerases are known to be essential for the amplification of the RNA silencing effect (systemic RNAi) in C. elegans and some plants (Tomoyasu et al., 2008). Genes encoding these enzymes are absent in insect genomes while 1 to 5 have been reported in Acari genomes (Grbić et al., 2011; Hoy et al., 2016; Joga et al., 2016; Mondal et al., 2018a; Zong et al., 2009). Surprisingly, we could not identify RNA-dependent polymerase genes in the A. lycopersici genome (Supplementary file 1 — ‘Table S19’ Tab), which might indicate that these genes have been lost since the divergence of Eriophyoidea from other acariform lineages. However, as systemic RNAi does seem to occur in some insect orders, for example, Coleoptera (Joga et al., 2016), we cannot exclude that systemic RNAi might also occur in A. lycopersici.

Discussion

Genome size varies enormously within the Acari. While tick genomes can be larger than 2 Gb (Gulia-Nuss et al., 2016), those of mite species belonging to the Acariformes are small (Gregory and Young, 2020). This is especially true for mites within the order Sarcoptiformes, including dust mites and scabies mites, for which genomes of lengths ~55-60 Mb have been reported (Chan et al., 2015; Rider et al., 2015). Eriophyoid mites like A. lycopersici have traditionally been placed in the order of the Trombidiformes, but recent work suggests they belong to the Sarcoptiformes, or a sister taxon (Arribas et al., 2020; Bolton et al., 2017; Klimov et al., 2018; Xue et al., 2017). Our work supports this conjecture, as within Acariformes, A. lycopersici fell in a well-supported clade with the house dust mite D. pteronyssinus (Sarcoptiformes), as opposed to T. urticae (Trombidiformes) (Figure 2—figure supplement 3).

Mirroring that of its closest sequenced relatives, the genome of A. lycopersici is tiny. At 32.5 Mb, it is the smallest reported to date for an arthropod and among the smallest metazoan genomes sequenced so far (Slyusarev et al., 2020). Its size is also consistent with cytological data that eriophyoid mites have few chromosomes that are extremely small (Helle and Wysoki, 1996; Helle and Wysoki, 1983) and with several trends. In broad terms eukaryotic genome sizes correlate positively with larger cell (nuclei) sizes, and vary inversely with cell division times (Elliott and Gregory, 2015; and references therein). While little is known about the minimal cell sizes for A. lycopersici, the whole mite is smaller than many single eukaryotic cells and neuron somata sizes of less than 1 μm have been observed for another eriophyoid mite of similar size (Whitmoyer et al., 1972). A. lycopersici is also half the size (or less) of mites like D. pteronyssinus, and its minute physical stature and genome size are consistent with a recent analysis that revealed a positive correlation within Acari between organismal size and haploid DNA content (Gregory and Young, 2020). The A. lycopersici generation time, a potential (albeit imperfect) proxy for cell cycle progression, is also near the minimum reported for other mites, or for microinsects (Danks, 2006; Kawai and Haque, 2004; Rice and Strong, 1962). The force(s) that have led to the small physical and genome size of A. lycopersici are not known. However, russet mites can use their short stylets only to feed on plant epidermal cells (Royalty and Perring, 1988). This is in contrast to many other (larger) herbivores, including other herbivorous mites like T. urticae (Bensoussan et al., 2016), that can reach and consume the photosynthetically active, sugar-rich mesophyll cells (Borsuk and Brodersen, 2019; Koroleva et al., 2000) underneath the epidermis. The nutrient-poor diet of A. lycopersici may favor small physical size, and under some conditions, nutrient limitations have been proposed to select specifically for low DNA content (Hessen et al., 2010a). Regardless, the rapid generation time of A. lycopersici facilitates dense populations on its host (Figure 1c,d), and outcrossing by deposition of spermatophores (Al-Azzazy and Alhewairini, 2018) in the environment may approximate panmixia, and hence high effective population sizes, and therefore more efficient selection against the accumulation of non-coding sequences associated with large eukaryotic genomes (Lynch et al., 2011). Therefore, a collection of life history features may underlie the streamlining observed in the A. lycopersici genome.

In addition to a very low content of repetitive sequences, a derived genomic organization underpins the reduced A. lycopersici genome. As compared to the ~3 fold larger T. urticae genome (Grbić et al., 2011), the relative intergenic and intronic fractions are reduced, while compared to the ~2 fold larger D. pteronyssinus genome (Waldron et al., 2017), the intergenic fraction is nearly identical, while the genomic percent in introns is less. The latter reduction reflects massive intron loss in A. lycopersici, as 83.7% of genes were intronless, a value more than threefold higher than for T. urticae or D. pteronyssinus. As observed in other intron-poor species (Mourier and Jeffares, 2003), we observed greater retention of 5′ introns in A. lycopersici, potentially a consequence of intron loss via 3′-biased intron removal by recombination with cDNAs following reverse transcription of spliced transcripts (also known as Reverse Transcriptase-Mediated Intron Loss, or RTMIL) (Mourier and Jeffares, 2003; Roy and Gilbert, 2005). Alternatively, or in concert, the pattern may reflect retention of 5′ introns rich in cis regulatory sequences (Roy and Gilbert, 2005), an explanation consistent with A. lycopersici’s relatively long median intron lengths as compared to other insects and mites with compact genomes (Table 1, Figure 2—figure supplement 6). Previously, comparisons of intron loss events among close relatives, where few mutational steps have occurred, have been important in establishing plausible mechanisms of intron loss (Yenerall et al., 2011; Zhu and Niu, 2013). Such analyses are challenging to perform for A. lycopersici, as the time of divergence from the most recent common ancestor with a sequenced genome is hundreds of millions of years. Nevertheless, for a set of A. lycopersici intron losses in highly conserved genes – for which confident assignment of intron positions could be made in multi-species protein alignments – the overwhelming majority of loss events were consistent with precise intron excisions (i.e. Figure 2c). This pattern is consistent with a major role for intron removal via RTMIL, which has also been suggested to be a frequent mechanism underlying intron loss events in the genomes of (comparatively) closely related Drosophila species (Yenerall et al., 2011). However, a more prominent role for precise (or nearly precise) genomic deletions of introns as a loss mechanism in A. lycopersici cannot be ruled out, especially as our analysis necessarily involved conserved genes for which imprecise intronic deletions would likely be highly detrimental. A. lycopersici also has a very rapid generation time, and as it is evolutionary distant from its closest sequenced relatives (Figure 3), many lineage-specific uncommon mutation events (such as genomic deletion of introns) have potentially been sampled. Currently, more closely related genomes are needed to distinguish between RTMIL or genomic deletions as the predominant driver of intron loss in A. lycopersici, as well as to assess contributions of other possible mechanisms – for instance, retrotransposition by target-primed reverse transcription of spliced transcripts (Cordaux and Batzer, 2009; Wang et al., 2014), with subsequent loss of source, intron-containing loci. Likewise, more closely related genomes will be critical to establish the timing of intron losses. As additional genomes in this lineage become available, eriophyoid mites promise to be an attractive system to investigate the dynamics of intron evolution.

Apart from the dearth of introns, the complement of coding genes in the A. lycopersici genome deviates from that of relatives with larger genomes, and seems to be associated with its reduced morphology and distinct life history (Lindquist and Oldfield, 1996). Compared to other arthropods, a mere handful of gene families were expanded, including one that encodes a troponin domain. While this result was unexpected, as troponin performs a conserved role in muscle contraction and is single or low copy number in most arthropods, in a transcriptome assembly of Aceria tosichella, a non-galling eriophyoid pest of wheat and other grasses, an expansion of troponin-encoding genes was also observed (Gupta et al., 2019). Possibly, this expansion may be related to the derived body musculature of eriophyoids, as their skeletal and peripheral musculature is very pronounced, with the latter enabling the maintenance of body turgidity (Nuzzaci and Alberti, 1996). Nevertheless, the dominant force in shaping the genic composition of A. lycopersici is loss, including for genes involved in highly conserved metazoan or arthropod cell processes (e.g. for the Golgi apparatus), as well as gene families and specific genes (or conserved domains) involved in many aspects of arthropod development and physiology. The latter include Hairy Orange domain proteins, nuclear receptors, and other transcription factors that have broadly conserved roles in animal development (Holland, 2013; Iso et al., 2003; Pflugfelder et al., 2017; Sebé-Pedrós and Ruiz-Trillo, 2017; Shimeld et al., 2010), and whose reduction (or simplification by domain loss) in A. lycopersici may be related to the eriophyoid body plan. For example, in contrast to other mites, A. lycopersici has no orthologue of the T-box gene org-1, which in D. melanogaster plays a pivotal role in diversification of circular visceral muscle (Schaub and Frasch, 2013). This musculature is reduced in the Eriophyoidea (Nuzzaci and Alberti, 1996; Whitmoyer et al., 1972) compared to other mites (Alberti and Crooker, 1985; Coons, 1978; Mathieson and Lehane, 2002), as it also is in studied microinsects (Polilov, 2015). Furthermore, in most chelicerates, the Hox gene pb is expressed in the pedipalps and in three to four pairs of legs (Barnett and Thomas, 2013; Schwager et al., 2015; Telford and Thomas, 1998). Whether the lack of pb in the A. lycopersici genome is related to the reduction in legs in Eriophyoidea is unknown; however, pb has also been lost in other ecdysozoan animals such as Nematoda (Aboobaker and Blaxter, 2003) and Tardigrada (Smith et al., 2016; Yoshida et al., 2017), lineages that either lack legs (Nematoda) or in which leg formation has been suggested to be highly aberrant (‘walking heads’, Maderspacher, 2016) from the panarthropodan ancestor (Tardigrada, Smith and Goldstein, 2017). Further, in D. melanogaster mutants of both dachs and four-jointed, each of which is absent in A. lycopersici, have similar phenotypes including shortened legs (Buckles et al., 2001). A. lycopersici-specific losses in cell cycle regulatory genes like unkempt and fat are also candidates to underlie allometric changes in tissues and organs, a general feature of diminutive mites (like A. lycopersici) and microinsects (Danks, 2006; Polilov, 2015).

A remarkable feature of the genome of T. urticae is the presence of hundreds of genes acquired from fungal or bacterial sources, including microbe-derived UGTs (Wybouw et al., 2018). While a modest number of UGTs of putative bacterial origin are present in A. lycopersici, horizontally transferred genes were otherwise absent, except for two genes in the pathway for the synthesis of pantothenate, an essential B vitamin. Previous studies have shown that pantothenate biosynthetic genes have been laterally transferred into tetranychid mites, the silverleaf whitefly, and nematodes (Chen et al., 2016; Craig et al., 2009; Wybouw et al., 2018; Ren et al., 2020). In A. lycopersici, the HGT event of ketopantoate hydroxymethyltransferase appears to be distinct from the transfer in the tetranychid mite lineage. The apparent independent HGT of pantothenate biosynthetic genes in Acariformes, coupled with acquisitions in insect lineages, is a strong signal of adaptive significance for de novo pantothenate biosynthesis in arthropod herbivores.

Finally, nowhere were reductions in A. lycopersici gene families more striking than in genes associated with host plant use. Recently, the importance of chemosensory receptors in host plant use and breadth has attracted intense interest (Gloss et al., 2019; Ngoc et al., 2016). A. lycopersici completely lacks the expansion of chemosensory receptors reported (to varying extents) in nearly all other arthropods, as only a handful of members are present for any of the characterized chemosensory receptor families. This finding is consistent with a reduced role for chemosensation in specialist herbivores, although it may also reflect a more general loss of sensory structures during miniaturization, as the number of sensilla (which include sites of chemosensation) are dramatically reduced in microinsects (Polilov, 2015), as well as in eriophyoid mites (Figure 1a,b; Lindquist and Oldfield, 1996). Next to chemosensory receptor genes, the detoxification gene complement of A. lycopersici is minimal compared to the generalist herbivore T. urticae (Dermauw et al., 2013b; Grbić et al., 2011), as well as to insect herbivores (Rane et al., 2019). This was particularly striking for CCEs and GSTs, for which lineage-specific expansions are absent, and for which most members are in highly conserved clades that likely perform more general (non-detoxification) roles. Several of the few notable lineage-specific expansions in A. lycopersici do involve subfamilies of the MFS. However, while some MFS genes are differentially regulated upon host shift or xenobiotic exposure in T. urticae (Dermauw et al., 2013b), MFS proteins have diverse roles, and additional work is needed to assess if MFS mini-expansions in A. lycopersci are associated with host use.

The minimal detoxification gene repertoire and the paucity of chemoreceptor genes in A. lycopersici are in line with ecological specialization theory that predicts that herbivores with a narrow host range only need a limited number of environmental response genes (Berenbaum, 2002; Rane et al., 2019). However, although A. lycopersici has a narrow host-range relative to the spider mite T. urticae, it can be found on related solanaceous plant species (Perring and Farrar, 1986), as well as on several hosts outside the nightshade family (Perring and Royalty, 1996; Rice and Strong, 1962). Hence, the extent to which this mite has specialized on these hosts is unclear. Nevertheless, the minimal detoxification and chemoreception repertoire gene sets support the idea that modification of the local environment by defense suppression may alter selection imposed by the environment, thereby reducing the requirement for environmental response genes (Laland et al., 2016). How eriophyoids manipulate their hosts is unknown, but likely involves orally delivered salivary metabolites (De Lillo and Monfreda, 2004), or alternatively secreted proteins, termed effectors. Currently, the molecular nature of herbivore effectors, and their mechanisms of action, are poorly understood (Blaazer et al., 2018; Erb and Reymond, 2019). However, proteins secreted by the larvae of several lepidopteran species have been shown to attenuate plant defenses, including by physical interaction with a component of the JA signal transduction pathway (Chen et al., 2019; Musser et al., 2002). Further, a salivary ferritin from the whitefly Bemisia tabaci suppresses oxidative signals in tomato, and blunts JA-mediate defense responses (Su et al., 2019), and expression of salivary products of unknown molecular function from spider mites in plants was recently demonstrated to impair defense signaling downstream of the phytohormone salicylic acid (Villarroel et al., 2016), and may also act to suppress JA signaling (Schimmel et al., 2017). The divergent molecular nature of these effectors mirrors findings from plant-pathogen (Toruño et al., 2016) and plant-nematode (Rehman et al., 2016) systems, where secreted effectors can be highly species-specific, hindering identification based solely on sequence information. These findings highlight the need for functional studies to establish if secreted proteins (or metabolites) in A. lycopersici saliva underlie this mite’s ability to potently suppress tomato defenses. More generally, as additional genomes of herbivores that induce or suppress plant defenses become available – and that vary in their magnitude and mechanisms of host suppression – the A. lycopersici genome will serve as a key reference for comparative studies to test hypotheses surrounding the evolution of gene families that respond to or modulate plant defenses.

Conclusion

At only 32.5 Mb, the A. lycopersici genome is the smallest sequenced arthropod genome to date. In contrast to its closest sequenced relatives, the majority of genes lack introns, few repetitive sequences are present, and many genes conserved in most animals are absent. Compared to its larger relatives, the simplification of A. lycopersici’s body plan, and that of eriophyoid mites more generally, is reminiscent of that observed in other microarthropods (Maderspacher, 2016). The compressed genome architecture of A. lycopersici is in line with genome streamlining concepts (Hessen et al., 2010a; Hessen et al., 2010b), some of which speculate that maintaining a high growth rate in nutritionally limited environments (in this study the plant epidermis) may be a driver for the evolution of compact genomes. Further, the extreme reduction of several environmental response gene families aligns with predictions that follow from ecological specialization theories (Devictor et al., 2010; Futuyma and Moreno, 1988; Laland et al., 2016) since the mite’s suppression of plant defenses may allow for such families to minimize during the course of its evolution. Finally, this first eriophyoid genome provides a resource for methods of early detection of mite infestations using molecular markers, and its reduced complement of defense genes – a common source of pesticide resistance – may also reveal novel Achilles’ heels for the control of A. lycopersici. But foremost, this genome is a milestone for accelerating our understanding of the evolutionary forces underpinning metazoan life at the limits of small physical and genome size.

Share this article

Cite this article

The tomato russet mite Aculops lycopersici is a devastating pest of tomato.

Number of conserved introns and intron loss rate across 18 metazoan species.

Genome metrics for A. lycopersici, other mite species, D. melanogaster and H. sapiens.

Table 1—source data 1

CAFE analysis of 6487 metazoan orthogroups.

Detoxification enzyme (CYPs, GSTs, CCEs) and ABC transporter gene family size in A. lycopersici, T. urticae, M. occidentalis, and D. melanogaster.

Gene family contractions and mini-expansions in A. lycopersici.

Hox genes in Acari and other ecdysozoan lineages.

Maximum-likelihood phylogenetic inference for ketopantoate hydroxymethyltransferase and pantoate β-alanine ligase of A. lycopersici.

Author details

Robert Greenhalgh

Contribution

Contributed equally with

Competing interests

Wannes Dermauw

Present address

Contribution

Contributed equally with

For correspondence

Competing interests

Joris J Glas

Present address

Contribution

Competing interests

Stephane Rombauts

Contribution

Competing interests

Nicky Wybouw

Contribution

Competing interests

Jainy Thomas

Contribution

Competing interests

Juan M Alba

Contribution

Competing interests

Ellen J Pritham

Contribution

Competing interests

Saioa Legarrea

Contribution

Competing interests

René Feyereisen

Contribution

Competing interests

Yves Van de Peer

Contribution

Competing interests

Thomas Van Leeuwen

Contribution

Competing interests

Richard M Clark

Contribution

For correspondence

Competing interests

Merijn R Kant

Contribution

For correspondence

Competing interests

Downloads (link to download the article as PDF)

Open citations (links to open the citations from this article in various online reference manager services)

Cite this article (links to download the citations from this article in formats compatible with various reference manager tools)

Categories and tags

Further reading