Introduction

Molecular information from both mitochondrial (mt) sequences and gene rearrangements have been employed for deducing phylogenetic relationships at different hierarchical levels for several lineages of arthropods, including crustaceans. Among the mt protein-coding genes investigated in Crustacea, cytochrome oxidase subunit I and subunit II gene (COI and COII) generally provide a wide range of evolutionary information (eg Meyran et al, 1997; Haye et al, 2004; Lörz and Held, 2004). Moreover, the crystal structure of these proteins is well known at high resolution (Saraste, 1990,1999; Capaldi, 1996; Tsukihara et al, 1996) allowing the comparison of amino-acid (aa) sequence alignment with structural information (Lunt et al, 1996; Davolos and Maclean, in preparation).

The mt rearrangements, which generally involve the tRNA genes, are regarded as reliable because of their relatively low likelihood of homoplasy (Rawlings et al, 2001; Kitaura et al, 2002; Roehrdanz et al, 2002; Shao et al, 2003). For the Crustacea, the mt COI-tRNALeu(UUR)-COII arrangement is considered ancestral (Crease, 1999) and probably occurred at the base of an insect–crustacean clade (Boore, 1999). However, recently completed mt DNA sequences show that the position of the gene encoding tRNALeu-UUR between COI and COII is not shared by all the crustaceans. To date, the sequences of entire crustacean mt genomes show that tRNALeu-UUR is not located between COI and COII in the copepod Tigriopus japonicus (Machida et al, 2002), in the cephalocarid Hutchinsoniella macracantha (Lavrov et al, 2004) or in the ostracod Vargula hilgendorfii (Ogoh and Ohmiya, 2004). Moreover, mobility of leucine tRNAs has been observed in malacostracan decapods defining specific nodes in molecular phylogenies (Morrison et al, 2002).

In the present study, we analyse mt regions of crustacean taxa of the Talitridae family (Amphipoda) examining chiefly supralittoral species of the genera Orchestia, Talitrus and Talorchestia of the Mediterranean-East Atlantic area. The talitrid amphipods are particularly interesting because of their adaptations for a semi terrestrial or terrestrial life. Despite this, little is known about the patterns and processes of molecular evolution within this crustacean group. Here, we have chosen to investigate mt regions from COI to COII to seek preliminary indications of their utility in inferring evolutionary relationships.

Interestingly, the tRNALeu(UUR) gene was absent between the COI and COII genes and a short AT-rich noncoding (NC) region, varying in length, and apparently capable of forming secondary structures, was found. This outcome raised several questions. Are the putative secondary structures of the NC region involved in the rearrangement process? Have the short intergenic NC regions with putative stem loops any functional role? Has this NC region phylogenetic potential in comparison with coding regions? Here, we discuss the observed COI-NC-COII rearrangement in comparison with previously published studies.

Materials and methods

Sampling and DNA extraction

Orchestia gammarellus (Pallas, 1766), O. mediterranea (Costa, 1857), Talitrus saltator (Montagu, 1808) and Talorchestia deshayesii (Audouin, 1826) from supralittoral environments of the Isle of Cumbrae (Cu), Scotland and the Isle of Wight (Wi), UK, were analysed. A Scottish (Loch Linnhe) and a Swedish specimen (Isle of Öland) of O. gammarellus were examined only for a ca. 400-bp long COI fragment (see below). In addition, a Mediterranean sample of T. saltator (Soverato, from the Italian Ionian coast; So) was included in the analysis as a preliminary comparison over a long distance. For the region between COI and COII (Figure 1), we also analysed two other species of Orchestia: the freshwater O. cavimana (Heller, 1865) (Bracciano Lake, central Italy) and the Mediterranean O. montagui (Audouin, 1826) (Civitavecchia, from the Italian Tirrenian coast). Specimens were collected during the years 2000–2002 and stored in 95% ethanol until DNA extraction. A Chelex (Sigma) protocol (5% (w/v) of the chelating resin in 10 mM Tris/1 mM EDTA solution with 10 μg/ml of proteinase K; Walsh et al, 1991) was used to extract total genomic DNA. Muscle tissue of two to three pereopods of each individual (generally adult males) were placed in 5% (w/v) Chelex solution and incubated at 60°C for ca. 2 h. After centrifugation, the DNA supernatant was stored in 5% Chelex at −20°C.

Figure 1
figure 1

(a) Schematic map of the putative talitrid mt COI-NC-COII rearrangement with the positions of the primers used (not to scale). (b) Alignment of the NC sequences for the talitrid species. The relatively conserved nucleotides are in boldface; the repeated sequence (ATAAAA) found in O. gammarellus is underlined. Dashes indicate gaps. The locality abbreviations are given in Materials and methods.

PCR amplification and sequencing

A ca. 400 bp fragment in the COI gene was obtained using the primer BI-COI (5′-GATACCCGAGCTTATTTTAC-3′; Power et al, 1999) and the newly designed SUBIR (5′-GTATCGTCGAGGTATGCCTCTTAA-3′; Figure 1a). An amplicon about 0.8 kb long, spanning from the 3′ extremity of the COI gene to the 5′-end of COII gene, was obtained using the newly designed primers SUBIF (5′-AAGAGGCATACCTCGACGATACTC-3′) and COII-CROZ (5′-CCACAAATTTCTGAACATTGACC-3′; Crozier et al, 1989; Figure 1a). SUBIF or its variation (SUBIF-CAV: 5′-AAGCGGAATGCCTCGCCGATACTC-3′) was used with a newly designed primer (COII-TAL2: 5′-ATGATTTCGATTGTTTGGTTTTGTA-3′; Figure 1a) mainly to explore the region between the COI and COII genes and the adjacent coding portions. The PCR amplification was conducted using QIAgen Taq DNA Polymerase; the cycling parameters were: 95°C for 5 min; followed by 35 cycles at 94°C for 30 s, 50–56°C for 1 min, 72°C for 1 min–1 min 20 s; followed by 72°C for 7 min. The size of PCR products was estimated by electrophoresis in 1% agarose gels. PCR products were purified (QIAgen Kit) and subsequently used as templates for cycle sequencing. This was carried out (30 cycles with 20 s at 96°C, 10 s at 50°C and 4 min at 60°C) with an ABI Prism BigDye Terminator Kit. Sequencing products, generally of both strands, were cleaned (using DyeEx Spin Kit, QIAgen), electrophoresed and resolved by an automated ABI 377 sequencer (Perkin-Elmer).

Sequence analysis

COI and COII sequences were submitted to GenBank (AY185145-AY185166, AY211517, AY221738, AY342008, AY352583 and AY536886). They were aligned with homologous crustacean sequences using the program CLUSTAL X (Thompson et al, 1997) and analysed using MEGA version 2.1. (Kumar et al, 2001). The aa sequences were classified into structural classes: hydrophobic membrane spanning or transmembrane helices (M), external loops (E), internal loops (I), carboxy- (COOH) and amino- (NH2) terminal. Hydropathic region plots were obtained using TMPred (available at www.ch.embnet.org).

A preliminary phylogenetic analysis of the coding sequences was conducted with MEGA and PAUP* version 4.0b10 (Swofford, 2001). Neighbour-joining (NJ) trees were obtained from genetic distance matrices using the K2P method (Kimura, 1980). Maximum-parsimony (MP) reconstructions were performed with unweighted codon positions, with weighting of the second codon positions relative to the first and third positions, with downweighting of transitions relative to transversions and combinations of these weighting schemes (the details are available upon request). In addition, NJ and MP analysis were performed on alignments of the aa residues inferred from the COI and COII sequences. In all the phylogenetic analyses, bootstrap replicates (1000) were calculated. Homologous sequences of representative crustaceans (the branchiopod Daphnia pulex and two malacostracan decapod species, Pagurus longicarpus and Penaeus monodon) extracted from GenBank were used for comparison.

The sequences between the COI and COII genes (Figure 1b) were analysed for the content, for sequence variability and in order to identify secondary structures. The Mfold tool based on the energy minimization method (available at http://biotools.idtdna.com/mFold/mfold.asp) was used to predict secondary structures for single-stranded DNA sequence of the NC region.

Results

COI and COII sequence analysis

In the talitrid COI and COII regions, which were relatively rich in AT (ca. 66.4 and 73%, respectively), there were no insertions, deletions, frameshifts or stop codons.

The talitrid COI gene includes a TAA termination codon; the initiation codon in the COII gene was ATA (O. gammarellus, O. mediterranea and O. montagui), ATY (T. saltator) or ATC (O. cavimana; T. deshayesii). The alignment and homology with crustacean sequences from complete mt genomes extracted from GenBank are apparently unambiguous, but at the 3′-end of the COI sequence of Argulus americanus, D. pulex, Panulirus japonicus and Portunus trituberculatus, we include the first five nucleotides of the adjacent tRNALeu(UUR) gene (Figure 2), making a complete TAA stop codon, as previously suggested for P. monodon (Wilson et al, 2000) and Triops cancriformis (Umetsu et al, 2002).

Figure 2
figure 2

Alignment of last 60 bp in the 3′-end of the COI gene of the talitrid amphipods along with additional crustacean sequences extracted from GenBank. In A. americanus, D. pulex, P. japonicus and P. trituberculatus a 5-bp (TCTAA) overlap between the last and the termination codon of COI and the tRNALeu(UUR) gene's shown. Dots correspond to identical nucleotides and dashes indicate deletion; TAA and TAG indicate stop codons. The locality abbreviations are in Materials and methods.

O. gammarellus showed no DNA variation in Britain whereas COI variation was recorded for T. saltator at the aa level. The relatively distant Mediterranean sample of T. saltator was more distinct.

The transmembrane helices M8, M10 and external loop E5 identified by Lunt et al (1996) proved to be highly conserved. M12 proved to be relatively conserved, while the COOH region and internal loops I4 and I5 were variable. For the COII region, following Caterino and Sperling's terminology (1999), the talitrid NH2 region (aa ca. 1–25) showed high conservation. Aa residues ca. 26–48 and 60–80 appear to form hydrophobic regions and are probably membrane spanning helices 1 and 2 (M1, M2). These regions and the separating tract (I1) were variable. Hydropathy profiles (not shown) have confirmed the above aa sequence analysis.

Considering all the COI and COII sites (ca. 785 bp) for the four talitrid species (O. gammarellus, O. mediterranea, T. saltator and T. deshayesii), the substitutions found are chiefly synonymous, but the COII fragment showed a higher rate of nonsynonymous substitutions than COI (19.7% vs 16.6%). The transition and transversion values plotted against K2P genetic distances obtained for the talitrid taxa indicated saturation at third position, and near-saturation at first positions in the COI regions.

Partition homogeneity testing (using the incongruence length difference test in PAUP*) with heuristic searching revealed no significant difference between the regions of the two genes (P=0.73) and we therefore performed a preliminary phylogenetic analysis pooling the COI and COII sequences. All the phylogenetic analyses yielded estimates of relationships that were substantially concordant with one other: the talitrid taxa were monophyletic and within this group two separate clades are evident: a cluster comprising T. saltator and T. deshayesii and another one including O. gammarellus and O. mediterranea (Figure 3). In the MP trees (not shown) the T. saltatorT. deshayesii clade showed higher bootstrap values on upweighting transversions and/or second positions.

Figure 3
figure 3

NJ tree for the talitrid species and crustacean taxa (D. pulex, P. longicarpus and P. monodon; see text) based on the aa's inferred from ca. 780 bp of the COI-COII genes using p-distances. The tree is rooted with D. pulex as ‘outgroup’; numbers above each branch are bootstrap support values. For the locality abbreviations see Materials and methods.

NC region analysis

In all the talitrid species, an AT-rich (ca. 90%) NC region (Figure 1) was evident between the COI and COII genes. In a manually refined alignment of the NC sequences (Figure 1b), a proportion of sites were relatively conserved. Differences (in length and in sequence element) were observed among the species of Orchestia and also between the Mediterranean and the British specimens of T. saltator. A repeated sequence (ATAAAA) was found in O. gammarellus only (Figure 1b), accounting for the greatest length.

All the NC sequences can apparently form hairpins (eg see Figure 4). A rigorous analysis of structure prediction and functional conformation is outside the scope of this paper.

Figure 4
figure 4

Potential stem-loop structures present in the AT-rich NC region of O. cavimanaG=−1.22 kcal/mol).

Discussion

COI and COII sequences

In the talitrid COI and COII sequences, there was no evidence of nuclear pseudogenes. All the COI coding sequences reported here shared amino acids in highly conserved regions (Saraste, 1990,1999; Capaldi, 1996; Tsukihara et al, 1996). The talitrid COII data support previous observations for structural regions of this gene as poor predictors of variability (Caterino and Sperling, 1999; Masta, 2000).

The alignment with crustacean COI sequences suggested a TAA termination codon for the talitrid COI gene (Figure 2). Moreover, for A. americanus, D. pulex, P. japonicus and P. trituberculatus, there appears to be a 5-bp overlap between the last and the termination codon of COI and the tRNALeu(UUR) gene (Figure 2), as proposed for P. monodon (Wilson et al, 2000) and T. cancriformis (Umetsu et al, 2002). COII started with ATA, ATC or ATT, which have been found as initiation codons in other animal mtDNAs (eg Wilson et al, 2000; Machida et al, 2002).

Our investigations on talitrid COI and COII are at a preliminary stage, but they can be compared to previous allozyme analysis, for example, indicating particularly within T. saltator the presence of geographic differentiation (eg De Matthaeis et al, 1998,2000). In addition, the relationships among the three talitrid genera (Orchestia, Talitrus and Talorchestia; Figure 3) contrast with those proposed by Conceição et al (1998), but are in agreement with the estimates obtained by De Matthaeis et al (1998).

NC region

The COI-tRNALeu(UUR)-COII organisation probably evolved at the base of an insect–crustacean clade (Boore, 1999). However, recent lines of evidence, from complete mt genomes, suggest the tRNALeu-UUR position between COI and COII is not shared by all the crustaceans and insects (Shao et al, 2001; Machida et al, 2002; Morrison et al, 2002; Roehrdanz et al, 2002; Lavrov et al, 2004; Ogoh and Ohmiya, 2004). The tRNALeu(UUR) missing between the COI and COII genes observed in the talitrid species (Figure 1), thus, reinforces the observations that tRNA gene rearrangements are much more common than those of protein-coding and rRNA genes. Also, the COI-NC-COII-derived arrangement (Figure 1) and the sequence-based phylogeny (Figure 3), provisionally support the monophyly of these talitrid taxa.

Relatively short mt NC regions have been observed in crustaceans (eg Valverde et al, 1994a; Hickerson and Cunningham, 2000; Yamauchi et al, 2003) and in other invertebrates. Such NC regions also show high AT-bias, high length variability and are able to form stem-loop structures as observed even in regions close to COI, COII or tRNALeu(UUR) genes (Smith and Bush, 1997; De La Rua et al, 2000; Shao et al, 2001; Soucy and Danforth, 2002; Dowton et al, 2003; Yamauchi et al, 2003). They might have a role in transcription in addition to that of the mt control region (see Valverde et al, 1994b; Roberti et al, 2003). Moreover, secondary structure of the NC regions could play a similar role to that of tRNA genes in the processing of mt transcripts (Valverde et al, 1994a; Garesse et al, 1997). To date, however, little is known concerning their possible biological roles (eg Helfenbein et al, 2001; Lavrov et al, 2002).

The short inverted sequences of NC regions (and tRNA genes), which form secondary structure may be involved in the rearrangement process (Macey et al, 1997; Lavrov et al, 2002; Tomita et al, 2002). Recently, Higgs et al (2003) and Rawlings et al (2003) suggested a process of duplication, anticodon mutation and deletion of mt tRNALeu genes. Nevertheless, alternative mechanisms of mt gene rearrangement can be considered (see Dowton et al, 2003). An independent translocation event from a hypothetical initial COI-NC-tRNALeu(UUR)-COII or COI-tRNALeu(UUR)-NC-COII could also explain our data, although the possibility of a reciprocal transposition of the tRNALeu gene with an NC region cannot be excluded. This suggestion is conjectural, and only a complete mt talitrid genome sequence can give a clear answer.