Introduction

Phosphorus (Pi) is a key nutrient for plant growth and development, is the second essential macronutrient required for plant growth and development alongside nitrogen1 and is also an essential component of fertilizers used to sustain modern agriculture because it helps quality and yield. Pi is part of important biomolecules and, in the form of phosphate, pyrophosphate, adenosine triphosphate, adenosine diphosphate, or adenosine monophosphate, involved in various metabolic functions and many living cell regulatory processes are P-dependent2. Globally, many soils are deficient in phosphate (Pi)3, Pi concentrations within plant cells are typically 1000-times those outside4. Therefore, plants must have specialized transport vehicles to transfer Pi from the soil to the cells to counter the large concentration gradient at the root-soil interface. The active phosphorus in the soil and the phosphorus used in various metabolic processes ultimately require direct uptake and transport through the "phosphate transporter protein (PHT)", a highly efficient plant root uptake and transport system, which is responsible for the transport of inorganic phosphorus in plants under the regulation of transcription factors. Genome sequence analyses and experimental evidence have indicated that plants contain numerous Pi transporter families, including PHT1, PHT2, PHT3, PHT4, and PHT5, which were distinguished by their protein sequences, structures, locations, and functions5,6,7,8,9,10,11.

The PHT1 family has been most extensively studied in plants, and the main pathway for Pi to enter plants from the soil is through the PHT1 family located on the plasma membrane, PHT type 1 protein is also defined as a major contributor to the Pi uptake system12 and the plant has several PHT1 transporter members that play a key role in P acquisition, transport and remobilization. The Arabidopsis AtPHT1 gene family contains nine members, The rice OsPHT1 gene family has 13 members13. The first high H + /PI phosphate transporter identified in higher plants was AtPT1 from Arabidopsis14. This gene has an important role in the uptake of phosphorus from the soil15. Analysis of expressed sequence tags (ESTs) and genomic sequences revealed that nine genes in Arabidopsis are similar to AtPT1, with PHT1;1 transcript being the most abundant16. Its overexpression increases Pi uptake in Arabidopsis17. The PHT2 gene family contains one member each in Arabidopsis thaliana and rice13, and orthologous gene have been characterized in wheat, alfalfa rhizobia and several Solanum species14. Arabidopsis PHT2;1 is the first member of the PHT2 family to be identified, with OsPHT2;1 in rice being the putative low-affinity phosphate transporter gene involved in the accumulation and transport of Pi in the plant18. The third family of plant Pi transporters is localized to the mitochondria, Three PHT3 genes have been identified in Arabidopsis8 and six members identified in rice13, and homologs have been cloned and partially characterized soybean, maize, and birch19,20. The Arabidopsis genome contains six PHT4 genes (PHT4;1 to PHT4;6), and chloroplast localization of PHT4;1, PHT4;2, PHT4;4, and PHT4;5 has been analyzed using GFP fusions or immunoblotting21,22,23. In Arabidopsis, phosphate transporter proteins containing the SPX–MFS structural domain have been designated as members of the phosphate transporter protein family PHT5 or VPT. The results also confirmed that PHT5 could increase plant tolerance under low phosphorus stress. The results also confirmed that PHT5 could increase plant tolerance under low phosphorus stress24. GmPHT5 in soybean may be responsible for transporting phosphorus from root vascular tissues to nodules, especially under phosphorus deficiency stress, and therefore GmPT5 may be a key transporter in regulating nodule phosphorus homeostasis25.

Soybean (Glycine max (L.) Merr.) is an economical and agronomical crop, which has been adopted as a diet staple throughout many parts of the world6, However, the low availability of phosphorus (P) in the soil is the main nutrient limiting factor for soybean yield26,27. Previous studies have reported that phosphorus addition improves drought tolerance in soybean28,29. Phosphorus application increased nitrogen fixation rates in soybean30,31 which is sensitive to drought stress32,33,34. In production areas within China, drought is the most challenging stress for soybean35 and the non-renewable resource of P is predicted to have limited availability, soybean seed yield will be threatened by both water and P deficits in the future36. Therefore, it is urgent that researchers develop plants with enhanced efficiency of soil phosphorus use under such conditions, improve the capacity of soybean to absorb phosphorus. Because Pi is moved from the soil into plant cells in response to excess phosphate, genomic analyses have been conducted with Pi transporter families in Arabidopsis, rice and apple.

Currently, the study of phosphorus transport protein genes in Arabidopsis, rice and other plants has gained great progress, while PHTs have been rarely studied in soybean. Therefore, based on the published Arabidopsis thaliana genome, this study identified and characterized putative soybean PHTs family genes through phylogenetic analysis, collinearity analysis, motif, promoter element analysis, and investigated the expression profile of PHTs genes in response to low phosphorus and drought. Transcriptome sequencing was used to analyze the induction of expression of root and leaf phosphorus transporter-related genes in soybean by low phosphorus treatment. The objective was to present a foundation for further functional dissection of GmPHTs so that genetic engineering approaches can be applied to improve the efficiency of phosphate uptake by stressed soybean plants.

Result

Identification of the PHTs family genes in soybean

Based on the homology with PHTs family members in Arabidopsis, a total of 57 PHTs genes were identified in the whole genome of soybean (Table 1). Each subfamily member also has its own characteristics. The differences in the isoelectric point and hydrophilicity of the subfamily proteins are relatively obvious, Except for the GmPHT5 subfamily, the isoelectric points of all subfamily members are basic, the isoelectric point of the GmPHT5 subfamily is acidic, at around 6. The GmPHT1, GmPHT2, GmPHT3, GmPHT4, and GmPHT5 subfamilies had positive hydrophilicity values ranging from 0.101 to 0.659, while the GmPHO1 subfamily had negative hydrophilicity values ranging from − 0.255 to 0.064. In contrast, the hydrophilicity among the members of the GmPHT4 subfamily varied considerably, with GRAVY values ranging from 0.2 to 0.6. The differences in the basic physicochemical properties of the proteins reflect the differences in their physiological functions. In particular, the differences in isoelectric point and hydrophilicity exhibited among the members within the GmPHT4 subfamily herald the diverse functional differentiation among the members of the GmPHT4 subfamily.

Table 1 Localization and physicochemical properties of members of the GmPHTs protein gene family.

Phylogenetic analysis of the GmPHTs genes

To determine the evolutionary relationships of the members of the GmPHTs gene family genes with those of the soybean species, we constructed a phylogenetic tree comprising 92 PHTs proteins, from soybean (57), Arabidopsis (35), based on a multialignment via MEGA 6. Our results showed that all the PHTs homologs could be classified into six clades. GmPHT1 subfamily (14 members), GmPHT2 subfamily (2 members), GmPHT3 subfamily (9 members), GmPHT4 subfamily (12 members), GmPHT5 subfamily (6 members) and GmPHO1 subfamily (14 members), respectively, highly similar to each other. In addition, almost all of the soybean PHTs genes appeared as pairs with the PHTs members in Arabidopsis in terms of phylogenetic relationships (Fig. 1).

Figure 1
figure 1

Phylogenetic tree of coding nucleotide sequences of the phosphate transporter family (PHT) in soybean and Arabidopsis thaliana. Note: The names of the genes used in soybean are referred to Table 1, and the names of the genes used in Arabidopsis are as follows. At1g68740(AtPHO1.1), At2g03260(AtPHO1.2), At1g14040(AtPHO1.3), At4g25350(AtPHO1.4), At2g03240(AtPHO1.5), AT2G03240(AtPHO1.6), At1g26730(AtPHO1.7), At1g35350(AtPHO1.8), At3g29060(AtPHO1.9), At1g69480(AtPHO1.10), AT3G29060(AtPHO1.11), AT4G25350(AtPHO1.12), AT5G35730(AtPHO1.13), AT5G43350(AtPHT1;1), AT5G43370(AtPHT1;2), AT5G43360(AtPHT1;3), AT2G38940(AtPHT1;4), AT2G32830(PHT1;5), AT5G43340(AtPHT1;6), AT3G54700(AtPHT1;7), AT1G20860(AtPHT1;8), AT1G76430(AtPHT1;9), AT3G26570(AtPHT2;1), AT5G14040(AtPHT3;1), AT3G48850(AtPHT3;2), AT2G17270(AtPHT3;3), AT2G29650(AtPHT4;1), AT2G38060(AtPHT4;2), AT3G46980(AtPHT4;3), AT4G00370(PHT4;4), AT5G20380(AtPHT4;5), AT5G44370(AtPHT4;6), AT1G63010(AtPHT5;1), AT4G11810(PHT5;2), AT4G22990(AtPHT5;3),

Collinearity analysis among members of the GmPHTs gene family

There are currently three ways in which duplicated genes are generated, including whole-genome duplication (polyploidy), large segmental duplication events and tandem duplication event37. Chromosomal distribution and Collinearity analysis of the GmPHTs gene (Fig. 2) shows that, 49% of the 57 GmPHTs gene family members were found to arise through large segmental duplication events and 14% through tandem duplication event. In the GmPHT1 subfamily, the GmPHT1.1 and GmPHT1.4, GmPHT1.2 and GmPHT1.11 genes are generated by a large segmental duplication event in the soybean chromosome; the GmPHT1.6 and GmPHT1.7, GmPHT1.13 and GmPHT1.14 genes are generated by tandem duplication event, two of the GmPHT2 members were formed by large segmental duplication events. GmPHT 3.1 and GmPHT3.5, GmPHT3.3 and GmPHT3.4, and GmPHT3.9 in the GmPHT3 subfamily have common ancestor genes with GmPHT3.6 and GmPHT3.7 which were generated by large segmental duplication events; ancestor genes of GmPHT3.6 and GmPHT3.7 genes Tandem duplications occurred. Both the GmPHT4 and GmPHT5 subfamilies expanded the gene family size with only large segmental duplication events (GmPHT4.1 and GmPHT4.8, GmPHT4.5 and GmPHT4.9, GmPHT4.6 and GmPHT4.11 genes; GmPHT5.1 and GmPHT5:4, GmPHT5.3 and GmPHT5.6 genes). of the GmPHO1 subfamily, GmPHO1.1 and GmPHO1.4, GmPHO1.2 and GmPHO1.7, GmPHO1.8 and GmPHO1.14 genes were generated by large segmental duplication events, and GmPHO1.11, GmPHO1.12 and GmPHO1.13 genes were formed by tandem duplication event. It suggested that large segmental duplication also played a major driving force for GmPHTs evolution in addition to tandem duplication.

Figure 2
figure 2

GmPHTs collinearity analysis results. Note: Members of the same subfamily are represented by the same color. Solid lines between genes indicate collinearity between the two. The genes enclosed in boxes are formed in the form of tandem repeats. The covariance data between chromosomes are cited Liu, et al. 201538.

Gene structure and protein-specific motif analysis of the GmPHTs gene

From the perspective of gene structure, the GmPHO1 subfamily can be divided into 2 Groups, and Group I contain 8 genes including GmPHO1.3, GmPHO1.5, GmPHO1.6, GmPHO1.9, GmPHO1.10, GmPHO1.11, GmPHO1.12 and GmPHO1.13. The three pairs of GmPHO1 genes form Group II, which differs from the Group I genes mainly in the length of the first exon. Members of the GmPHT3 subfamily have 5 or 6 exons, but the difference is mainly in whether the last exon is split or not. GmPHT4.1 and GmPHT4.8 both contain 8 exons, GmPHT4.5 and GmPHT4.9 genes contain only 1 exon, and GmPHT4.7 and GmPHT4.10 genes contain a maximum of 15 exons. The 14 members of the GmPHT1 subfamily can also be divided into two categories based on the results of gene structure, with the exception of GmPHT1.1 and GmPHT1.4, which have two exons, the other 10 genes in Group III have only one exon, and GmPHT1.3 and GmPHT1.12, which have similar structures and three exons. GmPHO1 subfamily has a maximum of 10 specific motifs and the number of motifs is the same for each member of the GmPHT3 and GmPHT5 subfamilies. The protein sequences of the GmPHT1 subfamily members contain 7–9 specific motifs, notably, the four gene pairs generated by large block repeats or tandem repeats and the GmPHT1.5 gene have exactly the same 9 motifs, whereas the protein encoded by the five genes generated by non-repetitive events has only 7 motifs except for 8 and 9. Only the number of specific motifs varied significantly among members of the GmPHT4 subfamily, ranging from 1 to 4 (Fig. 3).

Figure 3
figure 3

Analysis of the gene structure and protein-specific motifs of GmPHTs.

Identification of the cis-acting regulatory elements

To understand the potential transcriptional regulation of GmPHTs, we performed the analysis based on the DNA sequences of the promoter regions. The 2.0-kb upstream region of the initiation codon as the promoter region of the gene to analyze which specific regulatory elements are present in the promoter region of each gene (Fig. 4, Table S2). The GmPHTs gene family has the most light-responsive elements, accounting for 43.26% of all promoter motifs and every gene has more or less of this type of promoter element. The next most abundant is the abiotic stress response element, which mainly responds to adverse environmental conditions such as heat, low temperature, anaerobic, drought, and low phosphorus. The phytohormone response elements mainly include abscisic acid, methyl jasmonate, gibberellin, ethylene, salicylic acid, and growth hormone response elements. Except for the GmPHT3.1 gene, the promoters of all genes had phytohormone-like response elements, differing only in type and number. Biological stress response elements were the least, accounting for only about 5%, and none of the members of the PHT2 subfamily had biological stress response elements. It is thus hypothesized that each member of the GmPHT genes family has an important role in the growth and development of soybean and resistance to various external growth-adverse stimuli. The basic biological function of PHT proteins has been reported to be the transport of inorganic phosphorus, and the application of phosphorus has a mitigating effect on drought39, so among the promoter elements, the low phosphorus response element P1BS and the drought-inducing element MBS are the focus of our attention. Only 13 of the 57 genes have promoter regions containing both the low phosphorus response element P1BS and the drought-inducing element MBS. Therefore, further expression analysis was performed for 13 genes, including GmPHT1.1, GmPHT1.7, GmPHT1.10, GmPHT1.14, GmPHT3.4, GmPHT3.5, GmPHT3.6, GmPHT4.7, GmPHT4.8, GmPHT4.10, GmPHO1.4, GmPHO1.5 and GmPHO1.7.

Figure 4
figure 4

The promoter element composition of GmPHTs gene.

Differential expression profiles of the GmPHTs genes under low phosphorus and drought stress

So far, little is known about the specificity of tissues or organs of the soybean PHT genes, which may elucidate their functions in detail, because the GmPHTs family members in soybean have not been systemically examined. By the results of the preliminary qRT-PCR experiments (Data not shown) and the gene microarray data of soybean, GmPHT1.10, GmPHT1.14, GmPHT3.4 and GmPHT3.6 genes were not expressed among the 13 genes. Therefore, only the remaining nine genes were examined separately for their expression under simultaneous low phosphorus drought treatment conditions (Fig. 5). GmPHT1.1 gene was highly expressed in roots only, GmPHT3.5 gene was expressed in leaves only, GmPHT1.7, GmPHT4.7, GmPHT4.8, GmPHT4.10, GmPHO1.4, GmPHO1.5 and GmPHO1.7 genes were expressed in both roots and leaves. The GmPHT1.7 and GmPHT4.8 genes were expressed in both leaves and roots, and it was inferred that the GmPHT1.7 and GmPHT4.8 genes may play important roles in both roots and leaves, and the GmPHT1.1 gene acts synergistically in the roots during the seedling stage of soybean. In contrast, GmPHT4.10, GmPHO1.4, GmPHO1.5, and GmPHO1.7 genes were expressed in the roots but at low levels, and these three genes may not play a major role in the roots.

Figure 5
figure 5

qRT-PCR results of GmPHT genes in soybean roots and leaves.

The expression pattern of these nine genes changed again when soybean was treated with both low phosphorus and drought for 12 h (Fig. 6). GmPHT3.5, GmPHT4.7, GmPHT4.10, GmPHO1.5, GmPHO1.7, and GmPHT1.7 was significantly reduced. The expression of the GmPHT1.1 gene in the roots increased gradually with time, and its expression increased significantly with low phosphorus and drought treatment for 12 h. The expression pattern of GmPHO1.4 gene in leaves was similar to that of GmPHT1.1 gene, both of which decreased first and then showed a significant increase with time. Only the expression of GmPHT4.8 gene in leaves was not affected by low phosphorus and drought, and the expression was basically the same as that of the control, but the expression in roots was significantly lower.

Figure 6
figure 6

qRT-PCR results of nine GmPHT genes in roots and leaves of soybean treated with low phosphorus as well as drought.

Coexpression networks of the PHTs family genes in soybean

To further unravel the coexpression relationships between PHTs family genes and other genes, we calculated the interaction weight values of the target gene sets based on the FPKM values from the RNA-seq data. Figure 7 shows all the genes interacting with GmPHTs genes. In general, among the interacting genes, GmPHT1:1 interacted with 31 genes, while GmPHO1:4 interacted with 57 genes. Three genes interacted most strongly with GmPHT1:1, and six genes were co-expressed with GmPHO1:4. No other genes were found that interacted strongly with other PHTs genes under low phosphorus stress (Fig. 7). These include ubiquitin family genes, and ubiquitin-conjugating enzyme genes. Ubiquitination modifications play a central regulatory role in the plant response to low phosphorus stress and may be involved in regulating the uptake of phosphorus.

Figure 7
figure 7

Coexpression networks of 2 PHTs family genes in soybean. (A), Glyma02G005800 (GmPHT1.1) coexpression network diagram; (B), Glyma02G130200 (GmPHO1.4) coexpression network diagram; The size of the node is proportional to the degree of the node, the more edges connected to the node, the greater its degree and the larger the node; the color indicates the strength of the interaction, the redder the color the stronger the interaction. The PHTs family genes located in the center of the network, while the most coexpressed genes were displayed in each network.

Discussion

As an essential, nonsubstitutable element for plant growth, P plays a wide range of structural and biological roles40. P deficiency influences numerous biological processes in crop plants, such as rice, soybean and wheat, markedly reducing crop yield41,42,43,44. Plants need to obtain the phosphorus they need from the soil, but as the phosphorus concentration in the soil is lower than that in plant tissue, this sharp concentration gradient between the plant and the soil indicates the indispensable roles of PHTs, which can regulate Pi absorption 45.

Pi transporters currently identified in Arabidopsis, rice, soybean, maize, wheat, poplar, Medicago truncatula, and the Halophyte E. salsugineum46. The concerted action of Pi transporters ensures Pi acquisition and distribution among tissues and cytosolic Pi homeostasis44. Therefore, engineered alterations of the expression of Pi transporters provide an opportunity to optimize uptake and distribution of Pi in crops to improve yield47. A systematic analysis of the whole PHT gene family in soybean was conducted in this study, which will provide an opportunity to find candidate genes that play roles in phosphate use efficiency and abiotic stress response. A total of 57 PHT genes are identified in soybean and show quite high similarity to their corresponding members of the PHT gene family in Arabidopsis (Fig. 1). This implies that these PHT genes are evolutionarily conserved and functionally similar in soybean and Arabidopsis.

Gene duplication is considered one of the primary driving forces in the evolution of genomes and genetic systems. Of all the duplication patterns, segmental and tandem duplications have been purported to be the two main causes of gene family expansion in plants48. In the current research, 29 of the 57 GmPHTs (87.8%) in the soybean genome were associated with large segmental duplication events, Moreover, only 7 tandem duplication events were identified (Fig. 2). Taken together, our results indicate that large blockbuster duplication and tandem duplication events are the main force for the expansion of the PHT gene family in soybean. A relatively large number of members within a family suggests successful expansion and rearrangement of the genome by extensive duplication that occurred frequently during evolution49.

Cis elements may control the efficiency of promoters and thus regulate the expression of the genes that they control by interacting with the corresponding trans-regulatory factors50. Studies of cis elements might play a crucial role in dissecting the functions of genes. But the cis elements of all PHT gene family members in soybean have not yet been identified. In this study, the upstream regions of all the GmPHTs contain cis elements affecting light, hormone, defense and tissue-specific expression. These findings suggest that the expression of these GmPHT genes might be regulated by many factors, such as light-, hormone-, and defense-related factors. The presence of these cis elements in soybean GmPHT genes and their likely roles in regulating gene expression suggest that GmPHT genes may be involved in various stress responses in soybean Some cis elements function under phosphate deficiency, for example, the P1BS element can be bound to PHR1 (a MYB transcription factor) to potentially regulate the phosphate starvation response in Arabidopsis51, PHR1 has been suggested as a central integrator to role in the transcriptional regulation of phosphate starvation responses52. In the current study, P1BS-like elements were found in 13 of the 57 GmPHT promoters examined suggesting that expression of these genes might be influenced by Pi starvation. Similarly, P1BS-like elements have also been detected in OsPHT genes in rice50. These elements are found to be in the promoters of phosphate regulated genes, not only in Arabidopsis but also in other plant species such as barley and rice 50,53,54, which suggests that they may participate in a conserved signaling pathway for the phosphate starvation response in plants. The existence of P1BS-like elements in soybean, rice, and Arabidopsis suggests that they may serve a similar role in different species.

Drought and Pi deficiency stress often interfere with plant metabolism via various stress signals and hormonal changes that play essential roles in regulation processes under natural conditions55. So far, several studies have found that Pi transporters are involved in response to drought stress. Recently, several genes encoding Pi transporters from poplar were found to be regulated by drought stress, these genes especially those up-regulated by drought stress at low Pi level may contribute to drought tolerance of poplar plants in Pi-limited soils11. Under drought stress, 18 PHT genes were upregulated in the leaves, and 11 PHT genes were downregulated in Brassica napus roots56. The expression of 15 MdPHT genes was upregulated in apple under drought stress57. Cao et al. (2020) found that overexpression of StPHT1;7 in potato influenced plant growth and tolerance to drought stress58. Therefore, it is urgent that researchers develop plants with enhanced efficiency of soil phosphorus use under such conditions. Secondly, using drought inducible promoters to control the expression of Pi transporters or their regulatory factors may be a potential strategy to improve crop PUE under drought stress conditions. Drought can prevent Pi uptake by reducing Pi distribution in the root system59. Pi can also increase tissue osmotic pressure and enhance tolerance to drought stress by adjusting the water content in the cell structure and soluble sugar content in the cells60. Plants fertilised with Pi show moderate growth under drought compared to unfertilised plants61. In addition, phosphorus fertilizers are often used to reduce Pi deficiency in the soil, improve drought tolerance and promote plant growth62. In Alnus cremastogyne, phosphorus fertilisation alleviates drought stress by regulating its antioxidant and osmotic potential63. Several recent studies have highlighted the role of phosphorus in alleviating drought stress in plants by promoting growth, especially of seedlings64. Plants primarily use phosphate transporters (PHT) located in the cell membranes of plant roots to take up Pi from the soil, which is then transported into the plant65. Increased phosphorus uptake enhances tolerance to low phosphorus66,67 and drought stress62,68. Therefore, engineered alterations of the expression of Pi transporters provide an opportunity to optimize uptake and distribution of Pi in crops to improve yield.

The number of Pi transporters present may reflect the complexity and significance of the process of Pi transport5. Most PHT1 family genes expressed in roots show up-regulation in phosphate deprived Arabidopsis plants; In rice, high-affinity of OsPHT9 and OsPHT10 for Pi transport have been reported, as well as specific induction of their expression by Pi starvation69 The transcript levels of a few PHT genes in Arabidopsis were increased by Pi deficiency70; however, the expression of most PHO family members was not induced by Pi deficiency in poplar, suggesting that PHT genes might have different functions in woody species and herbs. In addition, some PHT4 family members might be low-affinity transporters, consistent with the relatively high Pi levels that have been reported in the cytosol and other subcellular compartments7. In this study, the expression patterns of GmPHT genes were analyzed under low phosphorus, drought and simultaneous low phosphorus drought stress conditions, and the results showed that the expression patterns of the genes differed under different treatment conditions. These results suggest that of crosstalk occurs among the P-starvation response and salt and drought stress responses, and Pi uptake in plants likely changes in association with the altered expression of the PHT1 genes under drought and salt stresses47.Our results show that under simultaneous drought and low phosphorus treatments, the expression of GmPHT1.7 and GmPHT1.1 genes in roots significantly increased, while the expression of GmPHO1.4 gene in leaves showed a significant increase with the continuation of treatment time, and the expression of this gene in roots significantly decreased. The expression of GmPHT4.8 gene in leaves was not affected by low phosphorus and drought, but the expression in roots was significantly reduced (Fig. 6). The identification of these genes is therefore of great significance for the development of transgenic crops to cope with low phosphorus and drought stress.

Conclusions

Plant Pi transporters play an essential role in Pi acquisition and distribution. The family of phosphate transporters (PHTs) mediates the uptake and translocation of Pi inside the plants. In this study, a total of 57 PHT-domain-containing protein genes were identified in soybean genome, (GmPHT1, GmPHT2, GmPHT3, GmPHT4, GmPHT5 and GmPHO1) had been found. Only 13 of the 57 genes have promoter regions containing both the low phosphorus response element P1BS and the drought-inducing element MBS, 4 genes were not expressed among the 13 genes. Therefore, only the remaining nine genes were examined separately for their expression under simultaneous low phosphorus drought treatment conditions. GmPHT1.7, GmPHT1.1, GmPHO1.4 and GmPHT4.8 genes displayed various expression patterns under low phosphorus and drought stress. The above results implied that they may play different roles in phosphate nutrition of Soybean different tissues and development stages. These results provide references for the further study of Soybean PHT-domain-containing proteins family genes. Moreover, this study provides the selection of candidate genes for functional research and genome editing in Soybean phosphate nutrition.

Materials and methods

Identification of the GmPHT family genes in soybean

The Arabidopsis (Arabidopsis thaliana) PHT protein sequence was used for TBlastN homology alignment in the soybean genome database (Phytozome V12.1) to obtain sequences similar to members of the AtPHT gene family. The obtained sequences were then analyzed in the NCBI CDD database (v3.15) for conserved structural domains to identify whether the obtained similar sequences were GmPHTs proteins. The sequence of the protein finally identified as GmPHTs was sequence corrected in the NCBI soybean database. Data on the isoelectric point, molecular weight, and hydrophilicity of the proteins were obtained using the ProtParam tool in the ExPASy (https://web.expasy.org/) online analysis website.

Phylogenetic, Multiple alignment and protein motif analysis

The PHT protein sequences of Arabidopsis and soybean were selected for multiple sequence alignment using the ClustalW program in BioEdit (v7.0.5) software. The PHT protein sequences of yeast were used as outgroup to construct a phylogenetic tree by MEGA (v7.0.26) software. The method used was the Jones-Taylor-Thornton (JTT) model, and Bootstrap was 1000 replicates.

The PLAZA website (https://bioinformatics.psb.ugent.be/plaza) was used to analyze the collinearity of the soybean genome and finally to obtain the regions where large segments of duplication occurred in the soybean genome.

The number of introns and exons of genes was analyzed and counted using the online analysis website GSDS (v2.0) (http: //gsds.cbi.pku.edu.cn). The unique or shared motifs in the proteins were done using the MEME line analysis website (http://meme-suite.org/).

Promoter element analysis in GmPHT genes

The gene sequence was obtained from the soybean genome database phytozome (V12.1) and the 2000 bp sequence upstream of the gene was used as the promoter region. The promoter element prediction was performed for all GmPHT members using the promoter online prediction website PlantCare (http://bioinformatics.psb.ugent.be/ webtools/plantcare/html/). The promoter elements other than the basic promoter elements (TATA box and CAAT box) were mapped using the GSDS online mapping tool for promoter elements.

Plant materials and growth condition

The experiments were conducted with the main cultivar Changnong 26 in Jilin Province (Validation No.: JI Audited Bean 2,010,004). To identify tissue-preferentially expressed genes, seedlings of Chang Nong 26 were grown under normal conditions and cultured in sand culture at 25 °C with 50% humidity and 13 h of light and 9 h of darkness, using 1/4 Hoagland (containing 0.5 mM KH2PO4) as the culture medium to provide essential nutrients for the soybean. The plants were converted to hydroponic culture when they reached the three-leaf stage, and the culture medium was still 1/4 Hoagland containing 0.5 mM KH2PO4. After 12 h of hydroponic culture, the plants were transferred to 1/4 Hoagland culture with 0.01 mM KH2PO4, 1/4 Hoagland culture with 2% PEG8000 (containing 0.5 mM KH2PO4) and 1/4 Hoagland culture with 2% PEG8000 containing 0.01 mM KH2PO4 for Hydroponic culture. The plants were cultivated in a glasshouse (natural light, 75% relative humidity) and irrigated with 50 ml nutrient solution. Three treatments were set up with three replications, continuous normal phosphorus treatment, continuous low phosphorus, and stressed low phosphorus treatment. Samples were taken on time for 12 h in hydroponics, and the roots and leaves were stored separately at − 80 °C for backup.

Coexpression networks of the GmPHT family genes using RNA-seq data

Gene coexpression network analysis was performed based on the RNA-seq data. For the RNA-seq experiment, fully expanded leaves and roots were taken when the plants were treated with low phosphorus nutrient solution until to 3 leaves separately for RNA extraction with three biological replicates. The transcript abundance (FPKM value) of each gene was calculated based on the length of the gene and the reads mapped to that gene. The interactions of the target gene sets were retrieved from the STRING protein database (http://string-db.org/), and the weight value of the target gene sets was calculated using the WGCNA R package based on the FPKM values. The gene coexpression networks were visualized by Cytoscape software71.

Quantitative real-time PCR analysis

RNA was extracted using Trizol (TAKARA, Beijing) and reverse transcribed using PrimeScriptTM RT reagent Kit with gDNA Eraser kit (TAKARA, Beijing). Fluorescent quantitative PCR primers were designed according to the specific sequences of each gene, see Table S1. Letin was used as the internal reference gene, and the fluorescent quantitative PCR was performed using the TAKARA SYBR Premix Ex Taq II kit (TAKARA, Beijing, China) with an ABI StepOne Plus real-time fluorescent quantitative PCR instrument (ABI, USA, model 7300). 25 μl of the reaction system contained 1 μl cDNA, 10 μl SYBR The PCR program was: 95 °C for 10 min (pre-denaturation), 95 °C for 3 s (denaturation) and 60 °C for 1 min (annealing-extension) for 40 cycles. The lysis curve procedure was: 95 °C for 15 s, 60 °C for 1 min, and 95 °C for 15 s. Data were calculated using the 2−ΔΔct algorithm for relative expression72. All experiments were performed in three replicates. Statistix (v 8.1) software was used to analyze the significance of the data.

Statistical analysis of data

Statistics was performed by Duncan’s test or Student’s t test. Significance of differences was defined as *P < 0.05, **P < 0.01.