Introduction

Rice (Oryza sativa), as a wholesome and nutritious cereal grain, provides the staple food for more than half of the world's population 1. Along with the growth of economies and population throughout the world, there is an increasing demand for high-yield and high-quality rice. The grain quality of rice, including that of milling, appearance, cooking and eating, and nutrition, is determined by multiple physicochemical properties 2, and studies on the involved key genes and relevant regulatory mechanisms will help to illustrate the regulatory networks of rice quality control and benefit for the breeding efforts.

Starch, which is composed of amylose and amylopectin, comprises 90% of dry endosperm of rice seed. The apparent amylose content (AAC) is recognized as an important determinant of the appearance and texture of grain 3. Thai jasmine rice (KDML 105), one of the best-quality rice with good cooking and eating qualities, has low amylose content (AC) and medium gel consistency (GC) 4.

The nutritional quality includes the proportions of protein, fat, mineral and other substances 5. The rice protein is rich in methionine and cysteine, and compares favorably with proteins of other cereals, but is poor in lysine and threonine, making it an incomplete protein source for human infants 6. Storage triacylglycerols are stored in oil bodies containing phospholipids and oleosins at the surface 7, and oil bodies are abundant in embryo and aleurone layer of rice seed. However, the aleurone layer is usually removed by milling because it turns rancid upon storage 8. In addition, oil also influences other aspects of seed quality, and a recent report showed that the quality and viability of Arabidopsis seeds can be enhanced by suppressing phospholipase D 9.

The near-infrared spectroscopy (NIRS) technology, which is based on the fact that several natural products can readily absorb NIR radiation at a specific region or wavelength, has been used for rapid and accurate analysis of chemical and physical properties without sample destruction 10. NIRS is time-saving and low-cost, and previous studies have used NIRS to determine quality characteristics such as moisture, protein, starch, amylose and lipids in buckwheat, wheat, barley, maize, soybean and rice 11, 12, 13, 14, 15, 16, 17; the practice was widely accepted in the trade of these products.

To date, many genes related to important agronomic traits and seed qualities have been identified through breeding and genetic transformation protocols 1, 5, 18, and quantitative trait locus (QTL) analysis and map-based cloning have been applied to identify loci associated with seed quality control 19, 20. Chinese geneticists and breeders have made great progress in the hybrid rice technology 21, and marker-assisted selection has been widely used to pyramid functional genes into popular hybrid rice cultivars to improve the seed quality 22, 23. As a multifactorial trait, modulations of genes involved in photosynthesis, plant architecture and transcriptional networks controlling plant development were proved to be effective in improving seed quality 24. However, systemic studies on rice seed quality control and the corresponding rice resources are still lacking.

Generation of T-DNA insertion populations is an important strategy to study gene functions on a large-scale basis 25. Currently, several databases for Arabidopsis have been established to collect information on T-DNA insertion populations, including SALK (http://signal.salk.edu/cgi-bin/tdnaexpress), GABI-Kat (http://www.gabi-kat.de/) and SAIL (http://www.tmri.org/en/partnership/sail_collection.aspx) 26, 27, 28. For rice, several resources with large collections of insertion mutants were also generated, including those in Australia (http://www.pi.csiro.au/fgrttpub/) 29, China (http://rmd.ncpgr.cn/; http://trim.sinica.edu.tw/) 30, 31, 32, Europe (http://orygenesdb.cirad.fr/) 33, France (http://urgi.versailles.inra.fr/OryzaTagLine/) 34, Japan (http://pfg101.nias.affrc.go.jp/~miyao/pub/tos17/index.html.en) 35, 36, Korea (http://141.223.132.44/pfg/) 37, 38 and USA (http://sundarlab.ucdavis.edu/) 39. These resources helped to identify and study genes involved in rice development 40, 41. Some mutants with marked changes of seed quality were also identified 42, 43.

To facilitate the coordination of rice functional genomic studies, especially that on seed quality regulation, we have developed the Shanghai T-DNA Insertion Population (SHIP, http://ship.plantsignal.cn) of rice by large-scale Agrobacterium-mediated transformation. The flanking sequences of T-DNAs were identified by thermal asymmetric interlaced-PCR (TAIL-PCR) 44, 45 and mapped onto the rice genome. After multiple generations of planting and resistance screening, more than 11 000 putative homozygous lines were obtained and analysis on the seed quality of around half of them with NIRS identified a total of 551 mutants with altered seed quality. Further analysis on the corresponding flanking sequence tags (FSTs) revealed that genes of diverse molecular functions are involved in seed quality control.

Results

Generation of a rice mutant population of 63 000 T-DNA insertion lines

A project aiming to generate a large T-DNA insertion population was initiated in 2003. To date, using Zhonghua 11 or Nipponbare, two rice japonica varieties, as materials, more than 63 000 independent transgenic lines have been generated through Agrobacterium-mediated transformation. In all, 15 000 independent lines were analyzed by TAIL-PCR to identify the T-DNA FSTs. Based on the finding from previous studies that isolation of FSTs from the left border was more efficient than that from the right one, due to a higher frequency of inverted T-DNA repeats involving two right borders 46, a set of nested sequence-specific primers corresponding to the left border of T-DNA and a shorter arbitrary degenerate (AD) primer were utilized in the present study. Overall, 12 948 FSTs were isolated till February 2008. A detailed analysis of these sequences by aligning against the rice pseudo-chromosome version 5 (The Institute for Genome Research (TIGR)) indicated that 8 840 (68.3%) of them were mapped onto the rice genome with on average 96.8% homology (all the FSTs have been submitted to RiceGE, http://signal.salk.edu/cgi-bin/RiceGE). After removing the T-DNA footprint, the size of the genomic sequence of these FSTs averaged at 241 bp. The other FSTs included 2 127 (16.4%) T-DNA tandom repeats, 1 564 (12.1%) binary vector sequences and sequences that did not hit at all (3.2%, probably due to non-specific amplification).

To test the accuracy of FSTs (whether the T-DNA insertion was indeed at the predicted position), PCR amplification was performed using a primer pair, of which one is the T-DNA left border primer and the other is a reverse primer corresponding to the isolated FSTs. The results showed that, among the 118 lines tested, T-DNA insertions of 69 lines (58.5%) were confirmed to be at the predicted position, which is relatively lower than that in Arabidopsis 28.

Further, to analyze the locus numbers of T-DNAs, T1 seeds of 2 097 independent lines were germinated in sterile water containing hygromycin and those with a segregation ratio of 3 resistant:1 sensitive were regarded as containing a single locus of T-DNA. This preliminary analysis showed that around 64.7% (1 357 lines) of the transgenic lines contain a single locus of T-DNA.

Distribution of T-DNA insertions on rice genome

Assignment of 8 840 FSTs on chromosomes showed that FSTs are distributed all over the chromosomes; however, the distribution was not even and the number of insertions was related to the chromosome size. Calculation of the T-DNA density (per Mb) showed the range from 15.60 (Chr. 12) to 32.27 (Chr. 3), with an average of 23.76, indicating a positive correlation with the chromosome size. The relatively longer chromosomes, 1 and 3, had an apparently higher density of T-DNA insertions (i.e., contained 28.03% of the total number of insertions), while chromosomes 6, 7 and 10-12 had a relatively lower density (Table 1). Linear regression analysis indicated that the chromosome size was significantly positively correlated with the insertion numbers (correlation coefficient = 0.924, P < 0.001), as well as with the insertion density (correlation coefficient = 0.729, P < 0.01) (Supplementary information, Figure S1).

Table 1 Distribution of T-DNA insertions on rice chromosomes

Further analysis on the distributional relationship between T-DNA insertions and several other features of the rice genome, including expressed genes and transposable elements (annotated by TIGR), was performed. In addition, as small RNAs were proposed to regulate transposon mobility and assembly of heterochromatin 47, the distribution of small RNAs [publicly available data, 48 was also analyzed. The whole genome was divided into non-overlapping windows with the size of 500 kb, and the numbers of T-DNA insertions, expressed genes, transposable elements and small RNAs mapping to each window were determined, which were then used for analysis of Pearson's correlation coefficient (PCC). The results revealed that the distribution of T-DNA insertions on the whole genome was positively correlated with expressed genes (PCC = 0.39, P < 0.001), but negatively correlated with transposable elements (PCC = −0.25, P < 0.001) and small RNAs (PCC = −0.37, P < 0.001). These correlations show the same trend, but different degrees for each individual chromosome (relatively higher on chromosomes 6 and 7, while lower on chromosomes 2 and 11, Figure 1).

Figure 1
figure 1

The distribution correlation between T-DNA insertions and expressed genes (black column), transposable elements (striped column) and small RNAs (dot column). “All” indicates the whole genome.

As T-DNA insertions were positively correlated with expressed genes, the distribution of T-DNA insertions within genes and intergenic regions was further analyzed. To facilitate the analysis, the DNA sequence from ATG to the stop codon, together with the sequence upstream (2 500 bp) from ATG and downstream (750 bp) from the stop codon, was defined as a gene, and the segments which were not included in genes were defined as intergenic regions. Overall, 6 586 (74.50%) T-DNAs located in genes and 2 254 (25.50%) in intergenic regions. Calculation of the insertion density revealed that recovered T-DNAs were preferentially located at the untranslated regions of genes, and that the density of T-DNA insertions near the gene-coding region was higher than the densities of those far from the coding region (Table 2).

Table 2 Distribution of T-DNA insertions in the genes and intergenic regions of rice genome

Functional annotation (Gene Ontology, “molecular function”) showed the tagged genes fall into multiple functions, and interestingly, some categories were significantly over-represented, including those for hydrolase, nucleic acid binding and motor functions, while some, such as genes in the categories of transcription factor, nuclease, transferase and oxygen binding, were under-represented (Table 3).

Table 3 Gene Ontology annotation of FSTs of the insertion population that mapped to the genic region (The GO terms were based on the TIGR product annotations.)

Identification of putative homozygous mutant lines and characterization of mutants with altered seed qualities by NIRS

Considering that T-DNA insertion mutagenesis was recessive, and that some phenotypes could not be exhibited in the heterozygous T1 plants, we tried to identify putative homozygous lines by resistance selection and use them for further analysis. Overall, more than 11 000 putative homozygous lines were obtained, and by using primers located at the T-DNA left border and two sides of the predicted insertion sites of FSTs, 49 (71.0%) of 69 putative homozygous lines were confirmed as homozygous at the insertion positions. This provides a good resource for functional genomic studies through forward genetics (http://www.plantsignal.cn).

The obtained putative homozygous lines were further used for seed quality analysis by measuring the contents of starch, amylose, protein and fat through NIRS. The calibration equations were developed and validated by Foss Company, and the statistical indices of the calibration and cross-validation set are listed in Supplementary information, Table S1. Overall, 5 868 putative homozygous lines were analyzed (the data distributions are shown in Figure 2). For each of four constituents, the mean values showed no significant difference between the homozygote population and WT (Zhonghua 11); however, the whole data set of the homozygote population showed wider ranges. The mean content of starch in Zhonghua 11 was 75.02%, ranging from 69.06% to 79.38%, which was consistent with the known content of starch in japonica varieties (70-80%), while that of the homozygote population was 77.51%, ranging from 64.28% to 85.79%. Contents of amylose, protein and fat also showed similar mean values and wider ranges in the homozygote population (Table 4).

Figure 2
figure 2

Distribution plots of wild-type (black column) and homozygote population (white column) for starch, amylose, protein and fat contents (for wild type, n = 50; for homozygote population, n = 5 868).

Table 4 The seed quality values through near-infrared spectroscopy measurement

Shapiro-Wilk test revealed that WT followed the normal distribution of quality (P > 0.05). Based on this, the 99% confidence interval limit was used as the threshold of mutant selection. In total, 287 lines with altered starch content (4.9% of the total tested lines) were selected in the first-round measurement; among them, 252 lines showed the same trend in the second-round measurement, with a repeat rate of 87.8%. Similarly, 162 lines (2.8%) were identified related to amylose changes. The reported protein content in WT seeds is 8.50 ± 0.35% (49, similar to that in our analysis, 8.79 ± 0.31%), and finally 50 lines with altered protein contents were identified. As the fat content in milled rice is very low, the putative mutants with higher fat content (> 1.1%) were identified (188 lines, 3.2%) (Table 4).

Among the identified lines, some showed changes in a single constituent, whereas some showed changes in two or three constituents of seeds (Figure 3), suggesting that the corresponding genes may be involved in some form of common regulation during the synthesis and metabolism of these constituents, or that perturbation in biosynthesis/metabolism of one constituent may result in the altered content of another. Among 551 lines, 58 lines showed changes in starch and fat contents, agreeing well with the fact that starch metabolism affects fat content 50. Few mutants had changes in both protein and amylose, implying that the regulation of synthesis and metabolism between protein and amylose was more independent. The number of mutants with altered starch and ACs was 15, similar to the number of mutants with altered protein and fat contents (Figure 3).

Figure 3
figure 3

Venn diagrams show overlapping mutant lines with changes in the four constituents of seed quality.

Functional categories of genes related to seed quality control

The corresponding mutated genes of the 551 lines were analyzed using the Gene Ontology annotation, and the chi-square test was performed to determine whether the genes have functional preference. The results showed that genes with functions of signal transducer, catalytic, transcription regulator and transporter were present in all kinds of mutants (Figure 4), suggesting that the alteration of seed quality may be due to the modification of some common genetic factors. In addition, some GO categories were present in one or two kinds of mutants exclusively, and were functionally closely related. For example, many encoded proteins with carbohydrate-binding activity were identified in mutants with altered starch and ACs, while those with lipid-binding activity were enriched in mutants with altered fat contents, indicating that such mutants may be attributable to genes directly involved in the cognate synthesis or metabolism pathways. Other genes have diverse molecular functions, including protein-binding and enzyme regulator activities.

Figure 4
figure 4

Function annotation of mutant genes associated with seed quality changes by “molecular function” using the Gene Ontology according to the TIGR model.

As expected, some of the mutant genes have been reported to be involved in seed quality regulation, and their defective expression resulted in altered seed quality (Table 5). Most of these genes encode enzymes involved in the metabolism pathway of nutrients, and PCR analysis confirmed the T-DNA insertion in these genes (Figure 5A). Further, considering that seed quality can be influenced by many factors, such as the environment or somaclonal variation, mutant line SHIP_ZSF0781 was selected to test the genetic linkage between the insertion and phenotype. This line had an insertion at the upstream of gene Os08g37800, which encodes a carbon catabolite derepressing protein kinase. The homozygote lines contained higher content of starch, and comparison between the homozygote of T2 generation and several independent progeny lines from one heterozygous T1 plant confirmed the genetic linkage between insertion and the increased starch content. Among eight progeny, four (Figure 5B, 4-7) were homozygous (confirmed by PCR analysis, lower panel) and had similarly increased starch content as the homozygote (Figure 5B, lane 2), and three heterozygous lines (Figure 5B, 8-10) showed slightly increased starch contents.

Table 5 Reported genes that are involved in the seed quality (s) control in the characterized mutants
Figure 5
figure 5

Analysis of the T-DNA insertion and genetic linkage of the seed quality with T-DNA insertion. (A) Confirmation of T-DNA insertions. Five homozygotes with mutations corresponding to genes that were previously reported to control seed quality were analyzed by PCR amplification, each using a primer pair matched on the genome (flanking the insertion site, G) or that of a T-DNA left border primer and a reverse primer (L). For each mutant, four independent plants were tested. The corresponding genomic fragment of WT was amplified and used as control. (B) Analysis of the genetic linkage in a mutant line SHIP_ZSF0781 with altered starch content. The starch contents of WT (1, 3), homozygote mutant line (2) and different progeny lines from one heterozygous T1 plant (4-11, of which 4-7 are homozygotes, 8-10 are heterozygous lines and 11 is WT) were measured (upper panel), showing good correlation with the gene mutation situation, as analyzed by PCR amplification (lower panel). * indicates the starch content has a significant difference compared with WT (P < 0.05).

Data collection of SHIP

All information collected has been integrated into the database SHIP (http://ship.plantsignal.cn), which includes 8 840 FSTs mapping on the rice genome, 11 000 putative homozygous lines, seed quality data of 5 868 putative homozygous lines analyzed with NIRS and some observable phenotypes during growth under agronomical conditions. The SHIP database can be accessed freely and searched using keywords or through the BLAST program, and will be updated regularly.

Discussion

Distribution of T-DNA insertions on chromosomes

Our study showed that the distribution of T-DNA insertions was correlated with the chromosome size, which is consistent with previous reports that T-DNA insertions are not randomly distributed on the chromosomes 57, 58. The larger chromosomes contain more T-DNA insertions and also show higher insertion density (Table 1; Supplementary information, Figure S1). This may be due to the fact that large chromosomes have higher euchromatin, into which the T-DNAs preferentially integrate 59. In addition, the T-DNA distribution was positively correlated with expressed genes but negatively with transposable elements and small RNAs. Indeed, it has been suggested that the positive correlation between expressed genes and T-DNA insertions may be due to the “open” state of actively transcribed regions, rendering the DNA in these regions more accessible 58, 60, 61.

The negative correlation with small RNAs suggests that small RNAs also play roles in influencing T-DNA integration by maintaining the DNA in a condensed state. Another explanation for the few T-DNA distributions in transposable elements is that the drug resistance marker gene in T-DNA would be silenced by the small RNAs originating from such regions. As reported, in many eukaryotes transposable elements can generate 21-24-nucleotide siRNAs 48, 62, which are involved in the silencing of flanking genes 63. T-DNAs were also preferentially recovered from the sites upstream and downstream of genes near the coding region, likely due to the presence of multiple cis-elements that bind transcription factors or other regulatory factors, creating a more “open state” than the coding sequence 58.

The homozygous T-DNA insertion population provides a good resource for functional genomic studies

As a staple crop for human food, many efforts have been made to improve the yield and seed quality of rice by breeding 21. QTL and map-based cloning were used to identify genes associated with agricultural traits. One disadvantage of these strategies is that they are time-consuming, whereas it is much easier to identify target genes based on phenotypes caused by T-DNA insertion mutations. Alternatively, T-DNA insertion populations provide a large platform for studying gene functions using forward or reverse genetics strategies. Nowadays, many T-DNA insertion populations have been developed, but most of them were not screened for homozygotes, and many recessive characters could be missed. The homozygotes obtained in this study are useful for characterizing mutants with altered seed quality. Additionally, the obtained homozygote population can be used for other kinds of functional screenings, such as responses to hormones or stress conditions and so on. As 2 920 unigenes with diverse molecular functions were harbored in the population, it would facilitate functional studies of these genes through the reverse genetics strategy. Systemic studies on seed quality control and the identification of 551 putative mutants, and confirmation of the altered seed qualities caused by T-DNA insertion (Figure 5), demonstrated that the insertion population, especially the homozygote population, would be of great value in mutant screening, particularly for those causing a recessive phenotype.

Genes associated with seed quality control are involved in diverse molecular functions

Mutants with alterations in the four kinds of constituents were identified and some mutants show changes of multiple constituents. Among the identified mutant lines, the most frequent overlap was between starch and fat, whereas the mutants for protein and ACs showed the least frequent overlap (Figure 3). Although amylose is one composition of starch, only 15 mutants have altered amounts of both starch and amylose. This may be due to the fact that amylose constitutes only up to 20% of the total starch, and mutants with altered amylose may not necessarily result in significant changes in total starch.

Some mutations are found in genes involved in the individual metabolism pathway, suggesting that seed quality can be regulated by directly modifying the genes involved in the synthesis or metabolism pathways of starch, fat and so on 41, 42, 43. Additionally, other genes fall into many other diverse molecular function categories and some genes are associated with more than one changed constituents of seeds, indicating the close relationship between the regulation of synthesis and metabolism of these constituents. For example, some metabolism pathways, such as glycolysis, are involved in all of the synthesis pathways of these constituents. Also, the genes associated with each individual constituent may be regulated by the same regulatory factors, for example by the same transcription factors.

Among the mutated genes, some have been previously shown to play roles in regulating seed quality (Table 5), such as those coding for enzymes involved in the biosynthesis of starch and fat. A gene for carbon catabolite derepressing protein kinase was identified, which is an ortholog of two conserved Arabidopsis thaliana protein kinases KIN10/KIN11. KIN10/KIN11 control convergent reprogramming of transcription in plant energy signaling, the mutation of which impairs starch mobilization at night and growth 64. In yeast and mammals, the orthologous genes Snf1/AMPK are inactivated by sugars and play central roles in energy signaling 65. Our study suggests that the function of this family of protein kinase is also conserved in rice. The modification of cognate genes can be used as a strategy to improve energy production, especially starch accumulation in crops.

Previous studies have focused on genes directly involved in the biosynthesis/metabolism of starch, protein and fat 66, 67, 68, 69, 70, 71, 72, 73; our analysis suggests that many other genes may also be involved in seed quality control. Analysis of the underlying complex regulatory network should broaden our understanding of the relevant regulatory mechanisms.

In conclusion, a population of 11 000 putative homozygous lines was generated, and a part of them was used to measure the seed quality using NIRS, resulting in the characterization of mutants with increased or decreased contents of starch, amylose and protein and with higher contents of fat. Further studies will provide new insights into the regulation of seed quality; in addition, these mutants can be used as genetic materials directly for different purposes of the food industry and brewing.

Materials and Methods

Plant materials and transformation

Rice (Oryza sativa) plants were cultivated in a phytotron with a light (12 h, 28 °C)/dark (12 h, 22 °C) cycle. Two japonica varieties, Nipponbare and Zhonghua 11, were used for transformation at the early stage of this project. In the later stage only Zhonghua 11 was used and most of the transformants are with Zhonghua 11 background. The construct pSMR-J18R 74 was used to transform rice through Agrobacterium-mediated transformation 75. For seed quality analysis, only the putative homozygous transformants from Zhonghua 11 were measured.

TAIL-PCR and sequence analysis

For each transformant, genomic DNA was extracted from the leaf sample (25 mg) of a 2-week-old seedling. TAIL-PCR amplifications were performed as reported previously 44, 45, 58. The nest primers for the left border of T-DNA and AD primers are listed in Supplementary information, Table S2. All the flanking sequences obtained were aligned against the rice pseudo-chromosomes by BLASTN 76, with a cut-off of 1e–5 (if several hits were detected, that with the lowest E-value was chosen). The mapped insertion sites were then annotated using TIGR genome annotations release 5 (http://rice.tigr.org) and the insertion patterns were indicated as exon, intron, upstream (within 2 500 bp from ATG) or downstream (within 750 bp from the stop codon), or intergenic.

Test of locus number of T-DNA insertion and homozygote screening

To test the locus number of T-DNA insertion, the segregation ratio of hygromycin resistance was calculated. Thirty seeds from each T0 plant were sowed in sterile water containing 30 mg/l hygromycin, and cultivated in a phytotron with a light (18 h, 28 °C)/dark (6 h, 22 °C) cycle for 7 days. The line showing 3 resistant:1 sensitive segregation ratio of hygromycin resistance was considered as containing a single locus of T-DNA. To obtain the putative homozygous lines, T2 seeds were harvested from individual T1 plants and used to analyze the hygromycin resistance. For each line, 30 seeds of 12 individual plants were sowed in sterile water containing hygromycin and those showing complete hygromycin resistance were considered as homozygotes.

Measurement of seed quality by NIRS

Homozygous T2 plants were grown in the field for propagation and the harvested T3 seeds were used for seed quality analysis (Figure 6). For each line, 25 g of paddy sample was dehulled to brown rice, and then milled for 90 s to well-milled rice, which was used for analysis using an NIRSystems model 6500 spectrophotometer (Foss NIRSystems, Inc., Silver Spring, MD, USA) equipped with a transport module, in the reflectance mode. Different calibration equations for starch, amylose, protein and fat were developed and validated by Foss Company. The value of each line was the average of three independent measurements.

Figure 6
figure 6

A flow diagram showing mutant analysis and data collection present in SHIP.

To set the criteria for selecting the mutants with altered seed qualities, the distributions of each content in wild-type (Zhonghua 11) plants were analyzed. All data sets were tested for normality using Shapiro-Wilk test (when P > 0.05, the data followed the normal distribution) 77, and the confidence limits of the 99% confidence interval were calculated and set as the criteria for mutant selection 78. Lines with values beyond the criteria were considered to have altered seed quality. After confirmation by second-round measurement, those with the same change trend were selected and analyzed.

Functional annotation of the characterized genes

The annotation information from rice gene ontology (http://www.tigr.org) was used to annotate the identified genes. To determine the significance of over- or under-representation of group genes for each GO term, the chi-square test was used.

Confirmation of the mutant lines related to reported genes

Five lines with putative mutation of the previously reported genes were selected and genomic DNA was extracted from four independent plants of each homozygous line. The T-DNA insertion was confirmed through PCR amplification using primers flanking the putative insertion site. The primers specific to each candidate gene were as follows: SHIP_ZS F0092 (F92-1: 5′-AGG GGC TGG AGT GCT AAA TA-3′; F92-2: 5′-GGA AAT AGA ATG ATA AAG GCG T-3′), SHIP_ZSF0235 (F235-1: 5′-TCT GTG ACC ACC TGA GCC T-3′; F235-2: 5′-GCC TTG ACC ATT CGT CCA TA-3′), SHIP_ZSF0781 (F781-1: 5′-GAG ACG AAT CTT TTG AGC CT-3′; F781-2: 5′-GCT TGG ATT TGG ATG GAC GG-3′), SHIP_ZSF0676 (F676-1: 5′-CAG ACG ACG ACG CTG ACG A-3′; F676-2: 5′-CAA GTT GGC AAA CGA TGG CT-3′), SHIP_ZD0819 (819-1: 5′-GTT GAA CGC GGA GAT AGA TTA-3′; 819-2: 5′-TGA GCA TTA GCA GTA GAG CC-3′), SHIP_ZS3493 (3493-1: 5′-AAC TTG CTC GTT TCA CTT CA-3′; 3493-2: 5′-CAA TGA GAT GGA GGG AGT AG-3′). The T-DNA left border primer NTL2 (5′-ATA GGG TTT CGC TCA TGT GTT GAG C-3′) was used.

For detailed analysis of mutant SHIP_ZSF0781, eight independent progenies separated from one heterozygous parent were tested through PCR amplification. The quality of seeds harvested from these progenies independently was measured through NIRS.

( Supplementary Information is linked to the online version of the paper on the Cell Research website.)