News Column

"Means and Methods for the Determination of Prediction Models Associated with a Phenotype" in Patent Application Approval Process

August 28, 2014

By a News Reporter-Staff News Editor at Agriculture Week -- A patent application by the inventors Inze, Dirk G. (Moorsel-Aalst, BE); Gonzalez, Nathalie (Merelbeke, BE); De Bodt, Stefanie (Gent, BE); Saeys, Yvan (Oudenaarde, GE), filed on June 25, 2012, was made available online on August 14, 2014, according to news reporting originating from Washington, D.C., by VerticalNews correspondents.

This patent application is assigned to Universiteit Gent.

The following quote was obtained by the news editors from the background information supplied by the inventors: "The heritable differences in genomes that are reflected in the variation of the expression of a particular phenotype, and which contribute to the range of phenotypes observed for any of a number of phenotypes, form the basis for decisions in plant and animal breeding. Typically, any one phenotype will be modulated by multiple genetic factors and differences of these genetic factors between individuals can be associated with a variation in the phenotypic outcome between individuals. In the instance where the phenotype is the product of one or more transgenes or where the phenotype is influenced by one or more transgenes, it is expected that several genetic factors in the organism's genome contributes to the phenotype of the transgene or to the phenotype influenced by the transgene. The possibility to manipulate plant phenotypes that affect the production of food, fiber and renewable energy has important agricultural consequences. Indeed, the most important goal in plant breeding is to meet a product concept by selecting the most promising plants as founders for further breeding or by selecting the best germplasm candidates for introduction of a transgene. Breeders are faced with a constant challenge to improve and shorten the timelines of the breeding processes. The outcome of a phenotype may be impacted by constitutive genes or more typically by genes which are only expressed at specific points in time during development in a plant. Allelic variants of constitutive genes, copy number variations, deletions, the presence of specific microRNA populations, promoter variations may all impact the genetic outcome of a particular phenotype. There is currently no magic approach for identifying genes which are correlated with important plant phenotypes. Forward genetics is limited as mutations in many genes may generate only moderate or weak phenotypes. Similarly, although reverse genetics allows for directed assay of gene perturbations, saturated phenotyping for many plant phenotypes is impractical.

"In the prior art, several attempts were made to identify simple genes, like individual transcripts, in order to describe certain plant phenotypes, and even more complex plant phenotypes like biomass production and growth. Most of these attempts had no satisfactory outcome, since complex traits usually are related to a more complex network of transcripts all partially representing such complex phenotypes.

"Another approach which has been proposed in the art is the computational identification of likely candidate genes for desired phenotypes, allowing for focused, efficient use of reverse genetics. An emerging approach for prioritizing candidate genes is network-guided guilt by association. In this approach, functional associations are first determined between genes in a genome on the basis of extensive experimental data sets such as microarray data sets. Probabilistic functional gene networks aim at integrating heterogeneous biological data into a single model, enhancing both model accuracy and coverage. Once a suitable network is generated, new candidate genes are proposed for phenotypes based upon network associations with genes previously linked to these phenotypes. Such network-guided screening has been successfully applied to the reference flowering plant, Arabidopsis thaliana (Insuk Lee et al., (2009) Nature Biotechnology 28(2) 149). Obviously, a key to progress towards breeding better crops has been to understand the changes in cellular, biochemical and molecular machinery that occur associated with a particular phenotype. The development of genetically engineered plants by the overexpression or downregulation of selected genes seems to be a viable option to hasten the breeding of 'improved' plants but has thus far not generated a significant impact on the generation of crops with improved quantitative traits such as yield, drought tolerance and abiotic stress tolerance.

"A further aspect is the unpredictable performance of a particular transgene in a given plant genetic background. In the past, a great deal of scientific effort has been invested in the development of transformation systems in plants. Transformation is noiinally used to introduce single novel genes into a plant and this gene usually modifies a single important characteristic of the recipient line. There are still barriers, however, to the transformation of agronomically-proven important crop genotypes, and several of these can be overcome by conventional crossing strategies. In some crop species only certain cultivars can be transformed efficiently and these often yield less than the most modern varieties and elite breeding material. In these cases, conventional breeding is used to transfer a promising transgene from a donor cultivar to a modern variety, and thus combine benefits of transformation and conventional breeding methods. To have the optimum potential, transgenic varieties should have genetic backgrounds which have been selected for maximum yield and good quality characteristics under normal agronomic conditions. The genotype of an elite variety is a complex assembly of genes controlling a large number of characters. To have the best effect, transgenes should be introduced (e.g., by crossing or transformation) in genetic backgrounds with an optimal plant transcriptional network able to synergize with the introduced transgene. It is known that every genetic background has its modifiers genes which influence the expression of a particular transgene. The speed with which transgenes are transferred into improved genetic backgrounds is accelerated by the application of marker-assisted breeding techniques. Marker-assisted backcrossing programs can introgress transgenes into elite varieties by selecting indirectly for the large numbers of alleles (with complex interactions) that make up a superior genotype. The latter is done without the need to identify the individual genes involved or to understand their modes of action. In the prior art methods have been described for the identification of loci modulating transgene performance in plant breeding through the screening of germplasm entries (see, for example, WO2009002924).

"Notwithstanding the foregoing, the current scientific opinion is that distinct gene networks operate in different genetic backgrounds or exist in plants grown in various environmental conditions. These gene networks contribute to the presence of a particular phenotype. A specific gene network for a given phenotype could be a valuable breeder tool to assist breeders in selecting the most valuable plant, with an expected phenotype, from, for example, a germplasm collection of immature plants or could assist breeders in selecting the most valuable genotype for the introduction of a trait able to influence a particular phenotype. It is a challenge to identify such gene networks which are specifically associated with a predicted phenotype of interest in a plant."

In addition to the background information obtained for this patent application, VerticalNews journalists also obtained the inventors' summary information for this patent application: "Demonstrated is that a combination of a set of absolute expression-values of specific genes in combination with a statistical model (i.e., herein defined as a plant phenotype predictor) is associated with a high likelihood of a specific predicted phenotype of interest. In other words, it was found that the specific composition and its absolute expression values of a gene expression network represents (or is associated with or corresponds with) a complex phenotype of interest of a plant, such as, for example, leaf biomass production.

"Accordingly, the disclosure relates to methods of predicting a future phenotype of interest in an organism such as a plant. In one embodiment, the disclosure enables the artisan to associate the presence of absolute gene expression signatures in plants, in combination with a suitable statistical model, with a predicted phenotype of interest in an organism such as a plant.

"Accordingly, the present disclosure for the first time provides the above-described direct proof that the output of a specific plant phenotype predictor is highly correlated with the expression of a certain phenotype of a plant, like, for example, leaf biomass production. One further merit of the disclosure is the successful demonstration that a future plant phenotype can be predicted based on the presence of an absolute gene expression signature in a plant present in a collection of immature plants.

"Moreover, it could be shown in the context of this disclosure, that for training the statistical model to be applied for predicting the phenotype of interest, not necessarily those plants have (or this group of plants has) to be analyzed (e.g., by performing a gene expression profile analysis of a particular tissue of each of the plants) for which the prediction is intended to be carried out. As also exemplary, shown in the appended experimental part, the prediction of the expression of a phenotype can also be carried out for plants, which were not employed for establishing the plant phenotype predictor. The latter means that the plant phenotype predictor was calculated (or established) in a training population and that the plant phenotype predictor can be used in other plants, which do not belong to the training population. In still other words, a prediction of the presence of a future phenotype is also possible for such plants which were cultivated independently from those plants which were initially employed (or 'analyzed,' according to the methods described herein) for the training of the correlation model. Hence, the methods provided herein can also be applied, when the (group of) plants employed for generating the correlation model were grown independently of the (group of) plants for which the phenotype of interest is to be predicted. The meaning of plants which 'were employed' refers to the fact that a gene expression profiling method is applied on the plants. It is expected that slight differences in environmental conditions, which exist between independent cultivations, do not constrain the predictiveness of the plant phenotype predictor with respect to the potential for the presence of a corresponding plant phenotype. These are further advantages of the present disclosure.

"In still other words, the present disclosure, relates in a genotype independent manner to the identification of plants comprising a predicted phenotype of interest based on calculating the correspondence between a plant phenotype predictor and the phenotype of interest with a statistical model.

"The findings provided herein offer agricultural potential for a number of applied purposes. For example, the possibility to predict the presence of certain plant phenotypes on the basis of the presence of one or more absolute gene expression signatures, in combination with an established statistical model established in a training set of plants, in one or more immature plants present in a group of plants revolutionizes the selection and thus breeding processes of plants. Particularly, with respect to biomass producers, such as trees that are cultivated for many years or even decades before harvest, the means and methods of the present disclosure are highly advantageous. The identification of certain plants that are capable of expressing (a) certain phenotype(s) in a desired manner, for example, potentially high biomass producers, already at an early growth stage, preferably an immature growth stage, even at the seed stage, can result in enormous time and cost-savings, especially in selection and breeding procedures.


"FIG. 1: Correlation initial leaf size versus final leaf size.

"FIG. 2a: Prediction of final leaf size. Classification results using support vector machines on 100 real (dark) and random (grey) datasets.

"FIG. 2b: Prediction of leaf size at harvest. Classification results using support vector machines on 100 real (dark) and random (grey) datasets.

"FIG. 2c: Prediction of final rosette size. Classification results using support vector machines on 100 real (black) and random (grey) datasets.

"FIG. 2d: Classification based on mechanism results using support vector machines on 100 real (black) and random (grey) datasets.

"FIG. 3: Summary of regression analysis

"FIG. 4: Co-expression network of the growth predictors based on the expression data in small plants (PCC >0.65).

"FIG. 5: Co-expression network of the growth predictors based on the expression data in large plants (PCC >0.65)."

URL and more information on this patent application, see: Inze, Dirk G.; Gonzalez, Nathalie; De Bodt, Stefanie; Saeys, Yvan. Means and Methods for the Determination of Prediction Models Associated with a Phenotype. Filed June 25, 2012 and posted August 14, 2014. Patent URL:

Keywords for this news article include: Agricultural, Universiteit Gent.

Our reports deliver fact-based news of research and discoveries from around the world. Copyright 2014, NewsRx LLC

For more stories covering the world of technology, please see HispanicBusiness' Tech Channel

Source: Agriculture Week

Story Tools Facebook Linkedin Twitter RSS Feed Email Alerts & Newsletters