1
|
Jurado-Ruiz F, Nguyen TP, Peller J, Aranzana MJ, Polder G, Aarts MGM. LeTra: a leaf tracking workflow based on convolutional neural networks and intersection over union. Plant Methods 2024; 20:11. [PMID: 38233879 PMCID: PMC10795293 DOI: 10.1186/s13007-024-01138-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Accepted: 01/08/2024] [Indexed: 01/19/2024]
Abstract
BACKGROUND The study of plant photosynthesis is essential for productivity and yield. Thanks to the development of high-throughput phenotyping (HTP) facilities, based on chlorophyll fluorescence imaging, photosynthetic traits can be measured in a reliable, reproducible and efficient manner. In most state-of-the-art HTP platforms, these traits are automatedly analyzed at individual plant level, but information at leaf level is often restricted by the use of manual annotation. Automated leaf tracking over time is therefore highly desired. Methods for tracking individual leaves are still uncommon, convoluted, or require large datasets. Hence, applications and libraries with different techniques are required. New phenotyping platforms are initiated now more frequently than ever; however, the application of advanced computer vision techniques, such as convolutional neural networks, is still growing at a slow pace. Here, we provide a method for leaf segmentation and tracking through the fine-tuning of Mask R-CNN and intersection over union as a solution for leaf tracking on top-down images of plants. We also provide datasets and code for training and testing on both detection and tracking of individual leaves, aiming to stimulate the community to expand the current methodologies on this topic. RESULTS We tested the results for detection and segmentation on 523 Arabidopsis thaliana leaves at three different stages of development from which we obtained a mean F-score of 0.956 on detection and 0.844 on segmentation overlap through the intersection over union (IoU). On the tracking side, we tested nine different plants with 191 leaves. A total of 161 leaves were tracked without issues, accounting to a total of 84.29% correct tracking, and a Higher Order Tracking Accuracy (HOTA) of 0.846. In our case study, leaf age and leaf order influenced photosynthetic capacity and photosynthetic response to light treatments. Leaf-dependent photosynthesis varies according to the genetic background. CONCLUSION The method provided is robust for leaf tracking on top-down images. Although one of the strong components of the method is the low requirement in training data to achieve a good base result (based on fine-tuning), most of the tracking issues found could be solved by expanding the training dataset for the Mask R-CNN model.
Collapse
Affiliation(s)
- Federico Jurado-Ruiz
- Center for Research in Agricultural Genomics (CRAG), Cerdanyola, 08193, Barcelona, Spain
| | - Thu-Phuong Nguyen
- Laboratory of Genetics, Wageningen University and Research (WUR), Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands
| | - Joseph Peller
- Greenhouse Horticulture, Wageningen University and Research (WUR), Wageningen, The Netherlands
| | - María José Aranzana
- Center for Research in Agricultural Genomics (CRAG), Cerdanyola, 08193, Barcelona, Spain
- Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Barcelona, Spain
| | - Gerrit Polder
- Greenhouse Horticulture, Wageningen University and Research (WUR), Wageningen, The Netherlands
| | - Mark G M Aarts
- Laboratory of Genetics, Wageningen University and Research (WUR), Droevendaalsesteeg 1, 6708 PB, Wageningen, The Netherlands.
| |
Collapse
|
2
|
López-Girona E, Zhang Y, Eduardo I, Mora JRH, Alexiou KG, Arús P, Aranzana MJ. A deletion affecting an LRR-RLK gene co-segregates with the fruit flat shape trait in peach. Sci Rep 2017; 7:6714. [PMID: 28751691 PMCID: PMC5532255 DOI: 10.1038/s41598-017-07022-0] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2017] [Accepted: 06/20/2017] [Indexed: 01/01/2023] Open
Abstract
In peach, the flat phenotype is caused by a partially dominant allele in heterozygosis (Ss), fruits from homozygous trees (SS) abort a few weeks after fruit setting. Previous research has identified a SSR marker (UDP98-412) highly associated with the trait, found suitable for marker assisted selection (MAS). Here we report a ∼10 Kb deletion affecting the gene PRUPE.6G281100, 400 Kb upstream of UDP98-412, co-segregating with the trait. This gene is a leucine-rich repeat receptor-like kinase (LRR-RLK) orthologous to the Brassinosteroid insensitive 1-associated receptor kinase 1 (BAK1) group. PCR markers suitable for MAS confirmed its strong association with the trait in a collection of 246 cultivars. They were used to evaluate the DNA from a round fruit derived from a somatic mutation of the flat variety 'UFO-4', revealing that the mutation affected the flat associated allele (S). Protein BLAST alignment identified significant hits with genes involved in different biological processes. Best protein hit occurred with AtRLP12, which may functionally complement CLAVATA2, a key regulator that controls the stem cell population size. RT-PCR analysis revealed the absence of transcription of the partially deleted allele. The data support PRUPE.6G281100 as a candidate gene for flat shape in peach.
Collapse
Affiliation(s)
- Elena López-Girona
- IRTA (Institut de Recerca i Tecnologia Agroalimentàries), Barcelona, Spain
| | - Yu Zhang
- Centre for Research in Agricultural Genomics (CRAG) CSIC-IRTA-UAB-UB, Campus UAB, Bellaterra, Barcelona, Spain
| | - Iban Eduardo
- IRTA (Institut de Recerca i Tecnologia Agroalimentàries), Barcelona, Spain
| | | | | | - Pere Arús
- IRTA (Institut de Recerca i Tecnologia Agroalimentàries), Barcelona, Spain
| | | |
Collapse
|
3
|
Donoso JM, Eduardo I, Picañol R, Batlle I, Howad W, Aranzana MJ, Arús P. High-density mapping suggests cytoplasmic male sterility with two restorer genes in almond × peach progenies. Hortic Res 2015; 2:15016. [PMID: 26504569 PMCID: PMC4595988 DOI: 10.1038/hortres.2015.16] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Accepted: 04/09/2015] [Indexed: 05/18/2023]
Abstract
Peach (Prunus persica) and almond (Prunus dulcis) are two sexually compatible species that produce fertile offspring. Almond, a highly polymorphic species, is a potential source of new genes for peach that has a strongly eroded gene pool. Here we describe the genetics of a male sterile phenotype that segregated in two almond ('Texas') × peach ('Earlygold') progenies: an F2 (T×E) and a backcross one (T1E) to the 'Earlygold' parent. High-density maps were developed using a 9k peach SNP chip and 135 simple-sequence repeats. Three highly syntenic and collinear maps were obtained: one for the F2 (T×E) and two for the backcross, T1E (for the hybrid) and E (for 'Earlygold'). A major reduction of recombination was observed in the interspecific maps (T×E and T1E) compared to the intraspecific parent (E). The E map also had extensive monomorphic genomic regions suggesting the presence of large DNA fragments identical by descent. Our data for the male sterility character were consistent with the existence of cytoplasmic male sterility, where individuals having the almond cytoplasm required the almond allele in at least one of two independent restorer genes, Rf1 and Rf2, to be fertile. The restorer genes were located in a 3.4 Mbp fragment of linkage group 2 (Rf1) and 1.4 Mbp of linkage group 6 (Rf2). Both fragments contained several genes coding for pentatricopeptide proteins, demonstrated to be responsible for restoring fertility in other species. The implications of these results for using almond as a source of novel variability in peach are discussed.
Collapse
Affiliation(s)
- José Manuel Donoso
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB; Campus UAB, Bellaterra (Cerdanyola del Vallès), 08193 Barcelona, Spain
| | - Iban Eduardo
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB; Campus UAB, Bellaterra (Cerdanyola del Vallès), 08193 Barcelona, Spain
| | - Roger Picañol
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB; Campus UAB, Bellaterra (Cerdanyola del Vallès), 08193 Barcelona, Spain
| | - Ignasi Batlle
- IRTA. Centre de Mas de Bover. Crta. De Reus – El Morell Km 3.8. 43120 Constantί, Tarragona, Spain
| | - Werner Howad
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB; Campus UAB, Bellaterra (Cerdanyola del Vallès), 08193 Barcelona, Spain
| | - María José Aranzana
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB; Campus UAB, Bellaterra (Cerdanyola del Vallès), 08193 Barcelona, Spain
| | - Pere Arús
- IRTA, Centre de Recerca en Agrigenòmica CSIC-IRTA-UAB-UB; Campus UAB, Bellaterra (Cerdanyola del Vallès), 08193 Barcelona, Spain
- E-mail:
| |
Collapse
|
4
|
Zhao K, Aranzana MJ, Kim S, Lister C, Shindo C, Tang C, Toomajian C, Zheng H, Dean C, Marjoram P, Nordborg M. An Arabidopsis example of association mapping in structured samples. PLoS Genet 2006; 3:e4. [PMID: 17238287 PMCID: PMC1779303 DOI: 10.1371/journal.pgen.0030004] [Citation(s) in RCA: 443] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2006] [Accepted: 11/22/2006] [Indexed: 01/04/2023] Open
Abstract
A potentially serious disadvantage of association mapping is the fact that marker-trait associations may arise from confounding population structure as well as from linkage to causative polymorphisms. Using genome-wide marker data, we have previously demonstrated that the problem can be severe in a global sample of 95 Arabidopsis thaliana accessions, and that established methods for controlling for population structure are generally insufficient. Here, we use the same sample together with a number of flowering-related phenotypes and data-perturbation simulations to evaluate a wider range of methods for controlling for population structure. We find that, in terms of reducing the false-positive rate while maintaining statistical power, a recently introduced mixed-model approach that takes genome-wide differences in relatedness into account via estimated pairwise kinship coefficients generally performs best. By combining the association results with results from linkage mapping in F2 crosses, we identify one previously known true positive and several promising new associations, but also demonstrate the existence of both false positives and false negatives. Our results illustrate the potential of genome-wide association scans as a tool for dissecting the genetics of natural variation, while at the same time highlighting the pitfalls. The importance of study design is clear; our study is severely under-powered both in terms of sample size and marker density. Our results also provide a striking demonstration of confounding by population structure. While statistical methods can be used to ameliorate this problem, they cannot always be effective and are certainly not a substitute for independent evidence, such as that obtained via crosses or transgenic experiments. Ultimately, association mapping is a powerful tool for identifying a list of candidates that is short enough to permit further genetic study. There is currently tremendous interest in using association mapping to find the genes responsible for natural variation, particularly for human disease. In association mapping, researchers seek to identify regions of the genome where individuals who are phenotypically similar (e.g., they all have the same disease) are also unusually closely related. A potentially serious problem is that spurious correlations may arise if the population is structured so that members of a subgroup tend to be much more closely related. We have previously demonstrated that this problem can be severe in Arabidopsis thaliana, and that established statistical methods for controlling for population structure are insufficient. Here, we evaluate a broader range of methods. We find that a recently introduced mixed-model approach generally performs best. By combining the association results with results from linkage mapping in F2 crosses, we identify one previously known true positive and several promising new associations, but also demonstrate the existence of both false positives and false negatives. Our results illustrate the potential of genome-wide association scans as a tool for dissecting the genetics of natural variation, while at the same time highlighting the pitfalls.
Collapse
Affiliation(s)
- Keyan Zhao
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - María José Aranzana
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sung Kim
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Clare Lister
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Chikako Shindo
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Chunlao Tang
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Christopher Toomajian
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Honggang Zheng
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Caroline Dean
- Department of Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Paul Marjoram
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Magnus Nordborg
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
5
|
Aranzana MJ, Kim S, Zhao K, Bakker E, Horton M, Jakob K, Lister C, Molitor J, Shindo C, Tang C, Toomajian C, Traw B, Zheng H, Bergelson J, Dean C, Marjoram P, Nordborg M. Genome-wide association mapping in Arabidopsis identifies previously known flowering time and pathogen resistance genes. PLoS Genet 2005; 1:e60. [PMID: 16292355 PMCID: PMC1283159 DOI: 10.1371/journal.pgen.0010060] [Citation(s) in RCA: 274] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2005] [Accepted: 10/10/2005] [Indexed: 11/19/2022] Open
Abstract
There is currently tremendous interest in the possibility of using genome-wide association mapping to identify genes responsible for natural variation, particularly for human disease susceptibility. The model plant Arabidopsis thaliana is in many ways an ideal candidate for such studies, because it is a highly selfing hermaphrodite. As a result, the species largely exists as a collection of naturally occurring inbred lines, or accessions, which can be genotyped once and phenotyped repeatedly. Furthermore, linkage disequilibrium in such a species will be much more extensive than in a comparable outcrossing species. We tested the feasibility of genome-wide association mapping in A. thaliana by searching for associations with flowering time and pathogen resistance in a sample of 95 accessions for which genome-wide polymorphism data were available. In spite of an extremely high rate of false positives due to population structure, we were able to identify known major genes for all phenotypes tested, thus demonstrating the potential of genome-wide association mapping in A. thaliana and other species with similar patterns of variation. The rate of false positives differed strongly between traits, with more clinal traits showing the highest rate. However, the false positive rates were always substantial regardless of the trait, highlighting the necessity of an appropriate genomic control in association studies.
Collapse
Affiliation(s)
- María José Aranzana
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Sung Kim
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Keyan Zhao
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Erica Bakker
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Matthew Horton
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Katrin Jakob
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Clare Lister
- Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - John Molitor
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Chikako Shindo
- Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Chunlao Tang
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Christopher Toomajian
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Brian Traw
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Honggang Zheng
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| | - Joy Bergelson
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Caroline Dean
- Cell and Developmental Biology, John Innes Centre, Norwich, United Kingdom
| | - Paul Marjoram
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Magnus Nordborg
- Molecular and Computational Biology, University of Southern California, Los Angeles, California, United States of America
| |
Collapse
|
6
|
Aranzana MJ, Carbó J, Arús P. Microsatellite variability in peach [ Prunus persica (L.) Batsch]: cultivar identification, marker mutation, pedigree inferences and population structure. Theor Appl Genet 2003; 106:1341-1352. [PMID: 12750778 DOI: 10.1007/s00122-002-1128-5] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2002] [Accepted: 08/08/2002] [Indexed: 05/23/2023]
Abstract
A collection of 212 peach and nectarine cultivars covering a wide variation of the species were studied with 16 polymorphic single-locus microsatellite, or simple-sequence repeat (SSR), markers. The average number of alleles per locus was 7.3, 35% of the cultivar x locus combinations analyzed were heterozygous and 87% of the cultivars studied could be individually identified. Most of the groups where two or more cultivars had the same SSR fingerprint included known peach mutants or possible synonymies. Pedigree information was tested with the SSR data. Five unexpected genotypes, due to a mutation at five SSR loci were found when comparing the SSR fingerprint of 14 known mutant cultivars and putative synonymous cultivars. The pedigree data were not consistent with the observed data in 11 out of 38 cases that could be analyzed. The group of non-melting fruit flesh cultivars, generally used by the canning industry, was more variable and genetically distant than the rest of the cultivars tested. Based on their level of homozygosity it was possible to separate those cultivars that were obtained by modern breeding technologies from those that were selected from traditional orchards after generations of seed propagation. The former had a distribution of genotypic frequencies close to a random mating model while the latter had a higher level of homozygosity. The implications of these data for the use of SSR fingerprints in breeder's rights protection and peach breeding are discussed.
Collapse
Affiliation(s)
- M J Aranzana
- IRTA, Department de Genètica Vegetal, Carretera, de Cabrils s/n; 08348 Cabrils, Barcelona, Spain
| | | | | |
Collapse
|
7
|
Aranzana MJ, Pineda A, Cosson P, Dirlewanger E, Ascasibar J, Cipriani G, Ryder CD, Testolin R, Abbott A, King GJ, Iezzoni AF, Arús P. A set of simple-sequence repeat (SSR) markers covering the Prunus genome. Theor Appl Genet 2003; 106:819-825. [PMID: 12647055 DOI: 10.1007/s00122-002-1094-y] [Citation(s) in RCA: 62] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2002] [Accepted: 08/06/2002] [Indexed: 05/24/2023]
Abstract
A set of 109 microsatellite primer pairs recently developed for peach and cherry have been studied in the almond x peach F(2) progeny previously used to construct a saturated Prunus map containing mainly restriction fragment length polymorphism markers. All but one gave amplification products, and 87 (80%) segregated in the progeny and detected 96 loci. The resulting Prunus map contains a total of 342 markers covering a total distance of 522 cM. The approximate position of nine additional simple sequence repeats (SSRs) was established by comparison with other almond and peach maps. SSRs were placed in all the eight linkage groups of this map, and their distribution was relatively even, providing a genome-wide coverage with an average density of 5.4 cM/SSR. Twenty-four single-locus SSRs, highly polymorphic in peach, and each falling within 24 evenly spaced approximately 25-cM regions covering the whole Prunus genome, are proposed as a 'genotyping set' useful as a reference for fingerprinting, pedigree and genetic analysis of this species.
Collapse
Affiliation(s)
- M J Aranzana
- Institut de Recerca i Tecnologia Agroalimentàries (IRTA), Departament de Genética Vegetal, Carretera de Cabrils s/n, 08348 Cabrils (Barcelona), Spain
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|