1
|
Xu M, Kim H, Yang J, Fuentes A, Meng Y, Yoon S, Kim T, Park DS. Embracing limited and imperfect training datasets: opportunities and challenges in plant disease recognition using deep learning. FRONTIERS IN PLANT SCIENCE 2023; 14:1225409. [PMID: 37810377 PMCID: PMC10557492 DOI: 10.3389/fpls.2023.1225409] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Accepted: 08/30/2023] [Indexed: 10/10/2023]
Abstract
Recent advancements in deep learning have brought significant improvements to plant disease recognition. However, achieving satisfactory performance often requires high-quality training datasets, which are challenging and expensive to collect. Consequently, the practical application of current deep learning-based methods in real-world scenarios is hindered by the scarcity of high-quality datasets. In this paper, we argue that embracing poor datasets is viable and aims to explicitly define the challenges associated with using these datasets. To delve into this topic, we analyze the characteristics of high-quality datasets, namely, large-scale images and desired annotation, and contrast them with the limited and imperfect nature of poor datasets. Challenges arise when the training datasets deviate from these characteristics. To provide a comprehensive understanding, we propose a novel and informative taxonomy that categorizes these challenges. Furthermore, we offer a brief overview of existing studies and approaches that address these challenges. We point out that our paper sheds light on the importance of embracing poor datasets, enhances the understanding of the associated challenges, and contributes to the ambitious objective of deploying deep learning in real-world applications. To facilitate the progress, we finally describe several outstanding questions and point out potential future directions. Although our primary focus is on plant disease recognition, we emphasize that the principles of embracing and analyzing poor datasets are applicable to a wider range of domains, including agriculture. Our project is public available at https://github.com/xml94/EmbracingLimitedImperfectTrainingDatasets.
Collapse
Affiliation(s)
- Mingle Xu
- Department of Electronic Engineering, Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju, Republic of Korea
| | - Hyongsuk Kim
- Department of Electronic Engineering, Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju, Republic of Korea
| | - Jucheng Yang
- College of Artificial Intelligence, Tianjin University of Science and Technology, Tianjin, China
| | - Alvaro Fuentes
- Department of Electronic Engineering, Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju, Republic of Korea
| | - Yao Meng
- Department of Electronic Engineering, Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju, Republic of Korea
| | - Sook Yoon
- Department of Computer Engineering, Mokpo National University, Muan, Republic of Korea
| | - Taehyun Kim
- National Institute of Agricultural Sciences, Wanju, Republic of Korea
| | - Dong Sun Park
- Department of Electronic Engineering, Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonju, Republic of Korea
| |
Collapse
|
2
|
Cui Z, Li K, Kang C, Wu Y, Li T, Li M. Plant and Disease Recognition Based on PMF Pipeline Domain Adaptation Method: Using Bark Images as Meta-Dataset. PLANTS (BASEL, SWITZERLAND) 2023; 12:3280. [PMID: 37765444 PMCID: PMC10534746 DOI: 10.3390/plants12183280] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/11/2023] [Accepted: 09/13/2023] [Indexed: 09/29/2023]
Abstract
Efficient image recognition is important in crop and forest management. However, it faces many challenges, such as the large number of plant species and diseases, the variability of plant appearance, and the scarcity of labeled data for training. To address this issue, we modified a SOTA Cross-Domain Few-shot Learning (CDFSL) method based on prototypical networks and attention mechanisms. We employed attention mechanisms to perform feature extraction and prototype generation by focusing on the most relevant parts of the images, then used prototypical networks to learn the prototype of each category and classify new instances. Finally, we demonstrated the effectiveness of the modified CDFSL method on several plant and disease recognition datasets. The results showed that the modified pipeline was able to recognize several cross-domain datasets using generic representations, and achieved up to 96.95% and 94.07% classification accuracy on datasets with the same and different domains, respectively. In addition, we visualized the experimental results, demonstrating the model's stable transfer capability between datasets and the model's high visual correlation with plant and disease biological characteristics. Moreover, by extending the classes of different semantics within the training dataset, our model can be generalized to other domains, which implies broad applicability.
Collapse
Affiliation(s)
| | | | | | | | | | - Mingyang Li
- Co-Innovation Center for Sustainable Forestry in Southern China, Nanjing Forestry University, Nanjing 210037, China; (Z.C.); (K.L.); (C.K.); (Y.W.); (T.L.)
| |
Collapse
|
3
|
Wu X, Deng H, Wang Q, Lei L, Gao Y, Hao G. Meta-learning shows great potential in plant disease recognition under few available samples. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2023; 114:767-782. [PMID: 36883481 DOI: 10.1111/tpj.16176] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2022] [Revised: 02/15/2023] [Accepted: 02/23/2023] [Indexed: 05/27/2023]
Abstract
Plant diseases worsen the threat of food shortage with the growing global population, and disease recognition is the basis for the effective prevention and control of plant diseases. Deep learning has made significant breakthroughs in the field of plant disease recognition. Compared with traditional deep learning, meta-learning can still maintain more than 90% accuracy in disease recognition with small samples. However, there is no comprehensive review on the application of meta-learning in plant disease recognition. Here, we mainly summarize the functions, advantages, and limitations of meta-learning research methods and their applications for plant disease recognition with a few data scenarios. Finally, we outline several research avenues for utilizing current and future meta-learning in plant science. This review may help plant science researchers obtain faster, more accurate, and more credible solutions through deep learning with fewer labeled samples.
Collapse
Affiliation(s)
- Xue Wu
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, Guizhou, China
| | - Hongyu Deng
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, Guizhou, China
| | - Qi Wang
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, Guizhou, China
| | - Liang Lei
- School of Physics & Optoelectronic Engineering, Guangdong University of Technology, Guangzhou, 550000, Guangzhou, China
| | - Yangyang Gao
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, Guizhou, China
| | - Gefei Hao
- National Key Laboratory of Green Pesticide, Key Laboratory of Green Pesticide and Agricultural Bioengineering, Ministry of Education, Center for Research and Development of Fine Chemicals, State Key Laboratory of Public Big Data, Guizhou University, Guiyang, 550025, Guizhou, China
| |
Collapse
|
4
|
Jiang J, Li J, Li J, Pei H, Li M, Zou Q, Lv Z. A Machine Learning Method to Identify Umami Peptide Sequences by Using Multiplicative LSTM Embedded Features. Foods 2023; 12:foods12071498. [PMID: 37048319 PMCID: PMC10094688 DOI: 10.3390/foods12071498] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2023] [Revised: 03/24/2023] [Accepted: 03/30/2023] [Indexed: 04/05/2023] Open
Abstract
Umami peptides enhance the umami taste of food and have good food processing properties, nutritional value, and numerous potential applications. Wet testing for the identification of umami peptides is a time-consuming and expensive process. Here, we report the iUmami-DRLF that uses a logistic regression (LR) method solely based on the deep learning pre-trained neural network feature extraction method, unified representation (UniRep based on multiplicative LSTM), for feature extraction from the peptide sequences. The findings demonstrate that deep learning representation learning significantly enhanced the capability of models in identifying umami peptides and predictive precision solely based on peptide sequence information. The newly validated taste sequences were also used to test the iUmami-DRLF and other predictors, and the result indicates that the iUmami-DRLF has better robustness and accuracy and remains valid at higher probability thresholds. The iUmami-DRLF method can aid further studies on enhancing the umami flavor of food for satisfying the need for an umami-flavored diet.
Collapse
Affiliation(s)
- Jici Jiang
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Jiayu Li
- College of Life Science, Sichuan University, Chengdu 610065, China
| | - Junxian Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Hongdi Pei
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
- Wu Yuzhang Honors College, Sichuan University, Chengdu 610065, China
| | - Mingxin Li
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
- Yangtze Delta Region Institute (Quzhou), University of Electronic Science and Technology of China, Quzhou 324000, China
| | - Zhibin Lv
- College of Biomedical Engineering, Sichuan University, Chengdu 610065, China
| |
Collapse
|
5
|
Xu M, Yoon S, Jeong Y, Park DS. Transfer learning for versatile plant disease recognition with limited data. FRONTIERS IN PLANT SCIENCE 2022; 13:1010981. [PMID: 36507376 PMCID: PMC9726777 DOI: 10.3389/fpls.2022.1010981] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 10/20/2022] [Indexed: 06/17/2023]
Abstract
Deep learning has witnessed a significant improvement in recent years to recognize plant diseases by observing their corresponding images. To have a decent performance, current deep learning models tend to require a large-scale dataset. However, collecting a dataset is expensive and time-consuming. Hence, the limited data is one of the main challenges to getting the desired recognition accuracy. Although transfer learning is heavily discussed and verified as an effective and efficient method to mitigate the challenge, most proposed methods focus on one or two specific datasets. In this paper, we propose a novel transfer learning strategy to have a high performance for versatile plant disease recognition, on multiple plant disease datasets. Our transfer learning strategy differs from the current popular one due to the following factors. First, PlantCLEF2022, a large-scale dataset related to plants with 2,885,052 images and 80,000 classes, is utilized to pre-train a model. Second, we adopt a vision transformer (ViT) model, instead of a convolution neural network. Third, the ViT model undergoes transfer learning twice to save computations. Fourth, the model is first pre-trained in ImageNet with a self-supervised loss function and with a supervised loss function in PlantCLEF2022. We apply our method to 12 plant disease datasets and the experimental results suggest that our method surpasses the popular one by a clear margin for different dataset settings. Specifically, our proposed method achieves a mean testing accuracy of 86.29over the 12 datasets in a 20-shot case, 12.76 higher than the current state-of-the-art method's accuracy of 73.53. Furthermore, our method outperforms other methods in one plant growth stage prediction and the one weed recognition dataset. To encourage the community and related applications, we have made public our codes and pre-trained model.
Collapse
Affiliation(s)
- Mingle Xu
- Department of Electronics Engineering, Jeonbuk National University, Jeonbuk, South Korea
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonbuk, South Korea
| | - Sook Yoon
- Department of Computer Engineering, Mokpo National University, Jeonnam, South Korea
| | - Yongchae Jeong
- Department of Electronics Engineering, Jeonbuk National University, Jeonbuk, South Korea
| | - Dong Sun Park
- Department of Electronics Engineering, Jeonbuk National University, Jeonbuk, South Korea
- Core Research Institute of Intelligent Robots, Jeonbuk National University, Jeonbuk, South Korea
| |
Collapse
|
6
|
Alejandrino JD, II RSC, Sybingco E, Palconit MGB, Bautista MGAC, Bandala AA, Dadios EP. fMaize: A Seamless Image Filtering and Deep Transfer EfficientNet-b0 Model for Sub-Classifying Fungi Species Infecting Zea mays Leaves. JOURNAL OF ADVANCED COMPUTATIONAL INTELLIGENCE AND INTELLIGENT INFORMATICS 2022. [DOI: 10.20965/jaciii.2022.p0914] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Identification of fungi infecting Zea mays leaves and sub-classifying them to have correct course management in the earlier stages is lucrative. To develop a nondestructive and low-cost classification model of corn leaves infected by Setosphaeria turcica (ST), Cercospora zeae-maydis (CZM), and Puccinia sorghi (PS) fungi using image filtering and transfer learning model. Corn leaf images were categorized based on fungal-infection and stored in an image library. All images were then processed to show different intensities and then utilized to filter the images. An original RGB-based CNN model has been compared with selected pre-trained models of VGG16 and EfficientNet-b0 with inputs of both unfiltered and filtered RGB images. Results showed that the EfficientNet-b0 with filtered images model (fMaize) exhibited the highest accuracy of 97.63%, sensitivity of 97.99%, specificity of 97.38, quality index of 97.68%, and F-score of 96.48%. Consequently, the experimental results revealed that deep transfer learning models fed with filtered images produced higher accuracy than models that simply employed RGB images. Thus, transfer learning was proven to be a valuable tool in enhancing CNN image classification accuracy.
Collapse
|