1
|
Ohana-Levi N, Derumigny A, Peeters A, Ben-Gal A, Bahat I, Katz L, Netzer Y, Naor A, Cohen Y. A multifunctional matching algorithm for sample design in agricultural plots. Comput Electron Agric 2021; 187:None. [PMID: 34381288 PMCID: PMC8329933 DOI: 10.1016/j.compag.2021.106262] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/27/2020] [Revised: 06/07/2021] [Accepted: 06/08/2021] [Indexed: 06/13/2023]
Abstract
Collection of accurate and representative data from agricultural fields is required for efficient crop management. Since growers have limited available resources, there is a need for advanced methods to select representative points within a field in order to best satisfy sampling or sensing objectives. The main purpose of this work was to develop a data-driven method for selecting locations across an agricultural field given observations of some covariates at every point in the field. These chosen locations should be representative of the distribution of the covariates in the entire population and represent the spatial variability in the field. They can then be used to sample an unknown target feature whose sampling is expensive and cannot be realistically done at the population scale. An algorithm for determining these optimal sampling locations, namely the multifunctional matching (MFM) criterion, was based on matching of moments (functionals) between sample and population. The selected functionals in this study were standard deviation, mean, and Kendall's tau. An additional algorithm defined the minimal number of observations that could represent the population according to a desired level of accuracy. The MFM was applied to datasets from two agricultural plots: a vineyard and a peach orchard. The data from the plots included measured values of slope, topographic wetness index, normalized difference vegetation index, and apparent soil electrical conductivity. The MFM algorithm selected the number of sampling points according to a representation accuracy of 90% and determined the optimal location of these points. The algorithm was validated against values of vine or tree water status measured as crop water stress index (CWSI). Algorithm performance was then compared to two other sampling methods: the conditioned Latin hypercube sampling (cLHS) model and a uniform random sample with spatial constraints. Comparison among sampling methods was based on measures of similarity between the target variable population distribution and the distribution of the selected sample. MFM represented CWSI distribution better than the cLHS and the uniform random sampling, and the selected locations showed smaller deviations from the mean and standard deviation of the entire population. The MFM functioned better in the vineyard, where spatial variability was larger than in the orchard. In both plots, the spatial pattern of the selected samples captured the spatial variability of CWSI. MFM can be adjusted and applied using other moments/functionals and may be adopted by other disciplines, particularly in cases where small sample sizes are desired.
Collapse
Affiliation(s)
- N. Ohana-Levi
- Independent Researcher, Variability, Ashalim 85512, Israel
| | - A. Derumigny
- Department of Applied Mathematics, Delft University of Technology, Mourik Broekmanweg 6, 2628 XE Delft, the Netherlands
| | - A. Peeters
- TerraVision Lab, Midreshet Ben-Gurion 8499000, Israel
| | - A. Ben-Gal
- Institute of Soil, Water and Environmental Sciences, Agricultural Research Organization, Gilat Research Center, Mobile post Negev 2, 85280, Israel
| | - I. Bahat
- Institute of Agricultural Engineering, Agricultural Research Organization, Volcani Center, P.O. Box 15159, Rishon LeZion 7505101, Israel
- The Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture, The Hebrew University of Jerusalem, The Robert H. Smith Faculty of Agriculture, Food & Environment, Rehovot 76100, Israel
| | - L. Katz
- Institute of Soil, Water and Environmental Sciences, Agricultural Research Organization, Gilat Research Center, Mobile post Negev 2, 85280, Israel
- Institute of Agricultural Engineering, Agricultural Research Organization, Volcani Center, P.O. Box 15159, Rishon LeZion 7505101, Israel
- The Robert H. Smith Institute of Plant Sciences and Genetics in Agriculture, The Hebrew University of Jerusalem, The Robert H. Smith Faculty of Agriculture, Food & Environment, Rehovot 76100, Israel
- Department of Soil and Water Sciences, The Robert H. Smith Faculty of Agriculture, Food and Environment, The Hebrew University of Jerusalem, P.O. Box 12, Rehovot 7610001, Israel
| | - Y. Netzer
- Department of Agriculture and Oenology, Eastern R&D Center, Israel
- Department of Chemical Engineering, Ariel University, Ariel 40700, Israel
| | - A. Naor
- Department of Precision Agriculture, MIGAL Galilee Research Institute, Kiryat Shmona 11016, Israel
| | - Y. Cohen
- Institute of Agricultural Engineering, Agricultural Research Organization, Volcani Center, P.O. Box 15159, Rishon LeZion 7505101, Israel
| |
Collapse
|
2
|
Liu D, Cai T, Lok A, Zheng Y. Nonparametric Maximum Likelihood Estimators of Time-Dependent Accuracy Measures for Survival Outcome Under Two-Stage Sampling Designs. J Am Stat Assoc 2017; 113:882-892. [PMID: 30555194 PMCID: PMC6291304 DOI: 10.1080/01621459.2017.1295866] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2014] [Revised: 12/01/2016] [Indexed: 12/24/2022]
Abstract
Large prospective cohort studies of rare chronic diseases require thoughtful planning of study designs, especially for biomarker studies when measurements are based on stored tissue or blood specimens. Two-phase designs, including nested case-control (Thomas, 1977) and case-cohort (Prentice, 1986) sampling designs, provide cost-effective strategies for conducting biomarker evaluation studies. Existing literature for biomarker assessment under two-phase designs largely focuses on simple inverse probability weighting (IPW) estimators (Cai and Zheng, 2011; Liu et al., 2012). Drawing on recent theoretical development on the maximum likelihood estimators for relative risk parameters in two-phase studies (Scheike and Martinussen, 2004; Zeng et al., 2006), we propose nonparametric maximum likelihood based estimators to evaluate the accuracy and predictiveness of a risk prediction biomarker under both types of two-phase designs. In addition, hybrid estimators that combine IPW estimators and maximum likelihood estimation procedure are proposed to improve efficiency and alleviate computational burden. We derive large sample properties of proposed estimators and evaluate their finite sample performance using numerical studies. We illustrate new procedures using a two-phase biomarker study aiming to evaluate the accuracy of a novel biomarker, des-γ-carboxy prothrombin, for early detection of hepatocellular carcinoma (Lok et al., 2010).
Collapse
Affiliation(s)
- Dandan Liu
- Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN 37232
| | - Tianxi Cai
- Department of Biostatistics, Harvard School of Public Health, Boston, Massachusetts 02115
| | - Anna Lok
- Division of Gastroenterology, University of Michigan, Ann Arbor, MI 48109
| | - Yingye Zheng
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA 98109
| |
Collapse
|