1
|
Liang Z, Shao S, Lv Z, Li D, Sleigh JW, Li X, Zhang C, He J. Constructing a Consciousness Meter Based on the Combination of Non-Linear Measurements and Genetic Algorithm-Based Support Vector Machine. IEEE Trans Neural Syst Rehabil Eng 2020; 28:399-408. [PMID: 31940541 DOI: 10.1109/tnsre.2020.2964819] [Citation(s) in RCA: 13] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
OBJECTIVE Constructing a framework to evaluate consciousness is an important issue in neuroscience research and clinical practice. However, there is still no systematic framework for quantifying altered consciousness along the dimensions of both level and content. This study builds a framework to differentiate the following states: coma, general anesthesia, minimally conscious state (MCS), and normal wakefulness. METHODS This study analyzed electroencephalography (EEG) recorded from frontal channels in patients with disorders of consciousness (either coma or MCS), patients under general anesthesia, and healthy participants in normal waking consciousness (NWC). Four non-linear methods-permutation entropy (PE), sample entropy (SampEn), permutation Lempel-Ziv complexity (PLZC), and detrended fluctuation analysis (DFA)-as well as relative power (RP), extracted features from the EEG recordings. A genetic algorithm-based support vector machine (GA-SVM) classified the states of consciousness based on the extracted features. A multivariable linear regression model then built EEG indices for level and content of consciousness. RESULTS The PE differentiated all four states of consciousness (p<0.001). Altered contents of consciousness for NWC, MCS, coma, and general anesthesia were best differentiated by the SampEn, and PLZC. In contrast, the levels of consciousness for these four states were best differentiated by RP of Gamma and PE. A multi-dimensional index, combined with the GA-SVM, showed that the integration of PE, PLZC, SampEn, and DFA had the highest classification accuracy (92.3%). The GA-SVM was better than random forest and neural networks at differentiating these four states. The 'coordinate value' in the dimensions of level and content were constructed by the multivariable linear regression model and the non-linear measures PE, PLZC, SampEn, and DFA. CONCLUSIONS Multi-dimensional measurements, especially the PE, SampEn, PLZC, and DFA, when combined with GA-SVM, are promising methods for constructing a framework to quantify consciousness.
Collapse
|
2
|
Tatliyer A, Cervantes I, Formoso-Rafferty N, Gutiérrez JP. The Statistical Scale Effect as a Source of Positive Genetic Correlation Between Mean and Variability: A Simulation Study. G3 (Bethesda) 2019; 9:3001-3008. [PMID: 31320386 PMCID: PMC6723139 DOI: 10.1534/g3.119.400497] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Accepted: 07/16/2019] [Indexed: 12/19/2022]
Abstract
The selection objective for animal production is the highest income with the lowest production cost, while ensuring the highest animal welfare. A selection experiment for environmental variability of birth weight in mice showed a correlated response in the mean after 20 generations starting from a crossed panmictic population. The relationship between the birth weight and its environmental variability explained the correlated response. The scale effect represents a potential cause of this correlation. The relationship between the mean and the variability implies: the higher the mean, the higher the variability. The study was to quantify by simulation the genetic correlation between a trait and its environmental variability. This can be attributable to the scale effect in a range of coefficients of variation and heritabilities between 0.05 and 0.50. The resulting genetic correlation ranged from 0.1335 to 0.7021 being the highest for the highest heritability and the lowest CV. The scale effect for a trait with heritability between 0.25 and 0.35 and CV between 0.15 and 0.25 generated a genetic correlation between 0.43 and 0.57. The genetic coefficient of variation (GCV) affecting residual variability was modulated by the strength reducing the impact of the scale effect. GCV ranged from 0.0050 to 1.4984. The strength of the scale effect might be in the range between 0 and 1. The scale effect would explain many reported genetic correlation and the additive genetic variance for the variability. This is relevant when increasing the mean of a trait jointly with the reduction of its variability.
Collapse
Affiliation(s)
- Adile Tatliyer
- Department of Animal Science, Faculty of Agriculture, Kahramanmaras Sutcu Imam University, Avsar Campus, 46100, Onikisubat, Kahramanmaras, Turkey and
| | - Isabel Cervantes
- Department of Animal Production, Faculty of Veterinary, Complutense University of Madrid, Avda. Puerta de Hierro s/n, E-28040-Madrid, Spain
| | - Nora Formoso-Rafferty
- Department of Animal Production, Faculty of Veterinary, Complutense University of Madrid, Avda. Puerta de Hierro s/n, E-28040-Madrid, Spain
| | - Juan Pablo Gutiérrez
- Department of Animal Production, Faculty of Veterinary, Complutense University of Madrid, Avda. Puerta de Hierro s/n, E-28040-Madrid, Spain
| |
Collapse
|
3
|
Affiliation(s)
- Elie Dolgin
- Para>Elie Dolgin is a science writer in Somerville, Massachusetts
| |
Collapse
|
4
|
Manu VS, Veglia G. Genetic algorithm optimized triply compensated pulses in NMR spectroscopy. J Magn Reson 2015; 260:136-43. [PMID: 26473327 PMCID: PMC4628891 DOI: 10.1016/j.jmr.2015.09.010] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/20/2015] [Revised: 09/11/2015] [Accepted: 09/21/2015] [Indexed: 05/24/2023]
Abstract
Sensitivity and resolution in NMR experiments are affected by magnetic field inhomogeneities (of both external and RF), errors in pulse calibration, and offset effects due to finite length of RF pulses. To remedy these problems, built-in compensation mechanisms for these experimental imperfections are often necessary. Here, we propose a new family of phase-modulated constant-amplitude broadband pulses with high compensation for RF inhomogeneity and heteronuclear coupling evolution. These pulses were optimized using a genetic algorithm (GA), which consists in a global optimization method inspired by Nature's evolutionary processes. The newly designed π and π/2 pulses belong to the 'type A' (or general rotors) symmetric composite pulses. These GA-optimized pulses are relatively short compared to other general rotors and can be used for excitation and inversion, as well as refocusing pulses in spin-echo experiments. The performance of the GA-optimized pulses was assessed in Magic Angle Spinning (MAS) solid-state NMR experiments using a crystalline U-(13)C, (15)N NAVL peptide as well as U-(13)C, (15)N microcrystalline ubiquitin. GA optimization of NMR pulse sequences opens a window for improving current experiments and designing new robust pulse sequences.
Collapse
Affiliation(s)
- V S Manu
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, United States
| | - Gianluigi Veglia
- Department of Biochemistry, Molecular Biology, and Biophysics, University of Minnesota, Minneapolis, MN 55455, United States; Department of Chemistry, University of Minnesota, Minneapolis, MN 55455, United States.
| |
Collapse
|
5
|
Abstract
Worries about fraudulent data should give way to broader critiques of Mendel's legacy
Collapse
Affiliation(s)
- Gregory Radick
- School of Philosophy, Religion and History of Science, University of Leeds, Leeds, UK.
| |
Collapse
|
6
|
Das S, Pan I, Das S, Gupta A. Improved model reduction and tuning of fractional-order PI(λ)D(μ) controllers for analytical rule extraction with genetic programming. ISA Trans 2012; 51:237-261. [PMID: 22036301 DOI: 10.1016/j.isatra.2011.10.004] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2011] [Revised: 10/01/2011] [Accepted: 10/05/2011] [Indexed: 05/31/2023]
Abstract
Genetic algorithm (GA) has been used in this study for a new approach of suboptimal model reduction in the Nyquist plane and optimal time domain tuning of proportional-integral-derivative (PID) and fractional-order (FO) PI(λ)D(μ) controllers. Simulation studies show that the new Nyquist-based model reduction technique outperforms the conventional H(2)-norm-based reduced parameter modeling technique. With the tuned controller parameters and reduced-order model parameter dataset, optimum tuning rules have been developed with a test-bench of higher-order processes via genetic programming (GP). The GP performs a symbolic regression on the reduced process parameters to evolve a tuning rule which provides the best analytical expression to map the data. The tuning rules are developed for a minimum time domain integral performance index described by a weighted sum of error index and controller effort. From the reported Pareto optimal front of the GP-based optimal rule extraction technique, a trade-off can be made between the complexity of the tuning formulae and the control performance. The efficacy of the single-gene and multi-gene GP-based tuning rules has been compared with the original GA-based control performance for the PID and PI(λ)D(μ) controllers, handling four different classes of representative higher-order processes. These rules are very useful for process control engineers, as they inherit the power of the GA-based tuning methodology, but can be easily calculated without the requirement for running the computationally intensive GA every time. Three-dimensional plots of the required variation in PID/fractional-order PID (FOPID) controller parameters with reduced process parameters have been shown as a guideline for the operator. Parametric robustness of the reported GP-based tuning rules has also been shown with credible simulation examples.
Collapse
Affiliation(s)
- Saptarshi Das
- School of Nuclear Studies & Applications (SNSA), Jadavpur University, Salt Lake Campus, LB-8, Sector 3, Kolkata-700098, India.
| | | | | | | |
Collapse
|
7
|
Abstract
Missing values are a common problem in genetic association studies concerned with single-nucleotide polymorphisms (SNPs). Since many statistical methods cannot handle missing values, such values need to be removed prior to the actual analysis. Considering only complete observations, however, often leads to an immense loss of information. Therefore, procedures are required that can be used to impute such missing values. In this study, an imputation procedure based on a weighted k nearest neighbors algorithm is presented. This approach, called KNNcatImpute, searches for the k SNPs that are most similar to the SNP whose missing values need to be replaced and uses these k SNPs to impute the missing values. Alternatively, KNNcatImpute can search for the k nearest subjects. In this situation, the missing values of an individual are imputed by considering subjects showing a DNA pattern similar to the one of this individual. In a comparison to other imputation approaches, KNNcatImpute shows the lowest rates of falsely imputed genotypes when applied to the SNP data from the GENICA study, a candidate SNP study dedicated to the identification of genetic and gene-environment interactions associated with sporadic breast cancer. Moreover, KNNcatImpute can also be applied to data from genome-wide association studies, as an application to a subset of the HapMap data demonstrates.
Collapse
|
8
|
Khajeh A, Modarress H. QSPR prediction of flash point of esters by means of GFA and ANFIS. J Hazard Mater 2010; 179:715-720. [PMID: 20381958 DOI: 10.1016/j.jhazmat.2010.03.060] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/18/2009] [Revised: 03/10/2010] [Accepted: 03/13/2010] [Indexed: 05/29/2023]
Abstract
A quantitative structure property relationship (QSPR) study was performed to develop a model for prediction of flash point of esters based on a diverse set of 95 components. The most five important descriptors were selected from a set of 1124 descriptors to build the QSPR model by means of a genetic function approximation (GFA). For considering the nonlinear behavior of these molecular descriptors, adaptive neuro-fuzzy inference system (ANFIS) method was used. The ANFIS and GFA squared correlation coefficient for testing set was 0.969 and 0.965, respectively. The results obtained showed the ability of developed GFA and ANFIS for prediction of flash point of esters.
Collapse
Affiliation(s)
- Aboozar Khajeh
- Islamic Azad University, Birjand Branch, Birjand, Southern Khorasan, Iran
| | | |
Collapse
|
9
|
Abstract
Artificial Neural Networks (ANNs) were used to predict nanoparticle size and micropore surface area of polylactic acid nanoparticles, prepared by a double emulsion method. Different batches were prepared while varying polymer and surfactant concentration, as well as homogenization pressure. Two commercial ANNs programs were evaluated: Neuroshell Predictor, a black-box software adopting both neural and genetic strategies, and Neurosolutions, allowing a step-by-step building of the network. Results were compared to those obtained by statistical method. Predictions from ANNs were more accurate than those calculated using non-linear regression. Neuroshell Predictor allowed quantification of the relative importance of the inputs. Furthermore, by varying the network topology and parameters using Neurosolutions, it was possible to obtain output values which were closer to experimental values. Therefore, ANNs represent a promising tool for the analysis of processes involving preparation of polymeric carriers and for prediction of their physical properties.
Collapse
Affiliation(s)
- Névine Rizkalla
- Faculté de Pharmacie, Université de Montréal, Montréal, Canada
| | | |
Collapse
|
10
|
|
11
|
van der Lee JH, Svrcek WY, Young BR. A tuning algorithm for model predictive controllers based on genetic algorithms and fuzzy decision making. ISA Trans 2008; 47:53-9. [PMID: 17870075 DOI: 10.1016/j.isatra.2007.06.003] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/06/2005] [Accepted: 06/22/2007] [Indexed: 05/17/2023]
Abstract
Model Predictive Control is a valuable tool for the process control engineer in a wide variety of applications. Because of this the structure of an MPC can vary dramatically from application to application. There have been a number of works dedicated to MPC tuning for specific cases. Since MPCs can differ significantly, this means that these tuning methods become inapplicable and a trial and error tuning approach must be used. This can be quite time consuming and can result in non-optimum tuning. In an attempt to resolve this, a generalized automated tuning algorithm for MPCs was developed. This approach is numerically based and combines a genetic algorithm with multi-objective fuzzy decision-making. The key advantages to this approach are that genetic algorithms are not problem specific and only need to be adapted to account for the number and ranges of tuning parameters for a given MPC. As well, multi-objective fuzzy decision-making can handle qualitative statements of what optimum control is, in addition to being able to use multiple inputs to determine tuning parameters that best match the desired results. This is particularly useful for multi-input, multi-output (MIMO) cases where the definition of "optimum" control is subject to the opinion of the control engineer tuning the system. A case study will be presented in order to illustrate the use of the tuning algorithm. This will include how different definitions of "optimum" control can arise, and how they are accounted for in the multi-objective decision making algorithm. The resulting tuning parameters from each of the definition sets will be compared, and in doing so show that the tuning parameters vary in order to meet each definition of optimum control, thus showing the generalized automated tuning algorithm approach for tuning MPCs is feasible.
Collapse
Affiliation(s)
- J H van der Lee
- Virtual Materials Group Inc., 657 Hawkside Mews NW, Calgary, Alberta T3G 3S1, Canada
| | | | | |
Collapse
|
12
|
Pilpel A. Statistics is not enough: revisiting Ronald A. Fisher's critique (1936) of Mendel's experimental results (1866). Stud Hist Philos Biol Biomed Sci 2007; 38:618-26. [PMID: 17893069 DOI: 10.1016/j.shpsc.2007.06.009] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/28/2006] [Revised: 11/01/2006] [Indexed: 05/17/2023]
Abstract
This paper is concerned with the role of rational belief change theory in the philosophical understanding of experimental error. Today, philosophers seek insight about error in the investigation of specific experiments, rather than in general theories. Nevertheless, rational belief change theory adds to our understanding of just such cases: R. A. Fisher's criticism of Mendel's experiments being a case in point. After an historical introduction, the main part of this paper investigates Fisher's paper from the point of view of rational belief change theory: what changes of belief about Mendel's experiment does Fisher go through and with what justification. It leads to surprising insights about what Fisher had done right and wrong, and, more generally, about the limits of statistical methods in detecting error.
Collapse
Affiliation(s)
- Avital Pilpel
- Philosophy Department, University of Haifa, 1901 Eshkol Tower, Mount Carmel, Haifa, Israel.
| |
Collapse
|
13
|
Siepmann P, Martin CP, Vancea I, Moriarty PJ, Krasnogor N. A genetic algorithm approach to probing the evolution of self-organized nanostructured systems. Nano Lett 2007; 7:1985-90. [PMID: 17552572 DOI: 10.1021/nl070773m] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
We present a new methodology, based on a combination of genetic algorithms and image morphometry, for matching the outcome of a Monte Carlo simulation to experimental observations of a far-from-equilibrium nanosystem. The Monte Carlo model used simulates a colloidal solution of nanoparticles drying on a solid substrate and has previously been shown to produce patterns very similar to those observed experimentally. Our approach enables the broad parameter space associated with simulated nanoparticle self-organization to be searched effectively for a given experimental target morphology.
Collapse
Affiliation(s)
- Peter Siepmann
- School of Computer Science & IT, The University of Nottingham, Nottingham NG8 1BB, UK
| | | | | | | | | |
Collapse
|
14
|
Cakmak A, Ozsoyoglu G. Annotating genes using textual patterns. Pac Symp Biocomput 2007:221-232. [PMID: 17990494] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Annotating genes with Gene Ontology (GO) terms is crucial for biologists to characterize the traits of genes in a standardized way. However, manual curation of textual data, the most reliable form of gene annotation by GO terms, requires significant amounts of human effort, is very costly, and cannot catch up with the rate of increase in biomedical publications. In this paper, we present GEANN, a system to automatically infer new GO annotations for genes from biomedical papers based on the evidence support linked to PubMed, a biological literature database of 14 million papers. GEANN (i) extracts from text significant terms and phrases associated with a GO term, (ii) based on the extracted terms, constructs textual extraction patterns with reliability scores for GO terms, (iii) expands the pattern set through "pattern crosswalks", (iv) employs semantic pattern matching, rather than syntactic pattern matching, which allows for the recognition of phrases with close meanings, and (iv) annotates genes based on the "quality" of the matched pattern to the genomic entity occurring in the text. On the average, in our experiments, GEANN has reached to the precision level of 78% at the 57% recall level.
Collapse
Affiliation(s)
- Ali Cakmak
- Department of Electrical Engineering and Computer Science, Case Western Reserve University, Cleveland, OH 44106, USA.
| | | |
Collapse
|
15
|
|
16
|
Abstract
In this paper, a hybrid Taguchi-genetic algorithm (HTGA) is applied to solve the problem of tuning both network structure and parameters of a feedforward neural network. The HTGA approach is a method of combining the traditional genetic algorithm (TGA), which has a powerful global exploration capability, with the Taguchi method, which can exploit the optimum offspring. The Taguchi method is inserted between crossover and mutation operations of a TGA. Then, the systematic reasoning ability of the Taguchi method is incorporated in the crossover operations to select the better genes to achieve crossover, and consequently enhance the genetic algorithms. Therefore, the HTGA approach can be more robust, statistically sound, and quickly convergent. First, the authors evaluate the performance of the presented HTGA approach by studying some global numerical optimization problems. Then, the presented HTGA approach is effectively applied to solve three examples on forecasting the sunspot numbers, tuning the associative memory, and solving the XOR problem. The numbers of hidden nodes and the links of the feedforward neural network are chosen by increasing them from small numbers until the learning performance is good enough. As a result, a partially connected feedforward neural network can be obtained after tuning. This implies that the cost of implementation of the neural network can be reduced. In these studied problems of tuning both network structure and parameters of a feedforward neural network, there are many parameters and numerous local optima so that these studied problems are challenging enough for evaluating the performances of any proposed GA-based approaches. The computational experiments show that the presented HTGA approach can obtain better results than the existing method reported recently in the literature.
Collapse
Affiliation(s)
- Jinn-Tsong Tsai
- Department of Medical Information Management, Kaohsiung Medical University, Kaohsiung 807, Taiwan, ROC
| | | | | |
Collapse
|
17
|
Blumenthal D, Campbell EG, Gokhale M, Yucel R, Clarridge B, Hilgartner S, Holtzman NA. Data withholding in genetics and the other life sciences: prevalences and predictors. Acad Med 2006; 81:137-45. [PMID: 16436574 DOI: 10.1097/00001888-200602000-00006] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
PURPOSE To better understand the variety and prevalence of data withholding in genetics and the other life sciences and to explore factors associated with these behaviors. METHOD In 2000, a sample of 2,893 geneticists and other life scientists (OLS) at the 100 most research-intensive universities in the United States were surveyed concerning data withholding and sharing. The instrument was developed and pretested in 1999. The two primary outcome measures were withholding in verbal exchanges with colleagues about unpublished research (verbal withholding) and withholding as part of the publishing process (publishing withholding). The independent variables related to the personal characteristics, research characteristics of faculty, and previous experience with data withholding. RESULTS A total of 1,849 faculty responded (64%): 1,240 geneticists and 600 OLS. Forty-four percent of geneticists and 32% of OLS reported participating in any one of 13 forms of data withholding in the three previous years. Publishing withholding (geneticists 35%, OLS 25%) was more frequent than verbal withholding (geneticists 23%, OLS 12%). In multivariate analyses, male gender, participation in relationships with industry, mentors' discouraging data sharing, receipt of formal instruction in data sharing, and negative past experience with sharing were significantly associated with either verbal or publishing withholding among either geneticists or OLS. CONCLUSIONS Data withholding is common in biomedical science, takes multiple forms, is influenced by a variety of characteristics of investigators and their training, and varies by field of science. Encouraging openness during the formative experiences of young investigators may be critical to increased data sharing, but the effects of formal training do not appear straightforward.
Collapse
Affiliation(s)
- David Blumenthal
- Institute for Health Policy, Massachusetts General Hospital/Partners HealthCare System, 50 Staniford St., Boston, MA 02114, USA.
| | | | | | | | | | | | | |
Collapse
|
18
|
Abstract
The generation of quantitative structure-activity relationships (QSARs) under the supervision of a genetic algorithm (GA) is a QSAR modeling approach used for more than a decade. In this paper we present McQSAR, an extension to the traditional GA approach to derive QSARs. McQSAR is able to use descriptors for multiple representations per compound, such as different conformers, tautomers, or protonation forms. Test runs show that the algorithm converges to a set of representations that describe the binding mode of the set of input molecules to a reasonable resolution provided that suitable descriptors-based on the three-dimensional structure-are used. Furthermore, the frequency of chance correlation was measured during multiple runs on a real-life data set using simulated linear relationship functions. The observed frequency of chance correlation, on average 0.3 +/- 0.5%, was found independent of the size of the calibration set and the number of terms in the underlying relationship function.
Collapse
Affiliation(s)
- Mikko J Vainio
- Structural Bioinformatics Laboratory, Department of Biochemistry and Pharmacy, Abo Akademi University, Tykistökatu 6A, FIN-20520 Turku, Finland.
| | | |
Collapse
|
19
|
Hegab AE, Sakamoto T, Sekizawa K. Assessing the validity of genetic association studies. Thorax 2005; 60:882-3; author reply 883. [PMID: 16192369 PMCID: PMC1747198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]
|
20
|
|
21
|
|
22
|
Pedrycz W, Breuer A, Pizzi NJ. Genetic design of feature spaces for pattern classifiers. Artif Intell Med 2004; 32:115-25. [PMID: 15364095 DOI: 10.1016/j.artmed.2004.01.005] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2003] [Revised: 10/30/2003] [Accepted: 01/17/2004] [Indexed: 11/27/2022]
Abstract
Functional piecewise approximation seeks data representation that is compact, highly simplified and meaningful. This study presents a genetic algorithm (GA)-based approach for computing a piecewise polynomial representation of functions, with the focus being on piecewise linear approximation in an application of biomedical spectral data. The area of piecewise linear approximation has been researched in the past four decades approximately, and the method presented here is compared with another well-known approach. The expansion of this method to piecewise polynomial representation is shown to be straightforward. Finally, the application of this method as a feature extraction method for classification of a dataset of feature vectors, specifically biomedical spectra, is demonstrated.
Collapse
Affiliation(s)
- Witold Pedrycz
- Department of Electrical & Computer Engineering, University of Alberta, Edmonton, Alta., Canada.
| | | | | |
Collapse
|
23
|
Affiliation(s)
- Mark A Beaumont
- School of Animal and Microbial Sciences, University of Reading, Whiteknights, P.O. Box 228, Reading RG6 6AJ, UK.
| | | |
Collapse
|
24
|
Abstract
Genetic maps are used routinely in family-based linkage studies to identify the rough location of genes that influence human traits and diseases. Unlike physical maps, genetic maps are based on the amount of recombination occurring between adjacent loci rather than the actual number of bases separating them. Genetic maps are constructed by statistically characterizing the number of crossovers observed in parental meioses leading to the transmission of alleles to their offspring. Considerations such as the number of meioses observed, the heterozygosity and physical distance between the loci studied, and the statistical methods used can impact the construction and reliability of a genetic map. As is well known, poorly constructed genetic maps can have adverse effects on linkage mapping studies. With the availability of sequence-based maps, as well as genetic maps generated by different researchers (such as those generated by the Marshfield and deCODE groups), one can investigate the compatibility and properties of different maps. We have integrated information from the most current human genome sequence data (UCSC genome assembly Human July 2003) as well as 8399 microsatellite markers used in the Marshfield and deCODE maps to reconcile the these maps. Our efforts resulted in updated sex-specific genetic maps.
Collapse
Affiliation(s)
- Caroline M Nievergelt
- Polymorphism Research Laboratory, Department of Psychiatry, University of California at San Diego, La Jolla, California 92093-0603, USA
| | | | | | | |
Collapse
|
25
|
Huang QR, Trevena L, McIntosh J. GPs' experience and attitudes toward new genetics: barriers and needs. Aust Fam Physician 2004; 33:379-80. [PMID: 15227873] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 04/30/2023]
Affiliation(s)
- Qi Rong Huang
- School of Health Information Management, University of Sydney, New South Wales.
| | | | | |
Collapse
|
26
|
Radwan E, Tazaki E. Rough sets and genetic algorithms in learning cellular neural networks cloning template for decision making system. Int J Neural Syst 2004; 14:57-68. [PMID: 15034947 DOI: 10.1142/s0129065704001851] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2003] [Revised: 10/31/2003] [Accepted: 12/16/2003] [Indexed: 11/18/2022]
Abstract
We purpose to find a new beneficial method for accelerating the Decision-Making and classifier support applied on imprecise data. This acceleration can be done by integration between Rough Sets theory, which gives us the minimal set of decision rules, and the Cellular Neural Networks. Our method depends on Genetic Algorithms for designing the cloning template for more accuracy. Some illustrative examples are given to demonstrate the effectiveness of the proposed method, whose advantages and limitations are also discussed.
Collapse
Affiliation(s)
- Elsayed Radwan
- Department of Control and System Engineering, Toin University of Yokohama, 1614 Kurogane-cho, Aoba-ku, Yokohama 225-8502, Japan.
| | | |
Collapse
|
27
|
Affiliation(s)
- L Adrienne Cupples
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
| | - Qiong Yang
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, USA
| | - Serkalem Demissie
- Department of Biostatistics, Boston University School of Public Health, Boston, Massachusetts, USA
| | - Donna Copenhafer
- Department of Neurology, Boston University School of Medicine, Boston, Massachusetts, USA
| | - Daniel Levy
- Framingham Heart Study, National Heart, Lung, and Blood Institute, Bethesda, Maryland, USA
| | | |
Collapse
|
28
|
Yang Q, Chazaro I, Cui J, Guo CY, Demissie S, Larson M, Atwood LD, Cupples LA, DeStefano AL. Genetic analyses of longitudinal phenotype data: a comparison of univariate methods and a multivariate approach. BMC Genet 2003; 4 Suppl 1:S29. [PMID: 14975097 PMCID: PMC1866464 DOI: 10.1186/1471-2156-4-s1-s29] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Background We explored three approaches to heritability and linkage analyses of longitudinal total cholesterol levels (CHOL) in the Genetic Analysis Workshop 13 simulated data without knowing the answers. The first two were univariate approaches and used 1) baseline measure at exam one or 2) summary measures such as mean and slope from multiple exams. The third method was a multivariate approach that directly models multiple measurements on a subject. A variance components model (SOLAR) was employed in the univariate approaches. A mixed regression model with polynomials was employed in the multivariate approach and implemented in SAS/IML. Results Using the baseline measure at exam 1, we detected all baseline or slope genes contributing a substantial amount (0.08) of variance (LOD > 3). Compared to the baseline measure, the mean measures yielded slightly higher LOD at the slope genes, and a lower LOD at the baseline genes. The slope measure produced a somewhat lower LOD for the slope gene than did the mean measure. Descriptive information on the pattern of changes in gene effects with age was estimated for three linked loci by the third approach. Conclusion We found simple univariate methods may be effective to detect genes affecting longitudinal phenotypes but may not fully reveal temporal trends in gene effects. The relative efficiency of the univariate methods to detect genes depends heavily on the underlying model. Compared with the univariate approaches, the multivariate approach provided more information on temporal trends in gene effects at the cost of more complicated modelling and more intense computations.
Collapse
Affiliation(s)
- Qiong Yang
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Neurology, Boston University, Boston, Massachusetts, USA
| | - Irmarie Chazaro
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Mathematics and Statistics, Boston University, Boston, Massachusetts, USA
| | - Jing Cui
- Departments of Medicine, Boston University, Boston, Massachusetts, USA
| | - Chao-Yu Guo
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
| | - Serkalem Demissie
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
| | - Martin Larson
- Departments of Mathematics and Statistics, Boston University, Boston, Massachusetts, USA
| | - Larry D Atwood
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Neurology, Boston University, Boston, Massachusetts, USA
| | - L Adrienne Cupples
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
| | - Anita L DeStefano
- Departments of Biostatistics, Boston University, Boston, Massachusetts, USA
- Departments of Neurology, Boston University, Boston, Massachusetts, USA
| |
Collapse
|
29
|
Abstract
There has been a lack of consistency in detecting chromosomal loci that are linked to obesity-related traits. This may be due, in part, to the phenotype definition. Many studies use a one-time, single measurement as a phenotype while one's weight often fluctuates considerably throughout adulthood. Longitudinal data from the Framingham Heart Study were used to derive alternative phenotypes that may lead to more consistent findings. Body mass index (BMI), a measurement for obesity, is known to increase with age and then plateau or decline slightly; the decline phase may represent a threshold or survivor effect. We propose to use the weight gain phase of BMI to derive phenotypes useful for linkage analysis of obesity. Two phenotypes considered in the present study are the average of and the slope of the BMI measurements in the gain phase (gain mean and gain slope). For comparison, we also considered the average of all BMI measurements available (overall mean). Linkage analysis using the gain mean phenotype exhibited two markers with LOD scores greater than 3, with the largest score of 3.52 on chromosome 4 at ATA2A03. In contrast, no LOD scores greater than 3 were observed when overall mean was used. The gain slope produced weak evidence for linkage on chromosome 4 with a multipoint LOD score of 1.77 at GATA8A05. Our analysis shows how omitting the decline phase of BMI in the definition of obesity phenotypes can result in evidence for linkage which might have been otherwise overlooked.
Collapse
Affiliation(s)
- Lisa Strug
- University of Toronto, Public Health Sciences, 12 Queen's Park Crescent West, Toronto, Ontario, Canada
- The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario, Canada
| | - Lei Sun
- University of Toronto, Public Health Sciences, 12 Queen's Park Crescent West, Toronto, Ontario, Canada
- The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario, Canada
| | - Mary Corey
- University of Toronto, Public Health Sciences, 12 Queen's Park Crescent West, Toronto, Ontario, Canada
- The Hospital for Sick Children, 555 University Avenue, Toronto, Ontario, Canada
| |
Collapse
|
30
|
Abstract
In the search for genes underlying complex traits, there is a tendency to impose increasingly stringent criteria to avoid false discoveries. These stringent criteria make it hard to find true effects, and we argue that it might be better to optimize our procedures for eliminating and controlling false discoveries. Focusing on achieving an acceptable ratio of true- and false-positives, we show that false discoveries could be eliminated much more efficiently using a stepwise approach. To avoid a relatively high false discovery rate, corrections for 'multiple testing' might also be needed in candidate gene studies. If the appropriate methods are used, detecting the proportion of true effects appears to be a more important determinant of the genotyping burden than the desired false discovery rate. This raises the question of whether current models for gene discovery are shaped excessively by a fear of false discoveries.
Collapse
Affiliation(s)
- Edwin J C G van den Oord
- Virginia Institute for Psychiatric and Behavioral Genetics, Medical College of Virginia, Virginia Commonwealth University, Richmond, VA 23298-0126, USA.
| | | |
Collapse
|
31
|
Yuan A, Bonney GE. Two new recursive likelihood calculation methods for genetic analysis. Hum Hered 2003; 54:82-98. [PMID: 12566740 DOI: 10.1159/000067664] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2002] [Accepted: 09/06/2002] [Indexed: 11/19/2022] Open
Abstract
Recursive likelihood calculations for genetic analysis with ungenotyped pedigree data employ variations of the Elston-Stewart (ES) or the Lander-Green (LG) algorithms. With the ES algorithm, the number of loci may be limited but not the pedigree size. With the LG algorithm, the reverse is the case. We introduce two new algorithms for the computation of regressive likelihoods for pedigrees with multivariate traits. The first is an alternative formulation of our existing model, which leads to a simpler form in the binary trait, polygenic and mixed model cases. The second is an approximation model, which is computationally efficient. These methods apply to both continuous and binary traits, in the oligogenic and polygenic cases. Both methods coincide in the binary case. We considered these methods for cases in which all the traits are controlled by a single locus, with each trait controlled by one locus independent to the others. Simulation studies and analysis of a real data are presented for segregation analysis as illustrations. These methods can also be used in other model-based analyses. These methods are implemented in G.E.M.S., the genetic epidemiology models software.
Collapse
Affiliation(s)
- Ao Yuan
- National Human Genome Center, Howard University, Statistical Genetics and Bioinformatics Unit, Washington DC, USA.
| | | |
Collapse
|
32
|
Shah PK, Perez-Iratxeta C, Bork P, Andrade MA. Information extraction from full text scientific articles: where are the keywords? BMC Bioinformatics 2003; 4:20. [PMID: 12775220 PMCID: PMC166134 DOI: 10.1186/1471-2105-4-20] [Citation(s) in RCA: 114] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2003] [Accepted: 05/29/2003] [Indexed: 11/12/2022] Open
Abstract
BACKGROUND To date, many of the methods for information extraction of biological information from scientific articles are restricted to the abstract of the article. However, full text articles in electronic version, which offer larger sources of data, are currently available. Several questions arise as to whether the effort of scanning full text articles is worthy, or whether the information that can be extracted from the different sections of an article can be relevant. RESULTS In this work we addressed those questions showing that the keyword content of the different sections of a standard scientific article (abstract, introduction, methods, results, and discussion) is very heterogeneous. CONCLUSIONS Although the abstract contains the best ratio of keywords per total of words, other sections of the article may be a better source of biologically relevant data.
Collapse
Affiliation(s)
- Parantu K Shah
- Biocomputing, European Molecular Biology Laboratory, Heidelberg, Germany
- Department of Bioinformatics, Max Delbrück Center for Molecular Medicine, Berlin-Buch, Germany
| | - Carolina Perez-Iratxeta
- Biocomputing, European Molecular Biology Laboratory, Heidelberg, Germany
- Department of Bioinformatics, Max Delbrück Center for Molecular Medicine, Berlin-Buch, Germany
| | - Peer Bork
- Biocomputing, European Molecular Biology Laboratory, Heidelberg, Germany
- Department of Bioinformatics, Max Delbrück Center for Molecular Medicine, Berlin-Buch, Germany
| | - Miguel A Andrade
- Biocomputing, European Molecular Biology Laboratory, Heidelberg, Germany
- Department of Bioinformatics, Max Delbrück Center for Molecular Medicine, Berlin-Buch, Germany
- Present address: Bioinformatics group, Ottawa Health Research Institute, Ottawa, Canada
| |
Collapse
|
33
|
|
34
|
Abstract
We presented an algorithm for extracting Boolean functions (propositions, rules) from the units in trained neural networks. The extracted Boolean functions make the hidden units understandable. However, in some cases, the extracted Boolean functions are complicated, and so are not understandable, which means that the hidden units are not functionally localized. This paper presents an algorithm for the functional localization of (the hidden units of) neural networks. When a hidden unit is well approximated to a low-order Boolean function, the unit can be regarded as functionally localized. The functional localization of a hidden unit is evaluated by the error between the hidden unit and the low-order Boolean function extracted from the hidden unit. The optimization is executed by genetic algorithms. We applied it to vote data, mushroom data and chess data. Experimental results show that the algorithm works well.
Collapse
Affiliation(s)
- Hiroshi Tsukimoto
- Tokyo Denki University, 2-2, Kanda-Nishiki-cho, Chiyoda-ku, 101-8457, Tokyo, Japan.
| | | |
Collapse
|
35
|
Yamashita F, Wanchana S, Hashida M. Quantitative structure/property relationship analysis of Caco-2 permeability using a genetic algorithm-based partial least squares method. J Pharm Sci 2002; 91:2230-9. [PMID: 12226850 DOI: 10.1002/jps.10214] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Caco-2 cell monolayers are widely used systems for predicting human intestinal absorption. This study was carried out to develop a quantitative structure-property relationship (QSPR) model of Caco-2 permeability using a novel genetic algorithm-based partial least squares (GA-PLS) method. The Caco-2 permeability data for 73 compounds were taken from the literature. Molconn-Z descriptors of these compounds were calculated as molecular descriptors, and the optimal subset of the descriptors was explored by GA-PLS analysis. A fitness function considering both goodness-of-fit to the training data and predictability of the testing data was adopted throughout the genetic algorithm-driven optimization procedure. The final PLS model consisting of 24 descriptors gave a correlation coefficient (r) of 0.886 for the entire dataset and a predictive correlation coefficient (r(pred)) of 0.825 that was evaluated by a leave-some-out cross-validation procedure. Thus, the GA-PLS analysis proved to be a reasonable QSPR modeling approach for predicting Caco-2 permeability.
Collapse
Affiliation(s)
- Fumiyoshi Yamashita
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, Yoshidashimoadachi-cho, Sakyo-ku, Kyoto 606-8501, Japan
| | | | | |
Collapse
|
36
|
Wanchana S, Yamashita F, Hashida M. Quantitative structure/property relationship analysis on aqueous solubility using genetic algorithm-combined partial least squares method. Pharmazie 2002; 57:127-9. [PMID: 11878188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 02/24/2023]
Abstract
The present study was initiated to generate a model of predicting aqueous solubility of substances from their molecular structure. For 211 drugs or drug-like compounds, their topological indices were calculated by Molconn-Z software. The optimal subset of the descriptors for the prediction of aqueous solubility was determined by genetic algorithm in combination with partial least squares (PLS) method. Thirty-four descriptors were selected by this method. Using 29 of the descriptors selected, of which the scaled PLS coefficient was significant, the cross-validated predictive q2 was 0.785 with 19 principal components that was the optimal and the standard error of prediction was 0.676. Thus, it is suggested that the model obtained would exhibit a good performance in predicting the aqueous solubility of compounds.
Collapse
Affiliation(s)
- S Wanchana
- Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, Kyoto, Japan
| | | | | |
Collapse
|
37
|
Leung CH, Poon WS, Yu LM. Is retrospective study reliable in genetic studies? Stroke 2001; 32:2441. [PMID: 11588342] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/21/2023]
|
38
|
Sierra B, Serrano N, Larrañaga P, Plasencia EJ, Inza I, Jiménez JJ, Revuelta P, Mora ML. Using Bayesian networks in the construction of a bi-level multi-classifier. A case study using intensive care unit patients data. Artif Intell Med 2001; 22:233-48. [PMID: 11377149 DOI: 10.1016/s0933-3657(00)00111-1] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
Combining the predictions of a set of classifiers has shown to be an effective way to create composite classifiers that are more accurate than any of the component classifiers. There are many methods for combining the predictions given by component classifiers. We introduce a new method that combine a number of component classifiers using a Bayesian network as a classifier system given the component classifiers predictions. Component classifiers are standard machine learning classification algorithms, and the Bayesian network structure is learned using a genetic algorithm that searches for the structure that maximises the classification accuracy given the predictions of the component classifiers. Experimental results have been obtained on a datafile of cases containing information about ICU patients at Canary Islands University Hospital. The accuracy obtained using the presented new approach statistically improve those obtained using standard machine learning methods.
Collapse
Affiliation(s)
- B Sierra
- Department of Computer Science and Artificial Intelligence, University of the Basque Country, P.O. Box 649, E-20080, San Sebastián, Spain.
| | | | | | | | | | | | | | | |
Collapse
|
39
|
Liu F, Wang J. [Genetic algorithms and its application to spectral analysis]. Guang Pu Xue Yu Guang Pu Fen Xi 2001; 21:331-335. [PMID: 12947660] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
Genetic algorithm derived from the principle of natural selection and the concepts of genetics is a global search method, which is not only highly effective but also parallel. Its essential theory, operating method, application to spectral analysis and trend of development are reviewed with 66 references.
Collapse
Affiliation(s)
- F Liu
- Laboratory of Advanced Spectroscopy, Nanjing University of Science and Technology, 210014 Nanjing
| | | |
Collapse
|
40
|
Abstract
In this paper, a program describing a genetic algorithm is used for optimising fed-batch culture hybridoma cells to obtain the highest yield over certain time period. Optimal feed rate trajectories for a single feed stream containing both glucose and glutamine, and separate feed streams of glucose and glutamine are determined via the genetic algorithm. As compared to the optimal constant feed rate regime, optimal varying feed rate trajectories improve the final monoclonal antibodies concentration by 10% for the single feed rate case and by 39% for the multi feed rate case in this simulation. In comparsion with a dynamic programming, GA calculated feed trajectories yield a much higher level of monoclonal antibodies concentration.
Collapse
Affiliation(s)
- S K Nguang
- Systems and Control Research Cluster, Department of Electrical and Electronic Engineering, University of Auckland, New Zealand.
| | | | | |
Collapse
|
41
|
Abstract
A technique to forecast spatiotemporal time series is presented. It uses a proper orthogonal or Karhunen-Loève decomposition to encode large spatiotemporal data sets in a few time series, and genetic algorithms to efficiently extract dynamical rules from the data. The method works very well for confined systems displaying spatiotemporal chaos, as exemplified here by forecasting the evolution of the one-dimensional complex Ginzburg-Landau equation in a finite domain.
Collapse
Affiliation(s)
- C López
- Instituto Mediterráneo de Estudios Avanzados, IMEDEA (CSIC-Universitat de les Illes Balears), 07071 Palma de Mallorca, Spain
| | | | | |
Collapse
|
42
|
Abstract
Sequential projection pursuit (SPP) is proposed to detect inhomogeneities (clusters) in high-dimensional analytical data. Such inhomogeneities indicate that there are groups of objects (samples) with different chemical characteristics. The method is compared with principal component analysis (PCA). PCA is generally applied to visually explore structure in high-dimensional data, but is not specifically used to find clustering tendency. Projection pursuit (PP) is specifically designed to find inhomogeneities, but the original method is computationally very intensive. SPP combines the advantages of both methods and overcomes most of their weak points. In this method, latent variables are obtained sequentially according to their importance measured by the entropy index. This involves an optimization step, which is achieved by using a genetic algorithm. The performance of the method is demonstrated and evaluated, first on simulated data sets, and then on near-infrared and gas chromatography data sets. It is shown that SPP indeed reveals more easily information about inhomogeneities than PCA.
Collapse
Affiliation(s)
- Q Guo
- ChemoAC, Pharmeceutical Institute, Vrije Universiteit Brussel, Belgium
| | | | | | | | | |
Collapse
|
43
|
Abstract
Conventional spectral analysis methods use a fast Fourier transform (FFT) on consecutive or overlapping windowed data segments. For Doppler ultrasound signals, this approach suffers from an inadequate frequency resolution due to the time segment duration and the non-stationarity characteristics of the signals. Parametric or model-based estimators can give significant improvements in the time-frequency resolution at the expense of a higher computational complexity. This work describes an approach which implements in real-time a parametric spectral estimator method using genetic algorithms (GAs) in order to find the optimum set of parameters for the adaptive filter that minimises the error function. The aim is to reduce the computational complexity of the conventional algorithm by using the simplicity associated to GAs and exploiting its parallel characteristics. This will allow the implementation of higher order filters, increasing the spectrum resolution, and opening a greater scope for using more complex methods.
Collapse
Affiliation(s)
- J Solano González
- Departamento de Ingeniería de Sistemas Computacionales y Automatización, Instituto de Investigaciones en Matemáticas Aplicadas y en Sistemas, Universidad Nacional Autónoma de México, P.O. Box 20-726, Del. A.Obregón, Mexico
| | | | | |
Collapse
|
44
|
Lavine BK, Ritter J, Moores AJ, Wilson M, Faruque A, Mayfield HT. Source identification of underground fuel spills by solid-phase microextraction/high-resolution gas chromatography/genetic algorithms. Anal Chem 2000; 72:423-31. [PMID: 10658340 DOI: 10.1021/ac9904967] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Solid-phase microextraction (SPME), capillary column gas chromatography, and pattern recognition methods were used to develop a potential method for typing jet fuels so a spill sample in the environment can be traced to its source. The test data consisted of gas chromatograms from 180 neat jet fuel samples representing common aviation turbine fuels found in the United States (JP-4, Jet-A, JP-7, JPTS, JP-5, JP-8). SPME sampling of the fuel's headspace afforded well-resolved reproducible profiles, which were standardized using special peak-matching software. The peak-matching procedure yielded 84 standardized retention time windows, though not all peaks were present in all gas chromatograms. A genetic algorithm (GA) was employed to identify features (in the standardized chromatograms of the neat jet fuels) suitable for pattern recognition analysis. The GA selected peaks, whose two largest principal components showed clustering of the chromatograms on the basis of fuel type. The principal component analysis routine in the fitness function of the GA acted as an information filter, significantly reducing the size of the search space, since it restricted the search to feature subsets whose variance is primarily about differences between the various fuel types in the training set. In addition, the GA focused on those classes and/or samples that were difficult to classify as it trained using a form of boosting. Samples that consistently classify correctly were not as heavily weighted as samples that were difficult to classify. Over time, the GA learned its optimal parameters in a manner similar to a perceptron. The pattern recognition GA integrated aspects of strong and weak learning to yield a "smart" one-pass procedure for feature selection.
Collapse
Affiliation(s)
- B K Lavine
- Department of Chemistry, Clarkson University, Potsdam, New York 13699-5810, USA
| | | | | | | | | | | |
Collapse
|
45
|
Abstract
Unlike monogenic diseases for which considerable progress has been made in past years, the identification of susceptibility genes involved in multifactorial diseases still poses numerous challenges, including the development of new statistical methodologies. Recently, several authors have advocated the use of the estimating equations (EE) approach as an alternative to standard maximum likelihood methods for analysing correlated data. Since most genetic studies rely on family data, the EE found a natural field of application in genetic epidemiology. The objective of this review is to give a brief description of the EE principles, and to outline its applications in the main areas of genetic epidemiology, including familial aggregation analysis, segregation analysis, linkage analysis and association studies.
Collapse
|
46
|
Albert I, Jais JP. [Methodology for analyzing censored correlated data: application of marginal and frailty approaches in human genetics. The European Community Alport Syndrome Concerted Action Group (ECASCA)]. Rev Epidemiol Sante Publique 1999; 47:545-54. [PMID: 10673588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/15/2023] Open
Abstract
BACKGROUND Statistical analysis for correlated censored data allows to study censored events in clustered structure designs. Considering a possible correlation among failure times of the same group, standard methodology is no longer applicable. We investigated proposed models in this context to study familial data about a genetic disease, Alport syndrome. Alport syndrome is a severe hereditary disease due to abnormal collagenous chains. Renal failure is the main symptom of the disease. It progresses toward end-stage renal failure (IRT) according to a high time variability. As shown by genetic studies, mutations of COL4A5 gene are involved in the X-linked Alport Syndrome. Due to the large range of the mutation types, the aim of this study was to search for a possible genetic origin of the heterogeneity of the disease severity. METHODS Marginal survival models and mixed effects survival models (so-called frailty models) were proposed to take into account the possible non independence of the observations. In this study, time until end-stage renal failure is a rightly censored end point. Possible intra-familial correlations due to shared environmental and/or genetic factors could induce dependence among familial failure times. In this paper, we fit marginal and frailty proportional hazards models to evaluate the effect of mutation type on the risk of IRT and an interfamilial heterogeneity of failure times. RESULTS In this study, the use of these models allows to show the presence of an interfamilial heterogeneity of the failure times to IRT. Moreover, the results suggest that some mutation types are linked to a higher risk of fast evolution to IRT, which explains partially the interfamilial heterogeneity of the failure times. CONCLUSIONS This paper shows the interest of marginal and frailty models to evaluate the heterogeneity of censored responses and to study relationships between a censored criterion and covariables. This study puts forward the importance of characterizing the mutation at a molecular level to understand the relationship between genotype and phenotype.
Collapse
Affiliation(s)
- I Albert
- Service de Biostatistique et d'Informatique Médicale, CHU Necker-Enfants Malades, 149, rue de Sèvres, 75743 Paris Cedex 15.
| | | |
Collapse
|
47
|
Abstract
Kinetic parameters were estimated from a three-compartment fluorodeoxyglucose model with three rate constants using a genetic algorithm. The performance of the genetic algorithm was investigated by simulation studies, in which brain time-activity data (TAD) were generated using cited mean values of rate constants and the plasma TAD obtained from positron emission tomographic studies. The accuracy of kinetic parameter estimation using the genetic algorithm was compared with that using the non-linear least-squares (NLSQ) method. The margin of error in the parameters estimated using the genetic algorithm tended to be smaller than that obtained by the NLSQ method. Although not statistically significant at a noise level of 5% in the brain TAD, the difference between the two methods became significant for all parameters at a noise level of 15% or higher. Our results suggest that the genetic algorithm is a promising means of estimating kinetic parameters from compartment models, because it is more robust against statistical noise than the NLSQ method and it can be rendered highly parallel for processing.
Collapse
Affiliation(s)
- K Murase
- Department of Radiology, Ehime University School of Medicine, Shitsukawa, Shigenobu-cho, Onsen-gun, Japan.
| | | | | | | |
Collapse
|
48
|
Abstract
Statistical analyses are used in many fields of genetic research. Most geneticists are taught classical statistics, which includes hypothesis testing, estimation and the construction of confidence intervals; this framework has proved more than satisfactory in many ways. What does a Bayesian framework have to offer geneticists? Its utility lies in offering a more direct approach to some questions and the incorporation of prior information. It can also provide a more straightforward interpretation of results. The utility of a Bayesian perspective, especially for complex problems, is becoming increasingly clear to the statistics community; geneticists are also finding this framework useful and are increasingly utilizing the power of this approach.
Collapse
Affiliation(s)
- J S Shoemaker
- The Cancer Prevention, Detection, Control Research Program, Duke Medical Center, Box 2949, Durham, NC 27710, USA.
| | | | | |
Collapse
|
49
|
Abstract
Family-based procedures such as the transmission disequilibrium test (TDT) were motivated by concern that sample-based methods to map disease genes by allelic association are not robust to population stratification, migration, and admixture. Other factors to consider in designing a study of allelic association are specification of gene action in a weakly parametric model, efficiency, diagnostic reliability for hypernormal individuals, interest in linkage and imprinting, and sibship composition. Family-based samples lend themselves to the TDT despite its inefficiency compared with cases and unrelated normal controls. The TDT has an efficiency of 1/2 for parent-offspring pairs and 2/3 for father-mother-child trios. Against cases and hypernormal controls, the efficiency is only 1/6 on the null hypothesis. Although dependent on marker gene frequency and other factors, efficiency for hypernormal controls is always greater than for random controls. Efficiency of the TDT is increased in multiplex families and by inclusion of normal sibs, approaching a case-control design with normal but not hypernormal controls. Isolated cases favor unrelated controls, and only in exceptional populations would avoidance of stratification justify a family-based design to map disease genes by allelic association.
Collapse
Affiliation(s)
- N E Morton
- Human Genetics, University of Southampton, Level G, Princess Anne Hospital, Coxford Road, Southampton SO16 5YA, United Kingdom.
| | | |
Collapse
|
50
|
Baumgart-Schmitt R, Herrmann WM, Eilers R, Bes F. On the use of neural network techniques to analyse sleep EEG data. First communication: application of evolutionary and genetic algorithms to reduce the feature space and to develop classification rules. Neuropsychobiology 1997; 36:194-210. [PMID: 9396019 DOI: 10.1159/000119412] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
To automate sleep stage scoring, the system sleep analysis system to challenge innovative artificial networks (SASCIA) has been developed and implemented. The aims of our investigation were twofold: In addition to automatic sleep stage scoring the hypothesis was tested that the information of only 1 EEG channel (C4-A2) should be sufficient to automatically generate sleep profiles which are comparable with profiles made by sleep experts on the basis of at least 3-channel EEG (C4-A2), EOG and EMG, as EOG and EMG are seen as epiphenomena during sleep and the full information about the sleep stage should--according to our hypothesis--be available in the EEG. The main components of the SASCIA sleep analysis system are designed to meet the requirements of flexible adaptation to the interindividual differences of the sleep EEG. The core of the SASCIA sleep analysis system consists of neural networks. Supervised learning was implemented and the experts' scorings were included into the learning set and test set. The feature selections out of a large number (118) are performed by genetic algorithms and the topologies of the networks are optimized by evolutionary algorithms. Different mathematical procedures were used to evaluate and optimize the efficiency of the system. The profiles generated by SASCIA are in reasonable agreement with the sleep stages scored by experts according to RKR. The development of the system is communicated in three parts: the first communication deals with the application of the neural network techniques using evolutionary and genetic algorithms and with the selection of feature space. The second communication shows the training of these evolutionary optimized network techniques with multiple subjects and the application of context rules, while the third communication shows an improvement in the robustness by the simultaneous application of 9 different networks obtained from 9 subject types which were used in combination with context rules.
Collapse
Affiliation(s)
- R Baumgart-Schmitt
- Department of Electrical Engineering, Schmalkalden Institute of Technology, Free University of Berlin, Germany
| | | | | | | |
Collapse
|