1
|
Arias-Garcia J, de Souza AC, Gade L, Yudi J, Coelho F, Castro CL, Torres LCB, Braga AP. Improved Design for Hardware Implementation of Graph-Based Large Margin Classifiers for Embedded Edge Computing. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2022; PP:1320-1329. [PMID: 35737604 DOI: 10.1109/tnnls.2022.3183236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]
Abstract
The number of connected embedded edge computing Internet of Things (IoT) devices has been increasing over the years, contributing to the significant growth of available data in different scenarios. Thereby, machine learning algorithms arise to enable task automation and process optimization based on those data. However, due to some learning methods' computational complexity implementing geometric classifiers, it is a challenge to map these on embedded systems or devices with limited resources in size, processing, memory, and power, to accomplish the desired requirements. This hampers the applicability of these methods to complex industrial embedded edge applications. This work evaluates strategies to reduce classifiers' implementation costs based on the CHIP-clas model, independent of hyperparameter tuning and optimization algorithms. The proposal aims to evaluate the tradeoff between numerical precision and model performance and analyze the hardware implementations of a distance-based classifier. Two 16 -b floating-point formats were compared to the 32 -b floating-point precision implementation. Also, a new hardware architecture was developed and then compared to the state-of-the-art reference. The results indicate that the model is robust to low precision computation, providing statistically equivalent results compared to the baseline model, also pointing out statistically equivalent performance and a global speed-up factor of approx 4.39 in processing time.
Collapse
|
2
|
Approximate Computing Circuits for Embedded Tactile Data Processing. ELECTRONICS 2022. [DOI: 10.3390/electronics11020190] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
In this paper, we demonstrate the feasibility and efficiency of approximate computing techniques (ACTs) in the embedded Support Vector Machine (SVM) tensorial kernel circuit implementation in tactile sensing systems. Improving the performance of the embedded SVM in terms of power, area, and delay can be achieved by implementing approximate multipliers in the SVD. Singular Value Decomposition (SVD) is the main computational bottleneck of the tensorial kernel approach; since digital multipliers are extensively used in SVD implementation, we aim to optimize the implementation of the multiplier circuit. We present the implementation of the approximate SVD circuit based on the Approximate Baugh-Wooley (Approx-BW) multiplier. The approximate SVD achieves an energy consumption reduction of up to 16% at the cost of a Mean Relative Error decrease (MRE) of less than 5%. We assess the impact of the approximate SVD on the accuracy of the classification; showing that approximate SVD increases the Error rate (Err) within a range of one to eight percent. Besides, we propose a hybrid evaluation test approach that consists of implementing three different approximate SVD circuits having different numbers of approximated Least Significant Bits (LSBs). The results show that energy consumption is reduced by more than five percent with the same accuracy loss.
Collapse
|
3
|
Cardarilli GC, Di Nunzio L, Fazzolari R, Giardino D, Nannarelli A, Re M, Spanò S. A pseudo-softmax function for hardware-based high speed image classification. Sci Rep 2021; 11:15307. [PMID: 34321514 PMCID: PMC8319144 DOI: 10.1038/s41598-021-94691-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 07/08/2021] [Indexed: 11/20/2022] Open
Abstract
In this work a novel architecture, named pseudo-softmax, to compute an approximated form of the softmax function is presented. This architecture can be fruitfully used in the last layer of Neural Networks and Convolutional Neural Networks for classification tasks, and in Reinforcement Learning hardware accelerators to compute the Boltzmann action-selection policy. The proposed pseudo-softmax design, intended for efficient hardware implementation, exploits the typical integer quantization of hardware-based Neural Networks obtaining an accurate approximation of the result. In the paper, a detailed description of the architecture is given and an extensive analysis of the approximation error is performed by using both custom stimuli and real-world Convolutional Neural Networks inputs. The implementation results, based on CMOS standard-cell technology, compared to state-of-the-art architectures show reduced approximation errors.
Collapse
Affiliation(s)
- Gian Carlo Cardarilli
- Department of Electronic Engineering, University of Rome "Tor Vergata", 00133, Rome, Italy
| | - Luca Di Nunzio
- Department of Electronic Engineering, University of Rome "Tor Vergata", 00133, Rome, Italy
| | - Rocco Fazzolari
- Department of Electronic Engineering, University of Rome "Tor Vergata", 00133, Rome, Italy
| | - Daniele Giardino
- Department of Electronic Engineering, University of Rome "Tor Vergata", 00133, Rome, Italy
| | - Alberto Nannarelli
- Department of Applied Mathematics and Computer Science, Danmarks Tekniske Universitet, 2800, Kongens Lyngby, Denmark
| | - Marco Re
- Department of Electronic Engineering, University of Rome "Tor Vergata", 00133, Rome, Italy
| | - Sergio Spanò
- Department of Electronic Engineering, University of Rome "Tor Vergata", 00133, Rome, Italy.
| |
Collapse
|
4
|
An Efficient FPGA-Based Hardware Accelerator for Convex Optimization-Based SVM Classifier for Machine Learning on Embedded Platforms. ELECTRONICS 2021. [DOI: 10.3390/electronics10111323] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Machine learning is becoming the cornerstones of smart and autonomous systems. Machine learning algorithms can be categorized into supervised learning (classification) and unsupervised learning (clustering). Among many classification algorithms, the Support Vector Machine (SVM) classifier is one of the most commonly used machine learning algorithms. By incorporating convex optimization techniques into the SVM classifier, we can further enhance the accuracy and classification process of the SVM by finding the optimal solution. Many machine learning algorithms, including SVM classification, are compute-intensive and data-intensive, requiring significant processing power. Furthermore, many machine learning algorithms have found their way into portable and embedded devices, which have stringent requirements. In this research work, we introduce a novel, unique, and efficient Field Programmable Gate Array (FPGA)-based hardware accelerator for a convex optimization-based SVM classifier for embedded platforms, considering the constraints associated with these platforms and the requirements of the applications running on these devices. We incorporate suitable mathematical kernels and decomposition methods to systematically solve the convex optimization for machine learning applications with a large volume of data. Our proposed architectures are generic, parameterized, and scalable; hence, without changing internal architectures, our designs can be used to process different datasets with varying sizes, can be executed on different platforms, and can be utilized for various machine learning applications. We also introduce system-level architectures and techniques to facilitate real-time processing. Experiments are performed using two different benchmark datasets to evaluate the feasibility and efficiency of our hardware architecture, in terms of timing, speedup, area, and accuracy. Our embedded hardware design achieves up to 79 times speedup compared to its embedded software counterpart, and can also achieve up to 100% classification accuracy.
Collapse
|
5
|
Adaptive Detection of Islanding and Power Quality Disturbances in a Grid-Integrated Photovoltaic System. ARABIAN JOURNAL FOR SCIENCE AND ENGINEERING 2020. [DOI: 10.1007/s13369-020-04378-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/24/2022]
|
6
|
Xia W, Mita Y, Shibata T. A Nearest Neighbor Classifier Employing Critical Boundary Vectors for Efficient On-Chip Template Reduction. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:1094-1107. [PMID: 26080388 DOI: 10.1109/tnnls.2015.2437901] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Aiming at efficient data condensation and improving accuracy, this paper presents a hardware-friendly template reduction (TR) method for the nearest neighbor (NN) classifiers by introducing the concept of critical boundary vectors. A hardware system is also implemented to demonstrate the feasibility of using an field-programmable gate array (FPGA) to accelerate the proposed method. Initially, k -means centers are used as substitutes for the entire template set. Then, to enhance the classification performance, critical boundary vectors are selected by a novel learning algorithm, which is completed within a single iteration. Moreover, to remove noisy boundary vectors that can mislead the classification in a generalized manner, a global categorization scheme has been explored and applied to the algorithm. The global characterization automatically categorizes each classification problem and rapidly selects the boundary vectors according to the nature of the problem. Finally, only critical boundary vectors and k -means centers are used as the new template set for classification. Experimental results for 24 data sets show that the proposed algorithm can effectively reduce the number of template vectors for classification with a high learning speed. At the same time, it improves the accuracy by an average of 2.17% compared with the traditional NN classifiers and also shows greater accuracy than seven other TR methods. We have shown the feasibility of using a proof-of-concept FPGA system of 256 64-D vectors to accelerate the proposed method on hardware. At a 50-MHz clock frequency, the proposed system achieves a 3.86 times higher learning speed than on a 3.4-GHz PC, while consuming only 1% of the power of that used by the PC.
Collapse
|
7
|
Canals V, Morro A, Oliver A, Alomar ML, Rosselló JL. A New Stochastic Computing Methodology for Efficient Neural Network Implementation. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:551-564. [PMID: 25915963 DOI: 10.1109/tnnls.2015.2413754] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
This paper presents a new methodology for the hardware implementation of neural networks (NNs) based on probabilistic laws. The proposed encoding scheme circumvents the limitations of classical stochastic computing (based on unipolar or bipolar encoding) extending the representation range to any real number using the ratio of two bipolar-encoded pulsed signals. Furthermore, the novel approach presents practically a total noise-immunity capability due to its specific codification. We introduce different designs for building the fundamental blocks needed to implement NNs. The validity of the present approach is demonstrated through a regression and a pattern recognition task. The low cost of the methodology in terms of hardware, along with its capacity to implement complex mathematical functions (such as the hyperbolic tangent), allows its use for building highly reliable systems and parallel computing.
Collapse
|
8
|
Kyrkou C, Bouganis CS, Theocharides T, Polycarpou MM. Embedded Hardware-Efficient Real-Time Classification With Cascade Support Vector Machines. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2016; 27:99-112. [PMID: 26011869 DOI: 10.1109/tnnls.2015.2428738] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
Cascade support vector machines (SVMs) are optimized to efficiently handle problems, where the majority of the data belong to one of the two classes, such as image object classification, and hence can provide speedups over monolithic (single) SVM classifiers. However, SVM classification is a computationally demanding task and existing hardware architectures for SVMs only consider monolithic classifiers. This paper proposes the acceleration of cascade SVMs through a hybrid processing hardware architecture optimized for the cascade SVM classification flow, accompanied by a method to reduce the required hardware resources for its implementation, and a method to improve the classification speed utilizing cascade information to further discard data samples. The proposed SVM cascade architecture is implemented on a Spartan-6 field-programmable gate array (FPGA) platform and evaluated for object detection on 800×600 (Super Video Graphics Array) resolution images. The proposed architecture, boosted by a neural network that processes cascade information, achieves a real-time processing rate of 40 frames/s for the benchmark face detection application. Furthermore, the hardware-reduction method results in the utilization of 25% less FPGA custom-logic resources and 20% peak power reduction compared with a baseline implementation.
Collapse
|
9
|
Source selection for real-time user intent recognition toward volitional control of artificial legs. IEEE J Biomed Health Inform 2015; 17:907-14. [PMID: 25055369 DOI: 10.1109/jbhi.2012.2236563] [Citation(s) in RCA: 68] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Various types of data sources have been used to recognize user intent for volitional control of powered artificial legs. However, there is still a debate on what exact data sources are necessary for accurately and responsively recognizing the user's intended tasks. Motivated by this widely interested question, in this study we aimed to 1) investigate the usefulness of different data sources commonly suggested for user intent recognition and 2) determine an informative set of data sources for volitional control of prosthetic legs. The studied data sources included eight surface electromyography (EMG) signals from the residual thigh muscles of transfemoral (TF) amputees, ground reaction forces/moments from a prosthetic pylon, and kinematic measurements from the residual thigh and prosthetic knee. We then ranked and included data sources based on the usefulness for user intent recognition and selected a reduced number of data sources that ensured accurate recognition of the user's intended task by using three source selection algorithms. The results showed that EMG signals and ground reaction forces/moments were more informative than prosthesis kinematics. Nine to eleven of all the initial data sources were sufficient to maintain 95% accuracy for recognizing the studied seven tasks without missing additional task transitions in real time. The selected data sources produced consistent system performance across two experimental days for four recruited TF amputee subjects, indicating the potential robustness of the selected data sources. Finally, based on the study results, we suggested a protocol for determining the informative data sources and sensor configurations for future development of volitional control of powered artificial legs.
Collapse
|
10
|
Groleat T, Arzel M, Vaton S. Stretching the Edges of SVM Traffic Classification With FPGA Acceleration. IEEE TRANSACTIONS ON NETWORK AND SERVICE MANAGEMENT 2014. [DOI: 10.1109/tnsm.2014.2346075] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
11
|
|
12
|
Perez-Ilzarbe MJ. New discrete-time recurrent neural network proposal for quadratic optimization with general linear constraints. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2013; 24:322-328. [PMID: 24808285 DOI: 10.1109/tnnls.2012.2223484] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
In this brief, the quadratic problem with general linear constraints is reformulated using the Wolfe dual theory, and a very simple discrete-time recurrent neural network is proved to be able to solve it. Conditions that guarantee global convergence of this network to the constrained minimum are developed. The computational complexity of the method is analyzed, and experimental work is presented that shows its high efficiency.
Collapse
|
13
|
Hussain HM, Benkrid K, Seker H. Reconfiguration-based implementation of SVM classifier on FPGA for Classifying Microarray data. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2013; 2013:3058-3061. [PMID: 24110373 DOI: 10.1109/embc.2013.6610186] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
Classifying Microarray data, which are of high dimensional nature, requires high computational power. Support Vector Machines-based classifier (SVM) is among the most common and successful classifiers used in the analysis of Microarray data but also requires high computational power due to its complex mathematical architecture. Implementing SVM on hardware exploits the parallelism available within the algorithm kernels to accelerate the classification of Microarray data. In this work, a flexible, dynamically and partially reconfigurable implementation of the SVM classifier on Field Programmable Gate Array (FPGA) is presented. The SVM architecture achieved up to 85× speed-up over equivalent general purpose processor (GPP) showing the capability of FPGAs in enhancing the performance of SVM-based analysis of Microarray data as well as future bioinformatics applications.
Collapse
|
14
|
Papadonikolakis M, Bouganis CS. Novel cascade FPGA accelerator for support vector machines classification. IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS 2012; 23:1040-1052. [PMID: 24807131 DOI: 10.1109/tnnls.2012.2196446] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Support vector machines (SVMs) are a powerful machine learning tool, providing state-of-the-art accuracy to many classification problems. However, SVM classification is a computationally complex task, suffering from linear dependencies on the number of the support vectors and the problem's dimensionality. This paper presents a fully scalable field programmable gate array (FPGA) architecture for the acceleration of SVM classification, which exploits the device heterogeneity and the dynamic range diversities among the dataset attributes. An adaptive and fully-customized processing unit is proposed, which utilizes the available heterogeneous resources of a modern FPGA device in efficient way with respect to the problem's characteristics. The implementation results demonstrate the efficiency of the heterogeneous architecture, presenting a speed-up factor of 2-3 orders of magnitude, compared to the CPU implementation. The proposed architecture outperforms other proposed FPGA and graphic processor unit approaches by more than seven times. Furthermore, based on the special properties of the heterogeneous architecture, this paper introduces the first FPGA-oriented cascade SVM classifier scheme, which exploits the FPGA reconfigurability and intensifies the custom-arithmetic properties of the heterogeneous architecture. The results show that the proposed cascade scheme is able to increase the heterogeneous classifier throughput even further, without introducing any penalty on the resource utilization.
Collapse
|
15
|
Manikandan J, Venkataramani B. System-on-programmable-chip implementation of diminishing learning based pattern recognition system. INT J MACH LEARN CYB 2012. [DOI: 10.1007/s13042-012-0102-z] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
16
|
Basham EJ, Parent DW. Compact digital implementation of a quadratic integrate-and-fire neuron. ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL INTERNATIONAL CONFERENCE 2012; 2012:3543-3548. [PMID: 23366692 DOI: 10.1109/embc.2012.6346731] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
A compact fixed-point digital implementation of a quadratic integrate-and-fire (QIF) neural model was developed. Equations were derived to determine the minimum number of bits the digital QIF model requires to represent all four states of the QIF model and control the switching threshold of the output voltage. In addition, the equations were used to minimize the size of the multiplier used for the nonlinear squaring function, V(2). These design equations were used to develop test vectors that could unambiguously show all four states of a digital QIF model. The FPGA implementation of the QIF model was shown to be computationally efficient, requiring only two fixed-point adders and one fixed-point multiplier.
Collapse
Affiliation(s)
- Eric J Basham
- Electrical Engineering Department, San Jose State University, One Washington Square, San Jose, CA 95192, USA.
| | | |
Collapse
|
17
|
Mahmoodi D, Soleimani A, Khosravi H, Taghizadeh M. FPGA Simulation of Linear and Nonlinear Support Vector Machine. ACTA ACUST UNITED AC 2011. [DOI: 10.4236/jsea.2011.45036] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
18
|
Zhang Y, Ma W, Li XD, Tan HZ, Chen K. MATLAB Simulink modeling and simulation of LVI-based primal–dual neural network for solving linear and quadratic programs. Neurocomputing 2009. [DOI: 10.1016/j.neucom.2008.07.008] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
19
|
FPGA Implementation of Support Vector Machines for 3D Object Identification. ARTIFICIAL NEURAL NETWORKS – ICANN 2009 2009. [DOI: 10.1007/978-3-642-04274-4_49] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
20
|
Anguita D, Ghio A, Pischiutta S, Ridella S. A support vector machine with integer parameters. Neurocomputing 2008. [DOI: 10.1016/j.neucom.2007.12.006] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
21
|
Lamela H, Ruiz-Llata M. Image identification system based on an optical broadcast neural network and a pulse coupled neural network preprocessor stage. APPLIED OPTICS 2008; 47:B52-B63. [PMID: 18382551 DOI: 10.1364/ao.47.000b52] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/26/2023]
Abstract
We describe the concept of a vision system based on an optoelectronic hardware neural processor. The proposed system is composed of a pulse coupled neural network (PCNN) preprocessor stage that converts an input image into a temporal pulsed pattern. These pulses are inputs to the optical broadcast neural network (OBNN) processor, which classifies the input pattern between a set of reference patterns based on a pattern matching strategy. The PCNN is to provide immunity to the scale, rotation, and translation of objects in the image. The OBNN provides high parallelism and a high speed hardware neural processor.
Collapse
Affiliation(s)
- Horacio Lamela
- Grupo de Optoelectrónica y Tecnología Láser, Universidad Carlos III de Madrid, Madrid, Spain.
| | | |
Collapse
|
22
|
|
23
|
Anguita D, Ghio A, Pischiutta S, Ridella S. A Hardware-friendly Support Vector Machine for Embedded Automotive Applications. ACTA ACUST UNITED AC 2007. [DOI: 10.1109/ijcnn.2007.4371156] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
24
|
Duren RW, Marks RJ, Reynolds PD, Trumbo ML. Real-time neural network inversion on the SRC-6e reconfigurable computer. IEEE TRANSACTIONS ON NEURAL NETWORKS 2007; 18:889-901. [PMID: 17526353 DOI: 10.1109/tnn.2007.891679] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/15/2023]
Abstract
Implementation of real-time neural network inversion on the SRC-6e, a computer that uses multiple field-programmable gate arrays (FPGAs) as reconfigurable computing elements, is examined using a sonar application as a specific case study. A feedforward multilayer perceptron neural network is used to estimate the performance of the sonar system (Jung et al., 2001). A particle swarm algorithm uses the trained network to perform a search for the control parameters required to optimize the output performance of the sonar system in the presence of imposed environmental constraints (Fox et al., 2002). The particle swarm optimization (PSO) requires repetitive queries of the neural network. Alternatives for implementing neural networks and particle swarm algorithms in reconfigurable hardware are contrasted. The final implementation provides nearly two orders of magnitude of speed increase over a state-of-the-art personal computer (PC), providing a real-time solution.
Collapse
Affiliation(s)
- Russell W Duren
- Department of Electrical and Computer Engineering, Baylor University, Waco, TX 76798, USA.
| | | | | | | |
Collapse
|
25
|
Massively distributed digital implementation of an integrate-and-fire LEGION network for visual scene segmentation. Neurocomputing 2007. [DOI: 10.1016/j.neucom.2006.11.009] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
|
26
|
Savich AW, Moussa M, Areibi S. The Impact of Arithmetic Representation on Implementing MLP-BP on FPGAs: A Study. ACTA ACUST UNITED AC 2007; 18:240-52. [PMID: 17278475 DOI: 10.1109/tnn.2006.883002] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, arithmetic representations for implementing multilayer perceptrons trained using the error backpropagation algorithm (MLP-BP) neural networks on field-programmable gate arrays (FPGAs) are examined in detail. Both floating-point (FLP) and fixed-point (FXP) formats are studied and the effect of precision of representation and FPGA area requirements are considered. A generic very high-speed integrated circuit hardware description language (VHDL) program was developed to help experiment with a large number of formats and designs. The results show that an MLP-BP network uses less clock cycles and consumes less real estate when compiled in an FXP format, compared with a larger and slower functioning compilation in an FLP format with similar data representation width, in bits, or a similar precision and range.
Collapse
Affiliation(s)
- Antony W Savich
- School of Engineering, University of Guelph, Guelph, ON NIG 2W1, Canada
| | | | | |
Collapse
|
27
|
Anguita D, Pischiutta S, Ridella S, Sterpi D. Feed-forward support vector machine without multipliers. ACTA ACUST UNITED AC 2006; 17:1328-31. [PMID: 17001991 DOI: 10.1109/tnn.2006.877537] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
In this letter, we propose a coordinate rotation digital computer (CORDIC)-like algorithm for computing the feed-forward phase of a support vector machine (SVM) in fixed-point arithmetic, using only shift and add operations and avoiding resource-consuming multiplications. This result is obtained thanks to a hardware-friendly kernel, which greatly simplifies the SVM feed-forward phase computation and, at the same time, maintains good classification performance respect to the conventional Gaussian kernel.
Collapse
|
28
|
Ferreira LV, Kaszkurewicz E, Bhaya A. Support vector classifiers via gradient systems with discontinuous righthand sides. Neural Netw 2006; 19:1612-23. [PMID: 17011165 DOI: 10.1016/j.neunet.2006.07.004] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2004] [Accepted: 07/07/2006] [Indexed: 11/16/2022]
Abstract
Gradient dynamical systems with discontinuous righthand sides are designed using Persidskii-type nonsmooth Lyapunov functions to work as support vector machines (SVMs) for the discrimination of nonseparable classes. The gradient systems are obtained from an exact penalty method applied to the constrained quadratic optimization problems, which are formulations of two well known SVMs. Global convergence of the trajectories of the gradient dynamical systems to the solution of the corresponding constrained problems is shown to be independent of the penalty parameters and of the parameters of the SVMs. The proposed gradient systems can be implemented as simple analog circuits as well as using standard software for integration of ODEs, and in order to use efficient integration methods with adaptive stepsize selection, the discontinuous terms are smoothed around a neighborhood of the discontinuity surface by means of the boundary layer technique. The scalability of the proposed gradient systems is also shown by means of an implementation using parallel computers, resulting in smaller processing times when compared with traditional SVM packages.
Collapse
Affiliation(s)
- Leonardo V Ferreira
- Department of Electrical Engineering, NACAD-COPPE/Federal University of Rio de Janeiro, Rio de Janeiro, RJ, Brazil.
| | | | | |
Collapse
|
29
|
Iakovidis DK, Maroulis DE, Karkanis SA. An intelligent system for automatic detection of gastrointestinal adenomas in video endoscopy. Comput Biol Med 2005; 36:1084-103. [PMID: 16293240 DOI: 10.1016/j.compbiomed.2005.09.008] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Today 95% of all gastrointestinal carcinomas are believed to arise from adenomas. The early detection of adenomas could prevent their evolution to cancer. A novel system for the support of the detection of adenomas in gastrointestinal video endoscopy is presented. Unlike other systems, it accepts standard low-resolution video input thus requiring less computational resources and facilitating both portability and the potential to be used in telemedicine applications. It combines intelligent processing techniques of SVMs and color-texture analysis methodologies into a sound pattern recognition framework. Concerning the system's accuracy this was measured using ROC analysis and found to exceed 94%.
Collapse
Affiliation(s)
- Dimitris K Iakovidis
- Department of Informatics and Telecommunications, University of Athens, Panepistimiopolis, Illisia, 15784 Athens, Greece.
| | | | | |
Collapse
|
30
|
Maeda Y, Wakamura M. Simultaneous Perturbation Learning Rule for Recurrent Neural Networks and Its FPGA Implementation. ACTA ACUST UNITED AC 2005; 16:1664-72. [PMID: 16342505 DOI: 10.1109/tnn.2005.852237] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Recurrent neural networks have interesting properties and can handle dynamic information processing unlike ordinary feedforward neural networks. However, they are generally difficult to use because there is no convenient learning scheme. In this paper, a recursive learning scheme for recurrent neural networks using the simultaneous perturbation method is described. The detailed procedure of the scheme for recurrent neural networks is explained. Unlike ordinary correlation learning, this method is applicable to analog learning and the learning of oscillatory solutions of recurrent neural networks. Moreover, as a typical example of recurrent neural networks, we consider the hardware implementation of Hopfield neural networks using a field-programmable gate array (FPGA). The details of the implementation are described. Two examples of a Hopfield neural network system for analog and oscillatory targets are shown. These results show that the learning scheme proposed here is feasible.
Collapse
Affiliation(s)
- Yutaka Maeda
- Department of Electrical Engineering and Computer Science, Faculty of Engineering, Kansai University, Osaka 564-8680, Japan.
| | | |
Collapse
|
31
|
Xie N, Leung H. Blind equalization using a predictive radial basis function neural network. ACTA ACUST UNITED AC 2005; 16:709-20. [PMID: 15940998 DOI: 10.1109/tnn.2005.845145] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
In this paper, we propose a novel blind equalization approach based on radial basis function (RBF) neural networks. By exploiting the short-term predictability of the system input, a RBF neural net is used to predict the inverse filter output. It is shown here that when the prediction error of the RBF neural net is minimized, the coefficients of the inverse system are identical to those of the unknown system. To enhance the identification performance in noisy environments, the improved least square (ILS) method based on the concept of orthogonal distance to red the estimation bias caused by additive measurement noise is proposed here to perform the training. The convergence rate of the ILS learning is analyzed, and the asymptotic mean square error (MSE) of the proposed predictive RBF identification method is derived theoretically. Monte Carlo simulations show that the proposed method is effective for blind system identification. The new blind technique is then applied to two practical applications: equalization of real-life radar sea clutter collected at the east coast of Canada and deconvolution of real speech signals. In both cases, the proposed blind equalization technique is found to perform satisfactory even when the channel effects and measurement noise are strong.
Collapse
Affiliation(s)
- Nan Xie
- Department of Electrical and Computer Engineering, University of Calgary, Calgary, AB T2N 1N4, Canada.
| | | |
Collapse
|
32
|
Ferrer MA, Alonso JB, Travieso CM. Offline geometric parameters for automatic signature verification using fixed-point arithmetic. IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2005; 27:993-7. [PMID: 15943430 DOI: 10.1109/tpami.2005.125] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/02/2023]
Abstract
This paper presents a set of geometric signature features for offline automatic signature verification based on the description of the signature envelope and the interior stroke distribution in polar and Cartesian coordinates. The features have been calculated using 16 bits fixed-point arithmetic and tested with different classifiers, such as hidden Markov models, support vector machines, and Euclidean distance classifier. The experiments have shown promising results in the task of discriminating random and simple forgeries.
Collapse
Affiliation(s)
- Miguel A Ferrer
- Departamento de Señales y Comunicaciones, Universidad de Las Palmas de Gran Canaria, Campus de Tafira, E35017 Las Palmas de Gran Canaria, Spain.
| | | | | |
Collapse
|