1
|
Determination of minimum inhibitory concentrations using machine-learning-assisted agar dilution. Microbiol Spectr 2024; 12:e0420923. [PMID: 38517194 PMCID: PMC11064640 DOI: 10.1128/spectrum.04209-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2024] [Accepted: 02/26/2024] [Indexed: 03/23/2024] Open
Abstract
Effective policy to address the global threat of antimicrobial resistance requires robust antimicrobial susceptibility data. Traditional methods for measuring minimum inhibitory concentration (MIC) are resource intensive, subject to human error, and require considerable infrastructure. AIgarMIC streamlines and standardizes MIC measurement and is especially valuable for large-scale surveillance activities. MICs were measured using agar dilution for n = 10 antibiotics against clinical Enterobacterales isolates (n = 1,086) obtained from a large tertiary hospital microbiology laboratory. Escherichia coli (n = 827, 76%) was the most common organism. Photographs of agar plates were divided into smaller images covering one inoculation site. A labeled data set of colony images was created and used to train a convolutional neural network to classify images based on whether a bacterial colony was present (first-step model). If growth was present, a second-step model determined whether colony morphology suggested antimicrobial growth inhibition. The ability of the AI to determine MIC was then compared with standard visual determination. The first-step model classified bacterial growth as present/absent with 94.3% accuracy. The second-step model classified colonies as "inhibited" or "good growth" with 88.6% accuracy. For the determination of MIC, the rate of essential agreement was 98.9% (644/651), with a bias of -7.8%, compared with manual annotation. AIgarMIC uses artificial intelligence to automate endpoint assessments for agar dilution and potentially increases throughput without bespoke equipment. AIgarMIC reduces laboratory barriers to generating high-quality MIC data that can be used for large-scale surveillance programs. IMPORTANCE This research uses modern artificial intelligence and machine-learning approaches to standardize and automate the interpretation of agar dilution minimum inhibitory concentration testing. Artificial intelligence is currently of significant topical interest to researchers and clinicians. In our manuscript, we demonstrate a use-case in the microbiology laboratory and present validation data for the model's performance against manual interpretation.
Collapse
|
2
|
Intrinsic Defect-Driven Synergistic Synaptic Heterostructures for Gate-Free Neuromorphic Phototransistors. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2024; 36:e2309940. [PMID: 38373410 DOI: 10.1002/adma.202309940] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/25/2023] [Revised: 01/28/2024] [Indexed: 02/21/2024]
Abstract
The optoelectronic synaptic devices based on two-dimensional (2D) materials offer great advances for future neuromorphic visual systems with dramatically improved integration density and power efficiency. The effective charge capture and retention are considered as one vital prerequisite to realizing the synaptic memory function. However, the current 2D synaptic devices are predominantly relied on materials with artificially-engineered defects or intricate gate-controlled architectures to realize the charge trapping process. These approaches, unfortunately, suffer from the degradation of pristine materials, rapid device failure, and unnecessary complication of device structures. To address these challenges, an innovative gate-free heterostructure paradigm is introduced herein. The heterostructure presents a distinctive dome-like morphology wherein a defect-rich Fe7S8 core is enveloped snugly by a curved MoS2 dome shell (Fe7S8@MoS2), allowing the realization of effective photocarrier trapping through the intrinsic defects in the adjacent Fe7S8 core. The resultant neuromorphic devices exhibit remarkable light-tunable synaptic behaviors with memory time up to ≈800 s under single optical pulse, thus demonstrating great advances in simulating visual recognition system with significantly improved image recognition efficiency. The emergence of such heterostructures foreshadows a promising trajectory for underpinning future synaptic devices, catalyzing the realization of high-efficiency and intricate visual processing applications.
Collapse
|
3
|
Reduced Retinal Vascular Density and Skeleton Length in Amblyopia. Transl Vis Sci Technol 2024; 13:21. [PMID: 38780954 PMCID: PMC11127489 DOI: 10.1167/tvst.13.5.21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2023] [Accepted: 04/06/2024] [Indexed: 05/25/2024] Open
Abstract
Purpose This study aimed to investigate the possible relationship between retinal vascular abnormalities and amblyopia by analyzing vascular structures of fundus images. Methods In this observational study, retinal fundus images were collected from 36 patients with unilateral amblyopia, 33 patients with bilateral amblyopia, and 36 healthy control volunteers. We developed a customized training algorithm based on U-Net to digitalize the vasculature in the fundus images to quantify vascular density (area and fractal dimension), skeleton length, and number of bifurcation points. For statistical comparisons, this study divided participants into two groups. The amblyopic eyes and the fellow eyes of patients with unilateral amblyopia formed the paired group, while bilateral amblyopic patients and healthy controls formed the independent group. Results In the paired group, the vascular area (P = 0.007), vascular fractal dimension (P = 0.007), and vascular skeleton length (P = 0.002) of the amblyopic eyes were significantly smaller than those of the fellow eyes. In the independent group, significant decreases in the vascular fractal dimension (P = 0.006) and skeleton length (P = 0.048) were observed in bilateral amblyopia compared to control. The vascular area was also significantly correlated with best-corrected visual acuity in amblyopic eyes. Conclusions This study demonstrated that retinal vascular density and skeleton length in amblyopic eyes were significantly smaller compared to control, indicating an association between the changes in retinal vascular features and the state of amblyopia. Translational Relevance Our algorithm presents amblyopic retinal vascular changes that are more biologically interpretable for both clinicians and researchers.
Collapse
|
4
|
Differentiating Epileptic and Psychogenic Non-Epileptic Seizures Using Machine Learning Analysis of EEG Plot Images. SENSORS (BASEL, SWITZERLAND) 2024; 24:2823. [PMID: 38732929 PMCID: PMC11086151 DOI: 10.3390/s24092823] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 04/22/2024] [Accepted: 04/28/2024] [Indexed: 05/13/2024]
Abstract
The treatment of epilepsy, the second most common chronic neurological disorder, is often complicated by the failure of patients to respond to medication. Treatment failure with anti-seizure medications is often due to the presence of non-epileptic seizures. Distinguishing non-epileptic from epileptic seizures requires an expensive and time-consuming analysis of electroencephalograms (EEGs) recorded in an epilepsy monitoring unit. Machine learning algorithms have been used to detect seizures from EEG, typically using EEG waveform analysis. We employed an alternative approach, using a convolutional neural network (CNN) with transfer learning using MobileNetV2 to emulate the real-world visual analysis of EEG images by epileptologists. A total of 5359 EEG waveform plot images from 107 adult subjects across two epilepsy monitoring units in separate medical facilities were divided into epileptic and non-epileptic groups for training and cross-validation of the CNN. The model achieved an accuracy of 86.9% (Area Under the Curve, AUC 0.92) at the site where training data were extracted and an accuracy of 87.3% (AUC 0.94) at the other site whose data were only used for validation. This investigation demonstrates the high accuracy achievable with CNN analysis of EEG plot images and the robustness of this approach across EEG visualization software, laying the groundwork for further subclassification of seizures using similar approaches in a clinical setting.
Collapse
|
5
|
The meta-learning method for the ensemble model based on situational meta-task. Front Neurorobot 2024; 18:1391247. [PMID: 38736985 PMCID: PMC11082275 DOI: 10.3389/fnbot.2024.1391247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2024] [Accepted: 04/04/2024] [Indexed: 05/14/2024] Open
Abstract
Introduction The meta-learning methods have been widely used to solve the problem of few-shot learning. Generally, meta-learners are trained on a variety of tasks and then generalized to novel tasks. Methods However, existing meta-learning methods do not consider the relationship between meta-tasks and novel tasks during the meta-training period, so that initial models of the meta-learner provide less useful meta-knowledge for the novel tasks. This leads to a weak generalization ability on novel tasks. Meanwhile, different initial models contain different meta-knowledge, which leads to certain differences in the learning effect of novel tasks during the meta-testing period. Therefore, this article puts forward a meta-optimization method based on situational meta-task construction and cooperation of multiple initial models. First, during the meta-training period, a method of constructing situational meta-task is proposed, and the selected candidate task sets provide more effective meta-knowledge for novel tasks. Then, during the meta-testing period, an ensemble model method based on meta-optimization is proposed to minimize the loss of inter-model cooperation in prediction, so that multiple models cooperation can realize the learning of novel tasks. Results The above-mentioned methods are applied to popular few-shot character datasets and image recognition datasets. Furthermore, the experiment results indicate that the proposed method achieves good effects in few-shot classification tasks. Discussion In future work, we will extend our methods to provide more generalized and useful meta-knowledge to the model during the meta-training period when the novel few-shot tasks are completely invisible.
Collapse
|
6
|
Recognition of 3D Images by Fusing Fractional-Order Chebyshev Moments and Deep Neural Networks. SENSORS (BASEL, SWITZERLAND) 2024; 24:2352. [PMID: 38610564 PMCID: PMC11014064 DOI: 10.3390/s24072352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/26/2024] [Revised: 03/23/2024] [Accepted: 03/26/2024] [Indexed: 04/14/2024]
Abstract
In order to achieve efficient recognition of 3D images and reduce the complexity of network parameters, we proposed a novel 3D image recognition method combining deep neural networks with fractional-order Chebyshev moments. Firstly, the fractional-order Chebyshev moment (FrCM) unit, consisting of Chebyshev moments and the three-term recurrence relation method, is calculated separately using successive integrals. Next, moment invariants based on fractional order and Chebyshev moments are utilized to achieve invariants for image scaling, rotation, and translation. This design aims to enhance computational efficiency. Finally, the fused network embedding the FrCM unit (FrCMs-DNNs) extracts depth features to analyze the effectiveness from the aspects of parameter quantity, computing resources, and identification capability. Meanwhile, the Princeton Shape Benchmark dataset and medical images dataset are used for experimental validation. Compared with other deep neural networks, FrCMs-DNNs has the highest accuracy in image recognition and classification. We used two evaluation indices, mean square error (MSE) and peak signal-to-noise ratio (PSNR), to measure the reconstruction quality of FrCMs after 3D image reconstruction. The accuracy of the FrCMs-DNNs model in 3D object recognition was assessed through an ablation experiment, considering the four evaluation indices of accuracy, precision, recall rate, and F1-score.
Collapse
|
7
|
Advancing maxillofacial prosthodontics by using pre-trained convolutional neural networks: Image-based classification of the maxilla. J Prosthodont 2024. [PMID: 38566564 DOI: 10.1111/jopr.13853] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Accepted: 03/15/2024] [Indexed: 04/04/2024] Open
Abstract
PURPOSE The study aimed to compare the performance of four pre-trained convolutional neural networks in recognizing seven distinct prosthodontic scenarios involving the maxilla, as a preliminary step in developing an artificial intelligence (AI)-powered prosthesis design system. MATERIALS AND METHODS Seven distinct classes, including cleft palate, dentulous maxillectomy, edentulous maxillectomy, reconstructed maxillectomy, completely dentulous, partially edentulous, and completely edentulous, were considered for recognition. Utilizing transfer learning and fine-tuned hyperparameters, four AI models (VGG16, Inception-ResNet-V2, DenseNet-201, and Xception) were employed. The dataset, consisting of 3541 preprocessed intraoral occlusal images, was divided into training, validation, and test sets. Model performance metrics encompassed accuracy, precision, recall, F1 score, area under the receiver operating characteristic curve (AUC), and confusion matrix. RESULTS VGG16, Inception-ResNet-V2, DenseNet-201, and Xception demonstrated comparable performance, with maximum test accuracies of 0.92, 0.90, 0.94, and 0.95, respectively. Xception and DenseNet-201 slightly outperformed the other models, particularly compared with InceptionResNet-V2. Precision, recall, and F1 scores exceeded 90% for most classes in Xception and DenseNet-201 and the average AUC values for all models ranged between 0.98 and 1.00. CONCLUSIONS While DenseNet-201 and Xception demonstrated superior performance, all models consistently achieved diagnostic accuracy exceeding 90%, highlighting their potential in dental image analysis. This AI application could help work assignments based on difficulty levels and enable the development of an automated diagnosis system at patient admission. It also facilitates prosthesis designing by integrating necessary prosthesis morphology, oral function, and treatment difficulty. Furthermore, it tackles dataset size challenges in model optimization, providing valuable insights for future research.
Collapse
|
8
|
A Bidirectional Feedforward Neural Network Architecture Using the Discretized Neural Memory Ordinary Differential Equation. Int J Neural Syst 2024; 34:2450015. [PMID: 38318709 DOI: 10.1142/s0129065724500151] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]
Abstract
Deep Feedforward Neural Networks (FNNs) with skip connections have revolutionized various image recognition tasks. In this paper, we propose a novel architecture called bidirectional FNN (BiFNN), which utilizes skip connections to aggregate features between its forward and backward paths. The BiFNN accepts any FNN as a plugin that can incorporate any general FNN model into its forward path, introducing only a few additional parameters in the cross-path connections. The backward path is implemented as a nonparameter layer, utilizing a discretized form of the neural memory Ordinary Differential Equation (nmODE), which is named [Formula: see text]-net. We provide a proof of convergence for the [Formula: see text]-net and evaluate its initial value problem. Our proposed architecture is evaluated on diverse image recognition datasets, including Fashion-MNIST, SVHN, CIFAR-10, CIFAR-100, and Tiny-ImageNet. The results demonstrate that BiFNNs offer significant improvements compared to embedded models such as ConvMixer, ResNet, ResNeXt, and Vision Transformer. Furthermore, BiFNNs can be fine-tuned to achieve comparable performance with embedded models on Tiny-ImageNet and ImageNet-1K datasets by loading the same pretrained parameters.
Collapse
|
9
|
A rapid method for identification of Lanxangia tsaoko origin and fruit shape: FT-NIR combined with chemometrics and image recognition. J Food Sci 2024; 89:2316-2331. [PMID: 38369957 DOI: 10.1111/1750-3841.16989] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2023] [Revised: 01/20/2024] [Accepted: 02/01/2024] [Indexed: 02/20/2024]
Abstract
Lanxangia tsaoko's accurate classifications of different origins and fruit shapes are significant for research in L. tsaoko difference between origin and species as well as for variety breeding, cultivation, and market management. In this work, Fourier transform-near infrared (FT-NIR) spectroscopy was transformed into two-dimensional and three-dimensional correlation spectroscopies to further investigate the spectral characteristics of L. tsaoko. Before building the classification model, the raw FT-NIR spectra were preprocessed using multiplicative scatter correction and second derivative, whereas principal component analysis, successive projections algorithm, and competitive adaptive reweighted sampling were used for spectral feature variable extraction. Then combined with partial least squares-discriminant analysis (PLS-DA), support vector machine (SVM), decision tree, and residual network (ResNet) models for origin and fruit shape discriminated in L. tsaoko. The PLS-DA and SVM models can achieve 100% classification in origin classification, but what is difficult to avoid is the complex process of model optimization. The ResNet image recognition model classifies the origin and shape of L. tsaoko with 100% accuracy, and without the need for complex preprocessing and feature extraction, the model facilitates the realization of fast, accurate, and efficient identification.
Collapse
|
10
|
Unlocking ground-based imagery for habitat mapping. Trends Ecol Evol 2024; 39:349-358. [PMID: 38087707 DOI: 10.1016/j.tree.2023.11.005] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2023] [Revised: 11/06/2023] [Accepted: 11/14/2023] [Indexed: 04/05/2024]
Abstract
Fine-grained environmental data across large extents are needed to resolve the processes that impact species communities from local to global scales. Ground-based images (GBIs) have the potential to capture habitat complexity at biologically relevant spatial and temporal resolutions. Moving beyond existing applications of GBIs for species identification and monitoring ecological change from repeat photography, we describe promising approaches to habitat mapping, leveraging multimodal data and computer vision. We illustrate empirically how GBIs can be applied to predict distributions of species at fine scales along Street View routes, or to automatically classify and quantify habitat features. Further, we outline future research avenues using GBIs that can bring a leap forward in analyses for ecology and conservation with this underused resource.
Collapse
|
11
|
Resilience-aware MLOps for AI-based medical diagnostic system. Front Public Health 2024; 12:1342937. [PMID: 38601490 PMCID: PMC11004236 DOI: 10.3389/fpubh.2024.1342937] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2023] [Accepted: 03/15/2024] [Indexed: 04/12/2024] Open
Abstract
Background The healthcare sector demands a higher degree of responsibility, trustworthiness, and accountability when implementing Artificial Intelligence (AI) systems. Machine learning operations (MLOps) for AI-based medical diagnostic systems are primarily focused on aspects such as data quality and confidentiality, bias reduction, model deployment, performance monitoring, and continuous improvement. However, so far, MLOps techniques do not take into account the need to provide resilience to disturbances such as adversarial attacks, including fault injections, and drift, including out-of-distribution. This article is concerned with the MLOps methodology that incorporates the steps necessary to increase the resilience of an AI-based medical diagnostic system against various kinds of disruptive influences. Methods Post-hoc resilience optimization, post-hoc predictive uncertainty calibration, uncertainty monitoring, and graceful degradation are incorporated as additional stages in MLOps. To optimize the resilience of the AI based medical diagnostic system, additional components in the form of adapters and meta-adapters are utilized. These components are fine-tuned during meta-training based on the results of adaptation to synthetic disturbances. Furthermore, an additional model is introduced for post-hoc calibration of predictive uncertainty. This model is trained using both in-distribution and out-of-distribution data to refine predictive confidence during the inference mode. Results The structure of resilience-aware MLOps for medical diagnostic systems has been proposed. Experimentally confirmed increase of robustness and speed of adaptation for medical image recognition system during several intervals of the system's life cycle due to the use of resilience optimization and uncertainty calibration stages. The experiments were performed on the DermaMNIST dataset, BloodMNIST and PathMNIST. ResNet-18 as a representative of convolutional networks and MedViT-T as a representative of visual transformers are considered. It is worth noting that transformers exhibited lower resilience than convolutional networks, although this observation may be attributed to potential imperfections in the architecture of adapters and meta-adapters. Сonclusion The main novelty of the suggested resilience-aware MLOps methodology and structure lie in the separating possibilities and activities on creating a basic model for normal operating conditions and ensuring its resilience and trustworthiness. This is significant for the medical applications as the developer of the basic model should devote more time to comprehending medical field and the diagnostic task at hand, rather than specializing in system resilience. Resilience optimization increases robustness to disturbances and speed of adaptation. Calibrated confidences ensure the recognition of a portion of unabsorbed disturbances to mitigate their impact, thereby enhancing trustworthiness.
Collapse
|
12
|
AA-RGTCN: reciprocal global temporal convolution network with adaptive alignment for video-based person re-identification. Front Neurosci 2024; 18:1329884. [PMID: 38591067 PMCID: PMC10999627 DOI: 10.3389/fnins.2024.1329884] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2023] [Accepted: 03/05/2024] [Indexed: 04/10/2024] Open
Abstract
Person re-identification(Re-ID) aims to retrieve pedestrians under different cameras. Compared with image-based Re-ID, video-based Re-ID extracts features from video sequences that contain both spatial features and temporal features. Existing methods usually focus on the most attractive image parts, and this will lead to redundant spatial description and insufficient temporal description. Other methods that take temporal clues into consideration usually ignore misalignment between frames and only focus on a fixed length of one given sequence. In this study, we proposed a Reciprocal Global Temporal Convolution Network with Adaptive Alignment(AA-RGTCN). The structure could address the drawback of misalignment between frames and model discriminative temporal representation. Specifically, the Adaptive Alignment block is designed to shift each frame adaptively to its best position for temporal modeling. Then, we proposed the Reciprocal Global Temporal Convolution Network to model robust temporal features across different time intervals along both normal and inverted time order. The experimental results show that our AA-RGTCN can achieve 85.9% mAP and 91.0% Rank-1 on MARS, 90.6% Rank-1 on iLIDS-VID, and 96.6% Rank-1 on PRID-2011, indicating we could gain better performance than other state-of-the-art approaches.
Collapse
|
13
|
An Artificial Neural Network Based on Oxide Synaptic Transistor for Accurate and Robust Image Recognition. MICROMACHINES 2024; 15:433. [PMID: 38675245 PMCID: PMC11052312 DOI: 10.3390/mi15040433] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Revised: 03/21/2024] [Accepted: 03/22/2024] [Indexed: 04/28/2024]
Abstract
Synaptic transistors with low-temperature, solution-processed dielectric films have demonstrated programmable conductance, and therefore potential applications in hardware artificial neural networks for recognizing noisy images. Here, we engineered AlOx/InOx synaptic transistors via a solution process to instantiate neural networks. The transistors show long-term potentiation under appropriate gate voltage pulses. The artificial neural network, consisting of one input layer and one output layer, was constructed using 9 × 3 synaptic transistors. By programming the calculated weight, the hardware network can recognize 3 × 3 pixel images of characters z, v and n with a high accuracy of 85%, even with 40% noise. This work demonstrates that metal-oxide transistors, which exhibit significant long-term potentiation of conductance, can be used for the accurate recognition of noisy images.
Collapse
|
14
|
Mechanical Metamaterials for Handwritten Digits Recognition. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2024; 11:e2308137. [PMID: 38145964 PMCID: PMC10933649 DOI: 10.1002/advs.202308137] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2023] [Revised: 12/14/2023] [Indexed: 12/27/2023]
Abstract
The increasing needs for new types of computing lie in the requirements in harsh environments. In this study, the successful development of a non-electrical neural network is presented that functions based on mechanical computing. By overcoming the challenges of low mechanical signal transmission efficiency and intricate layout design methodologies, a mechanical neural network based on bistable kirigami-based mechanical metamaterials have designed. In preliminary tests, the system exhibits high reliability in recognizing handwritten digits and proves operable in low-temperature environments. This work paves the way for a new, alternative computing system with broad applications in areas where electricity is not accessible. By integrating with the traditional electronic computers, the present system lays the foundation for a more diversified form of computing.
Collapse
|
15
|
Rethinking skip connections in Spiking Neural Networks with Time-To-First-Spike coding. Front Neurosci 2024; 18:1346805. [PMID: 38419664 PMCID: PMC10899405 DOI: 10.3389/fnins.2024.1346805] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 01/30/2024] [Indexed: 03/02/2024] Open
Abstract
Time-To-First-Spike (TTFS) coding in Spiking Neural Networks (SNNs) offers significant advantages in terms of energy efficiency, closely mimicking the behavior of biological neurons. In this work, we delve into the role of skip connections, a widely used concept in Artificial Neural Networks (ANNs), within the domain of SNNs with TTFS coding. Our focus is on two distinct types of skip connection architectures: (1) addition-based skip connections, and (2) concatenation-based skip connections. We find that addition-based skip connections introduce an additional delay in terms of spike timing. On the other hand, concatenation-based skip connections circumvent this delay but produce time gaps between after-convolution and skip connection paths, thereby restricting the effective mixing of information from these two paths. To mitigate these issues, we propose a novel approach involving a learnable delay for skip connections in the concatenation-based skip connection architecture. This approach successfully bridges the time gap between the convolutional and skip branches, facilitating improved information mixing. We conduct experiments on public datasets including MNIST and Fashion-MNIST, illustrating the advantage of the skip connection in TTFS coding architectures. Additionally, we demonstrate the applicability of TTFS coding on beyond image recognition tasks and extend it to scientific machine-learning tasks, broadening the potential uses of SNNs.
Collapse
|
16
|
A tree species classification model based on improved YOLOv7 for shelterbelts. FRONTIERS IN PLANT SCIENCE 2024; 14:1265025. [PMID: 38304457 PMCID: PMC10832270 DOI: 10.3389/fpls.2023.1265025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/21/2023] [Accepted: 12/27/2023] [Indexed: 02/03/2024]
Abstract
Tree species classification within shelterbelts is crucial for shelterbelt management. The large-scale satellite-based and low-altitude drone-based approaches serve as powerful tools for forest monitoring, especially in tree species classification. However, these methods face challenges in distinguishing individual tree species within complex backgrounds. Additionally, the mixed growth of trees within protective forest suffers from similar crown size among different tree species. The complex background of the shelterbelts negatively impacts the accuracy of tree species classification. The You Only Look Once (YOLO) algorithm is widely used in the field of agriculture and forestry, ie., plant and fruit identification, pest and disease detection, and tree species classification in forestry. We proposed a YOLOv7-Kmeans++_CoordConv_CBAM (YOLOv7-KCC) model for tree species classification based on drone RGB remote sensing images. Firstly, we constructed a dataset for tree species in shelterbelts and adopted data augmentation methods to mitigate overfitting due to limited training data. Secondly, the K-means++ algorithm was employed to cluster anchor boxes in the dataset. Furthermore, to enhance the YOLOv7 backbone network's Efficient Layer Aggregation Network (ELAN) module, we used Coordinate Convolution (CoordConv) replaced the ordinary 1×1 convolution. The Convolutional Block Attention Module (CBAM) was integrated into the Path Aggregation Network (PANet) structure to facilitate multiscale feature extraction and fusion, allowing the network to better capture and utilize crucial feature information. Experimental results showed that the YOLOv7-KCC model achieves a mean average precision@0.5 of 98.91%, outperforming the Faster RCNN-VGG16, Faster RCNN-Resnet50, SSD, YOLOv4, and YOLOv7 models by 5.71%, 11.75%, 5.97%, 7.86%, and 3.69%, respectively. The GFlops and Parameter values of the YOLOv7-KCC model stand at 105.07G and 143.7MB, representing an almost 5.6% increase in F1 metrics compared to YOLOv7. Therefore, the proposed YOLOv7-KCC model can effectively classify shelterbelt tree species, providing a scientific theoretical basis for shelterbelt management in Northwest China focusing on Xinjiang.
Collapse
|
17
|
Non-invasive medical imaging technology for the diagnosis of burn depth. Int Wound J 2024; 21:e14681. [PMID: 38272799 PMCID: PMC10805628 DOI: 10.1111/iwj.14681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2023] [Accepted: 01/03/2024] [Indexed: 01/27/2024] Open
Abstract
Currently, the clinical diagnosis of burn depth primarily relies on physicians' judgements based on patients' symptoms and physical signs, particularly the morphological characteristics of the wound. This method highly depends on individual doctors' clinical experience, proving challenging for less experienced or primary care physicians, with results often varying from one practitioner to another. Therefore, scholars have been exploring an objective and quantitative auxiliary examination technique to enhance the accuracy and consistency of burn depth diagnosis. Non-invasive medical imaging technology, with its significant advantages in examining tissue surface morphology, blood flow in deep and changes in structure and composition, has become a hot topic in burn diagnostic technology research in recent years. This paper reviews various non-invasive medical imaging technologies that have shown potential in burn depth diagnosis. These technologies are summarized and synthesized in terms of imaging principles, current research status, advantages and limitations, aiming to provide a reference for clinical application or research for burn specialists.
Collapse
|
18
|
In-Line Detection of Clinical Mastitis by Identifying Clots in Milk Using Images and a Neural Network Approach. Animals (Basel) 2023; 13:3783. [PMID: 38136819 PMCID: PMC10740463 DOI: 10.3390/ani13243783] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 11/30/2023] [Accepted: 12/06/2023] [Indexed: 12/24/2023] Open
Abstract
Automated milking systems (AMSs) already incorporate a variety of milk monitoring and sensing equipment, but the sensitivity, specificity, and positive predictive value of clinical mastitis (CM) detection remain low. A typical symptom of CM is the presence of clots in the milk during fore-stripping. The objective of this study was the development and evaluation of a deep learning model with image recognition capabilities, specifically a convolutional neural network (NN), capable of detecting such clots on pictures of the milk filter socks of the milking system, after the phase in which the first streams of milk have been discarded. In total, 696 pictures were taken with clots and 586 pictures without. These were randomly divided into 60/20/20 training, validation, and testing datasets, respectively, for the training and validation of the NN. A convolutional NN with residual connections was trained, and the hyperparameters were optimized based on the validation dataset using a genetic algorithm. The integrated gradients were calculated to explain the interpretation of the NN. The accuracy of the NN on the testing dataset was 100%. The integrated gradients showed that the NN identified the clots. Further field validation through integration into AMS is necessary, but the proposed deep learning method is very promising for the inline detection of CM on AMS farms.
Collapse
|
19
|
Go-Game Image Recognition Based on Improved Pix2pix. J Imaging 2023; 9:273. [PMID: 38132691 PMCID: PMC10871096 DOI: 10.3390/jimaging9120273] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2023] [Revised: 11/30/2023] [Accepted: 12/03/2023] [Indexed: 12/23/2023] Open
Abstract
Go is a game that can be won or lost based on the number of intersections surrounded by black or white pieces. The traditional method is a manual counting method, which is time-consuming and error-prone. In addition, the generalization of the current Go-image-recognition methods is poor, and accuracy needs to be further improved. To solve these problems, a Go-game image recognition based on an improved pix2pix was proposed. Firstly, a channel-coordinate mixed-attention (CCMA) mechanism was designed by combining channel attention and coordinate attention effectively; therefore, the model could learn the target feature information. Secondly, in order to obtain the long-distance contextual information, a deep dilated-convolution (DDC) module was proposed, which densely linked the dilated convolution with different dilated rates. The experimental results showed that compared with other existing Go-image-recognition methods, such as DenseNet, VGG-16, and Yolo v5, the proposed method could effectively improve the generalization ability and accuracy of a Go-image-recognition model, and the average accuracy rate was over 99.99%.
Collapse
|
20
|
Superficial vessel-based near infrared-assisted patient position recognition and real-time monitoring system (VIPS) for radiotherapy: A proof-of-concept study. Med Phys 2023; 50:7967-7979. [PMID: 37727130 DOI: 10.1002/mp.16690] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 07/26/2023] [Accepted: 07/28/2023] [Indexed: 09/21/2023] Open
Abstract
BACKGROUND The accuracy and precision of patient position in radiotherapy process have dramatic impacts on the tumor local control and therapy-related side effects, and there exist demands to explore effective positioning solutions, particularly in the era with great progress in imaging recognition and matching. PURPOSE Superficial vessel-based near infrared-assisted patient position recognition and real-time monitoring system (VIPS) was proposed to develop an automated, operator-independent and skin marker-free imaging system to improve patient setup and intrafractional motion monitoring. METHODS VIPS includes two components, the imaging module and the image alignment software. Using a simulated blood vessel model, multiple NIR sources with various wavelength and bolus (pseudo-skin) were evaluated in terms of imaging quality to determine the optimal light source and the upper limit of superficial fatty tissue thickness. Then the performance of VIPS with reference to either CBCT or laser setup system was conducted using 3D phantom and clinical cases enrolled into the registered clinical trial. The position displacement from VIPS and laser system was compared, as well as the systematic and random errors of VIPS setup procedure. RESULTS The NIR light source with the combined wavelengths of 760 nm + 940 nm (S760+940 nm ) provided the best performance among multiple tested light sources. The bolus (superficial fatty layer) thickness over 5 mm could dramatically compromise the NIR detection of vessels beneath. In the phantom study, the translational positional displacements according to VIPS guidance were within the submillimeter level with reference to CBCT, indicative of high setup accuracy. The clinical trial showed the prototype VIPS could effectively detect and control position displacement of patients in translational and rotational directions within an acceptable range, which was non-inferior to conventional laser/skin marker system. CONCLUSION This proof-of-concept study validated the feasibility and reliability of VIPS in guiding radiotherapy setup. However, limitations and technical challenges should be resolved prior to further clinical evaluation, including isocenter alignment, potential NIR image distortion and the impact of the superficial tissues on the recognition of vessels.
Collapse
|
21
|
Endometrial receptivity evaluation using hysteroscopic endometrial gland image recognition. HUM FERTIL 2023; 26:1347-1353. [PMID: 36942487 DOI: 10.1080/14647273.2023.2191345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Accepted: 12/09/2022] [Indexed: 03/23/2023]
Abstract
We investigated the endometrial gland count on hysteroscopic endometrial images in patients undergoing in vitro fertilization and embryo transfer (IVF-ET) to evaluate endometrial receptivity and predict pregnancy outcomes. Since endometrial receptivity and endometrial glands density are strongly influenced by numerous factors, we selected 98 patients who underwent frozen-thawed embryo transfer (FET) in a natural cycle. Within 1-3 menstrual cycles before embryo transfer, hysteroscopic exploration was performed 3-7 days after ovulation. Uterine cavity morphological data were measured, and hysteroscopic endometrial imaging was performed. An endometrial gland opening labelling algorithm was used to recognize and count the endometrial glands. Patients were divided into pregnancy and non-pregnancy groups based on ET outcomes. No significant differences were noted in patients' general information and laboratory parameters, including age, years of infertility, body mass index, anti-Müllerian hormone, endometrial thickness and embryos transferred between the two groups. The number of endometrial glands in the pregnancy group was higher than that in the non-pregnancy group (p < 0.05). Hysteroscopic examination of the uterine cavity and gland counting analysis of images using image recognition software can better indicate endometrial receptivity and improve pregnancy outcomes.
Collapse
|
22
|
Evaluation of computed tomography metal artifact and CyberKnife fiducial recognition for novel size fiducial markers. J Appl Clin Med Phys 2023; 24:e14142. [PMID: 37672211 PMCID: PMC10691645 DOI: 10.1002/acm2.14142] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2022] [Revised: 06/28/2023] [Accepted: 08/19/2023] [Indexed: 09/07/2023] Open
Abstract
PURPOSE This study aimed to compare fiducial markers used in CyberKnife treatment in terms of metal artifact intensity observed in CT images and fiducial recognition in the CyberKnife system affected by patient body thickness and type of marker. METHODS Five markers, ACCULOC 0.9 mm × 3 mm, Ball type Gold Anchor (GA) 0.28 mm × 10 mm, 0.28 mm × 20 mm, and novel size GA 0.4 mm × 10 mm, 0.4 mm × 20 mm were evaluated. To evaluate metal artifacts of CT images, two types of CT images of water-equivalent gels with each marker were acquired using Aquilion LB CT scanner, one applied SEMAR (SEMAR-on) and the other did not apply this technique (SEMAR-off). The evaluation metric of artifact intensity (MSD ) which represents a variation of CT values were compared for each marker. Next, 5, 15, and 20 cm thickness of Tough Water (TW) was placed on the gel under the condition of overlapping the vertebral phantom in the Target Locating System, and the live image of each marker was acquired to compare fiducial recognition. RESULTS The mean MSD of SEMAR-off was 78.80, 74.50, 97.25, 83.29, and 149.64 HU for ACCULOC, GA0.28 mm × 10 mm, 20 mm, and 0.40 mm × 10 mm, 20 mm, respectively. In the same manner, that of SEMAR-on was 23.52, 20.26, 26.76, 24.89, and 33.96 HU, respectively. Fiducial recognition decreased in the order of 5, 15, and 20 cm thickness, and GA 0.4 × 20 mm showed the best recognition at thickness of 20 cm TW. CONCLUSIONS We demonstrated the potential to reduce metal artifacts in the CT image to the same level for all the markers we evaluated by applying SEMAR. Additionally, the fiducial recognition of each marker may vary depending on the thickness of the patient's body. Particularly, we showed that GA 0.40 × 20 mm may have more optimal recognition for CyberKnife treatment in cases of high bodily thickness in comparison to the other markers.
Collapse
|
23
|
Cellular Automata Inspired Multistable Origami Metamaterials for Mechanical Learning. ADVANCED SCIENCE (WEINHEIM, BADEN-WURTTEMBERG, GERMANY) 2023; 10:e2305146. [PMID: 37870201 DOI: 10.1002/advs.202305146] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 08/31/2023] [Indexed: 10/24/2023]
Abstract
Recent advances in multistable metamaterials reveal a link between structural configuration transition and Boolean logic, heralding a new generation of computationally capable intelligent materials. To enable higher-level computation, existing computational frameworks require the integration of large-scale networked logic gates, which places demanding requirements on the fabrication of materials counterparts and the propagation of signals. Inspired by cellular automata, a novel computational framework based on multistable origami metamaterials by incorporating reservoir computing is proposed, which can accomplish high-level computation tasks without the need to construct a logic gate network. This approach thus eliminates the demanding requirements for the fabrication of materials and signal propagation when constructing large-scale networks for high-level computation in conventional mechanical logic. Using the multistable stacked Miura-origami metamaterial as a validation platform, digit recognition is experimentally implemented by a single actuator. Moreover, complex tasks, such as handwriting recognition and 5-bit memory tasks, are also shown to be feasible with the new computation framework. The research represents a significant advancement in developing a new generation of intelligent materials with advanced computational capabilities. With continued research and development, these materials can have a transformative impact on a wide range of fields, from computational science to material mechano-intelligence technology and beyond.
Collapse
|
24
|
An AI Dietitian for Type 2 Diabetes Mellitus Management Based on Large Language and Image Recognition Models: Preclinical Concept Validation Study. J Med Internet Res 2023; 25:e51300. [PMID: 37943581 PMCID: PMC10667983 DOI: 10.2196/51300] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2023] [Revised: 09/18/2023] [Accepted: 10/06/2023] [Indexed: 11/10/2023] Open
Abstract
BACKGROUND Nutritional management for patients with diabetes in China is a significant challenge due to the low supply of registered clinical dietitians. To address this, an artificial intelligence (AI)-based nutritionist program that uses advanced language and image recognition models was created. This program can identify ingredients from images of a patient's meal and offer nutritional guidance and dietary recommendations. OBJECTIVE The primary objective of this study is to evaluate the competence of the models that support this program. METHODS The potential of an AI nutritionist program for patients with type 2 diabetes mellitus (T2DM) was evaluated through a multistep process. First, a survey was conducted among patients with T2DM and endocrinologists to identify knowledge gaps in dietary practices. ChatGPT and GPT 4.0 were then tested through the Chinese Registered Dietitian Examination to assess their proficiency in providing evidence-based dietary advice. ChatGPT's responses to common questions about medical nutrition therapy were compared with expert responses by professional dietitians to evaluate its proficiency. The model's food recommendations were scrutinized for consistency with expert advice. A deep learning-based image recognition model was developed for food identification at the ingredient level, and its performance was compared with existing models. Finally, a user-friendly app was developed, integrating the capabilities of language and image recognition models to potentially improve care for patients with T2DM. RESULTS Most patients (182/206, 88.4%) demanded more immediate and comprehensive nutritional management and education. Both ChatGPT and GPT 4.0 passed the Chinese Registered Dietitian examination. ChatGPT's food recommendations were mainly in line with best practices, except for certain foods like root vegetables and dry beans. Professional dietitians' reviews of ChatGPT's responses to common questions were largely positive, with 162 out of 168 providing favorable reviews. The multilabel image recognition model evaluation showed that the Dino V2 model achieved an average F1 score of 0.825, indicating high accuracy in recognizing ingredients. CONCLUSIONS The model evaluations were promising. The AI-based nutritionist program is now ready for a supervised pilot study.
Collapse
|
25
|
Low-Pass Image Filtering to Achieve Adversarial Robustness. SENSORS (BASEL, SWITZERLAND) 2023; 23:9032. [PMID: 38005420 PMCID: PMC10675189 DOI: 10.3390/s23229032] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2023] [Revised: 11/01/2023] [Accepted: 11/01/2023] [Indexed: 11/26/2023]
Abstract
In this paper, we continue the research cycle on the properties of convolutional neural network-based image recognition systems and ways to improve noise immunity and robustness. Currently, a popular research area related to artificial neural networks is adversarial attacks. The adversarial attacks on the image are not highly perceptible to the human eye, and they also drastically reduce the neural network's accuracy. Image perception by a machine is highly dependent on the propagation of high frequency distortions throughout the network. At the same time, a human efficiently ignores high-frequency distortions, perceiving the shape of objects as a whole. We propose a technique to reduce the influence of high-frequency noise on the CNNs. We show that low-pass image filtering can improve the image recognition accuracy in the presence of high-frequency distortions in particular, caused by adversarial attacks. This technique is resource efficient and easy to implement. The proposed technique makes it possible to measure up the logic of an artificial neural network to that of a human, for whom high-frequency distortions are not decisive in object recognition.
Collapse
|
26
|
Deep Neural Network-Based Electron Microscopy Image Recognition for Source Distinguishing of Anthropogenic and Natural Magnetic Particles. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:16465-16476. [PMID: 37801812 DOI: 10.1021/acs.est.3c05252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/08/2023]
Abstract
Deep learning models excel at image recognition of macroscopic objects, but their applications to nanoscale particles are limited. Here, we explored their potential for source-distinguishing environmental particles. Transmission electron microscopy (TEM) images can reveal distinguishable features in particle morphology from various sources, but cluttered foreground objects and scale variations pose challenges to visual recognition models. In this proof-of-concept work, we proposed a novel instance segmentation model named CoMask to tackle these issues with atmospheric magnetic particles, a key species of PM2.5. CoMask features a densely connected feature extraction module to excavate multiscale spatial cues at the single-particle level and enlarges the receptive field size for improved representation capability. We also employed a collaborative learning strategy to further improve performance. Compared with other state-of-the-art models, CoMask was competitive on benchmark and TEM data sets. The application of CoMask not only enables the source-distinguishing of magnetic particles but also opens up a new vista for machine learning applications.
Collapse
|
27
|
Artificial Visual Synaptic Architecture with High-Linearity Light-Modulated Weight for Optoelectronic Neuromorphic Computing. ACS APPLIED MATERIALS & INTERFACES 2023. [PMID: 37885218 DOI: 10.1021/acsami.3c11495] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 10/28/2023]
Abstract
A brain-like neuromorphic computing system, as compared with traditional Von Neumann architecture, has broad application prospects in the fields of emerging artificial intelligence (AI) due to its high fault tolerance, excellent plasticity, and parallel computing capability. A neuromorphic visuosensory and memory system, an important branch of neuromorphic computing, is the basis for AI to perceive, process, and memorize optical information, now still suffering from nonlinearity of synaptic weight, crosstalk issues, and integration incompatibility, hindering the high-level training and inference accuracy. In this work, we propose a new optoelectronic neuromorphic architecture by integrating an electrochromic device and a perovskite photodetector. Ascribing to the superior memory characteristics of the electrochromic device and sensitive light response of the perovskite photodetector, the neuromorphic device shows typical visual synaptic functionalities such as light triggering, neural facilitation, long-term potentiation, and depression (LTP and LTD). Furthermore, by adjusting the intensity and wavelength of external light signals, the visual synaptic function of the device can be modulated, enabling the device to exhibit high weight linearity in all current output ranges and improve information processing capability and image recognition accuracy. Moreover, both the electrochromic and perovskite layers possess the advantage of large area fabrication and integration, which enables the fabrication of large device arrays with high integration compatibility and scalability. In this study, 10 × 10 device arrays are demonstrated and each device shows uniform light responses, memory behaviors, and synaptic performances. MNIST and CIFAR-10 algorithms are used to simulate the image recognition properties of the synaptic architecture, and the calculated recognition accuracy is 97.94 and 91.04%, respectively, with an error less than 2.5%. The proposed artificial visual neuromorphic architecture will provide a potential device prototype for efficient visual neuromorphic systems.
Collapse
|
28
|
Rapid Detection and Analysis of Raman Spectra of Bacteria in Multiple Fields of View Based on Image Stitching Technique. FRONT BIOSCI-LANDMRK 2023; 28:249. [PMID: 37919069 DOI: 10.31083/j.fbl2810249] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 04/17/2023] [Accepted: 06/14/2023] [Indexed: 11/04/2023]
Abstract
BACKGROUND Due to antibiotic abuse, the problem of bacterial resistance is becoming increasingly serious, and rapid detection of bacterial resistance has become an urgent issue. Because under the action of antibiotics, different active bacteria have different metabolism of heavy water, antibiotic resistance of bacteria can be identified according to the existence of a C-D peak in the 2030-2400 cm-1 range in the Raman spectrum. METHODS To ensure data veracity, a large number of bacteria need to be detected, however, due to the limitation of the field of view of the high magnification objective, the number of single cells in a single field of view is very small. By combining an image stitching algorithm, image recognition algorithm, and processing of Raman spectrum and peak-seeking algorithm, can identify and locate single cells in multiple fields of view at one time and can discriminate whether they are Antimicrobial-resistant bacteria. RESULTS In experiments 1 and 2, 2706 bacteria in 9 × 11 fields of view and 2048 bacteria in 11 × 11 fields of view were detected. Results showed that in experiment 1, there are 1137 antibiotic-resistant bacteria, accounting for 42%, and 1569 sensitive bacteria, accounting for 58%. In experiment 2, there are 1087 antibiotic-resistant bacteria, accounting for 53%, and 961 sensitive bacteria, accounting for 47%. It showed excellent performance in terms of speed and recognition accuracy as compared to traditional manual detection approaches. And solves the problems of low accuracy of data, a large number of manual experiments, and low efficiency due to the small number of single cells in the high magnification field of view and different peak-seeking parameters of different Raman spectra. CONCLUSIONS The detection and analysis method of bacterial Raman spectra based on image stitching can be used for unattended, automatic, rapid and accurate detection of single cells at high magnification with multiple fields of view. With the characteristics of automatic, high-throughput, rapid, and accurate identification, it can be used as an unattended, universal and non-invasive means to measure antibiotic-resistant bacteria to screen for effective antibiotics, which is of great importance for studying the persistence and spread of antibiotics in bacterial pathogens.
Collapse
|
29
|
Effects of Mg Doping to a LiCoO 2 Channel on the Synaptic Plasticity of Li Ion-Gated Transistors. ACS APPLIED MATERIALS & INTERFACES 2023; 15:47184-47195. [PMID: 37768881 DOI: 10.1021/acsami.3c07833] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/30/2023]
Abstract
Artificial synapses with ideal functionalities are essential in hardware neural networks to allow for energy-efficient analog computing. However, the realization of linear and symmetric weight updates in real synaptic devices has proven challenging and ultimately limits the online training capabilities of neural network systems. Herein, we investigate the effect of Mg doping on a LiCoO2 (LCO) channel in a Li ion-gated synaptic transistor, so as to improve long-term and short-term plasticity. Two transistor structures, based on a lithium phosphorus oxynitride electrolyte, were examined by using undoped LCO and Mg-doped LCO as the channel material between the source and drain electrodes. It was found that Mg doping increased the initial channel conductance by 3 orders of magnitude, which is probably due to the substitution of Co3+ by Mg2+ and the compensation of hole creation. It was further found that the doped channel transistor showed good retention characteristics and better linearity of long-term potentiation and depression when voltage pulses were applied to the gate electrode. The improved retention and linearity are attributed to an extended range of the insulator-to-conductor transition by Mg doping and Li-ion extraction/insertion cooperated in the LCO channel. Using the obtained synaptic weight update, artificial neural network simulations demonstrated that the doped channel transistor shows an image recognition accuracy of ∼80% for handwritten digits, which is higher than ∼65% exhibited by the undoped channel transistor. Mg doping also improved short-term plasticity such as paired-pulse facilitation/depression and Hebbian spike timing-dependent plasticity. These results indicate that elemental doping to the channel of Li ion-gated synaptic transistors could be a useful procedure for realizing robust neuromorphic systems based on analog computing.
Collapse
|
30
|
Using Crowdsourced Food Image Data for Assessing Restaurant Nutrition Environment: A Validation Study. Nutrients 2023; 15:4287. [PMID: 37836570 PMCID: PMC10574450 DOI: 10.3390/nu15194287] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2023] [Revised: 09/18/2023] [Accepted: 10/05/2023] [Indexed: 10/15/2023] Open
Abstract
Crowdsourced online food images, when combined with food image recognition technologies, have the potential to offer a cost-effective and scalable solution for the assessment of the restaurant nutrition environment. While previous research has explored this approach and validated the accuracy of food image recognition technologies, much remains unknown about the validity of crowdsourced food images as the primary data source for large-scale assessments. In this paper, we collect data from multiple sources and comprehensively examine the validity of using crowdsourced food images for assessing the restaurant nutrition environment in the Greater Hartford region. Our results indicate that while crowdsourced food images are useful in terms of the initial assessment of restaurant nutrition quality and the identification of popular food items, they are subject to selection bias on multiple levels and do not fully represent the restaurant nutrition quality or customers' dietary behaviors. If employed, the food image data must be supplemented with alternative data sources, such as field surveys, store audits, and commercial data, to offer a more representative assessment of the restaurant nutrition environment.
Collapse
|
31
|
Healthcare's New Horizon With ChatGPT's Voice and Vision Capabilities: A Leap Beyond Text. Cureus 2023; 15:e47469. [PMID: 37873042 PMCID: PMC10590619 DOI: 10.7759/cureus.47469] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/22/2023] [Indexed: 10/25/2023] Open
Abstract
The integration of artificial intelligence (AI) in healthcare is responsible for a paradigm shift in medicine. OpenAI's recent augmentation of their Generative Pre-trained Transformer (ChatGPT) large language model (LLM) with voice and image recognition capabilities (OpenAI, Delaware) presents another potential transformative tool for healthcare. Envision a healthcare setting where professionals engage in dynamic interactions with ChatGPT to navigate the complexities of atypical medical scenarios. In this innovative landscape, practitioners could solicit ChatGPT's expertise for concise summarizations and insightful extrapolations from a myriad of web-based resources pertaining to similar medical conditions. Furthermore, imagine patients using ChatGPT to identify abnormalities in medical images or skin lesions. While the prospects are diverse, challenges such as suboptimal audio quality and ensuring data security necessitate cautious integration in medical practice. Drawing insights from previous ChatGPT iterations could provide a prudent roadmap for navigating possible challenges. This editorial explores some possible horizons and potential hurdles of ChatGPT's enhanced functionalities in healthcare, emphasizing the importance of continued refinements and vigilance to maximize the benefits while minimizing risks. Through collaborative efforts between AI developers and healthcare professionals, another fusion of AI and healthcare can evolve into enriched patient care and enhanced medical experience.
Collapse
|
32
|
Realization of Artificial Neurons and Synapses Based on STDP Designed by an MTJ Device. MICROMACHINES 2023; 14:1820. [PMID: 37893257 PMCID: PMC10609371 DOI: 10.3390/mi14101820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2023] [Revised: 09/22/2023] [Accepted: 09/22/2023] [Indexed: 10/29/2023]
Abstract
As the third-generation neural network, the spiking neural network (SNN) has become one of the most promising neuromorphic computing paradigms to mimic brain neural networks over the past decade. The SNN shows many advantages in performing classification and recognition tasks in the artificial intelligence field. In the SNN, the communication between the pre-synapse neuron (PRE) and the post-synapse neuron (POST) is conducted by the synapse. The corresponding synaptic weights are dependent on both the spiking patterns of the PRE and the POST, which are updated by spike-timing-dependent plasticity (STDP) rules. The emergence and growing maturity of spintronic devices present a new approach for constructing the SNN. In the paper, a novel SNN is proposed, in which both the synapse and the neuron are mimicked with the spin transfer torque magnetic tunnel junction (STT-MTJ) device. The synaptic weight is presented by the conductance of the MTJ device. The mapping of the probabilistic spiking nature of the neuron to the stochastic switching behavior of the MTJ with thermal noise is presented based on the stochastic Landau-Lifshitz-Gilbert (LLG) equation. In this way, a simplified SNN is mimicked with the MTJ device. The function of the mimicked SNN is verified by a handwritten digit recognition task based on the MINIST database.
Collapse
|
33
|
Application of Deep Learning Techniques for Detection of Pneumothorax in Chest Radiographs. SENSORS (BASEL, SWITZERLAND) 2023; 23:7369. [PMID: 37687825 PMCID: PMC10490570 DOI: 10.3390/s23177369] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 08/21/2023] [Accepted: 08/22/2023] [Indexed: 09/10/2023]
Abstract
With the advent of Artificial Intelligence (AI) and even more so recently in the field of Machine Learning (ML), there has been rapid progress across the field. One of the prominent examples is image recognition in the medical category, such as X-ray imaging, Computed Tomography (CT), and Magnetic Resonance Imaging (MRI). It has the potential to alleviate a doctor's heavy workload of sifting through large quantities of images. Due to the rising attention to lung-related diseases, such as pneumothorax and nodules, ML is being incorporated into the field in the hope of alleviating the already strained medical resources. In this study, we proposed a system that can detect pneumothorax diseases reliably. By comparing multiple models and hyperparameter configurations, we recommend a model for hospitals, as its focus on minimizing false positives aligns with the precision required by medical professionals. Through our cooperation with Poh-Ai Hospital, we acquired a total of over 8000 X-ray images, with more than 1000 of them from pneumothorax patients. We hope that by integrating AI systems into the automated process of scanning chest X-ray images with various diseases, more resources will be available in the already strained medical systems. Our proposed system showed that the best model that is used for transfer learning from our dataset performed with an AP of 51.57 and an AP75 of 61.40, with accuracy at 93.89%, a false positive of 1.12%, and a false negative of 4.99%. Based on the feedback from practicing doctors, they are more wary of false positives. For their use case, we recommend another model due to the lower false positive rate and higher accuracy compared with other models, which in our test shows a rate of only 0.88% and 95.68%, demonstrating the feasibility of the research. This promising result showed that it could be utilized in other types of diseases and expand to more hospitals and medical organizations, potentially benefitting more people.
Collapse
|
34
|
WSCNet: Biomedical Image Recognition for Cell Encapsulated Microfluidic Droplets. BIOSENSORS 2023; 13:821. [PMID: 37622907 PMCID: PMC10452702 DOI: 10.3390/bios13080821] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 08/08/2023] [Accepted: 08/12/2023] [Indexed: 08/26/2023]
Abstract
Microfluidic droplets accommodating a single cell as independent microreactors are frequently demanded for single-cell analysis of phenotype and genotype. However, challenges exist in identifying and reducing the covalence probability (following Poisson's distribution) of more than two cells encapsulated in one droplet. It is of great significance to monitor and control the quantity of encapsulated content inside each droplet. We demonstrated a microfluidic system embedded with a weakly supervised cell counting network (WSCNet) to generate microfluidic droplets, evaluate their quality, and further recognize the locations of encapsulated cells. Here, we systematically verified our approach using encapsulated droplets from three different microfluidic structures. Quantitative experimental results showed that our approach can not only distinguish droplet encapsulations (F1 score > 0.88) but also locate each cell without any supervised location information (accuracy > 89%). The probability of a "single cell in one droplet" encapsulation is systematically verified under different parameters, which shows good agreement with the distribution of the passive method (Residual Sum of Squares, RSS < 0.5). This study offers a comprehensive platform for the quantitative assessment of encapsulated microfluidic droplets.
Collapse
|
35
|
Experimental Research into the Uniaxial Compressive Strength of Low-Density Reef Limestone Based on Image Recognition. MATERIALS (BASEL, SWITZERLAND) 2023; 16:5465. [PMID: 37570169 PMCID: PMC10420266 DOI: 10.3390/ma16155465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/03/2023] [Revised: 07/27/2023] [Accepted: 08/02/2023] [Indexed: 08/13/2023]
Abstract
Low-density reef limestone is widely distributed in tropical oceans; exploring its mechanical properties is of significance to practices in marine foundation engineering. In this research, laboratory experiments on low-density reef limestones with two different types of porous structures were conducted using image recognition methods to study the special mechanical properties of low-reef limestone. S¯ was defined as the parameter quantifying the pore geometry, and the calculation method of S¯ was optimized based on image recognition data. Finally, the influencing factors of uniaxial compressive strength (UCS) of low-density reef limestone were analyzed, and a modified formula considering pore structure was proposed. The results indicate the following: Image recognition methods were used to determine feasibility and convenience of capturing 2D pore geometric information of specimens. The optimization method of S¯ is conducive to improving automatic image recognition accuracy. Low-density reef limestones with different porous structures show small difference in porosity and density, while they exhibit large differences in pore sizes and UCS. The UCS of low-density reef limestone is found to be jointly influenced by pore structure and density (it increases with the decrease in parameter S¯ and increase in density). The results may provide help for those investigating the mechanical properties of reef limestone and practices in marine foundation engineering.
Collapse
|
36
|
Sharing leaky-integrate-and-fire neurons for memory-efficient spiking neural networks. Front Neurosci 2023; 17:1230002. [PMID: 37583415 PMCID: PMC10423932 DOI: 10.3389/fnins.2023.1230002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2023] [Accepted: 07/13/2023] [Indexed: 08/17/2023] Open
Abstract
Spiking Neural Networks (SNNs) have gained increasing attention as energy-efficient neural networks owing to their binary and asynchronous computation. However, their non-linear activation, that is Leaky-Integrate-and-Fire (LIF) neuron, requires additional memory to store a membrane voltage to capture the temporal dynamics of spikes. Although the required memory cost for LIF neurons significantly increases as the input dimension goes larger, a technique to reduce memory for LIF neurons has not been explored so far. To address this, we propose a simple and effective solution, EfficientLIF-Net, which shares the LIF neurons across different layers and channels. Our EfficientLIF-Net achieves comparable accuracy with the standard SNNs while bringing up to ~4.3× forward memory efficiency and ~21.9× backward memory efficiency for LIF neurons. We conduct experiments on various datasets including CIFAR10, CIFAR100, TinyImageNet, ImageNet-100, and N-Caltech101. Furthermore, we show that our approach also offers advantages on Human Activity Recognition (HAR) datasets, which heavily rely on temporal information. The code has been released at https://github.com/Intelligent-Computing-Lab-Yale/EfficientLIF-Net.
Collapse
|
37
|
Image recognition of traditional Chinese medicine based on deep learning. Front Bioeng Biotechnol 2023; 11:1199803. [PMID: 37545883 PMCID: PMC10402920 DOI: 10.3389/fbioe.2023.1199803] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Accepted: 05/17/2023] [Indexed: 08/08/2023] Open
Abstract
Chinese herbal medicine is an essential part of traditional Chinese medicine and herbalism, and has important significance in the treatment combined with modern medicine. The correct use of Chinese herbal medicine, including identification and classification, is crucial to the life safety of patients. Recently, deep learning has achieved advanced performance in image classification, and researchers have applied this technology to carry out classification work on traditional Chinese medicine and its products. Therefore, this paper uses the improved ConvNeXt network to extract features and classify traditional Chinese medicine. Its structure is to fuse ConvNeXt with ACMix network to improve the performance of ConvNeXt feature extraction. Through using data processing and data augmentation techniques, the sample size is indirectly expanded, the generalization ability is enhanced, and the feature extraction ability is improved. A traditional Chinese medicine classification model is established, and the good recognition results are achieved. Finally, the effectiveness of traditional Chinese medicine identification is verified through the established classification model, and different depth of network models are compared to improve the efficiency and accuracy of the model.
Collapse
|
38
|
A New Target Detection Method of Ferrography Wear Particle Images Based on ECAM-YOLOv5-BiFPN Network. SENSORS (BASEL, SWITZERLAND) 2023; 23:6477. [PMID: 37514771 PMCID: PMC10385517 DOI: 10.3390/s23146477] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/17/2023] [Revised: 07/04/2023] [Accepted: 07/14/2023] [Indexed: 07/30/2023]
Abstract
For mechanical equipment, the wear particle in the lubrication system during equipment operation can reflect the lubrication condition, wear mechanism, and severity of wear between equipment friction pairs. To solve the problems of false detection and missed detection of small, dense, and overlapping wear particles in the current ferrography wear particle detection model in a complex oil background environment, a new ferrography wear particle detection network, EYBNet, is proposed. Firstly, the MSRCR algorithm is used to enhance the contrast of wear particle images and reduce the interference of complex lubricant backgrounds. Secondly, under the framework of YOLOv5s, the accuracy of network detection is improved by introducing DWConv and the accuracy of the entire network is improved by optimizing the loss function of the detection network. Then, by adding an ECAM to the backbone network of YOLOv5s, the saliency of wear particles in the images is enhanced, and the feature expression ability of wear particles in the detection network is enhanced. Finally, the path aggregation network structure in YOLOv5s is replaced with a weighted BiFPN structure to achieve efficient bidirectional cross-scale connections and weighted feature fusion. The experimental results show that the average accuracy is increased by 4.46%, up to 91.3%, compared with YOLOv5s, and the detection speed is 50.5FPS.
Collapse
|
39
|
Deep learning for the rapid automatic segmentation of forearm muscle boundaries from ultrasound datasets. Front Physiol 2023; 14:1166061. [PMID: 37520832 PMCID: PMC10374344 DOI: 10.3389/fphys.2023.1166061] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2023] [Accepted: 06/28/2023] [Indexed: 08/01/2023] Open
Abstract
Ultrasound (US) is widely used in the clinical diagnosis and treatment of musculoskeletal diseases. However, the low efficiency and non-uniformity of artificial recognition hinder the application and popularization of US for this purpose. Herein, we developed an automatic muscle boundary segmentation tool for US image recognition and tested its accuracy and clinical applicability. Our dataset was constructed from a total of 465 US images of the flexor digitorum superficialis (FDS) from 19 participants (10 men and 9 women, age 27.4 ± 6.3 years). We used the U-net model for US image segmentation. The U-net output often includes several disconnected regions. Anatomically, the target muscle usually only has one connected region. Based on this principle, we designed an algorithm written in C++ to eliminate redundantly connected regions of outputs. The muscle boundary images generated by the tool were compared with those obtained by professionals and junior physicians to analyze their accuracy and clinical applicability. The dataset was divided into five groups for experimentation, and the average Dice coefficient, recall, and accuracy, as well as the intersection over union (IoU) of the prediction set in each group were all about 90%. Furthermore, we propose a new standard to judge the segmentation results. Under this standard, 99% of the total 150 predicted images by U-net are excellent, which is very close to the segmentation result obtained by professional doctors. In this study, we developed an automatic muscle segmentation tool for US-guided muscle injections. The accuracy of the recognition of the muscle boundary was similar to that of manual labeling by a specialist sonographer, providing a reliable auxiliary tool for clinicians to shorten the US learning cycle, reduce the clinical workload, and improve injection safety.
Collapse
|
40
|
A transformer-based approach for early prediction of soybean yield using time-series images. FRONTIERS IN PLANT SCIENCE 2023; 14:1173036. [PMID: 37409295 PMCID: PMC10319415 DOI: 10.3389/fpls.2023.1173036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/24/2023] [Accepted: 05/29/2023] [Indexed: 07/07/2023]
Abstract
Crop yield prediction which provides critical information for management decision-making is of significant importance in precision agriculture. Traditional manual inspection and calculation are often laborious and time-consuming. For yield prediction using high-resolution images, existing methods, e.g., convolutional neural network, are challenging to model long range multi-level dependencies across image regions. This paper proposes a transformer-based approach for yield prediction using early-stage images and seed information. First, each original image is segmented into plant and soil categories. Two vision transformer (ViT) modules are designed to extract features from each category. Then a transformer module is established to deal with the time-series features. Finally, the image features and seed features are combined to estimate the yield. A case study has been conducted using a dataset that was collected during the 2020 soybean-growing seasons in Canadian fields. Compared with other baseline models, the proposed method can reduce the prediction error by more than 40%. The impact of seed information on predictions is studied both between models and within a single model. The results show that the influence of seed information varies among different plots but it is particularly important for the prediction of low yields.
Collapse
|
41
|
STNet: shape and texture joint learning through two-stream network for knowledge-guided image recognition. Front Neurosci 2023; 17:1212049. [PMID: 37397450 PMCID: PMC10309034 DOI: 10.3389/fnins.2023.1212049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2023] [Accepted: 05/31/2023] [Indexed: 07/04/2023] Open
Abstract
Introduction The human brain processes shape and texture information separately through different neurons in the visual system. In intelligent computer-aided imaging diagnosis, pre-trained feature extractors are commonly used in various medical image recognition methods, common pre-training datasets such as ImageNet tend to improve the texture representation of the model but make it ignore many shape features. Weak shape feature representation is disadvantageous for some tasks that focus on shape features in medical image analysis. Methods Inspired by the function of neurons in the human brain, in this paper, we proposed a shape-and-texture-biased two-stream network to enhance the shape feature representation in knowledge-guided medical image analysis. First, the two-stream network shape-biased stream and a texture-biased stream are constructed through classification and segmentation multi-task joint learning. Second, we propose pyramid-grouped convolution to enhance the texture feature representation and introduce deformable convolution to enhance the shape feature extraction. Third, we used a channel-attention-based feature selection module in shape and texture feature fusion to focus on the key features and eliminate information redundancy caused by feature fusion. Finally, aiming at the problem of model optimization difficulty caused by the imbalance in the number of benign and malignant samples in medical images, an asymmetric loss function was introduced to improve the robustness of the model. Results and conclusion We applied our method to the melanoma recognition task on ISIC-2019 and XJTU-MM datasets, which focus on both the texture and shape of the lesions. The experimental results on dermoscopic image recognition and pathological image recognition datasets show the proposed method outperforms the compared algorithms and prove the effectiveness of our method.
Collapse
|
42
|
Classification of Citrus Huanglongbing Degree Based on CBAM-MobileNetV2 and Transfer Learning. SENSORS (BASEL, SWITZERLAND) 2023; 23:5587. [PMID: 37420753 DOI: 10.3390/s23125587] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 06/11/2023] [Accepted: 06/13/2023] [Indexed: 07/09/2023]
Abstract
Citrus has become a pivotal industry for the rapid development of agriculture and increasing farmers' incomes in the main production areas of southern China. Knowing how to diagnose and control citrus huanglongbing has always been a challenge for fruit farmers. To promptly recognize the diagnosis of citrus huanglongbing, a new classification model of citrus huanglongbing was established based on MobileNetV2 with a convolutional block attention module (CBAM-MobileNetV2) and transfer learning. First, the convolution features were extracted using convolution modules to capture high-level object-based information. Second, an attention module was utilized to capture interesting semantic information. Third, the convolution module and attention module were combined to fuse these two types of information. Last, a new fully connected layer and a softmax layer were established. The collected 751 citrus huanglongbing images, with sizes of 3648 × 2736, were divided into early, middle, and late leaf images with different disease degrees, and were enhanced to 6008 leaf images with sizes of 512 × 512, including 2360 early citrus huanglongbing images, 2024 middle citrus huanglongbing images, and 1624 late citrus huanglongbing images. In total, 80% and 20% of the collected citrus huanglongbing images were assigned to the training set and the test set, respectively. The effects of different transfer learning methods, different model training effects, and initial learning rates on model performance were analyzed. The results show that with the same model and initial learning rate, the transfer learning method of parameter fine tuning was obviously better than the transfer learning method of parameter freezing, and that the recognition accuracy of the test set improved by 1.02~13.6%. The recognition accuracy of the citrus huanglongbing image recognition model based on CBAM-MobileNetV2 and transfer learning was 98.75% at an initial learning rate of 0.001, and the loss value was 0.0748. The accuracy rates of the MobileNetV2, Xception, and InceptionV3 network models were 98.14%, 96.96%, and 97.55%, respectively, and the effect was not as significant as that of CBAM-MobileNetV2. Therefore, based on CBAM-MobileNetV2 and transfer learning, an image recognition model of citrus huanglongbing images with high recognition accuracy could be constructed.
Collapse
|
43
|
Recognition of Ellipsoid-like Herbaceous Tibetan Medicinal Materials Using DenseNet with Attention and ILBP-Encoded Gabor Features. ENTROPY (BASEL, SWITZERLAND) 2023; 25:847. [PMID: 37372191 DOI: 10.3390/e25060847] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2023] [Revised: 05/08/2023] [Accepted: 05/23/2023] [Indexed: 06/29/2023]
Abstract
Tibetan medicinal materials play a significant role in Tibetan culture. However, some types of Tibetan medicinal materials share similar shapes and colors, but possess different medicinal properties and functions. The incorrect use of such medicinal materials may lead to poisoning, delayed treatment, and potentially severe consequences for patients. Historically, the identification of ellipsoid-like herbaceous Tibetan medicinal materials has relied on manual identification methods, including observation, touching, tasting, and nasal smell, which heavily rely on the technicians' accumulated experience and are prone to errors. In this paper, we propose an image-recognition method for ellipsoid-like herbaceous Tibetan medicinal materials that combines texture feature extraction and a deep-learning network. We created an image dataset consisting of 3200 images of 18 types of ellipsoid-like Tibetan medicinal materials. Due to the complex background and high similarity in the shape and color of the ellipsoid-like herbaceous Tibetan medicinal materials in the images, we conducted a multi-feature fusion experiment on the shape, color, and texture features of these materials. To leverage the importance of texture features, we utilized an improved LBP (local binary pattern) algorithm to encode the texture features extracted by the Gabor algorithm. We inputted the final features into the DenseNet network to recognize the images of the ellipsoid-like herbaceous Tibetan medicinal materials. Our approach focuses on extracting important texture information while ignoring irrelevant information such as background clutter to eliminate interference and improve recognition performance. The experimental results show that our proposed method achieved a recognition accuracy of 93.67% on the original dataset and 95.11% on the augmented dataset. In conclusion, our proposed method could aid in the identification and authentication of ellipsoid-like herbaceous Tibetan medicinal materials, reducing errors and ensuring the safe use of Tibetan medicinal materials in healthcare.
Collapse
|
44
|
Interpretation of EKG with Image Recognition and Convolutional Neural Networks. Curr Probl Cardiol 2023; 48:101744. [PMID: 37084992 DOI: 10.1016/j.cpcardiol.2023.101744] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2023] [Accepted: 04/14/2023] [Indexed: 04/23/2023]
Abstract
Electrocardiograms (EKG) form the backbone of all cardiovascular diagnosis, treatment and follow up. Given the pivotal role it plays in modern medicine, there have been multiple efforts to computerize the EKG interpretation with algorithms to improve efficiency and accuracy. Unfortunately, many of these algorithms are machine specific and run-on proprietary signals generated by that machine, hence not generalizable. We propose the development of an image recognition model which can be used to read standard EKG strips. A convolutional neural network (CNN) was trained to classify 12-lead EKGs between seven clinically important diagnostic classes. An austere variation of the MobileNetV3 model was trained from the ground up on publicly available labeled training set. The precision per class varies from 52-91%. This is a novel approach to EKG interpretation as an image recognition problem.
Collapse
|
45
|
Geriatric Care Management System Powered by the IoT and Computer Vision Techniques. Healthcare (Basel) 2023; 11:healthcare11081152. [PMID: 37107987 PMCID: PMC10138364 DOI: 10.3390/healthcare11081152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 04/03/2023] [Accepted: 04/13/2023] [Indexed: 04/29/2023] Open
Abstract
The digitalisation of geriatric care refers to the use of emerging technologies to manage and provide person-centered care to the elderly by collecting patients' data electronically and using them to streamline the care process, which improves the overall quality, accuracy, and efficiency of healthcare. In many countries, healthcare providers still rely on the manual measurement of bioparameters, inconsistent monitoring, and paper-based care plans to manage and deliver care to elderly patients. This can lead to a number of problems, including incomplete and inaccurate record-keeping, errors, and delays in identifying and resolving health problems. The purpose of this study is to develop a geriatric care management system that combines signals from various wearable sensors, noncontact measurement devices, and image recognition techniques to monitor and detect changes in the health status of a person. The system relies on deep learning algorithms and the Internet of Things (IoT) to identify the patient and their six most pertinent poses. In addition, the algorithm has been developed to monitor changes in the patient's position over a longer period of time, which could be important for detecting health problems in a timely manner and taking appropriate measures. Finally, based on expert knowledge and a priori rules integrated in a decision tree-based model, the automated final decision on the status of nursing care plan is generated to support nursing staff.
Collapse
|
46
|
Cloud Based Fault Diagnosis by Convolutional Neural Network as Time-Frequency RGB Image Recognition of Industrial Machine Vibration with Internet of Things Connectivity. SENSORS (BASEL, SWITZERLAND) 2023; 23:3755. [PMID: 37050816 PMCID: PMC10099050 DOI: 10.3390/s23073755] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 03/05/2023] [Revised: 03/30/2023] [Accepted: 03/31/2023] [Indexed: 06/19/2023]
Abstract
The human-centric and resilient European industry called Industry 5.0 requires a long lifetime of machines to reduce electronic waste. The appropriate way to handle this problem is to apply a diagnostic system capable of remotely detecting, isolating, and identifying faults. The authors present usage of HTTP/1.1 protocol for batch processing as a fault diagnosis server. Data are sent by microcontroller HTTP client in JSON format to the diagnosis server. Moreover, the MQTT protocol was used for stream (micro batch) processing from microcontroller client to two fault diagnosis clients. The first fault diagnosis MQTT client uses only frequency data for evaluation. The authors' enhancement to standard fast Fourier transform (FFT) was their usage of sliding discrete Fourier transform (rSDFT, mSDFT, gSDFT, and oSDFT) which allows recursively updating the spectrum based on a new sample in the time domain and previous results in the frequency domain. This approach allows to reduce the computational cost. The second approach of the MQTT client for fault diagnosis uses short-time Fourier transform (STFT) to transform IMU 6 DOF sensor data into six spectrograms that are combined into an RGB image. All three-axis accelerometer and three-axis gyroscope data are used to obtain a time-frequency RGB image. The diagnosis of the machine is performed by a trained convolutional neural network suitable for RGB image recognition. Prediction result is returned as a JSON object with predicted state and probability of each state. For HTTP, the fault diagnosis result is sent in response, and for MQTT, it is send to prediction topic. Both protocols and both proposed approaches are suitable for fault diagnosis based on the mechanical vibration of the rotary machine and were tested in demonstration.
Collapse
|
47
|
Equity evaluation of urban green space in the main urban area of Wuhan based on green view index. YING YONG SHENG TAI XUE BAO = THE JOURNAL OF APPLIED ECOLOGY 2023; 34:1083-1090. [PMID: 37078328 DOI: 10.13287/j.1001-9332.202304.025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 04/21/2023]
Abstract
Green space is a kind of resource welfare. The evaluation of green space equity based on green view index (GVI) is important to ensure the equitable distribution of green resources. Taking the central urban area of Wuhan as the research object, based on multi-source data such as Baidu Street View Map, Baidu Thermal Map, and satellite remote sensing images, we evaluated the equity of spatial distribution of GVI in Wuhan by using the locational entropy, Gini coefficient and Lorenz curve. The results showed that 87.6% of the points in the central urban area of Wuhan were below the level of poor green vision, which mainly concentrated in Wuhan Iron and Steel Industrial Base of Qingshan District and south of Yandong Lake. The number of points reaching an excellent level was the least (0.4%), mainly concentrated around the East Lake. The overall Gini coefficient of GVI in the central urban area of Wuhan was 0.49, which indicated that the distribution of GVI was heterogeneous. The Gini coefficient of Hongshan District was the largest at 0.64, indicating a huge gap in the distribution of GVI, while the Gini coefficient of Jianghan District was the smallest at 0.47, with a large gap in the distribution. The central urban area of Wuhan had the most low-entropy areas for 29.7% and the least high-entropy areas for 15.4%. There were two-level differences in entropy distribution within Hongshan District, Qingshan District, and Wuchang District. The nature of land use and the role of linear greenery were the main factors affecting the equity of green space in the study area. Our results could provide theoretical basis and planning reference for optimizing urban green space layout.
Collapse
|
48
|
A grid management system for COVID-19 antigen detection based on image recognition. JOURNAL OF RADIATION RESEARCH AND APPLIED SCIENCES 2023:100563. [PMCID: PMC10027961 DOI: 10.1016/j.jrras.2023.100563] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/24/2023]
Abstract
Objective To develop a SARS-CoV-2 antigen detection management system for Chinese residents under community grid management, which is supported by “health information technology” and “neural network image recognition”, so as to give full play to the advantages of “grid management”. This system is applied to the normalized prevention and control of COVID-19 epidemic. Methods The model of image recognition algorithm was built based on deep learning and convolution neural network (CNN) artificial intelligence algorithm. The improved Canny edge detection algorithm was used to monitor and locate the image edge, and then the image segmentation and judgment value calculation were completed according to projection method. The system construction was completed combing with the grid number design. Results The proposed method had been tested and showed the accuracy of the algorithm. With a certain robustness, the algorithm error was proved to be small. Based on the image recognition algorithm model, the development of SARS-CoV-2 antigen detection management system covering user login, paper-strip test image upload, paper-strip test management, grid management, grid warning and regional traffic management was completed. Conclusions Antigen detection is an important supplementary means of COVID-19 epidemic prevention and control in the new stage. The SARS-CoV-2 antigen detection management system for Chinese residents under community grid managemen based on image recognition enables mobile communication devices to recognize the image of SARS-CoV-2 antigen detection results, which is helpful to form a grid management mode for the epidemic and improve the management framework of epidemic monitoring, detection, early warning and prevention and control.
Collapse
|
49
|
Image Adversarial Example Generation Method Based on Adaptive Parameter Adjustable Differential Evolution. ENTROPY (BASEL, SWITZERLAND) 2023; 25:487. [PMID: 36981373 PMCID: PMC10047979 DOI: 10.3390/e25030487] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/19/2022] [Revised: 03/01/2023] [Accepted: 03/08/2023] [Indexed: 06/18/2023]
Abstract
Adversarial example generation techniques for neural network models have exploded in recent years. In the adversarial attack scheme for image recognition models, it is challenging to achieve a high attack success rate with very few pixel modifications. To address this issue, this paper proposes an adversarial example generation method based on adaptive parameter adjustable differential evolution. The method realizes the dynamic adjustment of the algorithm performance by adjusting the control parameters and operation strategies of the adaptive differential evolution algorithm, while searching for the optimal perturbation. Finally, the method generates adversarial examples with a high success rate, modifying just a very few pixels. The attack effectiveness of the method is confirmed in CIFAR10 and MNIST datasets. The experimental results show that our method has a greater attack success rate than the One Pixel Attack based on the conventional differential evolution. In addition, it requires significantly less perturbation to be successful compared to global or local perturbation attacks, and is more resistant to perception and detection.
Collapse
|
50
|
Artificial Intelligence-Based Smart Quality Inspection for Manufacturing. MICROMACHINES 2023; 14:570. [PMID: 36984977 PMCID: PMC10058274 DOI: 10.3390/mi14030570] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Revised: 02/18/2023] [Accepted: 02/24/2023] [Indexed: 06/18/2023]
Abstract
In today's era, monitoring the health of the manufacturing environment has become essential in order to prevent unforeseen repairs, shutdowns, and to be able to detect defective products that could incur big losses. Data-driven techniques and advancements in sensor technology with Internet of the Things (IoT) have made real-time tracking of systems a reality. The health of a product can also be continuously assessed throughout the manufacturing lifecycle by using Quality Control (QC) measures. Quality inspection is one of the critical processes in which the product is evaluated and deemed acceptable or rejected. The visual inspection or final inspection process involves a human operator sensorily examining the product to ascertain its status. However, there are several factors that impact the visual inspection process resulting in an overall inspection accuracy of around 80% in the industry. With the goal of 100% inspection in advanced manufacturing systems, manual visual inspection is both time-consuming and costly. Computer Vision (CV) based algorithms have helped in automating parts of the visual inspection process, but there are still unaddressed challenges. This paper presents an Artificial Intelligence (AI) based approach to the visual inspection process by using Deep Learning (DL). The approach includes a custom Convolutional Neural Network (CNN) for inspection and a computer application that can be deployed on the shop floor to make the inspection process user-friendly. The inspection accuracy for the proposed model is 99.86% on image data of casting products.
Collapse
|