1
|
Lyu Z, Rodrigues MRD. Exploring the Impact of Additive Shortcuts in Neural Networks via Information Bottleneck-like Dynamics: From ResNet to Transformer. ENTROPY (BASEL, SWITZERLAND) 2024; 26:974. [PMID: 39593918 PMCID: PMC11592645 DOI: 10.3390/e26110974] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/30/2024] [Revised: 11/10/2024] [Accepted: 11/12/2024] [Indexed: 11/28/2024]
Abstract
Deep learning has made significant strides, driving advances in areas like computer vision, natural language processing, and autonomous systems. In this paper, we further investigate the implications of the role of additive shortcut connections, focusing on models such as ResNet, Vision Transformers (ViTs), and MLP-Mixers, given that they are essential in enabling efficient information flow and mitigating optimization challenges such as vanishing gradients. In particular, capitalizing on our recent information bottleneck approach, we analyze how additive shortcuts influence the fitting and compression phases of training, crucial for generalization. We leverage Z-X and Z-Y measures as practical alternatives to mutual information for observing these dynamics in high-dimensional spaces. Our empirical results demonstrate that models with identity shortcuts (ISs) often skip the initial fitting phase and move directly into the compression phase, while non-identity shortcut (NIS) models follow the conventional two-phase process. Furthermore, we explore how IS models are still able to compress effectively, maintaining their generalization capacity despite bypassing the early fitting stages. These findings offer new insights into the dynamics of shortcut connections in neural networks, contributing to the optimization of modern deep learning architectures.
Collapse
Affiliation(s)
- Zhaoyan Lyu
- Department of Electronic and Electrical Engineering, University College London, London WC1E 6BT, UK;
| | | |
Collapse
|
2
|
Shen C, Zhan W, Tang J, Wu Z, Xu B, Zhao C, Wang Z. Universal Deoxidation of Semiconductor Substrates Assisted by Machine Learning and Real-Time Feedback Control. ACS APPLIED MATERIALS & INTERFACES 2024; 16:18213-18221. [PMID: 38554077 DOI: 10.1021/acsami.4c01765] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/01/2024]
Abstract
Substrate oxidation is inevitable when exposed to ambient atmosphere during semiconductor manufacturing, which is detrimental to the fabrication of state-of-the-art devices. Optimizing the deoxidation process in molecular beam epitaxy (MBE) for random substrates poses a multidimensional challenge and is sometimes controversial. Due to variations in substrates and growth processes, the determination of the deoxidation condition heavily relies on the individual's expertise, yielding inconsistent results. This study employs a machine learning model that integrates interpolation and vision transformer (Interpolation-ViT) techniques. The model utilizes reflection high-energy electron diffraction videos as input to predict the status of the substrate, enabling automated deoxidation within a controlled architecture for various substrates. Furthermore, we highlight the potential of models trained on data from specific MBE equipment to achieve high-accuracy deployment on different pieces of equipment. In contrast to traditional methods, our approach holds exceptional value, as it standardizes deoxidation temperatures across diverse equipment and substrates. This significantly advances the standardization of the semiconductor process. The concepts and methods presented are expected to revolutionize semiconductor manufacturing processes in the optoelectronic and microelectronic industries.
Collapse
Affiliation(s)
- Chao Shen
- School of Physics Science and Technology, Xinjiang University, Urumqi, Xinjiang 830046, China
- Laboratory of Solid State Optoelectronics Information Technology, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China
| | - Wenkang Zhan
- College of Materials Science and Opto-Electronic Technology, University of Chinese Academy of Sciences, Beijing 101804, China
- Laboratory of Solid State Optoelectronics Information Technology, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China
| | - Jian Tang
- School of Physical and Electronic Engineering, Yancheng Teachers University, Yancheng 224002, China
| | - Zhaofeng Wu
- School of Physics Science and Technology, Xinjiang University, Urumqi, Xinjiang 830046, China
| | - Bo Xu
- College of Materials Science and Opto-Electronic Technology, University of Chinese Academy of Sciences, Beijing 101804, China
- Laboratory of Solid State Optoelectronics Information Technology, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China
| | - Chao Zhao
- College of Materials Science and Opto-Electronic Technology, University of Chinese Academy of Sciences, Beijing 101804, China
- Laboratory of Solid State Optoelectronics Information Technology, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China
| | - Zhanguo Wang
- College of Materials Science and Opto-Electronic Technology, University of Chinese Academy of Sciences, Beijing 101804, China
- Laboratory of Solid State Optoelectronics Information Technology, Institute of Semiconductors, Chinese Academy of Sciences, Beijing 100083, China
| |
Collapse
|