Author
Listed:
- Lu Wei
(School of Software, Northwestern Polytechnical University, Xi’an 710072, China
Third Technical Department, Xi’an Microelectronics Technology Institute, Xi’an 710065, China)
- Zhong Ma
(Third Technical Department, Xi’an Microelectronics Technology Institute, Xi’an 710065, China)
- Chaojie Yang
(Third Technical Department, Xi’an Microelectronics Technology Institute, Xi’an 710065, China)
- Qin Yao
(School of Software, Northwestern Polytechnical University, Xi’an 710072, China
Third Technical Department, Xi’an Microelectronics Technology Institute, Xi’an 710065, China)
- Wei Zheng
(School of Software, Northwestern Polytechnical University, Xi’an 710072, China)
Abstract
Quantization plays a crucial role in deploying neural network models on resource-limited hardware. However, current quantization methods have issues like the large accuracy loss and poor generalization for complex tasks. These issues pose obstacles to the practical application of deep learning and large language models in smart systems. The main problem is our limited understanding of quantization’s effect on accuracy, and there is also a need for more effective approaches to evaluate the performance of the quantized models. To address these concerns, we develop a novel method that leverages the self-attention mechanism. This method predicts a quantized model’s accuracy using a single representative image from the test set. It utilizes the transformer encoder and decoder to perform this prediction. The prediction error of the quantization accuracy on three types of neural network models is 2.44%. The proposed method enables rapid performance assessment of the quantized models during the development stage, thereby facilitating the optimization of the quantization parameters and promoting the practical application of neural network models.
Suggested Citation
Lu Wei & Zhong Ma & Chaojie Yang & Qin Yao & Wei Zheng, 2025.
"Utilizing the Attention Mechanism for Accuracy Prediction in Quantized Neural Networks,"
Mathematics, MDPI, vol. 13(5), pages 1-20, February.
Handle:
RePEc:gam:jmathe:v:13:y:2025:i:5:p:732-:d:1598563
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:13:y:2025:i:5:p:732-:d:1598563. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.