Echo State Network Based on Improved Knowledge Distillation for Edge Intelligence

ZHOU Jian; JIANG Yuwen; XU Lijie; ZHAO Lu; XIAO Fu

doi:10.23919/cje.2022.00.292

Volume 33 Issue 1

Jan. 2024

Turn off MathJax

Article Contents

Article Navigation > Chinese Journal of Electronics > 2024 > 33(1): 101-111

Jian ZHOU, Yuwen JIANG, Lijie XU, et al., “Echo State Network Based on Improved Knowledge Distillation for Edge Intelligence,” Chinese Journal of Electronics, vol. 33, no. 1, pp. 101–111, 2024 doi: 10.23919/cje.2022.00.292

Citation:

Jian ZHOU, Yuwen JIANG, Lijie XU, et al., “Echo State Network Based on Improved Knowledge Distillation for Edge Intelligence,” Chinese Journal of Electronics, vol. 33, no. 1, pp. 101–111, 2024 doi: 10.23919/cje.2022.00.292

Citation:

PDF( 6220 KB)

Echo State Network Based on Improved Knowledge Distillation for Edge Intelligence

doi: 10.23919/cje.2022.00.292

ZHOU Jian^{1, 2
,
,},
JIANG Yuwen^{1, 2
,},
XU Lijie^{1, 2
,},
ZHAO Lu^{1, 2
,},
XIAO Fu^{1, 2
,}

1.
College of Computer, Nanjing University of Posts and Telecommunications, Nanjing 210003, China
2.
Jiangsu High Technology Research Key Laboratory for Wireless Sensor Networks, Nanjing 210003, China

More Information

Author Bio:
Jian ZHOU was born in 1984. He received the Ph.D. degree from Nanjing University of Science and Technology, Nanjing, China, in 2012. He is currently a Professor in Nanjing University of Posts and Telecommunications, Nanjing, China. His recent research interests include edge intelligence, edge computing and time-series prediction. (Email: zhoujian@njupt.edu.cn)

Yuwen JIANG was born in 1998. He received the B.S. degree in Nanjing University of Posts and Telecommunications, Nanjing, China, in 2020. He is currently pursuing the M.S. degree with the College of Computer, Nanjing University of Posts and Telecommunications, Nanjing, China. His recent research interests include edge intelligence, edge computing and wireless sensor networks. (Email: 1220045023@njupt.edu.cn)

Lijie XU was born in 1983. He received the Ph.D. degree from Nanjing University, Nanjing, China, in 2014. He is currently an Associate Professor in Nanjing University of Posts and Telecommunications, Nanjing, China. His research interests include wireless rechargeable sensor networks, edge computing, mobile and distributed computing. (Email: ljxu@njupt.edu.cn)

Lu ZHAO was born in 1990. He received the Ph.D. degree from Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 2021. He is currently a Lecturer in Nanjing University of Posts and Telecommunications, Nanjing, China. His recent research interests include service computing, crowdsensing and edge computing. (Email: luzhao@njupt.edu.cn)

Fu XIAO was born in 1980. He received the Ph.D. degree from Nanjing University of Science and Technology, Nanjing, China, in 2007. He is currently a Professor in Nanjing University of Posts and Telecommunications, Nanjing, China. His recent research interests include mobile computing, edge computing and Internet of Things. (Email: xiaof@njupt.edu.cn)
Corresponding author: Email: zhoujian@njupt.edu.cn
Received Date: 2022-08-28
Accepted Date: 2023-02-14

Available Online: 2023-07-13

Publish Date: 2024-01-05

Abstract

Abstract

Echo state network (ESN) as a novel artificial neural network has drawn much attention from time series prediction in edge intelligence. ESN is slightly insufficient in long-term memory, thereby impacting the prediction performance. It suffers from a higher computational overhead when deploying on edge devices. We firstly introduce the knowledge distillation into the reservoir structure optimization, and then propose the echo state network based on improved knowledge distillation (ESN-IKD) for edge intelligence to improve the prediction performance and reduce the computational overhead. The model of ESN-IKD is constructed with the classic ESN as a student network, the long and short-term memory network as a teacher network, and the ESN with double loop reservoir structure as an assistant network. The student network learns the long-term memory capability of the teacher network with the help of the assistant network. The training algorithm of ESN-IKD is proposed to correct the learning direction through the assistant network and eliminate the redundant knowledge through the iterative pruning. It can solve the problems of error learning and redundant learning in the traditional knowledge distillation process. Extensive experimental simulation shows that ESN-IKD has a good time series prediction performance in both long-term and short-term memory, and achieves a lower computational overhead.
- Echo state network,
- Reservoir structure optimization,
- Knowledge distillation,
- Edge intelligence,
- Time series prediction

FullText(HTML)

References(40)

References

[1]	X. F. Wang, Y. W. Han, V. C. M. Leung, et al., “Convergence of edge computing and deep learning: A comprehensive survey,” IEEE Communications Surveys & Tutorials, vol. 22, no. 2, pp. 869–904, 2020. doi: 10.1109/COMST.2020.2970550
[2]	R. Gu, Y. Q. Chen, S. Liu, et al., “Liquid: Intelligent resource estimation and network-efficient scheduling for deep learning jobs on distributed GPU clusters,” IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 11, pp. 2808–2820, 2022. doi: 10.1109/TPDS.2021.3138825
[3]	T. Wang, Y. Li, W. W. Fang, et al., “A comprehensive trustworthy data collection approach in sensor-cloud systems,” IEEE Transactions on Big Data, vol. 8, no. 1, pp. 140–151, 2022. doi: 10.1109/TBDATA.2018.2811501
[4]	H. Jaeger, “Reservoir riddles: Suggestions for echo state network research,” in Proceedings of 2005 IEEE International Joint Conference on Neural Networks, Montreal, QC, Canada, pp. 1460–1462, 2005.
[5]	H. G. Zhang, Z. S. Wang, and D. R. Liu, “A comprehensive review of stability analysis of continuous-time recurrent neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 25, no. 7, pp. 1229–1262, 2014. doi: 10.1109/TNNLS.2014.2317880
[6]	Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE Transactions on Neural Networks, vol. 5, no. 2, pp. 157–166, 1994. doi: 10.1109/72.279181
[7]	Q. Y. An, K. J. Bai, L. J. Liu, et al., “A unified information perceptron using deep reservoir computing,” Computers & Electrical Engineering, vol. 85, article no. 106705, 2020. doi: 10.1016/j.compeleceng.2020.106705
[8]	O. Orang, P. C. de Lima e Silva, R. Silva, et al., “Randomized high order fuzzy cognitive maps as reservoir computing models: A first introduction and applications,” Neurocomputing, vol. 512, pp. 153–177, 2022. doi: 10.1016/j.neucom.2022.09.030
[9]	M. L. Xu, M. Han, and H. F. Lin, “Wavelet-denoising multiple echo state networks for multivariate time series prediction,” Information Sciences, vol. 465, pp. 439–458, 2018. doi: 10.1016/j.ins.2018.07.015
[10]	H. C. Chen and D. Q. Wei, “Chaotic time series prediction using echo state network based on selective opposition grey wolf optimizer,” Nonlinear Dynamics, vol. 104, no. 4, pp. 3925–3935, 2021. doi: 10.1007/s11071-021-06452-w
[11]	G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” Computer Science, vol. 14, no. 7, pp. 38–39, 2015. doi: 10.1155/2021/4319074
[12]	J. P. Gou, B. S. Yu, S. J. Maybank, et al., “Knowledge distillation: A survey,” International Journal of Computer Vision, vol. 129, no. 6, pp. 1789–1819, 2021. doi: 10.1007/s11263-021-01453-z
[13]	T. Furlanello, Z. C. Lipton, M. Tschannen, et al., “Born-again neural networks,” in Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, pp. 1602–1611, 2018.
[14]	A. Romero, N. Ballas, S. E. Kahou, et al., “FitNets: Hints for thin deep nets,” in Proceedings of the 3rd Conference on Learning Representations, San Diego, CA, USA, pp. 1–13, 2015.
[15]	J. Yim, D. Joo, J. Bae, et al., “A gift from knowledge distillation: Fast optimization, network minimization and transfer learning,” in Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 7130–7138, 2017.
[16]	S. G. Deng, H. L. Zhao, W. J. Fang, et al., “Edge intelligence: The confluence of edge computing and artificial intelligence,” IEEE Internet of Things Journal, vol. 7, no. 8, pp. 7457–7469, 2020. doi: 10.1109/JIOT.2020.2984887
[17]	Y. F. Shen, Y. M. Shi, J. Zhang, et al., “Graph neural networks for scalable radio resource management: Architecture design and theoretical analysis,” IEEE Journal on Selected Areas in Communications, vol. 39, no. 1, pp. 101–115, 2021. doi: 10.1109/JSAC.2020.3036965
[18]	T. Wang, Y. C. Lu, J. H. Wang, et al., “EIHDP: Edge-intelligent hierarchical dynamic pricing based on cloud-edge-client collaboration for IoT systems,” IEEE Transactions on Computers, vol. 70, no. 8, pp. 1285–1298, 2021. doi: 10.1109/TC.2021.3060484
[19]	J. F. Chen and X. Wang, “Non-intrusive load monitoring using gramian angular field color encoding in edge computing,” Chinese Journal of Electronics, vol. 31, no. 4, pp. 595–603, 2022. doi: 10.1049/cje.2020.00.268
[20]	U. Thakker, J. Beu, D. Gope, et al., “Run-time efficient RNN compression for inference on edge devices,” in Proceedings of the 2nd Workshop on Energy Efficient Machine Learning and Cognitive Computing for Embedded Applications, Washington, DC, USA, pp. 26–30, 2019.
[21]	L. B. Ma, X. Y. Wang, X. W. Wang, et al., “TCDA: Truthful combinatorial double auctions for mobile edge computing in industrial internet of things,” IEEE Transactions on Mobile Computing, vol. 21, no. 11, pp. 4125–4138, 2022. doi: 10.1109/TMC.2021.3064314
[22]	X. Y. Peng, J. X. Yu, B. W. Yao, et al., “A review of FPGA-based custom computing architecture for convolutional neural network inference,” Chinese Journal of Electronics, vol. 30, no. 1, pp. 1–17, 2021. doi: 10.1049/cje.2020.11.002
[23]	Z. Zhou, X. Chen, E. Li, et al., “Edge intelligence: Paving the last mile of artificial intelligence with edge computing,” Proceedings of the IEEE, vol. 107, no. 8, pp. 1738–1762, 2019. doi: 10.1109/JPROC.2019.2918951
[24]	L. Wang, J. F. Qiao, C. L. Yang, et al., “Pruning algorithm for modular echo state network based on sensitivity analysis,” Acta Automatica Sinica, vol. 45, no. 6, pp. 1136–1145, 2019. doi: 10.16383/j.aas.c180288
[25]	A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Transactions on Neural Networks, vol. 22, no. 1, pp. 131–144, 2011. doi: 10.1109/TNN.2010.2089641
[26]	X. C. Sun, H. Y. Cui, R. P. Liu, et al., “Multistep ahead prediction for real-time VBR video traffic using deterministic echo state network,” in Proceedings of IEEE 2nd International Conference on Cloud Computing and Intelligence Systems, Hangzhou, China, pp. 928–931, 2012.
[27]	J. Zhou, X. Y. Yang, L. J. Sun, et al., “Network traffic prediction method based on improved echo state network,” IEEE Access, vol. 6, pp. 70625–70632, 2018. doi: 10.1109/ACCESS.2018.2880272
[28]	J. F. Qiao, F. J. Li, H. G. Han, et al., “Growing echo-state network with multiple subreservoirs,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 2, pp. 391–404, 2017. doi: 10.1109/TNNLS.2016.2514275
[29]	Y. Kawai, J. Park, and M. Asada, “A small-world topology enhances the echo state property and signal propagation in reservoir computing,” Neural Networks, vol. 112, pp. 15–23, 2019. doi: 10.1016/j.neunet.2019.01.002
[30]	H. S. Wang and X. F. Yan, “Improved simple deterministically constructed cycle reservoir network with sensitive iterative pruning algorithm,” Neurocomputing, vol. 145, pp. 353–362, 2014. doi: 10.1016/j.neucom.2014.05.024
[31]	S. Scardapane, G. Nocco, D. Comminiello, et al., “An effective criterion for pruning reservoir’s connections in echo state networks,” in Proceedings of 2014 International Joint Conference on Neural Networks, Beijing, China, pp. 1205–1212, 2014.
[32]	D. Y. Li, F. Liu, J. F. Qiao, et al., “Structure optimization for echo state network based on contribution,” Tsinghua Science and Technology, vol. 24, no. 1, pp. 97–105, 2019. doi: 10.26599/TST.2018.9010049
[33]	H. S. Wang, Y. X. Liu, P. Lu, et al., “Echo state network with logistic mapping and bias dropout for time series prediction,” Neurocomputing, vol. 489, pp. 196–210, 2022. doi: 10.1016/j.neucom.2022.03.018
[34]	K. Greff, R. K. Srivastava, J. Koutník, et al., “LSTM: A search space odyssey,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 10, pp. 2222–2232, 2017. doi: 10.1109/TNNLS.2016.2582924
[35]	H. Jaeger, M. Lukoševičius, D. Popovici, et al., “Optimization and applications of echo state networks with leaky-integrator neurons,” Neural Networks, vol. 20, no. 3, pp. 335–352, 2007. doi: 10.1016/j.neunet.2007.04.016
[36]	S. Hochreiter and J. Schmidhuber, “Long short-term memory,” Neural Computation, vol. 9, no. 8, pp. 1735–1780, 1997. doi: 10.1162/neco.1997.9.8.1735
[37]	J. Chen, J. C. Li, Y. Li, et al., “Multiply accumulate operations in memristor crossbar arrays for analog computing,” Journal of Semiconductors, vol. 42, no. 1, article no. 013104, 2021. doi: 10.1088/1674-4926/42/1/013104
[38]	T. X. Shu, J. H. Chen, V. K. Bhargava, et al., “An energy-efficient dual prediction scheme using LMS filter and LSTM in wireless sensor networks for environment monitoring,” IEEE Internet of Things Journal, vol. 6, no. 4, pp. 6736–6747, 2019. doi: 10.1109/JIOT.2019.2911295
[39]	J. Zhou, T. T. Han, F. Xiao, et al., “Multiscale network traffic prediction method based on deep echo-state network for internet of things,” IEEE Internet of Things Journal, vol. 9, no. 21, pp. 21862–21874, 2022. doi: 10.1109/JIOT.2022.3181807
[40]	A. Jain, K. Nandakumar, and A. Ross, “Score normalization in multimodal biometric systems,” Pattern Recognition, vol. 38, no. 12, pp. 2270–2285, 2005. doi: 10.1016/j.patcog.2005.01.012