Performance Evaluation of Convolutional Networks on Heterogeneous Architectures for Applications in Autonomous Robotics

Guajo, Joaquín; Alzate-Anzola, Cristian; Castaño-Londoño , Luis; Márquez-Viloria, David

Performance Evaluation of Convolutional Networks on Heterogeneous Architectures for Applications in Autonomous Robotics

dc.creator	Guajo, Joaquín
dc.creator	Alzate-Anzola, Cristian
dc.creator	Castaño-Londoño , Luis
dc.creator	Márquez-Viloria, David
dc.date	2022-04-29
dc.date.accessioned	2025-10-01T23:52:46Z
dc.description	Humanoid robots find application in human-robot interaction tasks. However, despite their capabilities, their sequential computing system limits the execution of computationally expensive algorithms such as convolutional neural networks, which have demonstrated good performance in recognition tasks. As an alternative to sequential computing units, Field-Programmable Gate Arrays and Graphics Processing Units have a high degree of parallelism and low power consumption. This study aims to improve the visual perception of a humanoid robot called NAO using these embedded systems running a convolutional neural network. The methodology adopted here is based on image acquisition and transmission using simulation software: Webots and Choreographe. In each embedded system, an object recognition stage is performed using commercial convolutional neural network acceleration frameworks. Xilinx® Ultra96™, Intel® Cyclone® V-SoC and NVIDIA® Jetson™ TX2 cards were used, and Tinier-YOLO, AlexNet, Inception-V1 and Inception V3 transfer-learning networks were executed. Real-time metrics were obtained when Inception V1, Inception V3 transfer-learning and AlexNet were run on the Ultra96 and Jetson TX2 cards, with frame rates between 28 and 30 frames per second. The results demonstrated that the use of these embedded systems and convolutional neural networks can provide humanoid robots such as NAO with greater visual recognition in tasks that require high accuracy and autonomy.	en-US
dc.description	Los robots humanoides encuentran aplicación en tareas de interacción humano-robot. A pesar de sus capacidades, su sistema de computación secuencial limita la ejecución de algoritmos computacionalmente costosos, como las redes neuronales convolucionales, que han demostrado buen rendimiento en tareas de reconocimiento. Como alternativa a unidades de cómputo secuencial se encuentran los Field Programmable Gate Arrays y las Graphics Processing Unit, que tienen un alto grado de paralelismo y bajo consumo de energía. Este trabajo tuvo como objetivo mejorar la percepción visual del robot humanoide NAO utilizando estos sistemas embebidos que ejecutan una red neuronal convolucional. El trabajo se basó en la adquisición y transmisión de la imagen usando herramientas de simulación como Webots y Choreographe. Posteriormente, en cada sistema embebido, se realizó una etapa de reconocimiento del objeto utilizando frameworks de aceleración comerciales de redes neuronales convolucionales. Luego se utilizaron las tarjetas Xilinx Ultra96, Intel Cyclone V-SoC y Nvidia Jetson TX2; después fueron ejecutadas las redes Tinier-Yolo, Alexnet, Inception V1 y Inception V3 transfer-learning. Se obtuvieron métricas en tiempo real cuando Inception V1, Inception V3 transfer-learning y AlexNet fueron ejecutadas sobre la Ultra96 y Jetson TX2, teniendo como intervalo entre 28 y 30 cuadros por segundo. Los resultados demostraron que el uso de estos sistemas embebidos y redes neuronales convolucionales puede otorgarles a robots humanoides, como NAO, mayor reconocimiento visual en tareas que requieren alta precisión y autonomía.	es-ES
dc.format	application/pdf
dc.format	application/zip
dc.format	text/xml
dc.format	text/html
dc.identifier	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/2170
dc.identifier	10.22430/22565337.2170
dc.identifier.uri	https://hdl.handle.net/20.500.12622/7810
dc.language	eng
dc.publisher	Instituto Tecnológico Metropolitano (ITM)	es-ES
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/2170/2385
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/2170/2386
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/2170/2387
dc.relation	https://revistas.itm.edu.co/index.php/tecnologicas/article/view/2170/2388
dc.relation	/ref/S. R. Fanello; C. Ciliberto; N. Noceti, G. Metta; F. Odone, “Visual recognition for humanoid robots”, Rob. Auton Syst., vol. 91, pp. 151–168, May 2017. https://doi.org/10.1016/j.robot.2016.10.001
dc.relation	/ref/E. Cha; M. Mataric; T. Fong, “Nonverbal signaling for non-humanoid robots during human-robot collaboration”, in 2016 11th ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2016, pp. 601–602. https://doi.org/10.1109/HRI.2016.7451876
dc.relation	/ref/S. Shamsuddin et al., “Initial response of autistic children in human-robot interaction therapy with humanoid robot NAO”, in 2012 IEEE 8th International Colloquium on Signal Processing and its Applications, 2012, pp. 188–193. https://doi.org/10.1109/CSPA.2012.6194716
dc.relation	/ref/J. G. Hoyos-Gutiérrez; C. A. Peña-Solórzano; C. L. Garzón-Castro; F. A. Prieto-Ortiz; J. G. Ayala-Garzón, “Hacia el manejo de una herramienta por un robot NAO usando programación por demostración”, TecnoLógicas, vol. 17, no. 33, pp. 65-76, Aug. 2014. https://doi.org/10.22430/22565337.555
dc.relation	/ref/P. Vadakkepat; N. B. Sin; D. Goswami; R. X. Zhang; L. Y. Tan, “Soccer playing humanoid robots: Processing architecture, gait generation and vision system”, Rob. Auton. Syst., vol. 57, no. 8, pp. 776–785, Jul. 2009. https://doi.org/10.1016/j.robot.2009.03.012
dc.relation	/ref/A. Härtl; U. Visser; T. Röfer, “Robust and Efficient Object Recognition for a Humanoid Soccer Robot”, Springer, Berlin, Heidelberg, 2014, pp. 396–407. https://doi.org/10.1007/978-3-662-44468-9_35
dc.relation	/ref/D. Budden; S. Fenn; J. Walker; A. Mendes, “A Novel Approach to Ball Detection for Humanoid Robot Soccer”, Springer, Berlín, Heidelberg, 2012, pp. 827–838. https://doi.org/10.1007/978-3-642-35101-3_70
dc.relation	/ref/P. Sermanet; D. Eigen; X. Zhang; M. Mathieu; R. Fergus; Y. Le Cun, “Integrated recognition, localization and detection using convolutional networks”, 2013. https://arxiv.org/abs/1312.6229
dc.relation	/ref/A. Krizhevsk; I. Sutskever; G. E. Hinton, “ImageNet classification with deep convolutional neural networks”, Commun. ACM, vol. 60, no. 6, pp. 84–90, Jun 2017. https://doi.org/10.1145/3065386
dc.relation	/ref/K. Simonyan; A. Zisserman, “Very deep convolutional networks for large-scale image recognition”, arXiv preprint, Sep.2015. https://arxiv.org/abs/1409.1556
dc.relation	/ref/K. He; X. Zhang; S. Ren; J. Sun, “Deep Residual Learning for Image Recognition”, in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778. https://doi.org/10.1109/CVPR.2016.90
dc.relation	/ref/H. V. Nguyen; H. T. Ho; V. M. Patel; R. Chellappa, “DASH-N: Joint Hierarchical Domain Adaptation and Feature Learning,” IEEE Trans. Image Process., vol. 24, no. 12, pp. 5479–5491, Dec. 2015. https://doi.org/10.1109/TIP.2015.2479405
dc.relation	/ref/M. Podpora; A. Gardecki, “Extending vision understanding capabilities of NAO robot by connecting it to a remote computational resource,” in 2016 Progress in Applied Electrical Engineering (PAEE), 2016, pp. 1–5. https://doi.org/10.1109/PAEE.2016.7605119
dc.relation	/ref/M. Puheim; M. Bundzel; L. Madarasz, “Forward control of robotic arm using the information from stereo-vision tracking system”, in 2013 IEEE 14th International Symposium on Computational Intelligence and Informatics (CINTI), 2013, pp. 57–62. https://doi.org/10.1109/CINTI.2013.6705259
dc.relation	/ref/K. Noda; H. Arie; Y. Suga; T. Ogata, “Multimodal integration learning of robot behavior using deep neural networks”, Rob. Auton. Syst., vol. 62, no. 6, pp. 721–736, Jun. 2014. https://doi.org/10.1016/j.robot.2014.03.003
dc.relation	/ref/A. Biddulph; T. Houliston; A. Mendes; S. K. Chalup, “Comparing Computing Platforms for Deep Learning on a Humanoid Robot”, Springer, Cham, 2018. https://doi.org/10.1007/978-3-030-04239-4_11
dc.relation	/ref/A. Dundar; J. Jin; B. Martini; E. Culurciello, “Embedded Streaming Deep Neural Networks Accelerator With Applications,” IEEE Trans. Neural Networks Learn. Syst., vol. 28, no. 7, pp. 1572–1583, Jul. 2017. https://doi.org/10.1109/TNNLS.2016.2545298
dc.relation	/ref/H. Park et al., “Optimizing DCNN FPGA accelerator design for handwritten hangul character recognition”, in Proceedings of the 2017 International Conference on Compilers, Architectures and Synthesis for Embedded Systems Companion, 2017, pp. 1–2. https://doi.org/10.1145/3125501.3125522
dc.relation	/ref/C. Zhang; P. Li; G. Sun; Y. Guan; B. Xiao; J. Cong, “Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks”, in Proceedings of the 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 2015, pp. 161–170. https://doi.org/10.1145/2684746.2689060
dc.relation	/ref/Q. Xiao; Y. Liang; L. Lu; S. Yan; Y.-W. Tai, “Exploring Heterogeneous Algorithms for Accelerating Deep Convolutional Neural Networks on FPGAs”, in Proceedings of the 54th Annual Design Automation Conference 2017, 2017, pp. 1–6. https://doi.org/10.1145/3061639.3062244
dc.relation	/ref/E. Del Sozzo; A. Solazzo; A. Miele; M. D. Santambrogio, “On the Automation of High Level Synthesis of Convolutional Neural Networks”, in 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW), 2016, pp. 217–224. https://doi.org/10.1109/IPDPSW.2016.153
dc.relation	/ref/C. Zhang; G. Sun; Z. Fang; P. Zhou; P. Pan; J. Cong, “Caffeine: Toward Uniformed Representation and Acceleration for Deep Convolutional Neural Networks”, in IEEE Trans. Comput. Des. Integr. Circuits Syst., vol. 38, no. 11, pp. 2072–2085, Nov. 2019. https://doi.org/10.1109/TCAD.2017.2785257
dc.relation	/ref/R. Andri; L. Cavigelli; D. Rossi; L. Benini, “YodaNN: An Ultra-Low Power Convolutional Neural Network Accelerator Based on Binary Weights”, in 2016 IEEE Computer Society Annual Symposium on VLSI (ISVLSI), 2016, pp. 236–241. https://doi.org/10.1109/ISVLSI.2016.111
dc.relation	/ref/L. Ni; Z. Liu; H. Yu; R. V. Joshi, “An Energy-Efficient Digital ReRAM-Crossbar-Based CNN With Bitwise Parallelism”, IEEE J. Explor. Solid-State Comput. Devices Circuits, vol. 3, pp. 37–46, Dec. 2017. https://doi.org/10.1109/JXCDC.2017.2697910
dc.relation	/ref/A. Kulkarni; T. Abtahi; C. Shea; A. Kulkarni; T. Mohsenin, “PACENet: Energy efficient acceleration for convolutional network on embedded platform”, in 2017 IEEE International Symposium on Circuits and Systems (ISCAS), 2017, pp. 1–4. https://doi.org/10.1109/ISCAS.2017.8050342
dc.relation	/ref/T. Gong; T. Fan; J. Guo; Z. Cai, “GPU-based parallel optimization of immune convolutional neural network and embedded system”, Eng. Appl. Artif. Intell., vol. 62, pp. 384–395, Jun. 2017. https://doi.org/10.1016/j.engappai.2016.08.019
dc.relation	/ref/D. Strigl; K. Kofler; S. Podlipnig, “Performance and Scalability of GPU-Based Convolutional Neural Networks”, in 2010 18th Euromicro Conference on Parallel, Distributed and Network-based Processing, 2010, pp. 317–324. https://doi.org/10.1109/PDP.2010.43
dc.relation	/ref/O. Michel, “Cyberbotics Ltd. Webots TM: Professional Mobile Robot Simulation”, Int. J. Adv. Robot. Syst., vol. 1, no. 1, p. 39-42, Mar. 2004. https://doi.org/10.5772/5618
dc.relation	/ref/E. Pot; J. Monceaux; R. Gelin; B. Maisonnier, “Choregraphe: a graphical tool for humanoid robot programming”, in RO-MAN 2009 - The 18th IEEE International Symposium on Robot and Human Interactive Communication, 2009, pp. 46–51. https://doi.org/10.1109/ROMAN.2009.5326209
dc.relation	/ref/M. Blott et al., “FINN- R: An End-to-End Deep-Learning Framework for Fast Exploration of Quantized Neural Networks”, ACM Trans. Reconfigurable Technol. Syst., vol. 11, no. 3, pp. 1–23, Sep. 2018. https://doi.org/10.1145/3242897
dc.relation	/ref/S. Wang; S. Jiang, “INSTRE: A New Benchmark for Instance-Level Object Retrieval and Recognition”, ACM Trans. Multimed. Comput. Commun. Appl., vol. 11, no. 3, pp. 1–21, Feb. 2015. https://doi.org/10.1145/2700292
dc.relation	/ref/M. Mattamala; G. Olave; C. González; N. Hasbún; J. Ruiz-del-Solar, “The NAO Backpack: An Open-Hardware Add-on for Fast Software Development with the NAO Robot”, 2018, pp. 302–311. https://doi.org/10.1007/978-3-030-00308-1_25
dc.relation	/ref/D. Wang; K. Xu; D. Jiang, “PipeCNN: An OpenCL-based open-source FPGA accelerator for convolution neural networks”, in 2017 International Conference on Field Programmable Technology (ICFPT), 2017, pp. 279–282. https://doi.org/10.1109/FPT.2017.8280160
dc.relation	/ref/S. Xu; A. Savvaris; S. He; H. Shin; A. Tsourdos, “Real-time Implementation of YOLO+JPDA for Small Scale UAV Multiple Object Tracking”, in 2018 International Conference on Unmanned Aircraft Systems (ICUAS), 2018, pp. 1336–1341. https://doi.org/10.1109/ICUAS.2018.8453398
dc.relation	/ref/J. Ma; L. Chen; Z. Gao, “Hardware Implementation and Optimization of Tiny-YOLO Network”, Springer, Singapur, 2018, pp. 224–234, https://doi.org/10.1007/978-981-10-8108-8_21
dc.rights	Derechos de autor 2022 TecnoLógicas	es-ES
dc.source	TecnoLógicas; Vol. 25 No. 53 (2022); e2170	en-US
dc.source	TecnoLógicas; Vol. 25 Núm. 53 (2022); e2170	es-ES
dc.source	2256-5337
dc.source	0123-7799
dc.subject	Convolutional neural networks	en-US
dc.subject	field programmable gate array	en-US
dc.subject	system-on-a-chip	en-US
dc.subject	high-level synthesis	en-US
dc.subject	humanoid robot	en-US
dc.subject	Redes neuronales convolucionales	es-ES
dc.subject	matriz de puertas lógicas programable en campo	es-ES
dc.subject	sistema en chip	es-ES
dc.subject	síntesis de alto nivel	es-ES
dc.subject	robot humanoide	es-ES
dc.title	Performance Evaluation of Convolutional Networks on Heterogeneous Architectures for Applications in Autonomous Robotics	en-US
dc.title	Evaluación de desempeño de redes convolucionales sobre arquitecturas heterogéneas para aplicaciones en robótica autónoma	es-ES
dc.type	info:eu-repo/semantics/article
dc.type	info:eu-repo/semantics/publishedVersion
dc.type	Research Papers	en-US
dc.type	Artículos de investigación	es-ES