2021 8th NAFOSTED Conference on Information and Computer Science (NICS) ODLIE: On-Demand Deep Learning Framework for Edge Intelligence in Industrial Internet of Things Khanh-Hoi Le Minh∗ , Kim-Hung Le† University of Information Technology, Vietnam National University Ho Chi Minh City Ho Chi Minh, Vietnam Email: ∗ hoilmk@uit.edu.vn, † hunglk@uit.edu.vn Abstract—Recently, we have witnessed the evolution of Edge Computing (EC) and Deep Learning (DL) serving Industrial Internet of Things (IIoT) applications, in which executing DL models is shifted from cloud servers to edge devices to reduce latency However, achieving low latency for IoT applications is still a critical challenge because of the massive time consumption to deploy and operate complex DL models on constrained edge devices In addition, the heterogeneity of IoT data and device types raises edge-cloud collaboration issues To address these challenges, in this paper, we first introduce ODLIE, an ondemand deep learning framework for IoT edge devices ODLIE employs DL right-selecting and DL right-sharing features to reduce inference time while maintaining high accuracy and edge collaboration In detail, DL right-selecting chooses the appropriate DL model adapting to various deployment contexts and user-desired qualities, while DL right-sharing exploits W3C semantic descriptions to mitigate the heterogeneity in IoT data and devices To prove the applicability of our proposal, we present and analyze latency requirements of IIoT applications that are thoroughly satisfied by ODLIE Index Terms—Industry 4.0, Edge Intelligence, Deep Learning framework, Industrial Internet of Things I I NTRODUCTION In the open world economy, industrial companies are under the pressures of enhancing product quality and integrating information technologies into the traditional industry [1] This raises the prevalence of Smart industry, also known as Industrial 4.0, a revolution in manufacturing technologies enabling automation and data analysis across machines through connected smart objects (such as sensors, actuators) As a result, an enormous amount of data from various is generated for several manufacturing processes [2] They demand AI models to process and infer valuable knowledge to promote the automation in smart factories For example, Figure presents the process to detect the anomalies in the product surface by using scanned images from a camera These images are then processed by an AI model in Edge servers to decide whether a product is qualified Among AI approaches, deep learning is a breakthrough technology adopted in several scenarios because of its ability to derive high-level knowledge from complex input space by using DL models [3] Training these models may need a large amount of computational resources and input data; thus, the cloud-centric paradigm constituted of powerful servers is leveraged to perform heavy tasks In more detail, IoT data is transmitted from devices to cloud servers, in which they are processed and analyzed by DL models The model output 978-1-6654-1001-4/21/$31.00 ©2021 IEEE Fig 1: The example scenario of surface inspection for Smart factory is then responded to the devices However, transmitting data over a long network distance from devices to cloud may cause high end-to-end latency and various security issues These drawbacks bring out an edge computing paradigm that is low-latency and energy-efficiency by leveraging the computation of network edge devices [4] These devices act as cloud servers to collect IoT data, perform inference tasks, and respond to IoT devices This shortens the data transmission route, and thus reducing end-to-end latency Therefore, converging IoT and AI on the edge received substantial interest from both the research and the industrial community [5] Fig 2: Overview of Intelligence Edge Despite several advantages of running DL models on edge devices, it is a trivial challenge because: (1) DL tasks demand excessive computational resources (storing model, inferring knowledge) comparing with edge device capacity; (2) The DL models are trained by a general dataset on cloud, and thus it is inefficient to specific edge contexts; (3) The heterogeneity of hardware specifications and collected data 458 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) limits the collaboration between edge devices To overcome these challenges, we introduce an On-demand Deep Learning framework for Edge devices, namely ODLIE The superiority of our proposal comes from two key features The first one is DL right-selecting, which selects suitable DL models for various deployment contexts, resulting in minimizing the end-to-end latency while guaranteeing the user-desired performance To propose an appropriate selection, DL rightselecting examines the model running information formed as a tuple, including < Accuracy, Latency, Computation, Memory > The second one is DL right-sharing, which exploits the semantic description provided by W3C to describe the edge device resources (performance capability, DL models, collected data) This enables direct access to these resources via a uniform RESTful service, and thus, the heterogeneity and complexity of edge systems are removed In summary, our contribution is listed below: • A systematic overview of edge intelligence (IE) is presented Base on this knowledge, fundamental requirements of EI are also identified • We propose an on-demand deep learning framework for edge devices, namely ODLIE Our framework addresses the current EI limitations (latency, computational power, and data sharing and collaboration) by employing DL right-selecting and DL right-sharing features The applicability of the proposal in various IoT deployment context are also discussed The remainder of the paper is organized as follows In the section II, we present our motivation and the systematic overview of edge intelligence The ODLIE framework and its deployed context are depicted in section III The conclusion is reported in section IV learning tasks at edge nodes [8] IEC also discusses the shift of information technologies from cloud to edge level to deal with the challenges about latency, security, and connectivity In our vision, we define EI is the capability to perform AI operations in edge devices Depend on edge hardwares and AI algorithms, each edge device has a different capacity, which is represented via four key factors: accuracy, latency, computation, and memory • Accuracy (higher is better) represents the correctness of AI outputs indicated by different metrics For example, accuracy in object detection is measured by the mean average precision, whereas the f-score is widely used in anomaly detection tasks Since computational resources of edge devices are constrained (such as computational capacity, memory), most of AI mechanisms deployed on edge must be optimized by compression, quantization, or other methods However, these optimizations may reduce inference accuracy • Latency (lower is better) is a end-to-end delay representing the total time from sending a request to receiving a response from the edge device • Computation and Memory (lower is better) refer to the increase of CPU and RAM usage when performing the AI inference tasks These two factors represent computational requirements for each AI models B Overview of EI II E DGE I NTELLIGENCE OVERVIEW The edge intelligence has been emerging as a consequence of the exponential growth of IoT devices (smart objects, sensors, actuators) and stringent requirements about latency, accuracy, and security Applying EI to IoT is expected to not only satisfy these requirements but also derive more business values from sensory data [6] Because of being closer data sources than the core cloud, EI owns the advantages regarding end-to-end latency, efficiency, and security Instead of sending data to the central cloud, the analysis is located close to the data sources It thus reduce the end-to-end latency and the security risks relating to data interception and violence while transferring On the other hand, IoT data is processed by welltrained models to convert into valuable information, enabling end-user to real-time react with observed events In summary, the presence of EI is to deal with massive data at edge nodes in an intelligent way [7] A Definition Recently, several research groups have been working on EI definition and related concepts International Electrotechnical Commission (IEC) presents EI as a process to collect, store raw data from IoT devices before performing machine Fig 3: Collaboration of Intelligence Edge 1) EI Collaboration: Figure shows the overview of EI including several technologies, such as algorithms, software, or hardware platforms These technologies cooperate on providing intelligent services (e.g., defect detection, vehicle counting) There are two types of collaboration [9]: • Cloud to Edge: The AI model is trained on cloud before transferring to edge devices, where perform inference tasks If these tasks exceed the device capacity, offloading between edge devices may be employed In some cases, edges retrain the received model with incoming data, and then synchronize with the original model on 459 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) cloud This process is also known as transfer learning [10] • Edge to Edge: A set of edges collaborates on intensive tasks demanding high computational capability For example, to train a complex model, the model and training data are separated and allocated to edges based on their computational capability After successfully training, the models are synchronized together [11] 2) EI Dataflow: Depending on the EI collaboration types, EI data flows are operated in a different way As shown in Figure 3, the collected data has three major flows: • The edge devices send the data to cloud servers, where stores well-trained models Then, these servers perform inference tasks based on the received data and return the outputs to the edges However, the drawback of this flow is high latency since the data have to transmit over the network backbone • The inference tasks are executed directly on edge devices by using well-trained models downloaded from the cloud In some cases, the model must be optimized to fit the edge capacity The advantages of this data-flow are low latency and security risks • The general AI model loading from the cloud to the edge is locally retrained by the collected data This processing makes the model more specific for each edge, resulting in increasing model accuracy However, training complex AI models on a single edge may drain the device battery or even crash In this case, the edge-to-edge collaboration is required III ODLIE: O N -D EMAND D EEP L EARNING F RAMEWORK FOR E DGE I NTELLIGENCE In this section, we present an on-demand deep learning framework for edge intelligence (ODLIE) supporting AI model selection and data sharing The goal of ODLIE is to reduce the runtime of inference tasks, while maximizing accuracy and edge collaboration Applying our proposed framework could make EI suitable to various edge devices (e.g., Raspberry Pi 3, NVIDIA Jetson Nano) A Requirements In general, the main goal of ODLIE is to turn a single-board computer (such as Raspberry, Nano Jetson, BeagleBone) into an edge intelligence, which is able to run complex AI models or algorithms For example, a raspberry equipped ODLIE could perform real-time object detection based on an on-board camera In this case, several challenges may emerge: (1) How does a Raspberry device meet the real-time requirement (2) How edges collaborate together through ODLIE Following the mentioned example, we defined three key requirements of an EI framework as follows: • Facility: Currently, deploying and running an AI model on the edge device is a complicated process By wrapping this process behind uniform RESTful web services, ODLIE is easy to use even with not-tech users Fig 4: The overview of ODLIE Adaptation: In the AI world, there is a large number of AI models with different properties (size, accuracy, format) and purposes Thus, selecting a suitable AI model for edges based on deployment context is essential in EI • Interoperability: To collaborate and share data (output results, collected data) with heterogeneous edges, the EI platform has to semantically describe their resources by using uniform descriptions and access methods Our proposed EI framework could fulfill the above requirements As shown in Figure 4, the ODLIE architecture has three components: (1) DL sharing is used to interact with other edge devices and end-users in a semantic manner (2) DL right-selecting takes responsibility to select the most suitable model for edges (3) Package manager is used to run or train the chosen model directly on edge devices • B DL right-selecting Along with the development of AI applications, the Deep learning model based on the neural network has been significantly growing in terms of quality and categorization Each model is designed for dedicated purposes For example, the object detection model is insufficient for detecting defects on product surfaces Thus, a method for selecting the most suitable model for different edge capabilities in different deployment scenarios is necessary Aware of this demand, ODLIE arms the DL right-selecting feature, including model selector and model optimizer components After receiving a 460 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) Fig 5: Model Selector deployment request from the end-user, the model selector extracts user-desired configurations such as desired accuracy, latency It can be considered as a multi-dimensional space selection problem As an example shown in Figure 5, ODLIE deploying for object detection purposes takes at least three dimensions into account, such as AI models (ResNet, Mobilinets), DL software platforms (Tensorflow, PyTorch, MXNet), and edge hardware (NVIDIA Nano Jetson, Raspberry and 4) In more detail, the model selector first evaluates the capacity of the hardware platform via four main factors formed a tube < Accuracy, Latency, Computation, Memory > This information is obtained by running AI models or sharing between edges Then, the most suitable model is selected by solving the optimization equation argmin < A, L, C, M > m∈M odels (1) s.t A ≥ Areq , L ≤ Lreq , C ≤ Cpro , M ≤ Mpro where < A, L, C, M > represents the tube < Accuracy, Latency, Computation, Memory > Areq and Lreq refers to the user-desired accuracy and latency Cpro and Mpro are the computation (CPU) and memory (RAM) footprint while running the model at edge devices As shown in equation 1, the goal of model selector is to select the best fit model, which meets not only the accuracy and latency requirement but also CPU and RAM consumption A reinforcement learning algorithms could be exploited to enhance the selection performance C DL sharing DL sharing supports the interaction with end-users via cloud, other edges, and IoT devices Based on the semantic description framework provide by W3C [12], all edge resources are described by using semantic language, such as processed data, running AI model, device information These resources are encoded under JSON-LD format1 and directly access via uniform URIs The Figure shows a simple Fig 6: The example of edge description example of edge resource description, which has three main sections The first section presents the general information of the edge device, such as name, id, model, security method The next section is “properties” describing available resources of edge devices and their access methods In the example, we can access and capture an image from the onboard camera of edge device via calling an HTTP “GET” request The last section describes supported actions of services provided by the edge, namely “actions” As shown in the example, the edge device supports an image detection service via the HTTP “POST” method The other components in DL sharing are: • Lab Notebook provides a programming environment to create and evaluate AI models before importing into DL right-selecting Its interface is similar to Jupyter Notebook2 • Service Editor is build based on Node-red platform supporting end-user to create simple applications from edge resources, which is described in properties and actions in edge resource description D Package manager Similar to TensorFlow Lite4 for mobile phones, the Package manager of ODLIE aims to provide a lightweight DL software environment for edge devices It supports to inference and train the AI model in an optimized way (low computational and memory footprint) There are three key components in package manager: (1) Model training is used to retrain the model based on collected data It makes the model well-fit to specific data features of each edge device Thus, the model has better performance than general models; (2) Model inference aims to perform near real-time inference tasks by marking all AL tasks as high priority operations to the system Thus, these https://jupyter.org/ https://nodered.org/ https://www.tensorflow.org/lite https://json-ld.org/ 461 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) tasks are provided maximum computational device resources; (3) The hardware connection is used to connect ODLIE to hardware components of the edge device Through this connection, our framework could effectively manage the generate training and inference tasks The optimization in both models and the DL running environment is capable of accelerating the EI framework performance As a result, a single computer board with limited memory and computational capability could run an overpowering DL task smoothly E Deployment scenarios With the significant growth of IoT and AI, several intelligent applications have been applying in different life aspects, from home safety, smart transportation to health care services We describe some typical EI scenarios where our proposal could be deployed 1) Surface inspection for Smart factory: IE is expected to significantly reduce the latency of industrial applications compared with the central cloud paradigm Consider the surface inspection application in the automated assembly line shown in Figure 1, the products are conveyed to the right position, and an inspection camera captures its surface The output images are processed at an edge server, which validates whether the product is pass or not If processing each product has a small delay, the cumulative delay time will be considerable However, the product types in the factory are various, and there is not a general model fitting all of them We set up an illustration with a Raspberry Pi as the edge server The latency and demanded resource consumption of inspection tasks of various AI models are reported in Figure and Figure 8, respectively The differences between models are notable The lowest model is “Rcnn resnet”, taking 16.3 seconds to process, while the best one (“Mobilinet v2”) only needs 0.687 seconds Similar results are found for resource consumption, “Rcnn resnet” consumes around 100% CPU and 50% RAM when performing the inference tasks This figure of “Mobilinet v2” is significantly lower and reported about 7% CPU and 37% RAM All these results demonstrate the need for an on-demand deep learning framework supporting selecting the most suitable DL models and sharing edge resources Fig 8: Surface inspection resource consumption breakdown 2) Real-time camera monitoring system for large warehouse: A large number of cameras is deployed to enhance the safety of large warehouses (such as access control by face recognition, detecting theft) These monitoring applications require real-time video analytic to detect abnormal situations, which generates several challenges Firstly, the edge devices are not strong enough to execute an extensive convolutional neural network, which provides high detection accuracy Secondly, the vibration and noises in the captured video make the analyzing process more difficult Pre-processing input data of models is necessary to achieve high efficiency This additional step may increase the end-to-end latency We have to consider the balance between latency and accuracy of the system Finally, interoperation between large camera monitoring systems with other management systems in the whole smart factory context also a considerable challenge To mitigate all mentioned issues, ODLIE could be deployed directly on cameras or edge servers to support complex detection tasks, such as face recognition, people counting IV C ONCLUSION With the explosion of AI, edge computing, along with the strict requirement of IIoT applications, edge intelligence is a potential solution to reduce the end-to-end latency, while maintaining the service quality as offered by the cloud Addressing the challenges relating to limited computational capability as well as collaboration, we introduce ODLIE, an on-demand deep learning framework, supporting to select the deep learning model based on the deployment context and user-desired quality Besides, ODLIE could enhance data sharing capability by leveraging the semantic description concept We hope that ODLIE could be a model when developing applications of frameworks for EI ACKNOWLEDGEMENT Fig 7: Surface inspection latency breakdown This research is funded by Vietnam National University HoChiMinh City (VNU-HCM) under grant number DSC2021-26-04 462 2021 8th NAFOSTED Conference on Information and Computer Science (NICS) R EFERENCES [1] Y Lu, “Industry 4.0: A survey on technologies, applications and open research issues,” Journal of Industrial Information Integration, vol 6, pp 1–10, 2017 [2] L D Xu, E L Xu, and L Li, “Industry 4.0: state of the art and future trends,” International Journal of Production Research, vol 56, no 8, pp 2941–2962, 2018 [3] W Liu, Z Wang, X Liu, N Zeng, Y Liu, and F E Alsaadi, “A survey of deep neural network architectures and their applications,” Neurocomputing, vol 234, pp 11–26, 2017 [4] J Wang, Y Ma, L Zhang, R X Gao, and D Wu, “Deep learning for smart manufacturing: Methods and applications,” Journal of Manufacturing Systems, vol 48, pp 144–156, 2018 [5] W Yu, F Liang, X He, W G Hatcher, C Lu, J Lin, and X Yang, “A survey on the edge computing for the internet of things,” IEEE access, vol 6, pp 6900–6919, 2017 [6] A Yousefpour, C Fung, T Nguyen, K Kadiyala, F Jalali, A Niakanlahiji, J Kong, and J P Jue, “All one needs to know about fog computing and related edge computing paradigms: A complete survey,” Journal of Systems Architecture, 2019 [7] H El-Sayed, S Sankar, M Prasad, D Puthal, A Gupta, M Mohanty, and C.-T Lin, “Edge of things: the big picture on the integration of edge, iot and the cloud in a distributed computing environment,” IEEE Access, vol 6, pp 1706–1717, 2017 [8] ICE, “Edge intelligence (white paper),” 2018 [9] X Wang, Y Han, V C Leung, D Niyato, X Yan, and X Chen, “Convergence of edge computing and deep learning: A comprehensive survey,” IEEE Communications Surveys & Tutorials, 2020 [10] H Li, K Ota, and M Dong, “Learning iot in edge: Deep learning for the internet of things with edge computing,” IEEE network, vol 32, no 1, pp 96–101, 2018 [11] W Z Khan, E Ahmed, S Hakak, I Yaqoob, and A Ahmed, “Edge computing: A survey,” Future Generation Computer Systems, vol 97, pp 219–235, 2019 [12] K.-H Le, S K Datta, C Bonnet, and F Hamon, “Wot-ad: A descriptive language for group of things in massive iot,” in 2019 IEEE 5th World Forum on Internet of Things (WF-IoT), pp 257–262, IEEE, 2019 463 ... “Convergence of edge computing and deep learning: A comprehensive survey,” IEEE Communications Surveys & Tutorials, 2020 [10] H Li, K Ota, and M Dong, ? ?Learning iot in edge: Deep learning for the internet. .. required III ODLIE: O N -D EMAND D EEP L EARNING F RAMEWORK FOR E DGE I NTELLIGENCE In this section, we present an on-demand deep learning framework for edge intelligence (ODLIE) supporting AI model... specific for each edge, resulting in increasing model accuracy However, training complex AI models on a single edge may drain the device battery or even crash In this case, the edge- to -edge collaboration