IoT Malware Classification Based on System Calls Dang Kien Hoang Dai Tho Nguyen Duy Loi Vu University of Engineering and Technology, Vietnam National University, Hanoi University of Engineering and Technology, Vietnam National University, Hanoi UMI UMMISCO 209 (IRD/UPMC), Hanoi, Vietnam University of Engineering and Technology, Vietnam National University, Hanoi Email: nguyendaitho@vnu.edu.vn Abstract IoT devices play an important role in the industrial revolution 4.0 However, this type of device may exhibit specific security vulnerabilities that can be easily exploited to cause botnet attacks and other malicious activities In this paper, we introduce a new method for classification and clustering of IoT malware behaviors through system call monitoring Our method is constructed from multiple one-class SVM classifiers and has the ability to classify known malware with F1-Score over 98% and probability to detect unknown malware up to 97% Unknown malware instances with similar behaviors can also be grouped together so new classes of malware will be discovered Keyword IoT malware, MIPS malware, n-gram system call, detect unknown malware family I INTRODUCTION IoT (Internet of Things) is an important factor in the fourth industrial revolution Its mission is to connect all kinds of devices to the Internet for the collection and exchange of data IoT devices can run on various CPU architectures such as x86, MIPS, MIPSEL, ARM, and PowerPC, but Linux has been identified as the operating system of choice for most of them [25] One serious problem with the IoT devices is that manufacturers often neglect security measures in the rush to get their products to the market as soon as possible, leaving them wide open to cyber-attacks [1, 2] Malware attacks, a major threat to traditional computer systems, increasingly target IoT devices, attempting to infect and control them Therefore, effective methods for automatic analysis of a large number of IoT malware samples in order to properly handle malware outbreaks become all the more critical Malware in IoT devices is still a new research topic and newly concerned after the attacks of Mirai [5] using OpCodes feature and Recurrent Neural Networks to detect malware in IoT with detection rate up to 98.18% [6] converting the program binaries to gray-scale images and using Convolutional Neural Network to detect and classify malware with achieved 94.0% of accuracy when classifying malware and benign [7] embed the programs’ OpCodes into a vector space and apply fuzzy and fast fuzzy pattern tree methods to detect and malware detection and tree methods for malware detection and classification with an accuracy of 99.834% when detecting IoT malware [8] extract n-gram system call in ARM CPU architecture and detect malware by principal component analysis (PCA) based statistical anomaly detection and one-class SVM It achieves a 100% detection rate and closes to zero false alarms in two malware class: Mirai and MrBlack However, there has been no research on malware on IoT that automatically detect unknown malware, analyze the relationship of them and update unknown malware class to the classifier In this paper, the problem that we are going to solve is analyze malware samples were collected from honeypots: Detect family of known malware, detect unknown malware and automatically learn the behavior of newly discovered malware class We use learning concepts in machine learning to analyze malware are classification and clustering A combination of classification and clustering can use in discovery unseen class [10] Rieck et al [11] propose a method to automatically detect unknown malware class and update the unknown class to the classifier Both classification and clustering algorithm is based on Euclid distance and the classification have the ability to reject unknown malware EC2 [12] method using classification and clustering in combination to analyze malware in Android with high performance in different datasets Clustering is an effective method to help us to detect unseen class with higher confidence than single unknown data detected by the classifier To perform the combination, the classifier must have the ability to detect unknown malware, can learn its behavior easily and the one-class classification method can effectively meet the above requirements The combination of one-class models has been used to solve the multi-class classification problem in many research [18,20] With each malware class, we can build a model to detect malware of that class and combine them to create a multi-class classification model If all model gives the negative label for a malware sample, it can be an unknown malware When we have a new class, we just need to build a one-class model for that class without rebuilding the whole thing However, the problem is the decision malware label when two or more one-class models give positive results [20] build a hypersphere for each malware class base on the one-class SVM algorithm that was proposed by Tax and Duin To give the decision when two or more models give positive results, the authors have assumed that distance of all malware in a class to their hypersphere’s center according to the Gauss distribution but cannot prove it With new malware incoming, each model will give a p-value and the malware’s label is decided based on their maximum Authorized licensed use limited to: Cornell University Library Downloaded on August 17,2020 at 02:32:17 UTC from IEEE Xplore Restrictions apply value [18] uses a logarithmic function instead of the sign function for calibrating the outputs of one-class SVM and proposes a method to give a probability for each class and the decision is based on the maximum probability However, oneclass classification use in multi-class classification tasks usually gives less performance than the usual multi-class classifier [24] If one or some models that give bad probability for a malware sample, then final result can be affected and we can misclassify malware In this paper, we will introduce our method to resolve the conflict label between one-class models and construct a complete method to automatically analyzing malware In the first step, we will focus on malware in the IoT device with MIPS CPU architecture and Linux OS MIPS is a popular CPU architecture in router devices and still little is mentioned in researches about IoT malware We collected malware from sources with more than 3900 samples and executed in our sandbox environment to monitor system calls, sandbox details can be found at [16] Each malware will be represented by a point in vector space, the point’s coordinate is constructed from 2-gram of system call We will introduce a method include main components: (1) Classification, (2) clustering and (3) modeling unknown family (1) Classification enables us to classify known malware and detect unknown malware, (2) clustering to discover novel clusters of unknown malware with similar behavior Clusters have enough samples will be considered as unknown malware class and (3) modeling unknown family help us to model that class and update to the classifier The classification model was constructed from multiple one -class SVM models, each model for a malware class and predict whether a malware sample belongs to that class or not The problem of the classifier is the conflict result between one-class models: two or more models can give a positive result and we will solve it by assign priority to each model Each model will be evaluated independently and assigned a priority base on the AUC-ROC score and the label of malware is decided based on a higher priority model If all models give negative results, malware will be considered as unknown malware and put in a separate dataset and (2) clustering will be implemented in this dataset with many different popular algorithms Clusters that have enough data samples will be marked as a new malware family and we will construct one-class model for that family by component (3) modeling unknown family The new one-class model will be updated to the classifier to detect the malware family that we have discovered The Classification component can classify known malware samples with F1 Score over 98% and probability to detect unknown malware up to 97% The clustering can group malware to clusters with F-score up to 95,73% The rest of this paper is organized as follows: background is mentioned in Section II, our method with three components classification, clustering, and modeling unknown malware family is presented in Section III, Section IV including malware dataset introduction, evaluation metric, and result II BACKGROUND A One-class classification One class classification is a technique that has been designed for learning patterns of a class called target class Its main goal is to learn how to detect outline, anomaly or detect data belong to target class or not [17, 23] Several classification algorithms have been introduced as One-Class Nearest Neighbor Classifier, Auto-Associative Neural Network Classifier, One-Class Support Vector Machine Classifier, etc [18] In the following, we will describe the One-Class SVM algorithm that was proposed by Scholkopf et al in [9], which we use to build multiple one-class models and combine them One class SVM has been successfully applied to various research in anomaly detection and malware analysis [19,20,21,22] Multiple One-Class Classifier Combination can use for the construction of multi-class classifiers with the ability to detect unknown class and can extend easily with a new class Besides, the one-class classification trains only on the data of the target class that allows us to avoid the unbalanced data problem [18] B One-class SVM There are many one-class SVM version but the most popular is one-class SVM classifier was proposed by Scholkopf et al in [9] That version was supported by many famous machine learning libraries as LibSVM, Scikit-learn, etc It finds a hyperplane that can best separate the training set from the origin to learns a decision function to classify new data as similar or different to the training set The problem is formulated as follows: 1 ℎ𝐶 𝑚𝑖𝑛𝑤,𝜉,𝜌 ( ||𝑤||2 -𝜌 + ∑𝑖 𝜉𝑖 ) subject to 𝑤 · Φ(𝑥𝑖 ) ≥ 𝜌 - 𝜉𝑖 where 𝑥𝑖 is a sample in the training set, Φ(𝑥𝑖 ) is a mapping from 𝑥𝑖 in the original dimensional feature space to an inner product space, 𝑤 is a vector orthogonal to the hyperplane, 𝐶 poses an upper bound on the fraction of training errors and a lower bound of the fraction of support vectors, ℎ is the total number of training patterns, 𝜉𝑖 =[𝜉1 …𝜉ℎ ] are penalty terms for error, 𝜌 represents the distance of the hyperplane from the origin One-class SVM can use for anomaly detection with the only normal data need to construct a model One class approach is suitable to detect new class and update to the classifier when we detect a new class, we just need to build one class model for that class without rebuilding the whole thing III OUR PROPOSED METHOD The schematic overview of our analysis method is depicted in Figure The proposed system has three components: (1) Classification using multiple One-class SVM model, (2) Clustering and (3) Modeling unknown malware family (1) Classification using multiple One-class SVM models: Assign a known malware to its family and reject unknown malware (2) Clustering: Assign unknown malware that has the same behavior to clusters Clusters that have enough data sample will be considered as an unknown malware family (3) Modeling unknown malware family: Build a oneclass SVM model for unknown malware family that was detected by Clustering component Authorized licensed use limited to: Cornell University Library Downloaded on August 17,2020 at 02:32:17 UTC from IEEE Xplore Restrictions apply method, if the remaining number space is too low, the information lost will very high Figure 1: Schematic overview of the analysis method A Feature extraction and reduction We represent a system call log by a point in vector space by the 2-gram method which was used by Rieck et al We suppose the set A contains all possible system calls, set S contains all possible 2-gram system calls (|S |=|𝐴|2 ), L is the system call sequence The point will belong to a |S | dimensions vector space and the coordinate is constructed as follows: for each x ∈ S B Classification using multiple One-class SVM models We will construct a one-class SVM model for each malware class that we have Each model for a class will detect a malware sample that belong to its target class or not If all one-class classifier predicts malware does not belong to any class, it will be rejected as an unknown malware sample All unknown malware will be pushed to a dataset and served for the clustering component However, two or more oneclass classifiers can detect a malware sample belong to its class and we can’t assign a label for malware in that case To solve that problem, we will train and evaluate each model independently with the AUC (Area under the curve) of ROC (receiver operating characteristic) curve The ROC curve is a curve in a 2D space with the horizontal axis is FPR and the vertical axis is TPR The AUC of ROC curve is the area under the ROC curve and can use to evaluate the performance of a binary classifier (binary classifier is a classifier that uses when data samples have only two labels: positive and negative), an example of ROC curve and AUC is showed in Figure 2, in that case the AUC = 0.95104 if (L contain x) then v[x]=1.0; else v[x]=0.0 return v/ || v || By this feature extraction method, a point that represents a malware sample will lie on a hypersphere with a radius of And if we build a hyperplane by one-class SVM for each class in origin space, the area of a class will be a finite area space |S | can be a big number and was affected by the number of system-call in the sandbox environment The system call of a system depends on the OS kernel and the ABI (Application Binary interface) In the sandbox environment that we use, the kernel is Linux 3.2.0, OS is Debian MIPS so there is a total of 347 system calls and the number dimension of 2-gram space is 3472 Because of the high dimension of vector space make slow down our components, we apply PCA method to transform points that represent for malware to a lowerdimensional vector space PCA is a common dimensionality reduction method that finds a sequence of linear combinations Figure 2: Example of ROC curve and area under the curve With each one-class model for a class, we just consider labels that belong to the target class or not To determined Figure 3: Classification using multiple one-class SVM models of the variables that have maximal variance and are mutually uncorrelated We can choose the number dimension of vector space after apply PCA to a dataset It is a lossy compression ROC curve, the one-class model must able to give a score for each data and when we choose a threshold we can predict a data sample is positive or not The score that we used to Authorized licensed use limited to: Cornell University Library Downloaded on August 17,2020 at 02:32:17 UTC from IEEE Xplore Restrictions apply calculate the AUC of a one-class model is calculated as follows: score = 𝑤 𝑇 𝑥+𝑏 ||𝑤|| Where x = [x1, x2,…, xn ]T is the coordinate of data sample that we will calculate the score, w and b are parameters of the hyperplane that we have found by one-class SVM algorithm As we can see, the score can have positive or negative value and it has the absolute value equal distance of data sample x to the hyperplane We will calculate the AUC of oneclass models and sort them based on that score have been defined above After that, models will be connected like a chain, with each incoming malware sample, we will start with the best model, second-best, third-best, etc… If any model gives a positive result, we will stop and conclude class labels, if all model gives a negative result, malware will be rejected as an unknown malware sample This method helps speed up the process of classification of new data because we can stop when a model gives a positive result An overview of the classification method can be viewed in Figure C Clustering unknown malware When an unknown malware sample was detected, it's too hurry to conclude that it's the appearance of the new malware family Because a single data can’t be convincing enough and that can be a false alarm of the classifier However, if we have many malware samples that were rejected as unknown and they have the same behavior, the appearance of the new malware family is more certain We apply the clustering concept in unknown malware datasets to explore the relationship between unknown malware samples If exist any cluster that has enough data samples (more than a threshold) we will consider it as an unknown malware family The algorithm and parameter of the clustering component will be chosen by data that was labeled We perform clustering algorithms in known malware data and compare the result with the real label of data In the perfect case, all data samples have the same label that will belong to a cluster and all data of each cluster just have only one label We will use some algorithms that don’t need to know the number of clusters and cluster shapes as Hierarchical clustering, DBSCAN, Meanshift and choose the best algorithm, the result will be discussed in section IV D Modeling unknown malware family When an unknown malware family was detected, we just need to train a one-class model for that family without having to rebuild the whole thing A one-class model will be built with the target class is the new malware family and the second class is all of the other malware families AUC measure will be calculated and compared with one-class models that were constructed before Now, the multiclass classifier can detect the malware of the class that we newly discovered IV EXPERIMENT Dataset We have 3987 malware samples collect was collected from [13], VirusShare [14] and Detux [15] Malware samples will be executed in a sandbox environment, system call log will be collected by Strace [16], and INetSim is used for Simulate Internet environment (view [16] for more details) Malware samples were labeled by famous antivirus vendor in Virustotal, that are: Kaspersky, Symantec, Avast, Avira We A received malware classes include: Gafgyt, Mirai, Mrblack, Tsunami, Hajime, Downloader-Mirai and the biggest family size is Gafgyt with more than 1300 sample All malware classes that have less than 10 samples were rejected as Moose, Persirai, Luabot, Pilkah, Wifatch To prevent large skew between the number sample of classes, we will randomly reject samples of classes that have too much data After data reduction, we have 942 malware samples that have system call log The number sample for each class was represented in Table Malware class Samples Mirai 300 Gafgyt 300 MrBlack 255 Tsunami 47 Downloader-Mirai 30 Hajime 13 Table 1: Dataset of malware classes B Result Constructing One-class model We will construct a one-class model with each malware class that we have and each time we detect an unknown malware family by clustering We use AUC of ROC curve to evaluate the performance of one-class models The ROC curve is a curve in a 2D space (called ROC space) with the horizontal axis is FPR and the vertical axis is TPR To determined ROC curve, the model must able to give a score for each data (example positive probability) and when we choose a threshold we can predict a data sample is positive or not TPR and FPR for the model of a target class in a threshold are defined as follows: 𝑇𝑃𝑅 = 𝑇𝑃 𝑇𝑃+𝐹𝑁 𝐹𝑃𝑅 = 𝐹𝑃 𝑇𝑃+𝐹𝑃 TP is the number of malware that belongs to the target class and was label as positive by the one-class model FP is the number of malware that not belongs to the target class but was label as positive by the one-class model FN is the number of malware that belong to the target class but was label as negative by the one-class model When we vary the value of the threshold, we can receive many different TPR, FPR pair and so we can have many points in ROC space Putting those points together we will have ROC curve The AUC is the area under the ROC curve and the bigger the AUC, the better the classifier Each run, Authorized licensed use limited to: Cornell University Library Downloaded on August 17,2020 at 02:32:17 UTC from IEEE Xplore Restrictions apply we choose a target class, build a one-class SVM model from data of that class Data of target class have the positive label and all data of other classes will be considered as negative And the AUC of ROC for each classifier will be calculated The result is shown in Table The result shows that models for Hajime malware and MrBlack malware have the best result with AUC=0.9999 That two malware have some special and different with other malware: MrBlack have been developed aiming to survive a device reboot, Hajime use P2P architecture model to control zombies, it is a new architectural model recently reported [26] All classifiers have a good AUC score, which means we can find a threshold easily to separate data of target class and other class Model for malware class AUC Gafgyt 0.9996 Mirai 0.9985 Downloader-Mirai 0.9983 MrBlack the introduction section There are different F1-micro that we consider are 𝐹𝑘 and 𝐹𝑢 : 𝐹𝑘 : 𝐹1𝑚𝑖𝑐𝑟𝑜 that is calculated in the known part of the testing partition 𝐹𝑢 : 𝐹1𝑚𝑖𝑐𝑟𝑜 that is calculated in all testing partition There are labels are considered: unknown or known, that means if the classifier labeled a data as class A or B ( A, B is known malware classes) is not important with 𝐹𝑢 , we just care malware is known or unknown The result is shown in Table The result shows that the method using one-class SVM that we proposed have the best result with 𝐹𝑘 =0.9848 and 𝐹𝑢 =0.9778 Algorithm 𝐹𝑘 𝐹𝑢 SVM 0.9772 0.9759 Method of Rieck et al 0.9837 0.9650 0.9999 Tsunami 0.9997 Our method 0.9848 0.9778 Hajime 0.9999 Table 3: Classification result Table 2: AUC (ROC) of one-class SVM models Clustering Classification After training and evaluating the one-class classifiers, we will arrange the classifiers according to the AUC in descending order and that order will be MrBlack, Hajime, Tsunami, Gafgyt, Mirai, Downloader-Mirai Six one-class models will create the multi-class classifier The multi-class classifier is created from that order and we will evaluate the performance of it The metric that we use to evaluate the association of oneclass model is 𝐹1𝑚𝑖𝑐𝑟𝑜 score 𝐹1𝑚𝑖𝑐𝑟𝑜 can use to evaluate the performance of multiclass classification model and is defined as follows: ∑ 𝑇𝑃 𝑖 𝑃𝑚𝑖𝑐𝑟𝑜 = ∑(𝑇𝑃 +𝐹𝑃 ) 𝑖 𝐹1𝑚𝑖𝑐𝑟𝑜 𝑖 ∑ 𝑇𝑃 𝑖 𝑅𝑚𝑖𝑐𝑟𝑜 = ∑(𝑇𝑃 +𝐹𝑁 ) 𝑖 We will apply the clustering concept to our data set to evaluate the performance First, the label of all data will be moved to another location, we will apply clustering in unlabeled data and compare the result with the real label of data In the perfect case, all data samples in a cluster will have only one label and all data samples in the dataset that have the same label will belong to only one cluster We will use some common algorithms that not need to know the number of clusters and the shape of clusters which are DBSCAN, Meanshift, Hierarchical clustering (single linkage, average linkage, complete linkage) We use F measure to evaluate the clustering result, F measure is calculated as follows [11]: 𝑖 𝑃𝑚𝑖𝑐𝑟𝑜 𝑅𝑚𝑖𝑐𝑟𝑜 =2 𝑃𝑚𝑖𝑐𝑟𝑜 + 𝑅𝑚𝑖𝑐𝑟𝑜 𝑇𝑃𝑖 is the number of malware that belongs to the i-th class and was label as i-th by the classification model 𝐹𝑃𝑖 is the number of malware that not belongs to the i-th class but was label as i-th by the classification model 𝐹𝑁𝑖 is the number of malware that belongs to the i-th class but was label as another class by the classification model There are targets of the multiclass classifier that we consider are the ability to classify known malware and the ability to detect unknown malware We randomly split data into a training and testing partition in 30 times, with each time we will consider a random class is unknown and not provide any sample for training partition, which means testing partition will contain known part and unknown part We will compare the result with the SVM algorithm and method base on Euclid distance used by Rieck et al that we introduced in P= 𝑛 ∑𝑐∈𝐶 #𝑐 F=2 R= 𝑛 ∑𝑙∈𝐿 #𝑙 𝑃.𝑅 𝑃+𝑅 Where 𝐶 is the set of clusters, 𝐿 is the set of labels, 𝑛 is the number of data samples, #𝑐 is the biggest number of data in cluster c having the same label and #𝑙 is the largest number of data labeled 𝑙 belong one cluster The distance metric that we have used is Euclidean distance, the result of algorithms is shown in Table Table shows that the mean-shift algorithm has the best performance with F– score = 0.9573 Algorithm 𝐹 - score HAC (single linkage) 0.8897 HAC (complete linkage) 0.9159 HAC (average linkage) 0.9440 Authorized licensed use limited to: Cornell University Library Downloaded on August 17,2020 at 02:32:17 UTC from IEEE Xplore Restrictions apply DBSCAN 0.9359 Mean-shift 0.9573 [19] Table 4: Clustering result CONCLUSION In this paper we have introduced our method to analyze malware in IoT automatically with components: Classification component that classifies malware belong to known family and detect unknown malware, Clustering helps us to discovery relationship between unknown malware and detect new malware family, Constructing One-class model for unknown malware component update the unknown malware to the multiclass classifier In the future, we will continue collecting system calls of IoT malware in other CPU architectures In parallel, we will apply this method to the malware collected from honeypot to detect new patterns of infection and thereby make practical assessments of the effectiveness [1] [2] [3] [4] [5] [6] [7] [8] [9] [10] [11] [12] [13] [14] [15] [16] [17] [18] REFERENCES J Granjal, E Monteiro, and J Sa Silva, “Security for the internet of things: a survey of existing protocols and open research issues,” IEEE Communications Surveys & Tutorials, vol 17, no 3, pp 1294–1312, 2015 O Arias, J Wurm, K Hoang, and Y Jin, “Privacy and security in internet of things and wearable devices,” IEEE Transactions on MultiScale Computing Systems, vol 1, no 2, pp 99–109, 2015 https://securelist.com/new-trends-in-the-world-of-iot-threats/87991/ https://securelist.com/iot-a-malware-story/94451/ H HaddadPajouh, A Dehghantanha, R Khayami, K.-K.R Choo, “A deep recurrent neural network based approach for internet of things malware threat hunting,” Futu Gener comput syst 85 (2018), pp.88– 96 J Su, V.D Vasconcellos, S Prasad, D Sgandurra, Y Feng, K Sakurai, “Lightweight classification of iot malware based on image recognition,” in: 2018 IEEE 42nd Annual Computer Software and Applications Conference (COMPSAC), 02, 2018, pp 664–669 Dovom, E M., Azmoodeh, A., Dehghantanha, A., Newton, D E., Parizi, R M., & Karimipour, H (2019) “Fuzzy Pattern Tree for Edge Malware Detection and Categorization in IoT.” , pp 1-7, 2019 An, N., Duff, A., Naik, G., Faloutsos, M., Weber, S., & Mancoridis, S (2017) “Behavioral anomaly detection of malware on home routers.” 2017 12th International Conference on Malicious and Unwanted Software (MALWARE), pp 47-54, 2017 B Scholkopf, J C Platt, J Shawe-Taylor, A J Smola, ă and R C Williamson, “Estimating the support of a highdimensional distribution,” Neural computation, vol 13, no 7, pp 1443–1471, 2001 L Shu, H Xu, and B Liu, “Unseen class discovery in open-world classification,” arXiv preprint arXiv:1801.05609, 2018 Konrad Rieck, Philipp Trinius, Carsten Willems and Thorsten Holz, “Automatic Analysis of Malware Behavior using Machine Learning.” Journal of Computer Security (JCS), 19 (4), pp 639–668, IOSPress, June 2011 Chakraborty, T., Pierazzi, F., & Subrahmanian, V S (2017) “EC2: Ensemble Clustering and Classification for Predicting Android Malware Families.” IEEE Transactions on Dependable and Secure Computing, doi: 10.1109/TDSC.2017.2739145 Pa, Y M P., Suzuki, S., Yoshioka, K., Matsumoto, T., Kasama, T., & Rossow, C (2016) IoTPOT: A Novel Honeypot for Revealing Current IoT Threats Journal of Information Processing, 24(3), pp.522–533 ,2016 https://virusshare.com/ https://detux.com Tran Nghi Phu, Kien Hoang Dang, Dung Ngo Quoc, Nguyen Tho Dai and Nguyen Ngoc Binh (2019) “A Novel Framework to Classify Malware in MIPS Architecture-Based IoT Devices” Security and Communication Networks, pp.1-13, 2019 D.M.J Tax, and R P W Duin, “Characterizing one-class datasets”, In Proceedings of the Sixteenth Annual Symposium of the Pattern Recognition Association of South Africa pp 21–26, 2005 Hadjadji, B., Chibani, Y., & Guerbai, Y (2014) “Multiple One-Class Classifier Combination for Multiclass [20] [21] [22] [23] [24] [25] [26] Classification” 22nd International Conference on Pattern Recognition., 2014 Perdisci, R., Ariu, D., Fogla, P., Giacinto, G., & Lee, W (2009) McPAD: A multiple classifier system for accurate payload-based anomaly detection Computer Networks, 53(6), 864–881 Comar, P M., Liu, L., Saha, S., Tan, P.-N., & Nucci, A (2013) “Combining supervised and unsupervised learning for zero-day malware detection” 2013 Proceedings IEEE INFOCOM., 2013 Burnaev, E., & Smolyakov, D (2016) One-Class SVM with Privileged Information and Its Application to Malware Detection 2016 IEEE 16th International Conference on Data Mining Workshops (ICDMW) Heller KA, Svore KM, Keromytis AD, Stolfo SJ (2003) One class support vector machines for detecting anomalous windows registry accesses In: Proceedings of the workshop on data mining for computer security, vol B Krawczyk, and M Wozniak, “Experiments on distance measures for combining one-class classifiers”, in 2012 Federated Conference on Computer Science and Information Systems, pp.89–92 FedCSIS 2012 O Boehm, D R Hardoon, and L M Manevitz, “Classifying cognitive states of brain activity via one-class neural networks with feature selection by genetic algorithms”, Int J Mach Learn & Cyber pp 125– 134, 2011 https://www.itprotoday.com/iot/survey-shows-linux-top-operatingsystem-internet-things-devices De Donno, Michele & Dragoni, Nicola & Giaretta, Alberto & Spognardi, Angelo (2018) “DDoS-Capable IoT Malwares: Comparative Analysis and Mirai Investigation” In Security and Communication Networks, 2018 Authorized licensed use limited to: Cornell University Library Downloaded on August 17,2020 at 02:32:17 UTC from IEEE Xplore Restrictions apply ... analyze malware in IoT automatically with components: Classification component that classifies malware belong to known family and detect unknown malware, Clustering helps us to discovery relationship... components: (1) Classification using multiple One-class SVM model, (2) Clustering and (3) Modeling unknown malware family (1) Classification using multiple One-class SVM models: Assign a known malware. .. background is mentioned in Section II, our method with three components classification, clustering, and modeling unknown malware family is presented in Section III, Section IV including malware dataset