Big Data Analytics for Internet of Things Big Data Analytics for Internet of Things Edited by Tausifa Jan Saleem National Institute of Technology Srinagar, India Mohammad Ahsan Chishti Central University of Kashmir Ganderbal, Kashmir, India This edition first published 2021 © 2021 John Wiley & Sons, Inc All rights reserved No part of this publication may be reproduced, stored in a retrieval system, or transmitted, in any form or by any means, electronic, mechanical, photocopying, recording or otherwise, except as permitted by law Advice on how to obtain permission to reuse material from this title is available at http://www.wiley.com/go/permissions The right of Tausifa Jan Saleem and Mohammad Ahsan Chishti to be identified as the author(s) of the editorial material in this work has been asserted in accordance with law Registered Office John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, USA Editorial Officesw 111 River Street, Hoboken, NJ 07030, USA For details of our global editorial offices, customer services, and more information about Wiley products visit us at www.wiley.com Wiley also publishes its books in a variety of electronic formats and by print-on-demand Some content that appears in standard print versions of this book may not be available in other formats Limit of Liability/Disclaimer of Warranty While the publisher and authors have used their best efforts in preparing this work, they make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties, including without limitation any implied warranties of merchantability or fitness for a particular purpose No warranty may be created or extended by sales representatives, written sales materials or promotional statements for this work The fact that an organization, website, or product is referred to in this work as a citation and/or potential source of further information does not mean that the publisher and authors endorse the information or services the organization, website, or product may provide or recommendations it may make This work is sold with the understanding that the publisher is not engaged in rendering professional services The advice and strategies contained herein may not be suitable for your situation You should consult with a specialist where appropriate Further, readers should be aware that websites listed in this work may have changed or disappeared between when this work was written and when it is read Neither the publisher nor authors shall be liable for any loss of profit or any other commercial damages, including but not limited to special, incidental, consequential, or other damages Library of Congress Cataloging-in-Publication Data Names: Saleem, Tausifa Jan, editor | Chishti, Mohammad Ahsan, editor Title: Big data analytics for Internet of things / edited by Tausifa Jan Saleem, Mohammad Ahsan Chishti Description: First edition | Hoboken, NJ : Wiley, 2021 | Includes bibliographical references and index Identifiers: LCCN 2020049761 (print) | LCCN 2020049762 (ebook) | ISBN 9781119740759 (hardback) | ISBN 9781119740766 (adobe pdf) | ISBN 9781119740773 (epub) Subjects: LCSH: Big data | Internet of things Classification: LCC QA76.9.B45 B4995 2021 (print) | LCC QA76.9.B45 (ebook) | DDC 005.7–dc23 LC record available at https://lccn.loc.gov/2020049761 LC ebook record available at https://lccn.loc.gov/2020049762 Cover Design: Wiley Cover Image: © Blue Planet Studio/iStock/Getty Images Plus/Getty Images Set in 9.5/12.5pt STIXTwoText by SPi Global, Pondicherry, India 10 9 8 7 6 5 4 3 2 v Contents List of Contributors xv List of Abbreviations xix Big Data Analytics for the Internet of Things: An Overview Tausifa Jan Saleem and Mohammad Ahsan Chishti Data, Analytics and Interoperability Between Systems (IoT) is Incongruous with the Economics of Technology: Evolution of Porous Pareto Partition (P3) Shoumen Palit Austin Datta, Tausifa Jan Saleem, Molood Barati, María Victoria López López, Marie-Laure Furgala, Diana C Vanegas, Gérald Santucci, Pramod P Khargonekar, and Eric S McLamore Context Models in the Background 12 Problem Space: Are We Asking the Correct Questions? 14 Solutions Approach: The Elusive Quest to Build Bridges Between Data and Decisions 15 Avoid This Space: The Deception Space 17 Explore the Solution Space: Necessary to Ask Questions That May Not Have Answers, Yet 17 Solution Economy: Will We Ever Get There? 19 Is This Faux Naïveté in Its Purest Distillate? 21 Reality Check: Data Fusion 22 “Double A” Perspective of Data and Tools vs The Hypothetical Porous Pareto (80/20) Partition 28 Conundrums 29 Stigma of Partition vs Astigmatism of Vision 38 The Illusion of Data, Delusion of Big Data, and the Absence of Intelligence in AI 40 2.1 2.2 2.3 2.4 2.5 2.6 2.7 2.8 2.9 2.10 2.11 2.12 2.13 vi Contents 2.14 2.15 2.16 3.1 3.2 3.2.1 3.2.1.1 3.2.1.2 3.2.1.3 3.2.2 3.2.2.1 3.2.2.2 3.2.3 4.1 4.2 4.2.1 4.2.2 4.3 4.3.1 4.3.1.1 4.3.1.2 4.3.1.3 4.3.1.4 4.3.2 4.3.2.1 4.3.2.2 4.3.2.3 4.3.3 4.4 4.4.1 4.4.2 4.4.3 I n Service of Society 50 Data Science in Service of Society: Knowledge and Performance from PEAS 52 Temporary Conclusion 60 Acknowledgements 63 References 63 Machine Learning Techniques for IoT Data Analytics 89 Nailah Afshan and Ranjeet Kumar Rout Introduction 89 Taxonomy of Machine Learning Techniques 94 Supervised ML Algorithm 95 Classification 96 Regression Analysis 98 Classification and Regression Tasks 99 Unsupervised Machine Learning Algorithms 103 Clustering 103 Feature Extraction 106 Conclusion 107 References 107 IoT Data Analytics Using Cloud Computing 115 Anjum Sheikh, Sunil Kumar, and Asha Ambhaikar Introduction 115 IoT Data Analytics 117 Process of IoT Analytics 117 Types of Analytics 118 Cloud Computing for IoT 118 Deployment Models for Cloud 120 Private Cloud 120 Public Cloud 120 Hybrid Cloud 121 Community Cloud 121 Service Models for Cloud Computing 122 Software as a Service (SaaS) 122 Platform as a Service (PaaS) 122 Infrastructure as a Service (IaaS) 122 Data Analytics on Cloud 123 Cloud-Based IoT Data Analytics Platform 123 Atos Codex 125 AWS IoT 125 IBM Watson IoT 126 Contents 4.4.4 4.4.5 4.4.6 4.5 4.5.1 4.5.2 4.6 4.7 Hitachi Vantara Pentaho, Lumada 127 Microsoft Azure IoT 128 Oracle IoT Cloud Services 129 Machine Learning for IoT Analytics in Cloud 132 ML Algorithms for Data Analytics 132 Types of Predictions Supported by ML and Cloud 136 Challenges for Analytics Using Cloud 137 Conclusion 139 References 139 Deep Learning Architectures for IoT Data Analytics 143 Snowber Mushtaq and Omkar Singh Introduction 143 Types of Learning Algorithms 146 Supervised Learning 146 Unsupervised Learning 146 Semi-Supervised Learning 146 Reinforcement Learning 146 Steps Involved in Solving a Problem 146 Basic Terminology 147 Training Process 147 Modeling in Data Science 147 Generative 148 Discriminative 148 Why DL and IoT? 148 DL Architectures 149 Restricted Boltzmann Machine 149 Training Boltzmann Machine 150 Applications of RBM 151 Deep Belief Networks (DBN) 151 Training DBN 152 Applications of DBN 153 Autoencoders 153 Training of AE 153 Applications of AE 154 Convolutional Neural Networks (CNN) 154 Layers of CNN 155 Activation Functions Used in CNN 156 Applications of CNN 158 Generative Adversarial Network (GANs) 158 Training of GANs 158 Variants of GANs 159 5.1 5.1.1 5.1.1.1 5.1.1.2 5.1.1.3 5.1.1.4 5.1.2 5.1.2.1 5.1.2.2 5.1.3 5.1.3.1 5.1.3.2 5.1.4 5.2 5.2.1 5.2.1.1 5.2.1.2 5.2.2 5.2.2.1 5.2.2.2 5.2.3 5.2.3.1 5.2.3.2 5.2.4 5.2.4.1 5.2.4.2 5.2.4.3 5.2.5 5.2.5.1 5.2.5.2 vii viii Contents 5.2.5.3 5.2.6 5.2.6.1 5.2.6.2 5.2.7 5.2.7.1 5.2.7.2 5.3 Applications of GANs 159 Recurrent Neural Networks (RNN) 159 Training of RNN 160 Applications of RNN 161 Long Short-Term Memory (LSTM) 161 Training of LSTM 161 Applications of LSTM 162 Conclusion 162 References 163 Adding Personal Touches to IoT: A User-Centric IoT Architecture 167 Sarabjeet Kaur Kochhar Introduction 167 Enabling Technologies for BDA of IoT Systems 169 Personalizing the IoT 171 Personalization for Business 172 Personalization for Marketing 172 Personalization for Product Improvement and Service Optimization 173 Personalization for Automated Recommendations 174 Personalization for Improved User Experience 174 Related Work 175 User Sensitized IoT Architecture 176 The Tweaked Data Layer 178 The Personalization Layer 180 The Characterization Engine 180 The Sentiment Analyzer 182 Concerns and Future Directions 183 Conclusions 184 References 185 6.1 6.2 6.3 6.3.1 6.3.2 6.3.3 6.3.4 6.3.5 6.4 6.5 6.6 6.7 6.7.1 6.7.2 6.8 6.9 7.1 7.2 7.3 7.3.1 7.3.2 7.3.3 7.3.4 Smart Cities and the Internet of Things 187 Hemant Garg, Sushil Gupta, and Basant Garg Introduction 187 Development of Smart Cities and the IoT 188 The Combination of the IoT with Development of City Architecture to Form Smart Cities 189 Unification of the IoT 190 Security of Smart Cities 190 Management of Water and Related Amenities 190 Power Distribution and Management 191 Reference 10 Hoberg, G and Phillips, G (2010) Product market synergies and competition in mergers and acquisitions: a text-based analysis The Review of Financial Studies 23 (10): 3773–3811 11 Antweiler, W and Frank, M.Z (2004) Is all that talk just noise? The information content of internet stock message boards The Journal of Finance 59 (3): 1259–1294 12 Loughran, T and McDonald, B (2011) When is a liability not a liability? Textual analysis, dictionaries, and 10-Ks The Journal of Finance 66 (1): 35–65 13 Heston, S.L., Korajczyk, R.A., and Sadka, R (2010) Intraday patterns in the cross-section of stock returns The Journal of Finance 65 (4): 1369–1407 14 Chordia, T and Subrahmanyam, A (2004) Order imbalance and individual stock returns: theory and evidence Journal of Financial Economics 72 (3): 485–518 15 Kelley, E.K and Tetlock, P.C (2013) How wise are crowds? Insights from retail orders and stock returns The Journal of Finance 68 (3): 1229–1265 16 Kelley, E.K and Tetlock, P.C (2016) Retail short selling and stock prices The Review of Financial Studies 30 (3): 801–834 17 Da, Z., Huang, X., and Jin, L.J (2019) Extrapolative beliefs in the cross-section: What can we learn from the crowds? Journal of Financial Economics Available at SSRN 3144849 18 Neely, C.J., Rapach, D.E., Tu, J., and Zhou, G (2014) Forecasting the equity risk premium: the role of technical indicators Management Science 60 (7): 1772–1791 19 Da, Z., Engelberg, J., and Gao, P (2011) In search of attention The Journal of Finance 66 (5): 1461–1499 20 Huang, J (2018) The customer knows best: the investment value of consumer opinions Journal of Financial Economics 128 (1): 164–182 21 Tirunillai, S and Tellis, G.J (2012) Does chatter really matter? Dynamics of user-generated content and stock performance Marketing Science 31 (2): 198–215 22 Singh, H.P and Kumar, S (2014) Working capital management: a literature review and research agenda Qualitative Research in Financial Markets (2): 173–197 23 Tranfield, D., Denyer, D., and Smart, P (2003) Towards a methodology for developing evidence-informed management knowledge by means of systematic review British Journal of Management 14 (3): 207–222 24 Fink, A (2019) Conducting Research Literature Reviews: From the Internet to Paper Sage publications 25 Drake, M.S., Roulstone, D.T., and Thornock, J.R (2012) Investor information demand: evidence from Google searches around earnings announcements Journal of Accounting Research 50 (4): 1001–1040 26 Einav, L and Levin, J (2014) The data revolution and economic analysis Innovation Policy and the Economy 14 (1): 1–24 363 364 16 Two Decades of Big Data in Finance 27 Dimpfl, T and Jank, S (2016) Can internet search queries help to predict stock market volatility? European Financial Management 22 (2): 171–192 28 Kshetri, N (2016) Big data’s role in expanding access to financial services in China International journal of information management 36 (3): 297–308 29 Seddon, J.J and Currie, W.L (2017) A model for unpacking big data analytics in high-frequency trading Journal of Business Research 70: 300–307 30 Chen, Y., Chen, H., Gorkhali, A et al (2015) Big data analytics and big data science: a survey Journal of Management Analytics (1): 1–42 31 Campbell-Verduyn, M., Goguen, M., and Porter, T (2017) Big data and algorithmic governance: the case of financial practices New Political Economy 22 (2): 219–236 32 Jin, X., Shen, D., and Zhang, W (2016) Has microblogging changed stock market behavior? Evidence from China Physica A: Statistical Mechanics and its Applications 452: 151–156 33 Choi, T.M and Lambert, J.H (2017) Advances in risk analysis with big data Risk Analysis 37 (8): 1435–1442 34 Cerchiello, P and Giudici, P (2016) Big data analysis for financial risk management Journal of Big Data (1): 18 35 Côrte-Real, N., Ruivo, P., Oliveira, T., and Popovič, A (2019) Unlocking the drivers of big data analytics value in firms Journal of Business Research 97: 160–173 36 Fanning, K and Grant, R (2013) Big data: implications for financial managers Journal of Corporate Accounting & Finance 24 (5): 23–30 37 Pejić Bach, M., Krstić, Ž., Seljan, S., and Turulja, L (2019) Text mining for big data analysis in financial sector: a literature review Sustainability 11 (5): 1277 38 Pérez-Martín, A., Pérez-Torregrosa, A., and Vaca, M (2018) Big Data techniques to measure credit banking risk in home equity loans Journal of Business Research 89: 448–454 39 Blackburn, M., Alexander, J., Legan, J.D., and Klabjan, D (2017) Big data and the future of R&D management: the rise of big data and big data analytics will have significant implications for R&D and innovation management in the next decade Research-Technology Management 60 (5): 43–51 40 Tian, X., Han, R., Wang, L et al (2015) Latency critical big data computing in finance The Journal of Finance and Data Science (1): 33–41 41 Xie, P., Zou, C., and Liu, H (2016) The fundamentals of internet finance and its policy implications in China China Economic Journal (3): 240–252 42 Blocher, J., Cooper, R., Seddon, J., & Van Vliet, B (2018) Phantom liquidity and high-frequency quoting Journal of Trading, Vol 11, No 3, 6–15 43 Preda, A (2007a) The sociological approach to financial markets Journal of Economic Surveys 21 (3): 506–533 Reference 44 Zaloom, C (2003) Ambiguous numbers: trading technologies and interpretation in financial markets American Ethnologist 30 (2): 258–272 45 Yang, D., Chen, P., Shi, F., and Wen, C (2018) Internet finance: its uncertain legal foundations and the role of big data in its development Emerging Markets Finance and Trade 54 (4): 721–732 46 Glancy, F.H and Yadav, S.B (2011) A computational model for financial reporting fraud detection Decision Support Systems 50 (3): 595–601 47 Ngai, E.W., Hu, Y., Wong, Y.H et al (2011) The application of data mining techniques in financial fraud detection: a classification framework and an academic review of literature Decision Support Systems 50 (3): 559–569 48 Hajizadeh, E., Ardakani, H.D., and Shahrabi, J (2010) Application of data mining techniques in stock markets: a survey Journal of Economics and International Finance (7): 109 49 Sun, Y., Shi, Y., and Zhang, Z (2019) Finance Big Data: management, analysis, and applications International Journal of Electronic Commerce 23: 50 Hennessy, C.A and Whited, T.M (2007) How costly is external financing? Evidence from a structural estimation The Journal of Finance 62 (4): 1705–1745 51 Fang, L and Peress, J (2009) Media coverage and the cross-section of stock returns The Journal of Finance 64 (5): 2023–2052 52 Gomes, J.F (2001) Financing investment American Economic Review 91 (5): 1263–1285 53 Tumarkin, R and Whitelaw, R.F (2001) News or noise? Internet postings and stock prices Financial Analysts Journal 57 (3): 41–51 54 Cao, M., Chychyla, R., and Stewart, T (2015) Big Data analytics in financial statement audits Accounting Horizons 29 (2): 423–429 55 Subrahmanyam, A (2019) Big data in finance: evidence and challenges Borsa Istanbul Review 19: 283–287 365 367 Index a analysis, IoT 117 Apache HBase 222 Apache Spark 221–222 Apache Storm 222 artificial intelligence (AI) 43, 48, 50, 144 artificial neural network (ANN) 48, 49, 106 artificial reasoning tools (ART) 48–50 association rule mining 324 Atos Codex 125 attribute‐based encryption (ABE) 274 autoencoders (AE) 153–154 automatic license plate recognition (ALPR) 232–233 AWS IoT analytics 125–126, 130 b backpropagation through time (BPTT) 160 bagging see bootstrap aggregating BERT 45–46 big data data journalism BBC big data 340, 342 guardian data blog 342–343 Indian scenario 345–346 Internet of Things 346–347 media impact on 347–348 wikileaks 344 World Economic Forum 344–345 finance financial markets 358–359 financial services 359–360 internet finance 359 other financial issues 360 big data analytics (BDA) 351 business intelligence analytics 322–323 challenges in 324–325 IoT systems 169–171 massive analytics 323 memory‐level analytics systems 322 methods 324 off‐line analytics systems 322 personalization 168–169 real‐time analytical systems 322 binary prediction technique 136–137 bootstrap aggregating 102–103 business intelligence (BI) analytics 322–323 business, personalization for 172 Big Data Analytics for Internet of Things, First Edition Edited by Tausifa Jan Saleem and Mohammad Ahsan Chishti © 2021 John Wiley & Sons, Inc Published 2021 by John Wiley & Sons, Inc 368 Index c canonical correlation analysis (CCA) 107 canonical hyperplane 100 category prediction technique 137 Character Generator Protocol (CHARGEN) 296 city architecture, smart cities city assets and human resources 192 environmental pollution management 192 power distribution and management 191 revenue collection and administration 191–192 security of 190 unification of the IoT 190 water and related amenities 190–191 classification and regression trees (CART) 101–102 cloud‐based defense framework 310 cloud‐based integrated water management system data distribution, Wi‐Fi IOT communicator app 261 experimental setup module 259–260 flow rate vs bill generated 258 literature survey 248–250 six‐tier data framework contact unit (FC‐37) 253 GSM‐based ARM and control system 253 internet of things communicator (ESP8266) 253 methodology 253–256 primary components 251 proposed algorithm 256–257 time vs water flow rate 258 water report of both house 262 cloud‐based IoT data analytics platform Atos Codex 125 AWS IoT 125–126, 130 Hitachi Vantara Pentaho, Lumada 127–128, 131 IBM Watson IoT 126–127 Microsoft Azure IoT 128–129, 134 Oracle IoT cloud services 129, 132, 135 cloud‐based platforms, PMU 223–224 cloud computing analytics challenges 137–139 analytics types 118 benefits of 116, 119 data analytics on 123 DDoS attacks 293 application level attacks, 296m 297 community cloud 292–293 hybrid cloud 293 infrastructure level attacks 294–296 private cloud 292–294 probable impact 297–298 public cloud 290, 292–294 taxonomy 291 deployment models for community cloud 121 hybrid cloud 121 private cloud 120 public cloud 120–121 infrastructure storage 119 IoT analytics process 117–118 machine learning binary prediction 136–137 category prediction 137 for data analytics 132, 133, 135 value prediction 137 service models for 122–123 cloud node 292 Cloud Service Provider (CSP) 297 Cluster Communication Protocol (CCP) 301 Index cognition 43 collection of data 117 communication channel, data security 276–277 community cloud 121, 292–293 conditional restricted Boltzmann machine (CRBM) 151 control station 209 ConvNet see convolutional neural networks (CNN) convolutional neural network (CNN) 233–234 activation functions 156–157 applications of 158 convolution layer 155 fully connected layer 156 pooling layer 155–156 ReLU 156, 157 sigmoid function 156, 157 Tanh() 156, 157 core vector machine (CVM) 212 COVID‐19 pandemic 40 Crossfire attack 296 cybernetics 9, 13 cyberphysical systems (CPS) 12 cybersecurity 216–217 d data analytics cloud computing 123 ML algorithms, cloud computing 132, 133, 135 data and/or information‐informed decision support (DIDAS) 55–56 database management system (DBMS) 15–16 data democratization 28 data fusion 22–26 challenges 326 for IoT security 327–329 levels of 326–327 mathematical methods for 326–327 opportunities provided by 326 data‐informed decision support (DIDAS) 22 data journalism accessing data for 337–338 big data BBC big data 340, 342 guardian data blog 342–343 Indian scenario 345–346 Internet of Things 346–347 media impact on 347–348 wikileaks 344 World Economic Forum 344–345 data analytics 338–340 defined 333 next big thing 334–336 overview 336–337 data mining techniques 218 data models 15–17 data privacy 279–280 data science model 147–148 data security 121 application domain authentication 272–274 authorization 274 depletion of resources 274–275 establishment of trust 275 architectural domain 275–276 challenges 267–268 common attacks 271–272 communication channel 276–277 confidentiality, integrity, and authentication 278–279 data privacy 279–280 interface layer 271 IoT 266–267 network layer 269–271 research directions 280 sensing layer 268–269 datawrapper 340 369 370 Index DDoS attack mitigation big data analytics 305–306 divide and conquer 300 dynamic resource allocation 302–303 dynamic resource pricing 301 intelligent fast‐flux swarm network 301 proactive approach 298 push‐back 299 random flow network modeling 300 reactive approach 299 roaming honeypot 302 router throttling 299 SDN‐based DDoS defense 303–305 self‐cleansing intrusion tolerance 300–301 target defense moving 302 DDoS attacks see distributed denial of service (DDoS) attacks decisions, IoT 117 decision support systems (DSS) 13 deep belief networks (DBN) 151–153 deep learning (DL) 44, 48 autoencoders 153–154 convolutional neural networks activation functions 156–157 applications of 158 layers of 155–156 data science model 147–148 deep belief networks 151–153 deep neural networks 144–145 generative adversarial network 158–159 IoT 148–149 learning algorithms types 146–147 long short‐term memory 161–162 recurrent neural networks 159–161 restricted Boltzmann machine 149–151 solve a problem 146–147 deep neural learning 144–145 deep neural networks 144–145 delivering services 139 density‐based spatial clustering of applications with noise (DBSCAN) 104, 133 deployment cloud computing models community cloud 121 hybrid cloud 121 private cloud 120 public cloud 120–121 descriptive analytics 118 Device Provisioning System (DPS) 129 device‐to‐device communication 143 diagnostic analytics 118 digital transformation 55, 60 discriminative restricted Boltzmann machine (DRBM) 151 distributed denial of service (DDoS) attacks challenges and issues 309–310 cloud deployment models application level attacks 296 community cloud 292–293 hybrid cloud 293 infrastructure level attacks 294–296 private cloud 292–294 probable impact 297–298 public cloud 290, 292–294 taxonomy 291 contribution 288–290 define 285 future work 312 generic framework 310–311 intrusion scenario in 2020 286 mitigation approaches (see DDoS attack mitigation) organization 290 research on 287–288 selected approaches 307–308 statistics on 2020 286 Index “dollar‐sign‐dangling” Domain Name Service (DNS) 295 domain‐specific denominator models (DSDM) 16 DoS attack, IoT 271 dynamic resource allocation strategy 302–303 Dynamic State Estimation (DSE) 213–214 e elliptic curve cryptography (ECC) 273–274 Elliptic Curve Diffie–Hellman (ECDH) 276 Elliptic Curve Digital Signature Algorithm (ECDSA) 276 encoding‐transformation‐decoding network 235 energy efficiency 200 Environmental Impact Assessment (EIA) 197 environmental pollution management 192 environmental sustainability 199–202 f feed forward neural networks (FFNN) 105–106 finance, big data article identification and selection 353–354 articles published year wise 355–356 big data analytics 351 citation analysis 356–357 content analysis financial markets 358–359 financial services 359–360 internet finance 359 other financial issues 360 journal of publication 356 methodology 353 research employed method 354–355 future research 362 literature review 361 systematic literature review 353 financial markets, big data 358–359 financial services, big data 359–360 flexibility of business 139 Forecast Aided State Estimation 213–214 g Gartner’s Hype Cycle 92–93 generation of data 117 generative adversarial network (GANs) 158–159 global positioning system (GPS) 209 GSM ARM and water flow sensor 257 based water meter 256 h Hadoop 220–221 HELLO flood attacks, IoT 272 heterogeneous measurement integration 215–216 Hitachi Vantara Pentaho, Lumada 127–128, 131 HTTP flood attacks 296 hybrid cloud 121, 123, 293 hybrid restricted Boltzmann machine (HRBM) 151 i IBM Watson IoT 126–127 ICMP flood 296 information age 14 information technology (IT) 14–15 Infrastructure as a Service (IaaS) 122–124 371 372 Index intelligent autonomous vehicles 232 intelligent enterprise bounding box, license plate 233–234 data role 236 invariances 237 mean intersection over union 240 model framework 234–236 number of classes 238 reducing number of features 237 segmentation objective 234 smart city, big data analytics 240–244 Softmax loss model 239–240 spatial invariances 234 synthesizing samples 236–237 intelligent fast‐flux swarm network 301 interface layer, data security 271 internet finance, big data 359 infrastructure 204 and LLN 270 Internet of Robotic Things (IoRT) 232 Internet of Things (IoT) aspect of cybernetics defined goal of principles of 8–11 trillion market interoperability, IOT artificial intelligence 17 big data concept of 8–12 conundrums 29–38 data fusion 22–26 data‐informed decision support 22 data models 15–18 economy 19–21 information technology 14–15 models 12–14 partition vs astigmatism of vision 38–40 PEAS 52–60 security mandates 17, 19 service of society 50–51 small data 28–29 intrusion responsive autonomic system (IRAS) 305 IoT see Internet of Things (IoT) big data analytics 323 data characteristics 92 data security 266–267 security classification 273 security layered architecture 268 j journalism see data journalism k key performance indicators (KPI) 10, 37 k‐means clustering technique 103–104 k nearest neighbors (KNN) 96, 133–134 knowledge graph networks (KGN) 56 knowledge graphs (KG) 56–59 knowledge‐informed decision support (KIDS) 55 l labeled property graph (LPG) 25 linear regression 98–99 long short‐term memory (LSTM) 161–162 loss of main (LOM) detection 214 Lumada IoT 127–128, 131 m machine‐generated data 231 machine learning (ML) 17 cloud computing anomaly detection 135 binary prediction 136–137 category prediction 137 Index classification 132–133 clustering 133 feature extraction 133, 135 regression 133 value prediction 137 Gartner’s Hype Cycle 92–93 knowledge hierarchy/pyramid 90–91 methods 233 supervised algorithm bootstrap aggregating 102–103 classification and regression trees 101–102 k nearest neighbors 96 linear regression 98–99 Naïve Bayes classifier 96–98 random forest 102 support vector machine 99–101 taxonomy of 94–95 unsupervised algorithm canonical correlation analysis 107 DBSCAN 104 k‐means clustering 103–104 multilayer perceptrons 105–106 neural networks 104–105 principal component analysis 106–107 Malware as a Service 303 masked authenticated messaging (MAM) 22 massive analytics 323 mean intersection over union (M‐IoU) 240 memory‐level analytics systems 322 message passing neural network (MPNN) 48–49 metropolis 231 micro‐electro‐mechanical systems (MEMS) 89 Microsoft Azure IoT 128–129, 134 mode of disturbance (MOD) 212 multilayer perceptrons (MLP) 105–106 n Naïve Bayes 96–98, 133–134 network layer, data security 269–271 Network Time Protocol (NTP) 296 neural networks 43, 104–105 o off‐line analytics systems 322 Oracle IoT cloud services 129, 132, 135 oscillation detection 215 p pandas 340 pay‐a‐penny‐per‐use (PAPPU) 20 pay‐a‐price‐per‐unit (PAPPU) 20 PEAS 24, 52–60 “penny” 20–21 personalization for automated recommendations 174 BDA 169–171 for business 172 concerns and future directions 183–184 defined 168 disadvantage of 184 IoT systems 171 layer characterization engine 180–181 sentiment analyzer 182–183 for marketing 172 product improvement and service optimization 173–174 related work 175–176 tweaked data layer 178–179 user‐centric IoT architecture 176–178 user experience 174–175 personalize 174 phasor data concentrator (PDC) 209 373 374 Index phasor measurement unit (PMU) data processing Apache HBase 222 Apache Spark 221–222 Apache Storm 222 cloud‐based platforms 223–224 Hadoop 220–221 driven applications data quality and security 216–217 heterogeneous measurement integration 215–216 utilization and analytics 217–218 variety and interoperability 216 visualization of data 218–219 volume and velocity 216 thirty measurements 210 Platform as a Service (PaaS) 122, 124 PMU see phasor measurement unit (PMU) power generation system 209 prediction, IoT 117 predictive analytics 118, 324 prescriptive analytics 118 principal component analysis (PCA) 106–107 privacy, IoT 324–325 private cloud computing 120 DDoS attacks 292–293 vs public cloud 293–294 processing of data 138 programming languages 340 public cloud computing 120–121 DDoS attacks 290, 292 vs private cloud 293–294 Python 340 q quality of data 138 r radio frequency identification (RFID) random flow network modeling 300 random forest 102 real‐time analytical systems 322 recurrent neural networks (RNN) 159–161 regiopolis 231 reinforcement learning (RL) 49, 146 Relational DataBase Management System (RDBMS) 215–216 ReLU functions 156, 157 resource description framework (RDF) 24–25 restricted Boltzmann machine (RBM) applications of 151 defined 149 training of 150–151 return on investment (ROI) roadmap background 197–198 big data in sustainability 198–199 environmental sustainability 199–202 high hardware and software cost 204 internet infrastructure 204 less qualified workforce 204–205 proposed 202–204 roaming honeypot 302 robust Boltzmann machine (RoBM) 151 router throttling model 299–300 s SARA see Sense, Analyze, Respond, Actuate (SARA) SDN‐based DDoS defense 303–305 security of data 138–139 selective forwarding attack, IoT 272 self‐cleansing intrusion tolerance (SCIT) 300–301 Index semi‐supervised learning algorithms 146 Sense, Analyze, Respond, Actuate (SARA) 8, 10 sensing layer data security 268–269 vulnerabilities 269 sensor‐based economy 22 sentiment analyzer, personalization 182–183 service cloud computing models 122–124 service level agreement (SLA) 120 sigmoid function 156, 157 Simple Network Management Protocol (SNMP) 296 Simple Object Access Protocol (SOAP) 295 Simple Service Discovery Protocol (SSDP) 295–296 Sinkhole attack, IoT 272 six‐tier data, water management system contact unit (FC‐37) 253 GSM‐based ARM and control system 253 internet of things communicator (ESP8266) 253 methodology 253–256 primary components 251 proposed algorithm 256–257 smart cities air pollution hotspots in Dublin 243 city architecture to city assets and human resources 192 environmental pollution management 192 power distribution and management 191 revenue collection and administration 191–192 security of 190 unification of the IoT 190 water and related amenities 190–191 clustering results, traffic in Dublin 243 development of 188–189 forwarder and Indexer architecture 241 Splunk front end and back end 242 wearable tech 193–194 Smart Water Management (SWM) 249 smart water meter 252 SNAPS 23, 55, 61 Softmax loss model 239–240 Software as a Service (SaaS) 122, 124 software‐defined IoT model (SDIoT) 276 software‐defined networking (SDN) 232 Spatial MapReduce 242 storage of data 137–138 supervised learning algorithms 146 supervised machine learning algorithm bootstrap aggregating 102–103 classification and regression trees 101–102 k nearest neighbors 96 linear regression 98–99 Naïve Bayes classifier 96–98 random forest 102 support vector machine 99–101 support vector regression 101 supervisory control and data acquisition (SCADA) systems 209 support vector machine (SVMs) 99–101 support vector regression (SVR) 101, 133 375 376 Index sustainability, big data 198–199 synchrophasor data management application types fault detection 214 loss of main detection 214 multiple event detection 213 oscillation detection 215 out of step splitting protection 213 state estimation 213–214 topology update detection 214 transient stability 212–213 voltage stability analysis 211–212 PMU‐data processing Apache HBase 222 Apache Spark 221–222 Apache Storm 222 cloud‐based platforms 223–224 Hadoop 220–221 PMU‐driven applications data quality and security 216–217 heterogeneous measurement integration 215–216 utilization and analytics 217–218 variety and interoperability 216 visualization of data 218–219 volume and velocity 216 t Tanh() functions 156, 157 target defense moving 302 TCP SYN flood 296 Temporal MapReduce 242 tensor processing unit (TPU) 44 topology update detection 214 transient stability 212–213 tweaked data layer 178–179 u UDP flood 296 unsupervised learning algorithms 146 canonical correlation analysis 107 DBSCAN 104 k‐means clustering 103–104 multilayer perceptrons 105–106 neural networks 104–105 principal component analysis 106–107 v value prediction technique 137 variational auto‐encoder (VAE) 234 visualization of data 218–219 voltage stability analysis 211–212 w water management system data distribution, Wi‐Fi IOT communicator app 261 experimental setup module 259–260 flow rate vs bill generated 258 literature survey 248–250 six‐tier data framework contact unit (FC‐37) 253 GSM‐based ARM and control system 253 internet of things communicator (ESP8266) 253 methodology 253–256 primary components 251 proposed algorithm 256–257 water report of both house 262 wide area monitoring system (WAMS) 209 Wikileaks, big data 344 witch attack, IoT 272 World Economic Forum (WEF) 344–345 WILEY END USER LICENSE AGREEMENT Go to www.wiley.com/go/eula to access Wiley’s ebook EULA ... Big Data Analytics for? ?Internet of? ?Things Big Data Analytics for Internet of Things Edited by Tausifa Jan Saleem National Institute of Technology Srinagar, India Mohammad Ahsan Chishti Central... directions of work in Big Data Analytics of IoT systems The seventh chapter entitled “Smart Cities and the Internet of Things? ?? investigates the development of smart cities from a perspective of the IoT... Better Decisions using Data Fusion” gives an idea of the problems that arise in the Big Data Analytics for? ?the Internet of? ?Things defense related IoT -big data analytics with special attention to