1. Trang chủ
  2. » Công Nghệ Thông Tin

Big data computational intelligence networking 4

548 107 0

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang 548
Dung lượng 15,84 MB

Nội dung

Big Data and Computational Intelligence in Networking T&F Cat #K30172 — K30172 C000 — page i — 10/31/2017 — 6:59 Big Data and Computational Intelligence in Networking Edited by YU L E I W U F E I HU G E YONG M I N AL BE R T Y ZOM AYA T&F Cat #K30172 — K30172 C000 — page iii — 10/31/2017 — 6:59 CRC Press Taylor & Francis Group 6000 Broken Sound Parkway NW, Suite 300 Boca Raton, FL 33487-2742 © 2018 by Taylor & Francis Group, LLC CRC Press is an imprint of Taylor & Francis Group, an Informa business No claim to original U.S Government works Printed on acid-free paper International Standard Book Number-13: 978-1-4987-8486-3 (Hardback) This book contains information obtained from authentic and highly regarded sources Reasonable efforts have been made to publish reliable data and information, but the author and publisher cannot assume responsibility for the validity of all materials or the consequences of their use The authors and publishers have attempted to trace the copyright holders of all material reproduced in this publication and apologize to copyright holders if permission to publish in this form has not been obtained If any copyright material has not been acknowledged please write and let us know so we may rectify in any future reprint Except as permitted under U.S Copyright Law, no part of this book may be reprinted, reproduced, transmitted, or utilized in any form by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying, microfilming, and recording, or in any information storage or retrieval system, without written permission from the publishers For permission to photocopy or use material electronically from this work, please access www copyright.com (http://www.copyright.com/) or contact the Copyright Clearance Center, Inc (CCC), 222 Rosewood Drive, Danvers, MA 01923, 978-750-8400 CCC is a not-for-profit organization that provides licenses and registration for a variety of users For organizations that have been granted a photocopy license by the CCC, a separate system of payment has been arranged Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe Library of Congress Cataloging-in-Publication Data Names: Wu, Yulei, editor Title: Big data and computational intelligence in networking / Yulei Wu, Fei Hu, Geyong Min, Albert Y Zomaya Description: Boca Raton, FL : CRC Press, [2018] | Includes bibliographical references and index Identifiers: LCCN 2017028407| ISBN 9781498784863 (hardback : acid-free paper) | ISBN 9781315155678 (e-book) | ISBN 9781498784870 (e-book) | ISBN 9781351651721 (e-book) | ISBN 9781351642200 (e-book) Subjects: LCSH: Big data | Cloud computing | Computer networks Management | Computational intelligence Classification: LCC QA76.9.B45 W824 2018 | DDC 005.7 dc23 LC record available at https://lccn.loc.gov/2017028407 Visit the Taylor & Francis Web site at http://www.taylorandfrancis.com and the CRC Press Web site at http://www.crcpress.com T&F Cat #K30172 — K30172 C000 — page iv — 10/31/2017 — 6:59 Contents Preface ix Contributors xiii PART I: BASICS OF NETWORKED BIG DATA A Survey of Big Data and Computational Intelligence in Networking Yujia Zhu, Yulei Wu, Geyong Min, Albert Zomaya, and Fei Hu Some Mathematical Properties of Networks for Big Data Marcello Trovati Big Geospatial Data and the Geospatial Semantic Web: Current State and Future Opportunities Chuanrong Zhang, Tian Zhao, and Weidong Li Big Data over Wireless Networks (WiBi) Immanuel Manohar and Fei Hu PART II: NETWORK ARCHITECTURE FOR BIG DATA TRANSMISSIONS Efficient Big Data Transfer Using Bandwidth Reservation Service in High-Performance Networks Liudong Zuo and Michelle Mengxia Zhu 27 43 65 85 87 v T&F Cat #K30172 — K30172 C000 — page v — 10/31/2017 — 6:59 vi Contents A Dynamic Cloud Computing Architecture for Cloud-Assisted Internet of Things in the Era of Big Data 107 Mehdi Bahrami and Mukesh Singhal Bicriteria Task Scheduling and Resource Allocation for Streaming Big Data Processing in Geo-Distributed Clouds 125 Deze Zeng, Chengyu Hu, Guo Ren, and Lin Gu PART III: ANALYSIS AND PROCESSING OF NETWORKED BIG DATA 149 The ADMM and Its Application to Network Big Data 151 Nan Lin and Liqun Yu Hyperbolic Big Data Analytics for Dynamic Network Management and Optimization 177 Vasileios Karyotis and Eleni Stai 10 Predictive Analytics for Network Big Data Using Knowledge-Based Reasoning for Smart Retrieval of Data, Information, Knowledge, and Wisdom (DIKW) 209 Aziyati Yusoff, Norashidah Md Din, Salman Yussof, Assad Abbas, and Samee U Khan 11 Recommendation Systems 227 Joonseok Lee 12 Coordinate Gradient Descent Methods 265 Ion Necoara 13 Data Locality and Dependency for MapReduce 293 Xiaoqiang Ma, Xiaoyi Fan, and Jiangchuan Liu 14 Distributed Machine Learning for Network Big Data 331 Seunghak Lee 15 Big Data Security: Toward a Hashed Big Graph 351 Yu Lu and Fei Hu T&F Cat #K30172 — K30172 C000 — page vi — 10/31/2017 — 6:59 Contents PART IV: EMERGING APPLICATIONS OF NETWORKED BIG DATA vii 371 16 Mobile Augmented Reality to Enable Intelligent Mall Shopping by Network Data 373 Vincent W Zheng and Hong Cao 17 Toward Practical Anomaly Detection in Network Big Data 411 Chengqiang Huang, Yulei Wu, Zuo Yuan, and Geyong Min 18 Emerging Applications of Spatial Network Big Data in Transportation 433 Reem Y Ali, Venkata M.V Gunturi, Zhe Jiang, and Shashi Shekhar 19 On Emerging Use Cases and Techniques in Large Networked Data in Biomedical and Social Media Domain 453 Vishrawas Gopalakrishnan and Aidong Zhang 20 Big Data Analysis for Smart Manufacturing 499 Z Y Liu and Y B Guo Index 515 T&F Cat #K30172 — K30172 C000 — page vii — 10/31/2017 — 6:59 Preface Recent years have witnessed a deluge of network data propelled by the emerging online social media, user-generated video contents, and global-scale communications, bringing people into the era of big data Such network big data holds much critical and valuable information including customer experiences, user behaviors, service levels, and other contents, which could significantly improve the efficiency, effectiveness, and intelligence on the optimization of the current Internet, facilitate the smart network operation and management, and help service providers and content providers reduce capital expenditure (CapEx) and operational expenditure (OpEx) while maintaining a relatively high-level quality of service (QoS) and quality of experience (QoE) Typical examples of network intelligence received from network big data include rapid QoE impairment detection and mitigation, optimization of network asset utilization, proactive maintenance, rapid outage restoration, and graceful disaster recovery These aims can be achieved from high-level computational intelligence based on emerging analytical techniques such as big data processing, Web analytics, and network analytics employing software tools from advanced analytics disciplines such as machine learning, data mining, and predictive analytics The computational intelligence for big data analysis is playing an ever-increasingly important role in supporting the evolution of the current Internet toward the next-generation intelligent Internet However, the unstructured, heterogeneous, sheer volume and complex nature of network big data pose great challenges on the computational intelligence of these emerging analytical techniques due to high computational overhead and communication cost, non-real-time response, sparse matrix-vector multiplications, and high convergence time It is therefore of critical importance to understand network big data and design novel solutions of computational intelligence, scaling up for big data analytics of large-scale networks to automatically discover the hidden and valuable information available for smart network operations, management, and optimization This has been established as ix T&F Cat #K30172 — K30172 C000 — page ix — 10/31/2017 — 6:59 Index Bigtable, 54 Binary hash tree (BHT), 356 Biomedical and social media domain, emerging use cases and techniques in, 453–497 background on big data, 454–456 bioinformatics and biomedical, graph-based models in, 474–489 collective entity resolution, graph-based approach for, 469–474 commonality between social and biomedical network analyses, 474–475 entity resolution, graph-based models in social text mining for, 456–474 entity resolution, similarity measures for, 477 example applications of graph-based analyses, 455–456 graph as representation schema for big data, 454–455 hypotheses generation in biomedical texts, 475–489 information retrieval, 475 online pair-wise entity resolution, graph-based approach for, 459–469 Birthday paradox, 358 Block-greedy coordinate descent (BCD), 344 BNs, see Bayesian networks Bounded delay condition, 164 BP neural networks, see Back-propagation neural networks Breadth-first tree (BFT), 360 BTaaS, see Back-end-template-as-aService 517 Bulk synchronous parallel (BSP) model, 339 C Cassandra, 54 C-CGD, see Cyclic coordinate gradient descent method CDAM, see Community detection for approximate matching CDN, see Content distribution network Central processing unit (CPU), 46 Challenges for networked big data, 6–8 distributed and decentralized data collection and storage, 6–7 distributed and parallel data processing, heterogeneous data representation, more complex and evolving relationships among data, 7–8 Channel impulse response (CIR) environment, 13 Circuit-switched high-speed end-to-end transport architecture (CHEETAH), 89 Clock-bounded asynchronous parallel (CAP), 339 Clock-value-bounded asynchronous parallel (CVAP), 339 Cloud-assisted Internet of Things (IoT), dynamic Cloud computing architecture for, 107–124 big data, 110 challenges, 110–111 Cloud computing paradigm, 108 convergence of IoT and Cloud, 109 T&F Cat #K30172 — K30172 IDX — page 517 — 11/1/2017 — 20:12 518 Index customization of architecture, 118 data privacy, 119–120 data security, 118–119 DCCSOA, big data processing on, 114–116 eHealth-Template, 120 infrastructure-as-a-service, 112 IoT paradigm, 109 platform-as-a-service, 112 related work, 121 service-oriented architecture, 109 software-as-a-service, 112 solution, 111–114 standardization, 116–120 value-added services, 112 Cloud Computing Interoperability Forum (CCIF), 121 Cloud reference architecture (CRA), 121 Cloud Security Alliance (CSA), 11 Clustering coefficient (CC), 186, 187 Collaborative filtering (CF), 236–240 Community detection for approximate matching (CDAM), 460 Consensus problem, 162 Content distribution network (CDN), 187 Context-Aware Networks for the Design of Connected Things (CANthings), 16 Coordinate gradient descent methods, 265–291 computational complexity, 270–271 coordinate minimization versus coordinate gradient descent, 269–270 cyclic coordinate gradient descent method, 276–280 decision variable, 267 Gauss–Southwell rule, 273 greedy coordinate gradient descent method, 273–276 motivation, 269–271 previous work and extensions, 286–288 problem formulation, 271–273 randomized coordinate gradient descent method (R-CGD), 280–286 CPU, see Central processing unit Critical-timepoint ALSP solver (CTAS), 443 Cross-title enrichment (CTE), 460, 465 CSA, see Cloud Security Alliance Cyclic coordinate gradient descent method (C-CGD), 276–280 D DAG, see Directed acyclic graph DALM, see Dependency-aware locality for MapReduce Data, information, knowledge, and wisdom (DIKW), smart retrieval of, 209–226 challenges in big data network, 211–212 correlation and regression analysis, 214–215 decision tree, 222 description of DIKW, 211 DIKW hierarchy in network big data, 211–214 geographical information system, 216 Hadoop MapReduce, 216 hypothesis testing for big data, 214 keyhole markup language, 216 knowledge-based reasoning of data prediction in network big data, 217–221 T&F Cat #K30172 — K30172 IDX — page 518 — 11/1/2017 — 20:12 Index network big data framework and architecture, 212–214 occupational analysis, 215 OWL design for online prediction, 217–219 OWL performance for DIKW and beyond, 219–221 predictive analytics, 216–217 resource description framework, 214 semantic network and ontology of big data, 217 smart retrieval prediction engine, 222 statistical inferences and analytics in network big data, 214–217 uniform resource identifiers, 214 web ontology language, 211 DCCSOA, see Dynamic Cloud computing architecture DCG, see Discounted cumulative gain Decision tree, 222 Deep neural networks (DNNs), Deltacloud, 121 Dependency-aware locality for MapReduce (DALM), 294, 296–315 design and optimization, 299–300 Hadoop distributed file system, 296 hot spots, 296 implementation and deployment issues, 305–308 minimizing cross-server traffic (problem formulation), 300–304 performance evaluation, 308–314 rack-PM-VM, 307 sequential minimal optimization, 303 Depth-first tree (DFT), 360 519 DIKW, see Data, information, knowledge, and wisdom (DIKW), smart retrieval of Directed acyclic graph (DAG), 17, 132, 353 Discounted cumulative gain (DCG), 258, 465 Distributed machine learning, 331–349 block-greedy coordinate descent, 334 consistency models of parameter servers, 341–342 coordinate descent algorithm, 336–338 data-model-parallel optimization, 345–346 data-parallel optimization, 342–343 latent Dirichlet allocation, 332 machine learning models for network inference, 332–334 model-parallel optimization, 343–345 parameter servers, 339–341 proximal gradient descent algorithm, 335–336 regularized regression, 334–338 semidefinite programming, 333 DNNs, see Deep neural networks Dynamic Cloud computing architecture (DCCSOA), 111, 114–116; see also Cloud-assisted Internet of Things (IoT), dynamic Cloud computing architecture for Dynamic Network System (DYNES), 89 Dynamic resource allocation via generalized multi-protocol label switching optical networks (DRAGON), 89 T&F Cat #K30172 — K30172 IDX — page 519 — 11/1/2017 — 20:12 520 Index Dynamic template service layer (DTSL), 113, 118 Dynamic tensor analysis (DTA), 11 E Earliest data transfer completion time (ECT), 89 Earth Science Data and Information System (ESDIS), 44 Edge-order number (EON), 361 eHealth-Template, 120 Electronic control units (ECUs), 446 Entity resolution (ER), 456 Extended ADMM, 166 F Facebook, 235 False Alarm Rate (FAR), 18 False positive rate (FPR), 17 Fast nonparametric matrix factorization, 246 First in, first out (FIFO) strategy, 91 Foursquare Venue, 46 Freebase, 397 Front-end-template-as-a-Service (FTaaS), 113 G Game theory, 13 GAS (gather, apply, and scatter) model, 354 Gauss–Southwell rule, 273 GBA, see Guilt by association G-CGD, see Greedy coordinate gradient descent method Generalized multi-protocol label switching (GMPLS) optical networks, 89 General regression neural networks (GRNN), 19 Geographic information system (GIS), 7, 44, 216 Geolocated Twitter “tweet” data sets, 46 GeoSpark, 58 GeoSPARQL protocol, 52 Geospatial semantic web (GSW), big geospatial data and, 43–64 big geospatial data, 44–47 challenges and future directions, 49–59 distributed geospatial computing, 57–59 GeoSPARQL queries, 52–54 geospatial indexing, 54–56 geospatial semantic web, 47–49 hierarchical traversal algorithms, 56 KD-tree, 55 key value store systems, 54 nested loop, 56 ontology, 49–52 partition-based spatial merge-join algorithm, 57 plane sweep algorithm, 57 quadtree, 55 spatial join, 56–57 GIS, see Geographic information system Global consensus, 162 Global positioning system (GPS), 44 GMPLS optical networks, see Generalized multi-protocol label switching optical networks Google, 126 Google file system (GFS), GPS, see Global positioning system Graphics processing unit (GPU), 115 Graph signal processing, 68 Greedy coordinate gradient descent method (G-CGD), 273–276 Greedy embedding, 185 GRNN, see General regression neural networks T&F Cat #K30172 — K30172 IDX — page 520 — 11/1/2017 — 20:12 Index GSW, see Geospatial semantic web, big geospatial data and Guilt by association (GBA), 474 H Hadoop distributed file system (HDFS), 7, 296 Hadoop MapReduce, 216 Half-life utility score (HLU), 257 Hashed big graph, see Big graph (hashed) HBase, 54 HDA, see Hyperbolic data analytics Health Insurance Portability and Accountability (HIPPA) Act, 120 Hidden Markov model (HMM), 17 Hierarchical traversal algorithms, 56 Higher order singular value decomposition (HOSVD), 82 High-order pairwise (HOP) features, 380–381 High-performance computing (HPC), 20, 115 High-performance networks (HPNs), efficient big data transfer using bandwidth reservation service in, 87–106 bandwidth reservation algorithm, introduction and illustration of, 98–102 bandwidth reservation request, 89 control plane, 89 first in, first out strategy, 91 local area network, 89 mathematical models and bandwidth reservation concepts, 92–97 qualified reservation, 96 related work, 90–92 HLU, see Half-life utility score HMM, see Hidden Markov model 521 HOP features, see High-order pairwise features HOSVD, see Higher order singular value decomposition HPC, see High-performance computing HPNs, see High-performance networks, efficient big data transfer using bandwidth reservation service in Hyperbolic data analytics (HDA), 177–207 average path length, 187 big data analytics as network analysis, 180–186 complex network metrics, 186–189 computational intelligence, 195–203 content distribution network, 187 cyber-physical networked systems, evolution of, 195–198 efficient computation of complex network management metrics, 190–194 greedy embedding, 185 hyperbolic traffic load centrality, 190 HyperMap, 185–186 landmarks, 185 network embedding in hyperbolic space, 184–186 network evolution, network management under, 198–203 network feature vector, 180, 197 network management, 186–194 node closeness centrality, 188 node degree distribution, 187 Rigel embedding, 184–185 traffic load centrality, 188 HyperMap, 185–186 Hypothesis testing, 211, 214 T&F Cat #K30172 — K30172 IDX — page 521 — 11/1/2017 — 20:12 Index 522 I IBM Cloud, 111 Incremental high-order singular value decomposition method (IHOSVD), 11 Information retrieval (IR), 475 Infrastructure-as-a-Service (IaaS), 112 Intelligent mall shopping, see Mobile augmented reality, intelligent mall shopping enabled by (IntelligShop) Internet service providers (ISPs), 127 Internet of Things (IoT), see Cloud-assisted Internet of Things, dynamic Cloud computing architecture for Internet of Vehicles (IoV), 13 J Jena, 56 K KD-tree, 55 Keyhole markup language (KML), 216 Key performance indicators (KPIs), 8, 413 Key value store (KVS) systems, 54 K-nearest neighbors (K-NN) classifier, queries, 55 L Lagrangian Xgraphs, 439 LAN, see Local area network Large Hadron Collider (LHC), 88 Latent Dirichlet allocation (LDA), 332 Learning using hidden information (LUHI), 425 LinkedIn, 126 Literature-based discovery (LBD), 475 Local area network (LAN), 89 Local low-rank matrix approximation (LLORMA), 248–250 Local sensitive hashing (LSH), 419 Logical volume management (LVM) system, 312 Log-normal shadowing model, 380 Log tensor factorization (LTF), LUHI, see Learning using hidden information M Machine learning, see Distributed machine learning Machine learning and data mining (MLDM) problems, 353 MAE, see Mean absolute error Mall shopping (intelligent), see Mobile augmented reality, intelligent mall shopping enabled by (IntelligShop) MANSD, see Maximal allowable number of successive data dropouts Manufacturing, see Smart manufacturing MapReduce, data locality and dependency for, 293–330 dependency-aware data locality for MapReduce, 294, 296–315 Hadoop distributed file system, 296 hot spots, 296 minimizing cross-server traffic (problem formulation), 300–304 network interface card, 312 test-bed experiments, 312 virtualized Clouds, data locality for MapReduce in, 315–326 vLocality (architecture design), 320–322 workflow phases, 294 T&F Cat #K30172 — K30172 IDX — page 522 — 11/1/2017 — 20:12 Index MAS, see Microsoft Academic Search Mathematical properties of networks for big data, 27–42 automatic extraction from text, 35–37 Bayesian networks, extraction of, 32–41 boundary map, 30 homology theory, 28–30 network theory, 31–32 parsing tree, 34 probabilistic information, extraction of, 37–39 scale-free and small-world networks, 31–32 text mining, 34–35 tokenization of sentences, 34 topological properties of big data, 28–32 Maximal allowable number of successive data dropouts (MANSD), 15 Maximum margin matrix factorization (MMMF), 246 MBR, see Minimum bounding rectangle MDC, see Mobile data collectors Mean absolute error (MAE), 255 Merkle hash tree (MHT), 355–356 Microsoft Academic Search (MAS), 397 Microsoft Azure IoT Suite, 109 Microsoft Cloud Azure, 109 Microsoft Virtual Earth, 46 MIMO system, see Multiple-input multiple-output system Minimum bounding rectangle (MBR), 55 MLDM problems, see Machine learning and data mining problems MMMF, see Maximum margin matrix factorization 523 Mobile augmented reality, intelligent mall shopping enabled by (IntelligShop), 373–410 cold-start challenge, 379–380, 385–386 cold-start review crawling, learning to query for, 388–394 device heterogeneity, 379 HOP feature learning and robustness, 382–385 HOP features, 380–381 localization, leverage wireless network for, 379–385 log-normal shadowing model, 380 query, 386–387 real-world test-bed, evaluation of IntelligShop system in, 402–406 review crawling, leverage the web for, 385–394 robust feature learning problem, 379 system architecture, 377 system evaluation, 394–406 web-based review crawling, evaluation with, 397–402 wireless network-based localization, evaluation with, 394–395 Mobile data collectors (MDC), 14 Multiple-input multiple-output (MIMO) system, 12 Multiple objective integer programming (MOIP) problem, 128, 135 N National Aeronautics and Space Administration (NASA) Open Government Initiative, 44 T&F Cat #K30172 — K30172 IDX — page 523 — 11/1/2017 — 20:12 524 Index NDCG, see Normalized discounted cumulative gain Network architecture for big data transmissions, 13–16 context-aware networked big data processing, 15–16 efficient management of networked big data, 14–15 novel collection protocols, 13–14 Network failure log (NFL), 18 Network feature vector (NFV), 180, 197 Network interface card (NIC), 312 Network lasso, 170 Noncompliant window co-occurrence (NWC) pattern discovery problem, 443 Nonlinear probabilistic matrix factorization (NLPMF), 246 Nonnegative matrix factorization (NMF), 20, 245, 418 Nonparametric principal component analysis (NPCA), 247 Normalized discounted cumulative gain (NDCG), 258 Normalized mean absolute error (NMAE), 255 O OCCI, see Open cloud computing interface Occupational analysis (OA), 215 Offline distributed training (ODT), 18 On-demand secure circuits and advance reservation system (OSCARS), 89 Online ADMM (OADM) algorithm, 156 Online parallel prediction (OPP), 18 Open cloud computing interface (OCCI), 121 Open geospatial consortium (OGC), 44 OpenStreetMap, 46 OWL, see Web ontology language P PaaS, see Platform-as-a-Service Parsing tree, 34 Partition-based spatial merge-join algorithm, 57 Personal digital assistants (PDAs), 45 PHM, see Prognostics and health management Plane sweep algorithm, 57 Platform-as-a-Service (PaaS), 112 Privacy issues (networked big data), 11–12 Probabilistic matrix factorization (PMF), 246 Probabilistic polynomial-time (PPT) adversary, 357 Probabilistic relationship measure, 38 Prognostics and health management (PHM), 508–509 Pseudorandom permutation (PRP), 120 Q Quadratic programming (QP) optimization problem, 303 Quadtree, 55 Qualified reservation (QR), 96 Quality of Experience (QoE), 18 Quality of Service (QoS), 89, 127 R Radial basis function (RBF)-based classifier, Random edge-order number (REON), 364 Randomized coordinate gradient descent method (R-CGD), 280–286 RDBMS, see Relational database management systems RDD, see Resilient distributed data set T&F Cat #K30172 — K30172 IDX — page 524 — 11/1/2017 — 20:12 Index RDF, see Resource description framework Real-time event analysis and monitoring system (REAMS), 20 Recommendation systems, 227–264 basic collaborative filtering approaches, 236–240 Bayesian PMF, 246 challenges, 234–236 cold-start problem, 235–236 collaborative filtering, 232–234 content-based filtering, 231–232 evaluation, 253–259 extensions to memory-based CF, 239 extreme sparsity, 234–235 fast nonparametric matrix factorization, 246 goals, 230–231 how to recommend for fresh users/items, 250–253 hybrid approach, 234 item-based collaborative filtering, 238–239 large scale, 235 local low-rank matrix approximation, 248–250 matrix factorization, 240–247 maximum margin matrix factorization, 246 metrics based on ranks, 257–259 metrics for classification accuracy, 256–257 metrics for prediction accuracy, 253–256 nonlinear probabilistic matrix factorization, 246 nonnegative matrix factorization, 245–246 nonparametric principal component analysis, 247 525 optimizing a utility function, 230–231 parallelism, local approach with (for scalability), 247–250 predicting ratings, 231 probabilistic matrix factorization, 246 recommending good items, 230 regularized SVD, 243–245 support vector machines, 246 types, 231–234 user-based collaborative filtering, 236–238 Relational database management systems (RDBMS), 353 Representation and modeling, 9–11 dynamic representation, 10 graph representation, 9–10 tensor, 10–11 Resilient distributed data set (RDD), 58 Resource description framework (RDF), 214 Rigel embedding, 184–185 Robust feature learning problem, 379 Robust principle component analysis (RPCA), 418 Root mean squared error (RMSE), 255 Round trip time (RTT), 75 R-tree, 55 S SaaS, see Software-as-a-Service SBDP, see Streaming big data processing, bicriteria task scheduling and resource allocation for (in geo-distributed Clouds) SDIs, see Spatial data infrastructures Security issues (networked big data), 11–12; see also Big graph (hashed) T&F Cat #K30172 — K30172 IDX — page 525 — 11/1/2017 — 20:12 526 Index Semidefinite programming (SDP), 333 Sequential minimal optimization (SMO), 303 Service-oriented architecture (SOA), 48, 109 SimHash, 419 Singular value decomposition (SVD), 11, 82, 243 Small-world networks, 31–32 Smart manufacturing, 499–513 applications of big data in manufacturing, 505–511 benefits of using big data in manufacturing, 501 big data characteristics in manufacturing, 500–501 big data collection in manufacturing, 501–503 data mining in manufacturing, 504–505 data types, 501–502 improving manufacturing processes, 505–507 prognostics and health management, 508–509 quality control, 508 real-time data collection, 502–503 steel industry, 509–511 supply chain management, 508 SMO, see Sequential minimal optimization SNBD, see Transportation, emerging applications of spatial network big data in SOA, see Service-oriented architecture Social media domain, see Biomedical and social media domain, emerging use cases and techniques in Software-as-a-Service (SaaS), 112 Sparse matrix vector multiplication (SpMV), 16 Spatial data infrastructures (SDIs), 47 SpatialSpark, 58 Stale synchronous parallel (SSP) model, 339 Statistical template extraction (STE), Streaming big data processing (SBDP), bicriteria task scheduling and resource allocation for (in geo-distributed Clouds), 125–147 algorithm design, 136–141 background and preliminaries, 128–129 data and experiment settings, 141 extended SBDP graph construction, 132–133 geo-distributed computing environment, 131–132 multiobjective integer programming problem formulation, 132–135 NSGA-II, 136–141 Pareto optimization, 136 related work, 129–131 results and discussion, 141–142 SBDP graph, 132 system model, 131–132 Union population, 137–138 VM placement constraints, 133–135 Supply chain management data science, 508 Support vector data description (SVDD), 416, 420–425 feature space, 422 with heterogeneous network dataset, 425 with high-volume network dataset, 423 T&F Cat #K30172 — K30172 IDX — page 526 — 11/1/2017 — 20:12 Index learning using hidden information, 425 with streaming network dataset, 423–425 support vectors, 422 Support vector machines (SVM), 246 Survey of big data and computational intelligence in networking, 3–26 analysis and processing of networked big data, 16–20 back-propagation neural networks, challenges for networked big data, 6–8 channel impulse response environment, 13 comprehensive understanding of networked big data, 5–13 context-aware networked big data processing, 15–16 distributed data mining with networked big data, 17–18 distributed machine learning algorithms, 16–17 dynamic representation, 10 efficient management of networked big data, 14–15 game theory, 13 graph representation, 9–10 high-performance analytics platform, 20 Internet of Vehicles, 13 K-nearest neighbors classifier, multiple-input multiple-output system, 12 network architecture for big data transmissions, 13–16 nonnegative matrix factorization, 20 novel collection protocols, 13–14 offline distributed training, 18 online parallel prediction, 18 527 online prediction with networked big data, 18–19 radial basis function-based classifier, representation and modeling, 9–11 requirement engineering for computational intelligence, 8–9 security and privacy issues, 11–12 sparse matrix vector multiplication on distributed architectures, 19–20 tensor, 10–11 unsupervised learning, 17 variety, velocity, 5–6 volume, wireless big data, 12–13 SVD, see Singular value decomposition SVDD, see Support vector data description SVM, see Support vector machines T Text mining (TM), 34–35 Think like a vertex (TLAV) frameworks, 353 Time-aggregated graph (TAG), 438 TLC, see Traffic load centrality Tokenization of sentences, 34 Topological data analysis (TDA), 28 Topological properties of big data, 28–32 boundary map, 30 homology theory, 28–30 network theory, 31–32 scale-free and small-world networks, 31–32 TPR, see True positive rate Traffic load centrality (TLC), 188 T&F Cat #K30172 — K30172 IDX — page 527 — 11/1/2017 — 20:12 528 Index Transportation, emerging applications of spatial network big data (SNBD) in, 433–451 all start-time Lagrangian shortest path problem, 440 challenges posed by spatial network big data, 436–437 computational accomplishments, 437–443 description of spatial network big data, 435–436 forecast-critical-timepoint, 442 GPS track data, 435 Lagrangian Xgraphs, 439 logical model, 438–440 noncompliant window co-occurrence pattern discovery problem, 443 physical model, 440 research needs, 445–447 routing queries, 440–443 spatial network big data engineering, 446–447 spatial network big data science, 446 spatial network query processing, 445–446 TD priority queue, 442 TD road maps, 436 time-aggregated graph, 438 trends, 443–445 True positive rate (TPR), 17 Twitter, 126 U UltraScience Net (USN), 89 Unified modeling language (UML), 50 Uniform resource identifiers (URI), 214 United States Geological Survey (USGS), 50 Unsupervised learning, 17 User controlled lightpaths (UCLP), 89 User topic participant (UTP) model, 19 V Value-added services, 112 Value-bounded asynchronous parallel (VAP), 339 Vector processing units (VPUs), 58 VGI, see Volunteered geographic information Virtualized Clouds, data locality for MapReduce in, 315–326 DataNode placement, 317–318 guest OS, 317 host OS, 317 performance evaluation, 322–325 vLocality (architecture design), 320–322 when data locality meets virtualization, 316–320 Virtual machines (VMs), 127, 305 Virtual private network (VPN), 118 Volunteered geographic information (VGI), 44 VPN, see Virtual private network VPUs, see Vector processing units W Web 2.0, 45 Web ontology language (OWL), 50, 211 Weka, 17 Wikimapia, 46 Wireless big data, 12–13 Wireless sensor networks (WSNs), 12 Wireless systems for big data (WiBi), 65–84 challenges, 73–74 channel conditions, 76–77 T&F Cat #K30172 — K30172 IDX — page 528 — 11/1/2017 — 20:12 Index distributed processing of data, 78 distributed source coding, 78–79 distributed tensor decomposition, 79–83 graph-based representation, 68 intelligence gathering, 73 Internet of Things and massive sensing, 73 land surveying, 72 pipelining parallelism and concurrency, 75–76 scenarios, 71–73 solutions, 74–83 529 tensor-based representation, 68–70 WSNs, see Wireless sensor networks X Xen Hypervisor, 312 Y Yelp, 315, 378 Z Zero touch provisioning, operations, and management (ZTPOM), 20 T&F Cat #K30172 — K30172 IDX — page 529 — 11/1/2017 — 20:12 go to it-eb.com for more ... — 19:55 14 Big Data and Computational Intelligence in Networking data collection), sink nodes (which are responsible for data transfer), and big data center (which is responsible for data integration.. .Big Data and Computational Intelligence in Networking T&F Cat #K30172 — K30172 C000 — page i — 10/31/2017 — 6:59 Big Data and Computational Intelligence in Networking Edited... big data 1 .4. 2 Distributed data mining with networked big data 1 .4. 3 Online prediction with networked big data 1 .4. 4

Ngày đăng: 02/03/2019, 11:36

TỪ KHÓA LIÊN QUAN