Big data analytics methods and applications

278 102 0
Big data analytics   methods and applications

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Saumyadipta Pyne · B.L.S. Prakasa Rao S.B. Rao Editors Big Data Analytics Methods and Applications Big Data Analytics Saumyadipta Pyne ⋅ B.L.S Prakasa Rao S.B Rao Editors Big Data Analytics Methods and Applications 123 Editors Saumyadipta Pyne Indian Institute of Public Health Hyderabad India S.B Rao CRRao AIMSCS University of Hyderabad Campus Hyderabad India B.L.S Prakasa Rao CRRao AIMSCS University of Hyderabad Campus Hyderabad India ISBN 978-81-322-3626-9 DOI 10.1007/978-81-322-3628-3 ISBN 978-81-322-3628-3 (eBook) Library of Congress Control Number: 2016946007 © Springer India 2016 This work is subject to copyright All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed The use of general descriptive names, registered names, trademarks, service marks, etc in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use The publisher, the authors and the editors are safe to assume that the advice and information in this book are believed to be true and accurate at the date of publication Neither the publisher nor the authors or the editors give a warranty, express or implied, with respect to the material contained herein or for any errors or omissions that may have been made Printed on acid-free paper This Springer imprint is published by Springer Nature The registered company is Springer (India) Pvt Ltd The registered company address is: 7th Floor, Vijaya Building, 17 Barakhamba Road, New Delhi 110 001, India Foreword Big data is transforming the traditional ways of handling data to make sense of the world from which it is collected Statisticians, for instance, are used to developing methods for analysis of data collected for a specific purpose in a planned way Sample surveys and design of experiments are typical examples Big data, in contrast, refers to massive amounts of very high dimensional and even unstructured data which are continuously produced and stored with much cheaper cost than they are used to be High dimensionality combined with large sample size creates unprecedented issues such as heavy computational cost and algorithmic instability The massive samples in big data are typically aggregated from multiple sources at different time points using different technologies This can create issues of heterogeneity, experimental variations, and statistical biases, and would therefore require the researchers and practitioners to develop more adaptive and robust procedures Toward this, I am extremely happy to see in this title not just a compilation of chapters written by international experts who work in diverse disciplines involving Big Data, but also a rare combination, within a single volume, of cutting-edge work in methodology, applications, architectures, benchmarks, and data standards I am certain that the title, edited by three distinguished experts in their fields, will inform and engage the mind of the reader while exploring an exciting new territory in science and technology Calyampudi Radhakrishna Rao C.R Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad, India v Preface The emergence of the field of Big Data Analytics has prompted the practitioners and leaders in academia, industry, and governments across the world to address and decide on different issues in an increasingly data-driven manner Yet, often Big Data could be too complex to be handled by traditional analytical frameworks The varied collection of themes covered in this title introduces the reader to the richness of the emerging field of Big Data Analytics in terms of both technical methods as well as useful applications The idea of this title originated when we were organizing the “Statistics 2013, International Conference on Socio-Economic Challenges and Sustainable Solutions (STAT2013)” at the C.R Rao Advanced Institute of Mathematics, Statistics and Computer Science (AIMSCS) in Hyderabad to mark the “International Year of Statistics” in December 2013 As the convener, Prof Saumyadipta Pyne organized a special session dedicated to lectures by several international experts working on large data problems, which ended with a panel discussion on the research challenges and directions in this area Statisticians, computer scientists, and data analysts from academia, industry and government administration participated in a lively exchange Following the success of that event, we felt the need to bring together a collection of chapters written by Big Data experts in the form of a title that can combine new algorithmic methods, Big Data benchmarks, and various relevant applications from this rapidly emerging area of interdisciplinary scientific pursuit The present title combines some of the key technical aspects with case studies and domain applications, which makes the materials more accessible to the readers In fact, when Prof Pyne taught his materials in a Master’s course on “Big and High-dimensional Data Analytics” at the University of Hyderabad in 2013 and 2014, it was well-received vii viii Preface We thank all the authors of the chapters for their valuable contributions to this title Also, We sincerely thank all the reviewers for their valuable time and detailed comments We also thank Prof C.R Rao for writing the foreword to the title Hyderabad, India June 2016 Saumyadipta Pyne B.L.S Prakasa Rao S.B Rao Contents Big Data Analytics: Views from Statistical and Computational Perspectives Saumyadipta Pyne, B.L.S Prakasa Rao and S.B Rao Massive Data Analysis: Tasks, Tools, Applications, and Challenges Murali K Pusala, Mohsen Amini Salehi, Jayasimha R Katukuri, Ying Xie and Vijay Raghavan 11 Statistical Challenges with Big Data in Management Science Arnab Laha 41 Application of Mixture Models to Large Datasets Sharon X Lee, Geoffrey McLachlan and Saumyadipta Pyne 57 An Efficient Partition-Repetition Approach in Clustering of Big Data Bikram Karmakar and Indranil Mukhopadhayay 75 Online Graph Partitioning with an Affine Message Combining Cost Function Xiang Chen and Jun Huan 95 Big Data Analytics Platforms for Real-Time Applications in IoT 115 Yogesh Simmhan and Srinath Perera Complex Event Processing in Big Data Systems 137 Dinkar Sitaram and K.V Subramaniam Unwanted Traffic Identification in Large-Scale University Networks: A Case Study 163 Chittaranjan Hota, Pratik Narang and Jagan Mohan Reddy Application-Level Benchmarking of Big Data Systems 189 Chaitanya Baru and Tilmann Rabl ix x Contents Managing Large-Scale Standardized Electronic Health Records 201 Shivani Batra and Shelly Sachdeva Microbiome Data Mining for Microbial Interactions and Relationships 221 Xingpeng Jiang and Xiaohua Hu A Nonlinear Technique for Analysis of Big Data in Neuroscience 237 Koel Das and Zoran Nenadic Big Data and Cancer Research 259 Binay Panda About the Editors Saumyadipta Pyne is Professor at the Public Health Foundation of India, at the Indian Institute of Public Health, Hyderabad, India Formerly, he was P.C Mahalanobis Chair Professor and head of Bioinformatics at the C.R Rao Advanced Institute of Mathematics, Statistics and Computer Science He is also Ramalingaswami Fellow of Department of Biotechnology, the Government of India, and the founder chairman of the Computer Society of India’s Special Interest Group on Big Data Analytics Professor Pyne has promoted research and training in Big Data Analytics, globally, including as the workshop co-chair of IEEE Big Data in 2014 and 2015 held in the U.S.A His research interests include Big Data problems in life sciences and health informatics, computational statistics and high-dimensional data modeling B.L.S Prakasa Rao is the Ramanujan Chair Professor at the C.R Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad, India Formerly, he was director at the Indian Statistical Institute, Kolkata, and the Homi Bhabha Chair Professor at the University of Hyderabad He is a Bhatnagar awardee from the Government of India, fellow of all the three science academies in India, fellow of Institute of Mathematical Statistics, U.S.A., and a recipient of the national award in statistics in memory of P.V Sukhatme from the Government of India He has also received the Outstanding Alumni award from Michigan State University With over 240 papers published in several national and international journals of repute, Prof Prakasa Rao is the author or editor of 13 books, and member of the editorial boards of several national and international journals He was, most recently, the editor-in-chief for journals—Sankhya A and Sankhya B His research interests include asymptotic theory of statistical inference, limit theorems in probability theory and inference for stochastic processes S.B Rao was formerly director of the Indian Statistical Institute, Kolkata, and director of the C.R Rao Advanced Institute of Mathematics, Statistics and Computer Science, Hyderabad His research interests include theory and algorithms in graph theory, networks and discrete mathematics with applications in social, xi ... Traditionally, most of the available data is structured data and stored in conventional databases and data warehouses for supporting all kinds of data analytics With the Big data, data is no longer necessarily... Big data analytic projects Section discusses open challenges in Big data analytics Finally, we summarize and conclude the main contributions of the chapter in Sect Big Data Analytics Big data analytics. .. process of exploring Big data, to extract hidden and valuable information and patterns [48] Big data analytics helps organizations in more informed decision-making Big data analytics applications can

Ngày đăng: 04/03/2019, 11:48

Từ khóa liên quan

Mục lục

  • Foreword

  • Preface

  • Contents

  • About the Editors

  • Big Data Analytics: Views from Statistical and Computational Perspectives

    • 1 Some Unique Characteristics of Big Data

    • 2 Computational versus Statistical Complexity

    • 3 Techniques to Cope with Big Data

    • 4 Conclusion

    • References

    • Massive Data Analysis: Tasks, Tools, Applications, and Challenges

      • 1 Introduction

        • 1.1 Motivation

        • 1.2 Big Data Overview

        • 1.3 Big Data Adoption

        • 1.4 The Chapter Structure

        • 2 Big Data Analytics

          • 2.1 Descriptive Analytics

          • 2.2 Predictive Analytics

          • 2.3 Prescriptive Analytics

          • 3 Big Data Analytics Platforms

            • 3.1 MapReduce

            • 3.2 Apache Hadoop

            • 3.3 Spark

            • 3.4 High Performance Computing Cluster

Tài liệu cùng người dùng

  • Đang cập nhật ...

Tài liệu liên quan