Implementing splunk 7 effective operational intelligence to transform machine generated data into valuable business insight 3rd edition

THÔNG TIN TÀI LIỆU

Thông tin cơ bản

Định dạng
Số trang	701
Dung lượng	23,77 MB

Nội dung

Implementing Splunk Third Edition Effective operational intelligence to transform machine-generated data into valuable business insight James D Miller BIRMINGHAM - MUMBAI Implementing Splunk Third Edition Copyright © 2018 Packt Publishing All rights reserved No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews Every effort has been made in the preparation of this book to ensure the accuracy of the information presented However, the information contained in this book is sold without warranty, either express or implied Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals However, Packt Publishing cannot guarantee the accuracy of this information Commissioning Editor: Sunith Shetty Acquisition Editor: Tushar Gupta Content Development Editor: M ayur Pawanikar Technical Editor: Prasad Ramesh Copy Editor: Vikrant Phadke Project Coordinator: Nidhi Joshi Proofreader: Safis Editing Indexer: M ariammal Chettiyar Graphics: Tania Dutta Production Coordinator: Nilesh M ohite First published: January 2013 Second edition: July 2015 Third edition: M arch 2018 Production reference: 1280318 Published by Packt Publishing Ltd Livery Place 35 Livery Street Birmingham B3 2PB, UK ISBN 978-1-78883-628-9 www.packtpub.com mapt.io Mapt is an online digital library that gives you full access to over 5,000 books and videos, as well as industry leading tools to help you plan your personal development and advance your career For more information, please visit our website Why subscribe? Spend less time learning and more time coding with practical eBooks and Videos from over 4,000 industry professionals Improve your learning with Skill Plans built especially for you Get a free eBook or video every month Mapt is fully searchable Copy and paste, print, and bookmark content PacktPub.com Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy Get in touch with us at service@packtpub.com for more details At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters, and receive exclusive discounts and offers on Packt books and eBooks Contributors About the author James D Miller is an IBM-certified expert, creative innovator, director, senior project leader, and application/system architect with 35+ years extensive application, system design, and development experience He has introduced customers to new and sometimes disruptive technologies and platforms, integrating with IBM Watson Analytics, Cognos BI, TM1, web architecture design, systems analysis, GUI design and testing, database modeling and systems analysis He has done design and development of OLAP, client/server, web, and mainframe applications I would like to thank Nanette, Shelby and Paige who continually amaze me with their support and love Using Splunk As we mentioned earlier in this chapter, Splunk proposes a three-tier architecture for implementing machine learning solutions, defined as: Tier 1: Core platform searching features Tier 2: Packaged solutions and apps offered on Splunkbase Tier 3: Using the Splunk Machine Learning Toolkit Since we are focusing on the Splunk Machine Learning Toolkit in this chapter, we can skip going over the first two tiers and jump right into tier 3, using the toolkit Launching the toolkit The first step is to launch the toolkit Once your logged in to Splunk, you can locate the Splunk Machine Learning Toolkit App on the left of the main page: Once you click on the Splunk Machine Learning Toolkit APP, the Showcase page is displayed For this use case example, we are interested in time series forecasts, so if we scroll down, we will see that section (shown here): You can see the main link Forecast Time Series well as the sections sub-links offered for that type of modeling below it In our use case, we are interested in forecasting sales by month; so we can click on that link now, which then displays the Forecast Time Series page, made up of multiple sections or output panels The Splunk Machine Learning Toolkit provides a Forecast Monthly Sales showcase using a sample file: The top section/panel contains two tabs, Create New Forecast and Load Existing Sections, as shown here: In this output panel, you'll notice the familiar Splunk Enter a search input line (preloaded with the toolkit's supplied sample historic sales file), and under that, there are selectors to set the configuration parameters required to create our time series forecast machine learning model The configuration parameters are: Algorithm: Allows you to select the methodology the toolkit will use to build the model The options listed are based on the time series type we selected (forecast monthly sales) Field to forecast: This is where you select the field that you want your model to predict (forecast) based on the data source In the sample file, there are only two fields: Month (time) and Sales Method: This is where you indicate the method that the algorithm will implement Based on the selected Method, additional selections will be displayed and offered Caution: Trend forecasting is scientific, but it can also be ambiguous The longer into the future a forecast is applied, the more uncertain the results can be Unexpected events can occur, and they can disrupt any pattern or trend Additionally, the more complicated the pattern seems to be, the more uncertain the trend forecast usually will be You can configure your machine learning forecast model with the aforementioned selection parameters The Splunk Machine Learning Toolkit makes it quite easy to change the method for the model to use, re-forecast, view, and evaluate the results Note that if you are not familiar with the various algorithm methods, it is worth some google time to gain some knowledge on each; another approach is through experimentation, which is always recommended as part of any model configuration evaluation In a time series forecast model, perhaps some reasonable advice is to select and use the LLP5 method as this method combines both the LLP and LLT methods (since historical sales data over time most likely will include seasonal and trend patterns) to create predictions: Other configurable forecast inputs include the number of future periods to forecast and a confidence interval Note: Confidence intervals are a very common way of presenting forecast certainty These intervals are expected to cover the likely outcomes some percentage of the time, such as 67%, 90%, or 95% Again, reasonable advice is to set this value to 95% Using the showcase supplied sample data, we can modify (or enter) the search as the first step to configuring your forecast model and the data will be previewed For example, if we are not happy with the timeframe interval (although perhaps not a realistic time selection), we can select a relative time search of Previous year: After updating the search, you can click on the button labelled Forecast to have the model re-forecast and then display see the visual effect of the change in time interval on the output panel shown here: The next section or output panel on the page is the Forecast (visualization), shown here: In this output panel, the Machine Learning Toolkit has automatically created a visualization of the input data, the generated forecast, and the corresponding confidence intervals—designed to help you evaluate the quality of the current model results For clarity, helpful tips are always obtainable if you hover your mouse over the panel titles When you're happy with all of your configuration settings, you can click on the button corresponding with the action you'd like to take For example, you can view the underlying SPL or set up an alert to trigger when a forecast value falls outside a particular range The Splunk Machine Learning Toolkit gives you the opportunity to view and/or edit (and save versions of) SPL for each of the output panels on the page For example, you can click on the button labeled Show SPL to access the commands used to generate the visualization: The final output panels located across the bottom of the page (shown next) show various detailed data points that can be used to score or evaluate the performance of the machine learning predictive model, including: R2 Statistic: The square of the correlation coefficient between the forecasted and actual values Root Mean Squared Error (RSME): The quadratic mean of the prediction errors Forecast Outliers: The number of values in the test period that fall outside the confidence interval Validation Validation of a forecast model involves training the model with a portion of data (referred to as your training data) and then testing the model with a different portion (referred to as your test data) For forecasting tasks, the training data is a prefix of the data and the test data is a suffix of the data that is withheld to compare against the forecasts Validating a trained model with the test set can be performed several ways, depending on the type of model Each assistant provides methods in the Validate section, which is displayed after you train a model Deployment A model is ready to be deployed after you have validated it and are comfortable with its performance Deployment actions are usually categorized as: Generate a forecast to use directly or as input to other analytics applications Detect outliners and anomalies to help improve the overall process Trigger an action or alert of a needed decision The Splunk Machine Learning Toolkit makes deploying and sharing the results generated easily through Splunk's inherit ability to create dashboards, alerts and reports In addition, once a forecast is created, it's easy to export the generated data Saving a report From the Visualization output panel, you can select the visualization type you'd like to use to show your data: Once you have selected your desired visualization type, you can then use the Save As feature to save your visualization as a Splunk Report or Dashboard Panel: You can use all of the visualizations you create (referred to as applying a custom visualization to your data) and save on any Splunk platform instance on which the Splunk Machine Learning Toolkit is installed Exporting data As we mentioned earlier, you can also export the data generated by the machine learning model To this, in the visualization output panel you can export the generated data by: Click on the button labeled Open in Search: Click on Export: Fill out the Export Results details and then click on Export: The output file is created and ready for use directly and/or in other analytical systems or tools: Summary In this chapter, we provided a brief definition of what machine learning is We reviewed the fundamentals of Splunk's Machine Learning Toolkit and explained how it can be used to create a machine learning model .. .Implementing Splunk Third Edition Effective operational intelligence to transform machine- generated data into valuable business insight James D Miller BIRMINGHAM - MUMBAI Implementing Splunk. .. new modules of Splunk Splunk Cloud and the Machine Learning Toolkit to ease data usage Furthermore, you will learn how to use search terms effectively with boolean and grouping operators You will... Metadata Summary 12 Advanced Deployments Planning your installation Splunk instance types Splunk forwarders Splunk indexer Splunk search Common data sources Monitoring logs on servers Monitoring

Ngày đăng: 02/03/2019, 10:45