RAPIDS pitch deck QUESTIONS FACING EVERY AI

THÔNG TIN TÀI LIỆU

Nội dung

Pitch Deck RAPIDS 2 6 QUESTIONS FACING EVERY AI ENTERPRISE Top Challenges for AI, Big Data, and Enterprise Transformation Is your data doubling each year? DATA DELUGE Are you an intelligent enterprise.

RAPIDS Pitch Deck QUESTIONS FACING EVERY AI ENTERPRISE Top Challenges for AI, Big Data, and Enterprise Transformation DATA DELUGE PROLONGED TRAINING TIME Is your data doubling each year? Is ML training prohibitively long, delaying time-to-predictions? COMPLEX WORKLOADS DELAYED INTELLIGENCE Is Spark workloads creating relentless infrastructure sprawl? Are you an intelligent enterprise needing real time predictive analytics? TEDIOUS DATA PREP Do you have oceans of data, that take lifetimes to wrangle? $ SHRINKING BUDGET Is your CAPEX budget shrinking amidst escalating infrastructure demand? MACHINE LEARNING CHALLENGES Days 30+ Hours to Build GBDT (Gradient Boosted Tree Regression) SLOW PROCESSES Data Transformation Weeks Feature Engineering Months Scoring Pipelines MODEL COMPLEXITY $3M+ More Servers and Infrastructure Yielding Diminishing Returns ESCALATING TCO GPU-ACCELERATED DATA SCIENCE Use Cases in Every Industry CONSUMER INTERNET OIL & GAS Ad Personalization Sensor Data Tag Mapping Click Through Rate Optimization Anomaly Detection Churn Reduction Robust Fault Prediction FINANCIAL SERVICES MANUFACTURING Claim fraud Remaining Useful Life Estimation Customer service chatbots/routing Failure Prediction Risk evaluation Demand Forecasting HEALTHCARE TELCO Improve Clinical Care Detect Network/Security Anomalies Drive Operational Efficiency Forecasting Network Performance Speed Up Drug Discovery Network Resource Optimization (SON) RETAIL AUTOMOTIVE Supply Chain & Inventory Management Personalization & Intelligent Customer Interactions Price Management / Markdown Optimization Connected Vehicle Predictive Maintenance Promotion Prioritization And Ad Targeting Forecasting, Demand, & Capacity Planning ML WORKFLOW STIFLES INNOVATION Wrangle Data Data Sources ETL Data Lake Data Preparation Train Train Deploy Evaluate Predictions Time-consuming, inefficient workflow that wastes data science productivity DAY IN THE LIFE OF A DATA SCIENTIST ANOTHER… @*#! Forgot to Add a Feature GET A COFFEE Train Model Validate Start Data Prep Workflow GET A COFFEE Restart Data Prep Workflow 12 Test Model Start GET A COFFEE 12 Experiment with Optimizations and Repeat Switch to Decaf Configure Data Prep Workflow CPU POWERED WORKFLOW GPU POWERED WORKFLOW Find Unexpected Null Values Stored as String… Dataset Downloads Overnight 6 Dataset Downloads Overnight Restart Data Prep Workflow Again Stay Late Dataset Collection Go Home on Time Analysis Data Prep Train Inference DATA SCIENCE WORKFLOW WITH RAPIDS Open Source, End-to-end GPU-accelerated Workflow Built On CUDA DATA PREDICTIONS DATA PREPARATION GPUs accelerated compute for in-memory data preparation Simplified implementation using familiar data science tools Python drop-in Pandas replacement built on CUDA C++ GPU-accelerated Spark (in development) DATA SCIENCE WORKFLOW WITH RAPIDS Open Source, End-to-end GPU-accelerated Workflow Built On CUDA DATA PREDICTIONS MODEL TRAINING GPU-acceleration of today’s most popular ML algorithms XGBoost, PCA, Kalman, K-means, k-NN, DBScan, tSVD … DATA SCIENCE WORKFLOW WITH RAPIDS Open Source, End-to-end GPU-accelerated Workflow Built On CUDA DATA PREDICTIONS VISUALIZATION Effortless exploration of datasets, billions of records in milliseconds Dynamic interaction with data = faster ML model development Data visualization ecosystem (Graphistry & OmniSci), integrated with RAPIDS TRADITIONAL DATA SCIENCE CLUSTER Workload Profile: Fannie Mae Mortgage Data: • 192GB data set • 16 years, 68 quarters • 34.7 Million single family mortgage loans • 1.85 Billion performance records • XGBoost training set: 50 features 300 Servers | $3M | 180 kW 10 GPU-ACCELERATED MACHINE LEARNING CLUSTER DGX-2 and RAPIDS for Predictive Analytics DGX-2 | 10 kW 1/8 the Cost | 1/15 the Space 1/18 the Power End-to-End 20 CPU Nodes 30 CPU Nodes 50 CPU Nodes 100 CPU Nodes DGX-2 5x DGX-1 2,000 4,000 6,000 8,000 10,000 11 RAPIDS: DELIVERING DATA SCIENCE VALUE Maximized Productivity Top Model Accuracy Lowest TCO Oak Ridge National Labs Global Retail Giant Streaming Media Company 215x $1B $1.5M Speedup Using RAPIDS with XGBoost Potential Saving with 4% Error Rate Reduction Infrastructure Cost Saving 12 PILLARS OF RAPIDS PERFORMANCE CUDA Architecture NVLink/NVSwitch Integrated Software PYTHON 6x NVLink DASK NVSwitch DL FRAMEWORKS RAPIDS cuDF cuML cuDNN CUDA APACHE ARROW on GPU Memory Massively Parallel Processing High Speed Connecting between GPUs for Distributed Algorithms Fully Integrated Software and Hardware for Instant Productivity 13 FASTER SPEEDS, REAL WORLD BENEFITS cuIO/cuDF — Load and Data Preparation 20 CPU Nodes cuML — XGBoost 2,741 30 CPU Nodes 715 100 CPU Nodes 20 CPU Nodes 2,290 30 CPU Nodes 1,675 50 CPU Nodes End-to-End 1,956 50 CPU Nodes 379 20 CPU Nodes 30 CPU Nodes 1,999 100 CPU Nodes 50 CPU Nodes 100 CPU Nodes 1,948 DGX-2 42 DGX-2 169 DGX-2 5x DGX-1 19 5x DGX-1 157 5x DGX-1 1,000 2,000 3,000 500 1,000 1,500 2,000 2,500 2,000 4,000 6,000 8,000 10,000 Time in seconds — Shorter is better cuIO / cuDF (Load and Data Preparation) Data Conversion XGBoost Benchmark CPU Cluster Configuration DGX Cluster Configuration 200GB CSV dataset; Data preparation includes joins, variable transformations CPU nodes (61 GiB of memory, vCPUs, 64-bit platform), Apache Spark 5x DGX-1 on InfiniBand network 14 SELECTING THE RIGHT RAPIDS SOLUTION Unparalleled Data Science Performance and Productivity ML Enthusiast Machine Learning Developer Data Center Machine Learning Data Science Workstations Shared infrastructure for Data Science Teams TITAN RTX Quadro Workstation DGX Station DGX-1 / HGX-1 / OEM DGX-2 / HGX-2 / OEM Benefit PC solution, easy to acquire, deploy and get started experimenting Enterprise workstation for experienced data scientists Enterprise ML workgroups, largest memory on a workstation Enterprise server, proven 8-way configuration, modular approach for scale, multi-node training Largest compute and memory capacity in single node, fastest training solution GPU Memory 48GB 64GB 128GB 256GB 512GB GPU Fabric 2-way NVLINK 2-way NVLINK 4-way NVLINK 8-way NVLINK 16-way NVSWITCH End-to-end portfolio optimized for RAPIDS 15 WIDESPREAD SUPPORT FOR RAPIDS Open Source Community Enterprise Data Science Platforms Deep Learning Integration Startups RAPIDS GPU Servers Storage Partners * Spark and Hadoop support coming soon 16 TRANSFORMING RETAIL WITH RAPIDS Inventory Forecast 180x speedup using RAPIDS with cuDF 10 stores million rows 600 stores 60 million rows “My previous bottleneck was I/O …15 seconds to pull in data for 10 stores (about Million rows) With RAPIDS, we can pull in data for about 600 stores (60 Million rows) in less than seconds … just plain awesome.” — A mid-market specialty retailer with 4800 stores 17 TRANSFORM STREAMING MEDIA RECOMMENDATION SYSTEM WITH RAPIDS $1.5M Infrastructure Cost Saving with 24x Speed-up on XGBoost Hundreds of CPUs GPU Increase customer retention | Higher customer satisfaction | Increase revenue “I got 24x speedup using RAPIDS XGBOOST and can now replace hundreds of CPU nodes running my biggest ML workload on a single node with GPUs You made XGBOOST too fast!?” — Streaming Media Company 18 PREDICT EPIDEMIC DISEASE IN HEALTHCARE WITH RAPIDS 80x speedup on GPU-accelerated XGBoost Days on CPUs Hours on GPU “Early precaution of epidemic disease is now possible with 80x faster training time on RAPIDS.” — Dr Jian Zong Wang, Vice Chief Engineer and Senior AI Director (from the Largest Insurance and Internet Finance Company in China) 19 FOR MORE INFORMATION www.nvidia.com/datascience www.rapids.ai 20 ... End-to-end portfolio optimized for RAPIDS 15 WIDESPREAD SUPPORT FOR RAPIDS Open Source Community Enterprise Data Science Platforms Deep Learning Integration Startups RAPIDS GPU Servers Storage Partners... Infrastructure Cost Saving 12 PILLARS OF RAPIDS PERFORMANCE CUDA Architecture NVLink/NVSwitch Integrated Software PYTHON 6x NVLink DASK NVSwitch DL FRAMEWORKS RAPIDS cuDF cuML cuDNN CUDA APACHE ARROW... 11 RAPIDS: DELIVERING DATA SCIENCE VALUE Maximized Productivity Top Model Accuracy Lowest TCO Oak Ridge National Labs Global Retail Giant Streaming Media Company 215x $1B $1.5M Speedup Using RAPIDS

Ngày đăng: 30/08/2022, 07:06