Basic business analytics using excel BI348Chapter02

72 30 0
Basic business analytics using excel BI348Chapter02

Đang tải... (xem toàn văn)

Tài liệu hạn chế xem trước, để xem đầy đủ mời bạn chọn Tải xuống

Thông tin tài liệu

Highline Class, BI 348 Basic Business Analytics using Excel, Chapter 02 Descriptive Statistics Topics • Data Types & Default Alignment in Excel • Raw Data, Data • Variable, Element, Observation • Proper Data Set: Proper Table of Data • Population and Sample • Categorical and Quantitative Data • Cross Sectional and Time Series Data • Sources of Data • Sort & Filter to Organize Data • Conditional Formatting to Visualizing Data Topics • Frequency Distributions for Categorical Data, Charts: Column • Frequency Distributions for Quantitative Data, Charts: Histogram • Skew of Histograms • Cumulative Distributions Topics • Measures of Location • • • • Mean Median Mode Geometric Mean • • • • • Range Variance Standard Deviation Coefficient of Variation Z-score: Number of Standard Deviations • Measures of Variability Topics • The Normal Distribution & the Empirical Rule • Identifying Outliers • Percentiles and Quartiles • Box Plots Raw Data: Data stored in its smallest size Why? Because it is easier to analyze data when it is stored in its smallest parts Data: • Textbook: Facts or figures collected, analyzed and summarized for presentation and interpretation • Data = all the unorganized raw data in a Proper Data Set Data Types & Default Alignment in Excel • Empty Cells  Not really a Data Type, but it is a "thing" in Excel that can sometimes cause problems • **Refer to Empty Cells as "Empty Cells", not blanks • Why Default Alignment? Because Left means Excel thinks it is Text and Right means Excel thinks it is a Number This is important when dealing with data because some systems will mistakenly import numbers as text Numbers as text not always behave like you expect (like not being added by the SUM function The Default Alignment is a visual cue that informs us about how Excel “sees” the data Proper Data Set: Proper Table of Data • A structure for your data set necessary so that Excel Data Analysis features like Sort, Filter and PivotTables will work correctly: Fields in first row (no empty cells) Records or Observations in rows Empty cells or Excel Row/Column Headers all the way around Data Set Try not to have empty cells in data set Terms for Proper Data Set Primary Key / Variables List of Unique Elements Element = Entities on which data are collected We are collecting data for each Transaction Number Transaction Number is the Element Each row is a Record / Observation 10 All are called Fields (Column Headers) Coefficient of Variation • Formula = SD/Mean • Coefficient of Variation converts the SD to SD per unit of Mean • If you add Percent Number Formatting, it shows SD as a percentage of Mean • Use Coefficient Of Variation to compare: • • • • For every one unit of mean, what is the SD? What percentage is SD in relation to the Mean? Data in different units Data in the same units, but the means are far apart 58 Z-score: Number of Standard Deviations • • • • • Formula for z-score = Deviation/SD = (Xi - Xbar)/SD Excel Function: STANDARDIZE(X,Mean,SD) z Score = How Many Standard Deviation is a particular value ways from the mean? • • • • • z < 0, value below mean z > 0, value above mean z = 0, value is equal to mean Z score measures the relative location of a particular x in the data set (as compared to the mean), in units of standard deviation Relative Location in terms of "Number of Standard Deviations z Score = Standardized Value Observations in different data sets that have the same z-score are said to have the same relative location or the same number of standard deviations away from the mean 59 Uses of z-score: • Used in the Standard Normal Bell Curve or “Empirical Rule” • One way to measure Outliers (extreme values) is to consider any value that has z-score greater than to be an Outlier 60 Example of Bell Shaped “Normal” Distribution: 61 Empirical Rule 62 Example of Empirical Rule: 63 Identifying Outliers: Z Rule • One way to measure Outliers (extreme values) is to consider any value that has z-score greater than to be an Outlier • In Sep and Oct of 1981 the 10-year Government Bond Yield was above 15% • This was a value more than standard deviations away from the mean and is therefore considered an outlier 64 Measures for Location: Percentiles Percentiles: • Percentile: Create Marker in sorted data set that divides set Excel Functions: • PERCENTILE.EXC • PERCENTILE.INC • For Large Data Sets the two functions calculate similar answers into Parts with about P% Below the Marker and 1-P% Above • • EXC = Exclusive: Excludes 0% & 100% = Min and Max values cannot be found 0% and 100% are not allowed INC = Inclusive: Includes 0% & 100% = Min and Max values CAN be found 0% = Min & 100% = Max 65 Measures for Location: Quartiles Quartiles: Excel Functions: • QUARTILE.EXC • • Create Marker in sorted data set that divides set into four equal parts: • • Each part contains approximately 25% of the observations enter 1, 2, in second argument • QUARTILE.INC • The three Markers are referred to as quartiles: • • • EXC = Exclusive: Min and Max values cannot be found can only �1 = first quartile, or 25th percentile INC = Inclusive: = Min, = Quartile 1, = Quartile 2, = Quartile 3, = Max �2 = second quartile, or 50th percentile (also the median) �3 = third quartile, or 75th percentile • For Large Data Sets the two functions calculate similar answers 66 Percentile & Quartile Are Markers That Divide A Set Of Sorted Numbers Into Two Sets 67 Box Plots by hand 68 Box Plots • • No easy way to create Box Plots in Excel Reference video for how to it in Excel: Excel 2010 Statistics #28: Box & Whisker Plot: Stacked Bar with Mean Point Plotted and Outlier Lines https://www.youtube.com/watch?v=bgaN446TQXo • XL Minor Add-in makes it easy to create single and multiple variable data sets • Must have a Proper Data Set 69 Box Plots in XL Minor 70 Box Plots in Excel 2016: 71 Don’t Forget: • Q: Why MUST we have a Proper Data Set? • A: So we can ask questions of each Field (Variable)!!!! • Like in a PivotTable when we drag a field like “Sales Rep” to ask the question: “What is the total sales for each Sales Rep?” • Q: Why Histograms have “No Gap Width”? • A: Continuous Quantitative Data that is grouped has no gaps between categories - so columns must touch (have no gap width) to visually indicate that no numbers can fit between the categories or columns 72 ... raw data in a Proper Data Set Data Types & Default Alignment in Excel • Empty Cells  Not really a Data Type, but it is a "thing" in Excel that can sometimes cause problems • **Refer to Empty Cells... Cells as "Empty Cells", not blanks • Why Default Alignment? Because Left means Excel thinks it is Text and Right means Excel thinks it is a Number This is important when dealing with data because... is a visual cue that informs us about how Excel “sees” the data Proper Data Set: Proper Table of Data • A structure for your data set necessary so that Excel Data Analysis features like Sort, Filter

Ngày đăng: 31/10/2020, 15:55

Mục lục

    Raw Data: Data stored in its smallest size

    Data Types & Default Alignment in Excel

    Proper Data Set: Proper Table of Data

    Terms for Proper Data Set

    Proper Data Set with a Primary Key / List of Unique Elements:

    Proper Data Set with NO Primary Key / List of Unique Elements:

    Categorical and Quantitative Data

    Sort & Filter to Organize Data

    How to create PivotTable:

    Conditional Formatting to Visualizing Data