Monday, 11 November 2019

Descriptive Statistics for Data Analytics/Science


Basically descriptive statistics tells everything about the data that is available, it doesn’t tell us anything out of that data.

So how to describe the (Numerical/Categorical) data?

- graphical representation 
- tabular representation
- summary statistics 

Type of variables in statistics-

1) Numerical or Quantitative (continuous and discrete)
2) Categorical or Qualitative (only discrete)

- nominal
- ordinal

Speaking about variables they can be single or multiple as 
We deal with data in descriptive statistics, it is important to establish relationship between this variables so as to summarize.

  • Graphical representation- Single variable

- For Categorical(Qualitative) variables:

-> Bar Graph: suited for ordinal variables 
-> Pie Chart: suited for nominal variables

- For Quantitative(Numerical) variables:

-> Box-plot( suited for continuous data)
-> Histogram(1st step to start before distribution)
-suited for discrete data

  • Graphical Representation: Multiple variables

- Scatter plots-> two quantitative (numerical) variables
- Box-plots-> one categorical (qualitative) other quantitative variable (Suited for continuous data)
- Contingency table -> both categorical with frequency of occurrence of both.

  • Measures of Central Tendency: 

How to summarize data through numbers?(summary statistics)

- Measures of central tendency (mean, mode, median)
- Dispersion
- Skew and Kurtosis
central tendency

Choosing when to use mean, median:

- bad outlier
- good outlier

When to use Mode:

- useful with nominal variables 
- multi modal distribution

  • Measures of dispersion:

- Range (Max- Min)
- InterQuartile Range (IQR) = 3rd Quartile - 1st Quartile or (75th percentile - 25th percentile
- Sample Standard Deviation
- Variance
- Mean Absolute Deviation

 All Formulas:

DS formula

No comments:

Post a comment

Best Data Analytics Documentary you should watch now!!

"Data is the new Oil ? NO: Data is the new soil" -David Mccandless Do you know how much data we create every single...