**What is Mean?**

At its core, the Mean is just **the average** of a set of numbers.

**Example**

Imagine you and four friends went out and bought ice creams.

The prices were €2, €3, €3, €4, and €5.

To find the mean price, **you’d add up all the prices and divide by the number of ice creams**.

So, on average, you each spent €3.4 on ice cream.

**Formula for the Mean**

The formula for the mean is given by:

- µ is the mean
- n is the number of values
- x
_{i}represents each value in the dataset

**This formula might look too complex,** but remember that** all you need to do is to add up all the numbers and divide by the count.**

**Types of Mean**

There are 3 common types of mean: Arithmetic Mean, Geometric Mean and Harmonic Mean.

**1. Arithmetic Mean**

This is the most common type and what we generally refer to as the “mean”.

It’s the sum of all numbers divided by the count.

**2. Geometric Mean**

Geometric Mean is used mainly in **finance**, it’s the nth root of the product of n numbers.

Formula:

**3. Harmonic Mean**

Harmonic Mean is often used in speed or rate calculations, it’s the reciprocal of the arithmetic mean of the reciprocals.

*Formula*:

**Mean vs Median**

Beginners often mistakenly believe that the Mean (or average) accurately represents the center of any statistical data.

**However, the Mean isn’t universally applicable.** In some cases, using the Median as a central measure might be more appropriate.

**Mean: When to Apply?**

Mean is good to apply **when data is symmetrically distributed without outliers.**

**Inappropriate Scenario 1**

In the presence of extreme outliers.

For instance, when determining the **average income** in a neighborhood,** if a billionaire lives there**, the mean might be skewed and not represent the typical resident’s income.

Outliers can heavily skew the mean, making it less representative of the majority of data points.

**Inappropriate Scenario 2**

When data is bimodal or has multiple peaks.

For example, if a class has **two distinct groups of high and low scorers**, the mean might not accurately represent either group.

The mean can be influenced by the distribution of data and might not capture the essence of bimodal distributions.

**Median: When to Apply?**

Median is good to appply **when data has outliers or is skewed**.

**Inappropriate Scenario 1**

When data has gaps or is open-ended. **For instance, age categories like “60 and above” can make it challenging to determine a precise median.**

Without specific data points, finding an exact middle value becomes difficult.

**Inappropriate Scenario 2**

When data is ordinal **but** **the intervals between categories aren’t consistent**.

For example, a survey with responses like “not at all”, “somewhat”, “very much” might not have equidistant intervals.

**Exercises: Which to Apply, Mean or Median?**

To help you grasp these concepts, here are three statistical cases. Decide whether the Mean or Median would be more appropriate:

**Case 1**

**A shoe company wants to determine the average shoe size of its customers.**

The sizes range from 5 to 12, but there’s a promotional event where customers with size 11 shoes get a significant discount.

**Case 2**

A city wants to determine the central age of its residents.

**However,** there’s a renowned university in the city, leading to a higher population of people aged 18-22.

**Case 3**

A survey asks people how many times they eat out in a week.

Most responses are between 1-3 times, but **a few respondents claim they eat out every meal, totaling 21 times a week**.

**Answers**

**Median**: Due to the promotional event, there might be a disproportionate number of size 11 shoe sales, which could skew the mean.**Median**: The influx of university students can create a peak in the data, making the median a better representative of the central age.**Mean**: Despite some high numbers, this data is likely to be symmetrically distributed, making the mean a suitable measure.