**What is Median?**

The median is a measure of central tendency that represents **the middle value** in a dataset when it’s ordered from least to greatest.

If there’s an odd number of data points, the median is the middle number. If there’s an even number, it’s the average of the two middle numbers.

**Example**

Consider the set of numbers: 3, 7, 8, 5, and 12.

When arranged in ascending order (3, 5, 7, 8, 12), the median is 7.

But for the set: 3, 7, 8, 5, 12, and 15, the median is (7+8)/2 = 7.5.

**3 Types of Median**

Generally, we use 3 types of Median:

- Population Median
- Sample Median
- Weighted Median

**1. Population Median**

This is the median calculated using data from an entire population.

It’s denoted by the Greek letter** µ**.

**2. Sample Median**

This is the median calculated using data from **a sample of a population**.

It’s often used in statistics when it’s impractical to collect data from an entire population.

It’s denoted by or .

**3. Weighted Median**

In some datasets, certain values have more importance (or weight) than others. **The weighted median takes these weights into account.**

It’s found by ordering the data points and their weights and then finding the value where half the total weight is on each side.

**Mathematical Notation**:

Given data points with weights , the weighted median is the value ( x_k ) for which:

and

**Mean vs Median**

Beginners often mistakenly believe that the Mean (or average) accurately represents the center of any statistical data.

**However, the Mean isn’t universally applicable.** In some cases, using the Median as a central measure might be more appropriate.

**Mean: When to Apply?**

Mean is good to apply **when data is symmetrically distributed without outliers.**

**Inappropriate Scenario 1**

In the presence of extreme outliers.

For instance, when determining the **average income** in a neighborhood,** if a billionaire lives there**, the mean might be skewed and not represent the typical resident’s income.

Outliers can heavily skew the mean, making it less representative of the majority of data points.

**Inappropriate Scenario 2**

When data is bimodal or has multiple peaks.

For example, if a class has **two distinct groups of high and low scorers**, the mean might not accurately represent either group.

The mean can be influenced by the distribution of data and might not capture the essence of bimodal distributions.

**Median: When to Apply?**

Median is good to appply **when data has outliers or is skewed**.

**Inappropriate Scenario 1**

When data has gaps or is open-ended. **For instance, age categories like “60 and above” can make it challenging to determine a precise median.**

Without specific data points, finding an exact middle value becomes difficult.

**Inappropriate Scenario 2**

When data is ordinal **but** **the intervals between categories aren’t consistent**.

For example, a survey with responses like “not at all”, “somewhat”, “very much” might not have equidistant intervals.

**Exercises: Which to Apply, Mean or Median?**

To help you grasp these concepts, here are three statistical cases. Decide whether the Mean or Median would be more appropriate:

**Case 1**

**A shoe company wants to determine the average shoe size of its customers.**

The sizes range from 5 to 12, but there’s a promotional event where customers with size 11 shoes get a significant discount.

**Case 2**

A city wants to determine the central age of its residents.

**However,** there’s a renowned university in the city, leading to a higher population of people aged 18-22.

**Case 3**

A survey asks people how many times they eat out in a week.

Most responses are between 1-3 times, but **a few respondents claim they eat out every meal, totaling 21 times a week**.

**Answers**

**Median**: Due to the promotional event, there might be a disproportionate number of size 11 shoe sales, which could skew the mean.**Median**: The influx of university students can create a peak in the data, making the median a better representative of the central age.**Mean**: Despite some high numbers, this data is likely to be symmetrically distributed, making the mean a suitable measure.