**What is Population in Statistics?**

In statistics, the term “population” refers to the entire set of individuals or items that you wish to study. It encompasses every single unit of interest.

For instance, if you’re studying the heights of all students in a school,** the entire student body is your population.**

The mean(average) of the population is expressed as **µ**, and the standard deviation of that is expressed as **σ**.

**Mathematical Notations**

**μ (Mu)**: The mean (average) of the population.**σ (Sigma)**: The standard deviation of the population, which measures the spread of data around the mean.**N**: The total number of individuals or items in the population.

**What is Sample?**

A sample is a subset of the population.

It’s a smaller group, chosen from the population, which is used to infer or draw conclusions about the population as a whole.

Using the previous example, if you measure the heights of 50 students out of 1000 in a school, those 50 students constitute your sample.

**Mathematical Notations**

**x̄ (X-bar)**: The mean (average) of the sample.**s**: The standard deviation of the sample, indicating the spread of sample data around the sample mean.**n**: The number of individuals or items in the sample.

**Population vs Sample**

To better understand the difference:

**Population**: Imagine you want to know the average age of every citizen in Japan. That’s a massive number of people! If you could somehow get the age of every single person, you’d be dealing with the population.**Sample**: Now, gathering data from every citizen is challenging. So, instead, you randomly select 1,000 citizens and calculate their average age. This group of 1,000 is your sample, and the average age from this group is the sample mean.

The key takeaway is that while the population encompasses the whole, a sample represents a part of that whole, chosen to make data collection more feasible.

**Avoiding 4 Common Biases**

When dealing with populations and samples, it’s crucial to avoid biases that can skew your results:

**1. Selection Bias**

**Cause**: Not selecting a sample that’s representative of the entire population.**Solution**: Use random sampling techniques to ensure every member of the population has an equal chance of being selected.

**2. Non-response Bias**

**Cause**: When certain groups are less likely to respond to surveys or studies.**Solution**: Follow up with non-respondents, or weigh the responses to account for the differences.

**3. Sampling Bias**

**Cause**: When some members of the population are more likely to be included in the sample than others.**Solution**: Again, random sampling is key. Ensure that every member of the population has an equal chance of being chosen.

**4. Measurement Bias**

**Cause**: When data is consistently mismeasured in the same direction.**Solution**: Regularly calibrate measurement tools and train data collectors.