How Box Plot Works
A Box Plot (or Box-and-Whisker Plot) is used to summarize the distribution of numerical data and detect outliers.
Components of a Box Plot
| Component | Meaning |
|---|---|
| Minimum | Smallest value (excluding outliers) |
| Q1 (First Quartile) | 25% of the data lies below this value |
| Median (Q2) | Middle value (50th percentile) |
| Q3 (Third Quartile) | 75% of the data lies below this value |
| Maximum | Largest value (excluding outliers) |
| Outliers | Unusually high or low values shown as separate points |
Example
Dataset:
10, 12, 15, 18, 20, 22, 25, 28, 30, 60
| Statistic | Value |
|---|---|
| Minimum | 10 |
| Q1 | 15 |
| Median | 21 |
| Q3 | 28 |
| Maximum | 30 |
| Outlier | 60 |
How it works
- Sort the data.
- Find the median.
- Find Q1 and Q3.
- Draw a box from Q1 to Q3.
- Draw a line inside the box for the median.
- Extend “whiskers” to the minimum and maximum non-outlier values.
- Plot any outliers as separate points.
Uses:
- Detecting outliers
- Comparing distributions
- Understanding spread and central tendency
How Heatmap Works
A Heatmap displays values using colors. Larger or stronger values are shown with one color intensity, while smaller values are shown with another.
Example: Correlation Matrix
| Variable | Age | Salary | Experience |
|---|---|---|---|
| Age | 1.00 | 0.72 | 0.81 |
| Salary | 0.72 | 1.00 | 0.89 |
| Experience | 0.81 | 0.89 | 1.00 |
In a heatmap:
- Dark/strong color → High value or strong correlation
- Light color → Low value or weak correlation
How it works
- Arrange data into a matrix (rows and columns).
- Each cell contains a numerical value.
- Seaborn maps each value to a color based on a color scale.
- Similar values appear with similar colors, making patterns easy to spot.
Real-life uses
| Field | Example |
|---|---|
| Data Science | Feature correlation analysis |
| Finance | Stock correlation matrix |
| Education | Student performance analysis |
| Healthcare | Disease occurrence by region |
| Business | Sales performance across products and months |
Difference between Box Plot and Heatmap
| Box Plot | Heatmap |
|---|---|
| Shows distribution of one numerical variable | Shows values in a matrix using colors |
| Detects outliers | Detects patterns and correlations |
| Uses quartiles and median | Uses color intensity |
| Best for comparing distributions | Best for comparing many variables simultaneously |
Leave a Reply