0% found this document useful (0 votes)

100 views27 pages

Data Visualization Plot Types Guide

The document provides a comprehensive reference table for various data visualization plots, detailing their types, appropriate use cases, and required data types. It includes descriptions of single variable plots like dot plots and histograms, as well as two-variable and multivariate plots such as scatter plots and parallel coordinate plots. Each plot type is accompanied by its pros, cons, and examples of usage to aid in selecting the appropriate visualization method for different data scenarios.

Uploaded by

iron pump

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

100 views27 pages

Data Visualization Plot Types Guide

Uploaded by

iron pump

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Data Visualization Plots - Reference Table

Plot Type When to Use Datatype Needed

Dot Plot Display individual data points for small Numerical (continuous or
datasets; show distribution and exact values discrete)

Jitter Plot Show distribution of data points with added Numerical (continuous or
random noise to avoid overplotting discrete)

Error Bar Plot Display mean/median values with uncertainty Numerical (continuous) + error
or variability (standard deviation, confidence values
intervals)

Box Plot Show distribution summary (median, quartiles, Numerical (continuous)

outliers); compare distributions across
categories

Histogram Show frequency distribution of continuous Numerical (continuous)

data; understand data shape and spread

Pie Plot Show proportions of categorical data as parts Categorical + numerical

of a whole counts/percentages

Scatter Plot Show relationship between two continuous Two numerical (continuous)
variables; identify correlations and patterns variables

Bar Plot Compare values across categories; show Categorical (x-axis) + Numerical
means or counts for different groups (y-axis)

Log Log Plot Visualize relationships spanning multiple Two numerical (positive values
orders of magnitude; identify power-law only)
relationships

Line Plot Show trends over time or ordered categories; Numerical (both axes), often
display continuous relationships time-series

Parallel Compare multiple numerical variables across Multiple numerical variables +

Coordinate Plot observations; identify patterns in optional categorical
high-dimensional data

Pair Plot Visualize pairwise relationships between all Multiple numerical variables +
numerical variables; show distributions and optional categorical
correlations

Stacked Plot Show cumulative totals over time or Time/categorical (x-axis) +

categories; display part-to-whole relationships multiple numerical series
Heatmap Display matrix data with color intensity; show Matrix of numerical values or 2D
correlations or values across two categorical categorical grid
dimensions

Violin Plot Combine box plot with kernel density plot; Categorical (x-axis) + Numerical
show full distribution shape across categories (y-axis)

Single variable plots [ UNIVARIATE PLOTS ]

1. Dot plot
2. Jitter Plot
3. Box and Whisker Plot
4. Histogram

Dot Plot =
➔ A Dot Plot is a simple chart that displays individual data points along a single axis
(usually the x-axis).
➔ Each dot represents one observation.
➔ If multiple observations have the same value, the dots are stacked vertically.
➔ It’s one of the simplest and most direct ways to visualize small datasets, showing
both frequency and distribution clearly.

➔ When To Use :
◆ You have a small to medium-sized dataset (e.g., < 100 data points).
◆ You want to see each individual value and how often it occurs.
◆ You need to show the distribution of a single variable (univariate data).
◆ You want to easily identify clusters, gaps, and outliers.

➔ Pros
- Easy to read and interpret — shows exact values.
- Reveals clusters, gaps, and outliers clearly.
- Simple and quick to make for small datasets.

➔ Cons
- Becomes cluttered with large datasets.
- Hard to use for continuous or wide-range data.

Example :
Jitter plot=
- A Jitter Plot is a variation of the Dot Plot where random noise (jitter) is added to one
axis (usually the y-axis) to prevent overlapping of data points.
- It helps visualize individual data points when many values are identical, which would
otherwise stack up and overlap in a regular dot plot.

- In short: “A jitter plot = dot plot + tiny random displacement.”

- Use a jitter plot when:

- You have many identical or repeated data points that overlap.
- You want to see all individual observations instead of overlapping dots.
- You are comparing categorical groups with continuous values.

- Pros =
- Avoids overplotting by separating overlapping points
- Shows true density and spread of data
- Easy to see outliers even in overlapping data
- Good for categorical vs numeric comparisons

- Cons
- The random noise (jitter) can slightly distort exact values

Example :
Error Bar Plot =
An error bar plot is a type of graph that displays data with an indication of the uncertainty
or variability associated with each measurement or estimate. Here's a comprehensive
explanation:

Basic Structure
- Data points are plotted as symbols (dots, squares, etc.) showing the mean or central
value
- Error bars extend vertically and/or horizontally from each point, showing the range
of uncertainty
- The bars typically consist of a line with caps (small perpendicular lines) at each end

What Error Bars Represent

Error bars can show different types of variability:
- Standard Deviation (SD) - Shows the spread of individual data points around the
mean
- Standard Error (SE or SEM) - Shows the precision of the mean estimate
- Confidence Intervals (CI) - Shows the range where the true population mean likely
falls (e.g., 95% CI)
- Range - Shows minimum and maximum values
- Interquartile Range (IQR) - Shows the middle 50% of the data

How to Read Them

- The center point represents the average or mean value
- The length of the error bar indicates the amount of uncertainty
- Longer bars = more variability or less precision
- Shorter bars = less variability or more precision
- Overlapping error bars between groups suggest the differences may not be
statistically significant

Common Uses
Scientific research - displaying experimental results with measurement uncertainty
Clinical trials - comparing treatment effects across groups
Business analytics - showing forecasts with confidence ranges
Quality control - monitoring process variation
Survey data - presenting poll results with margins of error

Example
Imagine testing three different fertilizers on plant growth:
- Fertilizer A: mean height = 20 cm ± 2 cm (error bar from 18-22 cm)
- Fertilizer B: mean height = 25 cm ± 1 cm (error bar from 24-26 cm)
- Fertilizer C: mean height = 22 cm ± 4 cm (error bar from 18-26 cm)
The plot would show Fertilizer B has the tallest plants with the most consistent results,
while Fertilizer C shows the most variability.

Important Considerations
- Always label what the error bars represent (SD, SE, CI, etc.) - different measures
have different interpretations
- Error bars are crucial for understanding whether observed differences are
meaningful or just due to random variation
- They help distinguish between precision (reproducibility) and accuracy
(correctness)
Box Plot
A box plot (also called a box-and-whisker plot) is a graphical representation of the
distribution of a dataset that shows its central tendency, spread, and outliers in a simple
way. It is commonly used in statistics to visualize the summary of numerical data.

Components of a Box Plot

1. Minimum (Lower Whisker): The smallest data point excluding outliers.
2. First Quartile (Q1, Lower Edge of Box): The 25th percentile. 25% of data lies below
this value.
3. Median (Q2, Middle Line of Box): The 50th percentile. Divides the data into two
halves.
4. Third Quartile (Q3, Upper Edge of Box): The 75th percentile. 75% of data lies below
this value.
5. Maximum (Upper Whisker): The largest data point excluding outliers.
6. Interquartile Range (IQR): IQR = Q3 - Q1. Measures the middle 50% spread.
7. Outliers: Points that lie below Q1 - 1.5*IQR or above Q3 + 1.5*IQR.

When to Use
- To see the spread of data.
- To identify skewness (if median is closer to Q1 or Q3).
- To detect outliers.
- To compare distributions across different groups.

Pros
- Simple visual summary of data.
- Highlights median, quartiles, and outliers clearly.
- Good for comparing multiple datasets side by side.

Cons
- Does not show exact data distribution (like how data clusters within quartiles).
- Can hide multimodal distributions.

Histogram
- A histogram is a type of bar chart that groups numeric data into bins (intervals) and
displays how many data points fall into each bin.
- It’s commonly used to visualize the frequency distribution (or probability
distribution) of continuous or discrete numerical data.

Structure of a Histogram
- X-axis (horizontal): Represents the range of values divided into bins (intervals).
- Y-axis (vertical): Represents the frequency (count) or density (probability) of data
points within each bin.
- Bars: Each bar’s height corresponds to how many data points fall within that bin.

Scatter Plot
A scatter plot is a type of graph used to visualize the relationship between two numerical
variables. Each point on the plot represents a pair of values (x, y) from your dataset.

Key Features:
● X-axis: Represents one variable.
● Y-axis: Represents the other variable.
● Points: Each point shows a single observation in the dataset.

Purpose:
1. Show correlation: You can quickly see if two variables are related.
○ Positive correlation → points trend upward.
○ Negative correlation → points trend downward.
○ No correlation → points are scattered randomly.
2. Detect patterns or clusters: Groups of points might indicate natural clusters.
3. Identify outliers: Points that are far from others are easy to spot.

import [Link] as plt

# Heights and weights

heights = [150, 160, 165, 155, 170, 175, 180, 160, 165, 170]
weights = [50, 55, 60, 58, 65, 70, 75, 60, 62, 68]

# Categories: 'Male' or 'Female'

genders = ['F', 'M', 'M', 'F', 'M', 'M', 'M', 'F', 'F', 'M']

# Assign colors to categories

colors = ['blue' if gender == 'M' else 'pink' for gender in genders]

[Link](heights, weights, c=colors, s=100, edgecolor='black')

[Link]("Height (cm)")
[Link]("Weight (kg)")
[Link]("Scatter Plot: Heights vs Weights by Gender")
[Link]()
Log Log plot
A log-log plot is a graph where both the x-axis and y-axis use logarithmic scales
instead of linear scales. This powerful visualization tool is essential in science,
engineering, and data analysis
Why Use Log-Log Plots?
1. Visualizing Wide-Ranging Data
When your data spans several orders of magnitude (e.g., from 1 to 1,000,000), a linear plot
becomes unusable. Log scales compress large values and expand small ones, making all data
visible.
2. Identifying Power Laws
This is the most important application. If two variables follow a power law relationship:

y = a × x^b
Taking logarithms of both sides:
log(y) = log(a) + b × log(x)
This is a linear equation! On a log-log plot, power laws appear as straight lines, where the slope
equals the exponent b.

How to Read a Log-Log Plot

Scale Interpretation:
Each tick mark represents a power of 10 (or another base)
Equal distances on the axis represent equal ratios, not differences
Moving from 1 to 10 covers the same distance as 10 to 100

Slope Interpretation:

The slope of a line tells you the power law exponent:

Slope = 1: Linear relationship (y ∝ x)
Slope = 2: Quadratic relationship (y ∝ x²)
Slope = -1: Inverse relationship (y ∝ 1/x)
Slope = 0.5: Square root relationship (y ∝ √x)

import numpy as np
import [Link] as plt

# X values (avoid zero for log scale)

x = [Link](1, 10, 100)

# Different relationships
y1 = x**1 # Slope = 1 → linear
y2 = x**2 # Slope = 2 → quadratic
y3 = x**-1 # Slope = -1 → inverse
y4 = x**0.5 # Slope = 0.5 → square root

# Plot log-log
[Link](figsize=(8,6))
[Link](x, y1, label='Slope = 1 (y ∝ x)')
[Link](x, y2, label='Slope = 2 (y ∝ x²)')
[Link](x, y3, label='Slope = -1 (y ∝ 1/x)')
[Link](x, y4, label='Slope = 0.5 (y ∝ √x)')

Stacked Plot
A stacked plot (often stacked area plot or stacked bar plot) is a way to visualize multiple
variables over a common axis (usually time or categories), where the values are “stacked”
on top of each other.
- Shows the total of all variables at each point
- Helps to see both individual contributions and overall trend.
2️. Types of Stacked Plots
- Stacked Area Plot – continuous data (like time series)
- Stacked Bar Plot – categorical data

Example

import [Link] as plt import [Link] as plt

import pandas as pd import pandas as pd

# Sample data
# Sample data
data = {
data = { 'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'],
'Month': ['Jan', 'Feb', 'Mar', 'Apr', 'May'], 'Product_A': [3, 4, 5, 2, 6],
'Product_A': [3, 4, 5, 2, 6], 'Product_B': [2, 3, 4, 3, 2],
'Product_B': [2, 3, 4, 3, 2], 'Product_C': [1, 2, 1, 4, 3]
'Product_C': [1, 2, 1, 4, 3] }
}
df = [Link](data)
df = [Link](data)
df.set_index('Month', inplace=True)
df.set_index('Month', inplace=True)
# Plot stacked bar plot
# Plot stacked area plot [Link](kind='bar', stacked=True, alpha=0.7)
[Link](kind='area', stacked=True, alpha=0.7) [Link]("Stacked Bar Plot of Products")
[Link]("Stacked Area Plot of Products") [Link]("Sales")
[Link]("Sales") [Link]()
[Link]()

Parallel Coordinate Plot

A parallel coordinate plot is a visualization technique for displaying multivariate data.

Instead of plotting variables on perpendicular axes (like in scatter plots), it uses multiple
parallel vertical (or horizontal) axes, one for each variable. Each data point is represented
as a polyline that connects its values across all axes.
When to Use Parallel Coordinate Plots

Best for:

● Visualizing multivariate data (4+ variables)

● Detecting patterns, trends, and correlations across multiple dimensions
● Identifying clusters or groups in high-dimensional data
● Spotting outliers across multiple variables
● Comparing observations across many variables simultaneously

Use cases:

● Comparing performance metrics across multiple dimensions

● Analyzing product specifications with multiple features
● Quality control with multiple measured parameters
● Exploring relationships in high-dimensional datasets

Avoid when:

● You have only 2-3 variables (use scatter plots instead)

● You have too many data points (lines become overlapping and messy)
● Variables are on vastly different scales (though normalization can help)

Key Characteristics

● Each vertical line = one variable/dimension

● Each colored line connecting across axes = one observation/data point
● Lines that are parallel between two axes = positive correlation
● Lines that cross between two axes = negative correlation
● Bundled lines = similar observations

data = {
'Student': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'Math': [85, 72, 90, 68, 95],
'Science': [88, 75, 85, 70, 92],
'English': [90, 80, 75, 85, 88],
'History': [82, 85, 80, 90, 86]
}
import [Link] as plt
from [Link] import parallel_coordinates

data = {
'Student': ['Alice', 'Bob', 'Charlie', 'Diana', 'Eve'],
'Math': [85, 72, 90, 68, 95],
'Science': [88, 75, 85, 70, 92],
'English': [90, 80, 75, 85, 88],
'History': [82, 85, 80, 90, 86]
}

df = [Link](data)

# Create categorical variable for coloring

df['Performance'] = [Link](df['Math'], bins=[0, 75, 85, 100],
labels=['Low', 'Medium', 'High'])

[Link](figsize=(10, 6))
parallel_coordinates(df, 'Performance',
cols=['Math', 'Science', 'English', 'History'],
color=['red', 'orange', 'green'])
[Link]('Student Performance - Parallel Coordinates')
[Link]('Subject')
[Link]('Score')
[Link](loc='best')
[Link](alpha=0.3)
plt.tight_layout()
[Link]()

Pair Plot (Scatter Plot Matrix)

Definition
A pair plot (also called a scatter plot matrix or SPLOM) is a grid of plots that shows:

● Scatter plots for every pair of numerical variables (off-diagonal)

● Distribution plots (histograms or KDE) for each individual variable (diagonal)

It displays all pairwise relationships in a dataset simultaneously, making it a powerful tool

for exploring multivariate data.

When to Use Pair Plots

Best for:

● Initial exploratory data analysis (EDA) of multivariate datasets

● Identifying correlations and relationships between variables
● Detecting clusters or groupings in data
● Spotting outliers across multiple dimensions
● Understanding distributions of individual variables
● Comparing patterns across different categories/classes

Use cases:

● Machine learning: feature analysis before modeling

● Statistical analysis: checking assumptions and relationships
● Scientific data: exploring experimental results
● Business analytics: customer segmentation analysis

Avoid when:

● You have too many variables (>6-7 becomes cluttered)

● You have categorical variables only (use other plots)
● You need precise measurements (pair plots show patterns, not exact values)
● You have extremely large datasets (too slow to render)

Key Characteristics
● Grid layout: n×n grid for n variables
● Diagonal: Shows distribution of each single variable
● Off-diagonal: Shows scatter plots between pairs of variables
● Symmetry: Upper and lower triangles are mirror images
● Color coding: Often colored by a categorical variable to reveal clusters

SYNTAX
[Link](
data,
vars=['var1', 'var2', 'var3'], # Which variables to plot
hue='category', # Color by category
diag_kind='hist', # 'hist', 'kde', or None
kind='scatter', # 'scatter', 'reg', or 'kde'
palette='Set1', # Color palette
markers=['o', 's', 'D'], # Different markers per group
height=2.5, # Size of each subplot
aspect=1 # Width/height ratio
)

IRIS DATASET

HeatMap
[Link](
data,
annot=True, # Show values in cells
fmt='.2f', # Number format
cmap='coolwarm', # Color scheme
vmin=0, vmax=100,# Value range
center=50, # Center point for diverging colors
linewidths=1, # Cell border width
square=True # Make cells square-shaped
)
Visualizing multiple distributions at the same time
- Violin Plot

Plot Type When to Use Detailed Syntax Datatype Needed

Dot Display individual data points Matplotlib: Numerical

Plot for small datasets; show [Link](x_values, y_values, (continuous or
distribution and exact values 'o', markersize=8, discrete)
color='blue')
[Link]('X Label')
[Link]('Y Label')
[Link]('Dot Plot')
[Link]()
Seaborn:
[Link](data=df,
y='column_name',
color='blue', size=8)
[Link]('Dot Plot')
[Link]()

Jitter Show distribution of data Seaborn: Numerical

Plot points with added random [Link](data=df, (continuous or
noise to avoid overplotting x='category', y='value', discrete)
jitter=True, alpha=0.6, size=5)
[Link]('Jitter Plot')
[Link]('Category')
[Link]('Value')
[Link]()
Alternative:
[Link](data=df,
x='category', y='value')
[Link]()

Error Display mean/median values Matplotlib: Numerical

Bar Plot with uncertainty or variability x = [1, 2, 3, 4, 5] (continuous) +
(standard deviation, confidence y = [10, 15, 13, 18, 20] error values
intervals) errors = [1, 2, 1.5, 2,
1.8]
[Link](x, y,
yerr=errors, fmt='o-',
capsize=5, capthick=2,
ecolor='red', markersize=8)
[Link]('X Label')
[Link]('Y Label')
[Link]('Error Bar Plot')
[Link](True, alpha=0.3)
[Link]()
With horizontal errors:
[Link](x, y,
xerr=x_errors,
yerr=y_errors, fmt='s',
capsize=5)
Box Show distribution summary Matplotlib: Numerical
Plot (median, quartiles, outliers); data = [list1, list2, list3] (continuous)
compare distributions across [Link](data,
categories labels=['Group1', 'Group2',
'Group3'], patch_artist=True,
notch=True)
[Link]('Values')
[Link]('Box Plot')
[Link](True, alpha=0.3)
[Link]()
Seaborn:
[Link](data=df, x='category',
y='value', palette='Set2',
width=0.5)
[Link]('Box Plot')
[Link]()
Single variable:
[Link](y=df['column'])

Histogram Show frequency distribution Matplotlib: Numerical

of continuous data; [Link](data, bins=30, (continuous)
understand data shape and edgecolor='black',
spread color='skyblue', alpha=0.7)
[Link]('Value')
[Link]('Frequency')
[Link]('Histogram')
[Link](True, alpha=0.3)
[Link]()
Seaborn:
[Link](data=df,
x='column', bins=30,
kde=True, color='blue')
[Link]('Histogram with
KDE')
[Link]()
With custom bins:
[Link](data, bins=[0, 10,
20, 30, 40, 50],
density=True)
Pie Show Matplotlib: Categorical +
Plot proportions labels = ['Category A', 'Category B', 'Category C', numerical
of 'Category D'] counts/percentages
categorical sizes = [30, 25, 20, 25]
data as
colors = ['gold', 'lightcoral', 'lightskyblue',
parts of a
'lightgreen']
whole
explode = (0.1, 0, 0, 0)
[Link](sizes, labels=labels, colors=colors,
autopct='%1.1f%%', startangle=90, explode=explode,
shadow=True)
[Link]('Pie Chart')
[Link]('equal')
[Link]()
From DataFrame:
df['category'].value_counts().[Link](autopct='%1.
1f%%', figsize=(8,8))
Bivariate Plots
Plot Type When to Use Detailed Syntax Datatype Needed

Scatter Show relationship Matplotlib: Two numerical

Plot between two [Link](x, y, s=50, c='blue', (continuous)
continuous alpha=0.6, edgecolors='black', variables
variables; identify linewidth=0.5)
correlations and
[Link]('X Variable')
patterns
[Link]('Y Variable')
[Link]('Scatter Plot')
[Link](True, alpha=0.3)
[Link]()
Seaborn:
[Link](data=df, x='col1',
y='col2', hue='category', size='size_col',
palette='viridis', alpha=0.7)
[Link]('Scatter Plot')
[Link](bbox_to_anchor=(1.05, 1),
loc='upper left')
[Link]()
With regression line:
[Link](data=df, x='col1', y='col2',
scatter_kws={'alpha':0.5})

Bar Plot Compare values Matplotlib: Categorical

across categories; categories = ['A', 'B', 'C', 'D'] (x-axis) +
show means or values = [25, 40, 30, 55] Numerical (y-axis)
counts for different [Link](categories, values,
groups
color='steelblue', edgecolor='black',
width=0.6)
[Link]('Categories')
[Link]('Values')
[Link]('Bar Plot')
[Link](axis='y', alpha=0.3)
[Link]()
Seaborn:
[Link](data=df, x='category',
y='value', hue='subcategory',
palette='Set2', ci='sd', capsize=0.1)
[Link]('Bar Plot')
[Link](title='Subcategory')
[Link]()
Horizontal:
[Link](categories, values)
Log Log Visualize Matplotlib: Two numerical
Plot relationships [Link](x, y, 'o-', linewidth=2, variables (positive
spanning multiple markersize=6, color='blue') values only)
orders of [Link]('X (log scale)')
magnitude; identify
[Link]('Y (log scale)')
power-law
[Link]('Log-Log Plot')
relationships
[Link](True, which='both', alpha=0.3)
[Link]()
Alternative method:
[Link](x, y, 'o-')
[Link]('log')
[Link]('log')
[Link]('X (log scale)')
[Link]('Y (log scale)')
[Link](True, which='both', ls='--',
alpha=0.5)
[Link]()

Line Plot Show trends over Matplotlib: Numerical (both

time or ordered [Link](x, y, color='blue', linewidth=2, axes), often
categories; display linestyle='-', marker='o', markersize=6, time-series data
continuous label='Series 1')
relationships
[Link]('Time/X Variable')
[Link]('Y Variable')
[Link]('Line Plot')
[Link](loc='best')
[Link](True, alpha=0.3)
[Link]()
Seaborn:
[Link](data=df, x='time', y='value',
hue='category', style='category',
markers=True, dashes=False)
[Link]('Line Plot')
[Link]()
Multiple lines:
[Link](x, y1, label='Line 1')
[Link](x, y2, label='Line 2')
[Link]()
Multivariate Plots
Plot Type When to Use Detailed Syntax Datatype Needed

Parallel Compare Pandas: Multiple

Coordinate multiple from [Link] import numerical
Plot numerical parallel_coordinates variables +
variables parallel_coordinates(df, 'class_column', optional
across categorical class
color=['blue', 'red', 'green'], alpha=0.5)
observations; variable
[Link]('Parallel Coordinates Plot')
identify
[Link]('Variables')
patterns in
high-dimensi [Link]('Values')
onal data [Link](loc='best')
[Link](True, alpha=0.3)
[Link]()
Plotly (interactive):
import [Link] as px
fig = px.parallel_coordinates(df,
color='class_column', dimensions=['var1',
'var2', 'var3', 'var4'],
color_continuous_scale=[Link]
lrose)
[Link]()

Pair Plot Visualize Seaborn: Multiple

pairwise [Link](data=df, hue='category', numerical
relationships palette='husl', diag_kind='kde', variables +
between all plot_kws={'alpha': 0.6, 's': 50, 'edgecolor': optional
numerical categorical
'k'}, height=2.5)
variables; variable for
[Link]('Pair Plot', y=1.02)
show coloring
[Link]()
distributions
Specific columns:
and
correlations [Link](data=df, vars=['col1', 'col2',
'col3'], hue='category', markers=['o', 's',
'D'], diag_kind='hist')
[Link]()
With regression:
[Link](df, kind='reg', diag_kind='kde')
Stacked Plot Show Matplotlib: Time/categorical
cumulative x = [1, 2, 3, 4, 5] (x-axis) +
totals over y1 = [1, 2, 3, 4, 5] multiple
time or y2 = [1, 1, 2, 2, 3] numerical series
categories;
y3 = [2, 2, 2, 3, 3]
display
[Link](x, y1, y2, y3, labels=['Series
part-to-whol
1', 'Series 2', 'Series 3'],
e
relationships colors=['#1f77b4', '#ff7f0e', '#2ca02c'],
alpha=0.8)
[Link]('X Variable')
[Link]('Cumulative Value')
[Link]('Stacked Area Plot')
[Link](loc='upper left')
[Link](True, alpha=0.3)
[Link]()
Pandas:
[Link](stacked=True, alpha=0.7,
figsize=(10, 6))
[Link]('Stacked Area Chart')
[Link]()

Heatmap Display Seaborn: Matrix of

matrix data correlation_matrix = [Link]() numerical values
with color [Link](correlation_matrix, annot=True, or 2D grid with
intensity; fmt='.2f', cmap='coolwarm', center=0, categorical
show dimensions
square=True, linewidths=1, cbar_kws={'shrink':
correlations
0.8})
or values
[Link]('Correlation Heatmap')
across two
categorical plt.tight_layout()
dimensions [Link]()
Custom data:
[Link](data=pivot_table, annot=True,
cmap='YlGnBu', linecolor='white',
linewidths=0.5)
[Link]('X Categories')
[Link]('Y Categories')
[Link]('Heatmap')
[Link]()
With clustering:
[Link]([Link](), cmap='viridis',
annot=True)
Violin Plot Combine box Seaborn: Categorical
plot with [Link](data=df, x='category', (x-axis) +
kernel density y='value', hue='subcategory', split=False, Numerical
plot; show full palette='muted', inner='quartile', (y-axis), optional
distribution secondary
scale='width')
shape across categorical for
[Link]('Violin Plot')
categories hue
[Link]('Category')
[Link]('Value')
[Link](title='Subcategory',
bbox_to_anchor=(1.05, 1))
[Link]()
With split violins:
[Link](data=df, x='category',
y='value', hue='binary_var', split=True,
palette='Set2')
[Link]()
Horizontal:
[Link](data=df, x='value',
y='category', orient='h')

Xploratory Ata
No ratings yet
Xploratory Ata
16 pages
Chart Selection Guide for Data Visualization
No ratings yet
Chart Selection Guide for Data Visualization
6 pages
Data Visualization Methods Explained
No ratings yet
Data Visualization Methods Explained
43 pages
Data Analytics Overview and Techniques
No ratings yet
Data Analytics Overview and Techniques
80 pages
Effective Data Visualization Techniques
No ratings yet
Effective Data Visualization Techniques
32 pages
Effective Data Visualization Techniques
No ratings yet
Effective Data Visualization Techniques
52 pages
Understanding Statistics and Data Visualization
No ratings yet
Understanding Statistics and Data Visualization
6 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
30 pages
Box Plot Overview and Analysis
No ratings yet
Box Plot Overview and Analysis
18 pages
Understanding Box Plots and Data Visualization
No ratings yet
Understanding Box Plots and Data Visualization
91 pages
Data Analytics Overview 2023/2024
No ratings yet
Data Analytics Overview 2023/2024
89 pages
Data Visualization Techniques with Matplotlib
No ratings yet
Data Visualization Techniques with Matplotlib
25 pages
Machine Learning
No ratings yet
Machine Learning
5 pages
Understanding Data Distributions and Analysis
No ratings yet
Understanding Data Distributions and Analysis
130 pages
Boxplots: Creating and Analyzing in R
No ratings yet
Boxplots: Creating and Analyzing in R
10 pages
EDA and Descriptive Statistics Guide
No ratings yet
EDA and Descriptive Statistics Guide
40 pages
Graphical Methods: Pros and Cons
No ratings yet
Graphical Methods: Pros and Cons
28 pages
Unit5 Data Visualization
No ratings yet
Unit5 Data Visualization
13 pages
Data Visualization Techniques in Python
No ratings yet
Data Visualization Techniques in Python
51 pages
Histogram and Box Plot Analysis
No ratings yet
Histogram and Box Plot Analysis
5 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
3 pages
Introduction to Data Visualization Techniques
No ratings yet
Introduction to Data Visualization Techniques
3 pages
SPC Statistics: Variation & Data Analysis
No ratings yet
SPC Statistics: Variation & Data Analysis
158 pages
Enhanced Scatter Plot Techniques
No ratings yet
Enhanced Scatter Plot Techniques
52 pages
Histogram vs Box Plot Explained
100% (1)
Histogram vs Box Plot Explained
6 pages
Data Visualization Techniques Explained
No ratings yet
Data Visualization Techniques Explained
50 pages
Data Representation & Sampling Techniques
No ratings yet
Data Representation & Sampling Techniques
29 pages
Data Visualization Techniques in R
No ratings yet
Data Visualization Techniques in R
2 pages
EDA Data Visualization Techniques Guide
No ratings yet
EDA Data Visualization Techniques Guide
17 pages
EDA Techniques for U.S. Wage Analysis
No ratings yet
EDA Techniques for U.S. Wage Analysis
28 pages
Data Visualization Techniques in Seaborn
No ratings yet
Data Visualization Techniques in Seaborn
3 pages
Types of Data Visualization Charts
No ratings yet
Types of Data Visualization Charts
12 pages
Types of Data Visualization Charts
No ratings yet
Types of Data Visualization Charts
12 pages
Statistics and Probability Formulas Guide
No ratings yet
Statistics and Probability Formulas Guide
47 pages
Types of Charts (Correlation and Distribution)
No ratings yet
Types of Charts (Correlation and Distribution)
21 pages
Visual Data Exploration Techniques
No ratings yet
Visual Data Exploration Techniques
18 pages
Understanding Stem-and-Leaf Plots and Graphs
No ratings yet
Understanding Stem-and-Leaf Plots and Graphs
13 pages
Graphing Numerical Data Techniques
No ratings yet
Graphing Numerical Data Techniques
33 pages
Process Quality Inference Techniques
No ratings yet
Process Quality Inference Techniques
35 pages
Histogram vs. Box Plot Explained
No ratings yet
Histogram vs. Box Plot Explained
7 pages
Number Line Solutions for Equations
No ratings yet
Number Line Solutions for Equations
120 pages
Importing and Visualizing Data in R
No ratings yet
Importing and Visualizing Data in R
9 pages
Data Visualization with Matplotlib
No ratings yet
Data Visualization with Matplotlib
9 pages
Understanding Population, Sample, and Data Analysis
No ratings yet
Understanding Population, Sample, and Data Analysis
22 pages
Box Plot Analysis of Cars Dataset
No ratings yet
Box Plot Analysis of Cars Dataset
63 pages
Box and Whisker Plot & CLT Overview
No ratings yet
Box and Whisker Plot & CLT Overview
25 pages
Lesson 3 - Graphing
No ratings yet
Lesson 3 - Graphing
31 pages
Data Analytics Techniques Explained
No ratings yet
Data Analytics Techniques Explained
36 pages
Goals and Methods of Data Visualization
No ratings yet
Goals and Methods of Data Visualization
40 pages
Math236 Lecture 3
No ratings yet
Math236 Lecture 3
62 pages
Biostat A1
No ratings yet
Biostat A1
7 pages
Matplotlib Material 1
No ratings yet
Matplotlib Material 1
12 pages
Data Visualization Principles and Techniques
No ratings yet
Data Visualization Principles and Techniques
89 pages
Understanding Exploratory Data Analysis
No ratings yet
Understanding Exploratory Data Analysis
41 pages
Data Analysis and Visualization Guide
No ratings yet
Data Analysis and Visualization Guide
10 pages
Statistical Concepts in Lean Six Sigma
No ratings yet
Statistical Concepts in Lean Six Sigma
90 pages
R Data Visualization: Histograms & Boxplots
No ratings yet
R Data Visualization: Histograms & Boxplots
7 pages
Understanding Graphs and Data Distributions
No ratings yet
Understanding Graphs and Data Distributions
40 pages
Flaws in Word Cloud Visualizations
No ratings yet
Flaws in Word Cloud Visualizations
13 pages
MTE Question Paper
No ratings yet
MTE Question Paper
9 pages
Seminar Synopsis: Secure File Sharing 2025
No ratings yet
Seminar Synopsis: Secure File Sharing 2025
2 pages
SOP and POS Form Minimization Guide
No ratings yet
SOP and POS Form Minimization Guide
18 pages
AWS Testing and QA Processes Overview
No ratings yet
AWS Testing and QA Processes Overview
4 pages
R Studio Data Manipulation Guide
No ratings yet
R Studio Data Manipulation Guide
3 pages
Measuring Financial Contagion Dynamics
No ratings yet
Measuring Financial Contagion Dynamics
48 pages
Vehicle Transient Response Dynamics
No ratings yet
Vehicle Transient Response Dynamics
26 pages
Gravimetric Soil Moisture Measurement Guide
No ratings yet
Gravimetric Soil Moisture Measurement Guide
2 pages
Deep Pyramid Method for Anomaly Detection
No ratings yet
Deep Pyramid Method for Anomaly Detection
17 pages
Types of Graphic Organizers for Grade 7
100% (3)
Types of Graphic Organizers for Grade 7
64 pages
Arabic Science's Impact on Renaissance Europe
No ratings yet
Arabic Science's Impact on Renaissance Europe
26 pages
Binomial Probabilities Table A-1
No ratings yet
Binomial Probabilities Table A-1
4 pages
Year 4 Simple Probability Worksheet
No ratings yet
Year 4 Simple Probability Worksheet
19 pages
Holiday Homework Guidelines for Class XII
No ratings yet
Holiday Homework Guidelines for Class XII
25 pages
Government Workers' Motivation and Productivity
No ratings yet
Government Workers' Motivation and Productivity
9 pages
Capacitance and Capacitor Calculations
No ratings yet
Capacitance and Capacitor Calculations
11 pages
Architectural Composition Framework
No ratings yet
Architectural Composition Framework
7 pages
Six Bar Crank Mechanism in Forging
No ratings yet
Six Bar Crank Mechanism in Forging
6 pages
R23 B.Tech Course Structure Overview
No ratings yet
R23 B.Tech Course Structure Overview
41 pages
Conditional Statements in Java Explained
No ratings yet
Conditional Statements in Java Explained
43 pages
Tally Charts and Picture Graphs Unit
No ratings yet
Tally Charts and Picture Graphs Unit
11 pages
Research Methodology Essentials Guide
100% (1)
Research Methodology Essentials Guide
28 pages
16-bit Digital Adders Performance Study
No ratings yet
16-bit Digital Adders Performance Study
6 pages
Biostatistical Analysis 5th Edition Jerrold H. Zar Ebook Improved PDF Quality
100% (1)
Biostatistical Analysis 5th Edition Jerrold H. Zar Ebook Improved PDF Quality
41 pages
Data Cleaning for FMCG Ad Analysis
No ratings yet
Data Cleaning for FMCG Ad Analysis
2 pages
Normal Distribution and Hypothesis Testing
No ratings yet
Normal Distribution and Hypothesis Testing
29 pages
Invariance of Maxwell's Equations Under LT
No ratings yet
Invariance of Maxwell's Equations Under LT
5 pages
Assume X, Y, Z, W and P Are Matrices of Order 2 X N, 3 X K, 2 X P (MCQ
No ratings yet
Assume X, Y, Z, W and P Are Matrices of Order 2 X N, 3 X K, 2 X P (MCQ
1 page
Understanding Correlation in Statistics
No ratings yet
Understanding Correlation in Statistics
31 pages
AMC 10 Test Instructions and Guidelines
No ratings yet
AMC 10 Test Instructions and Guidelines
4 pages
Understanding N-Gram Models in NLP
No ratings yet
Understanding N-Gram Models in NLP
6 pages
Improved Analytical Fit of Gold Dispersion: Application To The Modeling of Extinction Spectra With A Finite-Difference Time-Domain Method
No ratings yet
Improved Analytical Fit of Gold Dispersion: Application To The Modeling of Extinction Spectra With A Finite-Difference Time-Domain Method
7 pages
Pre-Test Mathematics Tingkatan 5 SEEP
No ratings yet
Pre-Test Mathematics Tingkatan 5 SEEP
14 pages
Cointegration and Error Correction Models
No ratings yet
Cointegration and Error Correction Models
25 pages
Arithmetic and Geometric Series Guide
No ratings yet
Arithmetic and Geometric Series Guide
7 pages

Data Visualization Plot Types Guide

Uploaded by

Data Visualization Plot Types Guide

Uploaded by

Data Visualization Plots - Reference Table

Plot Type When to Use Datatype Needed

Box Plot Show distribution summary (median, quartiles, Numerical (continuous)

Histogram Show frequency distribution of continuous Numerical (continuous)

Pie Plot Show proportions of categorical data as parts Categorical + numerical

Parallel Compare multiple numerical variables across Multiple numerical variables +

Stacked Plot Show cumulative totals over time or Time/categorical (x-axis) +

Single variable plots [ UNIVARIATE PLOTS ]

-​ In short: “A jitter plot = dot plot + tiny random displacement.”

-​ Use a jitter plot when:

What Error Bars Represent

How to Read Them

Components of a Box Plot

import [Link] as plt

import [Link] as plt

# Heights and weights

# Categories: 'Male' or 'Female'

# Assign colors to categories

[Link](heights, weights, c=colors, s=100, edgecolor='black')

How to Read a Log-Log Plot

The slope of a line tells you the power law exponent:

# X values (avoid zero for log scale)

[Link]("X (log scale)")

import [Link] as plt import [Link] as plt

Parallel Coordinate Plot

A parallel coordinate plot is a visualization technique for displaying multivariate data.

●​ Visualizing multivariate data (4+ variables)

●​ Comparing performance metrics across multiple dimensions

●​ You have only 2-3 variables (use scatter plots instead)

●​ Each vertical line = one variable/dimension

# Create categorical variable for coloring

Pair Plot (Scatter Plot Matrix)

●​ Scatter plots for every pair of numerical variables (off-diagonal)

It displays all pairwise relationships in a dataset simultaneously, making it a powerful tool

When to Use Pair Plots

●​ Initial exploratory data analysis (EDA) of multivariate datasets

●​ Machine learning: feature analysis before modeling

●​ You have too many variables (>6-7 becomes cluttered)

Plot Type When to Use Detailed Syntax Datatype Needed

Dot Display individual data points Matplotlib: Numerical

Jitter Show distribution of data Seaborn: Numerical

Error Display mean/median values Matplotlib: Numerical

Histogram Show frequency distribution Matplotlib: Numerical

Scatter Show relationship Matplotlib: Two numerical

Bar Plot Compare values Matplotlib: Categorical

Line Plot Show trends over Matplotlib: Numerical (both

Parallel Compare Pandas: Multiple

Pair Plot Visualize Seaborn: Multiple

Heatmap Display Seaborn: Matrix of

You might also like

- In short: “A jitter plot = dot plot + tiny random displacement.”

- Use a jitter plot when:

● Visualizing multivariate data (4+ variables)

● Comparing performance metrics across multiple dimensions

● You have only 2-3 variables (use scatter plots instead)

● Each vertical line = one variable/dimension

● Scatter plots for every pair of numerical variables (off-diagonal)

● Initial exploratory data analysis (EDA) of multivariate datasets

● Machine learning: feature analysis before modeling

● You have too many variables (>6-7 becomes cluttered)