Lesson 2
Lesson 2
1. Indicator:
When you want to display the volume of the data you have.
When comparing data across more than one time period
Avoid if you need to compare multiple categories, as well as when you need to examine the
specific data value
3. Pivot Table:
A pivot table is a data processing tool used in spreadsheet programs or business intelligence
software.
It allows users to summarize and analyze data interactively by reorganizing and
summarizing selected columns and rows.
Pivot tables can also be enhanced with conditional formatting to provide color scales that
make performance trends more visible.
Data bars can also be added to cells to run either red or green for positive and negative
values.
5. Scatter Maps:
o Similar to scatter charts, but data points are plotted on a map, using geographic
coordinates.
o Each point represents a location on the map.
Fig 5.1 Scatter Plots
6. Tree Maps:
A tree map is a hierarchical visualization that displays nested rectangles to represent
hierarchical data structures.
Each branch of the tree is given a colored rectangle, and the size of the rectangle
corresponds to a quantitative value.
A tree map is a type of chart that is used to visualize hierarchical data.
It consists of a series of nested rectangles, where the size and color of each rectangle
represent a different variable.
Tree maps are best used when analyzing data that has a hierarchical structure.
1. Space-Filling Methods:
In space-filling methods, visual elements occupy the entire display area, and the
visualization effectively fills the space available.
i. Treemaps:
Treemaps represent hierarchical data as nested rectangles.
Each level of the hierarchy is depicted by a different color or shading, and the size of
each rectangle corresponds to a quantitative value, such as the number of observations
or a numerical attribute.
v. Hexagonal Binning:
Hexagonal binning is a method used in scatter plots where the 2D space is divided
into hexagons, and the color or intensity of each hexagon represents the density of
data points within that region.
This helps in visualizing the distribution of points in a dense scatter plot.
viii. Cartograms:
Cartograms distort the geographic space to represent statistical information.
The size or shape of regions is altered based on a particular variable, allowing for
the visualization of spatial patterns.
2. Non-Space-Filling Methods:
In non-space-filling methods, visual elements may not necessarily fill the entire display
space, and the focus is often on the arrangement and relationships between individual
elements.
i. Scatter Plots:
Scatter plots are simple yet powerful visualizations where individual data points are
plotted on a two-dimensional plane.
Each point's position represents the values of two variables, and additional dimensions
can be encoded using color, size, or shape of the markers.
ii. Line Charts:
Line charts connect data points with lines, making them useful for visualizing trends or
patterns over a continuous variable (e.g., time).
They are effective for showing the relationship between two variables.
iv. Histograms:
Histograms display the distribution of a single variable by dividing the data into
bins and representing the frequency of observations in each bin.
They are useful for understanding the shape of the data distribution.
Network Analysis:
Centrality Measures: Identify the most central nodes in a network, such as degree
centrality, betweenness centrality, and closeness centrality.
Community Detection: Identify groups of nodes with strong internal connections using
algorithms like modularity-based methods or spectral clustering.
Path Analysis: Understand the shortest paths or routes between nodes.
Fig: Graph
2. Matrix Representation for Graphs:
The matrix representation is one way to represent the adjacency structure of a graph using a
matrix, typically referred to as an adjacency matrix.
For an undirected graph, the matrix is symmetric; for a directed graph, it may not be
symmetric.
Adjacency Matrix:
o In an adjacency matrix, rows and columns represent nodes in the graph.
o The presence or absence of an edge between two nodes is indicated by a 1 or 0 in
the corresponding matrix cell.
o For a directed graph, the matrix may not be symmetric.
o Weighted graphs can have numerical values in the matrix cells to represent edge
weights.
Adjacency List:
o In an adjacency list, each node has a list of its neighbors.
o It's often implemented using a dictionary or an array of linked lists in programming.
o Suitable for sparse graphs (graphs with relatively few edges).
Edge List
o Another representation is the edge list, where each row represents an edge in the
graph.
o Each row contains the nodes involved in the edge, and possibly a weight.
3. Info Graphics:
Infographics, short for information graphics, are visual representations of information, data,
or knowledge designed to present complex information quickly and clearly.
Infographics combine text, images, and graphics to convey information in a visually
engaging and digestible format.
They are widely used in data visualization to make data-driven insights more accessible to a
broader audience.
1. Mapping:
In the context of data visualization methods and mapping, the term "Mapping" refers to the
representation of data points or locations on a map.
It involves the use of graphical elements to convey spatial information in a visual format.
Mapping is a powerful technique in data visualization as it allows users to understand
patterns, relationships, and trends in geographic data.
Here are some key concepts related to mapping in data visualization:
Geospatial Data: Mapping typically involves the use of geospatial data, which includes
information related to the Earth's surface such as latitude, longitude, and elevation. Geospatial
data can represent physical features, locations of events, or any other information with a spatial
context.
Coordinate Systems: Maps use coordinate systems to represent locations on the Earth's surface.
The most common coordinate system is the latitude and longitude system, but other systems like
UTM (Universal Transverse Mercator) may also be used.
Markers and Symbols: Data points on a map are often represented using markers or symbols.
These markers can vary in size, color, or shape to convey additional information about the data,
such as the magnitude of a variable or the category of a location.
Choropleth Maps: Choropleth maps use color variations to represent spatial variations in a
particular variable. Different shades or colors are used to indicate different levels or categories of
the variable across regions.
Heatmaps: Heatmaps visualize the density or intensity of data points in a particular area.
Hotspots with a higher concentration of data are represented with warmer colors, while cooler
colors indicate lower density.
1
Interactive Maps: With advancements in technology, interactive maps have become popular.
Users can interact with the map, zoom in, pan, and click on specific data points to access
additional information.
GIS (Geographic Information System): GIS is a powerful tool that integrates mapping and
spatial analysis capabilities. It allows users to overlay different layers of information and perform
complex spatial analyses.
Fig Mapping
2. Time Series:
Time series refers to a sequence of data points collected or recorded over a period
of time. Time series data is characterized by the temporal ordering of observations,
where each observation is associated with a specific timestamp.
Analyzing and visualizing time series data is essential for understanding patterns,
trends, and changes over time.
Here are some key aspects of time series in data visualization:
Temporal Axis: Time series data is plotted on a temporal axis, typically along the x-axis.
The x-axis represents time, and the y-axis represents the variable being measured. This
allows viewers to observe how the variable changes over different time intervals.
Line Charts: Line charts are commonly used to visualize time series data. In a line chart,
data points are connected by straight lines, providing a smooth representation of how the
variable evolves over time. Line charts are effective for displaying trends, cycles, and
fluctuations.
Time Series Plots: A time series plot is a specific type of graph that displays data points in
2
the order in which they occur. It helps in understanding the behavior of a variable over
time and is particularly useful for identifying seasonality, trends, and outliers.
Seasonal Patterns: Time series data often exhibits seasonal patterns, where certain
patterns repeat at regular intervals. Seasonal decomposition techniques, such as
decomposition into trend, seasonality, and residuals (e.g., using moving averages), can help
visualize and analyze these patterns.
Bar Charts and Histograms: Bar charts and histograms can be used to represent time
series data when discrete observations are recorded at specific time points. Each bar or
column represents the value of the variable at a particular time.
Heatmaps: Heatmaps are useful for visualizing time series data when there are multiple
variables or dimensions involved. Time can be represented on one axis, and different
variables on the other, with color indicating the intensity or magnitude of the values.
Annotations and Events: Annotations on time series charts can highlight significant
events or milestones, aiding in the interpretation of the data. Events like product launches,
policy changes, or external factors impacting the data can be visually marked.
Forecasting and Prediction: Time series visualizations are crucial for forecasting and
predicting future trends. Techniques like trend lines, moving averages, or advanced time
series models can be employed to make predictions.
3.5 Heatmaps:
Purpose: Display the intensity of relationships or correlations.
Representation: A grid of colored cells where colors indicate the strength and direction of
correlations, often used for correlation matrices.
3.6 Bubble Charts:
Purpose: Extend scatter plots to include a third variable.
Representation: Similar to scatter plots, but the size of each point represents a third
variable, adding another layer of information.