Visual Analytics Fundamentals
Visual Analytics Fundamentals
Lindy Ryan
Figures 3.2-3.15, 4.4-4.8, 5.1, 5.4a-b, 5.5, 5.6, 5.8, 5.10a-b, 5.12a-b, 5.13,
5.15a-b, 5.16-5.19, 5.22, 5.23, 5.25-5.27, 5.30a-b, 5.31, 5.34a-b, 5.35, 5.37-
5.39, 6.1-6.10, 6.12a-6.18a, 6.19, 6.20, 6.21, 7.3, 7.4, 7.5-7.28, 7.30, 7.32-
7.42, 8.1, 8.2, 8.5a-b, 8.7-8.11, 8.14-8.16, 8.18, 8.19, 9.1-9.14, 9.18, 9.19,
9.21, 9.23, 9.26, 9.28-9.33, 9.35-9.46, FIGA-01, FIGA-02: Tableau
Software, Inc
The author and publisher have taken care in the preparation of this book,
but make no expressed or implied warranty of any kind and assume no
responsibility for errors or omissions. No liability is assumed for incidental
or consequential damages in connection with or arising out of the use of the
information or programs contained herein.
For information about buying this title in bulk quantities, or for special sales
opportunities (which may include electronic versions; custom cover
designs; and content particular to your business, training goals, marketing
focus, or branding interests), please contact our corporate sales department
at [email protected] or (800) 382-3419.
ISBN-13: 978-0-13-795682-1
ISBN-10: 0-13-795682-7
ScoutAutomatedPrintCode
Pearson’s Commitment to Diversity, Equity, and
Inclusion
Education is a powerful force for equity and change in our world. It has the
potential to deliver opportunities that improve lives and enable economic
mobility. As we work with authors to create content for every product and
service, we acknowledge our responsibility to demonstrate inclusivity and
incorporate diverse scholarship so that everyone can achieve their potential
through learning. As the world’s leading learning company, we have a duty
to help drive change and live up to our purpose to help more people create a
better life for themselves and to create a better world.
While we work hard to present unbiased content, we want to hear from you
about any concerns or needs with this Pearson product so that we can
investigate and address them.
Preface
Acknowledgments
About the Author
1 Welcome to Visual Analytics
2 The Power of Visual Analytics
3 Getting Started with Tableau
4 Keeping Visual Analytics in Context
5 Fundamental Data Visualizations
6 Fundamental Maps
7 Design Tips for Curating Visual Analytics
8 Structuring Analytics for Storytelling: Prep, Dashboards, and Stories
9 Beyond Fundamentals: Advanced Visualizations
10 Closing Thoughts
Appendix A Tableau Services
Index
Contents
Preface
Acknowledgments
About the Author
1 Welcome to Visual Analytics
A Visual Revolution
The Evolution from Data Visualization to Visual Data Storytelling
A Brief Look at the State of the Industry
From Visual to Story: Bridging the Gap
Summary
2 The Power of Visual Analytics
The Science of Storytelling
The Brain on Stories
The Human on Stories
The Power of Stories
The Classic Visualization Example
Using Small Personal Data for Big Stories
The Two-or-Four Season Debate
Napoleon’s March
Stories Outside of the Box
Summary
3 Getting Started with Tableau
Using Tableau
Why Tableau?
The Tableau Product Portfolio
Tableau Desktop
Tableau Server
Tableau Cloud
Tableau Prep
Tableau Public
Tableau Reader and Tableau Viewer
Getting Started
Installing Tableau Desktop
Connecting to Data
Connecting to Tables
Live Versus Extract
Connecting to Multiple Tables with Relationships and Joins
Adding and Replacing Data Sources
Basic Data Prep with Data Interpreter
Navigating the Tableau Interface
Menus and Toolbar
Data Pane
Shelves and Cards
Legends
Understanding Dimensions and Measures
Dimensions
Measures
Continuous and Discrete
Summary
4 Keeping Visual Analytics in Context
Context in Action
Harry Potter: Hero or Menace?
Ensuring Relevant Context
Exploratory Versus Explanatory Analysis
Structuring Visual Analytic Stories
Story Plot
Story Genre
Audience Analysis for Storytelling
Who
What
Why
How
Summary
5 Fundamental Data Visualizations
The Bar Chart
Tableau How-To: Bar Chart
The Line Chart
Tableau How-To: Line Chart
Pie and Doughnut Charts
Tableau How-To: Pie and Doughnut Charts
The Scatter Plot
Tableau How-To: Scatter Plots
The Packed Bubble Chart
Tableau How-To: Packed Bubble Charts
The Tree Map
Tableau How-To: Tree Maps
The Heat Map
Tableau How-To: Heat Maps
Summary
6 Fundamental Maps
Connecting to Geographic Data
Assigning Geographic Roles
Creating Geographic Hierarchies
Proportional Symbol Maps
Choropleth Map
Summary
7 Design Tips for Curating Visual Analytics
Visual Design Building Blocks
Color
Sequential Color
Diverging Color
Categorical Color
Color Effects
Opacity
Mark Borders
Mark Halos
Pre-attentive Colors
Important Color Considerations
The Truth About Red and Green
Lines
Formatting Grid Lines, Zero Lines, and Drop Lines
Formatting Borders
Formatting, Shading, and Banding
Shapes
Shape Marks Card
Custom Shapes
Summary
8 Structuring Analytics for Storytelling: Prep, Dashboards, and Stories
Basic Data Prep in Tableau: Data Interpreter
Data Interpreter in Action
Handling Nulls in Tableau
Pivoting Data from Wide to Tall
A Note on Preparing Survey Data for Visual Analysis
Storyboarding Your Visual Analytics
Understanding Stories in Tableau
The Storyboarding Process
Summary
9 Beyond Fundamentals: Advanced Visualizations
Timelines
Bar-in-Bar Charts
Likert-Scale Visualizations
A 100% Stacked Bar Chart
Divergent Stacked Bar Chart
Lollipop Charts
Labeled Lollipops
Word Clouds
Summary
10 Closing Thoughts
Five Steps to Visual Data Storytelling
Step 1. Find Data That Supports Your Story
Step 2. Layer Information for Understanding
Step 3. Design to Reveal
Step 4. Beware the False Reveal
Step 5. Tell It Fast
The Important Role of Feedback
Ongoing Learning
Teach Yourself: External Resources
Companion Materials to This Text
Appendix A Tableau Services
Index
Preface
For as long as I can remember, I have always been fascinated by the power
of a good story.
Like taste buds, our taste for stories evolves over time, both in terms of
format and content. Our appetite changes alongside age, experiences, and
interests, yet still the desire for a good story persists. We crave stories; it’s
part of our design. Humans are intrinsically hungry for a good story. They
entertain us, educate us, and provide mechanisms to transmit knowledge,
information, and experiences. We’re rather indiscriminate about how we
receive stories, too. In fact, according to scientific evidence, we might even
prefer stories that move us and touch our senses.
Today, visual analytics and data storytelling are reshaping how we see,
interpret, and communicate data insights. Students from grade school to
graduate school are working hands-on with data and changing the way they
learn about and communicate about information. Business analysts,
managers, and executives are moving away from static, statistic-laden
reports and toward interactive, visual data dashboards. Journalists and news
editors are using data storyboards and engaging, often interactive,
infographics to share information with society-at-large.
Visual analytics helps us visually explore and uncover insights in our data.
Data visualization provides a way to showcase our findings by harnessing
our brains’ visual processing horsepower. Data storytelling gives us a very
human way to communicate. This approach pushes beyond the boundaries
of simply analyzing information to providing the capacity to communicate
it in ways that leave a meaningful, lasting impact.
Together, these converge into what I’ve termed the “visual imperative,” a
paradigm shift that has radically reshaped how we work with and seek to
understand our data, big and small. This visual imperative is reforming our
expectations of information, changing the question from “what can we do
with our data” to “what can our data do for us.” It’s making its mark on
every aspect of a progressively data-driven culture, too. From traditional
business intelligence and data discovery to the personal analytics on our
smart devices to how creators use data to cook up our new favorite
television shows, we are becoming more data-dependent and data-driven—
and we’re doing it visually.
While visual analytics and data storytelling leverages some of our innate
human communication and knowledge-sharing capabilities, it isn’t always
an intuitive and obvious process. It takes work, it takes understanding, and
it takes a lot of practice. This book is a steppingstone in your journey to
becoming a visual analyst, providing the fundamental knowledge and
hands-on training needed to help you take your first steps as a visual data
storyteller.
This book is for anyone who has data and wants to use it to visually
communicate its insights to someone else in engaging and memorable ways.
This includes, but is not limited to
Don’t let the topic overwhelm you: Whether you have been practicing data
visualization and visual analysis for some time or are just taking your first
steps in visual analytics, you don’t have to be a statistician or a computer
scientist, a graphic designer, or even a well-trained writer to learn how to
navigate the art and science that is data visualization or to become a master
data storyteller. Likewise, you do not need to be a data visualization expert
or come armed with deep technical expertise in visualization software
packages. Although this book utilizes Tableau as the primary mechanism
for data visualization, you need not be a power user or expert prior to
getting started—you don’t even have to have a purchased license on your
machine! You can simply download a free trial of Tableau Desktop to get
started. Finally, you don’t even need to bring your own data to play just yet
(although you certainly can). The sample datasets used in this text, as well
as information on where you can find other free datasets, are available to
you through the resources listed at the end of this book. Tableau also
provides a large selection of sample datasets that you can use for practice,
too.
I realize that the idea of visual analysis or data storytelling might sound
intimidating to many and that learning new software is always a challenge.
Therefore, this book is designed in a way that a professor or a tutor might
teach visual data storytelling (and in fact is the approach I take in my
classrooms for graduate-level students) using what I call the 1-2-3 Method
(see Figure P.2). This breaks down like this: (1) grounding easy-to-
understand principles, (2) reinforcing these through real-world examples,
and (3) guided hands-on work to incrementally develop skills. By the time
you have worked your way through each of the chapters and exercises in
this book, you will walk away with something tangible: competency as a
visual data storyteller using your own data in your own dashboards and
presentations. And you’ll have some great visuals on your Tableau Public
profile that you can add to your resume!
Figure P.2 The 1-2-3 Method
Assumptions
Last, but not least, this book assumes that the reader has access to Tableau
Desktop 2022, which is currently available to install on either Windows or
Mac operating systems. Free trials are available for business users or
general audiences, while students and educators can take advantage of the
Tableau for Teaching program, which offers free licenses to the full desktop
version.
Why Tableau?
If you browse the shelves of your local bookstore, you’ll find a bevy of
wonderful books available that teach data visualization and data storytelling
in a tool-agnostic manner. There is a good reason for this. To borrow the
words of Cole Nussbaumer Knaflic, author of Storytelling with Data, “No
matter how good the tool, it will never know your data or its story like you
do.” With any software, there will always be weak points to balance out the
strong ones. However, my goal in this book is to not only give you the
information you need but the application to use it. For that, we need a tool.
Many software packages are available on the market that would serve as
capable platforms to support this book, including Excel, which is still the
most ubiquitous, if unexciting, analysis tool, with the capability to create
functional if problematic basic charts and graphs. However, although many
of the more advanced available technologies meet the rigors of building
beautiful data visualizations, few provide the end-to-end capabilities that
Tableau does. What we’re looking for is a best-of-breed tool that delivers an
approachable, intuitive environment for self-service users of all levels to
prepare, analyze, and visualize data, as well as delivery platforms like
dashboards and story preparation—and one at the top of employer’s wishlist
for incoming visual analysts (we’ll look at this data in a future chapter). All
of these are native to Tableau.
If you’re new to the software, the following are the current technical
specifications needed to install Tableau Desktop:
Windows
Microsoft Windows 8/8.1, Windows 10 (x64)
2 GB memory
1.5 GB minimum free disk space
CPUs must support SSE4.2 and POPCNT instruction sets
Mac
macOS Mojave 10.14, macOS Catalina 10.15, and Big Sur 11.4+
Intel processors
M1 processors under Rosetta 2 emulation mode
1.5 GB minimum free disk space
CPUs must support SSE4.2 and POPCNT instruction sets
Although you are not limited to working through this book cover to cover, it
is recommended that you do so for incremental development of learning
and reinforcement of skills. Each module builds on concepts and skills
discussed in the preceding one and may include advancements on working
through an end-to-end data project that are necessary before taking the next
steps forward.
Note
With a very few exceptions, all visualizations and screenshots in this book
are created using Tableau 2022 for Mac. Differences in operating system
versions are negligible.
Supporting Materials
Beyond the modules of this text, there are several companion materials to
support ongoing skills development and learning in fundamental visual
analytics. These are intended to go beyond the confines of these chapters
and to attempt to keep pace with innovations in Tableau functionality as
well as review some of its more nuanced advanced features that are out of
scope for this book. These resources are suitable for the workplace,
although special attention has been given to classroom use:
The following provides a brief overview of what you can expect in each
chapter.
This chapter introduces the fundamental charts and graphs used to visually
communicate data that are offered on the Tableau Show Me Card. We
discuss appropriate use cases for each and get hands-on to create examples
in Tableau. You will learn techniques to help you assess when to use each
visualization type according to the data, how to generate these according to
best practices, and helpful considerations for when to avoid certain types of
charts. We’ll also explore some of the special features available in Tableau
to help you get the most from your visual.
This chapter dives into human cognition and visual perception to frame how
pre-attentive attributes like size, color, shape, and position affect the
usability and efficacy of visual analytics. We will explore best practices for
how the design elements can be employed to direct an audience’s attention
and create a visual hierarchy of components to communicate effectively.
This chapter moves beyond the basics of visual analytics to take our first
steps in architecting outputs of analysis in visual data dashboards and
stories. We’ll begin by taking a closer look at how to prepare data in
Tableau, utilizing some messy survey data—a common experience for data
storytellers—before building data dashboards and stories that incorporate
features like filters, annotations, and highlights to present compelling,
meaningful, and actionable outcomes of visual analytics.
This final chapter recaps the main lessons covered throughout the text. It
also serves as a resource kit for life beyond the book by providing
checklists of best practices and practical suggestions for continuing to
master outputs of visual analytics and discusses additional resources
available to support you on this journey.
Appendix
Certain figures in the print edition may not be as distinct as they are in the
digital version. To ensure an optimal reading experience, color PDFs of
figures are available at https://www.informit.com/store/visual-analytics-
fundamentals-creating-compelling-data-9780137956821.
Acknowledgments
About the Author
Lindy Ryan is passionate about telling stories with data. She specializes in
translating raw data into insightful stories through carefully curated visuals
and engaging narrative frameworks.
A Visual Revolution
1. https://actu.epfl.ch/news/the-world-s-largest-data-visualization/
2. www.gapminder.org/tag/trendalyzer/
Note
Speaking of Star Trek, check out how our youngest generation of visual
analysts are using the power of data visualization and Tableau to craft
engaging new data stories of their own in this “data kids” blog on Tableau:
www.tableau.com/blog/viz-long-and-prosper-how-one-young-trekkie-
telling-stories-his-data-55767.
However, from the most dynamic to the most static, data visualizations need
more than just data to make the leap from information representation to
resonation. They need a story—something to show or, more aptly, to “tell”
visually—and finding this isn’t always obvious when digging through a
dataset. It takes exploration, curiosity, and a shift in mindset to move from
creating a data visualization to scripting a data narrative. They are similar,
but not identical, skill sets. Both require a strong foundation in data
analytics and statistics, and both require skills development in the processes
of crafting, curating, and sharing data visualizations and stories. And, like
any other skill set, ongoing learning—whether at home, through an
academic program, or via industry or workplace training—will contribute to
continuous enhancement and efficacy in your holistic visual analytics skill
set.
Note
Data visualization is the practice of graphically representing data to help
people see and understand patterns, insights, and other discoveries hidden
inside information. Data storytelling translates seeing into meaning by
weaving a narrative around data to answer questions and support decision
making.
Data visualization and data storytelling are interrelated concepts within the
broader framework of visual analytics, but they are not the same thing. A
true data story utilizes data visualizations as a literary endeavor would use
illustrations—proof points to support the narrative. However, there’s a bit of
a role reversal here: Whereas data visualizations provide the “what” in the
story, the narrative itself answers the “why.” As such, the two work together
in tandem to translate raw data into something meaningful for an audience.
So, to be a proper data storyteller you need to know how to both curate
effective data visualizations and frame a storyboard around them. This
starts with learning how to properly and effectively visualize data, and how
to do so in the best way for presentation rather than for purely analytical
purposes. As discussed later in this book, visualizations for analysis versus
presentation are not always the same thing.
Data visualization is a place where science meets art, although the jury is
still out on whether the practice is more of a scientific endeavor or an
artistic one. Although experts agree that a compelling visual requires proper
application of both science and art, in practice it tends to be more of a
chicken-and-egg scenario. We haven’t quite come to a consensus as to
whether science comes before design or we design for the science, and the
decision changes depending on who you ask, who is creating the
visualization, and who the intended audience is. That said, whichever side
of the argument you land on, the result is the same: We need statistical
understanding of the data, its context, and how to measure it; otherwise, we
run the risk of faulty analysis and skewed decision making that eventually
lead to more risk and complications. Likewise, our very-visual cognition
system demands a way to encode numbers with meaning, so we rely on
colors and shapes to help automate these processes for us. Done incorrectly,
improper application and encoding of visual information can also lead to
faulty analysis, skewed decision making, and risk. An effective visual must
strike the right balance of both to accurately and astutely deliver on its goal:
intuitive insight at a glance.
This might sound like an easy task, but learning to properly construct
correct and effective data visualization isn’t something you can accomplish
overnight. It takes as much time to master this craft as it does any other, as
well as a certain dedication to patience, practice, and keeping abreast of
changes in software. In fact, in a recent study, new visual analytics
practitioners noted that a top frustration is a lack of technical skill, both in
data analytics and in the tools used to accomplish data visualization tasks.
Like so many other aspects of data science, data visualization and
storytelling tend to evolve over time, so an inherent need exists for
continuous learning and adaptation, as well as skills development and
continued learning. In a survey that we will explore in detail in a
subsequent section, visual analysts expressed a need to dedicate more time
to gaining an in-depth understanding of their visualization tools, as well as
structured training in design, analysis, statistics, and user experience,
among other facets, to improve their visual analytics skills. The lessons in
this book are intended to support that need and guide you as you begin your
first adventures using visual analytics in Tableau.
With all the current focus on data visualization as the best (and sometimes
only) way to see and understand today’s biggest and most diverse data, it’s
easy to think of this practice as a relatively new way of representing data
and other statistical information. In reality, the practice of graphing
information to visually communicate information stretches back all the way
to some of the earliest prehistoric cave drawings where our forebears
charted the minutiae of early human life. From there, we turned to
visualization through initial mapmaking and continuing through to more
modern advances in graphic design and statistical graphics. Along the way,
the practice of data visualization has been aided by advancements in both
visual design and cognitive science as well as in technology and business
intelligence (BI), and these developments have given rise to the
advancements that have led to our current state of data analytics.
To visualize the data storytelling process, consider Figure 1.3. It depicts the
visual analytics process we’ll follow throughout this book. However, this
process isn’t always as straightforward or linear as it might initially appear.
In reality, this process is, like all discovery processes, iterative. For
example, as a result of analysis we might need to revisit data wrangling—
we might uncover a missing attribute that we need for our proposed model
or need to incorporate supplementary data to complete an insight. Finally,
as the story unfolds, we might need to revisit previous steps to support
claims we did not originally plan to make.
3. www.datavisualizationsociety.org/report-2021
Note
We’ll look more closely at the DVS SOTI survey findings in later chapters,
and the job market for visual analytics in Chapter 3. However, it’s worth
noting now that the number of job postings for data visualization–related
jobs has increased dramatically over the past decade (Figure 1.4). From
fewer than 2,000 jobs in 2010 to more than 100,000 jobs in 2021, the
demand for analysts skilled in data visualization has seen tremendous
growth, with the future in data viz looking bright indeed. Note that jobs
reflected in the data represent the total number of hiring openings posted
across the United States that required a minimum of a bachelor’s degree.
Figure 1.4 The total number of data visualization–related job openings posted nationwide between
2010 and 2021.
While these studies provide an optimistic glimpse into the future of the
visual analytics job market, this upward momentum is not without its
challenges, particularly in regard to the amount of training and educational
resources to support this tremendous skills need. In early 2022, Tableau
tasked Forrester Consulting with researching the role data skills play in
business outcomes. The resulting study, titled “Building Data Literacy: The
Key to Better Decisions, Greater Productivity, and Data-Driven
Organizations,” included a survey of more than 2,000 executives, decision
makers, and data contributors working at global companies with 500 or
more employees in 10 countries. The quick takeaway: Despite the
increasing job demand for data skills, there is not yet enough training
available.4
4. www.tableau.com/about/press-releases/2022/new-data-literacy-research
As part of its study, Forrester found that data skills are increasingly crucial
—not only as in-demand skills that are seen as most important for today’s
analysts to succeed in their day-to-day work, but also as the skills that have
increased most in importance over the last three years. According to the
study, 82% of today’s analytic decision makers expect basic data literacy
from employees across departments (IT, human resources, and so on).
However, while close to 70% of employees are expected to use data heavily
in their job by 2025 (a 40% increase from 2018), currently only 39% of
organizations make data training available to their employees.
Note
Read the full report at www.tableau.com/sites/default/files/2022-
03/Forrester_Building_Data_Literacy_Tableau_Mar2022.pdf.
It’s important to recognize that technical competencies are not the only
skills in demand by analytic employers today. Data will continue to grow,
technologies to adapt and innovate, and analytical approaches to chart new
territory to evolve the way we work with and uncover meaning and value
hidden within our data. The real value in becoming a data storyteller is to
amass the ability to share—to communicate—about our data.
The BI Congress survey isn’t the only piece of data, or the most recent, to
point out the importance of communication skills in analytics. A recent
study from data research and advisory firm Gartner sought to determine
why big data projects fail—that is, what percentage of big data projects fail
due to organizational problems, such as communication, versus what
percentage fail due to technical problems, such as programming or
hardware.6 Only about 1% of companies responded that technical issues
alone were the fail point of their data analytics problems. The other 99% of
companies said that at least half of the reason their data analytics projects
failed was due to poor organizational skills—specifically, communication—
and not technical skills.
6. www.gartner.com/newsroom/id/2593815
Perhaps most conclusively, we can look directly to the job market to
snapshot which communication skills are most in demand by today’s
employers specifically hiring analysts skilled in visual analytics and data
visualization. These top communication skills are depicted in Figure 1.5
and include, in order, teamwork/collaboration, research, written
communication, problem-solving, planning, detail-oriented, creativity, and
organizational skills. We will look further at technical and software skills
prized by today’s analytic employers in Chapter 3.
Figure 1.5 In more than 146,000 data visualization–related jobs posted between March 2021 and
March 2022 in the United States, these communication skills ranked among the most in demand by
today’s analytics employers.
As you might expect, these changes have a significant effect on how people
work across analytics functions, be they executives and leadership, data
scientists, analysts, or even data storytellers. There are a lot of skills
available and a very big toolbox to choose tools from, and we are all
learning together. Adding to that, over the past few years we’ve been
reminded that data workers are in high demand, and we’ve seen firsthand
how limited the current supply is. This means we have to start thinking
about cultivating talent rather than recruiting it, and training an incoming
workforce isn’t something that an industry can do alone, no matter how
many specialized software training programs, massive open online courses
(MOOCs), conferences, and excellent publications we produce. In fact, one
finding uncovered in the 2021 DVS SOTI survey was a lack of structured
programs and stronger foundational knowledge bases, which respondents
said are needed to nurture visual analytic skills development for both new
and established analysts. According to the survey, data visualizers are
continuously looking to learn new skills, with 26.8% prioritizing new skills,
24.3% wanting to improve skills with an existing tool, and 25.9% interested
in learning a new tool.
All universities are listening to campus recruiters and market research that
demonstrate the need for qualified, educated people with more data skills
and knowledge, and they’re working hard to fill that gap. The top programs
are focused on real-world applications of data problems and are doing their
best to keep pace with fluid changes in technology adoption, new
programming languages, and on-the-market software packages. They’re
also putting a premium on visual analytics. Vendors like Tableau, with its
Tableau for Teaching program, are helping, too.
So just how big is data science education? Over the past couple of years, the
number of new business analytics program offerings has significantly
increased. In 2010, there were a total of 131 confirmed, full-time BI/BA
university degree programs, including 47 undergraduate-level programs. By
2018, that number had tripled. As of this writing, there are now at least 69
undergraduate programs, 438 masters programs, 24 doctorate programs, and
more than 100 certificates available at U.S.-based academic institutions
(Figure 1.6). So, while we might not have access to all this new data talent
yet, if academia has anything to say about it, help is on the way.
Figure 1.6 Business analytics degree programs in the United States.
Note
This dataset is regularly updated and maintained by Ryan Swanstrom and is
available via Github at https://github.com/ryanswanstrom/awesome-
datascience-colleges.
Summary
Memory, retention, and emotion form the understanding for experience, all
of which work in tandem with our brains’ visual processing horsepower.
This chapter leverages real-life examples to showcase the power of visual
analytics and visual data stories to communicate discoveries and insights
hidden in data. We will review the role of human cognition in visual
analytics; consider how the brain reacts to inputs of data, stories, and data
stories as unique entities; and explore how we can leverage this power to
tell impactful data narratives and influence action.
1. www.marketplace.org/2016/09/26/sustainability/corner-office-
marketplace/dont-call-national-geographic-stodgy
2. www.vox.com/2014/9/4/11630542/time-inc-to-take-page-from-national-
geographic-playbook
Media and journalists aren’t the only ones putting an emphasis on data
storytelling, although they have certainly been a particularly imaginative
bunch of communicators. Today we’ve seen the power of storytelling used
to color in conversations on just about every type of data imaginable—from
challenging astronomical principles to visualizing the tenure pipeline at
Harvard Business School to quantifying the fairytale of Little Red Riding
Hood. Speaking of social media, it’s clear how visual interfaces like
Instagram, Snapchat, and TikTok have reshaped digital content and
engagement. A new app, Lucid, even uses the power of visualization as its
core differentiator from its main competitor, Twitter, by noting in its
mission statement, “By visualizing and clarifying complex insights from the
world’s greatest thinkers, we’re helping people around the world master
essential topics and learn new skills, quickly and easily.”3 In every
organization and every industry, visual data stories are becoming the next
script for how we share information, and we are harnessing the power of
visual analytics to tell them.
3. https://lucid.fyi/about/
But as diverse as data stories can be, they all have one thing in common:
They give us something to connect to in a very literal sense. Now that we
have a firm grasp of the market for visual analytics skills, let’s delve into
the power of stories, first by looking behind the curtain at the science of
storytelling and then by examining some incredible existing data stories to
see how they have capitalized on the secret sauce of visual analytics.
Moving forward, let’s pretend we get to the store, only to discover it’s
closed. So, instead of cooking, we decide to go to our favorite Italian
restaurant for our pasta fix. Suddenly the image changes: We’re no longer
visualizing individual items on a grocery list but turning toward imagining
an immersive visual experience—a waiter setting down a big, beautiful dish
of steaming, flavorful, scent-rich spaghetti. Perhaps we also hear the
buzzing backdrop of restaurant sounds—low-volume music, water glasses
being filled, the clink of silverware, and so on. If we think about it long
enough (or if we’re hungry enough), we can almost taste the food.
Now, imagine your favorite dish served at a holiday feast and pay attention
to what other senses engage. These might include memories and other
auditory information more personally meaningful than a typical restaurant
experience.
These extra storytelling details have a profound effect on the brain (Figure
2.2). Beyond the two areas of the brain that become activated when
presented with data, five additional areas respond when presented with a
story:
Fitness
As much as we might try to argue otherwise, human beings simply did not
evolve to find truth. Rather, we evolved to defend positions and obtain
resources—often regardless of the physical or mental cost—so that we
could survive. These concepts lie at the heart of Darwinian theory of natural
selection: survival of the fittest as the mechanism, and our ability to
overcome (or, biologically, to reproduce), known as fitness.
Closure
Aside from being bent on survival, humans tend to require closure. A few
philosophical exceptions notwithstanding, in general we don’t enjoy
ongoing questions and curiosities with no resolution. We need endings,
even unhappy ones. We simply can’t abide cliffhangers; they’re sticky in
the worst of ways, bouncing around in our brains until we can finally
“finish” them and put them to rest (just read any book review website to see
this in action). There’s actually a term for this phenomenon—the Zeigarnik
effect, named for Soviet psychologist Bluma Zeigarnik. Zeigarnik
demonstrated that people have a better memory for unfinished tasks than
they do for finished ones. Today, the Zeigarnik effect is known formally as
a psychological device that creates dissonance and uneasiness in the target
audience.
In essence, the Zeigarnik effect speaks to our human need for endings. No
matter the story’s goal—to focus, align, teach, or inspire—we build
narratives to foster imagination, excitement, and even speculation.
Successful narratives are those that grab the audience’s attention, work
through a message, and then resolve our anxiety with a satisfactory ending.
Thus, stories are therapeutic. They give us closure.
We’ve established that data stories are powerful, and that they are powerful
because of their ability to communicate information, generate
understanding and knowledge, and stick in our brains over the long term.
However, as information assets, visual data stories have a few other
noteworthy qualities.
But first, let’s set the record straight. There is much to be said about how
visual data stories create meaning in a time of digital data deluge, but it
would be careless to relegate data storytelling to the role of “a fun new way
to talk about data.” Visual data storytelling has radically changed the way
we talk about data (though certainly not invented the concept). The
traditional charts and graphs we’ve long used to represent data are still
helpful because they help us to better visually organize and understand
information and because they are cemented as one of the ways in which we
visually organize and understand data. They’ve just become a little static.
With today’s technology, fueled by today’s innovation, we’ve moved
beyond the mentality of gathering, analyzing, and reporting data to
collecting, exploring, and sharing information. Instead of simply rendering
data visually, we are now focused on using these mechanisms to engage,
communicate, inspire, and make data memorable. No longer resigned to the
tasks of beautifying reports or dashboards, data visualizations are lifting out
of paper, coming out of the screen, and moving into our hearts, minds, and
emotions. In fact, the ability to stir emotion is the secret ingredient of visual
data storytelling, and what sets it apart from the aforementioned static
information visualization renderings.
Instead of talking about the power of visual data stories, let’s see them in
action within the context of visual analytics. As we do, we’ll be looking for
the following key takeaways:
One of the core tenets of a visual data story is that it uses different forms of
data analytics and visualization (e.g., charts, graphs, infographics) to bring
data to life. Perhaps one of the most archetypal examples of the power of
visual analytics to help people see and understand data in ways they never
would by looking at a simple data table—rows and columns of raw black
and white data—comes from Anscombe’s Quartet (Figure 2.3).
Even though the individual variables are different, if the statistical outputs
are the same, we would expect these datasets, when graphed, to look the
same. The “story” for each of these datasets should be the same—right?
Wrong.
Graphing these datasets (Figure 2.4) allows us to see beyond the limitations
of basic statistical properties for describing data. We can then appreciate the
bigger picture presented by the datasets and the relationships within them.
Figure 2.4 Anscombe’s Quartet, visualized.
Anscombe’s example might be a classic in terms of putting some support
behind visual horsepower, but it only brushes the tip of the iceberg in terms
of visual data storytelling. Although we might not yet have everything we
need to tell a story, we can start to see that the datasets are not so similar as
they might appear, and there is something worth talking about in these
datasets. We know there is a story there, and we know we need to visualize
it to see it, but we are still left wanting. This isn’t quite a visual data story,
but it’s definitely a first step.
Story Takeaway
Story Takeaway
Consider a visit to the eye doctor, when your vision is tested by the ability to
spot a flash of color in a sea of darkness. As another example, take a look
at Figure 2.7.
Figure 2.7 A table showing companies with respective annual gross profits, 2013–2016.
This is a simple data table with only three companies. Now suppose I asked
you to tell me, in each year, which company had the highest gross profit.
You are tasked with analyzing each box of the table, line by line, to assess
each year independently and select the highest number. You might even
have to write it down or mark it in some way to help you remember the
winner. Go ahead and give it a try. It should take you roughly one minute to
complete the exercise.
Figure 2.8 A highlight table showing companies with respective annual gross profits, replaced by
color, 2013–2016.
This time, we’ve replaced the numerical data with a visual cue—and,
coincidentally transformed this representation from a data table to a data
visualization type called a highlight table. The simple power of this
visualization is that, rather than reading the table, the perceptual pop-out of
this design makes completing this exercise a near-instant feat. We don’t
have to actually “look” for answers; we simply “see” them instead.
4. http://mindingourway.com/there-are-only-two-seasons/
The item up for debate in this story is a simple one: Is it fair to qualify
“waxing summer” (also known as spring) and “waning summer” (also
known as autumn) as full seasons? Sure, it’s familiar, and if you live in the
northern hemisphere you can likely distinguish the seasons according to
their observable natural phenomena, such as their colorful transitions
(flowers blooming or leaves changing color) or other sensory ones (weather
temperatures) rather than their actual astronomical dates—and this doesn’t
even begin to open the conversation on astronomical versus meteorological
dates of change.5
5. www.ncdc.noaa.gov/news/meteorological-versus-astronomical-seasons
Let’s begin to build a story around this question and see where we end up.
First, let’s agree on a foundation: The year follows a seasonal cycle that
starts cold and gets progressively warmer until it peaks and begins to cool
again. Repeat. This is a pretty basic assumption. More important, it’s one
that we can successfully chart loosely and without requiring any more
specific data or numbers. Rather, we’ll use points from the basic story
premise we laid out earlier to graph a seasonal continuum for the year,
using length of daylight as our curve (Figure 2.9). From there, we can try to
decide just how many seasons are really in a year.
How many curves does the orange line trace? The answer, obviously, is two
—hence the two-season viewpoint (Figure 2.10).
Figure 2.10 Two seasons.
From here, we could break this down further with more information. For
example, we could add in astronomical dates or mull over geographic
differences in weather or meteorology using maps and topography.
However, whether you agree with Nate and me (and others!) on the number
of qualifying seasons that occur over the course of one year, the preceding
two graphs represent a powerful data story without requiring the type of
“hard data” (rows and columns of numbers) that we would typically expect.
They show us, quite literally, that to tell a great story doesn’t necessarily
require a ton of data—or much at all. It just requires a few points, a goal,
and the creativity to visualize it for your audience in a way that affects their
opinion.
Story Takeaway
Obviously, this was not a successful war, and as a visual analytics piece
Minard’s map is not a particularly successful one. However, as a visual data
story around human drama, it has earned the distinction of becoming known
as one of the best storytelling examples in history. You would be hard
pressed to take a data visualization class today and not experience
Napoleon’s March. It’s also fair to note that several analysts have tried to
re-create it using more common statistical methods, but all fall short of the
original’s storytelling appeal.
Story Takeaway
Stories have an inherent amount of entropy, and some we tell only once.
Story Takeaway
Don’t be afraid to try something new.
Summary
Now, let’s get ready to put this information to work in Tableau. In the next
chapter, we’ll begin exploring the Tableau ecosystem and take a journey
through its freshly redesigned user interface. This will form the basis for
later hands-on practice as we get to work with visual analytics and start
exploring and analyzing data to build complete data visualizations and
visual data stories.
Chapter 3
While this text should not be approached as a user manual for Tableau, if you
are already an intermediate Tableau user and familiar with the 2022 interface
and Tableau terminology, you might want to skip this chapter and move on to
Chapter 4, “Keeping Visual Analytics in Context.”
Using Tableau
1. https://investor.salesforce.com/press-releases/press-release-
details/2019/Salesforce-Signs-Definitive-Agreement-to-Acquire-
Tableau/default.aspx
Figure 3.1 Tableau remains a Leader in the 2022 Gartner Magic Quadrant™.
Note
2. Gartner.com.
Why Tableau?
Tableau’s stated mission is to help analysts “see and understand” their data,
focusing on enabling “powerful analytics for everyone.” To facilitate this, the
company offers a suite of software products designed to meet the needs of a
diverse group of users, from experienced analysts at enterprise-level
organizations to academic users, data journalists, and other visual data
storytellers who want to visualize data. All Tableau products excel at
displaying data visually, using Tableau’s proprietary VizQL technology to
enable an intuitive, drag-and-drop canvas on top of embedded and augmented
analytics.
Our primary tool to apply the lessons learned in this book, Tableau Desktop can
connect to a wide variety of data, stored in a variety of places on-premises or in
the cloud—from SQL databases to local spreadsheets, and even many cloud
database sources, such as those offered by Google Analytics, Amazon Redshift,
and Salesforce. Although Tableau can mimic Excel by providing the capability
to analyze rows and columns of numbers, its focus is on interactive, visual data
exploration through its out-of-the-box analytic capabilities as well as
dashboarding and storytelling features—no programming required. For more
advanced users, Tableau offers augmented analytics powered by artificial
intelligence (AI) and machine learning (ML) to streamline time to insight
through statistics, natural language, and smart data prep, as well as predictive
modeling and other data science techniques. Data scientists can integrate and
visualize results from R, Python, Einstein Discovery, MATLAB, and other
extensions to scale models. Content created in Tableau Desktop can be shared
in a variety of methods, depending on the user’s specific needs.
Note
Note
Tableau maintains an expansive, ongoing program to rapidly develop product
features. This text attempts to capture product updates that affect fundamental
learners in versions 2022.3 and 2022.4.
3. www.tableau.com/about
4. Ryan, L.; Silver, D.; Laramee, R.; & Purdue, D. “Teaching Data
Visualization as a Skill.” IEEE Computer Graphics and Applications 39, no. 2
(2019): 95–103.
Figure 3.2 From March 8, 2021, to March 7, 2022, of the 146,083 jobs posted that required applicants to
have skills in data visualization, approximately 48% of specifically asked for Tableau.
Tableau in Demand
In a recent analysis of Labor Insight (Burning Glass), approximately 146,038
jobs posted across the United States between March 1, 2021, and March 31,
2022. listed “data visualization” as a desirable skill. Among the most popular
job titles were data analyst (6.4%), data scientist (5.9%), and BI analyst (2.6%).
These findings are consistent with previous years’ research, although the data
scientist job title has now edged above BI analyst. All postings required, at a
minimum, a bachelor’s degree.
These analysis results are consistent with recent surveys that have looked at
which tools are most in demand and in use by today’s visual analytics
practitioners. One such resource is the Data Visualization Society’s State of the
Industry (SOTU) report, discussed in Chapter 1. Its 2021 survey marked the
first year that respondents were asked which technologies they use often—
rather than the more general question used previously, which simply asked
about tools used, but did not differentiate between which tools were used on a
regular basis versus used at all. Regardless, the top tools mentioned by
respondents have remained consistent over the years the DVS has conducted
this survey, with Tableau coming in second only to Excel. Interestingly, 2021
also marked the emergence of PowerPoint as a data visualization tool, based on
2020 write-ins for “other” in regard to tools used to proliferate results to
stakeholders and other visualization audiences.
If you’re wondering where these data visualization jobs are located, nearly 15%
were in California, just shy of 10% in the New York/New Jersey metro area,
and another 8% in Texas. Washington, home to Tableau’s Seattle headquarters,
accounted for fewer than 4% of the jobs. Note that these percentages are based
on the count of total jobs posted and are not normalized for population. Top
employers included PricewaterhouseCoopers, Deloitte, Humana, and Amazon.
It’s important to think of Tableau as a brand, rather than a product. The Tableau
ecosystem includes several different products, and although Tableau Desktop is
its cornerstone data visualization software product and the focus of this book,
other environments are also offered in the Tableau application suite to support
various levels of user needs. What unites all of these products is VizQL,
Tableau’s proprietary visual technology that enables simple drag-and-drop
functions to create sophisticated visualizations. The primary differences
between these core products are the different data sources users can connect to
(connectivity), how visualizations can be shared with others (distribution), the
ability to automatically update or refresh analysis (automation), and the level of
governance required by the user and/or organization (security).
Note
VizQL is Tableau’s proprietary analysis technology. You can read more about
VizQL at www.tableau.com/products/technology.
Note
The Tableau pricing model is based on users and designed to scale as your
organization’s needs grow. Free software trials are also available, as well as
free licenses for students and teachers. For more information, see
www.tableau.com/pricing.
Tableau Desktop
Note
Work in this book, in terms of both hands-on work and the figures presented
throughout, relies on Tableau Desktop version 2022, unless otherwise stated. A
number of features are consistent across Tableau Desktop, Server, and Cloud,
but their functionality does differ. If you are not already an active Tableau user,
it’s recommended that you download Tableau Desktop to follow along with the
discussion. Information on Tableau’s free trial offerings can be found in the
Introduction.
Tableau Server
From small businesses to Fortune 500 companies, Tableau Server extends the
value of data across the entire organization for enterprise-wide deployments
and is intended for organization-wide provision of visual analytics outputs
through a central repository for Tableau work. It provides organizations with
centralized governance, visibility, and control, while allowing users to curate,
publish, and share data sources as well as collaborate, engage, and explore data.
Data visualizations and dashboards are typically stored within the organization.
Tableau Cloud
Formerly called Tableau Online, Tableau Cloud is a full-hosted, cloud-based,
enterprise-grade solution. It provides similar functionality as Tableau Server,
but has the advantages of cloud distribution and automation and is hosted off
premises.
Tableau Prep
Tableau Public
One part data visualization hosting service, one part social networking, Tableau
Public is a free service that allows users to publish interactive data
visualizations online. These visualizations can be embedded into websites and
blogs, shared via social media or email, or made available for download to
other users, but must remain in Tableau’s public cloud.
Note
You can follow me and see many of the visualizations included in this book on
Tableau Public at https://public.tableau.com/app/profile/lindy.ryan.
Tableau Reader and Tableau Viewer
Finally, Tableau offers two products that allow anyone, from experienced users
to casual users, to access and interact with content created by Tableau creators.
While both products are similar, they do have some notable differences.
Tableau Reader
Tableau Reader is a free desktop application that allows users to open and
interact with data visualizations built in Tableau Desktop in a view-only mode
(or, conversely, to distribute content built in Tableau Desktop). Tableau Reader
lacks governance, security, and administration capabilities, so it is not possible
for users to make any changes to visualizations provided in Tableau Reader.
Thus, the application is essentially a distribution platform with no analytic
capabilities.
Tableau Viewer
Note
See Appendix A, “Tableau Services,” to learn about the range of services that
Tableau offers to support your ongoing learning related to visual data analytics
as well as the Tableau ecosystem.
Getting Started
To begin working in Tableau Desktop, the first thing you need to do is get your
hands on a license. If you have not done so already, refer to the Introduction for
guidance on how to start a free trial of Tableau Desktop. You can also visit the
Tableau website to explore trial and purchase options. As previously noted, all
exercises and tutorials contained in this book use Tableau Desktop 2022.
Before installing Tableau Desktop, make sure your machine meets the
necessary requirements for the application. Tableau Desktop is available for
Windows and Mac, and the minimum requirements as of this book’s writing
were as follows:
Note
Research these and other minimum system requirements at
www.tableau.com/products/techspecs.
Connecting to Data
When you first open Tableau Desktop, the Connect to Data screen appears
(Figure 3.3). This “starting screen” is the first thing users see after launching
the Tableau software.
Figure 3.3 The Tableau Connect to Data screen.
Note
This book focuses on the fundamentals of practicing visual analytics within
the scope of data visualization and visual data storytelling; it is not a user
manual for Tableau. For more granular learning on Tableau, review the
Training videos provided by Tableau in the Discover pane or the resources
listed in Appendix A of this book.
Connecting to Tables
Selecting a saved data source from the Connect pane will immediately take you
to the Tableau worksheet canvas, bypassing the data preparation stage. We’ll
review this interface later in this chapter, but first let’s take a look at how to
prepare your data for analysis in Tableau.
Tip
For many exercises in this book, we’ll use the Sample-Superstore training file
provided on the Saved Data Resources section of the Connect to Data screen
shown in Figure 3.3. It is a well-prepared, simple dataset for a global retailer
that sells furniture, office supplies, and technology goods. You can connect to
this data source simply by double-clicking on it on the Connect pane. We will
also use an Excel-based dataset called Global Superstore, which contains
similar but not the same data. This dataset, and others, can be downloaded by
visiting the resources provided in the Introduction and Appendix A of this
book.
For our purposes, we’ll connect to a very common file format—an Excel
spreadsheet. You can connect to any Excel spreadsheet by clicking the Excel
option under the Connect menu and navigating to the file’s location on your
machine. Once connected to your data file (or any other file or database
connection), Tableau opens the Data Source page (Figure 3.4).
Figure 3.4 The Data Source page.
Before we look at options to make changes to the data at this point, it’s
important to note that whatever changes are made create metadata and have no
impact on the underlying data source. This means you can make specific
changes and prepare data directly in Tableau without affecting the data’s
existing infrastructure.
The Data Source page provides several options to help you prepare this file for
analysis in Tableau.
Connections: You can add additional data sources by clicking Add. You can
also edit the name of the connection or remove it as desired by clicking the
drop-down arrow to the right of the filename. (You can also rename the
connection by clicking its title on the canvas to the right.)
Sheets: This pane displays all the sheets in the Excel file, corresponding to
the names of individual worksheet tabs. Sheets in Excel are treated the same
as tables in a database, and you can choose to connect to a single table or
join multiple tables. To connect to a sheet, simply drag and drop it into the
data connection canvas to the right (you will notice a “Drag tables here”
prompt) or by double-clicking the sheet desired. After you connect to a
sheet, three things happen (Figure 3.5), which allow you to further explore
the capabilities of this screen:
The sheet name appears in the data connection canvas.
The data is displayed in the preview pane below the data connection
canvas.
A Go To Worksheet icon is displayed.
Figure 3.5 You have connected to the Orders table of the Excel file, populating the data preview pane.
Tableau also provides the prompt to Go To Worksheet, which allows you to begin visually exploring the
data if you are ready.
Before moving on, there are a few more things to take note of on this screen.
First, if you aren’t satisfied with any individual column name, you can click the
drop-down arrow to the right of the name and select Rename. Additionally,
clicking the data type icon allows you to change the default data type for that
column (Figure 3.6).
Figure 3.6 Clicking the data type icon allows you to change the default data type for that column. This
determines how the fields are displayed on your worksheet in the next step.
If you would like to further refine your columnar data, you can find a few more
options to prepare this data by clicking the drop-down arrow on the column
(Figure 3.7):
Figure 3.7 Clicking the drop-down arrow provides more options to prepare data.
Depending on the data type contained within the field, some options may not
be available. Likewise, for quantitative fields, the string functions are not
available, but an additional option, Create Bins, allows you to create equally
sized bins—an ability useful for making histograms.
You might have noticed the option for a Live or Extract connection on the sheet
canvas (Figure 3.8).
By default, most data sources will connect live with no filters. However, before
you begin analyzing data, this decision might be something you want to
consider. Be sure to understand the benefits and drawbacks of Live versus
Extract connection options as described in Table 3.1.
Table 3.1 Pros and cons of Live versus Extract connection options
Live
Leverage a high- Can result in a slower
performance experience.
database’s Some cloud-based data sources
capabilities. must be extracted.
See real-time
changes in data.
Offers the best
security in most
organizations.
Extract
Can deter latency in Most Online Analytical
a slow database. Processing (OLAP) data
Could reduce the sources cannot be extracted.
query load on Must be refreshed periodically.
critical systems.
Finally, you have the ability to filter the entire data source before working with
it in Tableau. These filters can be created with any combination of fields by
clicking the blue Add text under Filters. Filtering may help eliminate clutter or
extraneous fields by streamlining your view of the data and removing data that
is not needed for analysis.
Relationships are a dynamic, flexible way to combine data for analysis, and
generally a simpler process than working with joins. Therefore, Tableau
recommends using relationships as a first approach to combining data.
Relationships aside, there may be times when you still opt to create a join. To
create a join, double-click on the first table, then drag out the second table. This
will generate the same join experience as in previous versions of Tableau via
the join canvas (Figure 3.10).
While Tableau will automatically join your tables, it does so by guessing your
matching ID. You can change this default by clicking on the fields, which
shows a drop-down menu of all data fields available to join.
Tableau provides four types of joins that you can use to combine your data:
inner, left, right, and outer. Inner and left joins are the most common types of
joins.
Inner join: Joins records where there is a matching field in both datasets.
Using an inner join to combine tables produces a new virtual table that
contains values that have matches in both tables.
Left join: Joins records from the left and right sides of your equation when
there is a match. Using a left join to combine tables produces a new virtual
table that contains all values from the left table and corresponding matches
from the right table. When there is no corresponding match from left to
right, you will see a null value.
Right join: Joins all the records from the data on the right side of your
equation and any matching records from the left side. The opposite of a left
join, using a right join to combine tables produces a table that contains all
values from the right table and corresponding matches from the left.
Likewise, when a value in the right table doesn’t have a corresponding
match in the left table, you will see a null value.
Outer join: Joins all the records from each dataset together, even when
there is no join (this option is rarely used). Using a full outer join to
combine tables produces a table that contains all values from both tables. If
a value from either table doesn’t have a match with the other table, you will
see a null value.
Occasionally as you work with data, you will discover a field name called null.
What is that?
Null means that your data contains some empty cells and Tableau is,
essentially, letting you know about their presence. These fields could have been
left blank— either intentionally or unintentionally—or they may signify
missing or unknown values. Checking fields and formatting for extraneous
information is always important when doing data analysis because you want to
ensure these blank fields do not skew the results. A null field might indicate an
error in the data or some other inaccuracy. When null values exist in a
connected dataset, Tableau displays an indicator that provides options to filter
out these unknown values. We’ll take a closer look at working with nulls in a
later chapter.
Sometimes, you might want to work with multiple data sources in the same
Tableau workbook without joining or otherwise blending the data. Luckily,
Tableau offers some options for performing this type of analysis—including
recent feature updates in Tableau 2022.4 that make this process more seamless
and flexible than ever before.
Once you’ve connected to your primary data source as usual, navigate to your
worksheet. This worksheet will automatically utilize data from your first data
source.
From here, use the top menu bar to select Data > New Data Source (Figure
3.11). You can then use the familiar Connect to Data screen to select and
connect to your additional data source and, if needed, navigate to the Data
Connection canvas to see and select fields for use as the new data source.
Figure 3.11 To add or replace data sources, first use the top menu bar to select Data > New Data Source.
Once this process is complete, you will be able to see all connected data
sources in the Data pane on your Tableau worksheet (Figure 3.12).
Figure 3.12 Review all connected data sources in the Data pane.
From here, as you begin to explore and analyze your data, you can selectively
replace data sources at the worksheet level (in the past, replacing data sources
would apply to all worksheets in the Tableau workbook). To replace the data
set used, return to the Data menu and select Data > Replace Data Source. This
option enables you to replace the data source with any other connected data
source, at either the worksheet or workbook level (Figure 3.13).
Figure 3.13 Data sources can be changed at either the workbook or worksheet level.
When you connect to an Excel sheet in Tableau, the software can recognize
issues such as missing column names, null values, and so on. To remedy these
problems and clean the file for use in analysis, Tableau will suggest using Data
Interpreter. (Refer to Figure 3.4 to locate the Data Interpreter option on the left
pane of Data Source screen, directly between the list of data connections and
the resultant tables.)
We will take a closer look at Data Interpreter in a later chapter, but for now it’s
sufficient to know that to use Data Interpreter, you simply click the check box
to activate this tool. This executes a query to the Excel file and confirms its
automated prep tasks, with a revised data preview pane addressing the issues
Data Interpreter has identified. To get more specifics on what Data Interpreter
has adjusted in the file, including a before-and-after view and an explanation
table, click the link that is provided following Data Interpreter’s action to
“Review the results.” This opens an Excel file describing the changes. You can
also clear the check box to undo these changes and revert to your original
sheet.
After verifying the changes made to your data, you can go to your worksheet
and begin exploring the Tableau interface and your data. You are ready to begin
your analysis!
Now that you have connected to some data in Tableau, you can click the
prompt to Go To Worksheet and start getting to know the Tableau user interface
(UI) in a more meaningful way. Like the Data Source page, the Tableau UI is a
drag-and-drop interface that fosters rich interactivity between sheets,
dashboards, and stories, allowing for in-depth visual exploration and powerful
visual communication. Tableau is similar to Excel in that its files are called
workbooks and the sheets inside the workbook are called sheets. Every Tableau
workbook contains three elements:
We’ll cover dashboards and stories, and the differences between them, in more
depth in later chapters. For now, let’s focus on sheets and take a high-level
view of the various areas of the Tableau worksheet canvas. As we begin to
work directly with data to perform visual analytics and build visualizations,
dashboards, and stories throughout this book, we’ll explore these areas—and
more—in detail through hands-on exercises. For now, this high-level overview
is intended to orient you to the various aspects of the user interface.
As shown in Figure 3.14, the Tableau interface includes five basic elements:
1. Menus and toolbar
2. Data pane
3. Shelves and cards
4. The canvas workspace
5. Legends
At the top of the Tableau sheet is the toolbar, which is similar in concept to the
ribbon in Microsoft Office products. The toolbar contains many powerful
buttons that give you control over your Tableau experience and enable you to
navigate from the data source all the way to story presentation mode. A few
items of special note are highlighted here:
Logo: The Tableau logo button brings you back to the original Connect to
Data screen (clicking the icon from this screen returns you to your sheet).
Undo: There is no limit to how much you can undo in Tableau, which is an
important feature for exploration and discovery. The icon is grayed out until
there is an action to undo.
Save: There is no automatic save in Tableau. Be sure to save your work
incrementally.
Another menu appears along the bottom of the sheet. This menu, similar in
concept to a Tableau workbook, enables you to return to the Data Source
screen; create new sheets, dashboards, or stories; and do things like rename,
rearrange, duplicate, delete various sheets, and so on.
Data Pane
The pane on the left of the sheet is called the Data pane. It has two tabs: a Data
tab and an Analytics tab.
Data
At the top of the Data tab is a list of all open data connections and the fields
from that data source categorized as either dimensions or measures (discussed
shortly).
Analytics
The Analytics tab enables you to bring out pieces of your analysis—
summaries, models, and more—as drag-and-drop elements. We will review
these functions later.
Shelves and cards are some of the most dynamic and useful features of the
Tableau UI.
Legends
Legends will be created and automatically appear when you place a field on the
Color, Size, or Shape card. To change the order (or appearance) of fields in a
visualization, drag them around in the legend. Hide legends by clicking on the
menu and selecting Hide Card. Likewise, bring them back by selecting the
Legend option on the appropriate space in the Marks card or by using the
Analysis menu.
When you bring a data source into Tableau, the software automatically
classifies each field as either a dimension or a measure. The differences
between these two are important, though they can be tricky to understand for
those who are new at analytics. Perhaps the best way to differentiate these two
classifications is to think about them this way: Dimensions are categories,
whereas measures are fields you can do math with.
Dimensions
Dimensions are things that you can use to group data by or drill down by. They
are usually, but not always, categories (e.g., City, Product Name, or Color), and
they can be logically grouped into strings, dates, or geographic fields.
Dimensions can also be organized into Tableau groups and hierarchies, which
we’ll discuss shortly.
Measures
Colorful Pills
When a field is brought from the Data pane and dropped into the Rows and
Columns shelves, Tableau creates a “pill.” These pills are color coded: Blue
pills represent discrete variables, whereas green pills are continuous. The data
type icons also reflect these color codes (Figure 3.15).
Figure 3.15 Color-coded pills reflect continuous (green) measures and discrete (blue) dimensions.
Summary
This chapter introduced the Tableau product ecosystem. It then took a high-
level view of the Tableau user interface, including connecting and preparing
data and the core functionality of the Sheets canvas. In future chapters, you will
put this knowledge into practice as you begin working hands-on with this
functionality.
More than 20 years ago, Bill Gates coined the iconic phrase, “Content is
king.” Gates was, of course, referring to the importance of content on
demand in the early days (circa 1996) of the World Wide Web. His words
were prophetic, however, and over the past two-plus decades this mantra
has been applied to everything from Internet marketing to media journalists
to viral content creators online—suddenly everyone is a media company.
The never-ending quest for bigger, bolder, better online content has
radically changed the way people acquire and share information and how
we interact and communicate with others. However, although Gates might
have been right when he made his proclamation, his mantra is missing a
critical ingredient: context.
If you type Gates’s “content is king” into your Google search bar, you
might notice that a “but” is coming right behind it (Figure 4.1). Content is
king, but context is god.
Figure 4.1 Sorry, Bill. Content might be king, but context is god.
Context of data
Context of structure
Context of audience
Context of presentation
context caution
There’s more to context than just the data. Context can also be created by
the storyteller or by the audience, based on knowledge, biases, and
expectations. A successful storyteller needs to learn how to anticipate these
issues and ensure they do not improperly affect analysis or design. A good
way to ensure your context works for you, rather than against you, is to ask
and then answer a series of logically thought-out and connected questions
about the project, the data, and the audience. The answers to these questions
help provide a framework for your data story.
Context in Action
1. www.21stclub.com/2013/08/11/contextual-intelligence-a-definition/
Let’s take a quick look at a fun example of a story where context makes all
the difference.
By now, most of us are familiar with The Boy Who Lived and the Harry
Potter author. To date, the franchise book series has been distributed in
more than 200 territories, translated into 68 languages, and has sold more
than 400 million copies worldwide.2 There are companion volumes, a spin-
off series, and a bevy of critical discourse on everything from the author’s
controversial tweets to the series’ ongoing literary and cultural impact.
There are even conferences, like the Harry Potter Academic Conference, a
nonprofit annual academic conference hosted by Chestnut Hill College.
However, although most of us are familiar with Harry’s story through film
and media, it’s unlikely that we’ve taken a concentrated look on the data
inside the story.
2. http://harrypotter.scholastic.com/jk_rowling/
Note
If you’re a Potterhead, you’re in luck! A later chapter explores several
stories hidden within the Wizarding World’s data—working through the
entire storytelling process from collecting and preparing data to presenting
a complete data narrative.
Even if you don’t know all the nuances of the Wizarding World, you likely
do know that the story follows the journey of young wizard Harry Potter as
he fights against the dark wizard, Lord Voldemort, and his minions (known,
attractively, as Death Eaters). With that minimal amount of context, we can
assume that Harry is the good guy and Voldemort the bad.
To aid in visualizing the story of good versus evil in Harry Potter, we can
use a visualization of all of the instances where characters act aggressively.
When we visualize this data at the most superficial level—a count of
aggressive acts enacted by Harry and Voldemort in each of the books, in
order of release date—these “lightning bolts” seem to show that Harry
committed significantly more aggressive behaviors than did his nemesis,
Lord Voldemort, over the course of the series (Figure 4.2). In this version of
the story, our good wizard suddenly looks a little more sociopathic than we
might have expected. Yikes!
Figure 4.2 A modified bar chart of aggressive actions committed by Harry Potter and Lord
Voldemort.
The good news for Potter fans is that if we look at the data like this and fail
to put the count of aggressive behaviors into context, we overlook critical
context and showcase a faulty story. We don’t know, for example, if these
“aggressive behaviors” were inciting or if they were reactionary, defensive,
or protective—since these latter could themselves be labeled aggressive as
well.
Remember, the danger of a story told wrong is the prospect of making a bad
decision based on inaccurate or otherwise faulty information. This logic
applies to any story—even Harry’s. If we had presented this visual to, for
example, Rowling’s publisher prior to the series’ publication, we might
never have been introduced the Wizarding World. Who would want to
publish a children’s book that condones violence? Or, more aptly, who
would want their children reading about a malevolent hero? In this case,
telling an incorrect story could have resulted in a wrong decision (no
Potter), rather than introducing a pop-culture phenomenon that swept the
world.
Fortunately, we can remedy this by putting context back into the narrative.
Let’s try again.
This still isn’t a perfect solution, as it doesn’t account for what the
aggressive action was—for example, if it was offensive or defensive.
Nevertheless, putting the data into a relevant context will enable us to see
something very different when we visualize the data again (Figure 4.3).
Figure 4.3 With a little bit of context added back into the data, we see a different story.
With more context added into the narrative, we see Voldemort’s true colors
emerge and a much more significant insight becomes apparent: While Harry
rarely acts aggressively when mentioned, Voldemort usually acts
aggressively when he appears on the page. This completely changes the
story takeaway that we presented earlier.
Note
Be especially careful with counts, as they can influence your data and
present a distorted version of the truth. To prevent this, “normalize” your
data with a calculated field.
recommended reading
Context in Tableau
Annotations
Annotations are extracted information associated with a particular point in a
visualization. They are used to call out specific marks, points, or areas in a
visualization, and can be accessed by right-clicking on the relevant mark on
the canvas. There are three types of annotations (Figure 4.4):
Mark: Select this option to add an annotation that is associated with the
selected mark. This option is available only if a data point (mark) is
selected.
Point: Select this option to annotate a specific point in the view.
Area: Select this option to annotate an area in the viz, such as a cluster
of outliers or a targeted region.
Figure 4.4 Annotations pop-up menu.
Tooltips
The option to add a tooltip can be found on the Marks card (Figure 4.5).
Clicking it will create a hover box that displays secondary information
when you point at individual marks in the visualization (Figure 4.6). Users
can interact with visualizations to learn more by accessing tooltips that
include formatted text, dynamic fields in use in the visualization, and even a
newer function called Viz-in-Tooltip, in which the tooltip displays a second
visualization layer (we’ll cover this in a later chapter).
Figure 4.5 Selecting Tooltip from the Marks card provides a dialog box to add other information.
Figure 4.6 Hovering over a mark, when tooltips are activated, is a great way to increase interactivity
and layer in additional information.
The Show Me card (Figure 4.7), which you first met in Chapter 3, can guide
you in building visualizations that best represent your data. As you choose
measures and dimensions and bring them to the shelves, the Show Me card
will display which chart types are available based on the fields you’ve
selected. Likewise, if you’re unsure of which data to bring over to the
shelves, you can use the Show Me card as a tool to select the right measures
and dimensions.
Figure 4.7 The Tableau Show Me card, opened on symbol maps.
Filters are a great way to help cut out the noise and focus only on the
variables or parameters you wish to explore (Figure 4.8). With filters, you
can strip out unnecessary information, or you can hone in on specific fields
or elements critical to your data story. Like many things in Tableau, filtering
can be done in several ways and several places. You explore some of these
when working with real datasets later in this book.
Figure 4.8 A selection of filtering options in Tableau.
Story Plot
The events of a story (or the main part of a story) form its plot, also called
its storyline. These events generally relate to each other in a pattern or a
sequence, and the storyteller (or author) is responsible for arranging these
actions in a meaningful way to shape the story.
As in other forms of storytelling, the plot of a data story may or may not be
organized into a linear sequence (Figure 4.9). Not all data stories are told in
order, but they all have one thing in common: They must be true. Data
stories are not the place to practice fiction.
Figure 4.9 The basic plot diagram.
Note
The plot diagram is an organizational tool to help map events in a story. Its
familiar triangle shape (representing the beginning, middle, and end of a
story) was described by Aristotle and later modified by Gustav Freytag,
who added rising and falling action to the diagram. Though it was originally
designed for traditional stories, data stories can be built using this same
framework.
For the purposes of data storytelling, there are eight basic “plots” to help
shape your visual data story. Can you identify the plot used in the Harry
Potter example (Hint: We were telling a story of aggressive acts over the
series)?
Change over time: See a visual history as told through a simple metric
or trend.
Drill down: Start big, and get more and more granular to find meaning.
Zoom out: Reverse the particular, from the individual to a larger group.
Contrast: The “this” or “that.”
Spread: Help people see the light and the dark, or reach of data
(disbursement).
Intersections: Things that cross over, or progress (“less than” to “more
than”).
Factors: Things that work together to build up to a higher-level effect.
Outliers: A powerful way to show something outside the realm of
normal.
Story Genre
The other half of story structure is its genre. Like the diversity in plot, there
is more than one genre to choose from. In fact, there are seven genres of
narrative visualization: the magazine style, the annotated chart, the
partitioned poster, the flow chart, the comic strip, the slide short, and the
conglomerate film/video/animation (Figure 4.10). Developed by Segel and
Heer,3 these genres vary primarily in the number of frames and the ordering
of visual elements.
3. Segel, E.; and Heer, J. “Narrative Visualization: Telling Stories with
Data.” IEEE Transactions on Visualization and Computer Graphics 16, no.
6 (2010): 1139–1148. https://doi.org/10.1109/TVCG.2010.179
In Tableau, you can use dashboards and story points in each of these genres,
and we will explore how to build them in a later chapter. For now, keep in
mind that visual data stories are most effective when they have constrained
interactions at various checkpoints and allow the user to explore and engage
with the story without veering too far away from the intended narrative.
Stories unfold, and each visualization should highlight one story point at a
time (whether within the same visualization or within multiple
visualizations) as storytellers layer points to build a complete data narrative.
Curiosity is a learned skill. It takes time to develop a palate for asking the
right research questions and plucking out the relevant details from the
noise. Remember, visual data storytelling is fact, not fiction, and as such
involves a requisite degree of research as you move through visual analysis.
As you practice molding yourself into a thoughtful questioner, however,
you can use some of the same journalistic questions that help to parse out
the correct context for a story—particularly who, what, why, and how—to
make sure you build a presentation that will resonate with your audience
and give them the information they need to take action. Let’s look more
closely at these questions.
Who
It’s also important to consider the effects of your relationship with the
audience when creating your data story. Do they know you? Do they trust
you? Do they believe that you are a credible and reliable source of
information and insights? The answers to these questions are important
because they might influence how you structure your presentation as well as
any pre- or post-presentation communication. Your audience must believe
in you as an analyst and a storyteller before they will listen to your story
and be open to taking any actions you might suggest.
What
Analytics begins with understanding data—what you have, what you need,
its capabilities and its limitations. Additionally, you should have a realistic
view of your data’s quality and validity: Those characteristics determine the
data’s ability to help you answer business questions or explore a hypothesis,
and they suggest whether you should seek additional or external data to
complete your dataset for analysis. Understanding your data also requires
you to have a good grasp of how you might visually represent this data
compellingly and accurately, so that you are practicing “no harm” data
visualization as you design your narrative.
In addition to knowing the ins and outs of your data, be sure you’ve asked
enough questions to work out what your audience is asking of you, or what
story they are asking you to tell with the information you have at your
disposal. Aim to have a solid alignment of ideas between which questions
can be answered with your data and which insights or information your
audience needs or wants; otherwise, your data story will fall flat, unable to
satisfy the audience’s expectations.
Why
Every good story should prompt an action, whether you are building a story
intended to help your audience to make a decision; to cause them to change
their opinion; or to convince, persuade, or educate them. Ultimately, you
should be crystal clear about your goal in telling the story, and why your
audience should care about what you are saying. This understanding helps
both to ensure your story is meaningful and necessary and to give you a
clear target for building logical arguments toward a salient end goal.
To help crystalize the answers to the “why” part of the equation, be able to
articulate clear and concise answers to the following questions:
If you cannot readily answer most (if not all) of these questions, you might
need to revisit your purpose.
How
Finally, the communication medium and channel you use to present your
story matter. In fact, they have a number of implications for how you
deliver your story, as well as how much influence you have as a storyteller
and how your audience can interact with you as well as with the story itself.
Earlier in this chapter, we looked at some options in Tableau for keeping
context locked inside a visualization that help support narration. Although
there are many facets to explore in this step, one of the most constructive is
the differences between data stories delivered as narrated, live versions and
those that are non-narrated or otherwise “static” presentations.
Narrated
Non-Narrated
To ensure the integrity of the visuals and the story, a highly curated and
detailed view of the information is necessary. In the case of Tableau
dashboards or story points, this translates into not just well-crafted
visualizations, but cohesive, logical storylines and appropriate filters,
highlights, and other venues to let the audience explore visuals without
degrading the story or the underlying data’s integrity. Pay attention to the
device form factor, too, as you will need to be aware of how your story is
presented across multiple devices (laptop screens, tablets, smartphones, and
so on).
Of course, public speaking in any form isn’t a prospect that excites many
people. As a statistic made humorous by comedian Jerry Seinfeld notes,
according to most studies, people’s number one fear is public speaking;
number two is death. Hence Seinfeld’s joke: The average person at a funeral
would rather be the one in the casket than the one giving the eulogy.
Here are a few tips to help you become more comfortable about going “on
stage”:
Summary
This chapter looked closely at the importance of understanding data’s
context and its role in helping visual analysts ask the right questions to
build a story framework. You learned about exploratory and explanatory
analysis and strategies for successful storytelling, including narrative flow,
considerations for spoken versus written narratives that support visuals, and
structures that can support stories for maximum impact.
The next chapter looks at the importance of choosing the right visual—or
combinations of visuals—to support your data story, as well as how to build
fundamental visualizations in Tableau.
Chapter 5
This chapter introduces the types of charts and graphs most commonly used
to visually communicate data. We discuss appropriate use cases for each
and get hands-on to create examples from the catalog of charts available in
Tableau. You will learn techniques to help you assess when to use
fundamental data visualizations and how to generate them according to best
practices, as well as helpful considerations for when to avoid certain types
of charts. We’ll also explore some of the special features available in
Tableau to help you get the most from your visuals.
All visualizations created in this chapter were made using real datasets that
are included in Tableau’s library of sample datasets available at
https://public.tableau.com/app/resources/sample-data. Additionally, datasets
are provided via the companion website for this text. We’ll primarily use
the “Netflix Movies and TV” dataset. I encourage you to download the data
so that you can follow along with the step-by-step tutorials to create each of
the visualizations discussed in this chapter.
A traditional favorite, the bar chart is one of the most common ways to
visualize data. It is best suited for numerical data that can be divided into
distinct categories to compare information and reveal trends at a glance
(Figure 5.2).
Figure 5.2 This simple, classic bar chart with highlight color shows the percentage of film ratings
across available Netflix film and television titles. For a full look at the formatting of this
visualization, see Figure 5.8.
Bar charts can be combined with maps or line charts to act as filters that
correspond to different data points as they are selected.
Multiple bar charts could be set on a dashboard to help viewers quickly
compare information without navigating several charts.
Tip
Instead of manually rearranging pills on the shelves, you can use the Swap
Rows and Columns button on the toolbar to rearrange rows and columns
and toggle between views (Figure 5.5).
There are many ways to sort data in Tableau, depending on the type of
visualization you are working with, or the area of the viz you are
formatting. When viewing a visualization, data can be sorted using single-
click options from an axis, header, or field label. Additional sorting options
include sorting manually in headers and legends, using the toolbar sort
icons, and sorting from the Sort menu.
It’s also important to pay attention to how many bars you have on a
horizontal bar chart to avoid the moiré effect (pronounced “mwa-ray”), a
type of visual interference that can create a “shimmer” when too many bars
are grouped too closely together (Figure 5.7). To reduce the moiré effect,
considering limiting your visualization to a “top 10” or equivalent
arrangement, or use highlight or alerting colors to reduce visual clutter.
Figure 5.7 Whether vertical or horizontal, too many bars in a bar chart can result in a moiré effect, a
visual illusion in which the bars appear to shimmer.
Once you have your bars on the canvas, you can add additional fields to
these shelves and further modify your bar chart as desired. For example,
you can adjust the color, axis and field labels, annotations, titles, and more
(Figure 5.8).
Figure 5.8 A few additional curating steps can help the data presented in this bar chart shine.
The Line Chart
Like the bar chart, the line chart (or time series chart) is one of the most
frequently used visualization types. These charts connect individual
numeric data points to visualize a sequence of values. As such, they are
most commonly used when an element of time is present—hence their
alternative title. In fact, the best use case for line charts involves displaying
trends over a period of time (Figure 5.9), when your data are ordered, or
when interpolation makes sense.
Figure 5.9 This line chart shows film and television title adaptions, by release date, at Netflix from
2000 to 2019.
Dual-axis line charts can be created by bringing two measures to the Rows
shelf, and then right-clicking on the second measure and selecting Dual-axis
from the drop-down menu. You’ll also need to synchronize the axes to
ensure that the data is not skewed. Our Netflix dataset does not include the
appropriate data to generate a proper dual axis, but you can see an example
of a dual-axis line chart in Figure 5.10.
Figure 5.10 Create a dual-axis line chart by combining two measures. This produces a line chart
with multiple lines.
Figure 5.11 Adding multiple layers to a visualization allows you to slice and dice data to derive
additional insights that may be hidden in the aggregated data.
Tip
When two or more lines are present, you can transform line charts by
adding additional chart types to deepen the insights. For example, a line
chart can be combined with a bar chart (Figure 5.12) to provide visual cues
for further investigation. Alternatively, the area under lines can be shaded
by filling the space under each respective line; the resulting area chart can
extend the analysis and illuminate each line’s relative contribution to the
whole. Trend lines, such as linear regression and forecasting, can be added
to offer even more insights.
Figure 5.12 Adjust the Marks card to help you combine chart types. This work-in-progress line chart
has been combined with a bar chart. It also includes annotations, trend lines, and a color gradient
shade element on the line to enhance insight.
We all love to hate “dessert charts,” particularly the pie chart and its cousin
the doughnut. While there are a lot of opinions about why dessert charts
make for poor analytic tools, it’s more effective to point out the substantial
amount of empirical research that provides concrete, evidence-based
reasons not to use these charts, as well as the research that has established
best practices for when we simply must use them.
While circular charts are not new (Florence Nightingale’s 1858 Coxcomb
plot serves as a prime example), and many do offer strong impact in terms
of storytelling, the truth is that humans are just not very good at reading and
understanding angles. Further, the many distortion effects caused by too
many slices (which occur with both pie and doughnut charts) create
additional comprehension issues that diminish pie charts’ likelihood of
being considered useful analytic visualization. Even so, these charts remain
among the most misused and overused of chart types. Nevertheless, with a
few tweaks, both of these notorious chart types can be used, with discretion,
as viable options to visualize parts of a whole, or percentages.
In both types of charts, the circle represents the 100% whole, and the size of
each wedge (or slice) represents a percentage. The trick to properly reading
pie or doughnut charts is to not rely on the angle, but rather to look at area
or arc length. To avoid a bad pie chart, focus on comparing only a few
values (fewer than six is preferable, and two is best, if possible) and use
distinct color separation (or borders or white space) for maximum
readability. Doughnut charts can help clarify your data story by including a
key takeaway in the center white space (Figure 5.14).
Figure 5.14 A side-by-side comparison of an unlabeled pie chart and a doughnut chart displaying
percentages of America’s favorite pizza toppings.
The first rule of dessert charts is similar to the first rule of Fight Club:
Don’t use dessert charts. Other visualization types—including bar charts
and tree maps—are typically better suited to showcase the information
contained within a pie or doughnut chart. However, if you must use a
dessert chart, consider the following points:
Note
These pie and doughnut chart visualizations were created using a tiny
dataset on pizza toppings that is available on the companion website for this
text.
Tip
You can increase the size by holding down Ctrl+Shift (or holding down
Command+Shift on a Mac) and pressing B several times. You can also
change your view to Entire View to automatically resize the viz.
Begin by building a basic bar chart and then use the Show Me card to select
the pie chart option (Figure 5.15). You’ll note that the Marks card has
likewise changed from automatic (or bar) to pie.
Figure 5.15 Begin building a pie or doughnut chart by starting with a basic bar chart, then selecting
the pie chart option from the Show Me card.
Note
To keep this data visualization aligned with best practices, I’ve filtered
down to movies only, then further narrowed the data to show only the four
rating categories most commonly associated with films (G, PG, PG-13, and
R).
Before we go on, enlarge your pie chart to a more appropriate size. Next,
sort your wedges by first using the Sort option on the Color Marks card
menu, then sorting by Field and descending (Figure 5.16).
Figure 5.16 A few quick steps can help ensure the pie chart aligns to best practices.
From here, we could further refine this basic pie chart by sorting, adjusting
color, and so forth. However, while there is no one-click or Show Me option
to change a pie chart into a doughnut chart, a few additional steps will
transform your view:
1. First, we must duplicate the pie chart. This is achieved by creating a fake
axis on our Rows/Columns shelves. Double-click in the Rows shelf to
type there; then input a 1. This will create a SUM(1) field. From here,
click on the newly formed pill’s drop-down menu and change the
measure from Sum to Minimum.
2. Duplicate this pill by either performing the same exercise again in the
Rows shelf, or by using your mouse to drag-and-drop duplicate the field
using the Command key.
3. Now that we have two identical pie charts, we will essentially use the
second pie as the middle part of the doughnut. On the second MIN mark
on your Marks card, remove any fields from Color (this should render
the viz gray). If the Size marks card includes any pills, remove them, and
then reduce the size by roughly half. When finished, your canvas should
look similar to Figure 5.17.
Figure 5.17 Transform a pie chart into a doughnut chart by first
duplicating the pie chart and then manually adjusting the color and size.
4. Right-click the second instance of MIN pill on the Columns shelf and
select Dual Axis. This will allow the two charts to lay on top of each
other (Figure 5.18).
Figure 5.18 By using the dual axis function, we begin to layer our two
pies to form an early doughnut chart.
5. From here, we need to begin formatting the doughnut chart. First, right-
click the top axis to synchronize the axes. We can clean the view by
further removing axis headers.
6. On the Marks card for the second pie, change the color to gray, and bring
any value you’d like to display in the doughnut hole onto the Labels card
(you can also experiment with more customized text-box additions to fill
this space by using the doughnut chart in a dashboard, which we’ll cover
in a later chapter).
After a little bit of curation, your final pie chart should resemble that shown
in Figure 5.19.
Figure 5.19 A complete doughnut chart with additional color work and worksheet formatting.
Sketching ideas for your graphics can facilitate the artistic process of
storyboarding the output of your visual analysis (Figure 5.20). If you can
create a vision of your story, you can use it as a guide to curate meaningful
charts and graphs. While Tableau doesn’t support sketching, these guides
can be helpful as you work to curate your visual in Tableau to tell its best
story.
Figure 5.20 Sketching stories can facilitate the artistic process of visualization and help you see your
end goal as you work to curate it in Tableau.
You create simple scatter plots by dragging a measure to the Columns shelf
and a measure to the Rows shelf. When you plot one number against
another, the result is a Cartesian chart—a one-mark scatter plot with a
single x, y coordinate (Figure 5.22).
Figure 5.22 Simple scatter plots begin with aggregated measures, showing only one mark.
To view all of your measures, deselect the Aggregate Measures option from
the Analysis menu (Figure 5.23).
Figure 5.23 Deselect Aggregate Measures to view all of your data points on a scatter plot.
You can add depth and visual richness to a scatterplot in the following
ways:
Bring over dimensions from the Data pane onto the Marks card and use
them to add color or additional shapes onto the scatter plot.
Change the shape of the data via the Marks card to provide additional
relevance and visual cues. You can choose these shapes from a set of
sample default shapes as well as a selection of shape palettes included in
Tableau (Figure 5.25). You can also consider custom shapes, which we’ll
discuss in a later chapter.
Figure 5.25 Choose shapes from the Marks card to add depth to your
scatter plot.
Incorporate filters to reduce noise and help limit the investigation to the
factors that matter most to your analysis.
If these shelves contain both dimensions and measures, Tableau will create
a matrix of scatter plots and place the measures as the innermost fields.
Thus, these measures are always to the right of any dimensions that you
have also placed on the shelves. The word innermost in this case refers to
the table structure (Figure 5.27).
Figure 5.27 A matrix scatter plot.
The bubble chart is a variation of the scatter plot that replaces data points
with a cluster of circles (or bubbles), a technique that further emphasizes
data that would be rendered on a pie chart, scatter plot, or map. This method
shows relational values without regard to axes and is used to display three
dimensions of data: two dimensions through the bubble’s location and a
third dimension through size.
These charts allow for the comparison of entities in terms of their relative
positions with respect to each numeric axis and size. The sizes of the
bubbles provide details about the data, and colors can be used as an
additional encoding cue to answer many questions about the data at once
(Figure 5.28). As a technique for adding richness to bubble charts, consider
overlaying them on a map to put geographic data quickly into context.
Proper labels and annotations can also ensure valuable quantitative
information stays firmly contextualized within the visualization.
Figure 5.28 A packed bubble chart displays data in a cluster of circles, using size and color to
encode the bubbles with meaning.
Note
Figure 5.29 A simple data table showing America’s favorite pie flavors as of 2008.
In this example, the size of the bubble represents the number of survey
responses, whereas the color of the bubble represents the flavor or pie
chosen. The circle is also labeled with the flavor.
As with most chart types, there are ways to add more insights and
quantitative context into a packed bubble chart or embellish the chart with
storytelling techniques. For example, you can use different dimensions to
encode color or adjust labels to add more information (Figure 5.31).
Figure 5.31 In-worksheet formatting for a packed bubble chart.
Note
You might recognize this image from Wake’s Pis: A Kid’s Guide to
Delicious Data Stories. For more of Wake’s work, check out
www.tableau.com/about/blog/2016/6/viz-long-and-prosper-how-one-young-
trekkie-telling-stories-his-data-55767.
The tree map also provides a much more efficient way to see this
relationship when working with large amounts of data by making efficient
use of space. It is ideal for legibly showing hundreds (or perhaps even
thousands) of items simultaneously within a single visualization.
Note
The tree map data used in this section is from a Rutgers University cyber-
bullying research study, whose findings were presented in part at the 80th
Annual Meeting of the Association for Information Science and Technology
in 2017. Further presentations of this data were given at the 2017 Tableau
Tapestry Conference, and at the 2018 conference of the International
Bullying Prevention Association.
Use dimensions to define the structure of a tree map and measures to define
the size (or color) of the rectangles.
In this example, we are using survey data to create the tree map and
examine how many respondents selected each of the options presented.
Both the size of the rectangles and their color are determined by the value
of Response ID—the greater the sum of unique responses for each category,
the darker and larger its box (this is further clarified by the color legend at
right). Although this dataset does not include any “negative” values,
Tableau has automatically selected a diverging color palette, rather than a
sequential palette. Here, it is useful to quickly identify the top responses
from the lowest by using visual color cues.
Size and color are crucial elements in tree maps. You can modify a tree map
by adjusting how color is utilized. For example, in Figure 5.35, the count of
Response ID has been removed from the Color shelf and replaced with
Grade (6–12); now, rather than seeing total responses, we can review the
responses by grade level to see how student opinions differ by age. In the
revised tree map, Grade determines the color of the rectangles and the count
of Responses still determines the size of rectangles, allowing us to see top
responses per grade. This refinement of the view would help cyberbullying
researchers identify which responses may be most appropriate by grade
level.
Figure 5.35 Modify elements on the Marks card to adjust the elements of color and shape in a tree
map.
Here are some tips for tailoring this type of visualization to your audience:
Note
The heat map visualizations in this chapter were created from a large,
collected dataset on character aggression in the Harry Potter series, the
culmination of which was presented at the 2017 Harry Potter Academic
Conference hosted by Chestnut Hill University. The full dataset is available
at the companion website for this text.
Building a heat map in Tableau takes a few more clicks than with some of
the other charts discussed (Figure 5.37):
Place one (or more) dimensions onto the Columns shelf and one (or
more) dimensions on the Rows shelf.
Select Square as the mark type on the Marks card.
Place a measure on the Color shelf.
Figure 5.37 Building a heat map.
Note
Since this is quite a large dataset, in Figure 5.37 I have already manually
sorted the order of books (you can see the Sort icon on the Book pill) and
filtered the number of characters (you can see the Name pill in on the
Filters shelf).
There are a few more steps to curate this heat map. The preceding example
uses the default blue gradient color palette. However, other color palettes
might be more appropriate, depending on your data. For example, Figure
5.35 shows the use of a red-gold gradient scheme to progressively darken
the cell color in line with characters’ aggressive action counts. You can
access the Colors box in the Marks card, and then select Edit Colors to open
the Edit Colors dialog box (Figure 5.38). From here you can select another
color palette from the drop-down menu—either a gradient palette or a
diverging palette.
If you select the Use Full Color Range check box for a diverging option,
Tableau will assign the starting number a full intensity and the ending
number a full intensity.
If you don’t select the Use Full Color Range check box, Tableau will
automatically assign the color intensity as if the range were from –100 to
+100, maximizing the color contrast as much as possible.
Figure 5.38 Use the Edit Colors dialog box to select an appropriate color scheme for a heat map.
Figure 5.39 Adding borders to colored cells helps to distinguish individual cells in the view.
Recommended Reading
Check out the Tableau white paper Which Chart or Graph for additional
information: www.tableau.com/learn/whitepapers/which-chart-or-graph-is-
right-for-you.
Summary
These exercises have walked us through most, but not all, of the
visualization types available natively on the Tableau Show Me card. There
are still two more important visualization types to discuss. We’ll cover two
types of fundamental maps in the next chapter.
Chapter 6
Fundamental Maps
Note
All visualizations created in this chapter were created using a Centers for
Disease Control and Prevention (CDC) collected dataset on Lyme disease
case counts by county from 2000 to 2015. This Lyme disease dataset is
publicly available from the CDC at www.cdc.gov/lyme/stats/index.html.
While maps can be a great way to tell a story about your data, remember
that they are a type of visualization and do have an appropriate use case.
Depending on the question you are trying to answer or the insight you are
trying to communicate, another chart type might be a more appropriate fit.
Before you begin building a map, take a careful look at your data, your
analysis, and your story. Maps, as Tableau explains, should answer
questions with both an appropriate data representation and an attractive
data representation. As a storytelling device, maps can be particularly tricky
in their tendency to mislead or inadvertently cause people to misinterpret
the data or to dictate a not-quite-true story.
Note
Although you are already familiar with connecting to data in Tableau at this
point, geographic data comes in many shapes and formats. For this reason,
it is useful to walk through this step of the process again within the context
of mapping and discuss where geodata nuances might affect the process as
you prepare to work with geographic data.
Note
Newer visions of Tableau Desktop can connect directly to spatial files (such
as shapefiles and geoJSON files). However, following the precedent
established in this book, the examples in this chapter demonstrate
connecting to data in Excel.
After connecting to your data source, you might need to take a few more
steps before your geographic data is fully prepared for analysis in Tableau.
These steps will not always be necessary to create a map and might differ
depending on your data and the type of map you intend to create. In almost
all cases, geographic fields should have a data type of string, have a data
role of dimension, and be assigned the appropriate geographic roles. There
is one exception: Latitude/longitude should have a data type of number
(decimal), have a data role of measure, and be assigned the Latitude and
Longitude geographic roles.
Let’s practice adjusting data types for geographic data in the CDC dataset.
This simple dataset has two geographic fields: State and County. Tableau
has correctly identified these data types as string; however, clicking on the
field and looking at the geographic roles reveals that none has been
assigned (Figure 6.3). You might need to assign or edit the geographic role
assigned by Tableau. In this example, two things must be done:
After you make this adjustment, the data type icon will change to a globe,
indicating that the field now has a geographic role assigned (Figure 6.4).
Further, the icon designated in blue indicates that Tableau has assigned this
field as a dimension. This is correct.
Figure 6.4 The globe icon reflects the geodata field assignment in Tableau.
As one more data preparation step, notice that the field for County Name
includes both the name of the county and the word “County.” This extra
information will disable Tableau from recognizing the county names, but it
provides a great opportunity to use the “split” feature available when right-
clicking this data field (Figure 6.5).
Figure 6.5 Data can be “split” in the preparation process using the split feature.
Since we need to split the data using a consistent space between the county
name and the word “County,” we can simply use Split. For the sake of
illustration, we can also choose Custom Split, which enables you to input
the separator as well as the desired number of splits (Figure 6.6). Note that
for this function to work as expected, all separators must be consistent.
Figure 6.6 For custom splits, choose the separator and split off as desired.
Splitting the data will result in two new columns at the end of the dataset.
For clarity’s sake, hide your original column, then rename and verify your
data type and the geographic roles of the new columns. Since this is data
we’ve effectively created within Tableau, you’ll see an = in front of the new
column, similar to newly created calculated fields on the Data pane (Figure
6.7).
Figure 6.7 Splitting data results in new columns that should be reviewed to ensure they are properly
prepared for analysis.
When you assign the correct geographic role to a field in Tableau, the
software will also assign a latitude and longitude to each location. It does so
by finding a match that is already built into the geocoding database installed
with Tableau Desktop. These latitude and longitude fields will be displayed
on the Data pane as measures and are how Tableau knows where to plot
your data locations as you begin building a map (Figure 6.8). Note that in
some advanced maps, you might elect to have your latitude and longitude
coordinates be dimensions; these should be considered special uses and are
not covered here.
Figure 6.8 When Tableau recognizes geodata, latitude and longitude fields are automatically
displayed as measures on the Data pane.
In the Tableau worksheet space, if you have more than one level of
geographic data in your dataset, you can create geographic hierarchies.
While these are not critical to creating a map, geographic hierarchies allow
you to quickly drill into the levels of detail your data contains. Because this
dataset has both State and County, you can create a hierarchy using these
two fields. State is the larger field in the hierarchy, so let’s begin there.
A dialog box appears that prompts you to name the hierarchy schema, such
as Location. Enter a name and click OK.
A new field now appears in the Dimensions pane with the name of the
hierarchy just created. The highest-level geographic data used to create the
hierarchy—in this example, state—appears as the first rung in the hierarchy.
To add additional fields, simply drag and drop into the hierarchy, placing
them in correct order. Repeat as necessary until all geographic fields are
included in the hierarchy. Figure 6.10 shows county has been added into the
hierarchy below state.
Tip
Any date that follows a hierarchical structure (e.g., dates: year, quarter,
month, day; product: category, product, sub-product) can be grouped into
hierarchies to enable drill-down and drill-up functionality.
Proportional symbol maps are useful ways to show quantitative values for
individual locations. They can show one or two quantitative values per
location and can be encoded with visual cues like size and color. The
proportional symbol map displaying the number of analytics academic
programs across the United States shown in Chapter 1 is a great example of
a symbol map (Figure 6.11).
Figure 6.11 This symbol map shows the number and type of academic analytics programs available
in the United States.
Note
You can download this public, and constantly updated, dataset from
https://github.com/ryanswanstrom/awesome-datascience-colleges.
Let’s create a new map together using the Lyme disease dataset. The first
step is to give Tableau geographic coordinates to work with and lay the
foundation of the map. Double-click the Latitude and Longitude generated
fields under Measures. Latitude is added to the Rows shelf and Longitude to
the Columns shelf. Initially, a blank map view is created (Figure 6.12).
Figure 6.12 The first step in building a map visualization is to display the Latitude and Longitude
coordinates to generate a blank map.
Tip
Navigating maps on the Tableau canvas can be a little tricky. Use the
controls shown in Figure 6.13 to help.
Next, drag out the dimension that represents the location you want to plot
your map by and drop it on the Details card. From the hierarchy group in
this dataset, I’ve brought over County to look at Lyme disease cases at a
more granular level. Thus, a lower level of detail is added to the view
(Figure 6.14).
Figure 6.14 Add dimensions to the Detail Marks card to begin populating the data displayed on the
map.
With a level of detail now on the map, the next step is to bring over the
Measure to encode size. In this example, I am interested in seeing the
number of Lyme disease cases per location, so I can simply bring the Total
Number of Cases to the Size Marks card. With the size of the bubbles
representing the number of Lyme disease cases in each county, we can
visualize the range of values more clearly (Figure 6.15).
Figure 6.15 Adding detail to the Size Marks card can enhance the ways symbols appear on the map
and encode additional data.
This is the basis of a proportional symbol map. The larger data points
represent the locations with a larger total number of Lyme disease cases,
and the smaller data points represent the locations with fewer cases.
At this point, your map should look similar to the one displayed in Figure
6.16. However, depending on which additional data you may have in your
dataset, a few more tweaks can help to make the data in your map shine.
Try the following:
For data with categories: Sort your categories in an order that makes
logical sense.
Color: Adjust color and opacity, and add borders/halos as appropriate to
clarify overlapping marks.
Tip
Though we’ve accurately represented the data in terms of the total count of
Lyme disease cases by county, we have failed to consider an important
piece of context that’s critical to mapping data. Context is everything. With
maps, keeping population sizes in context is especially important. You
might need to “normalize” your data with a calculated field to ensure you
are looking at populations in the context of their geographic locations.
Choropleth Map
A choropleth (or filled) map is a great tool for showing ratio or aggregated
data. These maps use shading and coloring within geographic areas to
encode values for quantities in those areas. A dataset for choropleth maps
should include both quantitative and qualitative values, along with location
information recognizable by Tableau.
Let’s take our current symbol map and transform the data into a choropleth
map. From here, we can simply use the Show Me card and select the
choropleth map option to reimagine our data (Figure 6.17).
Figure 6.17 This is a nice example of why a choropleth can be a better alternative than a symbol
map. There is simply too much data to show in individual points, but all of the data is necessary for
accurate analysis.
Notice that the default aggregation type is SUM; however, this might not be
the best fit depending on your data. Take a moment to verify that the field
should be aggregated as a sum (because this is a count of disease cases
reported, a sum is appropriate).
For clarity view, I’ve adjusted the map to show just the contiguous 48 states
in Figure 6.18.
Figure 6.18 Color choice on a choropleth map is important and should follow the color practices
described earlier in this text.
Note
The level of detail specified in the map as well as the color distribution
specified for the polygons affects how the data is represented and how
people will interpret the data. In some cases, stepped color might be more
appropriate.
Map Layers
Of the many customization features for maps in Tableau, one of the most
interesting is the choice of the built-in map background style to adjust the
background of your map. The three background options offered in Tableau
are Normal, Light (the default), and Dark. Figure 6.19 shows each
background option. Tableau also offers street, outdoor, and satellite views
for more detail, depending on the drill-down level represented in the map.
Figure 6.19 Three standard map backgrounds available in Tableau.
To select a Tableau map background style, choose Map > Background Maps
or use the formatting pane to adjust the background style (Figure 6.20).
Figure 6.20 You can adjust map backgrounds and other formatting stylistics in the Background
Layers pane.
You can also experiment with importing your own background map, adding
a static background map image, and adding or subtracting map layers by
data layers. Learn more at https://help.tableau.com/current/pro/desktop/en-
us/maps_marks_layers.htm.
Additional map layers are available depending on the zoom level of the
map (Figure 6.21).
Figure 6.21 You can additional levels of map detail depending on the zoom level of your map.
Visualizations are not neutral. Maps, like any storytelling device, can
mislead audiences if they aren’t designed correctly and honestly—and they
can be customized for the audience. Google Maps does this with lines and
by adjusting views for disputed territories. For example, Russian users see
Crimea marked off with a solid line indicated that the area belongs to
Russia, but for Ukrainian users the solid line is replaced with a dashed
stroke indicating that the peninsula belongs to the Ukraine. Everyone else,
including people in the United States, sees a hybrid line that reflects
Crimea’s disputed status (Figure 6.22).
Figure 6.22 This Google Maps version of the Crimea border is intended for a U.S.-based audience
and shows a hybrid line that reflects the border’s disputed status.
Additionally, the manner in which we use shapes and colors to encode data
that represents humans can be tricky on a map. One map of poverty in
Minnesota recently changed from representing humans as red dots — which
resulted in a map covered by an angry red swarm — to a gradient purple to
look less aggressive (Figure 6.23).
Figure 6.23 The initial, unfortunate design choice to represent population was later adjusted to a
more neutral, and less offensive, approach.
These examples, and many more, speak to the importance of paying special
attention to how our assumptions, intuitions, and biases—or even the things
we might not consider—affect how we build visualizations to tell stories
about people and places. Check out this article for more information:
https://source.opennews.org/articles/when-designer-shows-design/.
Summary
Note
Note
Read more on our shared visual human history, including a study of visual
communication from cave drawings through advanced visualizations, in
The Visual Imperative: Creating a Visual Culture of Data Discovery (Ryan,
2016).
With that kind of retention power, one can easily understand how our visual
capacity demands careful attention when working with data. We want to
make sure that we are curating visual analytics as we apply design elements
that leverage our cognitive abilities and help us better “see” our data
insights.
As you might have noticed when we were building basic charts and maps
presented on the Show Me card, you can format just about every visual
element you see in Tableau. Choosing the right formatting is important to
your analysis and your presentation, and Tableau allows analysts to
customize the formatting for almost everything on a worksheet, including
All of these changes can be accomplished by using the Format pane and the
Marks card in combination. Likewise, you can specify format settings for
the entire worksheet, all rows, or all fields, or you can format individual
parts of the view. Further, formatting choices can be applied at the
worksheet or workbook level. This chapter takes a closer look at a few of
the most important visual design building blocks and explores how you can
use Tableau functionality to embed these visual cues purposefully and
intuitively as you format your visualizations.
Color
Lines
Shapes
Note
Before curating your visualization, you must first understand and explore
the data, and represent it using the chart or graph best suited for your data’s
story. From there, you can apply visual cues that enable you to intuitively
and meaningfully communicate the desired insights to your audience.
Chapter 5, “Fundamental Data Visualizations,” and Chapter 9, “Beyond
Fundamentals: Advanced Visualizations,” cover selecting and building the
best charts.
Color
Color is one of the most important, most complicated, and most frequently
misused elements of data visualization. When used well, color can enhance
and clarify a visualization; when used poorly, it can confuse, misrepresent,
or obstruct clear communication. Color is such a critical element in
representing data visually that Tableau employs color scientists to help
design the best color palettes and to provide deep education on the
appropriate use of color. This section covers best practices for properly
applying color to a visual that aligns to its data and your story. While we
can only scratch the surface of color application within the scope of this
book, this is an area in which you should invest more time learning.
All marks in Tableau have a default color, even if no fields are placed on
the Color Marks card. For most marks, blue is the default color. For text,
gray is the default color. We’ll explore how to use the Color Marks card
when looking at how it is used in different types of visualization.
Note
The visuals in this section use the Global Superstore dataset, which is
available from the Connect to Data screen in Tableau.
Tableau applies color depending on the field’s values. For discrete values,
or dimensions, Tableau typically uses a categorical palette; for continuous
values, or measures, it assigns a quantitative palette. These translate into the
three primary ways to encode data with color in data visualization:
Sequential
Diverging for continuous, quantitative values
Categorical for discrete values
Sequential Color
Sequential color encodes a quantitative value from low to high using
gradients of a single color and is applied when either all values are positive
or all values are negative. A great example of this is sales data, which go
from zero to infinity. The map in Figure 7.4 uses a sequential color scale to
encode positive sales amounts into each U.S. state.
Figure 7.4 This map uses sequential color to show the sum of sales from least to greatest. The darker
the blue, the higher the sales are.
Diverging Color
Figure 7.6 uses a diverging color palette to display profit by state (top) and
profit by product category and subcategory (bottom). Positive profit is
colored blue, with darker blue reflecting higher profit. Profit could also be
negative, and in this visual, negative values are encoded in orange, with the
darker orange reflecting a bigger loss.
Figure 7.6 Diverging color palettes display positive and negative values using gradient shading of
two contrasting colors to encode values.
Diverging color palettes are clearly designated in the Tableau color library.
Beyond changing the colors themselves, you can adjust the midpoint.
Midpoints do not have to be zero. They could be the average, such that
color values are then above or below average, or a target, such that color
values then exceed or fall below that point.
Stepped Color
In addition to changing the range of colors, you can group values into color-
coded bins using stepped colors. Use the up and down arrows to specify
how many bins to create (Figure 7.8).
Figure 7.8 Rather than use gradient shading, you can use stepped color palettes to distinguish colors.
Reversed Color
If it makes sense to do so, you can select the Reversed option to reverse the
order of colors in the range. For sequential colors, this means darkening the
intensity for the lower values rather than for the higher values. Likewise,
for diverging colors, this means swapping the two colors in the palette in
addition to reversing the shades within each range.
Categorical Color
Color Effects
Opacity
Mark Borders
Tableau automatically displays all marks without borders; however, you can
turn borders on for all mark types except text, line, and shapes.
Mark halos can assist in making marks more visible, particularly on maps.
They surround each mark with a ring of contrasting color (Figure 7.13).
Figure 7.13 Mark halos create a “ring” around a data point and increase its visibility and separation
from other marks on the viz.
Pre-attentive Colors
Color should be used not just strategically, but sparingly. Research says that
the human brain can differentiate approximately eight colors at a time, but
best practices suggest that using simple color palettes of five or fewer
colors reduces the stress on a user to decipher the meaning of color in a
visualization. Many data visualizations never require more than two to three
colors, including gray.
Similarly to highlighting, you can use alerting colors to draw the audience’s
attention to a particular data point. Using the same line chart as in Figure
7.14, rather than highlighting the high profits of the technology category,
suppose the goal is to alert the audience to the low profits in the furniture
category (Figure 7.15). In this use case, alerting is done with an alarming or
alerting color, such as red or orange, to indicate to the audience that
something is wrong.
Figure 7.15 Using an alerting color, such as red, drives attention to one mark on a viz without the
need for additional labeling.
It’s important to note that in Western culture, red is often associated with
negative values or associations. However, this is not always consistent with
color culture in other countries, such as China. Bright alerting colors could
be red, orange, or yellow.
Colors hold meaning, even when they are not attached to data. A certain
amount of psychology is involved in understanding color, as well as cultural
connotations, tone, and differences in how people see and interpret color.
Designing with the potential for color vision deficiencies in mind is such a
large consideration that it demands its own section. Most of us are familiar
with the “traffic light” palette, where red is stop (or bad), green is go (or
positive), and yellow or orange (or white as a midpoint) means to proceed
with caution.
Figure 7.16 A red–green image is virtually unreadable to someone with deuteranope color
deficiency.
Tip
To experiment with how your own visuals might appear to someone with
various types of color vision deficiencies, visit www.vischeck.com.
Figure 7.17 An orange–blue color palette is best suited to mitigating any color deficiency issues.
Lines
Lines have several purposes in data visualizations. They act as guides, they
reinforce patterns, they provide direction, and they create shapes. As with
any visual element, too many lines—or lines given too much emphasis—
can cause distraction or confusion. However, used wisely, they can be
transformative. Like color, lines should be used sparingly to reduce the
amount of ink onscreen so that the data can lead the story.
This section covers ways to format lines within individual visualizations in
Tableau and make effective use of them as view lines (axis lines, reference
lines, and so on), borders, and shaded bands. It also looks at how lines can
affect visualizations in the form of gridlines, axis rulers, and panes.
To format lines in Tableau worksheets, select the Format menu, then the
part of the view that you want to format (Figure 7.19).
Figure 7.19 Access the line formatting options via the Format menu.
You can also right-click on your sheet and select Format (Figure 7.20).
Figure 7.20 You can also right-click the worksheet and select the Format option to access the line
formatting options.
Either of these methods opens the Format pane as a new tab in place of the
Data pane, with icons to help direct formatting of individual elements in the
worksheet (Figure 7.21).
Figure 7.21 Once activated, the Format Borders pane opens in place of the Data pane.
Note
By default, most chart types in Tableau include gray axis lines, zero lines,
drop lines, and borders. If added, reference lines that help you analyze
statistical information in the data are also gray by default.
Grid lines, zero lines, and drop lines connect marks to the axis. You format
them using the lines icon on the Format pane, and you can adjust them by
sheet, row, or column (Figure 7.22).
Figure 7.22 Format grid lines, zero lines, and other reference lines using the Lines icon on the
Format pane.
Figure 7.23 is the default view of a sorted bar chart of the top 10 girls’ baby
names in the United States from 1910 to 2012. (Titles/subtitles are
formatted by double-clicking on the default title line.) We will use this
visualization to experiment with formatting lines.
Figure 7.23 A simple bar chart of top baby names, without line formatting.
To create this chart, follow these steps:
In this example, I removed all grid lines, zero lines, reference, and drop
lines. I also removed axis ticks and row axis rulers, but kept the column
axis rulers for reference and reformatted them as dotted lines (Figure 7.24).
Figure 7.24 You can format lines at the sheet, row, and column levels.
The resulting visualization is much cleaner, with only one line at the bottom
of the x-axis, as shown in Figure 7.25.
Figure 7.25 A simple bar chart of top baby names, with line formatting.
We could take this visual a step further, based on our previous discussion of
color, and eliminate unnecessary use of color to highlight the most common
girl’s name of the time period under review (Figure 7.26). In this slightly
modified (and updated) visualization, I’ve also adjusted the column zero
line to be a thin gray line so as to better separate the variables.
Figure 7.26 Takes formatting a bit further to provide a clearer message with the visualization.
Formatting Borders
Borders are the lines that surround visualizations, demarking the table,
pane, cells, and headers. You can specify the border style, width, and color
to format the cell, pane, and header areas using the grid icon on the Format
pane (Figure 7.27).
Figure 7.27 Format borders via the Grid icon in the Format pane.
Returning to the bar chart, I have added orange row dividers as borders to
show how they appear when formatted (Figure 7.28). Notice that because I
changed the format of the axis line to a dotted line, it now appears as
colored dots.
Figure 7.28 This figure shows an example of border formatting for row dividers.
Row and column dividers are most commonly used in nested data tables,
because they serve to visually break up a view and separate data fields,
especially when several levels of data exist. Figure 7.29 is the default view
of a nested data table reflecting the top 10 girls’ and boys’ baby names from
1910 to 2012. (Titles/subtitles are formatted by double-clicking on the
default title line.)
Figure 7.29 A basic nested data table, without additional formatting.
Using the Format Borders function, you can modify the style, width, color,
and level of the borders that divide each row or each column by using the
row and column divider drop-down menus. The level refers to the header
level you want to divide by. At the highest level, all fields are divided (as
shown in Figure 7.29, which is divided by Top Name at the highest level).
The resulting table, shown in Figure 7.31, is both cleaner and easier to read.
Figure 7.31 A simple data table, with some considerate line formatting.
At the intersection of lines and color, you can use shading to set a
background color for the entire visualization or for various areas of
importance, such as headers, panes, or totals.
Shaded areas are a technique commonly used to create banding, where the
background color alternates from row to row or from column to column.
This technique is useful for tables (as shown in Figure 7.31) because it
helps the eye distinguish rows or columns more intuitively than added
superfluous lines. To format shading and banding, you use the paint can
icon on the Format pane (Figure 7.32).
Figure 7.32 Format shading via the paint can icon on the Format pane.
The nested table in Figure 7.31 has row banding by default. If desired, you
could change this choice by sliding the band size to zero (Figure 7.33).
Figure 7.33 You can adjust row banding manually on the Format pane.
You can explore additional banding options by interacting with the various
settings on the Format pane. As a guide, practice with the following:
As an example, reconsider Figure 7.25. You can see the field labels for Top
Name rows, as well as the axis header for Occurrences directly below the
count. Does your audience need this duplicated information, or can we trust
them to infer the fields without an additional header? If the latter is true,
consider right-clicking on the field label for Top Name and removing it.
Next, to remove the axis header, right-click to Edit Axis and remove the
title by erasing the text in the field. The resulting visualization is much
simpler (Figure 7.34).
Figure 7.34 A modified view of a simple bar chart after eliminating redundant headers.
For additional simplification, you can remove the x-axis entirely, and label
the individual bars instead (Figure 7.35).
Figure 7.35 With some final polish and curation, this bar chart is data-rich and ink-minimal.
Figure 7.36 shows before and after views of the original and finished
versions of this chart.
Figure 7.36 Before and after views of the Top 10 Girls’ Names bar chart.
Tip
Reinstating a previously removed header can be a bit of a trick in Tableau.
To unhide a header, select Analysis > Table Layout. You can also unhide
any header from the rows or columns by simply right-clicking on the pill.
Use the header’s check box to toggle the header’s display on or off for each
pill.
Shapes
As a time-saving technique, shapes are one of the ways that our brains
recognize patterns. We immediately group similar objects and separate them
from those that look different. Some chart types, such as packed bubble
charts, use shapes (along with size and color) to encode meaning.
Additionally, we can use shapes in interesting ways to personalize data
stories in Tableau. The two ways to use shapes in Tableau are with the
Shape Marks card and with custom shapes.
The Shape Marks card feature allows you to assign different shapes to data
marks. Dropping a dimension on the Shape Marks card prompts Tableau to
assign a unique shape to each member in the field, as well as display a
shape legend (Figure 7.37). Using the Size Marks card allows you to
enlarge or reduce the size of each shape mark.
Figure 7.37 The Size Marks card allows you to use shapes to encode categories, a helpful technique
on a crowded scatter plot.
You can edit this default palette and assign a different palette from the
library of shape options within Tableau. Choices include a variety of shape
palettes, arrows, weather symbols, and KPI metrics. To edit the shape
palette assigned to your data, click the Shape Marks card and select Edit
Shape. A dialog box, similar to the Color dialog box, appears that allows
you to select a new palette as well as manually assign shapes to each data
item (Figure 7.38).
Figure 7.38 The Edit Shape dialog box.
Custom Shapes
If none of the palettes in the Tableau library appeals to you or is suitable for
your dataset, you can add custom shapes into your Tableau environment for
use in your workbooks. Custom shapes can add a nice design touch to your
visualization, particularly when you are building a narrative or working to
create engagement or visual appeal.
This function requires accessing the Tableau Repository on your machine.
To add custom shape palettes into the Tableau library, follow these steps:
1. Create your image files. Each shape should be its own file, and most
image formats (including .png, .gif, .jpg, .bmp, and .tiff) are acceptable.
(Tableau does not support symbols in .emf format.)
2. Copy the shape files to a new folder in the My Tableau Repository >
Shapes folder on your computer. The name of the folder will be the name
of the new palette in Tableau.
Note
If you plan to use color to encode shapes, use a transparent background
in your image file (.png). Otherwise, the entire square of the symbol
thumb-nail will be colored, rather than just the symbol itself.
Figure 7.39 shows that I have added two new palettes, Harry Potter and
Hogwarts House Crests, to my shape library.
Figure 7.39 You can manually add shape palettes to Tableau’s shape library by dropping them into
the Shapes folder in your Tableau Repository.
When you return to Tableau, you will see the new palettes included in the
Shape Palette library in the Edit Shape dialog box. If you modified the
shapes while Tableau was running, you might need to click Reload Shapes
(Figure 7.40).
Figure 7.40 Manually added shapes now appear in the Edit Shape dialog box.
You can assign these new shapes in the same manner as you do any shape
palette within Tableau.
tip
For tips on creating custom shapes best suited for use in Tableau, see the
helpful article at www.tableau.com/drive/custom-shapes.
Image Roles
URLs must navigate to image files with .png, .jpeg, or .jpg file
extensions.
Each URL should begin with http or https as the transfer protocol. (If
this information isn’t included, Tableau will assume https.)
As many as 500 images can be loaded per field.
Each image file should be smaller than 128 kilobytes.
With this data role selected, you can drop the corresponding pill onto your
worksheet canvas to see the images along with their associated data in your
visualization (Figure 7.42).
Figure 7.42 A visualization that now includes images associated with data using image roles.
Note
All data lacking a URL has been excluded for illustration purposes.
Summary
This chapter moves beyond the basics of visual analytics to take our first steps
in architecting outputs of analysis in visual data dashboards and stories. We’ll
begin by taking a closer look at how to prepare data in Tableau and what to do
about messy survey data—a common experience for data storytellers. Then
we’ll start building data dashboards and stories that incorporate features such
as filters, annotations, and highlights to present compelling, meaningful, and
actionable outcomes of visual analytics.
Like its predecessor versions, Tableau 2022 includes built-in data preparation
capabilities that help make reshaping data a smoother and less labor-intensive
experience than doing it by hand (or using the no-longer-supported Tableau
Excel add-in, which worked only for Windows-based licenses). While a full
course in data preparation is beyond the scope of this book, before getting into
a specific data prep exercise using survey data, we’ll review some of the basic
data preparation tools included in Tableau. We took a superficial look at Data
Interpreter in Chapter 3, “Getting Started with Tableau,” but this section takes a
more in-depth look at how it can be used with a real dataset.
The Significant Volcanic Eruptions dataset is a relatively clean dataset that can
be found in Tableau’s library of sample datasets at
https://public.tableau.com/app/resources/sample-data. For instructional
purposes in this text, I’ve introduced some common formatting errors into the
dataset by combining the two tables included in the Excel workbook.
Note
To “turn on” Data Interpreter, you need simply click the check box. Doing so
will prompt Tableau to run the interpreter tool and the contents of the preview
pane will update accordingly. You can see that those headers have been
stripped out, and the columns are now properly identified (Figure 8.2).
Figure 8.2 With one click, Data Interpreter has helped prepare data for analysis.
There are still several issues to resolve in this dataset, but we can already see
some improvement, as some blank rows have been removed from the preview
and column names identified. To explore the specifics of what Data Interpreter
has done to the data, you can click the blue Review the Results hyperlink. This
will open an Excel file that includes a key describing the changes and that
reflects the changes in individual sheets so you can efficiently identify work
performed by Data Interpreter (Figure 8.3).
Figure 8.3 Data Interpreter provides a “marked” Excel workbook that details the changes made in the
data.
Note
Data Interpreter is not available if the data contains more than 2,000 columns
or more than 3,000 rows and 150 columns. Likewise, this tool supports only
Excel (.xls or .xlsx), Text (.csv), PDF, and Google Sheets files. If Tableau
Desktop does not identify unique formatting issues or extraneous information
that Data Interpreter is designed to adjust, the option will not be available.
If you click through the sheets, you can see which fields are being used as
headers, in orange, and which are considered data, in green (Figure 8.4).
Figure 8.4 The data is color coded in the marked file.
For this sample dataset, Data Interpreter has handled a good portion of the
basic clean-up for us. However, before moving forward with this dataset, we
need to resolve the nulls shown in the preview pane. The handling of nulls in
Tableau is a dynamic and important step in the analysis process because null
values—and how they are or are not used—can have a significant effect on the
quality of your analytical work and visual outputs.
A null value is a field that is blank and signifies missing or unknown values.
When a measure contains null values, they are usually plotted as zero.
However, doing so can affect the analysis if these values are meant to be blank,
or not applicable, rather than numerically quantified at zero. Depending on the
reason for the null values, you may want to count them as zero, or perhaps you
would rather suppress (hide) null values altogether. A variety of functions in
Tableau work with null values:
Table 8.1 displays some of the different ways you can handle null values in
Tableau. More training on handling nulls is available on the Tableau website.
Table 8.1 Key of Tableau functions for handling nulls.
Numeric Values
String Data
Workbook Optimizer
The Workbook Optimizer breaks its guidelines down into three categories: take
action, needs review, and passed.
Take action items identify employable best practices that have minimal to
no impact on workbook functionality and should usually be addressed.
Needs review items may involve modifying the workbook in ways that
affect workbook functionality (e.g., restructuring a data source).
Passed indicates that guidelines are met, and best practices observed. If any
guidelines have been ignored, this category is renamed Passed and
Ignored.
Note
A good example of wide versus tall data is survey data, which is, by nature,
very wide. As an example, Figure 8.6 shows a raw export of survey data we’ll
return to later in this book.
Figure 8.6 This raw survey data has been cleaned to remove extraneous formatting, but retains its
original “wide” format when presented in Excel.
Note
This data was collected for a presentation at the Sixth Annual Harry Potter
Conference hosted by Chestnut Hill College in Philadelphia. We’ll return to
this dataset to build data visualization and stories in later chapters. You can
view the original presentation or learn more at HarryPotterConference.com.
To analyze this survey data, we need to reshape it from “wide” to “tall” before
we can look at it meaningfully within Tableau. This can be achieved by using
the Pivot function.
To pivot data in Tableau, select the columns that you would like to pivot, then
use the drop-down menu to select the Pivot function. In this example, I would
like to pivot each respondent’s question answers against their basic
demographic data, so I have selected all Question columns in Figure 8.8. You
may select multiple fields to pivot by using the Shift key to highlight each
column to be included in the function.
Figure 8.8 Using the Pivot function to reshape data from “wide” to “tall.”
The result has given us “tall” data with two new columns—Pivot Field Names
and Pivot Field Values, which contain the former columns and their respective
data (Figure 8.9).
Figure 8.9 The Pivot function reshapes data, resulting in two new columns.
From here, we can simply use the Rename function (also available on the
columns’ drop-down menu) to rename our columns to something more
appropriate (Figure 8.10). Now we are ready to begin our analysis.
Figure 8.10 Rename pivot fields to make the new columns more meaningful.
Working with survey data is a common task in analytics. Preparing this type of
data for analysis typically requires a hefty amount of manual work, in addition
to more robust preparation tasks than can be achieved with Data Interpreter.
Without extensive cleanup, analyzing raw survey data exported from tools such
as SurveyMonkey or Qualtrics can be near impossible, both because of the
usual formatting issues and because of the need to translate textual data into a
format usable in analysis while preserving its metadata.
Survey data includes four elements you need to organize and fit together:
Demographic information
Responses in text form
Responses in numeric form
A “metafile” that acts as a legend to describe the survey
The goal is to combine all of these elements into a single, comprehensive view
of the data—a tedious task.
While we won’t embark on this prep process within the scope of this book, a
word of caution is necessary here that applies to any data being used for
analysis. There is truth to the old adage “Garbage in, garbage out.” Before you
begin preparing your data for analysis, you should spend time reviewing the
data, even in its messiest raw form, and tidying up any errors or issues you see
before you take further steps. In particular, look over and confirm date and
geographic data formats, remove duplicate records, and change or correct any
identifiers to the format that you need (e.g., capitalization). Fields that allow for
manual text entry are especially prone to the latter issues, which might require
your attention before you begin working with your data. This is also a perfect
time to assess the presence of nulls in your data and determine when a field
should be a null or a zero.
Among its many additions and improvements, Tableau v10 brought analysts
the ability to prepare survey data without having to use external tools or spend
countless hours engaged in a manual cleaning process. Keeping this process
bundled within Tableau provides many benefits:
Note
Tableau Zen Master and Iron Viz Champion Steve Wexler maintains an
excellent blog on reshaping survey data for Tableau, including using Excel,
Tableau 10+, and Alteryx. His work includes many clearly delivered
tutorials and presentations specifically on survey data and is an excellent
resource. Visit Steve’s blog at www.datarevelations.com/surveyjustso.html.
Whether you’re working with survey data or any other dataset, once your data
is properly cleaned and shaped, you’re ready to begin visually exploring the
data and creating visuals to tell its story. We’ve already worked together to
build some fundamental data visualizations. Next, let’s look at how to leverage
these visuals to share with your audience in Tableau dashboards and stories.
Once your data has been thoroughly prepared for analysis and you’ve had the
opportunity to spend some time visually experimenting with ways to explore
and explain your insights through the basic visualizations, you’re ready to
construct a data narrative. This involves organizing the results of your analysis
either by arranging them into visual data dashboards or by sequencing them to
unfold into a compelling data story.
Note
best practice
Although all three of these mechanisms can support great visual data stories, be
strategic in how you use them. Be aware of the impact of sizing, layout, and
positioning, and make sure the view you’ve curated on your screen is
consistent with your audience’s. The Device Preview functionality in Tableau
supports this practice.
Individual Visualizations (Sheets)
Sometimes one visual is all you need, particularly if it is layered with rich
tooltips or annotations. Or, you might want to export simple, static visuals for
use in other presentation software. So far, we’ve spent our time applying
fundamental visual analysis principles within the sheets—or canvas—
environment in Tableau. From here, you can bring individual visualizations
into dashboards or story points to build a storytelling framework. A sheet is the
building block for any story, as it is the place where visualizations are built.
Note
Dashboards
You can create a dashboard in Tableau by clicking the Dashboard button at the
bottom of the Tableau workspace (Figure 8.11).
Have more descriptive titles and lead-in paragraphs, often including legends
(based on color or size) within their design.
Have simplified and streamlined views of a smaller number of
visualizations.
Include prominent legends, simplified color schemes, and limited views of
data, including only those that support the narrative (that is, filters or
parameters).
Omit interactive elements that might affect the narrative, such as quick
filters or other actions (this often depends on whether the presentation will
be narrated or left to the audience).
Include explanatory annotations to point out specific “story points” the
narrator deems of interest to the audience.
Figure 8.12 shows an example of a simple dashboard using some of the visuals
we created earlier in this book.
Figure 8.12 A simple dashboard.
Visual Hierarchy
You might have noticed that both Tableau’s Dashboard and the dashboard
presented in Figure 8.12 follow a four-quadrant approach. This format creates a
visual hierarchy—that is, an arrangement or presentation of elements in a way
that implies importance. In other words, visual hierarchy influences the order in
which the human eye perceives what it sees—left to right, top to bottom. This
order is created by the visual contrast between forms in a field of perception.
Top left: This should be the most important visualization on the dashboard.
Top right: This space is reserved for the second most important
visualization. If there’s no obvious hierarchy between the second and third
visualizations, time series graphs are a good option to fill this space.
Bottom left: This is neutral territory, for visualizations of lesser importance.
Bottom right: This is the area of least emphasis. Maps, because they are
encoded with location information that makes them easy to read without
much energy expenditure but typically provide less analytical value, are
great candidates for this space.
The overlay presented in Figure 8.13 shows how the dashboard in Figure 8.12
aligns to the four-quadrant best-practice approach.
Figure 8.13 A simple dashboard with a four-quadrant overlay for visual hierarchy.
The Dashboard Workspace
1. Device Preview: This option allows you to see your dashboard as it will
appear on the form factor selected in Size.
2. Size: This is an important aspect to think about before you start building a
dashboard, and these options let you select from a preprogrammed list of
fixed display sizes (that is, desktop browser, laptop browser, tablet) while
the canvas adjusts accordingly. The Automatic option allows the dashboard
display to dynamically resize to any display on which it is presented, but
this choice has certain ramifications for things like floating legends, which
will move around as the screen resizes.
3. Sheets: This area lists all sheets in the workbook.
4. Objects: This area lists additional elements, such as logos and images, that
you may elect to add into your dashboard from outside of Tableau.
Note
This section provides only a very basic overview of dashboard functionality.
You can find more information on creating a dashboard, including adding
views, objects, and interactivity, at www.tableau.com/learn/get-
started/dashboards.
Story Points
The benefit to using Tableau Story Points, rather than using PowerPoint or
similar, is that with it, interactivity remains in the story. Story points are not
static images: Presenters or audiences can explore or expand on the data using
actions such as quick filters within the narrative. Additionally, Tableau Story
Points updates in real time as the underlying visualizations or data is updated,
reducing the need for reworking or re-exporting worksheets.
You can create a story in Tableau by clicking the Story button (Figure 8.15).
Figure 8.15 Story button.
You present stories by clicking the Presentation Mode button (Figure 8.16) on
the toolbar.
Note
1. New story point: You may insert a blank or duplicate a story point.
2. Story pane: This box contains all the worksheets and dashboards in your
workbook that can be added as a story point.
3. Size: Again, size is important. This option lets you resize your storytelling
canvas.
4. Captions/navigation: Captions, similar in concept to annotations, provide a
tool to “narrate” your story. These can be formatted to include text, be
numbered, or utilize advancement dots or arrows (Figure 8.19)
The Layout pane provides further options intended to help you format the style
of navigation for the text boxes. Navigation can be formatted as caption boxes
(default), numbered, arrows, or dots (Figure 8.19).
Figure 8.19 Format navigation in the Layout pane within Tableau Story Points.
Before building a story, make sure to complete this Storytelling Checklist and
answer as many questions as possible before moving into the storyboarding
process to ensure the story aligns to your goals. You can then start sketching
the story outline for guidance during development of the narrative. We’ll take a
look at some data storytelling best practices in Chapter 10.
Who: The Data’s Audience
Who is your audience?
What do they want?
What do they need?
How might they be feeling?
What action do they need to take?
What type of communication do they prefer?
How well do they know the data?
You
How well does the audience know you?
Does the audience trust you? Do they find you credible?
How well do you know the data?
Do you have any preconceived notions or bias about the data?
What: The Data’s Context
What is the data?
What is it about? Is the data complete?
Do you have enough data to tell a complete story?
Is the subject matter general or specialized?
Why: The Goal
Why are you telling this story?
What action do you want the audience to take?
How: The Data’s Presentation Mode
Is this story static or interactive?
Will you be narrating the story?
Do you want to explore or expand the story while narrating?
Will it be presented in a small group or a large setting?
Will the audience be live or virtual?
Like all aspects of data visualization, crafting a perfect visual data story is a
process of ongoing iteration and refinement. Luckily, crafting a data story is
much like crafting any other narrative. After taking the time to clean and
explore your data, you can take advantage of a rather linear process to build a
complete visual data story (Figure 8.20).
Figure 8.20 This linear process can help you craft the best visual data narrative.
Whichever mechanism you are using, as you begin planning your story, taking
some time to reflect on the purpose of the story and the narrative you want to
share with your captivated audience is critical. Defining this purpose is the
single most important step of the planning stage in the storytelling.
We discussed the seven types of data stories in an earlier chapter. Pick one.
ONE. Regardless of the type of story you are telling, your story should have
one goal, and one goal only. The clarity of your story is just as important as the
clarity of each visualization within it. Audiences shouldn’t have to guess to
understand the salient point you are trying to make with your presentation.
In addition to the Storytelling Checklist, think like an author and consider the
following aspects of your narrative:
Plot: What story are you trying to tell? What is its purpose, and what is your
goal or the action you want the audience to take when you roll the credits,
figuratively speaking, at the end of your story?
Characters: Think of your data as your main character, and its context as
the story’s setting. Your filters, parameters, limitations, and even external
data sources are all supporting characters in your story, and each contributes
to the overall narrative. It’s your job to identify how.
Audience: Consider each of the questions in the Storytelling Checklist. Like
any good storyteller, you should know who is going to be listening to or
reading your story.
As you begin constructing your story, considering order and flow and how the
pieces of your story fit together is important. You’ve already seen the
importance of understanding the data’s context and of audience analysis and
how the story’s purpose is supported by understanding its plot, characters, and
audience. You’ve also reviewed several common types of data visualizations
and know how to curate them to make them more intuitive, focused, and
compelling. Your story will, when complete, tie together all of these elements
into a cohesive whole.
Although many different types of narrative structures exist and are worthy of
consideration, the one I have found most effective for consistently telling
compelling, meaningful visual data stories regardless of the type of story is the
three-act structure. Let’s review how this structure functions:
Act One—The Setup (or, the Exposition): This is where you, as the
narrator, lay the groundwork for your story. It’s where you explain the
purpose, introduce the plot, establish the characters and their relationships,
and finish with your dramatic question or inciting incident. This is the
catalyst; it’s the type of story you are telling. Are you going to explore a
change over time? Zoom into detail on something that happened? Act One
ends with your hypothesis, or what question you are exploring or answering
within your story.
Act Two—The Rising Action: This is the meat of your story. It’s where
you combine plot, characters, and audience as you guide them through the
data. In a film or play, Act Two typically depicts a protagonist’s attempt to
resolve a problem or the escalation of an issue, and who experiences
character development along the way. In a visual data narrative, this is your
opportunity to perform a curated version of your own analysis, capitalizing
on the insights you have found that support your narrative’s purpose—and
making your audience believe in your discoveries as they experience the
story for themselves.
Act Three—The Climax: Finally, you’ve reached your grand reveal. Your
audience has absorbed the plot point and is ready for their call to action via
a final viz that ends the story.
Note
You can use many techniques to build your storyboard. One of the most
common (and the one I use in my classroom) is the sticky note approach. This
process involves using color-coded sticky notes (yellow for Act One, pink for
Act Two, green for Act Three) and treating each sticky note as a view in your
story. You can easily adjust and rearrange your storyboard based on audience
feedback and as you fine-tune the visuals and supporting narrative.
Summary
This chapter goes beyond these basic options and takes a look at how to
create several advanced charts, step by step, in Tableau. These charts, which
are only a fraction of the advanced visualizations that can be created by the
savvy visual analyst in Tableau, are readily accessible to emerging learners,
as they require little more than the use of calculated fields and additional
formatting to build. However, these advanced visualizations offer deeper,
more dynamic views into data and can be beneficial in supporting more
complex analytics. The advanced visualizations covered in this chapter are
Timelines
Bar-in-bar charts
Likert-scale visualizations
Lollipop charts
Word clouds
Timelines
A timeline can be a useful way to depict events that occur over time,
whether the goal is analyzing patterns in notable events or showing dates of
interest. Although a timeline isn’t a graph that can be built out of the box in
Tableau, you can create one by following a few simple steps. The resulting
visual can support storytelling when you’re discussing important events
over time (Figure 9.1).
Figure 9.1 A snapshot of a finished visual timeline.
Note
For this timeline, we’re using a dataset of “final girls” depicted in horror
movies over time. The dataset is available for download on the companion
website to this text.
Before beginning, in your sheet, make sure that your date is recognized by
Tableau as a continuous date. If it is not, you can change it by right-clicking
the field on the Dimensions pane, and selecting Convert to Continuous on
the list of options (Figure 9.3).
Figure 9.3 Adjust your date to continuous if it is not already. This allows you to view an event over a
span of time, rather than in isolation.
1. Right-click the Data pane and create a new calculated field called
Anchor. The field should contain the input MIN(0), as shown in Figure
9.4.
2. Drag the newly created Anchor calculated field to the Rows shelf to
provide a starting point for your timeline. At this point, your
visualization is simply a horizontal axis line with a zero line (Figure 9.5).
3. Drag your Date field to the Columns shelf. Right-click the Date pill and
select Exact Date. This prompts Tableau to recognize each of the exact
dates listed in your dataset of events and lays the foundation of the
timeline by displaying a flat, solid, colored line (Figure 9.6). Because
there is no additional data, this is correct.
Figure 9.6 Adding the Date field to the Columns shelf provides the
foundation of a timeline.
Note
Calculated fields enable you to extend your analysis by creating a new field
(or column) that is not already contained in your data source. To create a
new calculated field, in a Tableau worksheet, select Analysis > Create
Calculated Field. The Calculation Editor dialog box will open, which
prompts you to give the calculated field a name and provide a formula.
Formulas can be created using a combination of functions, fields, and
operators. Once created, the new calculated field will appear as a new
measure in the Data pane and be designated with an equal (=) sign that
precedes the field name.
With the baseline set, you can now add dated events and begin formatting
the timeline to look more traditional. To do so, drag the Date dimension
from the Data pane again, and this time drop it on the Details Marks card.
Initially, the timeline may continue to appear flat. This is because Tableau
automatically looks at the largest segment of the date—in this case, Year.
Depending on the level of detail in your date data hierarchy, you may need
to prompt Tableau to look at a more granular view of the date. Do this by
clicking the + icon on the pill to expand the date to the level you’d prefer to
see on the timeline. Your timeline should now display each of the events in
your dataset as individual dots (Figure 9.7).
Figure 9.7 With event dates added to the baseline, the timeline begins to take form.
A bit of additional formatting can enable this timeline to tell a more detailed
story about the events displayed. You might use color to distinguish event
types, for example. The shape and size of each event point can be enlarged,
and a tooltip added to provide more information (Figure 9.8). You can also
adjust or delete zero lines, axis rules, and axis ticks as desired.
Figure 9.8 A little bit of additional formatting can add more detail and visual impact to your
timeline.
Here are a few additional things you can do to spice up a timeline and make
it more visually appealing:
Add a time frame. When you have a large number of events to display,
adding a relative date filter to show only the events within a certain span
of time might be helpful. To do this, drag the Date dimension to the
Filters shelf and choose Relative Dates. You can set the logic to be any
subset of dates that you want to dynamically display. In this example, I
have limited the range to a mere 30 days (Figure 9.9).
Figure 9.9 Filter the dates in a view to limit the number of events
displayed.
Add a reference line for “Today.” A reference line for the current date
can give audiences a visual checkpoint on the timeline. To create a
reference line, follow these steps:
1. Create a new calculated field, Today: TODAY().
2. Place this newly created calculated field on the Details Marks card.
This allows the field to be used as a reference line. Adjust the field
from a discrete date by right-clicking the date pill and selecting Exact
Date.
3. To add the reference line onto the timeline, select Reference Line
from the Analytics pane. Choose the Today calculated field as the
Line Value for the line and set it at Minimum (Figure 9.10).
Figure 9.10 Use the Today calculated field to add a reference line to
the timeline.
The Line Label option Custom enables you to specify how the line will be
labeled on the canvas. Further formatting can also be completed in this box
to define how you want the line to visually appear.
Note
Bar-in-Bar Charts
At this point, the two dimensions are stacked together along the x-axis of
the measure, rather than laid atop each other with both starting at the zero
point. This stacking is an automatic function of Tableau that you need to
turn off to manually build your bar-in-bar chart. To disable this feature,
navigate to the Analysis menu and choose Stack Marks > Off (Figure 9.13).
Figure 9.13 Turn off the automatic mark-stacking feature on the Analysis menu to overlay
components of stacked bars.
After this step, you should have a raw version of your bar-in-bar chart. At
this point, you can adjust which dimensions are in the foreground and the
background by dragging and dropping to sort the measures on the Color
Marks card filter, if desired. You can also edit the width of the bars by
clicking the drop-down menu on the Size filter, selecting Edit Sizes, and
then adjusting the Mark Size Range slider as desired (Figure 9.14).
Complete your visualization by editing and removing axis headers, titles,
sorting bars, and so on.
Figure 9.14 Adjust the width of the bars by editing the size range on the Size filter menu.
Continue to format your viz as desired to clean up and curate the bar-in-bar
chart.
Likert-Scale Visualizations
Likert scales are the most widely used approach to scaling responses to
gauge sentiment and tendencies, and they are a staple of surveys and other
types of data collection methodologies. Likert-scale questions can be asked
in several ways and, in turn, the data collected can be visualized in multiple
ways. This section takes a closer look at the two most common Likert
scales and the best ways to visualize them: a 100% stacked bar chart and a
divergent bar chart.
Note
To create the visualization in Figure 9.17, drag your first dimension to the
Rows shelf (this example uses survey data, so the field is Wording) and a
measure to the Columns shelf. A simple horizontal bar chart with solid-
color bars of equal length, representing the total count of responses for each
dimension, appears.
Next, drag your second dimension that represents the Answer value (or the
value representing survey responses) to the Color Marks card (Figure 9.18).
Figure 9.18 A rough stacked bar chart begins to visualize Likert-scale data; however, it requires
more curation to be a truly useful visualization.
A number of things need to be done to improve this basic 100% stacked bar
chart to properly visualize the Likert-scale data:
Color: The default color scheme in Tableau does little to help us see
behaviors that are adjacent to each other (for example, sometimes/often
and just once/never). Using the Color Marks card, adjust these to a more
suitable color palette.
Sort: Tendencies are sorted in alphabetical order rather than by how
often they occur. Manually sort these data to reflect the correct order.
Totals: A count of data is an okay option, but a better option
(particularly in survey data) may be “percentage of total.” Add in the
correct table calculation to make this change.
Curate: Remove unnecessary headers to clean up your canvas.
Note
You can apply several types of calculations to transform the values for a
measure in Tableau, including custom calculations, table calculations,
level of detail (LOD) expressions, and more. For more information on
the various types of calculations and how to use them, visit
https://help.tableau.com/current/pro/desktop/en-
us/calculations_calculatedfields_understand_types.htm.
With a bit of tweaking, this 100% stacked bar chart can be a decent
approach to display Likert-scale responses (Figure 9.19). You could also
add labels for each category to see the percentage of responses per
tendency.
Figure 9.19 With better colors and sorting, this 100% stacked bar chart does a better job of
visualizing the Likert-scale data.
Although the 100% stacked bar chart will work to represent Likert-scale
data, a better approach is a divergent bar chart—which is not actually a bar
chart, but rather a modified version of a Gantt chart. Rather than stacking
tendencies or sentiment ratings on a scale of 0 to 100, this approach shows
the spread of negative and positive sentiment values (such as “strongly
disagree” to “strongly agree”) aligned to each other around the neutral
midpoint (Figure 9.20).
Figure 9.20 Completed divergent stacked bar chart representing five-point Likert-scale data.
This approach requires you to create several calculated fields. To begin
building this visualization, you must first create a table, or crosstab, in
Tableau. This enables you to see the output of each of the calculations you
build and to troubleshoot any calculation errors before moving into
visualizing them.
Note
For this challenging chart, you will use the Harry Potter dataset used in
previous chapters. Download the dataset to follow along with the steps.
In this table, first drag the QuestionID and Text Value dimensions to the
Row shelf. (In this example, the QuestionID has been renamed to match the
character name for readability.) Because we included both the text and the
numeric coding for each Answer field, we can avoid writing any
calculations at this point (the Numeric Value for each response will become
useful in the next steps). However, notice that when you add this
dimension, the Text Values are not in the sequence they should be. Click the
drop-down menu on the Text Value dimension on the Rows shelf, select
Sort, and manually adjust these values so that they display in the correct
ranking order (minimum to maximum, or 1 to 5). Then, drop the Number of
Records Measure (SUM) onto the Text Marks card. Your table should
appear similar to Figure 9.21.
Figure 9.21 After this step, you can see for each question asked (in this case, each character ranked)
how many respondents chose each option on the Likert scale.
The next step is to create a series of calculated fields and add them onto the
table.
The first calculated field will calculate how many negative sentiment
responses were received for each question (or item ranked); these data
should appear as negative values and be positioned below (or to the left of)
the dividing line of 0 in the divergent stacked bar chart. To do this, we need
to count the number of responses received for the two lowest selections on
the scale (in this case 1—extremely non-aggressive and 2—non-aggressive)
as well as half of the neutral selection (in this case, 3—neither non-
aggressive nor aggressive). Because neutral responses in a survey are
neither positive nor negative, we want to split them in half to distribute
them across the bars in the chart so as to not unfairly weight one side of the
data.
Add this calculation onto the canvas. Your screen should match that shown
in Figure 9.22; the Number of Responses in the two negative sentiment
ranks should match the count in the Negative Sentiment column. The
neutral response count for Negative Sentiment should be half the count of
Number of Responses, and the two positive sentiment ranks should appear
with a count of 0 in the Negative Sentiment column.
Figure 9.22 This calculated field counts the number of negative sentiment responses that will appear
on the negative side of the dividing 0.
Figure 9.24 The Total Negative Sentiment function sums the individual count of negative responses
per question scored.
TOTAL(SUM([Number of Records]))
You will need to change the default table calculation to Text Values.
When added to the crosstab, this calculated field will sum the number of
responses per question. If your dataset is nice and clean and all questions
were answered, the value in this column should be the same all the way
down. For datasets in which not every question was answered, such as this
one, you will see variations in the count of responses in this column (Figure
9.25).
Figure 9.25 This calculated field counts the total number of scores for each question so as to
calculate the length of the entire bar.
The next step is to create a calculated field that will determine the
percentage offset, or how far into the negative to begin building the bar
chart. Because what we are creating is a modified Gantt chart, this
calculated field is really intended to be the first data point in the Gantt chart.
You can spot-check the Gantt Start calculated field after it’s added into the
crosstab by comparing it against the number of positive and negative
responses. The higher the count of negative responses, the larger the Gantt
Start percentage will be (Figure 9.27).
Figure 9.27 The Gantt Start calculated field tells each bar in the Gantt chart where on the axis to
begin.
Again, this is a percentage, so you must adjust the default number format
for this calculated field, too.
The last calculated field tells Tableau where to draw each line after the
original Gantt Start data point and separate the sentiment value categories.
Create this calculated field, named Gantt Percent Line:
You need to change the default table calculation to Text Values and adjust
the default number format to be a percentage.
The Gantt Percent Line is the trickiest of all the calculated fields needed to
create the divergent stacked bar chart. Essentially, in plain English, the
calculated field begins with the table calculation Previous Value and tells
Tableau to look at the previous row of the calculation we’ve just made.
However, there is no previous row for the first line in the table, so we are
actually directing Tableau to Gantt Start instead (−12.4%). We then tell
Tableau to add the previous row, this time on Percent of Total Sizing, and
subtract 1. Again, because there is no previous value, we’ve directed
Tableau to zero nulls (ZN), and the first value in this column is −12.04%. In
the next row, we can see this formula begin to work more smoothly (Figure
9.28).
Figure 9.28 The Gantt Percent Line calculated field creates a new calculation using the values
generated from previously created calculated fields.
After all five of these new calculated fields have been created and added
into the view, the crosstab for the divergent stacked bar chart is complete
(Figure 9.29). We are now ready to begin building the visualization in a
new sheet.
Figure 9.29 Although it’s a long process, this crosstab creates the foundation for our eventual Likert-
scale visualization.
In a new sheet, drag the Question dimension to the Rows shelf and the
Gantt Percent Line measure to the Columns shelf. Tableau will break
immediately, flagging the measure in red and giving the error message that
a critical field used to create this calculation is missing from the view
(Figure 9.30).
Figure 9.30 The first step in creating this Likert-scale visualization throws an error—but that’s okay!
The missing field is Text Value, which is the field we calculated everything
over in the crosstab. Bring this dimension into the view and drop it on the
Color Marks card. You might need to filter and then add the dimension,
depending on how many options it has.
1. Change the mark from Automatic to Gantt Bar (Figure 9.31). This
adjusts the view from bars to lines that separate each section of the Gantt
chart.
Figure 9.31 Changing the mark from the automatic bar to a Gantt bar
begins the Gantt chart transformation.
2. You need to manually re-sort your text value options in the same way we
discussed when making the crosstab table. This time, however, click the
Sort option on the Color Marks card and manually adjust it so that text
values display in the correct ranking order (minimum to maximum, or 1
to 5).
3. Drag and drop the Percent of Gantt Sizing calculated field on the Size
Marks card. Now, the visualization is beginning to take shape.
4. Now to address color: Tableau has used the automatic color palette,
which is intended to make things look very different. For this example,
we’ll make the colors look more like a standard blue–orange diverging
palette by changing the colors to a colorblind palette and manually
selecting better color choices (Figure 9.32).
Figure 9.32 With a few quick clicks, and by leveraging the calculated
fields already made and making smart color choices, the divergent chart
is beginning to take shape.
A few more clicks to simplify and remove headers and clean up the
visualization delivers a stunning divergent bar chart that displays the Likert-
scale sentiment data nicely. (The final result will look like Figure 9.20,
shown earlier in the chapter.)
Lollipop Charts
While not native to Tableau, the lollipop chart is a hybrid chart that
combines a traditional bar chart and a Cleveland dot plot. It is simply a
dual-axis chart that superimposes a circle on top of a very thin bar chart
(Figure 9.34). However, it’s a fun way to spice up a bar chart to give it more
visual appeal without reducing its analytical integrity.
Figure 9.34 Completed lollipop chart.
Note
Lollipop charts are a helpful way to visualize many bars of the same length
while avoiding the moiré effect. This exercise uses the Baby Names dataset,
which is available in the Tableau dataset library at
https://public.tableau.com/app/resources/sample-data. This dataset contains
the most popular male and female names in each state for each year from
1910 to 2012, based on data collected by the Social Security
Administration.
To begin:
1. Build a basic bar chart in Tableau (Figure 9.35). Here, I have filtered the
dataset to include only girls’ names from 2000 to 2010.
2. Duplicate your dimension on the same shelf you are currently using to
display dimensions (in this example, the Columns shelf). This creates a
side-by-side view of two identical bar charts (Figure 9.36).
4. Using the Marks card, change the first occurrence of your dimension to a
bar. Use the Size slider to slim down the line and the Color Marks to
adjust the color of the bar as appropriate. I typically use a lighter gray
(Figure 9.38).
Figure 9.38 With your first marks adjustment, the lollipop charts begins
to take shape.
5. To adjust the second dimension occurrence, using the Size Marks card,
enlarge the circles as appropriate (Figure 9.39).
Figure 9.39 The bars and circles of the lollipop chart can be changed
individually in terms of their size and color to curate your chart.
With the basis of the lollipop chart built, it’s time to clean up the
visualization.
6. Make sure your axes line up correctly. Right-click the second measure
axis (the one on top) and choose Synchronize Axis to make the axes
equal. Right-click the axis again and uncheck Show Header.
7. Tidy up the visual by sorting bars and excluding any data that might not
be pertinent to your story. I have sorted in Descending order and
excluded everything but the top 5 names. (You will need to readjust the
Marks Sizing after this step.)
8. Continue removing headers and axis titles as well as adjusting titles as
appropriate until you are happy with the visualization.
Labeled Lollipops
You might elect to remove the bottom axis header and use the circles to
encode their value.
To do this, drag the measure to the canvas for a third time, this time
dropping it on the Label card on the second occurrence. Adjust the Label
alignment to be centered and Automatic, and make sure the check box to
allow marks to overlap other labels is selected (Figure 9.40).
Figure 9.40 Carefully formatting mark labels can embed additional data in your lollipop chart and
eliminate the need for axis headers.
Right-click on the dimension on the Labels shelf to format the number and
text color, and then remove axis headers and tweak as necessary (Figure
9.41).
Figure 9.41 With proper size, color, and label adjustments, a lollipop chart can be a richer visual
alternative to a classic bar chart.
For this word cloud, I’m using a unigram list of the entire manuscript of
Harry Potter and the Philosopher’s Stone. It’s a tremendously large dataset,
so I’ve manually curated a list of keywords to use in this exercise.
At this point, you are ready to transform this view into something that better
resembles your expectation of a word cloud. To do this, right-click on the
dimension on the Size card and select Measure > Count (Figure 9.44).
Figure 9.44 Adjust the dimension to a measure to resize the words based on their count.
This step converts your initial word cloud structure to something that looks
like a tree map of a single color. Change the Mark type from Automatic to
Text, and your word cloud will re-form (Figure 9.45).
Figure 9.45 Changing the Mark type from Automatic to Text reshapes the tree map into something
more akin to a word cloud.
You might need to do some additional work to clean up your word cloud,
including removing extraneous words, performing deduplication, or
streamlining the words included. To add color to the word cloud, drag the
text measure once more to the Color Marks card. Now, with a little bit of
formatting clean up, your word cloud is complete (Figure 9.46).
Summary
This chapter described ways to create a few advanced visualizations beyond
the fundamental data visualizations available on the Show Me card in
Tableau. These charts require a little more hands-on manipulation but can
be excellent candidates to add some variety to your dashboards and visual
presentations. Aside from the types explored here, many other advanced
charts can be built in Tableau. Some fun ones to try on your own might be
waffle charts or hexbin map charts, or you can explore the use of spark
lines. There are always more charts to learn—so explore and have fun!
Chapter 10
Closing Thoughts
The lessons in this book are a huge step in the right direction, but they’re
just the first step. Becoming a seasoned visual analyst takes time. As Neil
DeGrasse Tyson says, “As the area of your knowledge grows, so too does
the perimeter of your ignorance.” We never know everything; we can only
keep learning more.
The best visual data analysts recognize that learning how to apply
fundamental data visualization best practices and tell meaningful, effective
data stories is a skill learned over time with practice, experience, and a good
amount of trial and error. These practices are ever-evolving, spurred onward
by the advent of technologies and by inspiring, innovative analysts and
artists who have the courage and curiosity to experiment with new
approaches and new ideas. It’s important that we, as visual analysts, data
explorers, and storytellers, commit to continuous learning, both with
visualization and with tools, in Tableau and beyond, as new capabilities are
introduced that facilitate deeper visual analysis.
In that spirit, this final chapter recaps the main lessons covered throughout
the text. It also serves as a resource kit for life beyond the book by
providing checklists of best practices and practical suggestions for
continuing to master outputs of visual analytics. In addition, it discusses the
myriad resources available to support you on this journey.
Here are five steps to guide you as you work to build a perfect data story.
Whether you’re crafting a data visualization or a data story, the first step in
creating your data narrative is to find (or collect) data that supports the story
you want to tell. The storytelling process is, in many ways, more similar to
the scientific process than to any literary one. After all, as a visual analyst
and storyteller, you are tasked with asking questions, performing
background research, constructing and testing one or many hypotheses, and
analyzing information to draw a conclusion.
As we’ve seen throughout this text, finding data to support a story doesn’t
necessarily require researching scientific data. Ultimately, the data chosen
should support the story it is telling in context, complexity, and depth. In
other words, find a story you’re interested in telling. Then, make sure you
understand your data and respect its limitations, knowing which story your
data can logically support and where you might need to incorporate
additional data to fill gaps or answer important questions.
After you have the goals of your data in hand and its story clearly in mind,
script your story by layering information to build a framework around a
narrative. Plotting a clear beginning, middle, and end, as well as a clear
message, will ensure the narrative fits your audience’s needs. In writing
terms, think of this as constructing your story’s outline and plot.
A false reveal can be a dangerous thing. It can incite the audience to draw
the wrong conclusions or take an incorrect action. It can also damage the
effect of the data itself and your credibility as a storyteller. As a visual data
documentary, data stories should be engaging and entertaining, but should
focus foremost on sharing truths.
Whether we do it intentionally or inadvertently, we can force the data to tell
the story we want it to, even if it’s the wrong—or an inaccurate—one. With
visual narratives, we are tasked not only with telling a story, but also
making it interesting, engaging, and inspiring. Think of a visual story as a
documentary: a nonfiction work, based on a collection of data, told in a
visually compelling way.
Stories have an inherent amount of entropy and are most potent when they
are happening. Data journalists take this point to heart in models that keep
track of events as they happen in real time (e.g., political elections or
disaster scenarios). The timestamp on when data is reported (or a
visualization or visual story released) can make a big difference in how the
story is interpreted or what impact it makes.
One way to tell a data story fast is by sharing it through mobile channels.
Mobile devices have been a game changer for data visualization in many
ways and will likely become even more important in the years ahead. That
said, mobile presentations require wise editing. Be aware of form factor
limitations and rethink the way storytelling via mobile devices happens.
One of the most imperative tenets of the DVCC is the need for feedback. As
with any type of analytics, visual analytics and visualization outputs that are
created and used in isolation can become their own version of data silos. We
should not overlook the need to collaborate and engage in group critiques
before publishing new visuals or presenting new data stories.
The old phrase “Everyone’s a critic” is alive and well in visual analytics.
I’ve created many visualizations that have not been as brilliant as I thought
they were. Tableau guru Steve Wexler, in his blog Data Revelations,
expressed a similar sentiment, even describing the moments of anger or
depression that can accompany less than enthusiastic feedback about a new
visualization. Being resistant to or wary of feedback is normal. Instead of
dwelling on negative feedback, use it as an impetus to improve your
visualization, engage in corrective learning, and seek out new information
to add into your visual data storytelling skillset for your next project.
Note
For more on the Data Visualization Competency Center, check out the
Visual Imperative or the whitepaper from Radiant Advisors, available at
https://radiantadvisors.com/our-research/new-research-the-data-
visualization-competency-center.
Ongoing Learning
A bevy of incredible information assets beyond this book can expand and
deepen your knowledge and understanding of the concepts we approached
from a pragmatic stance in this text. The following are some of the
resources I recommend.
Blogs
Blogs have two great qualities: There are a lot of them, and they are
constantly adding new material and new ideas. Here are a few of my
favorites that offer unique and compelling galleries of visual data
storytelling in action, created by some of the most prominent voices in data
visualization and storytelling today.
Books
Many books, both old and new, take deep dives into many of the concepts
covered in this book, as well as provide valuable insights from other leading
voices in the field. Several of these books are Tableau specific, providing
more practicable learning to apply concepts and test your skills. Here are a
few of the titles I use as references, both in and out of my classroom:
The Visual Imperative: Creating a Visual Culture of Data Discovery by
Lindy Ryan
Storytelling with Data: A Data Visualization Guide for Business
Professionals by Cole Nussbaumer Knaflic
The Functional Art: An Introduction to Information Graphics and
Visualization (Voices That Matter) by Alberto Cairo
Data Points: Visualization That Means Something by Nathan Yau
Communicating Data with Tableau: Designing, Developing, and
Delivering Data Visualizations by Ben Jones
Tableau Your Data!: Fast and Easy Visual Analysis with Tableau
Software by Daniel Murray
Visual Analytics with Tableau by Alexander Loth
DataStory: Explain Data and Inspire Action Through Story by Nancy
Duarte
Practical Tableau: 100 Tips, Tutorials, and Strategies from a Tableau
Zen Master by Ryan Sleeper
Visual Analysis and Design by Tamara Munzner
Tableau Resources
Tableau Services
Tableau offers several solutions designed to help users continue to grow and
improve as visual data analysts within the Tableau ecosystem. While there
are additional learning paths available for organizations, we’ll focus on
those offerings that support individual learners.
eLearning
Instructor-Led Training
Tableau Help
Tableau Community
Numerics
adding
data sources, 55–57
reference line for “Today”, 209–210
time frame to timeline, 208
Affinity Map, 2
AI (artificial intelligence), 39–40
alerting color, 149
analysis
audience, 76
how, 78
what, 77
who, 77
why, 77–78
exploratory versus explanatory, 69–70
annotations, 70–71
Anscombe’s Quartet, 24–25
anthropology, 21
app, Lucid, 18
area annotation, 70
audience analysis, 76
how, 78
what, 77
who, 77
why, 77–78
D
dashboards, 58, 188, 190–191
visual hierarchy, 191–192
workspace, 192–193
data
context, 65
hard, 32
pivoting, 182–185
preparing
Data Interpreter, 176–179
handling nulls, 179–181
survey data, 186–188
Data Interpreter, 57–58, 176–179
Data pane, 60
data sources
adding and replacing, 55–57
filtering, 52
geodata
assigning geographic roles, 117–121
connecting to, 116–117
supported, 115–116
join/s, 54
tables
connecting to, 48–51
relationships, 52–54
data visualization, 2, 7, 14. See also storytelling; visualizations
Affinity Map, 2
communication skills, 11–13
context, 13
“dessert charts”, 13
DVS SOTI (State of the Industry) survey, 9–10
evolution to visual data storytelling, 7–8
insight, 6, 8
“old” versus modern, 2
skill sets, 4, 14–15
statistics, 6–7
storytelling, 4–6, 7–8
Trendalyzer, 2
Data Visualization Society, SOTI (State of the Industry) report, 42
dataset/s
Anscombe’s Quartet, 24–25
Baby Names, 226
Harry Potter and the Philosopher’s Stone manuscript, 231
Lyme disease, 123–126
sample, 82
Significant Volcanic Eruptions, 176–177
in-demand skill sets, 41–42
“dessert charts”, 13–14. See also doughnut chart; pie chart
deuteranope color deficiency, 151–152
diagram, plot, 74
dimensions, 61
divergent stacked bar chart
calculated fields
Gantt percent line, 222–226
Gantt Start, 220–222
negative sentiment, 217–218
percent of Gantt sizing, 222
total negative sentiment, 218–219
total sentiment scores, 220
creating, 216–217
diverging color, 140–142
doughnut chart, 91–97
drill down, 74
drop line, formatting, 153–159
dual-axis line chart, 88–89
DVCC (Data Visualization Competency Center), 238
DVS (Data Visualization Society), SOTI (State of the Industry) survey,
9–10
factors, 75
false reveal, 238
Federal Institute of Technology, 2
feedback, 238–239
fields
calculated, 207
continous, 62
geographic, 117–121
null, 55
removing labels and headers, 165–168
filter/s, 52, 73
fitness, 22
Flowing Data, 240
formatting
borders, 160–163
lines, 153–159
text, 137–138
timeline, 207–208
Forrester Consulting, 10–11
functions
PROPER(), 137–138
working with NULL values, 179–181
ZN(), 180
Gapminder Foundation, 2
Gartner’s Magic Quadrant™, Leaders, 38–39
Gates, B., 64
genre, 75–76
geodata
assigning geographic roles, 117–121
connecting to, 116–117
creating geographic hierarchies, 120–122
proportional symbol maps, 123–126
supported, 115–116
Github, 15, 123
Global Superstore, 48
Goldberg, S., 18
Google Maps, 131
grid line, formatting, 153–159
Hannibal, 34
hard data, 32
Harry Potter, 232
context, 66–67
dataset, 216
headers, removing, 165–168
heat map, 109–112
Heer, J., 75
highlight color, 148
icons, Sort, 86
IEEE Computer Graphics and Applications, 40
IFNULL functions, 180
IIF functions, 180
illustration, 6
image roles, 171–173
Info We Trust, 240
Information is Beautiful, 241
inner join, 54
insight, 6, 8
installing, Tableau Desktop, 45–46
instructor-led training, 244
interface, 58–59
intersections, 75
ISNULL functions, 180
join
inner, 54
left, 54
outer, 54
right, 54
K-L
opacity, 146–147
organizational skills, 12–13
outer join, 54
outliers, 75
relationships, 52–54
relevant context, 67–69
replacing, data sources, 55–57
reversed color, 143
right join, 54
Rosling, H., 2
Rowling, J. K., 66
Ryan, L., The Visual Imperative: Creating a Visual Culture of Data
Discovery, 69, 135
Ryssdal, K., 18
U-V
Undo button, 59
usability testing, visualizations, 239
vision, red–green color deficiency, 151–152
visual analytics, 2
communication skills, 11–13
context, 13, 64, 65
education and training, 14–15
perceptual pop-out, 29
pre-attentive features, 29–30
skill sets, 10, 11, 14–15
visual design
banding, 164–165
borders, formatting, 160–163
building blocks, 136–137
color, 138–139
“brand”, 150
categorical, 143–145
consistency, 150–151
diverging, 140–142
opacity, 146–147
pre-attentive, 148–150
reversed, 143
sequential, 139–140
shading, 163–165
stepped, 142–143
image roles, 171–173
lines, 153–159
mark borders, 147
mark halos, 147–148
shapes
custom, 169–171
Shape Marks card, 168–169
visual storytelling, 7–8, 23
Anscombe’s Quartet, 24–25
emotion, 23
Hannibal’s march, 34
Harry Potter, context, 68–69
“Napoleon’s March by Minard”, 32–34, 134
Netflix viewing data, 26–28
Nigel Holme’s Monstrous Costs, 34–35
photojournalism, 18
picture superiority effect, 135
seasonal cycle, 30–32
visualizations, 2
chart
100% stacked bar, 214–215
bar, 83–86
bar-in-bar, creating, 210–212
bubble, 102–103
divergent stacked bar, creating, 216–225
doughnut, 91–97
labeled lollipop, 230–231
line, 88–90
lollipop, 226–230
packed bubble, 103–106
pie, 91–97
feedback, 238–239
Likert-scale, 213–214
map/s, 114–115
choropleth, 127–128
heat, 109–112
keeping neutral, 130–132
layers, 128–130
tree, 106–109
scatter plot, 98–101
sheets, 189
versus storytelling, 19–21
timeline
adding a reference line for “Today”, 209–210
adding a time frame, 208
creating, 202–207
formatting, 207–208
usability testing, 239
word cloud, creating, 231–234
VizQL, 39, 42
Vonnegut, K., 8
Wernicke’s area, 19
Wexler, S., Data Revelations, 239
Which Chart or Graph, 112
wide data, 182–185
Windows, installing Tableau Desktop, 46
word cloud, creating, 231–234
workbook
dashboards, 58
sheets, 58
stories, 58
Workbook Optimizer, 181–182
worksheet, dashboards, 190–191
workspace
dashboard, 192–193
story point, 195–196
X-Y-Z