Learn Power BI: A comprehensive, step-by-step guide for beginners to learn real-world business intelligence
4/5
()
About this ebook
To succeed in today's transforming business world, organizations need business intelligence capabilities to make smarter decisions faster than ever before. This updated second edition of Learn Power BI takes you on a journey of data exploration and discovery, using Microsoft Power BI to ingest, cleanse, and organize data in order to unlock key business insights that can then be shared with others.
This newly revised and expanded edition of Learn Power BI covers all of the latest features and interface changes and takes you through the fundamentals of business intelligence projects, how to deploy, adopt, and govern Power BI within your organization, and how to leverage your knowledge in the marketplace and broader ecosystem that is Power BI. As you progress, you will learn how to ingest, cleanse, and transform your data into stunning visualizations, reports, and dashboards that speak to business decision-makers.
By the end of this Power BI book, you will be fully prepared to be the data analysis hero of your organization – or even start a new career as a business intelligence professional.
Read more from Gregory Deckler
The Definitive Guide to Power Query (M): Mastering complex data transformation with Power Query Rating: 5 out of 5 stars5/5
Related to Learn Power BI
Related ebooks
Extreme DAX: Take your Power BI and Microsoft data analytics skills to the next level Rating: 4 out of 5 stars4/5Data Engineering with dbt: A practical guide to building a cloud-based, pragmatic, and dependable data platform with SQL Rating: 0 out of 5 stars0 ratingsLearning Azure DocumentDB Rating: 0 out of 5 stars0 ratingsLaw and Digital Technologies - The Way Forward Rating: 0 out of 5 stars0 ratingsModern Data Architecture on AWS: A Practical Guide for Building Next-Gen Data Platforms on AWS Rating: 0 out of 5 stars0 ratingsThe ChatGPT Coaching Millionaire Blueprint (GPT-4o 2024 Edition): ChatGPT Millionaire Blueprint, #6 Rating: 0 out of 5 stars0 ratingsBlazor WebAssembly by Example: A project-based guide to building web apps with .NET, Blazor WebAssembly, and C# Rating: 0 out of 5 stars0 ratingsMaster Algorithm: Fundamentals and Applications Rating: 0 out of 5 stars0 ratingsAWS Cloud Automation: Harnessing Terraform For AWS Infrastructure As Code Rating: 0 out of 5 stars0 ratingsBuilding REST APIs with Flask: Create Python Web Services with MySQL Rating: 0 out of 5 stars0 ratingsCorporate Information Factory Rating: 1 out of 5 stars1/5Modern Web Development with Go Rating: 0 out of 5 stars0 ratingsCryptocurrencies: Money, Trust and Regulation Rating: 0 out of 5 stars0 ratingsData Privacy Fintech A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratingsWhat on Earth is Going On?: A Crash Course in Current Affairs Rating: 4 out of 5 stars4/5Hyperledger Fabric A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratingsIT infrastructure deployment Standard Requirements Rating: 0 out of 5 stars0 ratingsHow to Create and Manage a Mutual Fund or Exchange-Traded Fund: A Professional's Guide Rating: 0 out of 5 stars0 ratingsRegulatory Technology A Complete Guide - 2021 Edition Rating: 0 out of 5 stars0 ratingsJob Ready Java Rating: 0 out of 5 stars0 ratingsProgramming the Network with Perl Rating: 0 out of 5 stars0 ratingsDeep Learning for Data Architects: Unleash the power of Python's deep learning algorithms (English Edition) Rating: 0 out of 5 stars0 ratingsBlockchain & Decentralized Finance #1 Guide To Invest In Blockchain Technology, Cryptocurrencies, Altcoins, Smart Contracts and NFTs Rating: 0 out of 5 stars0 ratingsBAD MONEY: FinTech as an Instrument in the Battle for Global Dominance Rating: 0 out of 5 stars0 ratingsThe Ultimate Docker Container Book: Build, test, ship, and run containers with Docker and Kubernetes Rating: 0 out of 5 stars0 ratingsEthereum Bible Rating: 0 out of 5 stars0 ratings
Enterprise Applications For You
Excel 101: A Beginner's & Intermediate's Guide for Mastering the Quintessence of Microsoft Excel (2010-2019 & 365) in no time! Rating: 0 out of 5 stars0 ratingsExcel 2016 For Dummies Rating: 4 out of 5 stars4/5QuickBooks 2023 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsQuickBooks 2024 All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology Rating: 1 out of 5 stars1/5Excel : The Ultimate Comprehensive Step-By-Step Guide to the Basics of Excel Programming: 1 Rating: 5 out of 5 stars5/5Excel Formulas That Automate Tasks You No Longer Have Time For Rating: 5 out of 5 stars5/5Bitcoin For Dummies Rating: 4 out of 5 stars4/5Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates Rating: 4 out of 5 stars4/5Excel All-in-One For Dummies Rating: 0 out of 5 stars0 ratingsQuickBooks Online For Dummies Rating: 0 out of 5 stars0 ratings50 Useful Excel Functions: Excel Essentials, #3 Rating: 5 out of 5 stars5/5Excel Tables: A Complete Guide for Creating, Using and Automating Lists and Tables Rating: 5 out of 5 stars5/5Scrivener For Dummies Rating: 4 out of 5 stars4/5Enterprise AI For Dummies Rating: 3 out of 5 stars3/5Excel Workbook For Dummies Rating: 4 out of 5 stars4/5Learning Python Rating: 5 out of 5 stars5/5Access 2019 For Dummies Rating: 0 out of 5 stars0 ratingsM Is for (Data) Monkey: A Guide to the M Language in Excel Power Query Rating: 4 out of 5 stars4/5Experts' Guide to OneNote Rating: 5 out of 5 stars5/5Notion for Beginners: Notion for Work, Play, and Productivity Rating: 4 out of 5 stars4/5Excel Tips and Tricks Rating: 0 out of 5 stars0 ratings
Reviews for Learn Power BI
1 rating0 reviews
Book preview
Learn Power BI - Gregory Deckler
BIRMINGHAM—MUMBAI
Learn Power BI Second Edition
Copyright © 2021 Packt Publishing
All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.
Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing or its dealers and distributors, will be held liable for any damages caused or alleged to have been caused directly or indirectly by this book.
Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.
Publishing Product Manager: Reshma Raman
Senior Editor: David Sugarman
Content Development Editor: Nathanya Dias
Technical Editor: Sonam Pandey
Copy Editor: Safis Editing
Project Coordinator: Aparna Ravikumar Nair
Proofreader: Safis Editing
Indexer: Subalakshmi Govindhan
Production Designer: Ponraj Dhandapani
First published: September 2019
Second edition: December 2021
Production reference: 2220222
Published by Packt Publishing Ltd.
Livery Place
35 Livery Street
Birmingham
B3 2PB, UK.
ISBN 978-1-80181-195-8
www.packt.com
Contributors
About the author
Greg Deckler is a Microsoft MVP for Data Platform and an active member of the Columbus Ohio IT community, having founded the Columbus Azure ML and Power BI User Group (CAMLPUG) and presented at many conferences and events throughout the country. An active blogger and community member interested in helping new users of Power BI, Greg actively participates in the Power BI community, having authored over 180 Power BI Quick Measures Gallery submissions and over 5,000 authored solutions to community questions. Greg is Vice President of cloud services at Fusion Alliance, a regional consulting firm, and assists customers in gaining competitive advantage from the cloud and cloud first technologies such as Power BI.
Thanks to my son Rocket, mom Sandy, the real-life Pam, Pam Hagely, and the entire Power BI community for their support, in particular, Pragati Jain, Pat Mahoney, Miguel Felix, Imke Feldman, Parvinder Chana, Paul Brown, Marco Russo, Ed Hansberry, Allison Kennedy, Tom Martens, Gilbert Quevauvilliers, Konstantinos Ioannou, Fowmy Abdulmuttalib, Matt Allington, Mike Carlo, Seth Bauer, and Phil Seamark. Special thanks to Doug Brown, Tom Wood, Tom Campbell, Rick Mariotti, David LeVine, Megan Koontz, Kishan Wanigasingha, Ishara Guruge, Sajith Wanigasingha, Paul Moore, Rob Watkins, Jon Tokash, John Dages, David Schroeder, and my entire Fusion Alliance family, including work friend
Ashley Titus. This book is dedicated to my dad, Carl, who passed away in 2021.
About the reviewer
Peter Ter Braake started working as a developer in 1996 after studying physics in Utrecht, the Netherlands. Databases and business intelligence piqued his interest the most, leading to him specializing in SQL Server and its business intelligence components. He has worked with Power BI from the tool's very beginnings. Peter started working as an independent contractor in 2008. This has enabled him to divide his time between teaching data-related classes, consulting with customers, and writing articles and books. Peter has also authored Data Modeling for Azure Data Services, Packt.
Table of Contents
Preface
Section 1: The Basics
Chapter 1: Understanding Business Intelligence and Power BI
Exploring key concepts of business intelligence
Domain
Data
Model
Analysis
Visualization
Discovering the Power BI ecosystem
Core and Power BI-specific
Core and non-Power BI-specific
Non-core and Power BI-specific
Natively integrated Microsoft technologies
The extended Power BI ecosystem
Choosing the right Power BI license
Shared capacity
Dedicated capacity
Introducing Power BI Desktop and the Power BI service
Power BI Desktop
The Power BI service
Summary
Questions
Further reading
Chapter 2: Planning Projects with Power BI
Planning Power BI business intelligence projects
Identifying stakeholders, goals, and requirements
Procuring the required resources
Discovering the required data sources
Designing a data model
Planning reports and dashboards
Explaining the example scenario
Background
Identifying stakeholders, goals, and requirements
Procuring the required resources
Discovering the required data sources
Designing a data model
Planning reports and dashboards
Summary
Questions
Further reading
Section 2: The Desktop
Chapter 3: Up and Running with Power BI Desktop
Technical requirements
Downloading and running Power BI Desktop
Downloading Power BI Desktop
Running Power BI Desktop
Touring the desktop
Header
Views
Panes
Canvas
Wallpaper
Pages
Footer
Ribbon
The Formula Bar
Generating data
Creating a calculated table
Creating calculated columns
Formatting columns
Creating visualizations
Creating your first visualization
Formatting your visualization
Adding analytics to your visualization
Creating and using a slicer
Creating more visualizations
Editing visual interactions
Summary
Questions
Further reading
Chapter 4: Connecting to and Transforming Data
Technical requirements
Getting data
Creating your first query
Getting additional data
Transforming data
Touring the Power Query Editor
Transforming budget and forecast data
Transforming People, Tasks, and January data
Merging, copying, and appending queries
Merging queries
Expanding tables
Disabling queries from being loaded
Copying queries
Changing sources
Appending queries
Verifying and loading data
Organizing queries
Checking column quality, distribution, and profiles
Loading the data
Summary
Questions
Further reading
Chapter 5: Creating Data Models and Calculations
Technical requirements
Creating a data model
Touring the Model view
Modifying the layout
Creating and understanding relationships
Exploring the data model
Creating calculations
Calculated columns
Measures
Checking and troubleshooting calculations
Boundary cases
Slicing
Grouping
Summary
Questions
Further reading
Chapter 6: Unlocking Insights
Technical requirements
Segmenting data
Creating groups
Creating hierarchies
Understanding RLS
Buttons
Question and answer (Q&A)
Bookmarks
Advanced analysis techniques
The Analyze and Summarize features
Top-N filtering
Gauges and KPIs
What-if parameters
Conditional formatting
Quick measures
Report tooltip pages
Key influencers
Summary
Questions
Further reading
Chapter 7: Creating the Final Report
Technical requirements
Preparing the final report
Planning the final report
Cleaning up
Using a theme
Creating a page template
Using Sync slicers
Adjusting the calendar
Adding report filters
Creating the final report pages
Creating the Executive Summary page
Creating the Division Management page
Creating the Branch Management page
Creating the Hours Detail page
Creating the Employee Details page
Creating the Introduction page
Finishing up
Testing
Cleaning up
Summary
Questions
Further reading
Section 3: The Service
Chapter 8: Publishing and Sharing
Technical requirements
Getting an account
Office 365
Power BI trial
Introducing the service
Touring the service
Header
Navigation pane
Canvas
Publishing and sharing
Creating a workspace
Publishing
Sharing
Summary
Questions
Further reading
Chapter 9: Using Reports in the Power BI Service
Technical requirements
Viewing and using reports
File menu
Export menu
Share
Chat in Teams
Subscribe
Ellipsis (…)
Reset
Bookmark
View
Refresh visuals
Comment
Add to Favorites
Editing and creating reports
Editing reports
Creating a report
Summary
Questions
Further reading
Chapter 10: Understanding Dashboards, Apps, Goals, and Security
Technical requirements
Understanding dashboards
Creating a dashboard
Working with dashboards
Working with tiles
Creating and using apps
Creating an app
Getting and using apps
Working with goals
Creating scorecards and goals
Using scorecards and goals
Understanding security and permissions
Workspace permissions
App permissions
Object permissions
RLS
Summary
Questions
Further reading
Chapter 11: Refreshing Content
Technical requirements
Installing and using data gateways
Downloading and installing a data gateway
Running a data gateway
Configuring a data gateway
Managing a data gateway
Refreshing datasets
Scheduling a refresh
Summary
Questions
Further reading
Section 4: The Future
Chapter 12: Deploying, Governing, and Adopting Power BI
Technical requirements
Understanding usage models
Anarchy
Centralized
Distributed
Golden datasets
Hybrid
Governing and administering Power BI
Tenant settings
Deploying Power BI content
Adopting Power BI
Adoption strategies
Summary
Questions
Further reading
Chapter 13: Putting Your Knowledge to Use
Technical requirements
Understanding the business intelligence opportunity
Understanding the types of business intelligence jobs and roles
Growing your job and career
Understanding the employment and career opportunities
Job search strategies
Interviewing tips
Negotiating benefits and compensation
Continuing your journey
Summary
Questions
Further reading
Why subscribe?
Other Books You May Enjoy
Preface
To succeed in today's fast-paced business world, organizations need Business Intelligence (BI) capabilities more than ever in order to make smarter decisions that allow those organizations to be more efficient, effective, and profitable. This book is an entry-level guide specifically designed to get you up and running quickly with Power BI, including data import and transformation, data modeling, visualization, and analytical techniques without any prior knowledge of BI or Power BI.
You will find this book useful if you want to become knowledgeable about the extensive Power BI ecosystem. You'll start by understanding basic BI concepts and how BI projects are conducted. In short order, you will have Power BI Desktop installed and understand its major components. As you progress, step-by-step instructions are provided for using Power Query Editor to ingest, cleanse, and transform your data, creating simple and complex DAX calculations and visualizing your data in ways that truly bring your data to life. Additionally, you'll gain hands-on experience in creating visually stunning reports that speak to business decision makers and understand how to share and collaborate with others. Finally, you will understand how Power BI is deployed, governed, and adopted within organizations, the job and career opportunities available to BI professionals, and how to continue your learning.
By the end of this book, you'll be ready to create effective reports and dashboards using the latest features of Power BI.
Who this book is for
If you are new to BI or you are a business analyst or other technical or non-technical user who is new to Power BI, then this book is for you. No prior experience in BI or Power BI is required in order to proceed.
What this book covers
Chapter 1, Understanding Business Intelligence and Power BI, provides an introduction to key concepts of business intelligence, an overview of the Power BI ecosystem, licensing options for Power BI, and introduces the Power BI Desktop and Power BI Service.
Chapter 2, Planning Projects with Power BI, explains how business intelligence projects are planned and executed, including identifying stakeholders, goals, and requirements, required resources and data sources, and introduces the example scenario used throughout the rest of the book.
Chapter 3, Up and Running with Power BI Desktop, provides instructions for downloading and installing Power BI Desktop and an overview of the major components of the Desktop including Report, Data and Model views, the menu tabs, the Filters, and the Visualizations and Fields panes. It introduces the creation of tables and visualizations.
Chapter 4, Connecting to and Transforming Data, introduces the Power Query Editor for importing and transforming data, including transposing data, creating custom columns, adding index columns, splitting columns, referencing queries, appending and merging queries, additional transformation functions and importing data.
Chapter 5, Creating Data Models and Calculations, demonstrates how to create a data model by using the model view to create relationships between tables, and how to create and troubleshoot data analysis calculations.
Chapter 6, Unlocking Insights, introduces analysis concepts such as groups and hierarchies, row level security, report navigation using drill through and buttons, question and answer, bookmarks and advanced analysis techniques such as analyze, summarization, filtering, gauges, key performance indicators, What if parameters, conditional formatting, quick measures, report page tooltips, and advanced visuals such as, the Key Influencer's visual.
Chapter 7, Creating the Final Report, provides step-by-step instructions for creating a professional, multi-page report that provides data insights to business decision makers.
Chapter 8, Publishing and Sharing, demonstrates how to publish the final report to the Power BI Service and share the report with a larger audience.
Chapter 9, Using Reports in the Power BI Service, focuses on using reports in the Power BI Service including all of the various report functions such as editing reports, embedding, exporting, bookmarks, lineage view, comments, subscriptions and Microsoft Teams integration.
Chapter 10, Understanding Dashboards, Apps, Goals, and Security, provides information on creating and working with dashboards, including pinning and managing tiles, the creation and distribution of apps, the creation of scorecards and goals and an overview of permissions and security.
Chapter 11, Refreshing Content, demonstrates how to install, configure, and manage a data gateway, and how to schedule automatic refreshes for datasets within the Power BI Service.
Chapter 12, Deploying, Governing, and Adopting Power BI, introduces different deployment usage models for Power BI within organizations, the concept of governance of Power BI systems including all of the various Power BI Service tenant settings, and how to drive the adoption of Power BI within an organization.
Chapter 13, Putting Your Knowledge to Use, describes the overall opportunity available in business intelligence, the various types of business intelligence jobs, roles, and responsibilities, the differences between consulting and internal employees, job search strategies, interviewing and compensation negotiation tips, and finally includes information on blogs and other websites to continue your journey of learning Power BI.
To get the most out of this book
No prior experience in BI or Power BI is necessary. A keen interest in data and data analytics is helpful as well as prior experience with other BI tools.
Chapter 10, Understanding Dashboards, Apps, Goals, and Security, includes material that requires Premium or Premium Per User (PPU) licensing.
Important note
The existing Power BI UI will be updated soon to look as shown in this book.
If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book's GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.
Join the Power BI Community at https://community.powerbi.com!
Download the example code files
You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Learn-Power-BI-second-edition. If there's an update to the code, it will be updated in the GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Code in Action
The Code in Action videos for this book can be viewed at https://bit.ly/3F2HfnI.
Download the color images
We also provide a PDF file that has color images of the screenshots and diagrams used in this book. You can download it here: https://static.packt-cdn.com/downloads/9781801811958_ColorImages.pdf.
Conventions used
There are a number of text conventions used throughout this book.
Code in text: Indicates code words in the text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: The first parameter is the 'Hours' table, on line 4, and a filter, on line 5.
A block of code is set as follows:
Column 3 =
SUMX(
FILTER(
ALL('Hours'),
[Category] = Billable
&& [EmployeeID] = EARLIER([EmployeeID])
),
[Hours]
)
Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: Power Platform includes Power BI datasets and dataflows, as well as the Dataverse
Tips or Important notes
Enter data queries support up to 3,000 cells of information. If you run into a limitation, you can always copy the table in Power BI and then paste it into Excel. Once you've done this, you can add the required information in Excel, save it, and then import this Excel file into Power BI.
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, email us at [email protected] and mention the book title in the subject of your message.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.
Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at [email protected] with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Share Your Thoughts
Once you've read Learn Power BI, we'd love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.
Your review is important to us and the tech community and will help us make sure we're delivering excellent quality content.
Section 1:The Basics
The objective of this section is to introduce you to the key concepts of business intelligence and Power BI, understand how Power BI projects are conducted, and introduce you to the example scenario used throughout the rest of the book.
This section comprises the following chapters:
Chapter 1, Understanding Business Intelligence and Power BI
Chapter 2, Planning Projects with Power BI
Chapter 1: Understanding Business Intelligence and Power BI
Power BI is a powerful ecosystem of business intelligence tools and technologies from Microsoft. But what exactly is business intelligence, anyway? Simply stated, business intelligence is all about leveraging data to make better decisions. This can take many forms and is not necessarily restricted to just business. We use data in our personal lives to make better decisions as well. For example, if we are remodeling a bathroom, we get multiple quotes from different firms. The prices and details in these quotes are pieces of data that allow us to make an informed decision in terms of which company to choose. We may also research these firms online. This is more data that ultimately supports our decision.
In this chapter, we will explore the fundamental concepts of business intelligence, as well as why business intelligence is important to organizations. In addition, we will take a high-level tour of the Power BI ecosystem, licensing, and core tools, such as Power BI Desktop and the Power BI service.
The following topics will be covered in this chapter:
Exploring key concepts of business intelligence
Discovering the Power BI ecosystem
Choosing the right Power BI license
Introducing Power BI Desktop and the Power BI service
Exploring key concepts of business intelligence
In the context of organizations, business intelligence is about making better decisions for your business. Unlike the example in the introduction, organizations are not generally concerned with bathrooms but rather with what can make their business more effective, efficient, and profitable. The businesses that provided those quotes on bathroom remodeling need to answer questions such as the following:
How can the business attract new customers?
How can the business retain more customers?
Who are the competitors and how do they compare?
What is driving profitability?
Where can expenses be diminished?
There are endless questions that businesses need to answer every day, and these businesses need data coupled with business intelligence tools and techniques to answer such questions and make effective operational and strategic decisions.
While business intelligence is a vast subject in and of itself, the key concepts of business intelligence can be broken down into five areas:
Domain
Data
Model
Analysis
Visualization
Domain
A domain is simply the context where business intelligence is applied. Most businesses are composed of relatively standard business functions or departments, such as the following:
Sales
Marketing
Manufacturing/production
Supply chain/operations
Research and development
Human resources
Accounting/finance
Each of these business functions or departments represents a domain within which business intelligence can be used to answer questions that can assist us in making better decisions.
The domain helps in narrowing down the focus regarding which questions can be answered and what decisions need to be made. For example, within the context of sales, a business might want to know which sales personnel are performing better or worse, or which customers are the most profitable. Business intelligence can provide such insights as well as help to determine which activities enable certain sales professionals to outperform others, or why certain customers are more profitable than others. This information can then be used to train and mentor sales personnel who are performing less effectively or to focus sales efforts.
Within the context of marketing, a business can use business intelligence to determine which types of marketing campaigns, such as email, radio, print, TV, and the web, are most effective in attracting new customers. This then informs the business where they should spend their marketing budget.
Within the context of manufacturing, a business can use business intelligence to determine the Mean Time Between Failure (MTBF) for machines that are used in the production of goods. This information can be used by the business to determine whether preventative maintenance would be beneficial and how often such preventative maintenance should occur.
Clearly, there are endless examples of where business intelligence can make an organization more efficient, effective, and profitable. Deciding on a domain in which to employ business intelligence techniques is a key step in enabling business intelligence undertakings within organizations, since the domain dictates which key questions can be answered, the possible benefits, as well as what data is required in order to answer those questions.
Data
Once a domain has been decided upon, the next step is identifying and acquiring the data that's pertinent to that domain. This means identifying the sources of relevant data. These sources may be internal or external to an organization and may be structured, unstructured, or semi-structured in nature.
Internal and external data
Internal data is data that is generated within an organization by its business processes and operations. These business processes can generate large volumes of data that is specific to that organization's operations. This data can take the form of net revenues, sales to customers, new customer acquisitions, employee turnover, units produced, cost of raw materials, and time series or transactional information. This historical and current data is valuable to organizations if they wish to identify patterns and trends, as well as for forecasting and future planning. Importantly, all the relevant data to a domain and question is almost never housed within a single data source; organizations inevitably have multiple sources of relevant data.
In addition to internal data, business intelligence is most effective when internal data is combined with external data. Crucially, external data is data that is generated outside the boundaries of an organization's operations. Such external data includes things such as overall global economic trends, census information, customer demographics, household salaries, and the cost of raw materials. All this data exists irrespective of any single organization.
Each domain and question will have internal and external data that is relevant and irrelevant to answering the question at hand. However, do not be fooled into believing that simply because you have chosen manufacturing/production as the domain, other domains, such as sales and marketing, do not have relevant sources of data. If you are trying to forecast the required production levels, sales data in terms of pipelines can be very relevant. Similarly, external data that points toward overall economic growth may also be extremely relevant, while data such as the cost of raw materials may very well be irrelevant.
Structured, unstructured, and semi-structured data
Structured data is data that conforms to a rather formal specification of tables with rows and columns. Think of a spreadsheet where you might have columns for the transaction ID, customer, units purchased, and price per unit. Each row represents a sales transaction. Structured data sources are the easiest sources for business intelligence tools to consume and analyze. These sources are most often relational databases, which include technologies such as Microsoft SQL Server, Microsoft Access, Azure Table storage, Azure SQL Database, Oracle, MySQL, IBM Db2, Teradata, PostgreSQL, Informix, and Sybase. In addition, this category of data sources includes relational database standards such as Open Database Connectivity (ODBC) and Object Linking and Embedding Database (OLE DB).
Unstructured data is effectively the opposite of structured data. Unstructured data cannot be organized into simple tables with rows and columns. Such data includes things such as video, audio, images, and text. Text documents, social media posts, and online reviews are also examples of largely unstructured data. Unstructured data sources are the most difficult types of sources for business intelligence tools to consume and analyze. This type of data is either stored as Binary Large Objects (BLOBSs), online files or posts, or as files in a filesystem, such as the New Technology File System (NTFS) or the Hadoop Distributed File System (HDFS).
Semi-structured data has a structure but does not conform to the formal definition of structured data, that is, tables with rows and columns. Examples of semi-structured data include tab and delimited text files, XML, other markup languages such as HTML and XSL, JavaScript Object Notation (JSON), and Electronic Data Interchange (EDI). Semi-structured data sources have a self-defining structure that makes them easier to consume and analyze than unstructured data sources but require more work than true, structured data sources.
Semi-structured data also includes so-called NoSQL databases, which include data stores such as document databases, graph databases, and key-value stores. These databases are specifically designed to store structured and unstructured data. Document databases include Microsoft Azure Cosmos DB, MongoDB, Cloudant (IBM), Couchbase, and MarkLogic. Graph databases include Neo4j and HyperGraphDB. Key-value stores include Basho Technologies' Riak, Redis, Aerospike, Amazon Web Services' DynamoDB, Couchbase, DataStax's Cassandra, and MapR Technologies. Wide-column stores include Cassandra and HBase.
Finally, semi-structured data also includes data access protocols, such as Open Data Protocol (OData) and other Representational State Transfer (REST) Application Programming Interfaces (APIs). These protocols provide interfaces to data sources such as Microsoft SharePoint, Microsoft Exchange, Microsoft Active Directory, and Microsoft Dynamics; social media systems such as Twitter and Facebook; as well as other online systems such as Mailchimp, Salesforce, Smartsheet, Twilio, Google Analytics, and GitHub, to name a few. These data protocols abstract how the data is stored, whether that is a relational database, NoSQL database, or simply a bunch of files.
Most business intelligence tools, such as Power BI, are optimized for handling structured and semi-structured data. Structured data sources integrate natively with how business intelligence tools are designed. In addition, business intelligence tools are designed to ingest semi-structured data sources and transform them into structured data. Unstructured data is more difficult but not impossible to analyze with business intelligence tools. In fact, Power BI has some features that are designed to ease the ingestion and analysis of unstructured data sources. However, analyzing such unstructured data has its limitations.
Model
A model, or data model, refers to the way in which one or more data sources are organized to support analysis and visualization. Models are built by transforming and cleansing data, helping to define the types of data within those sources, as well as the definition of data categories for specific data types. Building a model generally involves three elements:
Organizing
Transforming and cleansing
Defining and categorizing
Organizing
Models can be extremely simple, such as a single table with columns and rows. However, business intelligence almost always involves multiple tables of data, and often involves multiple tables of data coming from multiple sources. Thus, the model becomes more complex as the various sources and tables of data must be combined into a cohesive whole. This is done by defining how each of the disparate sources of data relates to one another. As an example, let's say you have one data source that represents a customer's name, contact information, and perhaps the size of the business by revenue and/or the number of employees. This information might come from an organization's Customer Relationship Management (CRM) system. The second source of data might be order information, which includes the customer's name, units purchased, and the price that was paid. This second source of data comes from the organization's Enterprise Resource Planning (ERP) system. These two sources of data can be related to one another based on the unique name or ID of the customer.
Some sources of data have prebuilt models. This includes traditional data warehouse technologies for structured data as well as analogous systems for performing analytics over unstructured data. The traditional data warehouse technology is generally built upon the Online Analytical Processing (OLAP) technology and includes systems such as Microsoft's Analysis Services, Snowflake, Oracle's Essbase, AtScale cubes, SAP HANA and Business Warehouse servers, and Azure Synapse. With respect to unstructured data analysis, technologies such as Apache Spark, Databricks, and Azure Data Lake Storage are used.
Transforming and cleansing
When building a data model, it is often (read: always) necessary to clean and transform the source data. Data is never clean – it must always be massaged for bad data to be removed or resolved. For example, when dealing with customer data from a CRM system, it is not uncommon to have the same customer entered with multiple spellings. The format of data in spreadsheets may make data entry easy for humans but can be unsuitable for business intelligence purposes. In addition, data may have errors, missing data, inconsistent formatting, or even have something as seemingly simple as trailing spaces. These types of situations can cause problems when performing business intelligence analysis. Luckily, business intelligence tools such as Power BI provide mechanisms for cleansing and reshaping the data to support analysis. This might involve replacing or removing errors in the data, pivoting, unpivoting, or transposing rows and columns, removing trailing spaces, or other types of transformation operations.
Transforming and cleansing technologies are often referred to as Extract, Transform, Load (ETL) tools and include products such as Microsoft's SQL Server Integration Services (SSIS), Azure Data Factory, Alteryx, Informatica, Dell Boomi, Salesforce's MuleSoft, Skyvia, IBM's InfoSphere Information Server, Oracle Data Integrator, Talend, Pentaho Data Integration, SAS's Data Integration Studio, Sybase ETL, and QlikView Expressor.
Defining and categorizing
Data models also formally define the types of data within each table. Data types generally include formats such as text, decimal number, whole number, percentage, date, time, date and time, duration, true/false, and binary. The definition of these data types is important as it defines what kind of analysis can be performed on the data. For example, it does not make sense to create a sum or average of text data types; instead, you would use aggregations such as count, first, or last.
Finally, data models also define the data category of data types. While a data type such as a postal code might be numeric or text, it is important for the model to define that the numeric data type represents a postal code. This further defines the type of analysis that can be performed upon this data, such as plotting the data on a map. Similarly, it might be important for the data model to define that a text data type represents a web or image Uniform Resource Locator (URL). Typical data categories include such things as address, city, state, province, continent, country, region, place, county, longitude, latitude, postal code, web URL, image URL, and barcode.
Analysis
Once a domain has been selected and data sources have been combined into a model, the next step is to perform an analysis of the data. This is a key process within business intelligence as this is when you attempt to answer questions that are relevant to the business using internal and external data. Simply having data about sales is not immediately useful to a business. For example, to predict future sales revenue, it is important that such data is aggregated and analyzed. This analysis can determine the average sales for a product, the frequency of