Data Preparation using PowerExcel
Data Preparation using PowerExcel
INTRO TO POWER
QUERY, POWER PIVOT
GETTING STARTED
COURSE OUTLINE
2 Power Query
• Types of data connectors, query editing tools, loading options, etc.
For a full, current list of compatible versions, visit support.office.com (or Google “Where is Power Pivot?”):
https://support.office.com/en-us/article/Where-is-Power-Pivot-aa64e217-4b6e-410b-8337-20b87e1c2a4b (or use: bit.ly/2yd80rd)
Other considerations:
• Power Pivot works best with 64-bit Excel, which can access more processing power and memory (not critical)
• Note: make sure you’re running a 64-bit operating system and that you’ve updated Office to the 64-bit version
• Power Pivot menus, features and tools have evolved over time; what you see on your screen may differ from
what you see on mine, but the fundamental skills and concepts covered are universally applicable
• Even if you have a compatible version of Excel, you may need to enable the Power Pivot or Power Query
plug-ins to access the tools in this course (File > Options > Add-Ins > Manage: COM Add-Ins)
GETTING TO KNOW THE FOODMART DATABASE
• Throughout the course, we’ll be using sample data from a fictional super market chain
called “FoodMart”*
• In addition to daily transactional records from 1997-1998, our data set includes
information about products, customers, stores, and regions
• All files are available for download in the course resources section of your course
dashboard (Course Dashboard > Course Content > All Resources)
Transactions Returns Customer Lookup Calendar Lookup Product Lookup Store Lookup Region Lookup
-transaction_date -return_date customer_id date product_id store_id region_id
-stock_date -product_id customer_acct_num month_num product_brand region_id sales_district
-product_id -store_id first_name quarter product_name store_type sales_region
-customer_id -quantity last_name year product_sku store_name
-store_id customer_address weekday_num product_retail_price store_street_address
-quantity etc.. etc… etc… etc…
2 This course is designed to get you up & running with Excel’s BI tools
• The goal is to provide a solid foundational understanding of Power Query, Power Pivot and DAX; we may
simplify some concepts to make them easier to grasp, and will not cover some of the more advanced tools
LET’S DO THIS.
INTRO TO “POWER EXCEL”
THE “POWER EXCEL” WORKFLOW
These are Excel’s Business Intelligence tools, all of which are available directly in Excel
(provided you have a compatible version); no additional software is required!
RAW DATA POWER QUERY DATA MODEL POWER PIVOT & DAX
Flat files (csv, txt), Excel tables, (aka “Get & Transform”) Explore and analyze the entire
Create table relationships, add
databases (SQL, Azure), folders, Connect to sources, import calculated columns, define data model, and create powerful
streaming sources, web data, etc. data, and apply shaping and hierarchies and perspectives, etc. measures using Data Analysis
transformation tools (ETL) Expressions (DAX)
“THE BEST THING TO HAPPEN TO EXCEL IN 20 YEARS”
Use Power Query and Power Pivot when you want to…
Analyze more data than can fit into a worksheet
From File From Database FromAzure From Online Services From Other Sources
THE QUERY EDITOR
Query
Editing
Tools
Formula Bar
(this is “M” code)
Name your
table!
Data
Preview Applied
Steps
Access the Query Editor by creating a new query and choosing the “Edit” option, or by launching
the Workbook Queries pane (Data > Show Queries) and right-clicking an existing query to edit
QUERY EDITOR TOOLS
The HOME tab includes general settings and common table transformation tools
The TRANSFORM tab includes tools to modify existing columns (splitting/grouping, transposing, extracting text, etc.
The ADD COLUMN tools create new columns based on conditional rules, text operations, calculations, dates, etc.
DATA LOADING OPTIONS
When you load data from Power Query, you have several options:
• Table
• Stores the data in a new or existing worksheet
• Requires relatively small data sets (<1mm rows)
• Connection Only
• Saves the data connection settings and applied steps
• Data does not load to a worksheet
Date & Time tools are relatively straight-forward, and include the following options:
• Age: Difference between the current time and the date in each row
• Date Only: Removes the time component of a date/time field
• Year/Month/Quarter/Week/Day: Extracts individual components from a date field
(Time-specific options include Hour, Minute, Second, etc.)
• Earliest/Latest: Evaluates the earliest or latest date from a column as a single value (can
only be accessed from the “Transform” menu)
Note: You will almost always want to perform these operations from the “Add Column” menu to
build out new fields, rather than transforming an individual date/time column
PRO TIP:
Load up a table containing a single date column and use Date tools to build out an entire calendar table
CREATING A BASIC CALENDAR TABLE
Give your queries clear and intuitive names, before loading the data
• Define names immediately; updating query & table names later can be a headache,
especially if you’ve already referenced them in calculated measures
• Don’t use spaces in table names (otherwise you have surround them with single quotes)
When working with large tables, only load the data you need
• Don’t include hourly data when you only need daily, or product-level transactions when
you only care about store-level performance; extra data will only slow you down
THANK YOU!