Importing-Data in Excel
Importing-Data in Excel
Importing data
Importing data from various databases, crm systems, websites, and other sources is a huge
part of what many people do in Excel. Here you will find our tips on how to go about
importing data into Excel.
Import dates and numbers from text This section expands on best Convert data to dates, those that
files practices as related to importing align right are seen as dates by
dates and numbers. Excel.
Set up imports properly (Power Here we elaborate on setting up Check your data before you set up
Query & Text Import) imports in Power Query and the old your query to make sure your
school Text Import Wizard. formats match the source data.
Importing from other sources The final section is a quick dive into With any major database system,
different databases and sources filter out the information you need in
which can be called up by a query. the system before going to Excel.
Dates
For any dates Excel has successfully converted to a data, the filter drop-down will
automatically group them in periods like years and months. So if it doesn’t do that, beware!
Worked out example:
If you handle the transformation of those textual dates wrongly, Excel creates this mess:
As you can see, column D does contain a calculated month for some of the dates, but those
months are all wrong! Column F does contain true dates (all right-aligned and no left-aligned
text values left) and all of Column G returns the correct months, indicating the
transformation we did to get column F was indeed the right one.
So, don’t let seemingly correctly translated dates fool you, check and double-check.
Numbers
As simple as it may seem, even importing numbers has challenges. For example, you may
be using different decimal and thousand separators than your source data carries. This is
especially true for information downloaded from websites or from browser-based
applications like SAP. Especially if you’re in an international organization, formatting of
numbers in text-files can vary depending on their source and upon local (browser) settings.
Let’s assume our numbers contain a period as the decimal separator and a comma as
thousands grouping symbol. Then this bit of data can be a challenge to convert:
Note how all the numbers which happened to have three decimals are off by a factor of one-
thousand! Excel assumed the comma wasn’t the decimal separator, but the thousands
grouping symbol.
These problems are very common when trying to import data from text files, like typically
files with an extension of .csv or .txt. In order for Excel to properly convert dates and
numbers, you have to use the right settings.
PowerQuery
To use PowerQuery to import text files, and we highly recommend you do so, you must click
the New Query button (in newer Excel versions this button has been renamed to Get Data).
Before you do so, it is best if you check out your raw text file, using Notepad for instance.
PowerQuery expects data in text files to have the same formatting as the country (Region)
you selected in your Windows settings:
Now your task is to select the correct Locale. It isn’t very easy to find the one which happens
to correspond with your combination, but luckily, someone sorted this out and there is a list
available here: Listing Windows Language Code Identifiers And Their Associated Date And
Number Formats With M In Power BI/Power Query
PowerQuery also allows to set the locale during the import itself.
In the Query editor pane, right-click the column in question and choose “Change Type…”,
“Using Locale…”:
Allowing you to set both the data type and the locale to use during the translation to that
data type:
In the next window you simply browse to your file and open it:
Rather than clicking Load, the best next step to take now is click Transform Data. This will
open up the PowerQuery window, with a preview of your current data. It is important to know
that it is here that you should be filtering out (or simply deleting) any content you do not
need for your analysis.
Also important: Make sure each column is showing the correct data type as indicated by the
small icons next to the column headings:
PowerQuery will try to convert the dates according to the locale you chose. If your result
contains a lot of errors:
Double-check the locale you chose, it was probably one with a different date syntax or order
than the one your data is written in.
You can click each column individually and set the data format. Note the drop-down button
next to Date, it allows you to choose the order of the elements of a date. Also, note the
“Advanced…” button. This button allows you to set the decimal and thousands separators:
Another good reason to contact IT when it comes to these large databases is that the data
you need might not be easy to find. Corporate databases can contain hundreds of tables
and it is very likely that your data needs to come from a combination of them, using the right
relationships. IT can usually predefine all of this for you.
Also, in this phase it is very important to think about which data you need precisely and in
what format. For instance, if your goal is to simply report a summary of sales per
department, it makes no sense to have the query return all individual entries of items sold.
You need an aggregated dataset with just the information you wish to be able to report (and
filter) on. If there are 10 departments, all you need is 10 rows of data per period your report
is supposed to display.
While you’re at it, try to define which data type you expect to get for each column (Field) of
your data. Getting the correct datatype from the query will save you a lot of hassle later on.
All that being said, getting database data is fairly simple. Click New Query (Get Data), From
Database and choose the database type:
Example: PowerQuery opens a File browse window from which you need to select the
correct Access database file:
When you’ve chosen the correct tables, click the transform data button to proceed.
Note, that setting up a direct connection to tables in the database isn’t recommended, a
table may contain hundreds of millions of records! It is better to first set up a query (or View
or Stored procedure, naming depends on the database brand) in the database application
itself, to make sure you filter the data and get only the data into Excel which you actually
need for the report. Setting up a query is often a step that must be done by the Database
Administrator or owner.