DATA WAREHOUSE
DATA WAREHOUSE
• Data warehouses are central repositories of
integrated data from one or more disparate
sources.
• They store current and historical data in one single
place that are used for creating reports.
• This is beneficial for companies as it enables them
to interrogate and draw insights from their data and
make decisions
DATA MART
• A data mart is a simple form of data warehouse
focused on a single subject or line of business.
• A data mart is a data storage system that contains
information specific to an organization's business
unit.
• It contains a small and selected part of the data
that the company stores in a larger storage
system.
Data Mart Vs Data Warehouse
OLTP
• OLTP or Online Transaction Processing is a type of
data processing that consists of executing a
number of transactions occurring concurrently—
online banking, shopping, order entry, or sending
text messages.
• Examples of systems that use OLTP include: ATMs,
financial transaction systems and online banking
applications; online bookings, ticketing and
reservation systems
Characteristics of OLTP systems
• Process a large number of relatively simple
transactions: (insertions, updates, and
deletions to data, as well as simple data
queries (for example, a balance check at an
ATM).
• Enable multi-user access to the same data,
while ensuring data integrity
• Emphasize very rapid processing, with
response times measured in milliseconds
Characteristics of OLTP systems
• Provide indexed data sets: These are used for
rapid searching, retrieval, and querying.
• Are available 24/7/365
OLAP
• OLAP (online analytical processing Server) is a computing
method that enables users to easily and selectively extract
and query data in order to analyze it from different points
of view.
• Online analytical processing (OLAP) is software technology
you can use to analyze business data from different points of
view.
• OLAP queries often aid in trends analysis, financial reporting,
sales forecasting, budgeting and other planning purposes.
Why is OLAP important?
• Faster decision making
• Non-technical user support
• Integrated data view
How does OLAP work?
• An online analytical processing (OLAP) system works
by collecting, organizing, aggregating, and analyzing
data using the following steps:
– The OLAP server collects data from multiple data sources,
including relational databases and data warehouses.
– Then, the extract, transform, and load (ETL) tools clean,
aggregate, precalculate, and store data in an OLAP cube
according to the number of dimensions specified.
– Business analysts use OLAP tools to query and generate
reports from the multidimensional data in the OLAP cube.
OLAP OPERATIONS/FUNCTIONS
• There are five basic analytical operations that
can be performed on an OLAP cube:
– Roll-up
– Drill-down
– Slice
– dice
– Pivot (rotate)
Drill down: In drill-down operation, the less detailed data is converted
into highly detailed data. It can be done by:
• Moving down in the concept hierarchy
• Adding a new dimension
• In the cube given, the drill down operation is performed by
moving down in the concept hierarchy of Time dimension
(Quarter -> Month).
• Roll up: It is just opposite of the drill-down operation. It
performs aggregation on the OLAP cube. It can be done
by:
– Climbing up in the concept hierarchy
– Reducing the dimensions
• In the cube given, the roll-up operation is performed by
climbing up in the concept hierarchy
of Location dimension (City -> Country).
• Dice: It selects a sub-cube from the OLAP cube
by selecting two or more dimensions.
• Here, a sub-cube is selected by selecting
following dimensions with criteria:
– Location = “Delhi” or “Kolkata”
– Time = “Q1” or “Q2”
– Item = “Car” or “Bus”
• Slice: It selects a single dimension from the OLAP
cube which results in a new sub-cube creation.
• Here, Slice is performed on the dimension
Time = “Q1”.
• Pivot: It is also known as rotation operation as it
rotates the current view to get a new view of the
representation.
• In the sub-cube obtained after the slice operation,
performing pivot operation gives a new view of it.
OLAP MULTIDIMENSIONAL MODELS
1. Data Cube
2. Star
3. Snow Flakes
4. Fact constellation
1. DATA CUBE
• Data cube is a structure that enables OLAP to achieve the
multidimensional functionality.
• The data cube is used to represent data along some measure
of interest.
• Data Cubes are an easy way to look at the data (allow us to
look at complex data in a simple format).
• Although called a "cube", it can be 2-dimensional, 3-
dimensional, or higher dimensional.
• Data cube design is for efficiency in data retrieval.
• The cube is comparable to a table in a relational database.
2. Star Schema
• A star schema is the elementary form of a dimensional
model, in which data are organized
into facts and dimensions.
• Star Schema in data warehouse, in which the center of the
star can have one fact table and a number of associated
dimension tables.
• It is known as star schema as its structure resembles a star.
• The Star Schema data model is the simplest type of Data
Warehouse schema.
• It is also known as Star Join Schema and is optimized for
querying large data sets.
3. SNOW FLAKES SCHEMA
• The snowflake schema is an expansion of the
star schema where each point of the star
explodes into more points.
• It is called snowflake schema because the
diagram of snowflake schema resembles a
snowflake.
• The dimension tables are normalized which
splits data into additional tables.
4. FACT CONSTELLATION
• A Fact constellation means two or more fact tables
sharing one or more dimensions. It is also called
Galaxy schema.
• Fact Constellation Schema describes a logical
structure of data warehouse or data mart.
• Fact Constellation Schema can implement between
aggregate Fact tables or decompose a complex Fact
table into independent simplex Fact tables.