0% found this document useful (0 votes)
26 views30 pages

Mod1 Data Warehouse

Data warehouses are centralized repositories for integrated data from various sources, facilitating reporting and decision-making. Data marts are simplified data warehouses focused on specific business units, while OLTP systems handle concurrent transactions, and OLAP systems enable complex data analysis. OLAP operations include roll-up, drill-down, slice, dice, and pivot, with various multidimensional models like data cubes, star schemas, snowflake schemas, and fact constellations used for data organization.

Uploaded by

donmathew666666
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
26 views30 pages

Mod1 Data Warehouse

Data warehouses are centralized repositories for integrated data from various sources, facilitating reporting and decision-making. Data marts are simplified data warehouses focused on specific business units, while OLTP systems handle concurrent transactions, and OLAP systems enable complex data analysis. OLAP operations include roll-up, drill-down, slice, dice, and pivot, with various multidimensional models like data cubes, star schemas, snowflake schemas, and fact constellations used for data organization.

Uploaded by

donmathew666666
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd

DATA WAREHOUSE

DATA WAREHOUSE
• Data warehouses are central repositories of
integrated data from one or more disparate
sources.

• They store current and historical data in one single


place that are used for creating reports.

• This is beneficial for companies as it enables them


to interrogate and draw insights from their data and
make decisions
DATA MART
• A data mart is a simple form of data warehouse
focused on a single subject or line of business.

• A data mart is a data storage system that contains


information specific to an organization's business
unit.

• It contains a small and selected part of the data


that the company stores in a larger storage
system.
Data Mart Vs Data Warehouse
OLTP
• OLTP or Online Transaction Processing is a type of
data processing that consists of executing a
number of transactions occurring concurrently—
online banking, shopping, order entry, or sending
text messages.

• Examples of systems that use OLTP include: ATMs,


financial transaction systems and online banking
applications; online bookings, ticketing and
reservation systems
Characteristics of OLTP systems
• Process a large number of relatively simple
transactions: (insertions, updates, and
deletions to data, as well as simple data
queries (for example, a balance check at an
ATM).
• Enable multi-user access to the same data,
while ensuring data integrity
• Emphasize very rapid processing, with
response times measured in milliseconds
Characteristics of OLTP systems
• Provide indexed data sets: These are used for
rapid searching, retrieval, and querying.
• Are available 24/7/365
OLAP
• OLAP (online analytical processing Server) is a computing
method that enables users to easily and selectively extract
and query data in order to analyze it from different points
of view.

• Online analytical processing (OLAP) is software technology


you can use to analyze business data from different points of
view.

• OLAP queries often aid in trends analysis, financial reporting,


sales forecasting, budgeting and other planning purposes.
Why is OLAP important?

• Faster decision making


• Non-technical user support
• Integrated data view
How does OLAP work?

• An online analytical processing (OLAP) system works


by collecting, organizing, aggregating, and analyzing
data using the following steps:

– The OLAP server collects data from multiple data sources,


including relational databases and data warehouses.
– Then, the extract, transform, and load (ETL) tools clean,
aggregate, precalculate, and store data in an OLAP cube
according to the number of dimensions specified.
– Business analysts use OLAP tools to query and generate
reports from the multidimensional data in the OLAP cube.
OLAP OPERATIONS/FUNCTIONS
• There are five basic analytical operations that
can be performed on an OLAP cube:

– Roll-up
– Drill-down
– Slice
– dice
– Pivot (rotate)
Drill down: In drill-down operation, the less detailed data is converted
into highly detailed data. It can be done by:
• Moving down in the concept hierarchy
• Adding a new dimension

• In the cube given, the drill down operation is performed by


moving down in the concept hierarchy of Time dimension
(Quarter -> Month).
• Roll up: It is just opposite of the drill-down operation. It
performs aggregation on the OLAP cube. It can be done
by:
– Climbing up in the concept hierarchy
– Reducing the dimensions
• In the cube given, the roll-up operation is performed by
climbing up in the concept hierarchy
of Location dimension (City -> Country).
• Dice: It selects a sub-cube from the OLAP cube
by selecting two or more dimensions.
• Here, a sub-cube is selected by selecting
following dimensions with criteria:
– Location = “Delhi” or “Kolkata”
– Time = “Q1” or “Q2”
– Item = “Car” or “Bus”
• Slice: It selects a single dimension from the OLAP
cube which results in a new sub-cube creation.
• Here, Slice is performed on the dimension
Time = “Q1”.
• Pivot: It is also known as rotation operation as it
rotates the current view to get a new view of the
representation.
• In the sub-cube obtained after the slice operation,
performing pivot operation gives a new view of it.
OLAP MULTIDIMENSIONAL MODELS

1. Data Cube
2. Star
3. Snow Flakes
4. Fact constellation
1. DATA CUBE
• Data cube is a structure that enables OLAP to achieve the
multidimensional functionality.
• The data cube is used to represent data along some measure
of interest.

• Data Cubes are an easy way to look at the data (allow us to


look at complex data in a simple format).
• Although called a "cube", it can be 2-dimensional, 3-
dimensional, or higher dimensional.

• Data cube design is for efficiency in data retrieval.


• The cube is comparable to a table in a relational database.
2. Star Schema
• A star schema is the elementary form of a dimensional
model, in which data are organized
into facts and dimensions.
• Star Schema in data warehouse, in which the center of the
star can have one fact table and a number of associated
dimension tables.
• It is known as star schema as its structure resembles a star.
• The Star Schema data model is the simplest type of Data
Warehouse schema.
• It is also known as Star Join Schema and is optimized for
querying large data sets.
3. SNOW FLAKES SCHEMA
• The snowflake schema is an expansion of the
star schema where each point of the star
explodes into more points.
• It is called snowflake schema because the
diagram of snowflake schema resembles a
snowflake.
• The dimension tables are normalized which
splits data into additional tables.
4. FACT CONSTELLATION
• A Fact constellation means two or more fact tables
sharing one or more dimensions. It is also called
Galaxy schema.
• Fact Constellation Schema describes a logical
structure of data warehouse or data mart.

• Fact Constellation Schema can implement between


aggregate Fact tables or decompose a complex Fact
table into independent simplex Fact tables.

You might also like