Assignment 3 DM
Assignment 3 DM
MIT 7033
Assignment 3
By:
33.2
Discuss the relationship between Data Warehouse and Online Analytical Processing
(OLAP)
A data warehouse is a database containing data that usually represents the business
history of an organization. This historical data is used for analysis. Data in a data warehouse is
organized to support analysis rather than to process real-time transactions as in online transaction
processing systems (OLTP).
OLAP technology enables data warehouses to be used effectively for online analysis, providing
rapid responses to iterative complex analytical queries. OLAP's multidimensional data model and
data aggregation techniques organize and summarize large amounts of data so it can be evaluated
quickly using online analysis and graphical tools.
33.8
OLAP implementation faces many challenges. Discuss critical OLAP implementation issues
and how they are addressed at present
OLAP Implementations
MOLAP: OLAP implemented with a multi-dimensional database.
ROLAP: OLAP implemented with a relational database.
HOLAP: OLAP implemented with a hybrid of multi-dimensional and relational database techno
logies.
DOLAP: OLAP implemented for desktop decision support environments.
Before considering implementation of OLAP in our data warehouse, we have to take into
account two key issues with regard to the MOLAP model running under MDDBMS. The first
issue relates to the lack of standardization. Each vendor tool has its own client interface. Another
issue is scalability. OLAP is generally good for handling summary data, but not good for
volumes of detailed data. On the other hand, highly normalized data in the data warehouse
can give rise to processing overhead when we are performing complex analysis. We may reduce
this by using a STAR schema multidimensional design. In fact, for some ROLAP tools, the
multidimensional representation of data in a STAR schema arrangement is a prerequisite.
Consider a few choices of architecture. They are various implementation options for providing
OLAP functionality in our data warehouse. These are important choices. Remember, without
OLAP, our users have very limited means for analysing data.
First we need to recognize that the presence of an OLAP system in our data warehouse
environment shifts the workload. Some of the queries that usually must run against the data
warehouse will now be redistributed to the OLAP system. The types of queries that need OLAP
are complex and filled with involved calculations.
At this point, perhaps our project team has been given the mandate to build and implement an
OLAP system. We know the features and functions. We know the significance. We are also
aware of the important considerations. How do we go about implementing OLAP? Let us
summarize the key steps. These are the steps or activities at a very high level. Each step consists
of several tasks to accomplish the objectives of that step. We will have to come up with the tasks
based on the requirements of your environment. Here are the major steps:
Dimensional modeling
Design and building of the MDDB
Selection of the data to be moved into the OLAP system
Data acquisition or extraction for the OLAP system
Data loading into the OLAP server
Computation of data aggregation and derived data
Implementation of application on the desktop
Provision of user training
34.1
Discuss what data mining represents.
Data mining is the process of extracting valid, previously unknown, comprehensible, and
actionable information from large databases and using it to make crucial business decisions.
Data mining also concerned with the analysis of data and the use of software techniques for
finding hidden and unexpected patterns and relationships in sets of data. Data mining is
focusing to reveal information that is hidden and unexpected, as there is less value in finding
patterns relationships that are already intuitive. To identify the patterns and relationships, the
examining of the underlying rules and features in the data. Data mining analysis tends to
work from the data up, and the techniques that produce the most accurate results normally
require large volumes of data to deliver reliable conclusions. The process of analysis starts by
developing an optimal representation of the structure of sample data, during which time
knowledge is acquired. This knowledge is then extended to larger sets of data, working on
the assumption that the larger data set has a structure similar to the sample data. Data mining
can provide huge paybacks for companies who have made a significant investment in data
warehousing. Data mining is used in a wide range of industries. Data mining also has been
popular treated as a synonym of knowledge discovery in databases and some views data
mining as an essential step for knowledge discovery.