Oracle Data Integrator
Oracle Data Integrator
questions
1) What components make up Oracle Data Integrator?
"Oracle Data Integrator" comprises of the following
1. Oracle Data Integrator + Topology Manager + Designer + Operator + Agent.
2. Oracle Data Quality for Data Integrator.
3. Oracle Data Profiling.
2) What is a model in ODI ?
A data model corresponds to group of tabular data structure stored in a data server
and is based on a Logical Schema defined in the topology and contain only
metadata.
ODI Model stores the metadata about the source and target tables.
An ODI model is a set of ODI datastore (table). A model can contain all tables or a
subset of the tables from a database schema. ODI does not call the tables tables,
but call them datastore since they may not come from database. XML files, LDAP
Tree, flat files, etc, are also datastores.
Unlike Informatica, which requires you to know whether a database table is a source
or a target during the time when you importing the database table definition, ODI lets
you to keep the model definition independently from the usage of the model.
The data warehouse staging tables are the source of the target tables and are the
target of the OLTP tables. You have to import them and copy them when you do ETL
development using Informatica.
ODI allows you to group datastores into model folders and sub models.
Partitioning In ODI:
Range Partitioning
Range partitioning maps data to partitions based on ranges of partition key values that
you establish for each partition. It is the most common type of partitioning and is often
used with dates. For example, you might want to partition sales data into monthly
partitions.
Range partitioning maps rows to partitions based on ranges of column values. Range
partitioning is defined by the partitioning specification for a table or index
in PARTITION BY RANGE(column_list) and by the partitioning specifications for each
individual partition in VALUES LESS THAN(value_list), wherecolumn_list is an ordered
list of columns that determines the partition to which a row or an index entry belongs.
These columns are called the partitioning columns. The values in the partitioning
columns of a particular row constitute that row's partitioning key.
An ordered list of values for the columns in the column list is called a value_list. Each
value must be either a literal or a TO_DATE or RPAD function with constant arguments.
Only the VALUES LESS THAN clause is allowed. This clause specifies a non-inclusive upper
bound for the partitions. All partitions, except the first, have an implicit low value
specified by the VALUES LESS THAN literal on the previous partition. Any binary values of
the partition key equal to or higher than this literal are added to the next higher partition.
Highest partition being where MAXVALUE literal is defined. Keyword, MAXVALUE, represents
a virtual infinite value that sorts higher than any other value for the data type, including
the null value.
The following statement creates a table sales_range that is range partitioned on
the sales_date field:
CREATE TABLE sales_range (salesman_id NUMBER(5), salesman_name
VARCHAR2(30), sales_amount NUMBER(10), sales_date
DATE)COMPRESSPARTITION BY RANGE(sales_date)(PARTITION sales_jan2000 VALUES
LESS THAN(TO_DATE('02/01/2000','DD/MM/YYYY')), PARTITION sales_feb2000
VALUES LESS THAN(TO_DATE('03/01/2000','DD/MM/YYYY')), PARTITION
sales_mar2000 VALUES LESS THAN(TO_DATE('04/01/2000','DD/MM/YYYY')),
PARTITION sales_apr2000 VALUES LESS
THAN(TO_DATE('05/01/2000','DD/MM/YYYY')));
Hash Partitioning
Hash partitioning maps data to partitions based on a hashing algorithm that Oracle
applies to a partitioning key that you identify. The hashing algorithm evenly distributes
rows among partitions, giving partitions approximately the same size. Hash partitioning is
the ideal method for distributing data evenly across devices. Hash partitioning is also an
easy-to-use alternative to range partitioning, especially when the data to be partitioned
is not historical.
Oracle Database uses a linear hashing algorithm and to prevent data from clustering
within specific partitions, you should define the number of partitions by a power of two
(for example, 2, 4, 8).
The following statement creates a table sales_hash, which is hash partitioned on
the salesman_id field:
CREATE TABLE sales_hash(salesman_id NUMBER(5), salesman_name VARCHAR2(30),
sales_amount NUMBER(10), week_no NUMBER(2)) PARTITION BY HASH(salesman_id)
PARTITIONS 4;
List Partitioning
List partitioning enables you to explicitly control how rows map to partitions. You do this
by specifying a list of discrete values for the partitioning column in the description for
each partition. This is different from range partitioning, where a range of values is
associated with a partition and with hash partitioning, where you have no control of the
row-to-partition mapping. The advantage of list partitioning is that you can group and
organize unordered and unrelated sets of data in a natural way. The following example
creates a list partitioned table grouping states according to their sales regions:
CREATE TABLE sales_list(salesman_id NUMBER(5), salesman_name
VARCHAR2(30),sales_state
VARCHAR2(20),sales_
7) What is An Interface?
Interface is an object in ODI which will map the sources to target datamarts.
8) What is a temporary Interface (Yellow Interface) ?
The advantage of using a yellow interface is to avoid the creation of Models each
time we need to use it in an interface. Since they are temporary, they are not a part
of the data model and hence dont need to be in the Model.
9) Explain some differences between ODI 10g and ODI 11g?
1. ODI 11g provides a Java API to manipulate both the design-time and run-time
artifacts of the product. This API allows you for example to create or modify
interfaces programmatically, create your topology, perform import or export
operations, launch or monitor sessions. This API can be used in any Java SE
and Java EE applications, or in the context of Java-based scripting languages
like Groovy or Jython.
2. External Password Storage, to have source/target data servers (and contexts)
passwords stored in an enterprise credential store. External Authentication, to
interfaces. This profile give access to any project and project sub components
(folders, interfaces, knowledge modules, etc.) stored in the repository. It also
authorize users to perform journalizing actions (start journal, create subscriber, etc.)
or to run static controls over models and datastores.
METADATA ADMIN: Generic profile for users responsible for managing models and
reverse-engineering. Users having this profile are allowed to browse any project in
order to select a CKM, RKM or JKM and to attach it to a specific model.
OPERATOR: Generic profile for operators. It allows users to browse execution logs.
REPOSITORY EXPLORER: Generic profile for meta-data browsing through
Metadata Navigator. It also allows scenario launching from Metadata Navigator.
SECURITY ADMIN: Generic profile for administrators of user accounts and profiles.
TOPOLOGY ADMIN: Generic profile for users responsible for managing the
information system topology. Users granted with this profile are allowed to perform
any action through the Topology Manager module.
VERSION ADMIN: Generic profile for managing component versions as well as
solutions. This profile must be coupled with DESIGNER and METADATA ADMIN.
NG DESIGNER: Non-Generic profile for DESIGNER.
NG METADATA ADMIN: Non-Generic profile for METADATA ADMIN.
NG REPOSITORY EXPLORER: Non-generic profile for meta-data browsing through
Metadata Navigator.
NG VERSION ADMIN: Non-generic profile for VERSION ADMIN. It is recommended
that you use this profile with NG DESIGNER and NG METADATA ADMIN.
29) Difference between generic and non-generic profiles ?
If an administrator wants a user to have the rights on no instance by default, but
wishes to grant the rights by instance, he must grant the user with a non-generic
profile.
If an administrator wants a user to have the rights on all instances of an object type
by default, he must grant the user with a generic profile.
30) What are the types of Generic profiles types in ODI ?
Connect, Designer, Metadata Admin, Operator, Repository Explorer, Security Admin,
Topology, Admin, Version Admin.
31) What are the types of non-Generic profiles types in ODI ?
NG DESIGNER, NG METADATA ADMIN, NG REPOSITORY EXPLORER, NG
VERSION ADMIN.
48) Suppose having unique and duplicate but i want to load unique record one
table and duplicates one table?
Create two interfaces or once procedure and use two queries one for Unique values
and one for duplicate values.
Create two interfaces or once procedure and use two queries one for Unique values
and one for duplicate values.
60) What is a procedure and how to write the procedures in ODI ?
A Procedure is a reusable component that allows you to group actions that do not fit
in the Interface framework. (That is load a target datastore from one or more
sources). A Procedure is a sequence of commands launched on logical schemas. It
has a group of associated options. These options parameterize whether or not a
command should be executed as well as the code of the commands.
61) What are the prime responsibilities of Data Integration Administrator?
Explain what is ORACLE DATA INTEGATOR? why is it different from the other ETL
tools.
Ans 1: ORACLE DATA INTEGATOR stands for Oracle Data Integrator. It is
different from another ETL tool in a way that it uses E-LT approach as opposed
to ETL approach. This approach eliminates the need of the exclusive
Transformation Server between the Source and Target Data server. The power of
the target data server can be used to transform the data. i.e. The target data
server acts as staging area in addition to its role of target databasel. While
loading the data in the target database (from staging area) the transformation
logic is implemented. Also, the use of appropriate CKM (Check Knowldege
Module) can be made while doing this to implement data quality requirement.
Q3.How will you bring in the different source data into ORACLE DATA
INTEGATOR?
you will have to create dataservers in the topology manager for the different
sources that you want.
The latest version, Oracle Data Integrator Enterprise Edition (ODI-EE 12c) brings together
"Oracle Data Integrator" and "Oracle Warehouse Builder" as separate components of a
single product with a single licence.
What is E-LT? Or What is the difference between ODI and other ETL Tools?
E-LT is an innovative approach to extracting, loading and Transforming data. Typically ETL
application vendors have relied on costly heavyweight , mid-tier server to perform the
transformations required when moving large volumes of data around the enterprise.
ODI delivers unique next-generation, Extract Load and Transform (E-LT) technology that
improves performance and reduces data integration costs, even across heterogeneous
systems by pushing the processing required down to the typically large and powerful
database servers already in place within the enterprise.
It can also connect to any data source supporting JDBC, its possible even to use the Oracle
BI Server as a data source using the jdbc driver that ships with BI Publisher
Reverse-engineering knowledge modules are used for reading the table and other
object metadata from source databases
Journalizing knowledge modules record the new and changed data within either a
single table or view or a consistent set of tables or views
Loading knowledge modules are used for efficient extraction of data from source
databases for loading into a staging area (database-specific bulk unload utilities can be
used where available)
Check knowledge modules are used for detecting errors in source data
Integration knowledge modules are used for efficiently transforming data from
staging area to the target tables, generating the optimized native SQL for the given
database
Service knowledge modules provide the ability to expose data as Web services
ODI ships with many knowledge modules out of the box, these are also extendable, they
can modified within the ODI Designer module.
Does my ODI infrastructure require an Oracle database?No, the ODI modular repositories
(Master + and one of multiple Work repositories) can be installed on any database engine
that supports ANSI ISO 89 syntax such as Oracle, Microsoft SQL Server, Sybase AS
Enterprise, IBM DB2 UDB, IBM DB2/40.
Does ODI support web services?
Yes, ODI is 'SOA' enabled and its web services can be used in 3 ways:
The Oracle Data Integrator Public Web Service, that lets you execute a scenario (a
published package) from a web service call
Data Services, which provide a web service over an ODI data store (i.e. a table,
view or other data source registered in ODI)
The ODIInvokeWebService tool that you can add to a package to request a response
from a web service
what
is
the
ODI
Console?
ODI console is a web based navigator to access the Designer, Operator and Topology
components through browser.
suppose I having 6 interfaces and running the interface 3 rd one failed how to run
remaining
interfaces?
If you are running Sequential load it will stop the other interfaces. so goto operator and
right click on filed interface and click on restart. If you are running all the interfaces are
parallel only one interface will fail and other interfaces will finish.
what
is
load
plans
and
types
of
load
plans?
Load plan is a process to run or execute multiple scenarios as a Sequential or parallel or
conditional based execution of your scenarios. And same we can call three types of load
plans , Sequential, parallel and Condition based load plans.
what
is
profile
in
ODI?
profile is a set of objective wise privileges. we can assign this profiles to the users. Users
will
get
the
privileges
from
profile
How
to
write
the
sub-queries
in
ODI?
Using Yellow interface and sub queries option we can create sub queries in ODI. or Using
VIEW we can go for sub queries Or Using ODI Procedure we can call direct database queries
in
ODI.
How
to
remove
the
duplicate
in
ODI?
Use DISTINCT in IKM level. it will remove the duplicate rows while loading into target.
Suppose having unique and duplicate but i want to load unique record one table and
duplicates
one
table?
Create two interfaces or once procedure and use two queries one for Unique values and
one for duplicate values.
how
to
implement
data
validations?
Use Filters & Mapping Area AND Data Quality related to constraints use CKM Flowcontrol.
How
to
handle
exceptions?
Exceptions In packages advanced tab and load plan exception tab we can handle
exceptions.
In the package one interface got failed how to know which interface got failed if we no
access
to
operator?
Make it mail alert or check into SNP_SESS_LOg tables for session log details.
How to implement the logic in procedures if the source side data deleted that will
reflect
the
target
side
table?
User this query on Command on target Delete from Target_table where not exists (Select
'X' From Source_table Where Source_table.ID=Target_table.ID).
If the Source have total 15 records with 2 records are updated and 3 records are newly
inserted at the target side we have to load the newly changed and inserted records
Use IKM Incremental Update Knowledge Module for Both Insert n Update operations.
Can
we
implement
package
Yes, we can call one package into other package.
in
package?
How to load the data with one flat file and one RDBMS table using joins?
Drag and drop both File and table into source area and join as in Staging area.
If the source and target are oracle technology tell me the process to achieve this
requirement(interfaces, KMS, Models)
Use LKM-SQL to SQL or LKM-SQL to Oracle , IKM Oracle Incremental update or Control
append.
what we specify the in XML data server and parameters for to connect to xml file?
The ability to generate code on source and target systems alike, in the same
transformation
The ability to generate native SQL for any database on the marketmost ETL
tools will generate code for their own engines, and then translate that code for
the databaseshence limiting their generation capacities to their ability to
convert proprietary concepts
The ability to generate DML and DDL, and to orchestrate sequences of
operations on the heterogeneous systems
1.Explain what is ODI?why is it different from the other ETL tools.
ANS: ODI stands for Oracle Data Integrator. It is different from another ETL tool in a
way that it uses E-LT approach as opposed to ETL approach. This approach eliminates
the need of the exclusive Transformation Server between the Source and Target Data
server. The power of the target data server can be used to transform the data. i.e. The
target data server acts as staging area in addition to its role of target databasel. While
loading the data in the target database (from staging area) the transformation logic is
implemented. Also, the use of appropriate CKM (Check Knowldege Module) can be made
while doing this to implement data quality requirement.
2.Explain about what you have done using ODI?
3.How will you bring in the different source data into ODI?
ANS:you will have to create dataservers in the topology manager for the different
sources that you want.
4.How will you bulk load data?
ANS:In Odi there are IKM that are designed for bulk loading of data.
5.How will you install ODI?
ANS: http://subbus.blogspot.in/2014/03/oracle-data-integrator-odisunopsis.html
6.How will you bring in files from remote locations?
ANS:We will invoke the Service knowledge module in ODI,this will help us to accesses
data thought a web service.
7.How will you handle dataquality in ODI?
ANS:There are two ways of handling dataquality in Odi....the first method deals with
handling the incorrect data using the CKM...the second method uses Oracle data quality
tool(this is for advanced quality options)
What
is
Oracle
Data
Integrator
(ODI)?
Oracle acquired SUNOPSIS with its ETL tool called Sunopsis Data Integrator and renamed
to Oracle Data Integrator (ODI) is an E-LT (Extract, Load and Transform) tool used for highspeed data movement between disparate systems.
The latest version, Oracle Data Integrator Enterprise Edition (ODI-EE 12c) brings together
Oracle Data Integrator and Oracle Warehouse Builder as separate components of a single
product with a single licence.
Oracle
Data
Quality
for
Data
Integrator
Oracle Data Profiling
What
systems
can
ODI
extract
and
load
data
into?
ODI brings true heterogeneous connectivity out-of-the-box, it can connect natively to Oracle,
Sybase, MS SQL Server, MySQL, LDAP, DB2, PostgreSQL, Netezza.
It can also connect to any data source supporting JDBC, its possible even to use the Oracle BI
Server as a data source using the jdbc driver that ships with BI Publisher
What
are
Knowledge
Modules?
Knowledge Modules form the basis of plug-ins that allow ODI to generate the relevant
execution code , across technologies , to perform tasks in one of six areas, the six types of
knowledge module consist of:
Reverse-engineering knowledge modules are used for reading the table and other object
metadata from source databases.
Journalizing knowledge modules record the new and changed data within either a single table
or view or a consistent set of tables or views
Loading knowledge modules are used for efficient extraction of data from source databases
for loading into a staging area (database-specific bulk unload utilities can be used where
available)
Check knowledge modules are used for detecting errors in source data
Integration knowledge modules are used for efficiently transforming data from staging area to
the target tables, generating the optimized native SQL for the given database
Service knowledge modules provide the ability to expose data as Web services
ODI ships with many knowledge modules out of the box, these are also extendable, they can
modified within the ODI Designer module.
Does my ODI infrastructure require an Oracle database?No, the ODI modular repositories
(Master + and one of multiple Work repositories) can be installed on any database engine that
supports ANSI ISO 89 syntax such as Oracle, Microsoft SQL Server, Sybase AS Enterprise,
IBM DB2 UDB, IBM DB2/40.
Does ODI support web services?
Yes, ODI is SOA enabled and its web services can be used in 3 ways:
The Oracle Data Integrator Public Web Service, that lets you execute a scenario (a published
package)
from
a
web
service
call
Data Services, which provide a web service over an ODI data store (i.e. a table, view or other
data
source
registered
in
ODI)
The ODIInvokeWebService tool that you can add to a package to request a response from a
web service
What is the ODI Console?
ODI console is a web based navigator to access the Designer, Operator and Topology
components through browser.
suppose I having 6 interfaces and running the interface 3 rd one failed how to run remaining
interfaces?
If you are running Sequential load it will stop the other interfaces. so goto operator and right
click on filed interface and click on restart. If you are running all the interfaces are parallel
only one interface will fail and other interfaces will finish.
what is load plans and types of load plans?
Each work repository is attached to a master repository, therefore, information about the
physical connection to a work repository is stored in the master repository it is attached to.
Defining a connection to a work repository consists of defining a connection to a master
repository, then selecting one of the work repositories attached to this master repository.
What is Master Repository ?
The Master Repository is a data structure containing information on the topology of a
companys IT resources, on security and on version management of projects and data models.
This repository is stored on a relational database accessible in client/server mode from the
different modules.
Generally, only one master repository is necessary.
However, in exceptional circumstances, it may be necessary to create several master
repositories in one of the following cases:
-Project construction over several sites not linked by a high-speed network (off-site
development, for example).
-Necessity to clearly separate the interfaces operating environments (development, test,
production), including on the database containing the master repository. This may be the case
if these environments are on several sites.
What is a Procedure?
A Procedure is a reusable component that allows you to group actions that do not fit in the
Interface framework. (That is load a target datastore from one or more sources).
A Procedure is a sequence of commands launched on logical schemas. It has a group of
associated options. These options parameterize whether or not a command should be
executed as well as the code of the commands.
What is Model ?
An Oracle Model is a set of datastores corresponding to views and tables contained in an
Oracle Schema. A model is always based on a Logical Schema. In a given Context, the
Logical Schema corresponds to a Physical Schema. The Data Schema of this Physical
Schema contains the Oracle models tables and views.
What is a Package ?
The package is the biggest execution unit in Oracle Data Integrator. A package is made of a
sequence of steps organized in an execution diagram.
processing applications. Contexts allow the same jobs (Reverse, Data Quality Control,
Package, etc) to be executed on different databases and/or schemas.
In Oracle Data Integrator, a context allows logical objects (logical agents, logical schemas) to
be linked with physical objects (physical agents, physical schemas).
What
is
Sequences?
A sequence is a variable that increments itself each time it is used. Between two uses, the
value can be stored in the repository or managed within an external RDBMS table.
Oracle Data Integrator supports two types of sequences:
-Standard sequences, whose last value is stored in the Repository.
-Specific sequences , whose last value is stored in an RDBMS table cell. Oracle Data
Integrator undertakes to read the value, to lock the row (for concurrent updates) and to update
the row after the last increment.
What
is
Session?
A session is an execution (of a scenario, an interface, a package or a procedure, )
undertaken by an execution agent. A session is made up of steps which are made up of tasks.
What is Session Tasks?
The task is the smallest execution unit. It corresponds to a procedure command in a KM, a
procedure, assignment of a variable, etc