|
What
Is Cognos Data Manager?
Cognos Data Manager uses Jobstreams to build an automated
data mart.
Cognos Data Manager is an ETL (extract, transformation
and load) product. Its transforms your data, typically
from a transactional database to a data warehouse. Data
Manager is particularly optimised to produce ‘star
schema’ dimensional data marts, although it can
quite happily handle other types.
Data Manager makes it easy to merge data from a variety
of different data sources, for example Microsoft SQL
Server, Oracle 9i and 'comma delimited' files can happily
be merged in a single build.
Data Manager Make-Up
With Data Manager you create jobstreams to control your
data warehouse load, which consists of fact builds (which
build fact tables) and dimension builds, which are based
on hierarchies. All of these elements can be extensively
monitored with comprehensive log files.
Fact Builds
A fact build transforms one or more data sources into
a single output, say a database table or text file.
In the process you can rename items, construct complex
rules to determine which records are merged and which
are ignored, add your own items using functions and
calculations and so on. Fact tables are usually related
to dimensions, which contain categories such as Customer
Name, Address, Product Type, whereas the fact tables
typically contain numbers such as Order Amount or Quantity
Ordered.
Dimension Builds
This is where Data Manager gets clever! Data Manager
checks all the 'links' to dimensions before allowing
you to add a record to a fact table. Records where the
keys are not in sync are rejected.
Cognos Data Manager creates surrogate keys for you.
These are numeric references which have no relationship
with the data but make data warehouse maintenance easier.
Jobstreams
Jobstreams allow you to schedule and organize your data
warehouse builds. A jobstream is a sequence of 'nodes',
which can include other jobstreams, fact or dimension
builds, emails, procedures, your own SQL and so on.
There is a lot you can put into a Jobstream. Nodes can
be in sequence or set to run concurrently.
Slowly Changing Dimensions
Dimensional data which changes slowly, for example a
customer address, can be difficult to maintain because
a new dimension record is required whenever a change
is made. However, all the dimension records for the
customer need to be related to the same fact records.
Slowly changing dimensions usually contain 'start' and
'end' dates, which indicate, for example, which address
is the current one for the customer. Cognos Data Manager
is very good at handling these, creating start and end
dates for you whenever the data changes.
Logging
Logging in Cognos Data Manager is extensive. Log files
are written for both jobstreams and builds. However,
if you use SQL nodes (ie write your own SQL) you will
need to log progress yourself.
Documenting Data Manager Builds
Cognos Data Manager can produce a detailed catalog cross
reference in HTML, so you can open it in a web browser.
By putting your own notes in the 'general' tab found
in the property sheet of most objects you can make this
a lot more meaningful as your descriptions will appear
in the cross reference.
In my own work I have found the following to be useful:
- build cross reference
- hierarchy diagrams (I use Microsoft Visio)
- data flow diagrams for each business process
I could go on! But this is a good start, especially
if you do this to a detailed level.
Summary
Cognos Data Manager is a 'heavyweight' ETL tool, well
suited to an environment with many different data sources.
Data transformation can be highly automated by Jobstreams
which provide detailed logging. Data Manager can maintain
integrity between fact and dimension tables, create
and maintain surrogate keys and take the hassle out
of dealing with slowly-changing dimensions.
|