DataUnited

Factless Facts

A factless fact is where the fact does not store an actual numerical measure, the mere existance of a fact record indicates that an event has happened that you wish to track. The classic example of this would be an "Attendance" fact. If you had dimensions to record date, scheduled_course, instructor and delegate then you could create a fact table that held the permutations of these dimensions. From this you could evaluate the number of courses you run, the number of delegates, the number of courses by instructor etc.

I would never simply leave a factess fact as a bare collection of foreign key columns I would always add a dummy measure column which would be set to 1 which you would then sum.

Conformed dimensions

Conformed dimensions can be used to analyze facts from two or more data marts. Suppose you have a “shipping” data mart (telling you what you’ve shipped to whom and when) and a “sales” data mart (telling you who has purchased what and when). Both marts require a “customer” dimension and a “time” dimension. If they’re the samedimension, then you have conforming dimensions, allowing you to extract and manipulate facts relating to a particular customer from both marts, answering questions such as whether late shipments have affected sales to that customer.

Suppose now that you add a “marketing” data mart to help you analyze product promotions. Again, with conformed customer and time dimensions, you’re able to analyze the effects of a particular product promotion on sales. (Analyzing facts from more than one fact table in this way is termed “drilling across.” My previous article,“Thinking dimensionally aids business intelligence design and use,” explains the function of facts and dimensions.)

As this example shows, the very same conformed dimensions—in this case, time and customer dimensions—have meaning in the context of three independentlydevelopeddata marts. These dimensions become enterprise property and can be used later in other marts as you evolve the enterprise data warehouse.

What is Junk Dimension?

A "junk" dimension is a collection of random transactional codes, flags and/or text attributes that are unrelated to any particular dimension. The junk dimension is simply a structure that provides a convenient place to store the junk attributes. A good example would be a trade fact in a company that brokers equity trades.

When you consolidate lots of small dimensions and instead of having 100s of small dimensions, that will have few records in them, cluttering your database with these mini ‘identifier’ tables, all records from all these small dimension tables are loaded into ONE dimension table and we call this dimension table Junk dimension table. (Since we are storing all the junk in this one table) For example: a company might have handful of manufacture plants, handful of order types, and so on, so forth, and we can consolidate them in one dimension table called junked dimension table

It’s a dimension table which is used to keep junk attributes

Informatica PowerCenter Repository tables

every PowerCenter developer either has an intention or necessity to know about the Informatica metadata tables and where information is stored etc. For the starters, all the objects that we create in Informatica PowerCenter - let them be sources, targets, mappings, workflows, sessions, expressions, be it anything related to PowerCenter, will get stored in a set of database tables (call them as metadata tables or OPB tables or repository tables).

* I want to know all the sessions in my folder that are calling some shell script/command in the Post-Session command task.
* I want to know how many mappings have transformations that contain "STOCK_CODE" defined as a port.
* I want to know all unused ports in my repository of 100 folders.

In repositories where you have many number of sessions or workflows or mappings, it gets difficult to achieve this with the help of Informatica PowerCenter client tools. After all, whole of this data is stored in some form in the metadata tables. So if you know the data model of these repository tables, you will be in a better position to answer these questions.

Before we proceed further, let me clearly urge for something very important. Data in the repository/metadata/OPB tables is very sensitive and that the modifications like insert or updates are to be made using the PowerCenter tools ONLY. DO NOT DIRECTLY USE UPDATE OR INSERT COMMANDS AGAINST THESE TABLES.

Please also note that there is no official documentation from Informatica Corporation on how these tables act. It is purely based on my assumption, research and experience that I am providing these details. I will not be responsible to any of the damages caused if you use any statement other than the SELECT, knowing the details from this blog article. This is my disclaimer. Let us move on to the contents now.
There around a couple of hundred OPB tables in 7.x version of PowerCenter, but in 8.x, this number crosses 400. In this regard, I am going to talk about few important tables in this articles. As such, this is not a small topic to cover in one article. I shall write few more to cover other important tables like OPB_TDS, OPB_SESSLOG etc.
We shall start with OPB_SUBJECT now.

OPB_SUBJECT - PowerCenter folders table

This table stores the name of each PowerCenter repository folder.

Usage: Join any of the repository tables that have SUBJECT_ID as column with that of SUBJ_ID in this table to know the folder name.

OPB_MAPPING - Mappings table

This table stores the name and ID of each mapping and its corresponding folder.

Usage: Join any of the repository tables that have MAPPING_ID as column with that of MAPPING_ID in this table to know the mapping name.

OPB_TASK - Tasks table like sessions, workflow etc

This table stores the name and ID of each task like session, workflow and its corresponding folder.

Usage: Join any of the repository tables that have TASK_ID as column with that of TASK_ID/SESSION_ID in this table to know the task name. Observe that the session and also workflow are stored as tasks in the repository. TASK_TYPE for session is 68 and that of the workflow is 71.

OPB_SESSION - Session & Mapping linkage table

This table stores the linkage between the session and the corresponding mapping. As informed in the earlier paragraph, you can use the SESSION_ID in this table to join with TASK_ID of OPB_TASK table.

OPB_TASK_ATTR - Task attributes tables

This is the table that stores the attribute values (like Session log name etc) for tasks.

Usage: Use the ATTR_ID of this table to that of the ATTR_ID of OPB_ATTR table to find what each attribute in this table means. You can know more about OPB_ATTR table in the next paragraphs.

OPB_WIDGET - Transformations table

This table stores the names and IDs of all the transformations with their folder details.

Usage: Use WIDGET_ID from this table to that of the WIDGET_ID of any of the tables to know the transformation name and the folder details. Use this table in conjunction with OPB_WIDGET_ATTR or OPB_WIDGET_EXPR to know more about each transformation etc.

OPB_WIDGET_FIELD - Transformation ports table

This table stores the names and IDs of all the transformation fields for each of the transformations.

Usage: Take the FIELD_ID from this table and match it against the FIELD_ID of any of the tables like OPB_WIDGET_DEP and you can get the corresponding information.

OPB_WIDGET_ATTR - Transformation properties table

This table stores all the properties details about each of the transformations.

Usage: Use the ATTR_ID of this table to that of the ATTR_ID of OPB_ATTR table to find what each attribute in this transformation means.

OPB_EXPRESSION - Expressions table

This table stores the details of the expressions used anywhere in PowerCenter.

Usage: Use this table in conjunction with OPB_WIDGET/OPB_WIDGET_INST and OPB_WIDGET_EXPR to get the expressions in the Expression transformation for a particular, mapping or a set.

OPB_ATTR - Attributes

This table has a list of attributes and their default values if any. You can get the ATTR_ID from this table and look it up against any of the tables where you can get the attribute value. You should also make a note of the ATTR_TYPE, OBJECT_TYPE_ID before you pick up the ATTR_ID. You can find the same ATTR_ID in the table, but with different ATTR_TYPE or OBJECT_TYPE_ID.

OPB_COMPONENT - Session Component

This table stores the component details like Post-Session-Success-Email, commands in Post-Session/pre-Session etc.

Usage: Match the TASK_ID with that of the SESSION_ID in OPB_SESSION table to get the SESSION_NAME and to get the shell command or batch command that is there for the session, join this table with OPB_TASK_VAL_LIST table on TASK_ID.

OPB_CFG_ATTR - Session Configuration Attributes

This table stores the attribute values for Session Object configuration like "Save Session log by", Session log path etc.

Optimizing Oracle Optimizer Statistics

Top 5 MS Power Point Tips

Power point has become so common a tool that every Windows user clicks on MS Power point icon when asked to research, create a presentation, draw a flowchart or even just take notes.

However, there is so much of power under the hood that hardly gets utilized. So here are the top 5 power point tips to upgrade your power point experience

1. Stop Ctrl+C (Copy) and Ctrl+V (Paste). Select an object and do a Ctrl+D (Duplicate). Not only are you reducing the number of key presses, the Duplicate shortcut also aligns every newly duplicated object to create a pattern of equidistant objects

2. Pasted a image and it has an irritating white background?. Relax. You can clear the white background by double-clicking on the image, go to Colors on the menu bar, select "Set Transparent Color" and click anywhere on the white background of the image

3. Do you have a table that does not fit into a single slide? Do not fret. You can divide the table across two slides and use a cool slide transition to let the audience know that the table is continuing over two slides. Select the first slide and Go to Transitions on the menu bar and select "Push" as the transition.

4. Need to fit the letters of a word onto a curve? Type your word. Select the text box. Go to Format on the menu bar. Select Text Effects --> Transform --> Follow Path and the needed layout.

5. You are projecting your slides and you suddenly need the audience to stop looking at the slides and focus on you the speaker. You can press "B" or "W" on the keyboard to produce a Black or White background respectively. This works only in Presentation mode.

What are theTypes of Triggers ?

This section describes the different types of triggers:

Row Triggers

A row trigger is fired each time the table is affected by the triggering statement. For example, if an UPDATE statement updates multiple rows of a table, a row trigger is fired once for each row affected by the UPDATE statement. If a triggering statement affects no rows, a row trigger is not run.

BEFORE and AFTER Triggers

When defining a trigger, you can specify the trigger timing--whether the trigger action is to be run before or after the triggering statement. BEFORE and AFTER apply to both statement and row triggers.

BEFORE and AFTER triggers fired by DML statements can be defined only on tables, not on views.

Difference between Trigger and Procedure

Triggers	Stored Procedures
In trigger no need to execute manually. Triggers will be fired automatically. Triggers that run implicitly when an INSERT, UPDATE, or DELETE statement is issued against the associated table.	Where as in procedure we need to execute manually.

Differences between stored procedure and functions

Stored Procedure	Functions
Stored procedure may or may not return values.	Function should return at least one output parameter. Can return more than one parameter using OUT argument.
Stored procedure can be used to solve the business logic.	Function can be used to calculations
Stored procedure is a pre-compiled statement.	But function is not a pre-compiled statement.
Stored procedure accepts more than one argument.	Whereas function does not accept arguments.
Stored procedures are mainly used to process the tasks.	Functions are mainly used to compute values
Cannot be invoked from SQL statements. E.g. SELECT	Can be invoked form SQL statements e.g. SELECT
Can affect the state of database using commit.	Cannot affect the state of database.
Stored as a pseudo-code in database i.e. compiled form.	Parsed and compiled at runtime.