Performance Tuning of a DataStage Job

Active Stages

Eliminate unused columns.

Eliminate unused references.

In derivations, if possible, instead of calling routines, move the code into the derivation. This eliminates the overhead of the procedure call.

Input Links

Use ODBC stage to access relational tables.

Move constraints from Transform stages to input stage WHERE clauses, if possible, to reduce the number of rows the job has to process.

Output links

Use OCI plugin stages to access relational tables if available.

Adjust the rows per transaction setting. Try 1000, 5000, or 10,000.

Adjust the array size setting. Try 10, 100, or 1000.

If output rows are INSERTs or APPENDs, not UPDATEs, consider using a native bulk loader. Direct output to a sequential file compatible with the bulk loader, then invoke the bulk loader using an after-job subroutine. The bulk loader for Oracle is SQLLDR.

Reference Lookups

Compare the number of input rows with the number of rows in the referenced table. If the referenced table is smaller than the number of input rows, pre-load the reference table into a hashed file and then reference the hashed file.

Consider moving reference lookups to a join within the input stage. All columns used to join the tables should be indexed to maximize performance.

If the number of rows in a hashed file is small, consider pre-loading the file into memory by checking the pre-load into memory checkbox in the Hashed File stage.

Copyright 2016 Another IT Co