A Global View of Reference Lookups

This document represents work in progress.

Reference lookups are typically thought of on a per job basis. It is assumed that good DataStage development practices are being used, utilizing hash files for reference lookups, or where possible, performing joins in the input stage of a job to off load this burden onto the database server.

In most cases, this is satisfactory. However, large and/or time critical ETL applications may find it advantageous to revisit the topic of reference lookups, and apply a more global view. In this global view, four factors should be considered:

  1. Data Source
  2. Number of rows in source
  3. Total number of rows retrieved
  4. Total number of unique rows retrieved

Copyright 2016 Another IT Co