SAS DI Studio (the DI stands for Data
Integration) is the SAS tool that allows for the creation and
maintenance of Extract, Transform and Load (ETL) jobs in a data
warehousing environment. The idea is that companies have lots of
operational data that needs to be manipulated in a batch fashion and
then loaded into a read-only data warehouse repository. Once loaded,
the data can be reported on by a wide variety of visualization
tools. SAS DI Studio allows one to point-and-click through
the process of getting data out of those operational systems and
into the warehouse.
The concept behind DI Studio is solid
– provide the user with a set of canned widgets that can be pieced
together to create a data flow. Those widgets might be things like
SQL Joins, Appends or Lookups. One can also create custom code
modules to implement non-standard transformation logic. For example,
calculate a 20% commission for senior sales people and a 10%
Under the covers, DI Studio is a code
generator. Each time that you deploy one of the transformation
widgets, the (mostly) appropriate SAS code is generated. After you
piece together several of the widgets, you end up with a flow of SAS
code that can be deployed as part of a batch process and run nightly
(or as frequently as needed).
The graphical interface for DI studio
really promotes understanding of how the data travels from start to
finish. It’s easy to see and thus maintain where each piece of code
exists and what might need to be modified if something goes amiss.
Here is a screen shot of the workspace area where you build
your flow. Each orange circle represents a dataset or a table while
each blue square is some type of extract, transformation or loading
The DI studio tool is not without its labor-intensive tasks. Before
you even begin to piece together your transformations you have to
clearly define the source and target data sets for your flow.
Although this can be done through a series of import wizards it can
be a bit tedious and requires you to have thoroughly thought through
your design. You can make changes to your flow after you have it in
place but any substantial shifts like moving a dataset from one join
to another sometimes gets the tool confused and requires close
inspection of each of the transformation widgets to get everything
back the way it should be.
Overall the SAS DI Studio tool is an effective mechanism for building and maintaining your ETL flows. Its positives far out-weigh its negatives and it is an essential ingredient of any SAS Intelligence platform.
Blue Chip Solutions has lots of experience with SAS DI Studio. As a matter of fact, we worked as a subcontractor and did most of the work for this project.
If you would like to use us and gain similar success, check out our consulting page and give us a call. We're the best.