ETL tools: What you do and don't want

Ask the Expert

ETL tools: What you do and don't want

What should I look for in extract, transform and load (ETL) tools? What should I avoid?
Ahh. ETL tools. Can't you just do it all with COBOL? (Insert laugh track here.)

Seriously, there are some core functions that most ETL tool vendors offer today. These include:

  • The ability to deal with multiple input data formats (such as flat files, SQL databases, SAS data sets, etc.)
  • Support of a GUI interface that allows the "drag and drop" of data elements from source to destination
  • Built-in data profiling and analysis functionality allowing the examination of source data
  • Built-in data transformation functions (such as data type conversion, date reformatting, etc.)
  • Support for data quality/cleansing functionality (either built-in, or via interfaces to popular data quality tools like Group 1, DataFlux, or Business Objects)
  • Metadata support, allowing for the capture and identification of data transformation and business rules
  • Job scheduling functionality
  • Version control of the ETL maps
  • The ability to manipulate data either inside or outside the target database (aka: ETL versus ELT)

What you should avoid

My ETL guys suggest staying away from code-generating ETL tools, which are difficult to troubleshoot and de-bug. They also recommend steering clear of ETL tools that don't provide error tracking and logging functionality.

And a final thought: ETL isn't just for breakfast anymore. What I mean by this is that ETL isn't exclusive to data warehouse projects, but can solve a host of data movement and transformation problems. Best of luck with your tool selection!

For more information on ETL tools

 

This was first published in August 2010