Recently I was able to solve a really tricky problem in SSIS using the TF Surrogate key transform. Let’s discuss the business problem:
The client was using some complicated .net code to loop through each line in a file and look for a very specific string of code. Once the string was found they then wanted to return all the values found six rows below that row. This process continues through the file. A picture is worth a thousand words:
So now that we understand the business problem we can solve this with SSIS. The solution is actually quite simple. Essentially we need to get the string we are searching for (5909045743) and the row with the values on the same row so we can let SSIS work its’ magic. This is where the Surrogate Key transform comes in.
To solve this problem I used not one but two Flat File source components in my data flow and I brought the same file in twice. I then used the surrogate key transform to assign row numbers to the rows coming from each flat file source component.
- For the first flat file source I started the row number at 6 and then incremented by one.
- For the second flat file source I started the row number at 1 and incremented by one.
- Setting up the surrogate key transform is extremely easy, 3 easy steps, see below:
Now I use the merge join component in SSIS and join the two data sets on the newly created SK_Columns. This will put our search string on the same row with the values we were looking for.
Below is a picture of the data flow thus far:
And here is a screenshot of the new result set after the Merge Join transform, notice how the two rows are now on the same row. From here we can do some basic SSIS operations to split out all the records we don’t need and then we can use a derived column transform to parse out the values in “Column 2”.
Thanks for looking!