SSIS 2008 Flat File Source Issue
Ok, this might be my last post for this year, 2009. This is more of a note to myself than a blog post. Anyway, I want to document the issue that I have encountered and the solution that I have found through the help of Todd McDermid (Blog | Twitter) and Andy Leonard (Blog | Twitter), so that it might help those who might be in the same situation.
A portion of a DTS package that I migrated to SSIS 2008 has some discrepancy. I was not getting the exact count of rows from my Flat File Source (Tab Delimited Text File in this case). I checked the DTS, and the Row Count pumped into the SQL Server 2000 matched that of the text file source. The Row Count pumped into the SQL Server 2008 from SSIS 2008 is short of about 10%. I could have fixed this by “correcting” the source text file but that wouldn’t work. I was tasked to bring solution to the current process. The process itself is to remain intact.
What was the issue?
Some records from the source text file do not have all the required columns. In DTS, this is fine. DTS knows how to deal with that by assumption. DTS can simply handle it. It is not the case with SSIS 2008 (and SSIS 2005, I guess). What is going on in SSIS 2008 in the situation just described is, and I am quoting Todd McDermid in this SSIS Forum (similar case was also documented in this Forum, also see this for other links to blogs discussing the same issue):
The Flat File Source in SSIS does not respect the “end-of-line” delimiter to indicate that the line is complete. It reads in data until it sees a column delimiter, then proceeds to fill the next column. It will continue until it has read in enough data (and column delimiters) to fill all of the columns of a row – regardless of whether it encountered an end-of-line delimiter in the data. Once it has filled all of the columns that it expects, I think it then searches for the end-of-line.
That is exactly what was happening in my case. I tried to isolate the rows that did not make to the target table and I could see “combined columns” in a field. SSIS won’t give you errror message on this. Everything will run successfully. That’s makes it harder to investigate the issue. Thank God for Twitter.
What is the Solution?
Simply put, I needed a better Data Flow Source Component. I cannot run the same old DTS from within SSIS. That would only defeat the purpose of migrating the old system (DTS & SQL Server 2000) into the new one (SSIS & SQL Server 2008).