Wednesday, 5 December 2012

Talend downloading data from a big table into multiple flat files and reloading it back to DB using tFile





Problem: Downloading data from a big table into multiple flat files and reloading back to DB.

Solution: – Different options are available – I have used following components
One db input component – tjdbcinput to read from source table.
tfileOutputdelimited to write to file – with options of split into multiple files and each file to contain some x number of rows. and subsequently tFileInputdelimited and tjdbcoutput to read these flat files and insert back to DB

Talend can generate multiple files from same tfileoutputdelimited component by appending 0,1,...to file name - for example if you filename is input_file.csv - talend will generate output files as input_file0.csv, input_file1.csv and so on. This will happen if you check split into several file option.

Also while loading these files back to DB we need to read them in a loop in a directory by using wildcard search for that i have used tFileList with filemask to get list of files starting with a particular name.
Other important settings are case sensitive (in file name) YES/NO
Filemask - input_file* - to get list of all files with input_file as starting name.
Filename - for tFileInputDelimited component - this should come from variable tFileList*_CURRENT_FILEPATH

part of job for reading data from Db and writing to flat file

 
Complete job for inserting data back to DB
tFileList settings


 

tFileInputDelimited settings


Ignore below text.
PN9S8TH8WU9Q