Talend - How to get Last Modified File from a directory using tFileList
We often have need to get last modified file from a list of files in a directory. We can use tFileList component to give us list of files sorted (DESC/ASC) from a directory, but for now it does not have feature to restrict to last modified file.
one possible solution to achieve this is as below
tFileList(sorted DESC by file modified date) ------> tFixedFlowInput (schema - filename, filenumber) ----->tHashOutput
here in tFixedFlowInput filename = file(String)globalMap.get("tFileList_1_CURRENT_FILEPATH")+"/"+(String)globalMap.get("tFileList_1_CURRENT_FILE")
filenumber = (Integer)globalMap.get("tFileList_1_NB_FILE")
What above will accomplish is get list of all files in the directory with their number/rank - where the file last modified will have file number =1 and next to that will have 2...and so on.
Now on SubJobOK of above tFileList you can have tHashInput which will read from above tHashOutput and filter only row where filenumber==1 - which means the last modified file.
tHashInput (link to tHashoutput) ---->tFilterRow(filenumber==1)------>tLogRow
here in tFixedFlowInput filename = file(String)globalMap.get("tFileList_1_CURRENT_FILEPATH")+"/"+(String)globalMap.get("tFileList_1_CURRENT_FILE")
filenumber = (Integer)globalMap.get("tFileList_1_NB_FILE")
What above will accomplish is get list of all files in the directory with their number/rank - where the file last modified will have file number =1 and next to that will have 2...and so on.
Now on SubJobOK of above tFileList you can have tHashInput which will read from above tHashOutput and filter only row where filenumber==1 - which means the last modified file.
tHashInput (link to tHashoutput) ---->tFilterRow(filenumber==1)------>tLogRow
how to get the paths of filtered data in tFileCopy
ReplyDeleteCan you explain what you are looking for - paths of filtered data in tFileCopy? which paths are you looking for and what are you filtering in iFileCopy
Deletei wanted to copy filtered files in to different location, i finally figured out that we could get row7.file_name thanks for the really helpful tutorial
DeleteThis way you read all filees. It seems to me a better way to set a global var depending with null or the filepath if NB_FILE == 1 and then make a RunIf if your globalVar is not null.
ReplyDeleteThank you for your guide to with upgrade information about
ReplyDeleteAWS keep update at AWS Online Course
How to get Last Modified File from a directory using tFTPGet
ReplyDeleteHow to get Latest File from a directory using tFTPGet
ReplyDelete