Wednesday 13 November 2013

Talend code review utility




Talend code review utility

At times you might have a requirement to make sure - all developed talend jobs in your repository conforms to some basic standards setup by our organization. These could be as simple as guidelines to be followed by all developers while creating a new talend job. These standards could be – like naming convention for your jobs, naming convention for components in job, advance properties for components like “use cursor” option for tjdbcinput etc. commit size, batch size etc... 
There could be many more such standards.

Problem comes when you have to review a job against these standards. Depending on number of components used in job this review task could be very time consuming and difficult with increase in number of components in job.

To save your time and efforts – it’s better to write some utilities to carryout the review checks against talend jobs.  Since talend job's entire metadata is available in *.items, *.properties and *.screenshot files – it’s not difficult to build this utility. Of these 3 files *.item file is of utmost importance as this contains XML which contains everything (all components, comments, settings, properties etc..) from your talend job. All you need to do is read this XML with xPath – to the level/depth you want to retrieve information like you can read this xml at element parameter level ("/talendfile:ProcessType/node/elementParameter") to fetch properties for each component and then validate each of these properties against your review checks and build your review report.

What is more interesting is that you can build this utility in talend studio itself. You simply need to loop through all *.item file in your workspace folder and read each of these *.item file - parse xml and fetch component properties and perform your checks.