Saturday 10 December 2011

Talend ETL how to concatenate strings during group by / aggregate operations taggreraterow

Problem - you have multiple rows for same group key but these rows has some string column which you want to get into your processing but want to concatenate this column values for same group key rows into one. example you have data in your table that tells a product is shippable at what locations
ProductName ShipLocation
Prod_01       IND
Prod_01       US
Prod_01      CA
Prod_02    UK
.....

You need a result like
Prod_01 IND;US;CA
Prod_02 UK

Solution - input the rows into tAggregateRow and group by ProductName. In the output Column select shipLocation and in function use LIST and input column position again select shiplocation.
(make sure you increase size of your output shiplocation column in schema)
in the advance settings for tAggregateRow you can use delimeter as ;.
when you take output from this tAggregateRow it returns the output as desired (semicolon separated and concatenated for each group key)

talend - etl - one way to split single row to multiple row and give output in same output group in tmap

Problem - you have following data as given below in a single row and you want to generate two rows out of this..
CountryName1 CountryCode1 CountryName2 CountryCode2
INDIA                    IND                  EGYPT             EG


Solution - though there are many solutions, one quick solution incase you have requirement like above, where you know number of rows you want to split -
Take this row as input to tmap component and in tmap create one output group say out_1.
Now in this out_1 drag and link countryName1 and countryCode1 columns from input.
Now create another output group out_02 in this tmap and when "add a output" dialog comes
select "create join table from" and in the dropdown select out_1 group, so that our output rows from this out_02 group will also go to out_01 group. So our tmap will have only one output group out_01 containing rows from both out_01 and out_02. now in out_02 drag and link countryName2 and CountryCode2 columns.