Talend integration with Hive on hadoop – Part#2 (Read data from Hive)
In my previous example Talend integration with Hive on hadoop – Part#1
we created external table customers_ext in my hive database and loaded data into this table.
In this example we will read data from this table and once we have this data into talend server/memory we can transform/move this data as per our need using other talend components.
Pre-requisites –
1) Talend integration with Hive on hadoop – Part#1
See job below - it uses tHiveInput component to run sql
"select country,count(1) from arpitdb.customers_ext group by country" against my hive db.
and output from tHiveInput is printed using tLogRow.
See below screenshots for more details
output on execution of job is given below- first hive will run sql and will internally run map/reduce job and finally give results output to talend.
Starting job job_for_blog at 18:06 30/12/2013.
[statistics] connecting to socket on port 3803
[statistics] connected
.------------+-----------.
| tLogRow_1 |
|=-----------+----------=|
|country |countofrows|
|=-----------+----------=|
|Australia |1034 |
|Canada |1004 |
|Chile |1047 |
|China |1002 |
|France |971 |
|Germany |1004 |
|Japan |989 |
|Russia |1012 |
|South Africa|935 |
|UK |1002 |
|US |10000 |
|country |1 |
'------------+-----------'
[statistics] disconnected
Job job_for_blog ended at 18:07 30/12/2013. [exit code=0]
In my previous example Talend integration with Hive on hadoop – Part#1
we created external table customers_ext in my hive database and loaded data into this table.
In this example we will read data from this table and once we have this data into talend server/memory we can transform/move this data as per our need using other talend components.
Pre-requisites –
1) Talend integration with Hive on hadoop – Part#1
See job below - it uses tHiveInput component to run sql
"select country,count(1) from arpitdb.customers_ext group by country" against my hive db.
and output from tHiveInput is printed using tLogRow.
See below screenshots for more details
output on execution of job is given below- first hive will run sql and will internally run map/reduce job and finally give results output to talend.
Starting job job_for_blog at 18:06 30/12/2013.
[statistics] connecting to socket on port 3803
[statistics] connected
.------------+-----------.
| tLogRow_1 |
|=-----------+----------=|
|country |countofrows|
|=-----------+----------=|
|Australia |1034 |
|Canada |1004 |
|Chile |1047 |
|China |1002 |
|France |971 |
|Germany |1004 |
|Japan |989 |
|Russia |1012 |
|South Africa|935 |
|UK |1002 |
|US |10000 |
|country |1 |
'------------+-----------'
[statistics] disconnected
Job job_for_blog ended at 18:07 30/12/2013. [exit code=0]