Talend integration with Hive on hadoop – Part#2 (Read data from Hive)
In my previous example Talend integration with Hive on hadoop – Part#1
we created external table customers_ext in my hive database and loaded data into this table.
In this example we will read data from this table and once we have this data into talend server/memory we can transform/move this data as per our need using other talend components.
Pre-requisites –
1) Talend integration with Hive on hadoop – Part#1
See job below - it uses tHiveInput component to run sql
"select country,count(1) from arpitdb.customers_ext group by country" against my hive db.
and output from tHiveInput is printed using tLogRow.
See below screenshots for more details
output on execution of job is given below- first hive will run sql and will internally run map/reduce job and finally give results output to talend.
Starting job job_for_blog at 18:06 30/12/2013.
[statistics] connecting to socket on port 3803
[statistics] connected
.------------+-----------.
| tLogRow_1 |
|=-----------+----------=|
|country |countofrows|
|=-----------+----------=|
|Australia |1034 |
|Canada |1004 |
|Chile |1047 |
|China |1002 |
|France |971 |
|Germany |1004 |
|Japan |989 |
|Russia |1012 |
|South Africa|935 |
|UK |1002 |
|US |10000 |
|country |1 |
'------------+-----------'
[statistics] disconnected
Job job_for_blog ended at 18:07 30/12/2013. [exit code=0]
In my previous example Talend integration with Hive on hadoop – Part#1
we created external table customers_ext in my hive database and loaded data into this table.
In this example we will read data from this table and once we have this data into talend server/memory we can transform/move this data as per our need using other talend components.
Pre-requisites –
1) Talend integration with Hive on hadoop – Part#1
See job below - it uses tHiveInput component to run sql
"select country,count(1) from arpitdb.customers_ext group by country" against my hive db.
and output from tHiveInput is printed using tLogRow.
See below screenshots for more details
output on execution of job is given below- first hive will run sql and will internally run map/reduce job and finally give results output to talend.
Starting job job_for_blog at 18:06 30/12/2013.
[statistics] connecting to socket on port 3803
[statistics] connected
.------------+-----------.
| tLogRow_1 |
|=-----------+----------=|
|country |countofrows|
|=-----------+----------=|
|Australia |1034 |
|Canada |1004 |
|Chile |1047 |
|China |1002 |
|France |971 |
|Germany |1004 |
|Japan |989 |
|Russia |1012 |
|South Africa|935 |
|UK |1002 |
|US |10000 |
|country |1 |
'------------+-----------'
[statistics] disconnected
Job job_for_blog ended at 18:07 30/12/2013. [exit code=0]
Hi,
ReplyDeleteI am done same as above you said. But i got an error was
Exception in component tHiveRow_1
java.sql.SQLException: org.apache.thrift.TApplicationException: Invalid method name: 'execute'
at org.apache.hadoop.hive.jdbc.HiveStatement.executeQuery(HiveStatement.java:191)
at org.apache.hadoop.hive.jdbc.HiveStatement.execute(HiveStatement.java:127)
at org.apache.hadoop.hive.jdbc.HiveConnection.configureConnection(HiveConnection.java:126)
at org.apache.hadoop.hive.jdbc.HiveConnection.(HiveConnection.java:121)
at org.apache.hadoop.hive.jdbc.HiveDriver.connect(HiveDriver.java:104)
at java.sql.DriverManager.getConnection(DriverManager.java:582)
at java.sql.DriverManager.getConnection(DriverManager.java:185)
at mirth.sample_hive_0_1.Sample_Hive.tHiveRow_1Process(Sample_Hive.java:468)
at mirth.sample_hive_0_1.Sample_Hive.runJobInTOS(Sample_Hive.java:797)
at mirth.sample_hive_0_1.Sample_Hive.main(Sample_Hive.java:663)
Please help me to resolve this issue.
Thanks in Advance
Thirupathi