Sunday, August 16, 2015

Hive on Amazon EMR

Hive on Amazon EMR-4.0.0 is pretty handy, as EMR comes with Hadoop 2.6.0, Hive 1.0.0, Pig 0.14.0, and optionally other tools pre-configured.

Connection can be established to EMR Hive by,

con = DriverManager.getConnection("jdbc:hive2://ec2XXXX.compute-1.amazonaws.com:10000/default","hadoop", "");


The below error is common, and due to the mismatch of the version of Hive Client and Hive server. Note that Hive Server in EMR is 1.0.0. Make sure to have your client same version. Not a later version to avoid the below error.

17:59:22.298 [main] ERROR edu.emory.bmi.datacafe.hdfs.HiveConnector - SQL Exception in writing to Hive Table: patients java.sql.SQLException: Could not establish connection to jdbc:hive2://ec2-54-82-17-142.compute-1.amazonaws.com:10000/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default}) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:594) ~[hive-jdbc-1.2.1.jar:1.2.1] at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:192) ~[hive-jdbc-1.2.1.jar:1.2.1] at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) ~[hive-jdbc-1.2.1.jar:1.2.1] at java.sql.DriverManager.getConnection(DriverManager.java:571) ~[?:1.7.0_40] at java.sql.DriverManager.getConnection(DriverManager.java:215) ~[?:1.7.0_40] at edu.emory.bmi.datacafe.hdfs.HiveConnector.writeToHive(HiveConnector.java:68) ~[datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.hdfs.HiveConnector.writeToWarehouse(HiveConnector.java:88) [datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.hdfs.HiveConnector.writeDataSourcesToWarehouse(HiveConnector.java:50) [datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.impl.main.Initiator.initiate(Initiator.java:71) [datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.impl.main.Initiator.main(Initiator.java:39) [datacafe-server-1.0-SNAPSHOT.jar:?] Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default}) at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[libthrift-0.9.2.jar:0.9.2] at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) ~[libthrift-0.9.2.jar:0.9.2] at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:156) ~[hive-service-1.2.1.jar:1.2.1] at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:143) ~[hive-service-1.2.1.jar:1.2.1] at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:583) ~[hive-jdbc-1.2.1.jar:1.2.1] ... 9 more 17:59:22.318 [main] INFO edu.emory.bmi.datacafe.hdfs.HiveConnector - Successfully written the output to the file, slices 17:59:22.331 [main] ERROR edu.emory.bmi.datacafe.hdfs.HiveConnector - SQL Exception in writing to Hive Table: slices java.sql.SQLException: Could not establish connection to jdbc:hive2://ec2-54-82-17-142.compute-1.amazonaws.com:10000/default: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default}) at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:594) ~[hive-jdbc-1.2.1.jar:1.2.1] at org.apache.hive.jdbc.HiveConnection.(HiveConnection.java:192) ~[hive-jdbc-1.2.1.jar:1.2.1] at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) ~[hive-jdbc-1.2.1.jar:1.2.1] at java.sql.DriverManager.getConnection(DriverManager.java:571) ~[?:1.7.0_40] at java.sql.DriverManager.getConnection(DriverManager.java:215) ~[?:1.7.0_40] at edu.emory.bmi.datacafe.hdfs.HiveConnector.writeToHive(HiveConnector.java:68) ~[datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.hdfs.HiveConnector.writeToWarehouse(HiveConnector.java:88) [datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.hdfs.HiveConnector.writeDataSourcesToWarehouse(HiveConnector.java:50) [datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.impl.main.Initiator.initiate(Initiator.java:71) [datacafe-server-1.0-SNAPSHOT.jar:?] at edu.emory.bmi.datacafe.impl.main.Initiator.main(Initiator.java:39) [datacafe-server-1.0-SNAPSHOT.jar:?] Caused by: org.apache.thrift.TApplicationException: Required field 'client_protocol' is unset! Struct:TOpenSessionReq(client_protocol:null, configuration:{use:database=default}) at org.apache.thrift.TApplicationException.read(TApplicationException.java:111) ~[libthrift-0.9.2.jar:0.9.2] at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:71) ~[libthrift-0.9.2.jar:0.9.2] at org.apache.hive.service.cli.thrift.TCLIService$Client.recv_OpenSession(TCLIService.java:156) ~[hive-service-1.2.1.jar:1.2.1] at org.apache.hive.service.cli.thrift.TCLIService$Client.OpenSession(TCLIService.java:143) ~[hive-service-1.2.1.jar:1.2.1] at org.apache.hive.jdbc.HiveConnection.openSession(HiveConnection.java:583) ~[hive-jdbc-1.2.1.jar:1.2.1] ... 9 more

No comments:

Post a Comment

You are welcome to provide your opinions in the comments. Spam comments and comments with random links will be deleted.