lundi 25 mars 2013

How to schedule a hive job with java?

Our project is running on Cassandra bundled in DataStax Entreprise plateforme.
To schedule a hive job with java and datastax entreprise, i used the scheduler embeded in Jboss 7 and Hive-JDBC.

First step in to includ the following libraries to the project

- ~/dse-3.0/resources/hive/lib/hive-jdbc-0.9.0.1.jar
- ~/dse-3.0/resources/hive/lib/hive-metastore-0.9.0.1.jar
- ~/dse-3.0/resources/hive/lib/hive-service-0.9.0.1.jar
- ~/dse-3.0/resources/hive/lib/libfb303-0.7.0.jar
- ~/dse-3.0/resources/hadoop/hadoop-core-1.0.4.2.jar
- ~/dse-3.0/resources/hive/lib/hive-serde-0.9.0.1.jar
- ~/dse-3.0/resources/hive/lib/commons-logging-1.0.4.jar
- ~/dse-3.0/resources/hive/lib/hive-exec-0.9.0.1.jar


Then start hive server:
dse-3.0/bin/dse hive --service hives


Then follow this example of Hive-JDBC to write my job:

import java.sql.DriverManager;
import java.sql.Connection;
import java.sql.ResultSet; import java.sql.Statement; import java.sql.SQLException; import javax.ejb.Schedule; import javax.ejb.Stateless; import org.apache.log4j.Logger; @Stateless(name = "AutomaticSchedulerBean") public class HiveJob { private static Logger mLogger = Logger.getLogger(HiveJob.class); private static String driverName = "org.apache.hadoop.hive.jdbc.HiveDriver"; @Schedule(dayOfWeek = "*", hour = "*", minute = "*/5", year = "*", persistent = false) public void execute() throws SQLException{ mLogger.info("Start HiveJob"); try { Class.forName(driverName); } catch (ClassNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); System.exit(1); } Connection con = DriverManager.getConnection("jdbc:hive://localhost:10000/default", "", ""); Statement stmt = con.createStatement(); // regular hive query String sql = "select count(*) from universe where status = 'GO'"; System.out.println("Running: " + sql); ResultSet res = stmt.executeQuery(sql); while (res.next()) { System.out.println(res.getString(1)); } mLogger.info("HiveJob executed!"); } }

This Job is going to run every 5 minutes.

Enjoy :)

Aucun commentaire:

Enregistrer un commentaire