Amazon EMRのJDBCメタデータとサポート状況を確認した

この記事は公開されてから1年以上経過しています。情報が古い可能性がありますので、ご注意ください。

EMRにJDBC接続

Amazon EMR4.0.0には、Hiveサーバが起動しています。そして、このHiveサーバ接続用にJDBCドライバが用意されています。そこで、JDBC接続をしてメタデータとサポート状況を確認したいと思います。

EMRのセットアップ

EMR4.0.0を選択してクラスターを起動します。起動が完了しましたら、マスターノードのセキュリティグループで10000番ポートを開放してください。基本的にはこれだけでOKです。

JDBCドライバのダウンロード

EMR用にAWSからJDBCドライバが提供されていますのでダウンロードします。後でコンパイルするときにクラスパスを通して置いてください。

Hive 1.0 JDBC drivers

JDBCドライバURL

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class HiveJDBC {

	public static void main(String[] args) throws Exception {
		Class.forName("com.amazon.hive.jdbc4.HS2Driver").newInstance();
		String url = "jdbc:hive2://XXX.XXX.XXX.XXX:10000/default";
		Connection conn = DriverManager.getConnection(url);

		Statement stmt = conn.createStatement();
		stmt.execute("drop table foo");
		stmt.execute("create table foo (key int, value string)");
		String sql = "show tables";
		ResultSet res = stmt.executeQuery(sql);
		while (res.next()) {
			System.out.println(res.getString(1));
		}
	}
}

以下結果です。

foo

Hiveのメタデータ

Hiveが対応するSQL等のメタデータを収集します。

import java.sql.Connection;
import java.sql.DatabaseMetaData;
import java.sql.DriverManager;

public class HiveMetaData {

	public static void main(String[] args) throws Exception {
		Class.forName("com.amazon.hive.jdbc4.HS2Driver").newInstance();
		String url = "jdbc:hive2://XXX.XXX.XXX.XXX:10000/default";
		Connection conn = DriverManager.getConnection(url);
		DatabaseMetaData dbmd = conn.getMetaData();

		System.out.println("※※※※※ METADATA ※※※※※");
		System.out.println("getCatalogSeparator : " + dbmd.getCatalogSeparator());
		System.out.println("getCatalogTerm : " + dbmd.getCatalogTerm());
		System.out.println("getDatabaseProductName : " + dbmd.getDatabaseProductName());
		System.out.println("getDatabaseProductVersion : " + dbmd.getDatabaseProductVersion());
		System.out.println("getDatabaseMajorVersion : " + dbmd.getDatabaseMajorVersion());
		System.out.println("getDatabaseMinorVersion : " + dbmd.getDatabaseMinorVersion());
		System.out.println("getDefaultTransactionIsolation : " + dbmd.getDefaultTransactionIsolation());
		System.out.println("getDriverMajorVersion : " + dbmd.getDriverMajorVersion());
		System.out.println("getDriverMinorVersion : " + dbmd.getDriverMinorVersion());
		System.out.println("getDriverName : " + dbmd.getDriverName());
		System.out.println("getDriverVersion : " + dbmd.getDriverVersion());
		System.out.println("getExtraNameCharacters : " + dbmd.getExtraNameCharacters());
		System.out.println("getIdentifierQuoteString : " + dbmd.getIdentifierQuoteString());
		System.out.println("getJDBCMajorVersion : " + dbmd.getJDBCMajorVersion());
		System.out.println("getJDBCMinorVersion : " + dbmd.getJDBCMinorVersion());
		System.out.println("getMaxBinaryLiteralLength : " + dbmd.getMaxBinaryLiteralLength());
		System.out.println("getMaxCatalogNameLength : " + dbmd.getMaxCatalogNameLength());
		System.out.println("getMaxCharLiteralLength : " + dbmd.getMaxCharLiteralLength());
		System.out.println("getMaxColumnNameLength : " + dbmd.getMaxColumnNameLength());
		System.out.println("getMaxColumnsInGroupBy : " + dbmd.getMaxColumnsInGroupBy());
		System.out.println("getMaxColumnsInIndex : " + dbmd.getMaxColumnsInIndex());
		System.out.println("getMaxColumnsInOrderBy : " + dbmd.getMaxColumnsInOrderBy());
		System.out.println("getMaxColumnsInSelect : " + dbmd.getMaxColumnsInSelect());
		System.out.println("getMaxColumnsInTable : " + dbmd.getMaxColumnsInTable());
		System.out.println("getMaxConnections : " + dbmd.getMaxConnections());
		System.out.println("getMaxCursorNameLength : " + dbmd.getMaxCursorNameLength());
		System.out.println("getMaxIndexLength : " + dbmd.getMaxIndexLength());
		System.out.println("getMaxProcedureNameLength : " + dbmd.getMaxProcedureNameLength());
		System.out.println("getMaxRowSize : " + dbmd.getMaxRowSize());
		System.out.println("getMaxSchemaNameLength : " + dbmd.getMaxSchemaNameLength());
		System.out.println("getMaxStatementLength : " + dbmd.getMaxStatementLength());
		System.out.println("getMaxStatements : " + dbmd.getMaxStatements());
		System.out.println("getMaxTableNameLength : " + dbmd.getMaxTableNameLength());
		System.out.println("getMaxTablesInSelect : " + dbmd.getMaxTablesInSelect());
		System.out.println("getMaxUserNameLength : " + dbmd.getMaxUserNameLength());
		System.out.println("getNumericFunctions : " + dbmd.getNumericFunctions());
		System.out.println("getProcedureTerm : " + dbmd.getProcedureTerm());
		System.out.println("getResultSetHoldability : " + dbmd.getResultSetHoldability());
		System.out.println("getRowIdLifetime : " + dbmd.getRowIdLifetime());
		System.out.println("getSchemaTerm : " + dbmd.getSchemaTerm());
		System.out.println("getSearchStringEscape : " + dbmd.getSearchStringEscape());
		System.out.println("getSQLKeywords : " + dbmd.getSQLKeywords());
		System.out.println("getSQLStateType : " + dbmd.getSQLStateType());
		System.out.println("getStringFunctions : " + dbmd.getStringFunctions());
		System.out.println("getSystemFunctions : " + dbmd.getSystemFunctions());
		System.out.println("getSchemas : " + dbmd.getSchemas());
		System.out.println("getTimeDateFunctions : " + dbmd.getTimeDateFunctions());
		System.out.println("getURL : " + dbmd.getURL());
		System.out.println("getUserName : " + dbmd.getUserName());
	}
}

以下結果です。

※※※※※ METADATA ※※※※※
getCatalogSeparator : .
getCatalogTerm : catalog
getDatabaseProductName : Apache Hive
getDatabaseProductVersion : 1.0.0-amzn-0
getDatabaseMajorVersion : 1
getDatabaseMinorVersion : 0
getDefaultTransactionIsolation : 1
getDriverMajorVersion : 1
getDriverMinorVersion : 0
getDriverName : HiveJDBC
getDriverVersion : 01.00.00.1000
getExtraNameCharacters : 
getIdentifierQuoteString : `
getJDBCMajorVersion : 4
getJDBCMinorVersion : 0
getMaxBinaryLiteralLength : 0
getMaxCatalogNameLength : 128
getMaxCharLiteralLength : 0
getMaxColumnNameLength : 128
getMaxColumnsInGroupBy : 0
getMaxColumnsInIndex : 0
getMaxColumnsInOrderBy : 0
getMaxColumnsInSelect : 0
getMaxColumnsInTable : 0
getMaxConnections : 0
getMaxCursorNameLength : 0
getMaxIndexLength : 0
getMaxProcedureNameLength : 0
getMaxRowSize : 0
getMaxSchemaNameLength : 128
getMaxStatementLength : 0
getMaxStatements : 0
getMaxTableNameLength : 128
getMaxTablesInSelect : 0
getMaxUserNameLength : 0
getNumericFunctions : ABS,ACOS,ASIN,ATAN,ATAN2,CEILING,COS,COT,DEGREES,EXP,FLOOR,LOG,LOG10,MOD,PI,POWER,RADIANS,RAND,ROUND,SIGN,SIN,SQRT,TAN,TRUNCATE
getProcedureTerm : procedure
getResultSetHoldability : 2
getRowIdLifetime : ROWID_UNSUPPORTED
getSchemaTerm : schema
getSearchStringEscape : \
getSQLKeywords : 
getSQLStateType : 2
getStringFunctions : ASCII,CHAR,CHAR_LENGTH,CHARACTER_LENGTH,CONCAT,INSERT,LCASE,LEFT,LENGTH,LOCATE,LOCATE2,LTRIM,OCTET_LENGTH,POSITION,REPEAT,REPLACE,RIGHT,RTRIM,SOUNDEX,SPACE,SUBSTRING,UCASE
getSystemFunctions : DATABASE,IFNULL,USER
getSchemas : com.amazon.hive.jdbc.jdbc4.S4MetaDataProxy@222370ba
getTimeDateFunctions : CURDATE,CURRENT_DATE,CURRENT_TIME,CURRENT_TIMESTAMP,CURTIME,DAYNAME,DAYOFMONTH,DAYOFWEEK,DAYOFYEAR,HOUR,MINUTE,MONTH,MONTHNAME,NOW,QUARTER,SECOND,TIMESTAMPADD,TIMESTAMPDIFF,WEEK,YEAR
getURL : jdbc:hive2://XXX.XXX.XXX.XXX:10000/default
getUserName : User

Hiveのサポート状況

HiveがサポートするトランザクションやJOIN、UNIONなどについて対応可否について収集します。

import java.sql.Connection;
import java.sql.DatabaseMetaData;
import java.sql.DriverManager;
import java.sql.ResultSet;

public class HiveSupports {

	public static void main(String[] args) throws Exception {
		Class.forName("com.amazon.hive.jdbc4.HS2Driver").newInstance();
		String url = "jdbc:hive2://XXX.XXX.XXX.XXX:10000/default";
		Connection conn = DriverManager.getConnection(url);

		DatabaseMetaData dbmd = conn.getMetaData();
		System.out.println("※※※※※ SUPPORTS ※※※※※");
		System.out.println("supportsAlterTableWithAddColumn : " + dbmd.supportsAlterTableWithAddColumn());
		System.out.println("supportsAlterTableWithDropColumn : " + dbmd.supportsAlterTableWithDropColumn());
		System.out.println("supportsANSI92EntryLevelSQL : " + dbmd.supportsANSI92EntryLevelSQL());
		System.out.println("supportsANSI92FullSQL : " + dbmd.supportsANSI92FullSQL());
		System.out.println("supportsANSI92IntermediateSQL : " + dbmd.supportsANSI92IntermediateSQL());
		System.out.println("supportsBatchUpdates : " + dbmd.supportsBatchUpdates());
		System.out.println("supportsCatalogsInDataManipulation : " + dbmd.supportsCatalogsInDataManipulation());
		System.out.println("supportsCatalogsInIndexDefinitions : " + dbmd.supportsCatalogsInIndexDefinitions());
		System.out.println("supportsCatalogsInPrivilegeDefinitions : " + dbmd.supportsCatalogsInPrivilegeDefinitions());
		System.out.println("supportsCatalogsInProcedureCalls : " + dbmd.supportsCatalogsInProcedureCalls());
		System.out.println("supportsCatalogsInTableDefinitions : " + dbmd.supportsCatalogsInTableDefinitions());
		System.out.println("supportsColumnAliasing : " + dbmd.supportsColumnAliasing());
		System.out.println("supportsConvert : " + dbmd.supportsConvert());
		System.out.println("supportsCoreSQLGrammar : " + dbmd.supportsCoreSQLGrammar());
		System.out.println("supportsCorrelatedSubqueries : " + dbmd.supportsCorrelatedSubqueries());
		System.out.println("supportsDataDefinitionAndDataManipulationTransactions : " + dbmd.supportsDataDefinitionAndDataManipulationTransactions());
		System.out.println("supportsDataManipulationTransactionsOnly : " + dbmd.supportsDataManipulationTransactionsOnly());
		System.out.println("supportsDifferentTableCorrelationNames : " + dbmd.supportsDifferentTableCorrelationNames());
		System.out.println("supportsExpressionsInOrderBy : " + dbmd.supportsExpressionsInOrderBy());
		System.out.println("supportsExtendedSQLGrammar : " + dbmd.supportsExtendedSQLGrammar());
		System.out.println("supportsFullOuterJoins : " + dbmd.supportsFullOuterJoins());
		System.out.println("supportsGetGeneratedKeys : " + dbmd.supportsGetGeneratedKeys());
		System.out.println("supportsGroupBy : " + dbmd.supportsGroupBy());
		System.out.println("supportsGroupByBeyondSelect : " + dbmd.supportsGroupByBeyondSelect());
		System.out.println("supportsGroupByUnrelated : " + dbmd.supportsGroupByUnrelated());
		System.out.println("supportsIntegrityEnhancementFacility : " + dbmd.supportsIntegrityEnhancementFacility());
		System.out.println("supportsLikeEscapeClause : " + dbmd.supportsLikeEscapeClause());
		System.out.println("supportsLimitedOuterJoins : " + dbmd.supportsLimitedOuterJoins());
		System.out.println("supportsMinimumSQLGrammar : " + dbmd.supportsMinimumSQLGrammar());
		System.out.println("supportsMixedCaseIdentifiers : " + dbmd.supportsMixedCaseIdentifiers());
		System.out.println("supportsMixedCaseQuotedIdentifiers : " + dbmd.supportsMixedCaseQuotedIdentifiers());
		System.out.println("supportsMultipleOpenResults : " + dbmd.supportsMultipleOpenResults());
		System.out.println("supportsMultipleResultSets : " + dbmd.supportsMultipleResultSets());
		System.out.println("supportsMultipleTransactions : " + dbmd.supportsMultipleTransactions());
		System.out.println("supportsNamedParameters : " + dbmd.supportsNamedParameters());
		System.out.println("supportsNonNullableColumns : " + dbmd.supportsNonNullableColumns());
		System.out.println("supportsOpenCursorsAcrossCommit : " + dbmd.supportsOpenCursorsAcrossCommit());
		System.out.println("supportsOpenCursorsAcrossRollback : " + dbmd.supportsOpenCursorsAcrossRollback());
		System.out.println("supportsOpenStatementsAcrossCommit : " + dbmd.supportsOpenStatementsAcrossCommit());
		System.out.println("supportsOpenStatementsAcrossRollback : " + dbmd.supportsOpenStatementsAcrossRollback());
		System.out.println("supportsOrderByUnrelated : " + dbmd.supportsOrderByUnrelated());
		System.out.println("supportsOuterJoins : " + dbmd.supportsOuterJoins());
		System.out.println("supportsPositionedDelete : " + dbmd.supportsPositionedDelete());
		System.out.println("supportsPositionedUpdate : " + dbmd.supportsPositionedUpdate());
		System.out.println("supportsResultSetHoldability(ResultSet.HOLD_CURSORS_OVER_COMMIT) : " + dbmd.supportsResultSetHoldability(ResultSet.HOLD_CURSORS_OVER_COMMIT));
		System.out.println("supportsResultSetHoldability(ResultSet.CLOSE_CURSORS_AT_COMMIT) : " + dbmd.supportsResultSetHoldability(ResultSet.CLOSE_CURSORS_AT_COMMIT));
		System.out.println("supportsResultSetType(ResultSet.TYPE_FORWARD_ONLY) : " + dbmd.supportsResultSetType(ResultSet.TYPE_FORWARD_ONLY));
		System.out.println("supportsResultSetType(ResultSet.TYPE_SCROLL_INSENSITIVE) : " + dbmd.supportsResultSetType(ResultSet.TYPE_SCROLL_INSENSITIVE));
		System.out.println("supportsResultSetType(ResultSet.TYPE_SCROLL_SENSITIVE) : " + dbmd.supportsResultSetType(ResultSet.TYPE_SCROLL_SENSITIVE));
		System.out.println("supportsResultSetConcurrency(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE) : "
				+ dbmd.supportsResultSetConcurrency(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE));
		System.out.println("supportsResultSetConcurrency(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY) : "
				+ dbmd.supportsResultSetConcurrency(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY));
		System.out.println("supportsSavepoints : " + dbmd.supportsSavepoints());
		System.out.println("supportsSchemasInDataManipulation : " + dbmd.supportsSchemasInDataManipulation());
		System.out.println("supportsSchemasInIndexDefinitions : " + dbmd.supportsSchemasInIndexDefinitions());
		System.out.println("supportsSchemasInPrivilegeDefinitions : " + dbmd.supportsSchemasInPrivilegeDefinitions());
		System.out.println("supportsSchemasInProcedureCalls : " + dbmd.supportsSchemasInProcedureCalls());
		System.out.println("supportsSchemasInTableDefinitions : " + dbmd.supportsSchemasInTableDefinitions());
		System.out.println("supportsSelectForUpdate : " + dbmd.supportsSelectForUpdate());
		System.out.println("supportsStatementPooling : " + dbmd.supportsStatementPooling());
		System.out.println("supportsStoredFunctionsUsingCallSyntax : " + dbmd.supportsStoredFunctionsUsingCallSyntax());
		System.out.println("supportsStoredProcedures : " + dbmd.supportsStoredProcedures());
		System.out.println("supportsSubqueriesInComparisons : " + dbmd.supportsSubqueriesInComparisons());
		System.out.println("supportsSubqueriesInExists : " + dbmd.supportsSubqueriesInExists());
		System.out.println("supportsSubqueriesInIns : " + dbmd.supportsSubqueriesInIns());
		System.out.println("supportsSubqueriesInQuantifieds : " + dbmd.supportsSubqueriesInQuantifieds());
		System.out.println("supportsTableCorrelationNames : " + dbmd.supportsTableCorrelationNames());
		System.out.println("supportsTransactionIsolationLevel(Connection.TRANSACTION_NONE) : " + dbmd.supportsTransactionIsolationLevel(Connection.TRANSACTION_NONE));
		System.out.println("supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_COMMITTED) : " + dbmd.supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_COMMITTED));
		System.out.println("supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_UNCOMMITTED) : " + dbmd.supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_UNCOMMITTED));
		System.out.println("supportsTransactionIsolationLevel(Connection.TRANSACTION_REPEATABLE_READ) : " + dbmd.supportsTransactionIsolationLevel(Connection.TRANSACTION_REPEATABLE_READ));
		System.out.println("supportsTransactionIsolationLevel(Connection.TRANSACTION_SERIALIZABLE) : " + dbmd.supportsTransactionIsolationLevel(Connection.TRANSACTION_SERIALIZABLE));
		System.out.println("supportsTransactions : " + dbmd.supportsTransactions());
		System.out.println("supportsUnion : " + dbmd.supportsUnion());
		System.out.println("supportsUnionAll : " + dbmd.supportsUnionAll());
	}
}

以下結果です。

※※※※※ SUPPORTS ※※※※※
supportsAlterTableWithAddColumn : false
supportsAlterTableWithDropColumn : false
supportsANSI92EntryLevelSQL : true
supportsANSI92FullSQL : false
supportsANSI92IntermediateSQL : false
supportsBatchUpdates : false
supportsCatalogsInDataManipulation : true
supportsCatalogsInIndexDefinitions : true
supportsCatalogsInPrivilegeDefinitions : true
supportsCatalogsInProcedureCalls : true
supportsCatalogsInTableDefinitions : true
supportsColumnAliasing : true
supportsConvert : true
supportsCoreSQLGrammar : true
supportsCorrelatedSubqueries : true
supportsDataDefinitionAndDataManipulationTransactions : false
supportsDataManipulationTransactionsOnly : false
supportsDifferentTableCorrelationNames : false
supportsExpressionsInOrderBy : true
supportsExtendedSQLGrammar : false
supportsFullOuterJoins : true
supportsGetGeneratedKeys : false
supportsGroupBy : true
supportsGroupByBeyondSelect : true
supportsGroupByUnrelated : false
supportsIntegrityEnhancementFacility : false
supportsLikeEscapeClause : true
supportsLimitedOuterJoins : false
supportsMinimumSQLGrammar : true
supportsMixedCaseIdentifiers : false
supportsMixedCaseQuotedIdentifiers : true
supportsMultipleOpenResults : false
supportsMultipleResultSets : false
supportsMultipleTransactions : true
supportsNamedParameters : false
supportsNonNullableColumns : false
supportsOpenCursorsAcrossCommit : false
supportsOpenCursorsAcrossRollback : false
supportsOpenStatementsAcrossCommit : true
supportsOpenStatementsAcrossRollback : true
supportsOrderByUnrelated : false
supportsOuterJoins : false
supportsPositionedDelete : false
supportsPositionedUpdate : false
supportsResultSetHoldability(ResultSet.HOLD_CURSORS_OVER_COMMIT) : false
supportsResultSetHoldability(ResultSet.CLOSE_CURSORS_AT_COMMIT) : true
supportsResultSetType(ResultSet.TYPE_FORWARD_ONLY) : true
supportsResultSetType(ResultSet.TYPE_SCROLL_INSENSITIVE) : false
supportsResultSetType(ResultSet.TYPE_SCROLL_SENSITIVE) : false
supportsResultSetConcurrency(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_UPDATABLE) : false
supportsResultSetConcurrency(ResultSet.TYPE_FORWARD_ONLY, ResultSet.CONCUR_READ_ONLY) : true
supportsSavepoints : false
supportsSchemasInDataManipulation : true
supportsSchemasInIndexDefinitions : true
supportsSchemasInPrivilegeDefinitions : true
supportsSchemasInProcedureCalls : false
supportsSchemasInTableDefinitions : true
supportsSelectForUpdate : false
supportsStatementPooling : false
supportsStoredFunctionsUsingCallSyntax : false
supportsStoredProcedures : true
supportsSubqueriesInComparisons : true
supportsSubqueriesInExists : true
supportsSubqueriesInIns : true
supportsSubqueriesInQuantifieds : true
supportsTableCorrelationNames : true
supportsTransactionIsolationLevel(Connection.TRANSACTION_NONE) : false
supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_COMMITTED) : false
supportsTransactionIsolationLevel(Connection.TRANSACTION_READ_UNCOMMITTED) : true
supportsTransactionIsolationLevel(Connection.TRANSACTION_REPEATABLE_READ) : false
supportsTransactionIsolationLevel(Connection.TRANSACTION_SERIALIZABLE) : false
supportsTransactions : false
supportsUnion : true
supportsUnionAll : true

まとめ

EMRにJDBC接続できるのは最近知りました。トランザクションに対応してないことや、使えるSQLが限定的だったりしますが、基本的な操作をすることができますので、TableauなどのBIツールからも簡単に接続できますね。

参考資料

Use the Hive JDBC Driver

HiveClient