-
15. Re: Hive integration
jietao Oct 16, 2015 5:54 AM (in response to shawkins)I have all of these jars. I keep getting errors:
11:45:51,281 INFO [org.jboss.as.server.deployment] (MSC service thread 1-3) JBAS015876: Starte Deployment von "importVDB-vdb.xml" (runtime-name: "importVDB-vdb.xml")
11:45:51,302 INFO [org.teiid.RUNTIME.VDBLifeCycleListener] (MSC service thread 1-2) TEIID40118 VDB importVDB.1 added to the repository - is reloading false
11:45:51,303 INFO [org.teiid.RUNTIME] (MSC service thread 1-2) TEIID50029 VDB importVDB.1 model "importVDBSrcModel" metadata is currently being loaded. Start Time: 16.10.15 11:45
11:45:51,378 INFO [org.jboss.as.server] (management-handler-thread - 11) JBAS015859: "importVDB-vdb.xml" deployed (runtime-name: "importVDB-vdb.xml")
11:45:51,442 INFO [org.teiid.CONNECTOR] (teiid-async-threads - 2) TEIID11002 Failed to report the JDBC driver and connection information
11:45:51,638 WARN [org.teiid.RUNTIME] (teiid-async-threads - 2) TEIID50036 VDB importVDB.1 model "importVDBSrcModel" metadata failed to load. Reason:java.lang.NullPointerException: java.lang.NullPointerException
at org.teiid.translator.hive.HiveMetadataProcessor.getRuntimeType(HiveMetadataProcessor.java:75)
at org.teiid.translator.hive.HiveMetadataProcessor.addTable(HiveMetadataProcessor.java:133)
at org.teiid.translator.hive.HiveMetadataProcessor.getConnectorMetadata(HiveMetadataProcessor.java:59)
at org.teiid.translator.jdbc.JDBCExecutionFactory.getMetadata(JDBCExecutionFactory.java:294)
at org.teiid.translator.jdbc.JDBCExecutionFactory.getMetadata(JDBCExecutionFactory.java:68)
at org.teiid.query.metadata.NativeMetadataRepository.getMetadata(NativeMetadataRepository.java:83) [teiid-engine-8.11.4.jar:8.11.4]
at org.teiid.query.metadata.NativeMetadataRepository.loadMetadata(NativeMetadataRepository.java:60) [teiid-engine-8.11.4.jar:8.11.4]
at org.teiid.query.metadata.ChainingMetadataRepository.loadMetadata(ChainingMetadataRepository.java:55) [teiid-engine-8.11.4.jar:8.11.4]
at org.teiid.jboss.VDBService$6.run(VDBService.java:393) [teiid-jboss-integration-8.11.4.jar:8.11.4]
at org.teiid.jboss.VDBService$7.run(VDBService.java:444) [teiid-jboss-integration-8.11.4.jar:8.11.4]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [rt.jar:1.8.0_60]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [rt.jar:1.8.0_60]
at java.lang.Thread.run(Thread.java:745) [rt.jar:1.8.0_60]
at org.jboss.threads.JBossThread.run(JBossThread.java:122)
This is my module.xml for the hive driver:
<?xml version="1.0" encoding="UTF-8"?>
<module xmlns="urn:jboss:module:1.0" name="org.apache.hadoop.hive12">
<resources>
<resource-root path="commons-logging-1.1.3.jar"/>
<resource-root path="hadoop-common-2.4.0.jar"/>
<resource-root path="hive-exec-0.13.1.jar"/>
<resource-root path="hive-jdbc-0.13.1.jar"/>
<resource-root path="hive-service-0.13.1.jar"/>
<resource-root path="httpclient-4.2.5.jar"/>
<resource-root path="httpcore-4.2.5.jar"/>
<resource-root path="libfb303-0.9.0.jar"/>
<resource-root path="libthrift-0.9.0.jar"/>
<resource-root path="log4j-1.2.16.jar"/>
<resource-root path="slf4j-api-1.7.5.jar"/>
<resource-root path="hive-metastore-0.13.1.jar"/>
</resources>
<dependencies>
<module name="org.slf4j"/>
<module name="org.apache.commons.logging"/>
<module name="javax.api"/>
<module name="javax.resource.api"/>
</dependencies>
</module>
-
16. Re: Hive integration
rareddy Oct 16, 2015 9:06 AM (in response to jietao)That is bug, can log a JIRA for it?
-
17. Re: Hive integration
shawkins Oct 16, 2015 9:08 AM (in response to jietao)Although I couldn't find a reference, I believe that has been seen before with using a newer Hive client and an older Hive server. For some reason Hive can then report null type names, which is obviously problematic. Try using the import option importer.useDatabaseMetaData set to true and/or see if you are running mismatched versions.
-
18. Re: Hive integration
jietao Oct 16, 2015 10:49 AM (in response to shawkins)We looked the Hive server log and see that an operation with a partitioned table does not return "OK". This may be the reason that Teiid shows an error. We tried the same operation locally on the server. The operation runs correctly but the server log shows the same as from Teiid.
We created a database with a simple table. This time I was successful with connection to Hive.
Thank you a lot.
-
19. Re: Hive integration
jietao Oct 16, 2015 11:34 AM (in response to rareddy)the following is our server log. See the last several lines, where the error occures. Can Teiid do something for such issues?
2015-10-16 16:11:20,607 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,607 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1445004680607 end=1445004680607 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,609 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=compile from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,609 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=parse from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,609 INFO [pool-7-thread-10]: parse.ParseDriver (ParseDriver.java:parse(185)) - Parsing command: DESCRIBE event_log
2015-10-16 16:11:20,610 INFO [pool-7-thread-10]: parse.ParseDriver (ParseDriver.java:parse(206)) - Parse Completed
2015-10-16 16:11:20,610 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=parse start=1445004680609 end=1445004680610 duration=1 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,610 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=semanticAnalyze from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,642 INFO [pool-7-thread-10]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(623)) - 78: get_table : db=test tbl=event_log
2015-10-16 16:11:20,642 INFO [pool-7-thread-10]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(305)) - ugi=hive ip=unknown-ip-addr cmd=get_table : db=test tbl=event_log
2015-10-16 16:11:20,660 INFO [pool-7-thread-10]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(623)) - 78: get_table : db=test tbl=event_log
2015-10-16 16:11:20,660 INFO [pool-7-thread-10]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(305)) - ugi=hive ip=unknown-ip-addr cmd=get_table : db=test tbl=event_log
2015-10-16 16:11:20,673 INFO [pool-7-thread-10]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(623)) - 78: get_table : db=test tbl=event_log
2015-10-16 16:11:20,673 INFO [pool-7-thread-10]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(305)) - ugi=hive ip=unknown-ip-addr cmd=get_table : db=test tbl=event_log
2015-10-16 16:11:20,684 INFO [pool-7-thread-10]: parse.DDLSemanticAnalyzer (DDLSemanticAnalyzer.java:analyzeDescribeTable(1984)) - analyzeDescribeTable done
2015-10-16 16:11:20,684 INFO [pool-7-thread-10]: ql.Driver (Driver.java:compile(431)) - Semantic Analysis Completed
2015-10-16 16:11:20,684 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=semanticAnalyze start=1445004680610 end=1445004680684 duration=74 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,685 INFO [pool-7-thread-10]: exec.ListSinkOperator (Operator.java:initialize(337)) - Initializing Self 8076 OP
2015-10-16 16:11:20,685 INFO [pool-7-thread-10]: exec.ListSinkOperator (Operator.java:initializeChildren(410)) - Operator 8076 OP initialized
2015-10-16 16:11:20,685 INFO [pool-7-thread-10]: exec.ListSinkOperator (Operator.java:initialize(385)) - Initialization Done 8076 OP
2015-10-16 16:11:20,685 INFO [pool-7-thread-10]: ql.Driver (Driver.java:getSchema(238)) - Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:col_name, type:string, comment:from deserializer), FieldSchema(name:data_type, type:st
ring, comment:from deserializer), FieldSchema(name:comment, type:string, comment:from deserializer)], properties:null)
2015-10-16 16:11:20,685 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=compile start=1445004680609 end=1445004680685 duration=76 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,686 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=Driver.run from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,686 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=TimeToSubmit from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,686 INFO [pool-6-thread-816]: ql.Driver (Driver.java:checkConcurrency(158)) - Concurrency mode is disabled, not creating a lock manager
2015-10-16 16:11:20,686 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=Driver.execute from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,686 INFO [pool-6-thread-816]: ql.Driver (Driver.java:execute(1192)) - Starting command: DESCRIBE event_log
2015-10-16 16:11:20,687 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=TimeToSubmit start=1445004680686 end=1445004680686 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,687 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=runTasks from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,687 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=task.DDL.Stage-0 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,687 INFO [pool-6-thread-816]: metastore.HiveMetaStore (HiveMetaStore.java:logInfo(623)) - 502: get_table : db=test tbl=event_log
2015-10-16 16:11:20,688 INFO [pool-6-thread-816]: HiveMetaStore.audit (HiveMetaStore.java:logAuditEvent(305)) - ugi=hive ip=unknown-ip-addr cmd=get_table : db=test tbl=event_log
2015-10-16 16:11:20,688 INFO [pool-6-thread-816]: metastore.HiveMetaStore (HiveMetaStore.java:newRawStore(493)) - 502: Opening raw store with implemenation class:org.apache.hadoop.hive.metastore.ObjectStore
2015-10-16 16:11:20,689 INFO [pool-6-thread-816]: metastore.ObjectStore (ObjectStore.java:initialize(246)) - ObjectStore, initialize called
2015-10-16 16:11:20,693 INFO [pool-6-thread-816]: metastore.ObjectStore (ObjectStore.java:setConf(229)) - Initialized ObjectStore
2015-10-16 16:11:20,705 INFO [pool-6-thread-816]: exec.DDLTask (DDLTask.java:describeTable(3334)) - DDLTask: got data for event_log
2015-10-16 16:11:20,706 INFO [pool-6-thread-816]: exec.DDLTask (DDLTask.java:describeTable(3359)) - DDLTask: written data for event_log
2015-10-16 16:11:20,706 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=runTasks start=1445004680687 end=1445004680706 duration=19 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,706 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=Driver.execute start=1445004680686 end=1445004680706 duration=20 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,707 INFO [pool-6-thread-816]: ql.Driver (SessionState.java:printInfo(536)) - OK
2015-10-16 16:11:20,707 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,707 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1445004680707 end=1445004680707 duration=0 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,707 INFO [pool-6-thread-816]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=Driver.run start=1445004680686 end=1445004680707 duration=21 from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,709 INFO [pool-7-thread-10]: mapred.FileInputFormat (FileInputFormat.java:listStatus(247)) - Total input paths to process : 1
2015-10-16 16:11:20,710 INFO [pool-7-thread-10]: lazy.LazyStruct (LazyStruct.java:parse(167)) - Missing fields! Expected 3 fields but only got 1! Ignoring similar problems.
2015-10-16 16:11:20,710 WARN [pool-7-thread-10]: lazy.LazyStruct (LazyStruct.java:parse(160)) - Extra bytes detected at the end of the row! Ignoring similar problems.
2015-10-16 16:11:20,712 INFO [pool-7-thread-10]: exec.ListSinkOperator (Operator.java:close(574)) - 8076 finished. closing...
2015-10-16 16:11:20,712 INFO [pool-7-thread-10]: exec.ListSinkOperator (Operator.java:close(591)) - 8076 Close done
2015-10-16 16:11:20,712 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogBegin(108)) - <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2015-10-16 16:11:20,712 INFO [pool-7-thread-10]: log.PerfLogger (PerfLogger.java:PerfLogEnd(135)) - </PERFLOG method=releaseLocks start=1445004680712 end=1445004680712 duration=0 from=org.apache.hadoop.hive.ql.Driver>
-
20. Re: Hive integration
rareddy Oct 16, 2015 12:50 PM (in response to jietao)Looks like all the log from Hive, you can turn the default logging level of "ROOT" logger to "WARNING" or "ERROR" in the standalone-teiid.xml file, in the "logging" sub-section. That will skip printing to console or writing to the log file.
-
21. Re: Hive integration
jietao Oct 19, 2015 3:34 AM (in response to rareddy)do you mean that I can not go through the last step "export DDL" when our Hive keeps showing the above log? Changing the root logger does not remove the error "importVDBSrcModel" metadata failed to load. Reason:java.lang.NullPointerException:"
-
22. Re: Hive integration
rareddy Oct 19, 2015 9:19 AM (in response to jietao)Have you following Steve's suggestion of
>> Try using the import option importer.useDatabaseMetaData set to true and/or see if you are running mismatched versions.
Please try that and let us know what you find out.
Ramesh.. -
23. Re: Hive integration
jietao Oct 19, 2015 12:03 PM (in response to rareddy)I tried that, but I still get the error "importVDBSrcModel" metadata failed to load. Reason:java.lang.NullPointerException:" I added this property In the Optional Source Import Properties
-
24. Re: Hive integration
shawkins Oct 19, 2015 1:09 PM (in response to jietao)If you are getting the same stack trace, then the property was not set appropriately.
-
25. Re: Hive integration
jietao Oct 20, 2015 2:25 AM (in response to shawkins)This is the Teiid-designer created VDB definition:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<vdb name="importVDB" version="1">
<description>Importer VDB</description>
<property name="UseConnectorMetadata" value="true" />
<property name="deployment-name" value="importVDB-vdb.xml" />
<model name="importVDBSrcModel" type="PHYSICAL" visible="true">
<property name="trimColumnNames" value="true" />
<property name="importer.useDatabaseMetaData" value="true" />
<source name="importVDBSrcModel" translator-name="hive" connection-jndi-name="java:/HadoopDB" />
</model>
</vdb>
Is the property correct?
-
26. Re: Hive integration
rareddy Oct 20, 2015 8:57 AM (in response to jietao)Looks like it is not set as an importer property, instead it is defined translator property (may be we should open JIRA to switch this around). Anyway, to use this is to add a translator override property. The VDB definition will look something like
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> <vdb name="importVDB" version="1"> <description>Importer VDB</description> <property name="UseConnectorMetadata" value="true" /> <property name="deployment-name" value="importVDB-vdb.xml" /> <model name="importVDBSrcModel" type="PHYSICAL" visible="true"> <property name="importer.trimColumnNames" value="true" /> <source name="importVDBSrcModel" translator-name="hive-override" connection-jndi-name="java:/HadoopDB" /> </model> <translator name="hive-override" type="hive"> <property name="UseDatabaseMetaData" value="true"/> </translator> </vdb>
sorry for mis-direction before.
Ramesh..
-
27. Re: Hive integration
jietao Oct 22, 2015 3:06 AM (in response to rareddy)thanks again. I use Teiid designer "import source model". where I cannot find possibilities to give the translator property. I tried to add the property in standalone-teiid.xml, but not allowed. Where teiid designer store the importvdb.xml? Maybe I can change there. Other choices?
-
28. Re: Hive integration
rareddy Oct 22, 2015 10:34 AM (in response to jietao)Jie,
You are correct, unfortunately there is no way to define the translator override during the import process. So we need https://issues.jboss.org/browse/TEIID-3780 to be fixed, before you can use Teiid Connection Importer.
One alternative is take the above XML create file "hive-vdb.xml" and deploy to the server, this is called Dynamic VDB. That will gather the metadata and make the VDB. You can then use either web-console or use jboss-cli and retrieve the DDL of the Hive model. Once you have the DDL, then you import that into your Teiid Designer and continue for further modeling.
Using the jboss-cli to get the DDL
cd <jboss-eap>/bin ./jboss-cli.bat --connect [standalone@localhost:9999 /] /subsystem=teiid:get-schema(vdb-name=importvdb, vdb-version=1, model-name=importVDBSrcModel)
That should print the DDL if metadata import worked correctly.
Ramesh..
-
29. Re: Hive integration
jietao Oct 23, 2015 3:37 AM (in response to rareddy)it works! I successfully created the DDL and imported it in Teiid designer. Thanks.