Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hive3.x get_table_meta null pointer exception #288

Open
flaming-archer opened this issue Sep 5, 2023 · 6 comments
Open

Hive3.x get_table_meta null pointer exception #288

flaming-archer opened this issue Sep 5, 2023 · 6 comments

Comments

@flaming-archer
Copy link
Contributor

Describe the bug
WaggleDanceIntegrationTest.getTableMeta is always fail because of assertThat(tableMeta.size(), is(1));

To Reproduce
run WaggleDanceIntegrationTest.getTableMeta, the ut always fails ,sometimes ok.

Expected behavior
run WaggleDanceIntegrationTest.getTableMeta ok.

Logs
I change the code PanopticConcurrentOperationExecutor.getResultFromFuture from
} catch (ExecutionException e) { log.warn(errorMessage, e.getCause().getMessage());
to

} catch (ExecutionException e) { log.warn(errorMessage, e);
I will get this log. It seems may be the hms bug.
2023-09-05T22:43:04,765 WARN com.hotels.bdp.waggledance.mapping.service.PanopticConcurrentOperationExecutor:78 - Got exception fetching get_table_meta: {} java.util.concurrent.ExecutionException: MetaException(message:java.lang.NullPointerException) at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_301] at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_301] at com.hotels.bdp.waggledance.mapping.service.PanopticConcurrentOperationExecutor.getResultFromFuture(PanopticConcurrentOperationExecutor.java:74) ~[classes/:?] at com.hotels.bdp.waggledance.mapping.service.PanopticConcurrentOperationExecutor.executeRequests(PanopticConcurrentOperationExecutor.java:63) ~[classes/:?] at com.hotels.bdp.waggledance.mapping.service.PanopticOperationHandler.getTableMeta(PanopticOperationHandler.java:110) ~[classes/:?] at com.hotels.bdp.waggledance.mapping.service.impl.PrefixBasedDatabaseMappingService$1.getTableMeta(PrefixBasedDatabaseMappingService.java:359) ~[classes/:?] at com.hotels.bdp.waggledance.server.FederatedHMSHandler.get_table_meta_aroundBody830(FederatedHMSHandler.java:2163) ~[classes/:?] at com.hotels.bdp.waggledance.server.FederatedHMSHandler$AjcClosure831.run(FederatedHMSHandler.java:1) ~[classes/:?] at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:170) ~[aspectjweaver-1.9.7.jar:?] at com.hotels.bdp.waggledance.metrics.MonitoredAspect.monitor(MonitoredAspect.java:58) ~[classes/:?] at com.hotels.bdp.waggledance.metrics.MonitoredAspect.monitor(MonitoredAspect.java:47) ~[classes/:?] at com.hotels.bdp.waggledance.server.FederatedHMSHandler.get_table_meta_aroundBody832(FederatedHMSHandler.java:2162) ~[classes/:?] at com.hotels.bdp.waggledance.server.FederatedHMSHandler$AjcClosure833.run(FederatedHMSHandler.java:1) ~[classes/:?] at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:170) ~[aspectjweaver-1.9.7.jar:?] at com.jcabi.aspects.aj.MethodLogger.wrap(MethodLogger.java:218) ~[jcabi-aspects-0.25.1.jar:?] at com.jcabi.aspects.aj.MethodLogger.ajc$inlineAccessMethod$com_jcabi_aspects_aj_MethodLogger$com_jcabi_aspects_aj_MethodLogger$wrap(MethodLogger.java:1) ~[jcabi-aspects-0.25.1.jar:?] at com.jcabi.aspects.aj.MethodLogger.wrapMethod(MethodLogger.java:169) ~[jcabi-aspects-0.25.1.jar:?] at com.hotels.bdp.waggledance.server.FederatedHMSHandler.get_table_meta(FederatedHMSHandler.java:2162) ~[classes/:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_301] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_301] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_301] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_301] at com.hotels.bdp.waggledance.server.ExceptionWrappingHMSHandler.invoke(ExceptionWrappingHMSHandler.java:49) ~[classes/:?] at com.sun.proxy.$Proxy29.get_table_meta(Unknown Source) ~[?:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_301] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_301] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_301] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_301] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) ~[hive-exec-3.1.3.jar:3.1.3] at com.sun.proxy.$Proxy29.get_table_meta(Unknown Source) ~[?:?] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_meta.getResult(ThriftHiveMetastore.java:15190) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_meta.getResult(ThriftHiveMetastore.java:15174) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-3.1.3.jar:3.1.3] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_301] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_301] at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_301] Caused by: org.apache.hadoop.hive.metastore.api.MetaException: java.lang.NullPointerException at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_meta_result$get_table_meta_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_meta_result$get_table_meta_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_meta_result.read(ThriftHiveMetastore.java) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_meta(ThriftHiveMetastore.java:1973) ~[hive-exec-3.1.3.jar:3.1.3] at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_meta(ThriftHiveMetastore.java:1958) ~[hive-exec-3.1.3.jar:3.1.3] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_301] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_301] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_301] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_301] at com.hotels.bdp.waggledance.client.compatibility.HiveCompatibleThriftHiveMetastoreIfaceFactory$ThriftMetaStoreClientInvocationHandler.invoke(HiveCompatibleThriftHiveMetastoreIfaceFactory.java:43) ~[classes/:?] at com.sun.proxy.$Proxy150.get_table_meta(Unknown Source) ~[?:?] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_301] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[?:1.8.0_301] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_301] at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_301] at com.hotels.bdp.waggledance.client.DefaultMetaStoreClientFactory$ReconnectingMetastoreClientInvocationHandler.doRealCall(DefaultMetaStoreClientFactory.java:105) ~[classes/:?] at com.hotels.bdp.waggledance.client.DefaultMetaStoreClientFactory$ReconnectingMetastoreClientInvocationHandler.invoke(DefaultMetaStoreClientFactory.java:98) ~[classes/:?] at com.sun.proxy.$Proxy150.get_table_meta(Unknown Source) ~[?:?] at com.hotels.bdp.waggledance.mapping.service.requests.GetTableMetaRequest.call(GetTableMetaRequest.java:41) ~[classes/:?] at com.hotels.bdp.waggledance.mapping.service.requests.GetTableMetaRequest.call(GetTableMetaRequest.java:29) ~[classes/:?] at java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:266) ~[?:1.8.0_301] at java.util.concurrent.FutureTask.run(FutureTask.java) ~[?:1.8.0_301] ... 3 more

The NPE comes from
public void rollbackTransaction() { if (this.openTrasactionCalls < 1) { this.debugLog("rolling back transaction: no open transactions: " + this.openTrasactionCalls); } else { this.debugLog("Rollback transaction, isActive: " + this.currentTransaction.isActive()); }
in this.debugLog("Rollback transaction, isActive: " + this.currentTransaction.isActive());

this.currentTransaction is null.

Versions (please complete the following information):

  • Waggle Dance Version: the latest hive3.x
  • Hive Versions: no hive version, test the ut

Additional context
The bug looks very serious,beacause get_table_meta often uses.

@patduin
Copy link
Contributor

patduin commented Sep 6, 2023

I just did a clean checkout of branch hive-3.x
ran: mvn clean install -DskipTests
cd waggle-dance-integration-tests
mvn test -Dtest=WaggleDanceIntegrationTest#getTableMeta

and tests succeed.
Running from IDE (eclipse) also succeeds.
Also your PR got tested and the test also was ok: https://github.com/ExpediaGroup/waggle-dance/actions/runs/6086596725 (The deployment fails but that's not relevant).

Not sure what would be different to make it fail.

@flaming-archer
Copy link
Contributor Author

I just did a clean checkout of branch hive-3.x ran: mvn clean install -DskipTests cd waggle-dance-integration-tests mvn test -Dtest=WaggleDanceIntegrationTest#getTableMeta

and tests succeed. Running from IDE (eclipse) also succeeds. Also your PR got tested and the test also was ok: https://github.com/ExpediaGroup/waggle-dance/actions/runs/6086596725 (The deployment fails but that's not relevant).

Not sure what would be different to make it fail.

I tested it again according to your way and WaggleDanceIntegrationTest#getTableMeta failed again.

And I run from IDE(Idea) also failed.

`Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 75.938 sec <<< FAILURE!
getTableMeta(com.hotels.bdp.waggledance.WaggleDanceIntegrationTest) Time elapsed: 75.359 sec <<< FAILURE!
java.lang.AssertionError:
Expected: is <1>
but: was <0>
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:6)
at com.hotels.bdp.waggledance.WaggleDanceIntegrationTest.getTableMeta(WaggleDanceIntegrationTest.java:1118)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59)
at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12)
at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56)
at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17)
at org.junit.internal.runners.statements.RunBefores.evaluate(RunBefores.java:26)
at org.junit.internal.runners.statements.RunAfters.evaluate(RunAfters.java:27)
at fm.last.commons.test.file.ClassDataFolder$1.evaluate(ClassDataFolder.java:48)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:61)
at org.junit.rules.ExternalResource$1.evaluate(ExternalResource.java:54)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.BlockJUnit4ClassRunner$1.evaluate(BlockJUnit4ClassRunner.java:100)
at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:366)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:103)
at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:63)
at org.junit.runners.ParentRunner$4.run(ParentRunner.java:331)
at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:79)
at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:329)
at org.junit.runners.ParentRunner.access$100(ParentRunner.java:66)
at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:293)
at org.junit.runners.ParentRunner$3.evaluate(ParentRunner.java:306)
at org.junit.runners.ParentRunner.run(ParentRunner.java:413)
at org.apache.maven.surefire.junit4.JUnit4Provider.execute(JUnit4Provider.java:242)
at org.apache.maven.surefire.junit4.JUnit4Provider.executeTestSet(JUnit4Provider.java:137)
at org.apache.maven.surefire.junit4.JUnit4Provider.invoke(JUnit4Provider.java:112)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.maven.surefire.util.ReflectionUtils.invokeMethodWithArray(ReflectionUtils.java:189)
at org.apache.maven.surefire.booter.ProviderFactory$ProviderProxy.invoke(ProviderFactory.java:165)
at org.apache.maven.surefire.booter.ProviderFactory.invokeProvider(ProviderFactory.java:85)
at org.apache.maven.surefire.booter.ForkedBooter.runSuitesInProcess(ForkedBooter.java:115)
at org.apache.maven.surefire.booter.ForkedBooter.main(ForkedBooter.java:75)

2023-09-06T18:01:50,041 INFO com.hotels.bdp.waggledance.server.MetaStoreProxyServer:128 - Shutting down WaggleDance.

Results :

Failed tests: getTableMeta(com.hotels.bdp.waggledance.WaggleDanceIntegrationTest): (..)

Tests run: 1, Failures: 1, Errors: 0, Skipped: 0
`

@flaming-archer
Copy link
Contributor Author

I just did a clean checkout of branch hive-3.x ran: mvn clean install -DskipTests cd waggle-dance-integration-tests mvn test -Dtest=WaggleDanceIntegrationTest#getTableMeta

and tests succeed. Running from IDE (eclipse) also succeeds. Also your PR got tested and the test also was ok: https://github.com/ExpediaGroup/waggle-dance/actions/runs/6086596725 (The deployment fails but that's not relevant).

Not sure what would be different to make it fail.

Could you please merge my pr firstly. ^-^.

@Milkkker
Copy link

Milkkker commented Sep 7, 2023

After testing, the getTableMate() unit test has a probability of error, and the cycle test is conducted by modifying the getTableMeta() code

@Test
  public void retryGetTableMeta() throws Exception {
    boolean testPassed = true;
    int attempt = 1;
    while (testPassed) {
      try {
        System.out.println("Attempt " + attempt);
        runner = WaggleDanceRunner
                .builder(configLocation)
                .databaseResolution(DatabaseResolution.PREFIXED)
                .primary("primary", localServer.getThriftConnectionUri(), READ_ONLY)
                .federate(SECONDARY_METASTORE_NAME, remoteServer.getThriftConnectionUri(), REMOTE_DATABASE)
                .build();
        runWaggleDance(runner);
        HiveMetaStoreClient proxy = getWaggleDanceClient();
        List<TableMeta> tableMeta = proxy
                .getTableMeta("waggle_remote_remote_database", "*", Lists.newArrayList("EXTERNAL_TABLE"));
        assertThat(tableMeta.size(), is(1));
        assertThat(tableMeta.get(0).getDbName(), is("waggle_remote_remote_database"));
        assertThat(tableMeta.get(0).getTableName(), is(REMOTE_TABLE));
        // use wildcards: '.'
        tableMeta = proxy.getTableMeta("waggle_remote.remote_database", "*", Lists.newArrayList("EXTERNAL_TABLE"));
        assertThat(tableMeta.size(), is(1));
        assertThat(tableMeta.get(0).getDbName(), is("waggle_remote_remote_database"));
        assertThat(tableMeta.get(0).getTableName(), is(REMOTE_TABLE));
        tableMeta = proxy.getTableMeta("waggle.remote_remote_database", "*", Lists.newArrayList("EXTERNAL_TABLE"));
        assertThat(tableMeta.size(), is(1));
        assertThat(tableMeta.get(0).getDbName(), is("waggle_remote_remote_database"));
        assertThat(tableMeta.get(0).getTableName(), is(REMOTE_TABLE));
        proxy.close();
        runner.stop();
      } catch (Throwable e) {
        // Test failed, stop the loop
        testPassed = false;
        System.out.println("Test failed on attempt " + attempt);
        e.printStackTrace();
        Assert.fail("Test failed on attempt " + attempt);
      } finally {
        attempt++;
      }
    }
  } 

Errors will appear after about 15 cycles, and the same error message is

2023-09-07T10:17:09,792 WARN com.hotels.bdp.waggledance.mapping.service.PanopticConcurrentOperationExecutor:79 - Got exception fetching get_table_meta: {}
java.util.concurrent.ExecutionException: MetaException(message:java.lang.NullPointerException)
at java.util.concurrent.FutureTask.report(FutureTask.java:122) ~[?:1.8.0_381]
at java.util.concurrent.FutureTask.get(FutureTask.java:192) ~[?:1.8.0_381]
at com.hotels.bdp.waggledance.mapping.service.PanopticConcurrentOperationExecutor.getResultFromFuture(PanopticConcurrentOperationExecutor.java:74) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.mapping.service.PanopticConcurrentOperationExecutor.executeRequests(PanopticConcurrentOperationExecutor.java:63) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.mapping.service.PanopticOperationHandler.getTableMeta(PanopticOperationHandler.java:110) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.mapping.service.impl.PrefixBasedDatabaseMappingService$1.getTableMeta(PrefixBasedDatabaseMappingService.java:359) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.server.FederatedHMSHandler.get_table_meta_aroundBody830(FederatedHMSHandler.java:2163) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.server.FederatedHMSHandler$AjcClosure831.run(FederatedHMSHandler.java:1) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:170) ~[aspectjweaver-1.9.7.jar:?]
at com.hotels.bdp.waggledance.metrics.MonitoredAspect.monitor(MonitoredAspect.java:58) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.metrics.MonitoredAspect.monitor(MonitoredAspect.java:47) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.server.FederatedHMSHandler.get_table_meta_aroundBody832(FederatedHMSHandler.java:2162) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.server.FederatedHMSHandler$AjcClosure833.run(FederatedHMSHandler.java:1) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at org.aspectj.runtime.reflect.JoinPointImpl.proceed(JoinPointImpl.java:170) ~[aspectjweaver-1.9.7.jar:?]
at com.jcabi.aspects.aj.MethodLogger.wrap(MethodLogger.java:218) ~[jcabi-aspects-0.25.1.jar:?]
at com.jcabi.aspects.aj.MethodLogger.ajc$inlineAccessMethod$com_jcabi_aspects_aj_MethodLogger$com_jcabi_aspects_aj_MethodLogger$wrap(MethodLogger.java:1) ~[jcabi-aspects-0.25.1.jar:?]
at com.jcabi.aspects.aj.MethodLogger.wrapMethod(MethodLogger.java:169) ~[jcabi-aspects-0.25.1.jar:?]
at com.hotels.bdp.waggledance.server.FederatedHMSHandler.get_table_meta(FederatedHMSHandler.java:2162) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_381]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_381]
at com.hotels.bdp.waggledance.server.ExceptionWrappingHMSHandler.invoke(ExceptionWrappingHMSHandler.java:49) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.sun.proxy.$Proxy29.get_table_meta(Unknown Source) ~[?:?]
at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_381]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_381]
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:147) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:108) ~[hive-exec-3.1.3.jar:3.1.3]
at com.sun.proxy.$Proxy29.get_table_meta(Unknown Source) ~[?:?]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_meta.getResult(ThriftHiveMetastore.java:15190) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Processor$get_table_meta.getResult(ThriftHiveMetastore.java:15174) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.TSetIpAddressProcessor.process(TSetIpAddressProcessor.java:48) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) ~[hive-exec-3.1.3.jar:3.1.3]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) ~[?:1.8.0_381]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) ~[?:1.8.0_381]
at java.lang.Thread.run(Thread.java:750) ~[?:1.8.0_381]
Caused by: org.apache.hadoop.hive.metastore.api.MetaException: java.lang.NullPointerException
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_meta_result$get_table_meta_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_meta_result$get_table_meta_resultStandardScheme.read(ThriftHiveMetastore.java) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_table_meta_result.read(ThriftHiveMetastore.java) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_table_meta(ThriftHiveMetastore.java:1973) ~[hive-exec-3.1.3.jar:3.1.3]
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_table_meta(ThriftHiveMetastore.java:1958) ~[hive-exec-3.1.3.jar:3.1.3]
at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_381]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_381]
at com.hotels.bdp.waggledance.client.compatibility.HiveCompatibleThriftHiveMetastoreIfaceFactory$ThriftMetaStoreClientInvocationHandler.invoke(HiveCompatibleThriftHiveMetastoreIfaceFactory.java:43) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.sun.proxy.$Proxy150.get_table_meta(Unknown Source) ~[?:?]
at sun.reflect.GeneratedMethodAccessor78.invoke(Unknown Source) ~[?:?]
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[?:1.8.0_381]
at java.lang.reflect.Method.invoke(Method.java:498) ~[?:1.8.0_381]
at com.hotels.bdp.waggledance.client.DefaultMetaStoreClientFactory$ReconnectingMetastoreClientInvocationHandler.doRealCall(DefaultMetaStoreClientFactory.java:105) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.client.DefaultMetaStoreClientFactory$ReconnectingMetastoreClientInvocationHandler.invoke(DefaultMetaStoreClientFactory.java:98) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.sun.proxy.$Proxy150.get_table_meta(Unknown Source) ~[?:?]
at com.hotels.bdp.waggledance.mapping.service.requests.GetTableMetaRequest.call(GetTableMetaRequest.java:42) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at com.hotels.bdp.waggledance.mapping.service.requests.GetTableMetaRequest.call(GetTableMetaRequest.java:30) ~[waggle-dance-core-4.0.0-SNAPSHOT.jar:?]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) ~[?:1.8.0_381]
... 3 more

@patduin
Copy link
Contributor

patduin commented Sep 7, 2023

strange I got to 224 attempts and the stopped the test all were ok.
I suggest to move the:

    runner = WaggleDanceRunner
        .builder(configLocation)
        .databaseResolution(DatabaseResolution.PREFIXED)
        .primary("primary", localServer.getThriftConnectionUri(), READ_ONLY)
        .federate(SECONDARY_METASTORE_NAME, remoteServer.getThriftConnectionUri(), REMOTE_DATABASE)
        .build();
    runWaggleDance(runner);
    HiveMetaStoreClient proxy = getWaggleDanceClient();

outside the loop and just repeat the getTableMeta call in a loop.
Maybe there is some timing issue with the runner/proxy setup that something is not completed correctly before the first call is being made.

Can you reproduce this in a a real environment as well or is this only happening in the IT?

@flaming-archer
Copy link
Contributor Author

strange I got to 224 attempts and the stopped the test all were ok. I suggest to move the:

    runner = WaggleDanceRunner
        .builder(configLocation)
        .databaseResolution(DatabaseResolution.PREFIXED)
        .primary("primary", localServer.getThriftConnectionUri(), READ_ONLY)
        .federate(SECONDARY_METASTORE_NAME, remoteServer.getThriftConnectionUri(), REMOTE_DATABASE)
        .build();
    runWaggleDance(runner);
    HiveMetaStoreClient proxy = getWaggleDanceClient();

outside the loop and just repeat the getTableMeta call in a loop. Maybe there is some timing issue with the runner/proxy setup that something is not completed correctly before the first call is being made.

Can you reproduce this in a a real environment as well or is this only happening in the IT?

This test case may not be very accurate because it determines whether the result is 1, but sometimes even if an error occurs, it does not affect the accuracy of UT testing because WD will query two databases at the same time. If an error is reported that originally had no results, it does not affect the accuracy of UT testing. Perhaps you can modify the log to see if there are any abnormal stacks.

Maybe there is some timing issue with the runner/proxy setup that something is not completed correctly before the first call is being made.---- Yes , I agree with you, maybe fix some code, can run successfully.

At present, feedback shows that the testing environment is relatively normal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants