Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add CLI Support for Catalyst #337

Open
wants to merge 21 commits into
base: sparkSql
Choose a base branch
from

Conversation

chenghao-intel
Copy link
Contributor

  • Support reload the cachedRDD upon the start
  • Support the CLI switch for Hive/Catalyst
$ bin/shark
catalyst> show tables;
Execution Mode: catalyst
OK
shark_test1
shark_test1_cached
Time taken: 0.011 seconds

catalyst> explain select * from shark_test1;
Execution Mode: catalyst
== Logical Plan ==
Project [key#0,val#1]
 MetastoreRelation default, shark_test1, None

== Optimized Logical Plan ==
MetastoreRelation default, shark_test1, None

== Physical Plan ==
HiveTableScan [key#0,val#1], (MetastoreRelation default, shark_test1, None), None
Time taken: 0.172 seconds

catalyst> set shark.exec.mode=hive;
hive> explain select * from shark_test1;
Execution Mode: hive
OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_TABREF (TOK_TABNAME shark_test1))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR TOK_ALLCOLREF))))

STAGE DEPENDENCIES:
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-0
    Fetch Operator
      limit: -1
      Processor Tree:
        TableScan
          alias: shark_test1
          Select Operator
            expressions:
                  expr: key
                  type: int
                  expr: val
                  type: string
            outputColumnNames: _col0, _col1
            ListSink


Time taken: 0.107 seconds

@chenghao-intel
Copy link
Contributor Author

@marmbrus Can you review that for me? Sorry, lots of code, but most of them are copied from the Shark.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12203/

@chenghao-intel
Copy link
Contributor Author

Still found some jar conflict issues, I will keep updating.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@chenghao-intel
Copy link
Contributor Author

SharkServer2Suite failed in my local test, seems the namespace conflict for the rewritten class CliService.java / HiveServer2.java, I will figure out how to fix that soon.

Besides, I removed the cached RDD reload code for next PR.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12204/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12205/

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12206/

@rxin
Copy link
Member

rxin commented Jun 5, 2014

@chenghao-intel thanks for working on this. I think it is ok to not have the other features for now. We just need a CLI that we can use to query.

@chenghao-intel
Copy link
Contributor Author

The CLI is ready now, and it passed the unit test in my local (SharkServer2 doens't work in my local still), But Jenkins failed in retrieving the httpclient jar, @rxin , can you check that also in your local if possible? I am not sure if any env setting that only work for myself.

@rxin
Copy link
Member

rxin commented Jun 6, 2014

Jenkins, retest this please.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12207/

@chenghao-intel
Copy link
Contributor Author

Still failed in retrieving the jar httpclient.

@rxin
Copy link
Member

rxin commented Jun 6, 2014

Could it be missing a repository?

@chenghao-intel
Copy link
Contributor Author

Actually I 've added 3 more repository.

@rxin
Copy link
Member

rxin commented Jun 6, 2014

I confirm that I can build this locally.

@pwendell can we clear the .m2 / .ivy2 cache on the Jenkins machine?


import shark.LogHelper

//TODO work around for HiveContext, need to update that in Spark project (sql/hive), not here.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think at least some of the issues that necessitate this class's existence have been fixed (e.g. EXPLAIN throwing exceptions). I'm fine with leaving these other fixes here for now, but can you file some JIRAs for the ones that aren't fixed in Spark?

@chenghao-intel
Copy link
Contributor Author

@marmbrus , thanks for the comments. I will update the code accordingly.

And the HiveContext issue was created at https://issues.apache.org/jira/browse/SPARK-2106.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12208/

@chenghao-intel
Copy link
Contributor Author

Hi, @rxin @pwendell seems very weird about the build error. Anything wrong with the Jenkins env settings?

@rxin
Copy link
Member

rxin commented Jun 18, 2014

Hi @chenghao-intel thanks for working on this. Due to the failures in Jenkins, I've asked @liancheng to do more testing and update Shark based on this PR. He has started a new branch based on your work and will post an update soon. He might have some questions for you too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants