Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spark SQL based Shark CLI #341

Closed
wants to merge 27 commits into from
Closed

Conversation

liancheng
Copy link
Contributor

Merged PR #337 authored by @chenghao-intel, and then did some refactoring and update, mainly include:

  • Minimized CatalystContext as we don't care about response code for now, should update after SPARK-2106 is addressed.
  • CatalystDriver.destroy should call super.destroy.
  • A bunch of trivial coding style fix (including original Shark code).

Review on Reviewable

chenghao-intel and others added 25 commits June 3, 2014 16:11
Conflicts:
	project/SharkBuild.scala
	src/main/scala/shark/SharkServer2.scala
- Moved `CatalystContext` into the right package folder
- Moved `getResultSetSchema` out of `CatalystContext`
- Made `getResultSetSchema` to return a `Schema` rather than a `TableSchema`
- `CatalystDriver.destroy` should call `super.destroy`
- Coding style issues
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12211/


import shark.LogHelper

class CatalystContext(sc: SparkContext) extends HiveContext(sc) with LogHelper {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SparkSQLContext?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or JDBCContext?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... I'd rather just call it SharkContext. Catalyst is a query optimization framework, Spark SQL is more than that, but neither of them is a concept parallel to Hive. And this class is actually not very much related to JDBC.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM

override def getSchema: Schema = tableSchema

override def getResults(res: JArrayList[String]): Boolean = {
if(hiveResponse == null) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space after if

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/Shark-Pull-Request-Builder/12212/

@liancheng
Copy link
Contributor Author

@rxin @chenghao-intel, thanks for your comments :) I've removed the alternative Hive execution mode, and addressed issues you brought up. I cleaned up some code that doesn't follow reasonable coding conventions along the way until I realized that most of it was copied from the original Hive codebase. Maybe I should just left them unchanged...

@liancheng liancheng closed this Feb 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants