Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache Spark - Docs refactoring #2789

Merged
merged 12 commits into from
Nov 21, 2024
Merged

Conversation

BentsiLeviav
Copy link
Contributor

As part of our effort to improve Spark's documentation, this PR includes:

  • split the big file we currently have into 2 separate documentation pages.
  • Add code examples in Java, Scala, Pyspark, and SparkSQL to the native connector page.
  • Organize the native connector doc page.

@BentsiLeviav BentsiLeviav requested a review from a team as a code owner November 13, 2024 17:06
@BentsiLeviav BentsiLeviav requested review from kitop and a team and removed request for a team November 13, 2024 17:06
@mshustov mshustov requested review from mzitnik and laeg and removed request for kitop and a team November 14, 2024 13:53
import TOCInline from '@theme/TOCInline';

# Spark JDBC
One of the most used data sources supported by Spark is JDBC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to provide any specific recommendations for the JDBC driver version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have something specific version in mind?
@mzitnik is there a specific version we recommend on?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mshustov did you meant what version of ClickHouse JDBC we should recommend?

Copy link
Member

@mshustov mshustov Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you meant what version of ClickHouse JDBC we should recommend?

yes

The above examples demonstrate SparkSQL queries, which you can run within your application using any API—Java, Scala, PySpark, or shell.


## Supported Data Types
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could you extend the docs with the following sections:

:::important
It's essential to include the [clickhouse-jdbc JAR](https://mvnrepository.com/artifact/com.clickhouse/clickhouse-jdbc) with the "all" classifier,
as the connector relies on [clickhouse-http](https://mvnrepository.com/artifact/com.clickhouse/clickhouse-http-client) and [clickhouse-client](https://mvnrepository.com/artifact/com.clickhouse/clickhouse-client) —both of which are bundled in clickhouse-jdbc:all.
Alternatively, you can add [clickhouse-client JAR](https://mvnrepository.com/artifact/com.clickhouse/clickhouse-client) and [clickhouse-http](https://mvnrepository.com/artifact/com.clickhouse/clickhouse-http-client) individually if you prefer not to use the full JDBC package.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO, giving two many alternatives can cause confusion.

@BentsiLeviav BentsiLeviav merged commit c2dd4a0 into ClickHouse:main Nov 21, 2024
2 of 3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants