Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove references to Jupyter notebooks within the Druid repo #15143

Merged
merged 4 commits into from
Nov 1, 2023
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions docs/operations/security-overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,8 +183,6 @@ the extension used in the examples above.
* [Kerberos](../development/extensions-core/druid-kerberos.md) for Kerberos authentication.
* [User authentication and authorization](security-user-auth.md) for details about permissions.
* [SQL permissions](security-user-auth.md#sql-permissions) for permissions on SQL system tables.
* [The `druidapi` Python library](../tutorials/tutorial-jupyter-index.md),
provided as part of the Druid tutorials, to set up users and roles for learning how security works.

## Enable authorizers

Expand Down
7 changes: 5 additions & 2 deletions docs/querying/tips-good-queries.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,9 @@ sidebar_label: "Tips for writing good queries"
~ under the License.
-->

This topic includes tips and examples that can help you investigate and improve query performance and accuracy using [Apache Druid SQL](./sql.md). Use this topic as a companion to the Jupyter Notebook tutorial [Learn the basics of Druid SQL](https://github.com/apache/druid/blob/master/examples/quickstart/jupyter-notebooks/notebooks/03-query/00-using-sql-with-druidapi.ipynb).
This topic includes tips and examples that can help you investigate and improve query performance and accuracy using [Apache Druid SQL](./sql.md).

For an interactive tutorial on Druid SQL, see [Learn the basics of Druid SQL](https://github.com/implydata/learn-druid/tree/main/notebooks) within the [Learn Druid repo](https://github.com/implydata/learn-druid).

Your ability to effectively query your data depends in large part on the way you've ingested and stored the data in Apache Druid. This document assumes that you've followed the best practices described in [Schema design tips and best practices](../ingestion/schema-design.md#general-tips-and-best-practices) when modeling your data.

Expand Down Expand Up @@ -68,7 +70,8 @@ When possible, design your SQL queries in such a way that they match the rules f

Note that TopN queries are approximate in that each data process ranks its top K results and only returns those top K results to the Broker.

You can follow the tutorial [Using TopN approximation in Druid queries](https://github.com/apache/druid/blob/master/examples/quickstart/jupyter-notebooks/notebooks/03-query/02-approxRanking.ipynb) to work through some examples with approximation turned on and off. The tutorial [Get to know Query view](../tutorials/tutorial-sql-query-view.md) demonstrates running aggregate queries in the Druid console.
You can follow the tutorial [Using TopN approximation in Druid queries](https://github.com/implydata/learn-druid/tree/main/notebooks) within the [Learn Druid repo](https://github.com/implydata/learn-druid) to work through some examples with approximation turned on and off.
The tutorial [Get to know Query view](../tutorials/tutorial-sql-query-view.md) demonstrates running aggregate queries in the Druid console.

### Manually tune your queries

Expand Down
252 changes: 0 additions & 252 deletions docs/tutorials/tutorial-jupyter-docker.md

This file was deleted.

51 changes: 4 additions & 47 deletions docs/tutorials/tutorial-jupyter-index.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,53 +23,10 @@ sidebar_label: Jupyter Notebook tutorials
~ under the License.
-->

<!-- tutorial-jupyter-index.md and examples/quickstart/juptyer-notebooks/README.md
share a lot of the same content. If you make a change in one place, update the other
too. -->
You can try out the Druid APIs using interactive Jupyter Notebook tutorials.
techdocsmith marked this conversation as resolved.
Show resolved Hide resolved
These tutorials provide snippets of Python code that you can use to run calls against the Druid API to complete the tutorial.
techdocsmith marked this conversation as resolved.
Show resolved Hide resolved

You can try out the Druid APIs using the Jupyter Notebook-based tutorials. These
tutorials provide snippets of Python code that you can use to run calls against
the Druid API to complete the tutorial.
For ease of use, the tutorials are contained within their own open source [repo](https://github.com/implydata/learn-druid).
See the [notebook index](https://github.com/implydata/learn-druid/tree/main/notebooks) for a list of available tutorials.

## Prerequisites

The simplest way to get started is to use Docker. In this case, you only need to set up Docker Desktop.
For more information, see [Docker for Jupyter Notebook tutorials](tutorial-jupyter-docker.md).

Otherwise, you can install the prerequisites on your own. Here's what you need:

- An available Druid instance.
- Python 3.7 or later
- JupyterLab (recommended) or Jupyter Notebook running on a non-default port.
By default, Druid and Jupyter both try to use port `8888`, so start Jupyter on a different port.
- The `requests` Python package
- The `druidapi` Python package

For setup instructions, see [Tutorial setup without using Docker](tutorial-jupyter-docker.md#tutorial-setup-without-using-docker).
Individual tutorials may require additional Python packages, such as for visualization or streaming ingestion.

## Python API for Druid

The `druidapi` Python package is a REST API for Druid.
One of the notebooks shows how to use the Druid REST API. The others focus on other
topics and use a simple set of Python wrappers around the underlying REST API. The
wrappers reside in the `druidapi` package within the notebooks directory. While the package
can be used in any Python program, the key purpose, at present, is to support these
notebooks. See
[Introduction to the Druid Python API](https://raw.githubusercontent.com/apache/druid/master/examples/quickstart/jupyter-notebooks/notebooks/01-introduction/01-druidapi-package-intro.ipynb)
for an overview of the Python API.

The `druidapi` package is already installed in the custom Jupyter Docker container for Druid tutorials.

## Tutorials

The notebooks are located in the [apache/druid repo](https://github.com/apache/druid/tree/master/examples/quickstart/jupyter-notebooks/). You can either clone the repo or download the notebooks you want individually.

The links that follow are the raw GitHub URLs, so you can use them to download the notebook directly, such as with `wget`, or manually through your web browser. Note that if you save the file from your web browser, make sure to remove the `.txt` extension.

- [Introduction to the Druid REST API](https://raw.githubusercontent.com/apache/druid/master/examples/quickstart/jupyter-notebooks/notebooks/04-api/00-getting-started.ipynb) walks you through some of the
basics related to the Druid REST API and several endpoints.
- [Introduction to the Druid Python API](https://raw.githubusercontent.com/apache/druid/master/examples/quickstart/jupyter-notebooks/notebooks/01-introduction/01-druidapi-package-intro.ipynb) walks you through some of the
basics related to the Druid API using the Python wrapper API.
- [Learn the basics of Druid SQL](https://raw.githubusercontent.com/apache/druid/master/examples/quickstart/jupyter-notebooks/notebooks/03-query/00-using-sql-with-druidapi.ipynb) introduces you to the unique aspects of Druid SQL with the primary focus on the SELECT statement.
- [Ingest and query data from Apache Kafka](https://raw.githubusercontent.com/apache/druid/master/examples/quickstart/jupyter-notebooks/notebooks/02-ingestion/01-streaming-from-kafka.ipynb) walks you through ingesting an event stream from Kafka.
4 changes: 3 additions & 1 deletion docs/tutorials/tutorial-sql-query-view.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,7 @@ sidebar_label: Get to know Query view

This tutorial demonstrates some useful features built into Query view in Apache Druid.

Query view lets you run [Druid SQL queries](../querying/sql.md) and [native (JSON-based) queries](../querying/querying.md) against ingested data. Try out the [Introduction to Druid SQL](./tutorial-jupyter-index.md#tutorials) tutorial to learn more about Druid SQL.
Query view lets you run [Druid SQL queries](../querying/sql.md) and [native (JSON-based) queries](../querying/querying.md) against ingested data.

You can use Query view to test and tune queries before you use them in API requests&mdash;for example, to perform [SQL-based ingestion](../api-reference/sql-ingestion-api.md). You can also ingest data directly in Query view.

Expand Down Expand Up @@ -193,3 +193,5 @@ For more information on ingestion and querying data, see the following topics:
- [Ingestion](../ingestion/index.md) for an overview of ingestion and the ingestion methods available in Druid.
- [SQL-based ingestion](../multi-stage-query/index.md) for an overview of SQL-based ingestion.
- [SQL-based ingestion query examples](../multi-stage-query/examples.md) for examples of SQL-based ingestion for various use cases.
- Try out the interactive [Introduction to Druid SQL](https://github.com/implydata/learn-druid/tree/main/notebooks) notebook to learn more about Druid SQL.
techdocsmith marked this conversation as resolved.
Show resolved Hide resolved

1 change: 0 additions & 1 deletion website/sidebars.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,6 @@
"tutorials/tutorial-unnest-arrays",
"tutorials/tutorial-query-deep-storage",
"tutorials/tutorial-jupyter-index",
"tutorials/tutorial-jupyter-docker",
"tutorials/tutorial-jdbc"
],
"Design": [
Expand Down