diff --git a/docs/README.md b/docs/README.md
index e4ea30e53..e484592c3 100644
--- a/docs/README.md
+++ b/docs/README.md
@@ -47,16 +47,6 @@ inside a Python virtualenv.
## Release
-The documentation is served through the [arrow-site](https://github.com/apache/arrow-site/) repository. To release
-a new version of the documentation, follow these steps:
+The documentation is published from the `asf-site` branch of this repository.
-1. Download the release source tarball (we can only publish documentation from official releases)
-2. Run `./build.sh` inside `docs` folder to generate the docs website inside the `build/html` folder.
-3. Clone the arrow-site repo
-4. Checkout to the `asf-site` branch (NOT `master`)
-5. Copy build artifacts into `arrow-site` repo's `ballista` folder with a command such as
-
-- `cp -rT ./build/html/ ../../arrow-site/ballista/` (doesn't work on mac)
-- `rsync -avzr ./build/html/ ../../arrow-site/ballista/`
-
-6. Commit changes in `arrow-site` and send a PR.
+Documentation is published automatically when documentation changes are pushed to the main branch.
diff --git a/docs/source/community/communication.md b/docs/source/community/communication.md
index 4a2cf20ff..295bf46fe 100644
--- a/docs/source/community/communication.md
+++ b/docs/source/community/communication.md
@@ -26,55 +26,15 @@ All participation in the Apache DataFusion Ballista project is governed by the
Apache Software Foundation's [code of
conduct](https://www.apache.org/foundation/policies/conduct.html).
-## Questions?
+We use the same communication channels as the main DataFusion project:
-### Mailing list
-
-We use datafusion.apache.org's `dev@` mailing list for project management, release
-coorindation and design discussions
-([subscribe](mailto:dev-subscribe@datafusion.apache.org),
-[unsubscribe](mailto:dev-unsubscribe@datafusion.apache.org),
-[archives](https://lists.apache.org/list.html?dev@datafusion.apache.org)).
-
-When emailing the dev list, please make sure to prefix the subject line with a
-`[Ballista]` tag, e.g. `"[Ballista] New API for remote data sources"`, so
-that the appropriate people in the Apache DataFusion community notice the message.
-
-### Slack and Discord
-
-We use the official [ASF](https://s.apache.org/slack-invite) Slack workspace
-for informal discussions and coordination. This is a great place to meet other
-contributors and get guidance on where to contribute. Join us in the
-`#arrow-rust` channel.
-
-We also have a backup Arrow Rust Discord
-server ([invite link](https://discord.gg/Qw5gKqHxUM)) in case you are not able
-to join the Slack workspace. If you need an invite to the Slack workspace, you
-can also ask for one in our Discord server.
-
-### Sync up video calls
-
-We have biweekly sync calls every other Thursdays at both 04:00 UTC
-and 16:00 UTC (starting September 30, 2021) depending on if there are
-items on the agenda to discuss and someone being willing to host.
-
-Please see the [agenda](https://docs.google.com/document/d/1atCVnoff5SR4eM4Lwf2M1BBJTY6g3_HUNR6qswYJW_U/edit)
-for the video call link, add topics and to see what others plan to discuss.
-
-The goals of these calls are:
-
-1. Help "put a face to the name" of some of other contributors we are working with
-2. Discuss / synchronize on the goals and major initiatives from different stakeholders to identify areas where more alignment is needed
-
-No decisions are made on the call and anything of substance will be discussed on this mailing list or in github issues / google docs.
-
-We will send a summary of all sync ups to the dev@datafusion.apache.org mailing list.
+[https://datafusion.apache.org/contributor-guide/communication.html](https://datafusion.apache.org/contributor-guide/communication.html)
## Contributing
Our source code is hosted on
-[GitHub](https://github.com/apache/arrow-datafusion). More information on contributing is in
-the [Contribution Guide](https://github.com/apache/arrow-datafusion/blob/master/CONTRIBUTING.md)
+[GitHub](https://github.com/apache/datafusion-ballista). More information on contributing is in
+the [Contribution Guide](https://github.com/apache/datafusion-ballista/blob/main/CONTRIBUTING.md)
, and we have curated a [good-first-issue](https://github.com/apache/datafusion-ballista/contribute)
list to help you get started. You can find datafusion's major designs in docs/source/specification.
diff --git a/docs/source/conf.py b/docs/source/conf.py
index 7a3477f80..eab94b1a9 100644
--- a/docs/source/conf.py
+++ b/docs/source/conf.py
@@ -90,7 +90,7 @@
html_context = {
"github_user": "apache",
- "github_repo": "arrow-ballista",
+ "github_repo": "datafusion-ballista",
"github_version": "main",
"doc_path": "docs/source",
}
diff --git a/docs/source/contributors-guide/architecture.md b/docs/source/contributors-guide/architecture.md
index 4541e4d0a..6cec186fd 100644
--- a/docs/source/contributors-guide/architecture.md
+++ b/docs/source/contributors-guide/architecture.md
@@ -94,9 +94,9 @@ can execute multiple partitions of the same plan in parallel.
There are multiple clients available for submitting jobs to a Ballista cluster:
-- The [Ballista CLI](https://github.com/apache/arrow-ballista/tree/main/ballista-cli) provides a SQL command-line
+- The [Ballista CLI](https://github.com/apache/datafusion-ballista/tree/main/ballista-cli) provides a SQL command-line
interface.
-- The Python bindings ([PyBallista](https://github.com/apache/arrow-ballista/tree/main/python)) provide a session
+- The Python bindings ([PyBallista](https://github.com/apache/datafusion-ballista/tree/main/python)) provide a session
context with support for SQL and DataFrame operations.
- The [ballista crate](https://crates.io/crates/ballista) provides a native Rust session context with support for
SQL and DataFrame operations.
@@ -201,5 +201,5 @@ Each executor will re-partition the output of the stage it is running so that it
stage. This mechanism is known as an Exchange or a Shuffle. The logic for this can be found in the [ShuffleWriterExec]
and [ShuffleReaderExec] operators.
-[shufflewriterexec]: https://github.com/apache/arrow-ballista/blob/main/ballista/core/src/execution_plans/shuffle_writer.rs
-[shufflereaderexec]: https://github.com/apache/arrow-ballista/blob/main/ballista/core/src/execution_plans/shuffle_reader.rs
+[shufflewriterexec]: https://github.com/apache/datafusion-ballista/blob/main/ballista/core/src/execution_plans/shuffle_writer.rs
+[shufflereaderexec]: https://github.com/apache/datafusion-ballista/blob/main/ballista/core/src/execution_plans/shuffle_reader.rs
diff --git a/docs/source/contributors-guide/code-organization.md b/docs/source/contributors-guide/code-organization.md
index 6b830589d..e1f3e4706 100644
--- a/docs/source/contributors-guide/code-organization.md
+++ b/docs/source/contributors-guide/code-organization.md
@@ -23,33 +23,33 @@ This section provides links to the source code for major areas of functionality.
### ballista-core crate
-- [Crate Source](https://github.com/apache/arrow-ballista/blob/main/ballista/core)
-- [Protocol Buffer Definition](https://github.com/apache/arrow-ballista/blob/main/ballista/core/proto/ballista.proto)
-- [Execution Plans](https://github.com/apache/arrow-ballista/tree/main/ballista/core/src/execution_plans)
-- [Ballista Client](https://github.com/apache/arrow-ballista/blob/main/ballista/core/src/client.rs)
+- [Crate Source](https://github.com/apache/datafusion-ballista/blob/main/ballista/core)
+- [Protocol Buffer Definition](https://github.com/apache/datafusion-ballista/blob/main/ballista/core/proto/ballista.proto)
+- [Execution Plans](https://github.com/apache/datafusion-ballista/tree/main/ballista/core/src/execution_plans)
+- [Ballista Client](https://github.com/apache/datafusion-ballista/blob/main/ballista/core/src/client.rs)
### ballista-scheduler crate
-- [Crate Source](https://github.com/apache/arrow-ballista/tree/main/ballista/scheduler)
-- [Distributed Query Planner](https://github.com/apache/arrow-ballista/blob/main/ballista/scheduler/src/planner.rs)
-- [gRPC Service](https://github.com/apache/arrow-ballista/blob/main/ballista/scheduler/src/scheduler_server/grpc.rs)
-- [Flight SQL Service](https://github.com/apache/arrow-ballista/blob/main/ballista/scheduler/src/flight_sql.rs)
-- [REST API](https://github.com/apache/arrow-ballista/tree/main/ballista/scheduler/src/api)
-- [Web UI](https://github.com/apache/arrow-ballista/tree/main/ballista/scheduler/ui)
-- [Prometheus Integration](https://github.com/apache/arrow-ballista/blob/main/ballista/scheduler/src/metrics/prometheus.rs)
+- [Crate Source](https://github.com/apache/datafusion-ballista/tree/main/ballista/scheduler)
+- [Distributed Query Planner](https://github.com/apache/datafusion-ballista/blob/main/ballista/scheduler/src/planner.rs)
+- [gRPC Service](https://github.com/apache/datafusion-ballista/blob/main/ballista/scheduler/src/scheduler_server/grpc.rs)
+- [Flight SQL Service](https://github.com/apache/datafusion-ballista/blob/main/ballista/scheduler/src/flight_sql.rs)
+- [REST API](https://github.com/apache/datafusion-ballista/tree/main/ballista/scheduler/src/api)
+- [Web UI](https://github.com/apache/datafusion-ballista/tree/main/ballista/scheduler/ui)
+- [Prometheus Integration](https://github.com/apache/datafusion-ballista/blob/main/ballista/scheduler/src/metrics/prometheus.rs)
### ballista-executor crate
-- [Crate Source](https://github.com/apache/arrow-ballista/tree/main/ballista/executor)
-- [Flight Service](https://github.com/apache/arrow-ballista/blob/main/ballista/executor/src/flight_service.rs)
-- [Executor Server](https://github.com/apache/arrow-ballista/blob/main/ballista/executor/src/executor_server.rs)
+- [Crate Source](https://github.com/apache/datafusion-ballista/tree/main/ballista/executor)
+- [Flight Service](https://github.com/apache/datafusion-ballista/blob/main/ballista/executor/src/flight_service.rs)
+- [Executor Server](https://github.com/apache/datafusion-ballista/blob/main/ballista/executor/src/executor_server.rs)
### ballista crate
-- [Crate Source](https://github.com/apache/arrow-ballista/tree/main/ballista/client)
-- [Context](https://github.com/apache/arrow-ballista/blob/main/ballista/client/src/context.rs)
+- [Crate Source](https://github.com/apache/datafusion-ballista/tree/main/ballista/client)
+- [Context](https://github.com/apache/datafusion-ballista/blob/main/ballista/client/src/context.rs)
### PyBallista
-- [Source](https://github.com/apache/arrow-ballista/tree/main/python)
-- [Context](https://github.com/apache/arrow-ballista/blob/main/python/src/context.rs)
+- [Source](https://github.com/apache/datafusion-ballista/tree/main/python)
+- [Context](https://github.com/apache/datafusion-ballista/blob/main/python/src/context.rs)
diff --git a/docs/source/index.rst b/docs/source/index.rst
index 9491eccf5..959d5844b 100644
--- a/docs/source/index.rst
+++ b/docs/source/index.rst
@@ -65,7 +65,7 @@ Table of content
contributors-guide/architecture
contributors-guide/code-organization
contributors-guide/development
- Source code
+ Source code
.. _toc.community:
@@ -75,5 +75,5 @@ Table of content
community/communication
- Issue tracker
- Code of conduct
+ Issue tracker
+ Code of conduct
diff --git a/docs/source/user-guide/deployment/docker-compose.md b/docs/source/user-guide/deployment/docker-compose.md
index f09490f9c..53501c781 100644
--- a/docs/source/user-guide/deployment/docker-compose.md
+++ b/docs/source/user-guide/deployment/docker-compose.md
@@ -23,31 +23,31 @@ Docker Compose is a convenient way to launch a cluster when testing locally.
## Build Docker Images
-Run the following commands to download the [official Docker image](https://github.com/apache/arrow-ballista/pkgs/container/arrow-ballista-standalone):
+Run the following commands to download the [official Docker image](https://github.com/apache/datafusion-ballista/pkgs/container/datafusion-ballista-standalone):
```bash
-docker pull ghcr.io/apache/arrow-ballista-standalone:0.12.0-rc4
+docker pull ghcr.io/apache/datafusion-ballista-standalone:0.12.0-rc4
```
Altenatively run the following commands to clone the source repository and build the Docker images from source:
```bash
-git clone git@github.com:apache/arrow-ballista.git -b 0.12.0
-cd arrow-ballista
+git clone git@github.com:apache/datafusion-ballista.git -b 0.12.0
+cd datafusion-ballista
./dev/build-ballista-docker.sh
```
This will create the following images:
-- `apache/arrow-ballista-benchmarks:0.12.0`
-- `apache/arrow-ballista-cli:0.12.0`
-- `apache/arrow-ballista-executor:0.12.0`
-- `apache/arrow-ballista-scheduler:0.12.0`
-- `apache/arrow-ballista-standalone:0.12.0`
+- `apache/datafusion-ballista-benchmarks:0.12.0`
+- `apache/datafusion-ballista-cli:0.12.0`
+- `apache/datafusion-ballista-executor:0.12.0`
+- `apache/datafusion-ballista-scheduler:0.12.0`
+- `apache/datafusion-ballista-standalone:0.12.0`
## Start a Cluster
-Using the [docker-compose.yml](https://github.com/apache/arrow-ballista/blob/main/docker-compose.yml) from the
+Using the [docker-compose.yml](https://github.com/apache/datafusion-ballista/blob/main/docker-compose.yml) from the
source repository, run the following command to start a cluster:
```bash
@@ -77,5 +77,5 @@ The scheduler web UI is available on port 80 in the scheduler.
## Connect from the Ballista CLI
```shell
-docker run --network=host -it apache/arrow-ballista-cli:0.12.0 --host localhost --port 50050
+docker run --network=host -it apache/datafusion-ballista-cli:0.12.0 --host localhost --port 50050
```
diff --git a/docs/source/user-guide/deployment/docker.md b/docs/source/user-guide/deployment/docker.md
index 291c98e3f..b67e267e0 100644
--- a/docs/source/user-guide/deployment/docker.md
+++ b/docs/source/user-guide/deployment/docker.md
@@ -21,27 +21,27 @@
## Build Docker Images
-Run the following commands to download the [official Docker image](https://github.com/apache/arrow-ballista/pkgs/container/arrow-ballista-standalone):
+Run the following commands to download the [official Docker image](https://github.com/apache/datafusion-ballista/pkgs/container/datafusion-ballista-standalone):
```bash
-docker pull ghcr.io/apache/arrow-ballista-standalone:0.12.0-rc4
+docker pull ghcr.io/apache/datafusion-ballista-standalone:0.12.0-rc4
```
Altenatively run the following commands to clone the source repository and build the Docker images from source:
```bash
-git clone git@github.com:apache/arrow-ballista.git -b 0.12.0
-cd arrow-ballista
+git clone git@github.com:apache/datafusion-ballista.git -b 0.12.0
+cd datafusion-ballista
./dev/build-ballista-docker.sh
```
This will create the following images:
-- `apache/arrow-ballista-benchmarks:0.12.0`
-- `apache/arrow-ballista-cli:0.12.0`
-- `apache/arrow-ballista-executor:0.12.0`
-- `apache/arrow-ballista-scheduler:0.12.0`
-- `apache/arrow-ballista-standalone:0.12.0`
+- `apache/datafusion-ballista-benchmarks:0.12.0`
+- `apache/datafusion-ballista-cli:0.12.0`
+- `apache/datafusion-ballista-executor:0.12.0`
+- `apache/datafusion-ballista-scheduler:0.12.0`
+- `apache/datafusion-ballista-standalone:0.12.0`
## Start a Cluster
@@ -51,7 +51,7 @@ Start a scheduler using the following syntax:
```bash
docker run --network=host \
- -d apache/arrow-ballista-scheduler:0.12.0 \
+ -d apache/datafusion-ballista-scheduler:0.12.0 \
--bind-port 50050
```
@@ -60,7 +60,7 @@ Run `docker ps` to check that the process is running:
```
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
-a756055576f3 apache/arrow-ballista-scheduler:0.12.0 "/root/scheduler-ent…" 8 seconds ago Up 8 seconds xenodochial_carson
+a756055576f3 apache/datafusion-ballista-scheduler:0.12.0 "/root/scheduler-ent…" 8 seconds ago Up 8 seconds xenodochial_carson
```
Run `docker logs CONTAINER_ID` to check the output from the process:
@@ -84,7 +84,7 @@ Start one or more executor processes. Each executor process will need to listen
```bash
docker run --network=host \
- -d apache/arrow-ballista-executor:0.12.0 \
+ -d apache/datafusion-ballista-executor:0.12.0 \
--external-host localhost --bind-port 50051
```
@@ -93,8 +93,8 @@ Use `docker ps` to check that both the scheduler and executor(s) are now running
```
$ docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
-fb8b530cee6d apache/arrow-ballista-executor:0.12.0 "/root/executor-entr…" 2 seconds ago Up 1 second gallant_galois
-a756055576f3 apache/arrow-ballista-scheduler:0.12.0 "/root/scheduler-ent…" 8 seconds ago Up 8 seconds xenodochial_carson
+fb8b530cee6d apache/datafusion-ballista-executor:0.12.0 "/root/executor-entr…" 2 seconds ago Up 1 second gallant_galois
+a756055576f3 apache/datafusion-ballista-scheduler:0.12.0 "/root/scheduler-ent…" 8 seconds ago Up 8 seconds xenodochial_carson
```
Use `docker logs CONTAINER_ID` to check the output from the executor(s):
@@ -117,7 +117,7 @@ to launch the scheduler with this option enabled.
```bash
docker run --network=host \
- -d apache/arrow-ballista-scheduler:0.12.0 \
+ -d apache/datafusion-ballista-scheduler:0.12.0 \
--bind-port 50050 \
--config-backend etcd \
--etcd-urls etcd:2379
@@ -129,5 +129,5 @@ recommended.
## Connect from the CLI
```shell
-docker run --network=host -it apache/arrow-ballista-cli:0.12.0 --host localhost --port 50050
+docker run --network=host -it apache/datafusion-ballista-cli:0.12.0 --host localhost --port 50050
```
diff --git a/docs/source/user-guide/deployment/kubernetes.md b/docs/source/user-guide/deployment/kubernetes.md
index eebe6e1c9..2bdb4fb69 100644
--- a/docs/source/user-guide/deployment/kubernetes.md
+++ b/docs/source/user-guide/deployment/kubernetes.md
@@ -41,37 +41,37 @@ microk8s enable dns
## Build Docker Images
-Run the following commands to download the [official Docker image](https://github.com/apache/arrow-ballista/pkgs/container/arrow-ballista-standalone):
+Run the following commands to download the [official Docker image](https://github.com/apache/datafusion-ballista/pkgs/container/datafusion-ballista-standalone):
```bash
-docker pull ghcr.io/apache/arrow-ballista-standalone:0.12.0-rc4
+docker pull ghcr.io/apache/datafusion-ballista-standalone:0.12.0-rc4
```
Altenatively run the following commands to clone the source repository and build the Docker images from source:
```bash
-git clone git@github.com:apache/arrow-ballista.git -b 0.12.0
-cd arrow-ballista
+git clone git@github.com:apache/datafusion-ballista.git -b 0.12.0
+cd datafusion-ballista
./dev/build-ballista-docker.sh
```
This will create the following images:
-- `apache/arrow-ballista-benchmarks:0.12.0`
-- `apache/arrow-ballista-cli:0.12.0`
-- `apache/arrow-ballista-executor:0.12.0`
-- `apache/arrow-ballista-scheduler:0.12.0`
-- `apache/arrow-ballista-standalone:0.12.0`
+- `apache/datafusion-ballista-benchmarks:0.12.0`
+- `apache/datafusion-ballista-cli:0.12.0`
+- `apache/datafusion-ballista-executor:0.12.0`
+- `apache/datafusion-ballista-scheduler:0.12.0`
+- `apache/datafusion-ballista-standalone:0.12.0`
## Publishing Docker Images
Once the images have been built, you can retag them and can push them to your favourite Docker registry.
```bash
-docker tag apache/arrow-ballista-scheduler:0.12.0 /arrow-ballista-scheduler:0.12.0
-docker tag apache/arrow-ballista-executor:0.12.0 /arrow-ballista-executor:0.12.0
-docker push /arrow-ballista-scheduler:0.12.0
-docker push /arrow-ballista-executor:0.12.0
+docker tag apache/datafusion-ballista-scheduler:0.12.0 /datafusion-ballista-scheduler:0.12.0
+docker tag apache/datafusion-ballista-executor:0.12.0 /datafusion-ballista-executor:0.12.0
+docker push /datafusion-ballista-scheduler:0.12.0
+docker push /datafusion-ballista-executor:0.12.0
```
## Create Persistent Volume and Persistent Volume Claim
@@ -159,7 +159,7 @@ spec:
spec:
containers:
- name: ballista-scheduler
- image: /arrow-ballista-scheduler:0.12.0
+ image: /datafusion-ballista-scheduler:0.12.0
args: ["--bind-port=50050"]
ports:
- containerPort: 50050
@@ -191,7 +191,7 @@ spec:
spec:
containers:
- name: ballista-executor
- image: /arrow-ballista-executor:0.12.0
+ image: /datafusion-ballista-executor:0.12.0
args:
- "--bind-port=50051"
- "--scheduler-host=ballista-scheduler"
diff --git a/docs/source/user-guide/flightsql.md b/docs/source/user-guide/flightsql.md
index cb420e3de..4572eef91 100644
--- a/docs/source/user-guide/flightsql.md
+++ b/docs/source/user-guide/flightsql.md
@@ -54,7 +54,7 @@ choco install docker-desktop
## Run Docker Container
```shell
-docker run -p 50050:50050 --rm ghcr.io/apache/arrow-ballista-standalone:0.10.0
+docker run -p 50050:50050 --rm ghcr.io/apache/datafusion-ballista-standalone:0.10.0
```
## Download the FlightSQL JDBC Driver
@@ -79,7 +79,7 @@ The important pieces of information:
## Run a "Hello, World!" Query
```sql
-select 'Hello from Arrow Ballista!' as greeting;
+select 'Hello from DataFusion Ballista!' as greeting;
```
## Run a Complex Query
diff --git a/docs/source/user-guide/introduction.md b/docs/source/user-guide/introduction.md
index 65cbe2f7c..fbadf13b5 100644
--- a/docs/source/user-guide/introduction.md
+++ b/docs/source/user-guide/introduction.md
@@ -19,7 +19,7 @@
# Overview
-Ballista is a distributed compute platform primarily implemented in Rust, and powered by Apache Arrow.
+Ballista is a distributed compute platform primarily implemented in Rust, and powered by Apache DataFusion.
Ballista has a scheduler and an executor process that are standard Rust executables and can be executed directly, but
Dockerfiles are provided to build images for use in containerized environments, such as Docker, Docker Compose, and
diff --git a/docs/source/user-guide/python.md b/docs/source/user-guide/python.md
index 80ce8aa5d..674850c70 100644
--- a/docs/source/user-guide/python.md
+++ b/docs/source/user-guide/python.md
@@ -135,4 +135,4 @@ assert result.column(1) == pyarrow.array([-3, -3, -3])
## User Defined Functions
The underlying DataFusion query engine supports Python UDFs but this functionality has not yet been implemented in
-Ballista. It is planned for a future release. The tracking issue is [#173](https://github.com/apache/arrow-ballista/issues/173).
+Ballista. It is planned for a future release. The tracking issue is [#173](https://github.com/apache/datafusion-ballista/issues/173).
diff --git a/docs/source/user-guide/scheduler.md b/docs/source/user-guide/scheduler.md
index 6ac81ed2c..447992cea 100644
--- a/docs/source/user-guide/scheduler.md
+++ b/docs/source/user-guide/scheduler.md
@@ -21,7 +21,7 @@
## Web User Interface
-The scheduler provides a web user interface that allows queries to be monitored. Details on how to start the ui is present [here](https://github.com/apache/arrow-ballista/tree/main/ballista/scheduler/ui)
+The scheduler provides a web user interface that allows queries to be monitored. Details on how to start the ui is present [here](https://github.com/apache/datafusion-ballista/tree/main/ballista/scheduler/ui)
![Ballista Scheduler Web UI](./images/ballista-web-ui.png)