Skip to content

Commit

Permalink
Website fixes (#702)
Browse files Browse the repository at this point in the history
* docs: link to examples using full URL in README

PyPI otherwise renders these relative to the `datafusion-python` page,
so when users currently get a 404 when they click on one of these links.

Fixes #699

* docs: update project.urls in pyproject.toml

* docs: update README with apache TLP URLs

* docs u pdate docs/README.md with apache TLP URLs

* docs: update index.rst with TLP URLs

* docs: update to new branded logos
  • Loading branch information
Michael-J-Ward authored May 15, 2024
1 parent d6c42b4 commit 856b310
Show file tree
Hide file tree
Showing 16 changed files with 63 additions and 35 deletions.
34 changes: 17 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,16 +19,16 @@

# DataFusion in Python

[![Python test](https://github.com/apache/arrow-datafusion-python/actions/workflows/test.yaml/badge.svg)](https://github.com/apache/arrow-datafusion-python/actions/workflows/test.yaml)
[![Python Release Build](https://github.com/apache/arrow-datafusion-python/actions/workflows/build.yml/badge.svg)](https://github.com/apache/arrow-datafusion-python/actions/workflows/build.yml)
[![Python test](https://github.com/apache/datafusion-python/actions/workflows/test.yaml/badge.svg)](https://github.com/apache/datafusion-python/actions/workflows/test.yaml)
[![Python Release Build](https://github.com/apache/datafusion-python/actions/workflows/build.yml/badge.svg)](https://github.com/apache/datafusion-python/actions/workflows/build.yml)

This is a Python library that binds to [Apache Arrow](https://arrow.apache.org/) in-memory query engine [DataFusion](https://github.com/apache/arrow-datafusion).
This is a Python library that binds to [Apache Arrow](https://arrow.apache.org/) in-memory query engine [DataFusion](https://github.com/apache/datafusion).

DataFusion's Python bindings can be used as a foundation for building new data systems in Python. Here are some examples:

- [Dask SQL](https://github.com/dask-contrib/dask-sql) uses DataFusion's Python bindings for SQL parsing, query
planning, and logical plan optimizations, and then transpiles the logical plan to Dask operations for execution.
- [DataFusion Ballista](https://github.com/apache/arrow-ballista) is a distributed SQL query engine that extends
- [DataFusion Ballista](https://github.com/apache/datafusion-ballista) is a distributed SQL query engine that extends
DataFusion's Python bindings for distributed use cases.

It is also possible to use these Python bindings directly for DataFrame and SQL operations, but you may find that
Expand Down Expand Up @@ -120,23 +120,23 @@ See [examples](examples/README.md) for more information.

### Executing Queries with DataFusion

- [Query a Parquet file using SQL](./examples/sql-parquet.py)
- [Query a Parquet file using the DataFrame API](./examples/dataframe-parquet.py)
- [Run a SQL query and store the results in a Pandas DataFrame](./examples/sql-to-pandas.py)
- [Run a SQL query with a Python user-defined function (UDF)](./examples/sql-using-python-udf.py)
- [Run a SQL query with a Python user-defined aggregation function (UDAF)](./examples/sql-using-python-udaf.py)
- [Query PyArrow Data](./examples/query-pyarrow-data.py)
- [Create dataframe](./examples/import.py)
- [Export dataframe](./examples/export.py)
- [Query a Parquet file using SQL](https://github.com/apache/datafusion-python/blob/main/examples/sql-parquet.py)
- [Query a Parquet file using the DataFrame API](https://github.com/apache/datafusion-python/blob/main/examples/dataframe-parquet.py)
- [Run a SQL query and store the results in a Pandas DataFrame](https://github.com/apache/datafusion-python/blob/main/examples/sql-to-pandas.py)
- [Run a SQL query with a Python user-defined function (UDF)](https://github.com/apache/datafusion-python/blob/main/examples/sql-using-python-udf.py)
- [Run a SQL query with a Python user-defined aggregation function (UDAF)](https://github.com/apache/datafusion-python/blob/main/examples/sql-using-python-udaf.py)
- [Query PyArrow Data](https://github.com/apache/datafusion-python/blob/main/examples/query-pyarrow-data.py)
- [Create dataframe](https://github.com/apache/datafusion-python/blob/main/examples/import.py)
- [Export dataframe](https://github.com/apache/datafusion-python/blob/main/examples/export.py)

### Running User-Defined Python Code

- [Register a Python UDF with DataFusion](./examples/python-udf.py)
- [Register a Python UDAF with DataFusion](./examples/python-udaf.py)
- [Register a Python UDF with DataFusion](https://github.com/apache/datafusion-python/blob/main/examples/python-udf.py)
- [Register a Python UDAF with DataFusion](https://github.com/apache/datafusion-python/blob/main/examples/python-udaf.py)

### Substrait Support

- [Serialize query plans using Substrait](./examples/substrait.py)
- [Serialize query plans using Substrait](https://github.com/apache/datafusion-python/blob/main/examples/substrait.py)

## How to install (from pip)

Expand Down Expand Up @@ -172,7 +172,7 @@ Bootstrap (Conda):

```bash
# fetch this repo
git clone [email protected]:apache/arrow-datafusion-python.git
git clone [email protected]:apache/datafusion-python.git
# create the conda environment for dev
conda env create -f ./conda/environments/datafusion-dev.yaml -n datafusion-dev
# activate the conda environment
Expand All @@ -183,7 +183,7 @@ Bootstrap (Pip):

```bash
# fetch this repo
git clone [email protected]:apache/arrow-datafusion-python.git
git clone [email protected]:apache/datafusion-python.git
# prepare development environment (used to build wheel / install in development)
python3 -m venv venv
# activate the venv
Expand Down
12 changes: 6 additions & 6 deletions docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
# DataFusion Documentation

This folder contains the source content of the [Python API](./source/api).
This is published to https://arrow.apache.org/datafusion-python/ by a GitHub action
This is published to https://datafusion.apache.org/python by a GitHub action
when changes are merged to the main branch.

## Dependencies
Expand Down Expand Up @@ -66,15 +66,15 @@ firefox build/html/index.html
## Release Process
This documentation is hosted at https://arrow.apache.org/datafusion-python/
This documentation is hosted at https://datafusion.apache.org/python
When the PR is merged to the `main` branch of the DataFusion
repository, a [github workflow](https://github.com/apache/arrow-datafusion-python/blob/main/.github/workflows/docs.yaml) which:
repository, a [github workflow](https://github.com/apache/datafusion-python/blob/main/.github/workflows/docs.yaml) which:
1. Builds the html content
2. Pushes the html content to the [`asf-site`](https://github.com/apache/arrow-datafusion-python/tree/asf-site) branch in this repository.
2. Pushes the html content to the [`asf-site`](https://github.com/apache/datafusion-python/tree/asf-site) branch in this repository.
The Apache Software Foundation provides https://arrow.apache.org/,
which serves content based on the configuration in
[.asf.yaml](https://github.com/apache/arrow-datafusion-python/blob/main/.asf.yaml),
which specifies the target as https://arrow.apache.org/datafusion-python/.
[.asf.yaml](https://github.com/apache/datafusion-python/blob/main/.asf.yaml),
which specifies the target as https://datafusion.apache.org/python.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.

This file was deleted.

Binary file removed docs/source/_static/images/DataFusion-Logo-Dark.png
Binary file not shown.
1 change: 0 additions & 1 deletion docs/source/_static/images/DataFusion-Logo-Dark.svg

This file was deleted.

Binary file removed docs/source/_static/images/DataFusion-Logo-Light.png
Binary file not shown.
Loading

0 comments on commit 856b310

Please sign in to comment.