Skip to content

Commit

Permalink
add structure for updated onboarding docs
Browse files Browse the repository at this point in the history
Signed-off-by: nikki everett <[email protected]>
  • Loading branch information
nikki everett committed Nov 17, 2023
1 parent a3ae0e4 commit 443f2ca
Show file tree
Hide file tree
Showing 7 changed files with 267 additions and 0 deletions.
43 changes: 43 additions & 0 deletions docs/onboarding_revamp/about_flyte.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3

# override the toc-determined page navigation order
next-page: getting_started/quickstart-guide
next-page-title: Quickstart guide
---

(getting_started_index)=

# About Flyte

Flyte is a workflow orchestrator that seamlessly unifies data engineering, machine learning, and data analytics stacks for building robust and reliable applications. Flyte features:
* Reproducible, repeatable workflows
* Strongly typed interfaces
* Structured datasets to enable easy conversion of dataframes between types, and column-level type checking
* Easy movement of data between local and cloud storage
* Easy tracking of data lineages
* Built-in data and artifact visualization

For a full list of feature, see the [Flyte features page](https://flyte.org/features).

[TK - decide where to put link to hosted sandbox https://sandbox.union.ai/]

## Basic Flyte components

Flyte is made up of a User Plane, Control Plane, and Data Plane.
* The **User Plane** consists of FlyteKit, the FlyteConsole, and Flytectl, which assist in interacting with the core Flyte API. Tasks, workflows, and launch plans are part of the User Plane.
* The **Control Plane** implements the core Flyte API and serves all client requests coming from the User Plane. The Control Plane stores information such as current and past running workflows, and provides that information upon request. It also accepts requests to execute workflows, but offloads the work to the Data Plane.
* The **Data Plane** accepts workflow requests from the Control Plane and guides the workflow to completion, launching tasks on a cluster of machines as necessary based on the workflow graph. The Data Plane sends status events back to the Control Plane so that information can be stored and surfaced to end users.

## Next steps

* To quickly create and run a Flyte workflow, follow the [Quickstart guide](TK-link), then read "[Getting started with Flyte development](TK-link)".
* To create a Flyte Project with lightweight directory structure and configuration files, go to "[Getting started with Flyte development](TK-link)".
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---

(getting_started_creating_a_flyte_project)=

# Creating a Flyte Project

## About Flyte Projects

[TK - link to repo with project templates]

## Prerequisites

* Follow the steps in "[Installing development tools](link-TK)"
* Install git

## Steps

1. Create a virtual environment with conda (or other tool) to manage dependencies. [TK - if we want people to install flytekit after creating a virtual env, they need to do that after this step]
2. Initialize your Flyte project [TK - slope/intercept example]
3. Install additional requirements with `pip install -r requirements.txt`.
4. Initialize git repository in your Flyte project directory.
5. Create at least one commit so you can later register the workflow to the local Flyte cluster.

```{note}
TK - benefits of versioning your project.
```

## Flyte Project components

### Directory structure and configuration files

[TK - dir structure and config files]

### Workflow code

In this example, the workflow file [TK - name of file] contains tasks and a workflow, decorated with the `@task` and `@workflow` decorators, respectively. You can invoke tasks and workflows like regular Python methods, and even import and use them in other Python modules or scripts.

[TK - example workflow code]

#### @task

The @task decorator indicates functions that define tasks:

* A task is a Python function that takes some inputs and produces an output.
* When deployed to a Flyte cluster, each task runs in its own Kubernetes pod.
* Tasks are assembled into workflows.

For more information on tasks, see "[TK - task feature/concept doc](link-TK)".

#### @workflow

The @workflow decorator indicates a function-esque construct that defines a workflow:

* Workflows specify the flow of data between tasks, and the dependencies between tasks.
* A workflow appears to be a Python function but is actually a DSL that only supports a subset of Python syntax and semantics.
* When deployed to a Flyte cluster the workflow function is "compiled" to construct the directed acyclic graph (DAG) of tasks, defining the order of execution of task pods and the data flow dependencies between them. [TK - what part of the data plane does the compiling?]

For more information on workflows, see "[TK - workflow feature/concept doc](link-TK)".
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Creating and running a Flyte LaunchPlan

## About Flyte LaunchPlans

## Steps
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
---
# override the toc-determined page navigation order
prev-page: index
prev-page-title: Getting Started
---

TK - let users know this section is about working in the user plane (creating and running projects, workflows, and tasks)

NOTE: "getting started with flyte development" could be read as a guide for getting started as a flyte contributor, not as a flyte user. we may want to think about different terminology.
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---

(getting_started_installing_development_tools)=

# Installing development tools

To create and run workflows in Flyte, you must install Python, flytekit, flytectl, and Docker. Installing conda or another Python virtual environment manager is optional, but strongly recommended.

### Install Python

Python versions 3.8x - 3.10x are supported. [TK - note what version is used in Flyte docs and therefore recommended]

If you already have Python installed, you can use conda or pyenv to install the recommended version.

### Install Conda (or another Python virtual environment manager)

We strongly recommend installing [Conda](https://docs.conda.io/projects/conda/en/stable/) via miniconda to manage Python versions and virtual environments. Conda is used throughout the Flyte documentation.

You can also use another virtual environment manager, such as `[pyenv](https://github.com/pyenv/pyenv)` or `[venv](https://docs.python.org/3/library/venv.html)`.

### Install Flytekit

[TK - do we want people to set up a virtual environment before installing Flytekit? If so, the venv will be tied to a specific project]

### Install `flytectl`

You must install `flytectl` to start and configure a local Flyte cluster, as well as register workflows to a Flyte cluster.

[TK - Union docs have switcher for different OSes]

### Install Docker

[Install Docker](https://docs.docker.com/get-docker/) and ensure that you
have the Docker daemon running. [TK - link to docs to help folks get Docker daemon running, if need be]

Flyte supports any [OCI-compatible](https://opencontainers.org/) container
technology (like [Podman](https://podman.io/),
[LXD](https://linuxcontainers.org/lxd/introduction/), and
[Containerd](https://containerd.io/)), but
for the purpose of this documentation, `flytectl` uses Docker to spin up a local
Kubernetes cluster so that you can interact with it on your machine. [TK - be more specific here -- `flytectl demo start` uses Docker]
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---

(getting_started_running_workflows_and_tasks_locally)=

# Running workflows and tasks locally

## Running a workflow locally (not in a local cluster)

[TK - Intro - why run locally + not in a local cluster]

### Prerequisites

### Steps (with example)

## Running a workflow in a local cluster

[TK - Intro - why run in a local cluster]

### Prerequisites

### Steps (with example)

## Next steps

* Continue iterating and testing DAG (next section)
* Set up cloud cluster (link to deploy docs)
55 changes: 55 additions & 0 deletions docs/onboarding_revamp/quickstart_guide.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
---
jupytext:
formats: md:myst
text_representation:
extension: .md
format_name: myst
kernelspec:
display_name: Python 3
language: python
name: python3
---

(getting_started_quickstart_guide)=

# Quickstart guide

In this guide, you will create and run a Flyte workflow composed of Flyte tasks to generate the output “Hello, World!”.

## Prerequisites

* Install Python
* Install Flytekit

## Steps

1. Create the "Hello, world" example Python file. [TK]
2. Run the workflow with `pyflyte run`

## The @task and @workflow decorators

In this example, the workflow file [TK - name of file] contains tasks and a workflow, decorated with the `@task` and `@workflow` decorators, respectively. You can invoke tasks and workflows like regular Python methods, and even import and use them in other Python modules or scripts.

### @task

The @task decorator indicates functions that define tasks:

* A task is a Python function that takes some inputs and produces an output.
* When deployed to a Flyte cluster, each task runs in its own Kubernetes pod.
* Tasks are assembled into workflows.

For more information on tasks, see "[TK - task feature/concept doc](link-TK)".

### @workflow

The @workflow decorator indicates a function-esque construct that defines a workflow

* Workflows specify the flow of data between tasks, and the dependencies between tasks.
* A workflow appears to be a Python function but is actually a DSL that only supports a subset of Python syntax and semantics.
* When deployed to a Flyte cluster the workflow function is "compiled" to construct the directed acyclic graph (DAG) of tasks, defining the order of execution of task pods and the data flow dependencies between them. [TK - what part of the data plane does the compiling?]

For more information on workflows, see "[TK - workflow feature/concept doc](link-TK)".

## Next steps

To create a Flyte Project and run the workflow in a local Flyte cluster, see "[Getting started with Flyte development](link-TK)".

0 comments on commit 443f2ca

Please sign in to comment.