diff --git a/docs/en/images/Kubeflow-small.png b/docs/en/images/Kubeflow-small.png
new file mode 100644
index 000000000..8e871707c
Binary files /dev/null and b/docs/en/images/Kubeflow-small.png differ
diff --git a/docs/en/images/Kubeflow.PNG b/docs/en/images/Kubeflow.PNG
deleted file mode 100644
index 1ecb693b6..000000000
Binary files a/docs/en/images/Kubeflow.PNG and /dev/null differ
diff --git a/docs/en/images/kubeflow-favourite.png b/docs/en/images/kubeflow-favourite.png
new file mode 100644
index 000000000..a90408416
Binary files /dev/null and b/docs/en/images/kubeflow-favourite.png differ
diff --git a/docs/en/images/kubeflow-stack.png b/docs/en/images/kubeflow-stack.png
new file mode 100644
index 000000000..f358b2c0c
Binary files /dev/null and b/docs/en/images/kubeflow-stack.png differ
diff --git a/docs/en/images/kubeflow-stack_384x384.png b/docs/en/images/kubeflow-stack_384x384.png
new file mode 100644
index 000000000..483b835d4
Binary files /dev/null and b/docs/en/images/kubeflow-stack_384x384.png differ
diff --git a/docs/en/images/statistics-on-the-moon-small.jpg b/docs/en/images/statistics-on-the-moon-small.jpg
new file mode 100644
index 000000000..1be455f7c
Binary files /dev/null and b/docs/en/images/statistics-on-the-moon-small.jpg differ
diff --git a/docs/en/index.md b/docs/en/index.md
index fdf7d091a..fa9c68d33 100644
--- a/docs/en/index.md
+++ b/docs/en/index.md
@@ -1,143 +1,136 @@
-# Welcome to the Advanced Analytics Workspace
+# The Advanced Analytics Workspace
-![Statistics](images/statistics-on-the-moon.jpg)
+![Statistics](images/statistics-on-the-moon-small.jpg)
-## The Advanced Analytics Workspace Documentation
+!!! Info "Open source and made for you!"
+ The AAW is an open-source platform specifically crafted for data scientists, analysts, and researchers proficient in open-source tools and coding.
-_Welcome to the world of data science and machine learning!_
+The [Advanced Analytics Workspace](https://www.statcan.gc.ca/data-analytics-services/aaw) (AAW) stands as a comprehensive and open-source solution designed to cater to the diverse needs of data scientists. It offers a flexible environment that empowers practitioners to seamlessly conduct their work. More information about the AAW and Data Analytics Services (DAS) can be found on [the DAS Portal](https://www.statcan.gc.ca/data-analytics-services/aaw).
-
-!!! info "What is the AAW?"
- **[Advanced Analytics Workspace](https://www.statcan.gc.ca/data-analytics-services/aaw)** is an open source platform designed for data scientists, data stewards, analysts and researchers familiar with open source tools and coding. Developed by data scientists for data scientists, AAW provides a flexible environment that enables advanced practitioners to get their work done with ease.
+!!! Warning "Warning"
+ Many of the links on https://www.statcan.gc.ca/data-analytics-services/aaw are broken.
-The AAW is a comprehensive solution for data science and data analytics. With the AAW, you can customize notebook server deployments to suit your specific data science needs. We have a small number of custom Docker images made by our team.
+## Getting Started
-
-!!! info "What is Kubeflow?"
- The AAW is based on [Kubeflow](https://www.kubeflow.org/), an open source comprehensive solution for deploying and managing end-to-end ML workflows.
+- **StatCan Users**: Access the [Kubeflow Dashboard](https://kubeflow.aaw.cloud.statcan.ca/) to get started.
+- **External Users and Collaborators**: Fill out [the DAS Onboarding Form](https://forms.office.com/r/RPrgDDkU9T) to tell us about your project needs. Once completed, a DAS representative will contact you to discuss the next steps and begin the onboarding process. Note: External users need a StatCan Cloud account granted by the business sponsor.
-Whether you're just getting started or already knee-deep in data analysis, the Advanced Analytics Workspace has everything you need to take your work to the next level. From powerful tools for data pipelines to cloud storage for your datasets, our platform has it all. Need to collaborate with colleagues or publish your results? No problem. We offer seamless collaboration features that make it easy to work together and share your work with others.
+## Creating Kubeflow Notebook Servers
-No matter what stage of your data science journey you're at, the Advanced Analytics Workspace has the resources you need to succeed.
+Follow these steps to create your first notebook server:
-## Getting Started with the AAW
+1. Log in to [Kubeflow](https://kubeflow.aaw.cloud.statcan.ca/);
+2. Click **Notebooks** from the sidebar on the left (you may need to select a namespace from the **Select namespace** dropdown menu in the upper left-hand corner);
+3. Click the **+ New Notebook** button (upper right-hand corder);
+4. Follow [the instructions here](https://statcan.github.io/aaw/en/1-Experiments/Kubeflow.html#setup) to configure the notebook server.
-
-![AAW icon](https://user-images.githubusercontent.com/8212170/158243976-0ee25082-f3dc-4724-b8c3-1430c7f2a461.png)
-
+
+!!! Hint "Need help creating a notebook server?"
+ We have [a Slideshow](https://docs.google.com/presentation/d/12yTDlbMCmbg0ccdea2h0vwhs5YTa_GHm_3DieG5A-k8/edit?usp=sharing) with instructions on how to create a notebook server.
-### The AAW Portal
+### Kubeflow Documentation
-The AAW portal homepage is available for internal users only. However, external users with a cloud account granted access by the business sponsor can access the platform through the analytics-platform URL.
+The AAW is based on [Kubeflow](https://statcan.github.io/aaw/en/1-Experiments/Kubeflow.html), an open source comprehensive solution for deploying and managing end-to-end ML workflows. Kubeflow simplifies the creation and management of customizable compute environments with user-controlled resource provisioning (custom CPU, GPU, RAM and storage). For more information on Kubeflow, please visit:
-
-!!! info annotate "AAW Portal Homepage"
- - [**Portal Homepage for Statistics Canada Employees**](https://www.statcan.gc.ca/data-analytics-service/aaw)
- - [**Portal Homepage for External Users**](https://www.statcan.gc.ca/data-analytics-services/overview)
+- [AAW Kubeflow Documentation](https://statcan.github.io/aaw/en/1-Experiments/Kubeflow.html)
+- [Official Kubeflow Documentation](https://www.kubeflow.org/docs/started/introduction/)
+
+#### Kubeflow Videos
-### Kubeflow Account
+Videos on Kubeflow have been developed by Google:
-
-!!! important "Attention External Users!"
- Users external to Statistics Canada will require a cloud account granted access by the business sponsor.
+- [Kubeflow 101](https://www.youtube.com/playlist?list=PLIivdWyY5sqLS4lN75RPDEyBgTro_YX7x) by Google Cloud Tech
-
-!!! important "Attention Statistics Canada Employees!"
- Users internal to Statistics Canada can get started right away without any additional sign up procedures, just head to [https://kubeflow.aaw.cloud.statcan.ca/](https://kubeflow.aaw.cloud.statcan.ca/).
+## Working with Your Data
-
-!!! note ""
-
- [![Kubeflow is the Heart of the AAW!](./images/Kubeflow.PNG)](https://kubeflow.aaw.cloud.statcan.ca/)
- **[👉 Click here to setup your Kubeflow account! 👈](https://kubeflow.aaw.cloud.statcan.ca/)**
-
+Once your notebook server has been created, you may want to import data or access shared data from cloud storage. Instructions on how to add storage to your notebook server can be found on [the documentation page for storage](https://statcan.github.io/aaw/en/5-Storage/Disks.html).
-**[Kubeflow](1-Experiments/Kubeflow/)** is a powerful and flexible open source platform that allows for dynamic leverage of cloud compute, with users having the ability to control compute, memory, and storage resources used.
+### Protected Data
-Kubeflow simplifies the following tasks:
+If your project requires protected data:
-- Creating customizable environments to work with data with user-controlled resource provisioning (custom CPU, GPU, RAM and storage).
-- Managing notebook servers including Ubuntu Desktop (via noVNC), R Studio, JupyterLab with Python, R, Julia and SAS for Statistics Canada employees.
+- Cloud storage buckets will be created for you at the time of your projects onboarding.
+- Accessing protected data is done by opening the buckets folder, see [the documentation on Azure Blob Storage](https://statcan.github.io/aaw/en/5-Storage/AzureBlobStorage.html).
-
-!!! info "Kubeflow Dashboard"
- - [**Kubeflow Dashboard**](https://kubeflow.aaw.cloud.statcan.ca/) Use this link once you have your cloud account!
+### Unprotected Data
-Getting started with the Advanced Analytics Workspace (AAW) is easy and quick. First, you'll want to login to Kubeflow to create your first notebook server running JupyterLab, RStudio or Ubuntu Desktop. We encourage you to join our Slack channel to connect with other data scientists and analysts, ask questions, and share your experiences with the AAW platform.
+If you want to upload data into your notebook server ([on a Data Volume](https://statcan.github.io/aaw/en/5-Storage/Disks.html#setup), for instance), you can upload data into JupyterLab by following [the official JupyterLab documentation](https://jupyterlab.readthedocs.io/en/stable/user/files.html#uploading-and-downloading), which has a section on uploading and downloading files from the JupyterLab web interface.
-### Slack
+## Working in JupyterLab
-
-[![Ask Platform Related Questions on Slack!](images/SlackAAW.PNG)](https://statcan-aaw.slack.com/)
-
+Kubeflow creates and manages notebook servers running JupyterLab, which is the main interface in which you'll be doing your data science work.
-- **[Click here sign in to our Slack Support Workspace](https://statcan-aaw.slack.com/)**
+### Virtual Environments
-- **Use the _General_ Channel!**
+When conducting data science experiments, it's a best practice to utilize Python and/or conda virtual environments to manage your project dependencies. It is common to create a dedicated environment for each project or, in some cases, separate environments for different features or aspects of your work (for instance, one environment for general projects and an additional environment tailored for GPU-accelerated deep learning tasks).
-At StatCan, we understand that embarking on a new project can be overwhelming, and you're likely to have many platform-related questions along the way. That's why we've created a dedicated **[Slack channel](https://statcan-aaw.slack.com/)** to provide you with the support you need. Our team of experts is standing by to answer your questions, address any concerns, and guide you through every step of the process.
+!!! Info "Virtual Environments and the Launcher"
+ If you find yourself frequently switching between environments and desire a more convenient way to access them within JupyterLab, you can follow [these instructions](https://statcan.github.io/aaw/en/1-Experiments/Virtual-Environments.html#creating-and-adding-environments-to-the-jupyterlab-launcher).
-To join our **[Slack channel](https://statcan-aaw.slack.com/)**, simply click on the link provided and follow the instructions. You'll be prompted to create an account in the upper right-hand corner of the page. If you have an `@statcan.gc.ca` email address, use it when signing up as this will ensure that you are automatically approved and can start engaging with our community right away.
+### JupyterLab Documentation
-Once you've created your account, you'll have access to a wealth of resources and information, as well as the opportunity to connect with other users who are working on similar projects. Our **[Slack channel](https://statcan-aaw.slack.com/)** is the perfect place to ask questions, share insights, and collaborate with your peers in real-time. Whether you're just getting started with a new project or you're looking for expert advice on a complex issue, our team is here to help.
+- [Official Getting Started with JupyterLab Docs](https://jupyterlab.readthedocs.io/en/stable/getting_started/overview.html)
-So don't hesitate - join our **[Slack channel](https://statcan-aaw.slack.com/)** today and start getting the answers you need to succeed. We look forward to welcoming you to our community!
+### Example IPython Notebooks
-Click on the link, then choose "Create an account" in the upper right-hand corner.
+You can download these notebooks and upload them to your notebook server. These notebooks can also be run from Visual Studio Code if you prefer.
-
-!!! note ""
-
- ![Use your @statcan.gc.ca email!](images/SlackAAW2.png)
- Use your @statcan.gc.ca email address so that you will be automatically approved.
-
+1. [Visual Python: Simplifying Data Analysis for Python Learners](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/VisualPython_EN.html)
+2. [YData Profiling: Streamlining Data Analysis](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/YData-Profiling_EN.html)
+3. [Draw Data: Creating Synthetic Datasets with Ease](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/DrawData_EN.html)
+4. [D-Tale: A Seamless Data Exploration Tool for Python](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/DTale_EN.html)
+5. [Mito Sheet: Excel-Like Spreadsheets in JupyterLab](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/MitoSheet_EN.html)
+6. [PyGWalker: Simplifying Exploratory Data Analysis with Python](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/PyGWalker_EN.html)
+7. [ReRun: Fast and Powerful Multimodal Data Visualization](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/ReRun_EN.html)
+8. [SweetViz: Streamlining EDA with Elegant Visualizations](https://statcan.github.io/aaw/en/1-Experiments/Notebooks/SweetViz_EN.html)
-## 🧭 Getting Started
+## Need Help?
-To access AAW services, you need to log in to Kubeflow with your StatCan guest cloud account. Once logged in, select Notebook Servers and click the "New Server" button to get started.
+Join our vibrant community! Connect with AAW developers and fellow users, ask questions, and share experiences all on the [Slack Support Channel](https://statcan-aaw.slack.com/).
-1. Login to [Kubeflow](https://kubeflow.aaw.cloud.statcan.ca/) with your StatCan guest cloud account. You will be prompted to authenticate the account.
-2. Select Notebook Servers.
-3. Click the "➕ New Server" button.
+For comprehensive documentation and guidance, refer to the:
-## 🧰 Tools Offered
+- [AAW Documentation](https://statcan.github.io/aaw/)
+- [Official Kubeflow Documentation](https://www.kubeflow.org/docs/)
+- [Official JupyterLab Documentation](https://jupyterlab.readthedocs.io/en/stable/user/index.html)
-AAW is a flexible platform for data analysis and machine learning. It offers a range of languages, including Python, R, and Julia. AAW also supports development environments such as VS Code, R Studio, and Jupyter Notebooks. Additionally, Linux virtual desktops are available for users who require additional tools such as OpenM++ and QGIS.
+!!! Info "Do you need help?"
+ **Need real-time assistance?** Join our [Slack Support Channel](https://statcan-aaw.slack.com).
-Here's a list of tools we offer:
+### Demos and Contributions
-- 📜 Languages:
- - 🐍 Python
- - 📈 R
- - 👩🔬 Julia
-- 🧮 Development environments:
- - VS Code
- - R Studio
- - Jupyter Notebooks
-- 🐧 Linux virtual desktops for additional tools (🧫 OpenM++, 🌏 QGIS etc.)
+For in-depth demos, personalized assistance, or to contribute to the AAW community, reach out to us on [Slack Support Channel](https://statcan-aaw.slack.com). You can contribute to the platform's development and report issues or feature requests on [GitHub](https://github.com/StatCan/aaw).
-Sharing code, disks, and workspaces (e.g.: two people sharing the same virtual machine) is described in more detail in the [Collaboration](4-Collaboration/Overview.md) section. Sharing data through buckets is described in more detail in the **[Azure Blob Storage](./5-Storage/AzureBlobStorage.md)**
-section.
+## External Learning Resources
-### 💡 Help
+Some of the AAW Developers are also data scientists! So we have a lot of material to share on data science tooling and best practices. Below are some useful and interested data science learning resources:
-- Disk (also called Volumes on the Notebook Server creation screen)
-- Containers (Blob Storage)
-- Data Lakes (coming soon)
+### Data Science Resources (R and Python)
-- 📗 AAW Portal Documentation
- - [https://statcan.github.io/aaw/](https://statcan.github.io/aaw/)
-- 📘 Kubeflow Documentation
- - [https://www.kubeflow.org/docs/](https://www.kubeflow.org/docs/)
-- 🤝 Slack Support Channel
- - [https://statcan-aaw.slack.com](https://statcan-aaw.slack.com)
+- [Machine Learning Mastery's Data Preparation Course](https://machinelearningmastery.com/start-here/#dataprep)
+- [A Gentle Introduction to SciKit Learn (Python)](https://machinelearningmastery.com/a-gentle-introduction-to-scikit-learn-a-python-machine-learning-library/)
+- [Official SciKit Learn Tutorials](https://scikit-learn.org/stable/tutorial/index.html)
+- [How to Handle Imbalanced Datasets](https://machinelearningmastery.com/start-here/#imbalanced)
+- [Quarto Themes](https://quarto.org/docs/output-formats/html-themes.html)
+- [Tidy Models Resampling Techniques](https://www.tidymodels.org/start/resampling/)
+- [EasyStats for R](https://github.com/easystats)
+- [EasyStats Model Performance Evaluation Package](https://easystats.github.io/performance/)
+- [Tidy Modelling with R](https://www.tmwr.org/)
+- [Model evaluation and analysis: the modEvA R package in a nutshell](https://modeva.r-forge.r-project.org/modEvA-tutorial.html)
+- [Metrics and scoring: quantifying the quality of predictions](https://scikit-learn.org/stable/modules/model_evaluation.html)
-## 🐱 Demos
+### Python Language Resources
-If you require a quick onboarding demo session, need help, or have any questions, please reach out to us through our [🤝 Slack Support Channel](https://statcan-aaw.slack.com).
+- [Real Python's Introduction to Python](https://realpython.com/learning-paths/python3-introduction/)
+- [W3School's Introduction to Python](https://www.w3schools.com/python/python_intro.asp)
+- [Google Developers' Introduction to Python](https://developers.google.com/edu/python)
+- [Machine Learning Mastery's Python Skills](https://machinelearningmastery.com/start-here/#pythonskills)
+- [TechWorld with Nana's Python Tutorial for Beginners](https://www.youtube.com/watch?v=t8pPdKYpowI)
-## Contributing
+### R Language Resources
-If you have any bugs to report or features to request please do so via https://github.com/StatCan/aaw.
+- [Videos on R](https://www.youtube.com/playlist?list=PLLOxZwkBK52C6_Nkmp0nFCreLfnfJgUL7)
+- [Introduction to R](https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf)
+- [R Data Import/Export](https://cran.r-project.org/doc/manuals/r-release/R-data.pdf)
diff --git a/package.json b/package.json
index d45dfaf3d..e91b9f6d5 100644
--- a/package.json
+++ b/package.json
@@ -15,6 +15,6 @@
"devDependencies": {
"husky": "^4.2.5",
"markdown-spellcheck": "https://github.com/brendangadd/node-markdown-spellcheck.git#45cf81bfb56f298d0928461133aa9f264047dd49",
- "prettier": "3.1.0"
+ "prettier": "3.1.1"
}
}
diff --git a/yarn.lock b/yarn.lock
index 3144efdd7..f5f0df1b3 100644
--- a/yarn.lock
+++ b/yarn.lock
@@ -573,10 +573,10 @@ please-upgrade-node@^3.2.0:
dependencies:
semver-compare "^1.0.0"
-prettier@3.1.0:
- version "3.1.0"
- resolved "https://registry.yarnpkg.com/prettier/-/prettier-3.1.0.tgz#c6d16474a5f764ea1a4a373c593b779697744d5e"
- integrity sha512-TQLvXjq5IAibjh8EpBIkNKxO749UEWABoiIZehEPiY4GNpVdhaFKqSTu+QrlU6D2dPAfubRmtJTi4K4YkQ5eXw==
+prettier@3.1.1:
+ version "3.1.1"
+ resolved "https://registry.yarnpkg.com/prettier/-/prettier-3.1.1.tgz#6ba9f23165d690b6cbdaa88cb0807278f7019848"
+ integrity sha512-22UbSzg8luF4UuZtzgiUOfcGM8s4tjBv6dJRT7j275NXsy2jb4aJa4NNveul5x4eqlF1wuhuR2RElK71RvmVaw==
queue-microtask@^1.2.2:
version "1.2.3"