update(notebooks): add examples (#1903)

* update(notebooks): add examples add example data analysis tools notebooks * update(translation): notebooks/cahier --> bloc-notes --------- Co-authored-by: Bryan Paget <[email protected]>
StatCan · Nov 28, 2023 · f3f7064 · f3f7064
1 parent 7922da5
commit f3f7064
Show file tree

Hide file tree

Showing 98 changed files with 586,779 additions and 31 deletions.
diff --git a/docs/en/1-Experiments/Notebooks/DTale_EN.ipynb b/docs/en/1-Experiments/Notebooks/DTale_EN.ipynb
@@ -0,0 +1,104 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "76784523-1230-4c92-8cac-753cc0f8613e",
+   "metadata": {},
+   "source": [
+    "# D-Tale: A Seamless Data Exploration Tool for Python\n",
+    "\n",
+    "D-Tale, born out of a SAS to Python conversion, transforms the data exploration process into a breeze. Originally a Perl script wrapper for SAS's insight function, it has evolved into a lightweight web client seamlessly integrated with Pandas data structures.\n",
+    "\n",
+    "Built on a Flask back-end and a React front-end, D-Tale offers a straightforward method to view and analyze Pandas data structures. Its seamless integration with Jupyter notebooks and Python/IPython terminals makes it a versatile tool. Currently, it supports various Pandas objects, including DataFrame, Series, MultiIndex, DatetimeIndex, and RangeIndex.\n",
+    "\n",
+    "D-Tale is a solution that simplifies data exploration. Acting as a lightweight web client over Pandas data structures, D-Tale offers an intuitive user interface for performing various data exploration tasks without the need to write any code.\n",
+    "\n",
+    "\n",
+    "![](dtale.png)\n",
+    "\n",
+    "![](dtale-menu.png)\n",
+    "\n",
+    "![](dtale-2.png)\n",
+    "\n",
+    "## Installation:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c1a92ee3-40b7-4a8c-b51b-4e007efe6593",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%capture\n",
+    "! pip install -r requirements.txt\n",
+    "! pip install -U dtale"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7b7fffba-5e2e-4381-973b-6da1a9dc1bb2",
+   "metadata": {},
+   "source": [
+    "## Usage:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "31bec97e-6100-43ba-b41e-68fb3f837116",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import dtale\n",
+    "import pandas as pd\n",
+    "from ydata_profiling.utils.cache import cache_file\n",
+    "\n",
+    "# Fetching Pokemon dataset\n",
+    "file_name = cache_file(\n",
+    "    \"pokemon.csv\",\n",
+    "    \"https://raw.githubusercontent.com/bryanpaget/html/main/pokemon.csv\"\n",
+    ")\n",
+    "\n",
+    "# Reading dataset using Pandas\n",
+    "pokemon_df = pd.read_csv(file_name)\n",
+    "\n",
+    "# Displaying dataset with D-Tale\n",
+    "dtale.show(pokemon_df)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "863aff9f-8be9-4231-ade4-21377856b1a0",
+   "metadata": {},
+   "source": [
+    "D-Tale comes to the rescue by providing a user-friendly interface for essential data exploration tasks, eliminating the need for repetitive code and saving valuable time in the process.\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/en/1-Experiments/Notebooks/DrawData_EN.ipynb b/docs/en/1-Experiments/Notebooks/DrawData_EN.ipynb
@@ -0,0 +1,159 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "5665986c-f484-437d-9c69-6e2fb9ca2741",
+   "metadata": {},
+   "source": [
+    "# Draw Data: Creating Synthetic Datasets with Ease\n",
+    "\n",
+    "Ever wished you could effortlessly generate a dataset by visually sketching points on a Cartesian plane? Meet Draw Data, a handy Python app designed for Jupyter notebooks. This tool allows you to craft toy or synthetic datasets by simply drawing your ideas directly onto the chart. It proves particularly valuable when teaching machine learning algorithms.\n",
+    "\n",
+    "![](drawdata-1.png)\n",
+    "![](drawdata-2.png)"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7e22fff3-f538-4ea2-b81c-26a1aaa058b9",
+   "metadata": {},
+   "source": [
+    "## Installation:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "edb64c8b-59f1-42d3-87e0-97aca11a8ac7",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%capture\n",
+    "! pip install -r requirements.txt\n",
+    "! pip install -U drawdata\n",
+    "# On Linux you'll need to:\n",
+    "# sudo apt-get install xsel xclip -y"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "97f8e6e4-8be5-49c2-9803-cb505aa87499",
+   "metadata": {},
+   "source": [
+    "## Getting Started:\n",
+    "\n",
+    "To draw a dataset, execute the following cell. You can sketch up to four classes of points. Afterward, click \"Copy CSV,\" and your data points, presented as x, y, z comma-separated values, will be copied to the clipboard. To import the data into a Pandas DataFrame, use the following code:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c20c0812-a0a9-491c-9c0c-4144eec21b0b",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "from drawdata import draw_scatter\n",
+    "\n",
+    "draw_scatter()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3b92e81d-f46b-4411-9ee4-734838bcb9be",
+   "metadata": {},
+   "source": [
+    "## Viewing the Data Table:\n",
+    "\n",
+    "Once you've completed your drawing, copy the data to the clipboard. The next step involves using Pandas to read the clipboard and populate a DataFrame. Here's a glimpse of the initial entries:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "eec428d3-5bb0-4c02-ab70-1e91da46291e",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "# Reading the clipboard into a DataFrame\n",
+    "df = pd.read_clipboard(sep=\",\")\n",
+    "df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9161b704-2af1-4e65-a07c-ec2d6a53551c",
+   "metadata": {},
+   "source": [
+    "## Plotting the Drawn Data:\n",
+    "\n",
+    "Visualizing the drawn points becomes a breeze with Plotly, which provides an interactive chart. The following code snippet accomplishes this:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "dd70f514-2ff4-42a3-b3a1-2d7dff468d3b",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import plotly.express as px\n",
+    "import plotly\n",
+    "\n",
+    "plotly.offline.init_notebook_mode(connected=True)\n",
+    "\n",
+    "# Creating an interactive scatter plot\n",
+    "fig = px.scatter(df, x='x', y='y', color='z')\n",
+    "fig.update_layout(\n",
+    "    autosize=False,\n",
+    "    width=800,\n",
+    "    height=800,\n",
+    ")\n",
+    "fig.show()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "3976a320-0acb-4e36-a34f-6d317b0ef642",
+   "metadata": {},
+   "source": [
+    "This comprehensive guide empowers you to seamlessly draw, analyze, and visualize your synthetic dataset, making the process of teaching machine learning concepts more intuitive and engaging.\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n",
+    "\n"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}
diff --git a/docs/en/1-Experiments/Notebooks/MitoSheet_EN.ipynb b/docs/en/1-Experiments/Notebooks/MitoSheet_EN.ipynb
@@ -0,0 +1,128 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "1bd46748-42b9-46ea-9fea-fee20634a793",
+   "metadata": {},
+   "source": [
+    "# Mito Sheet: Excel-Like Spreadsheets in JupyterLab\n",
+    "\n",
+    "Mito Sheet offers a seamlessly integrated spreadsheet program within JupyterLab, bringing a familiar Excel-like experience to your data analysis workflow.\n",
+    "\n",
+    "![](mito-fullscreen.png)\n",
+    "\n",
+    "![](mito.png)\n",
+    "\n",
+    "## Installation:\n",
+    "\n",
+    "To install Mito Sheet, execute the following cell in your JupyterLab environment:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "d9f41c51-e09f-4a37-b5e6-3ff4766001ea",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "%%capture\n",
+    "! pip install -r requirements.txt\n",
+    "! pip install -U mitosheet\n",
+    "! python -m jupyter nbextension install --py --user mitosheet\n",
+    "! python -m jupyter nbextension enable --py --user mitosheet"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b421be69-2741-4d5c-a3fa-f60f8d9187aa",
+   "metadata": {
+    "tags": []
+   },
+   "source": [
+    "## Getting Started:\n",
+    "\n",
+    "Begin your Mito Sheet journey by running the following code snippet:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "a2fa0eae-fceb-49a9-9c0f-4daf35d200f7",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import mitosheet\n",
+    "mitosheet.sheet()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a1d58f56-87f6-4ebc-9a6a-0502773d0be7",
+   "metadata": {},
+   "source": [
+    "## Working with Excel Data:\n",
+    "\n",
+    "For those accustomed to Excel, transitioning to Mito Sheet is seamless. Let's import an Excel-like dataset for illustration. We'll use a popular Pokémon dataset available online:"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "5b1b4720-3beb-4bf8-b104-3ac6dde49d08",
+   "metadata": {
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "import mitosheet\n",
+    "import pandas as pd\n",
+    "from ydata_profiling.utils.cache import cache_file\n",
+    "\n",
+    "# Caching the Pokémon dataset from a remote source\n",
+    "file_name = cache_file(\n",
+    "    \"pokemon.csv\",\n",
+    "    \"https://raw.githubusercontent.com/bryanpaget/html/main/pokemon.csv\"\n",
+    ")\n",
+    "\n",
+    "# Reading the dataset into a Pandas DataFrame\n",
+    "pokemon_df = pd.read_csv(file_name)\n",
+    "\n",
+    "# Launching Mito Sheet with the Pokémon dataset\n",
+    "mitosheet.sheet(pokemon_df, analysis_to_replay=\"id-flhnxtovqt\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "00603023-c5c1-4606-8f9b-1748e4f1c1f9",
+   "metadata": {},
+   "source": [
+    "Mito Sheet empowers you to leverage your Excel proficiency within the JupyterLab environment. Seamlessly analyze datasets, perform computations, and visualize results – all within the familiar interface you know and love. Embrace the power of Mito Sheet for a more integrated and efficient data analysis experience."
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.10.12"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}