From cddf82e63962618ad1fcfb0972c78fc092474560 Mon Sep 17 00:00:00 2001
From: miqueasmd <miqueasmd@gmail.com>
Date: Tue, 10 Dec 2024 13:09:41 +0100
Subject: [PATCH] Update lab-dw-aggregating.ipynb

---
 lab-dw-aggregating.ipynb | 1320 ++++++++++++++++++++++++++++++++++----
 1 file changed, 1180 insertions(+), 140 deletions(-)
diff --git a/lab-dw-aggregating.ipynb b/lab-dw-aggregating.ipynb
index fff3ae5..ac396c4 100644
--- a/lab-dw-aggregating.ipynb
+++ b/lab-dw-aggregating.ipynb
@@ -1,161 +1,1201 @@
 {
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "id": "31969215-2a90-4d8b-ac36-646a7ae13744",
-      "metadata": {
-        "id": "31969215-2a90-4d8b-ac36-646a7ae13744"
-      },
-      "source": [
-        "# Lab | Data Aggregation and Filtering"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d",
-      "metadata": {
-        "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d"
-      },
-      "source": [
-        "In this challenge, we will continue to work with customer data from an insurance company. We will use the dataset called marketing_customer_analysis.csv, which can be found at the following link:\n",
-        "\n",
-        "https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\n",
-        "\n",
-        "This dataset contains information such as customer demographics, policy details, vehicle information, and the customer's response to the last marketing campaign. Our goal is to explore and analyze this data by first performing data cleaning, formatting, and structuring."
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50",
-      "metadata": {
-        "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50"
-      },
-      "source": [
-        "1. Create a new DataFrame that only includes customers who have a total_claim_amount greater than $1,000 and have a response of \"Yes\" to the last marketing campaign."
-      ]
-    },
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "31969215-2a90-4d8b-ac36-646a7ae13744",
+   "metadata": {
+    "id": "31969215-2a90-4d8b-ac36-646a7ae13744"
+   },
+   "source": [
+    "# Lab | Data Aggregation and Filtering"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d",
+   "metadata": {
+    "id": "a8f08a52-bec0-439b-99cc-11d3809d8b5d"
+   },
+   "source": [
+    "In this challenge, we will continue to work with customer data from an insurance company. We will use the dataset called marketing_customer_analysis.csv, which can be found at the following link:\n",
+    "\n",
+    "https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\n",
+    "\n",
+    "This dataset contains information such as customer demographics, policy details, vehicle information, and the customer's response to the last marketing campaign. Our goal is to explore and analyze this data by first performing data cleaning, formatting, and structuring."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50",
+   "metadata": {
+    "id": "9c98ddc5-b041-4c94-ada1-4dfee5c98e50"
+   },
+   "source": [
+    "1. Create a new DataFrame that only includes customers who have a total_claim_amount greater than $1,000 and have a response of \"Yes\" to the last marketing campaign."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 1,
+   "id": "8c71d31e",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "b9be383e-5165-436e-80c8-57d4c757c8c3",
-      "metadata": {
-        "id": "b9be383e-5165-436e-80c8-57d4c757c8c3"
-      },
-      "source": [
-        "2. Using the original Dataframe, analyze the average total_claim_amount by each policy type and gender for customers who have responded \"Yes\" to the last marketing campaign. Write your conclusions."
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Unnamed: 0</th>\n",
+       "      <th>Customer</th>\n",
+       "      <th>State</th>\n",
+       "      <th>Customer Lifetime Value</th>\n",
+       "      <th>Response</th>\n",
+       "      <th>Coverage</th>\n",
+       "      <th>Education</th>\n",
+       "      <th>Effective To Date</th>\n",
+       "      <th>EmploymentStatus</th>\n",
+       "      <th>Gender</th>\n",
+       "      <th>...</th>\n",
+       "      <th>Number of Open Complaints</th>\n",
+       "      <th>Number of Policies</th>\n",
+       "      <th>Policy Type</th>\n",
+       "      <th>Policy</th>\n",
+       "      <th>Renew Offer Type</th>\n",
+       "      <th>Sales Channel</th>\n",
+       "      <th>Total Claim Amount</th>\n",
+       "      <th>Vehicle Class</th>\n",
+       "      <th>Vehicle Size</th>\n",
+       "      <th>Vehicle Type</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>189</th>\n",
+       "      <td>189</td>\n",
+       "      <td>OK31456</td>\n",
+       "      <td>California</td>\n",
+       "      <td>11009.130490</td>\n",
+       "      <td>Yes</td>\n",
+       "      <td>Premium</td>\n",
+       "      <td>Bachelor</td>\n",
+       "      <td>1/24/11</td>\n",
+       "      <td>Employed</td>\n",
+       "      <td>F</td>\n",
+       "      <td>...</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Corporate Auto</td>\n",
+       "      <td>Corporate L3</td>\n",
+       "      <td>Offer2</td>\n",
+       "      <td>Agent</td>\n",
+       "      <td>1358.400000</td>\n",
+       "      <td>Luxury Car</td>\n",
+       "      <td>Medsize</td>\n",
+       "      <td>NaN</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>236</th>\n",
+       "      <td>236</td>\n",
+       "      <td>YJ16163</td>\n",
+       "      <td>Oregon</td>\n",
+       "      <td>11009.130490</td>\n",
+       "      <td>Yes</td>\n",
+       "      <td>Premium</td>\n",
+       "      <td>Bachelor</td>\n",
+       "      <td>1/24/11</td>\n",
+       "      <td>Employed</td>\n",
+       "      <td>F</td>\n",
+       "      <td>...</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Special Auto</td>\n",
+       "      <td>Special L3</td>\n",
+       "      <td>Offer2</td>\n",
+       "      <td>Agent</td>\n",
+       "      <td>1358.400000</td>\n",
+       "      <td>Luxury Car</td>\n",
+       "      <td>Medsize</td>\n",
+       "      <td>A</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>419</th>\n",
+       "      <td>419</td>\n",
+       "      <td>GW43195</td>\n",
+       "      <td>Oregon</td>\n",
+       "      <td>25807.063000</td>\n",
+       "      <td>Yes</td>\n",
+       "      <td>Extended</td>\n",
+       "      <td>College</td>\n",
+       "      <td>2/13/11</td>\n",
+       "      <td>Employed</td>\n",
+       "      <td>F</td>\n",
+       "      <td>...</td>\n",
+       "      <td>1.0</td>\n",
+       "      <td>2</td>\n",
+       "      <td>Personal Auto</td>\n",
+       "      <td>Personal L2</td>\n",
+       "      <td>Offer1</td>\n",
+       "      <td>Branch</td>\n",
+       "      <td>1027.200000</td>\n",
+       "      <td>Luxury Car</td>\n",
+       "      <td>Small</td>\n",
+       "      <td>A</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>442</th>\n",
+       "      <td>442</td>\n",
+       "      <td>IP94270</td>\n",
+       "      <td>Arizona</td>\n",
+       "      <td>13736.132500</td>\n",
+       "      <td>Yes</td>\n",
+       "      <td>Premium</td>\n",
+       "      <td>Master</td>\n",
+       "      <td>2/13/11</td>\n",
+       "      <td>Disabled</td>\n",
+       "      <td>F</td>\n",
+       "      <td>...</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>8</td>\n",
+       "      <td>Personal Auto</td>\n",
+       "      <td>Personal L2</td>\n",
+       "      <td>Offer1</td>\n",
+       "      <td>Web</td>\n",
+       "      <td>1261.319869</td>\n",
+       "      <td>SUV</td>\n",
+       "      <td>Medsize</td>\n",
+       "      <td>A</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>587</th>\n",
+       "      <td>587</td>\n",
+       "      <td>FJ28407</td>\n",
+       "      <td>California</td>\n",
+       "      <td>5619.689084</td>\n",
+       "      <td>Yes</td>\n",
+       "      <td>Premium</td>\n",
+       "      <td>High School or Below</td>\n",
+       "      <td>1/26/11</td>\n",
+       "      <td>Unemployed</td>\n",
+       "      <td>M</td>\n",
+       "      <td>...</td>\n",
+       "      <td>0.0</td>\n",
+       "      <td>1</td>\n",
+       "      <td>Personal Auto</td>\n",
+       "      <td>Personal L1</td>\n",
+       "      <td>Offer2</td>\n",
+       "      <td>Web</td>\n",
+       "      <td>1027.000029</td>\n",
+       "      <td>SUV</td>\n",
+       "      <td>Medsize</td>\n",
+       "      <td>A</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "<p>5 rows × 26 columns</p>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "     Unnamed: 0 Customer       State  Customer Lifetime Value Response  \\\n",
+       "189         189  OK31456  California             11009.130490      Yes   \n",
+       "236         236  YJ16163      Oregon             11009.130490      Yes   \n",
+       "419         419  GW43195      Oregon             25807.063000      Yes   \n",
+       "442         442  IP94270     Arizona             13736.132500      Yes   \n",
+       "587         587  FJ28407  California              5619.689084      Yes   \n",
+       "\n",
+       "     Coverage             Education Effective To Date EmploymentStatus Gender  \\\n",
+       "189   Premium              Bachelor           1/24/11         Employed      F   \n",
+       "236   Premium              Bachelor           1/24/11         Employed      F   \n",
+       "419  Extended               College           2/13/11         Employed      F   \n",
+       "442   Premium                Master           2/13/11         Disabled      F   \n",
+       "587   Premium  High School or Below           1/26/11       Unemployed      M   \n",
+       "\n",
+       "     ...  Number of Open Complaints Number of Policies     Policy Type  \\\n",
+       "189  ...                        0.0                  1  Corporate Auto   \n",
+       "236  ...                        0.0                  1    Special Auto   \n",
+       "419  ...                        1.0                  2   Personal Auto   \n",
+       "442  ...                        0.0                  8   Personal Auto   \n",
+       "587  ...                        0.0                  1   Personal Auto   \n",
+       "\n",
+       "           Policy  Renew Offer Type  Sales Channel  Total Claim Amount  \\\n",
+       "189  Corporate L3            Offer2          Agent         1358.400000   \n",
+       "236    Special L3            Offer2          Agent         1358.400000   \n",
+       "419   Personal L2            Offer1         Branch         1027.200000   \n",
+       "442   Personal L2            Offer1            Web         1261.319869   \n",
+       "587   Personal L1            Offer2            Web         1027.000029   \n",
+       "\n",
+       "     Vehicle Class Vehicle Size Vehicle Type  \n",
+       "189     Luxury Car      Medsize          NaN  \n",
+       "236     Luxury Car      Medsize            A  \n",
+       "419     Luxury Car        Small            A  \n",
+       "442            SUV      Medsize            A  \n",
+       "587            SUV      Medsize            A  \n",
+       "\n",
+       "[5 rows x 26 columns]"
       ]
-    },
+     },
+     "execution_count": 1,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "import pandas as pd\n",
+    "\n",
+    "# Load the dataset\n",
+    "url = \"https://raw.githubusercontent.com/data-bootcamp-v4/data/main/marketing_customer_analysis.csv\"\n",
+    "df = pd.read_csv(url)\n",
+    "\n",
+    "# Filter the DataFrame\n",
+    "filtered_df = df[(df['Total Claim Amount'] > 1000) & (df['Response'] == 'Yes')]\n",
+    "\n",
+    "# Display the filtered DataFrame\n",
+    "filtered_df.head()"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b9be383e-5165-436e-80c8-57d4c757c8c3",
+   "metadata": {
+    "id": "b9be383e-5165-436e-80c8-57d4c757c8c3"
+   },
+   "source": [
+    "2. Using the original Dataframe, analyze the average total_claim_amount by each policy type and gender for customers who have responded \"Yes\" to the last marketing campaign. Write your conclusions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 2,
+   "id": "f9e17188",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0",
-      "metadata": {
-        "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0"
-      },
-      "source": [
-        "3. Analyze the total number of customers who have policies in each state, and then filter the results to only include states where there are more than 500 customers."
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Policy Type</th>\n",
+       "      <th>Gender</th>\n",
+       "      <th>Total Claim Amount</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Corporate Auto</td>\n",
+       "      <td>F</td>\n",
+       "      <td>433.738499</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Corporate Auto</td>\n",
+       "      <td>M</td>\n",
+       "      <td>408.582459</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Personal Auto</td>\n",
+       "      <td>F</td>\n",
+       "      <td>452.965929</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Personal Auto</td>\n",
+       "      <td>M</td>\n",
+       "      <td>457.010178</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>Special Auto</td>\n",
+       "      <td>F</td>\n",
+       "      <td>453.280164</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>Special Auto</td>\n",
+       "      <td>M</td>\n",
+       "      <td>429.527942</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "      Policy Type Gender  Total Claim Amount\n",
+       "0  Corporate Auto      F          433.738499\n",
+       "1  Corporate Auto      M          408.582459\n",
+       "2   Personal Auto      F          452.965929\n",
+       "3   Personal Auto      M          457.010178\n",
+       "4    Special Auto      F          453.280164\n",
+       "5    Special Auto      M          429.527942"
       ]
-    },
+     },
+     "execution_count": 2,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Filter the DataFrame for customers who responded \"Yes\"\n",
+    "responded_yes_df = df[df['Response'] == 'Yes']\n",
+    "\n",
+    "# Group by 'Policy Type' and 'Gender' and calculate the average 'Total Claim Amount'\n",
+    "average_claims = responded_yes_df.groupby(['Policy Type', 'Gender'])['Total Claim Amount'].mean().reset_index()\n",
+    "\n",
+    "# Display the result\n",
+    "average_claims"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7efbabc6",
+   "metadata": {},
+   "source": [
+    "Based on the provided data, here are some conclusions:\n",
+    "\n",
+    "1. **Corporate Auto**:\n",
+    "   - Female customers have a slightly higher average total claim amount ($433.74) compared to male customers ($408.58).\n",
+    "\n",
+    "2. **Personal Auto**:\n",
+    "   - Male customers have a slightly higher average total claim amount ($457.01) compared to female customers ($452.97).\n",
+    "\n",
+    "3. **Special Auto**:\n",
+    "   - Female customers have a slightly higher average total claim amount ($453.28) compared to male customers ($429.53).\n",
+    "\n",
+    "### General Observations:\n",
+    "- For **Corporate Auto** policies, females tend to have higher average claim amounts than males.\n",
+    "- For **Personal Auto** policies, males tend to have higher average claim amounts than females.\n",
+    "- For **Special Auto** policies, females tend to have higher average claim amounts than males.\n",
+    "\n",
+    "### Marketing Implications:\n",
+    "- The insurance company might consider tailoring their marketing strategies based on these insights. For example, they could emphasize the benefits of Corporate Auto policies to female customers and Personal Auto policies to male customers.\n",
+    "- Understanding these differences can help in designing more effective marketing campaigns and customer service strategies to address the specific needs and behaviors of different customer segments."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0",
+   "metadata": {
+    "id": "7050f4ac-53c5-4193-a3c0-8699b87196f0"
+   },
+   "source": [
+    "3. Analyze the total number of customers who have policies in each state, and then filter the results to only include states where there are more than 500 customers."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 3,
+   "id": "c98ca385",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d",
-      "metadata": {
-        "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d"
-      },
-      "source": [
-        "4. Find the maximum, minimum, and median customer lifetime value by education level and gender. Write your conclusions."
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>State</th>\n",
+       "      <th>Customer Count</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Arizona</td>\n",
+       "      <td>1937</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>California</td>\n",
+       "      <td>3552</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Nevada</td>\n",
+       "      <td>993</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Oregon</td>\n",
+       "      <td>2909</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>Washington</td>\n",
+       "      <td>888</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "        State  Customer Count\n",
+       "0     Arizona            1937\n",
+       "1  California            3552\n",
+       "2      Nevada             993\n",
+       "3      Oregon            2909\n",
+       "4  Washington             888"
       ]
-    },
+     },
+     "execution_count": 3,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Group by 'State' and count the number of customers\n",
+    "state_counts = df.groupby('State').size().reset_index(name='Customer Count')\n",
+    "\n",
+    "# Filter the results to include only states with more than 500 customers\n",
+    "filtered_states = state_counts[state_counts['Customer Count'] > 500]\n",
+    "\n",
+    "# Display the result\n",
+    "filtered_states"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d",
+   "metadata": {
+    "id": "b60a4443-a1a7-4bbf-b78e-9ccdf9895e0d"
+   },
+   "source": [
+    "4. Find the maximum, minimum, and median customer lifetime value by education level and gender. Write your conclusions."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 4,
+   "id": "82a6611a",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "b42999f9-311f-481e-ae63-40a5577072c5",
-      "metadata": {
-        "id": "b42999f9-311f-481e-ae63-40a5577072c5"
-      },
-      "source": [
-        "## Bonus"
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Education</th>\n",
+       "      <th>Gender</th>\n",
+       "      <th>max</th>\n",
+       "      <th>min</th>\n",
+       "      <th>median</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Bachelor</td>\n",
+       "      <td>F</td>\n",
+       "      <td>73225.95652</td>\n",
+       "      <td>1904.000852</td>\n",
+       "      <td>5640.505303</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Bachelor</td>\n",
+       "      <td>M</td>\n",
+       "      <td>67907.27050</td>\n",
+       "      <td>1898.007675</td>\n",
+       "      <td>5548.031892</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>College</td>\n",
+       "      <td>F</td>\n",
+       "      <td>61850.18803</td>\n",
+       "      <td>1898.683686</td>\n",
+       "      <td>5623.611187</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>College</td>\n",
+       "      <td>M</td>\n",
+       "      <td>61134.68307</td>\n",
+       "      <td>1918.119700</td>\n",
+       "      <td>6005.847375</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>4</th>\n",
+       "      <td>Doctor</td>\n",
+       "      <td>F</td>\n",
+       "      <td>44856.11397</td>\n",
+       "      <td>2395.570000</td>\n",
+       "      <td>5332.462694</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>5</th>\n",
+       "      <td>Doctor</td>\n",
+       "      <td>M</td>\n",
+       "      <td>32677.34284</td>\n",
+       "      <td>2267.604038</td>\n",
+       "      <td>5577.669457</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>6</th>\n",
+       "      <td>High School or Below</td>\n",
+       "      <td>F</td>\n",
+       "      <td>55277.44589</td>\n",
+       "      <td>2144.921535</td>\n",
+       "      <td>6039.553187</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>7</th>\n",
+       "      <td>High School or Below</td>\n",
+       "      <td>M</td>\n",
+       "      <td>83325.38119</td>\n",
+       "      <td>1940.981221</td>\n",
+       "      <td>6286.731006</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>8</th>\n",
+       "      <td>Master</td>\n",
+       "      <td>F</td>\n",
+       "      <td>51016.06704</td>\n",
+       "      <td>2417.777032</td>\n",
+       "      <td>5729.855012</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>9</th>\n",
+       "      <td>Master</td>\n",
+       "      <td>M</td>\n",
+       "      <td>50568.25912</td>\n",
+       "      <td>2272.307310</td>\n",
+       "      <td>5579.099207</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "              Education Gender          max          min       median\n",
+       "0              Bachelor      F  73225.95652  1904.000852  5640.505303\n",
+       "1              Bachelor      M  67907.27050  1898.007675  5548.031892\n",
+       "2               College      F  61850.18803  1898.683686  5623.611187\n",
+       "3               College      M  61134.68307  1918.119700  6005.847375\n",
+       "4                Doctor      F  44856.11397  2395.570000  5332.462694\n",
+       "5                Doctor      M  32677.34284  2267.604038  5577.669457\n",
+       "6  High School or Below      F  55277.44589  2144.921535  6039.553187\n",
+       "7  High School or Below      M  83325.38119  1940.981221  6286.731006\n",
+       "8                Master      F  51016.06704  2417.777032  5729.855012\n",
+       "9                Master      M  50568.25912  2272.307310  5579.099207"
       ]
-    },
+     },
+     "execution_count": 4,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Group by 'Education' and 'Gender' and calculate the max, min, and median 'Customer Lifetime Value'\n",
+    "clv_stats = df.groupby(['Education', 'Gender'])['Customer Lifetime Value'].agg(['max', 'min', 'median']).reset_index()\n",
+    "\n",
+    "# Display the result\n",
+    "clv_stats"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "f38647e8",
+   "metadata": {},
+   "source": [
+    "Based on the provided data, here are some conclusions regarding the customer lifetime value (CLV) by education level and gender:\n",
+    "\n",
+    "### General Observations:\n",
+    "1. **Bachelor's Degree**:\n",
+    "   - Female customers have a higher maximum CLV ($73,225.96) compared to male customers ($67,907.27).\n",
+    "   - The median CLV is slightly higher for female customers ($5,640.51) compared to male customers ($5,548.03).\n",
+    "\n",
+    "2. **College**:\n",
+    "   - Female customers have a higher maximum CLV ($61,850.19) compared to male customers ($61,134.68).\n",
+    "   - The median CLV is higher for male customers ($6,005.85) compared to female customers ($5,623.61).\n",
+    "\n",
+    "3. **Doctorate**:\n",
+    "   - Female customers have a higher maximum CLV ($44,856.11) compared to male customers ($32,677.34).\n",
+    "   - The median CLV is slightly higher for male customers ($5,577.67) compared to female customers ($5,332.46).\n",
+    "\n",
+    "4. **High School or Below**:\n",
+    "   - Male customers have a significantly higher maximum CLV ($83,325.38) compared to female customers ($55,277.45).\n",
+    "   - The median CLV is higher for male customers ($6,286.73) compared to female customers ($6,039.55).\n",
+    "\n",
+    "5. **Master's Degree**:\n",
+    "   - Female customers have a slightly higher maximum CLV ($51,016.07) compared to male customers ($50,568.26).\n",
+    "   - The median CLV is slightly higher for female customers ($5,729.86) compared to male customers ($5,579.10).\n",
+    "\n",
+    "### Conclusions:\n",
+    "- **High School or Below**: Male customers in this education category have the highest maximum CLV ($83,325.38) among all groups, indicating that this segment might include some highly valuable customers.\n",
+    "- **Bachelor's Degree**: Female customers with a Bachelor's degree have the highest maximum CLV ($73,225.96) among female customers, suggesting that this segment is particularly valuable.\n",
+    "- **Doctorate**: Female customers with a Doctorate degree have a higher maximum CLV compared to their male counterparts, but the median CLV is higher for males.\n",
+    "- **College and Master's Degree**: The differences in maximum and median CLV between genders are relatively small, indicating a more balanced distribution of customer value in these education levels.\n",
+    "\n",
+    "### Marketing Implications:\n",
+    "- The insurance company might consider focusing marketing efforts on male customers with a high school education or below, as they have the highest maximum CLV.\n",
+    "- Female customers with a Bachelor's degree also represent a valuable segment and could be targeted with tailored marketing campaigns.\n",
+    "- Understanding these differences can help in designing more effective marketing strategies and customer retention programs to maximize the lifetime value of different customer segments."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b42999f9-311f-481e-ae63-40a5577072c5",
+   "metadata": {
+    "id": "b42999f9-311f-481e-ae63-40a5577072c5"
+   },
+   "source": [
+    "## Bonus"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "81ff02c5-6584-4f21-a358-b918697c6432",
+   "metadata": {
+    "id": "81ff02c5-6584-4f21-a358-b918697c6432"
+   },
+   "source": [
+    "5. The marketing team wants to analyze the number of policies sold by state and month. Present the data in a table where the months are arranged as columns and the states are arranged as rows."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 6,
+   "id": "4f175f4e",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "81ff02c5-6584-4f21-a358-b918697c6432",
-      "metadata": {
-        "id": "81ff02c5-6584-4f21-a358-b918697c6432"
-      },
-      "source": [
-        "5. The marketing team wants to analyze the number of policies sold by state and month. Present the data in a table where the months are arranged as columns and the states are arranged as rows."
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th>Month</th>\n",
+       "      <th>1</th>\n",
+       "      <th>2</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>State</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Arizona</th>\n",
+       "      <td>1008</td>\n",
+       "      <td>929</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>California</th>\n",
+       "      <td>1918</td>\n",
+       "      <td>1634</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Nevada</th>\n",
+       "      <td>551</td>\n",
+       "      <td>442</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Oregon</th>\n",
+       "      <td>1565</td>\n",
+       "      <td>1344</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Washington</th>\n",
+       "      <td>463</td>\n",
+       "      <td>425</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Month          1     2\n",
+       "State                 \n",
+       "Arizona     1008   929\n",
+       "California  1918  1634\n",
+       "Nevada       551   442\n",
+       "Oregon      1565  1344\n",
+       "Washington   463   425"
       ]
-    },
+     },
+     "execution_count": 6,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Ensure the date column is in datetime format with specified format\n",
+    "df['Effective To Date'] = pd.to_datetime(df['Effective To Date'], format='%m/%d/%Y')\n",
+    "\n",
+    "# Extract the month from the date column\n",
+    "df['Month'] = df['Effective To Date'].dt.month\n",
+    "\n",
+    "# Group by 'State' and 'Month' and count the number of policies sold\n",
+    "policies_by_state_month = df.groupby(['State', 'Month']).size().reset_index(name='Policy Count')\n",
+    "\n",
+    "# Pivot the table to get months as columns and states as rows\n",
+    "pivot_table = policies_by_state_month.pivot(index='State', columns='Month', values='Policy Count').fillna(0)\n",
+    "\n",
+    "# Display the result\n",
+    "pivot_table"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "b6aec097-c633-4017-a125-e77a97259cda",
+   "metadata": {
+    "id": "b6aec097-c633-4017-a125-e77a97259cda"
+   },
+   "source": [
+    "6.  Display a new DataFrame that contains the number of policies sold by month, by state, for the top 3 states with the highest number of policies sold.\n",
+    "\n",
+    "*Hint:*\n",
+    "- *To accomplish this, you will first need to group the data by state and month, then count the number of policies sold for each group. Afterwards, you will need to sort the data by the count of policies sold in descending order.*\n",
+    "- *Next, you will select the top 3 states with the highest number of policies sold.*\n",
+    "- *Finally, you will create a new DataFrame that contains the number of policies sold by month for each of the top 3 states.*"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 7,
+   "id": "327c08b5",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "b6aec097-c633-4017-a125-e77a97259cda",
-      "metadata": {
-        "id": "b6aec097-c633-4017-a125-e77a97259cda"
-      },
-      "source": [
-        "6.  Display a new DataFrame that contains the number of policies sold by month, by state, for the top 3 states with the highest number of policies sold.\n",
-        "\n",
-        "*Hint:*\n",
-        "- *To accomplish this, you will first need to group the data by state and month, then count the number of policies sold for each group. Afterwards, you will need to sort the data by the count of policies sold in descending order.*\n",
-        "- *Next, you will select the top 3 states with the highest number of policies sold.*\n",
-        "- *Finally, you will create a new DataFrame that contains the number of policies sold by month for each of the top 3 states.*"
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th>Month</th>\n",
+       "      <th>1</th>\n",
+       "      <th>2</th>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>State</th>\n",
+       "      <th></th>\n",
+       "      <th></th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>Arizona</th>\n",
+       "      <td>1008</td>\n",
+       "      <td>929</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>California</th>\n",
+       "      <td>1918</td>\n",
+       "      <td>1634</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>Oregon</th>\n",
+       "      <td>1565</td>\n",
+       "      <td>1344</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "Month          1     2\n",
+       "State                 \n",
+       "Arizona     1008   929\n",
+       "California  1918  1634\n",
+       "Oregon      1565  1344"
       ]
-    },
+     },
+     "execution_count": 7,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "# Extract the month from the date column\n",
+    "df['Month'] = df['Effective To Date'].dt.month\n",
+    "\n",
+    "# Group by 'State' and 'Month' and count the number of policies sold\n",
+    "policies_by_state_month = df.groupby(['State', 'Month']).size().reset_index(name='Policy Count')\n",
+    "\n",
+    "# Sum the total number of policies sold by state\n",
+    "total_policies_by_state = policies_by_state_month.groupby('State')['Policy Count'].sum().reset_index()\n",
+    "\n",
+    "# Sort the states by the total number of policies sold in descending order\n",
+    "top_states = total_policies_by_state.sort_values(by='Policy Count', ascending=False).head(3)\n",
+    "\n",
+    "# Filter the original grouped data to include only the top 3 states\n",
+    "top_states_policies = policies_by_state_month[policies_by_state_month['State'].isin(top_states['State'])]\n",
+    "\n",
+    "# Pivot the table to get months as columns and states as rows\n",
+    "pivot_table_top_states = top_states_policies.pivot(index='State', columns='Month', values='Policy Count').fillna(0)\n",
+    "\n",
+    "# Display the result\n",
+    "pivot_table_top_states"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009",
+   "metadata": {
+    "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009"
+   },
+   "source": [
+    "7. The marketing team wants to analyze the effect of different marketing channels on the customer response rate.\n",
+    "\n",
+    "Hint: You can use melt to unpivot the data and create a table that shows the customer response rate (those who responded \"Yes\") by marketing channel."
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "id": "933238fd",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009",
-      "metadata": {
-        "id": "ba975b8a-a2cf-4fbf-9f59-ebc381767009"
-      },
-      "source": [
-        "7. The marketing team wants to analyze the effect of different marketing channels on the customer response rate.\n",
-        "\n",
-        "Hint: You can use melt to unpivot the data and create a table that shows the customer response rate (those who responded \"Yes\") by marketing channel."
+     "data": {
+      "text/plain": [
+       "array(['Agent', 'Call Center', 'Branch', 'Web'], dtype=object)"
       ]
-    },
+     },
+     "execution_count": 8,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df['Sales Channel'].unique()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 15,
+   "id": "262da406",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "markdown",
-      "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d",
-      "metadata": {
-        "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d"
-      },
-      "source": [
-        "External Resources for Data Filtering: https://towardsdatascience.com/filtering-data-frames-in-pandas-b570b1f834b9"
+     "data": {
+      "text/plain": [
+       "Unnamed: 0                                int64\n",
+       "Customer                                 object\n",
+       "State                                    object\n",
+       "Customer Lifetime Value                 float64\n",
+       "Response                                 object\n",
+       "Coverage                                 object\n",
+       "Education                                object\n",
+       "Effective To Date                datetime64[ns]\n",
+       "EmploymentStatus                         object\n",
+       "Gender                                   object\n",
+       "Income                                    int64\n",
+       "Location Code                            object\n",
+       "Marital Status                           object\n",
+       "Monthly Premium Auto                      int64\n",
+       "Months Since Last Claim                 float64\n",
+       "Months Since Policy Inception             int64\n",
+       "Number of Open Complaints               float64\n",
+       "Number of Policies                        int64\n",
+       "Policy Type                              object\n",
+       "Policy                                   object\n",
+       "Renew Offer Type                         object\n",
+       "Sales Channel                            object\n",
+       "Total Claim Amount                      float64\n",
+       "Vehicle Class                            object\n",
+       "Vehicle Size                             object\n",
+       "Vehicle Type                             object\n",
+       "Month                                     int32\n",
+       "dtype: object"
       ]
-    },
+     },
+     "execution_count": 15,
+     "metadata": {},
+     "output_type": "execute_result"
+    }
+   ],
+   "source": [
+    "df.dtypes"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "id": "88eea386",
+   "metadata": {},
+   "outputs": [
     {
-      "cell_type": "code",
-      "execution_count": null,
-      "id": "449513f4-0459-46a0-a18d-9398d974c9ad",
-      "metadata": {
-        "id": "449513f4-0459-46a0-a18d-9398d974c9ad"
-      },
-      "outputs": [],
-      "source": [
-        "# your code goes here"
+     "data": {
+      "text/html": [
+       "<div>\n",
+       "<style scoped>\n",
+       "    .dataframe tbody tr th:only-of-type {\n",
+       "        vertical-align: middle;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe tbody tr th {\n",
+       "        vertical-align: top;\n",
+       "    }\n",
+       "\n",
+       "    .dataframe thead th {\n",
+       "        text-align: right;\n",
+       "    }\n",
+       "</style>\n",
+       "<table border=\"1\" class=\"dataframe\">\n",
+       "  <thead>\n",
+       "    <tr style=\"text-align: right;\">\n",
+       "      <th></th>\n",
+       "      <th>Sales Channel</th>\n",
+       "      <th>Response Count</th>\n",
+       "      <th>Total Count</th>\n",
+       "      <th>Response Rate</th>\n",
+       "    </tr>\n",
+       "  </thead>\n",
+       "  <tbody>\n",
+       "    <tr>\n",
+       "      <th>0</th>\n",
+       "      <td>Agent</td>\n",
+       "      <td>742</td>\n",
+       "      <td>4121</td>\n",
+       "      <td>0.180053</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>1</th>\n",
+       "      <td>Branch</td>\n",
+       "      <td>326</td>\n",
+       "      <td>3022</td>\n",
+       "      <td>0.107876</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>2</th>\n",
+       "      <td>Call Center</td>\n",
+       "      <td>221</td>\n",
+       "      <td>2141</td>\n",
+       "      <td>0.103223</td>\n",
+       "    </tr>\n",
+       "    <tr>\n",
+       "      <th>3</th>\n",
+       "      <td>Web</td>\n",
+       "      <td>177</td>\n",
+       "      <td>1626</td>\n",
+       "      <td>0.108856</td>\n",
+       "    </tr>\n",
+       "  </tbody>\n",
+       "</table>\n",
+       "</div>"
+      ],
+      "text/plain": [
+       "  Sales Channel  Response Count  Total Count  Response Rate\n",
+       "0         Agent             742         4121       0.180053\n",
+       "1        Branch             326         3022       0.107876\n",
+       "2   Call Center             221         2141       0.103223\n",
+       "3           Web             177         1626       0.108856"
       ]
+     },
+     "execution_count": 14,
+     "metadata": {},
+     "output_type": "execute_result"
     }
-  ],
-  "metadata": {
-    "kernelspec": {
-      "display_name": "Python 3 (ipykernel)",
-      "language": "python",
-      "name": "python3"
-    },
-    "language_info": {
-      "codemirror_mode": {
-        "name": "ipython",
-        "version": 3
-      },
-      "file_extension": ".py",
-      "mimetype": "text/x-python",
-      "name": "python",
-      "nbconvert_exporter": "python",
-      "pygments_lexer": "ipython3",
-      "version": "3.9.13"
-    },
-    "colab": {
-      "provenance": []
-    }
+   ],
+   "source": [
+    "# Ensure the 'Response' column is in the correct format\n",
+    "df['Response'] = df['Response'].astype(str)\n",
+    "\n",
+    "# Identify the marketing channel column\n",
+    "marketing_channel_column = 'Sales Channel'\n",
+    "\n",
+    "# Filter the data to include only customers who responded \"Yes\"\n",
+    "responded_yes_df = df[df['Response'] == 'Yes']\n",
+    "\n",
+    "# Calculate the response count for each marketing channel\n",
+    "response_count = responded_yes_df.groupby(marketing_channel_column).size().reset_index(name='Response Count')\n",
+    "\n",
+    "# Calculate the total number of customers for each marketing channel\n",
+    "total_customers = df.groupby(marketing_channel_column).size().reset_index(name='Total Count')\n",
+    "\n",
+    "# Merge the response count and total count dataframes\n",
+    "merged_df = pd.merge(response_count, total_customers, on=marketing_channel_column)\n",
+    "\n",
+    "# Calculate the response rate\n",
+    "merged_df['Response Rate'] = merged_df['Response Count'] / merged_df['Total Count']\n",
+    "\n",
+    "# Display the result\n",
+    "merged_df"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "ee462336",
+   "metadata": {},
+   "source": [
+    "Here are some conclusions regarding the effectiveness of different marketing channels on the customer response rate:\n",
+    "\n",
+    "### Data Summary:\n",
+    "- **Agent**:\n",
+    "  - Response Count: 742\n",
+    "  - Total Count: 4121\n",
+    "  - Response Rate: 18.01%\n",
+    "\n",
+    "- **Branch**:\n",
+    "  - Response Count: 326\n",
+    "  - Total Count: 3022\n",
+    "  - Response Rate: 10.79%\n",
+    "\n",
+    "- **Call Center**:\n",
+    "  - Response Count: 221\n",
+    "  - Total Count: 2141\n",
+    "  - Response Rate: 10.32%\n",
+    "\n",
+    "- **Web**:\n",
+    "  - Response Count: 177\n",
+    "  - Total Count: 1626\n",
+    "  - Response Rate: 10.89%\n",
+    "\n",
+    "### Conclusions:\n",
+    "1. **Agent Channel**:\n",
+    "   - The Agent channel has the highest response rate at 18.01%.\n",
+    "   - This indicates that customers are more likely to respond positively when contacted through an agent compared to other channels.\n",
+    "   - The high response rate suggests that personal interaction through agents is effective in engaging customers.\n",
+    "\n",
+    "2. **Branch Channel**:\n",
+    "   - The Branch channel has a response rate of 10.79%.\n",
+    "   - While lower than the Agent channel, it is still relatively effective.\n",
+    "   - This suggests that in-person interactions at branches can also be a good way to engage customers, though not as effective as agents.\n",
+    "\n",
+    "3. **Call Center Channel**:\n",
+    "   - The Call Center channel has a response rate of 10.32%.\n",
+    "   - This is slightly lower than the Branch channel.\n",
+    "   - It indicates that while call centers are useful, they might not be as effective as personal or in-person interactions.\n",
+    "\n",
+    "4. **Web Channel**:\n",
+    "   - The Web channel has a response rate of 10.89%.\n",
+    "   - This is comparable to the Branch and Call Center channels.\n",
+    "   - It suggests that online interactions are as effective as in-person and call center interactions, but still less effective than agent interactions.\n",
+    "\n",
+    "### Marketing Implications:\n",
+    "- **Focus on Agent Channel**: Given the highest response rate, the marketing team should consider investing more in the Agent channel. Training and expanding the agent network could yield higher customer engagement and response rates.\n",
+    "- **Enhance Branch and Web Channels**: Since the Branch and Web channels have similar response rates, efforts to improve customer experience in these channels could help increase their effectiveness. This could include better online tools, more personalized in-branch services, and targeted marketing campaigns.\n",
+    "- **Optimize Call Center Operations**: While the Call Center channel has the lowest response rate, it is still a significant channel. Improving call scripts, training call center staff, and using data analytics to target calls more effectively could help improve response rates.\n",
+    "\n",
+    "Overall, the data suggests that personal interactions through agents are the most effective way to engage customers, followed by in-person branch interactions and online/web interactions."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d",
+   "metadata": {
+    "id": "e4378d94-48fb-4850-a802-b1bc8f427b2d"
+   },
+   "source": [
+    "External Resources for Data Filtering: https://towardsdatascience.com/filtering-data-frames-in-pandas-b570b1f834b9"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
   },
-  "nbformat": 4,
-  "nbformat_minor": 5
-}
\ No newline at end of file
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.12.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

	Unnamed: 0	Customer	State	Customer Lifetime Value	Response	Coverage	Education	Effective To Date	EmploymentStatus	Gender	...	Number of Open Complaints	Number of Policies	Policy Type	Policy	Renew Offer Type	Sales Channel	Total Claim Amount	Vehicle Class	Vehicle Size	Vehicle Type
189	189	OK31456	California	11009.130490	Yes	Premium	Bachelor	1/24/11	Employed	F	...	0.0	1	Corporate Auto	Corporate L3	Offer2	Agent	1358.400000	Luxury Car	Medsize	NaN
236	236	YJ16163	Oregon	11009.130490	Yes	Premium	Bachelor	1/24/11	Employed	F	...	0.0	1	Special Auto	Special L3	Offer2	Agent	1358.400000	Luxury Car	Medsize	A
419	419	GW43195	Oregon	25807.063000	Yes	Extended	College	2/13/11	Employed	F	...	1.0	2	Personal Auto	Personal L2	Offer1	Branch	1027.200000	Luxury Car	Small	A
442	442	IP94270	Arizona	13736.132500	Yes	Premium	Master	2/13/11	Disabled	F	...	0.0	8	Personal Auto	Personal L2	Offer1	Web	1261.319869	SUV	Medsize	A
587	587	FJ28407	California	5619.689084	Yes	Premium	High School or Below	1/26/11	Unemployed	M	...	0.0	1	Personal Auto	Personal L1	Offer2	Web	1027.000029	SUV	Medsize	A
	Policy Type	Gender	Total Claim Amount
0	Corporate Auto	F	433.738499
1	Corporate Auto	M	408.582459
2	Personal Auto	F	452.965929
3	Personal Auto	M	457.010178
4	Special Auto	F	453.280164
5	Special Auto	M	429.527942
	State	Customer Count
0	Arizona	1937
1	California	3552
2	Nevada	993
3	Oregon	2909
4	Washington	888
	Education	Gender	max	min	median
0	Bachelor	F	73225.95652	1904.000852	5640.505303
1	Bachelor	M	67907.27050	1898.007675	5548.031892
2	College	F	61850.18803	1898.683686	5623.611187
3	College	M	61134.68307	1918.119700	6005.847375
4	Doctor	F	44856.11397	2395.570000	5332.462694
5	Doctor	M	32677.34284	2267.604038	5577.669457
6	High School or Below	F	55277.44589	2144.921535	6039.553187
7	High School or Below	M	83325.38119	1940.981221	6286.731006
8	Master	F	51016.06704	2417.777032	5729.855012
9	Master	M	50568.25912	2272.307310	5579.099207
Month	1	2
State
Arizona	1008	929
California	1918	1634
Nevada	551	442
Oregon	1565	1344
Washington	463	425
	Sales Channel	Response Count	Total Count	Response Rate
0	Agent	742	4121	0.180053
1	Branch	326	3022	0.107876
2	Call Center	221	2141	0.103223
3	Web	177	1626	0.108856