A project using Python Pandas library to clean-up a small dataset. Created to practice data cleanup with Python/Pandas with Jupyter Notebook IDE.
Generated by exporting Jupyter file as Markdown. Recommend to use Jupyter Notebook to view uploaded file.
##Goals
- Cleanup a small customer dataset from duplicates/normalize data.
- Provide the Call Centre a table only with customers that they can call.
import pandas as pd
df = pd.read_excel(r"C:\Users\Filip\Desktop\JupyterProjects\datasets\Input.xlsx")
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Not_Useful_Column | |
---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Yes | No | True |
1 | 1002 | Abed | Nadir | 123/643/9775 | 93 West Main Street | No | Yes | False |
2 | 1003 | Walter | /White | 7066950392 | 298 Drugs Driveway | N | NaN | True |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y | True |
4 | 1005 | Jon | Snow | 876|678|3469 | 123 Dragons Road | Y | No | True |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Yes | Yes | True |
6 | 1007 | Jeff | Winger | NaN | 1209 South Street | No | No | False |
7 | 1008 | Sherlock | Holmes | 876|678|3469 | 98 Clue Drive | N | No | False |
8 | 1009 | Gandalf | NaN | N/a | 123 Middle Earth | Yes | NaN | False |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Yes | No | True |
10 | 1011 | Samwise | Gamgee | NaN | 612 Shire Lane, Shire | Yes | No | True |
11 | 1012 | Harry | ...Potter | 7066950392 | 2394 Hogwarts Avenue | Y | NaN | True |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Yes | N | False |
13 | 1014 | Leslie | Knope | 876|678|3469 | 343 City Parkway | Yes | No | False |
14 | 1015 | Toby | Flenderson_ | 304-762-2467 | 214 HR Avenue | N | No | False |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | No | N | False |
16 | 1017 | Michael | Scott | 123/643/9775 | 121 Paper Avenue, Pennsylvania | Yes | No | False |
17 | 1018 | Clark | Kent | 7066950392 | 3498 Super Lane | Y | NaN | True |
18 | 1019 | Creed | Braton | N/a | N/a | N/a | Yes | True |
19 | 1020 | Anakin | Skywalker | 876|678|3469 | 910 Tatooine Road, Tatooine | Yes | N | True |
20 | 1020 | Anakin | Skywalker | 876|678|3469 | 910 Tatooine Road, Tatooine | Yes | N | True |
df = df.drop_duplicates()
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Not_Useful_Column | |
---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Yes | No | True |
1 | 1002 | Abed | Nadir | 123/643/9775 | 93 West Main Street | No | Yes | False |
2 | 1003 | Walter | /White | 7066950392 | 298 Drugs Driveway | N | NaN | True |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y | True |
4 | 1005 | Jon | Snow | 876|678|3469 | 123 Dragons Road | Y | No | True |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Yes | Yes | True |
6 | 1007 | Jeff | Winger | NaN | 1209 South Street | No | No | False |
7 | 1008 | Sherlock | Holmes | 876|678|3469 | 98 Clue Drive | N | No | False |
8 | 1009 | Gandalf | NaN | N/a | 123 Middle Earth | Yes | NaN | False |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Yes | No | True |
10 | 1011 | Samwise | Gamgee | NaN | 612 Shire Lane, Shire | Yes | No | True |
11 | 1012 | Harry | ...Potter | 7066950392 | 2394 Hogwarts Avenue | Y | NaN | True |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Yes | N | False |
13 | 1014 | Leslie | Knope | 876|678|3469 | 343 City Parkway | Yes | No | False |
14 | 1015 | Toby | Flenderson_ | 304-762-2467 | 214 HR Avenue | N | No | False |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | No | N | False |
16 | 1017 | Michael | Scott | 123/643/9775 | 121 Paper Avenue, Pennsylvania | Yes | No | False |
17 | 1018 | Clark | Kent | 7066950392 | 3498 Super Lane | Y | NaN | True |
18 | 1019 | Creed | Braton | N/a | N/a | N/a | Yes | True |
19 | 1020 | Anakin | Skywalker | 876|678|3469 | 910 Tatooine Road, Tatooine | Yes | N | True |
df = df.drop(columns = "Not_Useful_Column")
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | |
---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Yes | No |
1 | 1002 | Abed | Nadir | 123/643/9775 | 93 West Main Street | No | Yes |
2 | 1003 | Walter | /White | 7066950392 | 298 Drugs Driveway | N | NaN |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y |
4 | 1005 | Jon | Snow | 876|678|3469 | 123 Dragons Road | Y | No |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Yes | Yes |
6 | 1007 | Jeff | Winger | NaN | 1209 South Street | No | No |
7 | 1008 | Sherlock | Holmes | 876|678|3469 | 98 Clue Drive | N | No |
8 | 1009 | Gandalf | NaN | N/a | 123 Middle Earth | Yes | NaN |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Yes | No |
10 | 1011 | Samwise | Gamgee | NaN | 612 Shire Lane, Shire | Yes | No |
11 | 1012 | Harry | ...Potter | 7066950392 | 2394 Hogwarts Avenue | Y | NaN |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Yes | N |
13 | 1014 | Leslie | Knope | 876|678|3469 | 343 City Parkway | Yes | No |
14 | 1015 | Toby | Flenderson_ | 304-762-2467 | 214 HR Avenue | N | No |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | No | N |
16 | 1017 | Michael | Scott | 123/643/9775 | 121 Paper Avenue, Pennsylvania | Yes | No |
17 | 1018 | Clark | Kent | 7066950392 | 3498 Super Lane | Y | NaN |
18 | 1019 | Creed | Braton | N/a | N/a | N/a | Yes |
19 | 1020 | Anakin | Skywalker | 876|678|3469 | 910 Tatooine Road, Tatooine | Yes | N |
df["Last_Name"] = df["Last_Name"].str.strip("./_")
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | |
---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Yes | No |
1 | 1002 | Abed | Nadir | 123/643/9775 | 93 West Main Street | No | Yes |
2 | 1003 | Walter | White | 7066950392 | 298 Drugs Driveway | N | NaN |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y |
4 | 1005 | Jon | Snow | 876|678|3469 | 123 Dragons Road | Y | No |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Yes | Yes |
6 | 1007 | Jeff | Winger | NaN | 1209 South Street | No | No |
7 | 1008 | Sherlock | Holmes | 876|678|3469 | 98 Clue Drive | N | No |
8 | 1009 | Gandalf | NaN | N/a | 123 Middle Earth | Yes | NaN |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Yes | No |
10 | 1011 | Samwise | Gamgee | NaN | 612 Shire Lane, Shire | Yes | No |
11 | 1012 | Harry | Potter | 7066950392 | 2394 Hogwarts Avenue | Y | NaN |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Yes | N |
13 | 1014 | Leslie | Knope | 876|678|3469 | 343 City Parkway | Yes | No |
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | No |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | No | N |
16 | 1017 | Michael | Scott | 123/643/9775 | 121 Paper Avenue, Pennsylvania | Yes | No |
17 | 1018 | Clark | Kent | 7066950392 | 3498 Super Lane | Y | NaN |
18 | 1019 | Creed | Braton | N/a | N/a | N/a | Yes |
19 | 1020 | Anakin | Skywalker | 876|678|3469 | 910 Tatooine Road, Tatooine | Yes | N |
df["Phone_Number"] = df["Phone_Number"].str.replace("[^a-zA-Z0-9]", "", regex=True)
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | |
---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 1235455421 | 123 Shire Lane, Shire | Yes | No |
1 | 1002 | Abed | Nadir | 1236439775 | 93 West Main Street | No | Yes |
2 | 1003 | Walter | White | NaN | 298 Drugs Driveway | N | NaN |
3 | 1004 | Dwight | Schrute | 1235432345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y |
4 | 1005 | Jon | Snow | 8766783469 | 123 Dragons Road | Y | No |
5 | 1006 | Ron | Swanson | 3047622467 | 768 City Parkway | Yes | Yes |
6 | 1007 | Jeff | Winger | NaN | 1209 South Street | No | No |
7 | 1008 | Sherlock | Holmes | 8766783469 | 98 Clue Drive | N | No |
8 | 1009 | Gandalf | NaN | Na | 123 Middle Earth | Yes | NaN |
9 | 1010 | Peter | Parker | 1235455421 | 25th Main Street, New York | Yes | No |
10 | 1011 | Samwise | Gamgee | NaN | 612 Shire Lane, Shire | Yes | No |
11 | 1012 | Harry | Potter | NaN | 2394 Hogwarts Avenue | Y | NaN |
12 | 1013 | Don | Draper | 1235432345 | 2039 Main Street | Yes | N |
13 | 1014 | Leslie | Knope | 8766783469 | 343 City Parkway | Yes | No |
14 | 1015 | Toby | Flenderson | 3047622467 | 214 HR Avenue | N | No |
15 | 1016 | Ron | Weasley | 1235455421 | 2395 Hogwarts Avenue | No | N |
16 | 1017 | Michael | Scott | 1236439775 | 121 Paper Avenue, Pennsylvania | Yes | No |
17 | 1018 | Clark | Kent | NaN | 3498 Super Lane | Y | NaN |
18 | 1019 | Creed | Braton | Na | N/a | N/a | Yes |
19 | 1020 | Anakin | Skywalker | 8766783469 | 910 Tatooine Road, Tatooine | Yes | N |
#df["Phone_Number"] = df["Phone_Number"].apply(lambda x : x[0:3] + "-" + x[3:6] + x[6:10]) Doesnt work because phone numbers need to be strings first
df["Phone_Number"] = df["Phone_Number"].apply(lambda x : str(x))
df["Phone_Number"] = df["Phone_Number"].apply(lambda x : x[0:3] + "-" + x[3:6] + "-" + x[6:10])
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | |
---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Yes | No |
1 | 1002 | Abed | Nadir | 123-643-9775 | 93 West Main Street | No | Yes |
2 | 1003 | Walter | White | nan-- | 298 Drugs Driveway | N | NaN |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y |
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | No |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Yes | Yes |
6 | 1007 | Jeff | Winger | nan-- | 1209 South Street | No | No |
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | No |
8 | 1009 | Gandalf | NaN | Na-- | 123 Middle Earth | Yes | NaN |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Yes | No |
10 | 1011 | Samwise | Gamgee | nan-- | 612 Shire Lane, Shire | Yes | No |
11 | 1012 | Harry | Potter | nan-- | 2394 Hogwarts Avenue | Y | NaN |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Yes | N |
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Yes | No |
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | No |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | No | N |
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Yes | No |
17 | 1018 | Clark | Kent | nan-- | 3498 Super Lane | Y | NaN |
18 | 1019 | Creed | Braton | Na-- | N/a | N/a | Yes |
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Yes | N |
df[['Street_Name', 'State', 'Zip_Code']] = df['Address'].str.split(',',n=2,expand=True)
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Street_Name | State | Zip_Code | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Yes | No | 123 Shire Lane | Shire | None |
1 | 1002 | Abed | Nadir | 123-643-9775 | 93 West Main Street | No | Yes | 93 West Main Street | None | None |
2 | 1003 | Walter | White | nan-- | 298 Drugs Driveway | N | NaN | 298 Drugs Driveway | None | None |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Yes | Y | 980 Paper Avenue | Pennsylvania | 18503 |
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | No | 123 Dragons Road | None | None |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Yes | Yes | 768 City Parkway | None | None |
6 | 1007 | Jeff | Winger | nan-- | 1209 South Street | No | No | 1209 South Street | None | None |
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | No | 98 Clue Drive | None | None |
8 | 1009 | Gandalf | NaN | Na-- | 123 Middle Earth | Yes | NaN | 123 Middle Earth | None | None |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Yes | No | 25th Main Street | New York | None |
10 | 1011 | Samwise | Gamgee | nan-- | 612 Shire Lane, Shire | Yes | No | 612 Shire Lane | Shire | None |
11 | 1012 | Harry | Potter | nan-- | 2394 Hogwarts Avenue | Y | NaN | 2394 Hogwarts Avenue | None | None |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Yes | N | 2039 Main Street | None | None |
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Yes | No | 343 City Parkway | None | None |
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | No | 214 HR Avenue | None | None |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | No | N | 2395 Hogwarts Avenue | None | None |
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Yes | No | 121 Paper Avenue | Pennsylvania | None |
17 | 1018 | Clark | Kent | nan-- | 3498 Super Lane | Y | NaN | 3498 Super Lane | None | None |
18 | 1019 | Creed | Braton | Na-- | N/a | N/a | Yes | N/a | None | None |
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Yes | N | 910 Tatooine Road | Tatooine | None |
df["Do_Not_Contact"] = df["Do_Not_Contact"].str.replace("Yes","Y")
df["Do_Not_Contact"] = df["Do_Not_Contact"].str.replace("No","N")
df["Paying Customer"] = df["Paying Customer"].str.replace("No","N")
df["Paying Customer"] = df["Paying Customer"].str.replace("Yes","Y")
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Street_Name | State | Zip_Code | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Y | N | 123 Shire Lane | Shire | None |
1 | 1002 | Abed | Nadir | 123-643-9775 | 93 West Main Street | N | Y | 93 West Main Street | None | None |
2 | 1003 | Walter | White | nan-- | 298 Drugs Driveway | N | NaN | 298 Drugs Driveway | None | None |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Y | Y | 980 Paper Avenue | Pennsylvania | 18503 |
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | N | 123 Dragons Road | None | None |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Y | Y | 768 City Parkway | None | None |
6 | 1007 | Jeff | Winger | nan-- | 1209 South Street | N | N | 1209 South Street | None | None |
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | N | 98 Clue Drive | None | None |
8 | 1009 | Gandalf | NaN | Na-- | 123 Middle Earth | Y | NaN | 123 Middle Earth | None | None |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Y | N | 25th Main Street | New York | None |
10 | 1011 | Samwise | Gamgee | nan-- | 612 Shire Lane, Shire | Y | N | 612 Shire Lane | Shire | None |
11 | 1012 | Harry | Potter | nan-- | 2394 Hogwarts Avenue | Y | NaN | 2394 Hogwarts Avenue | None | None |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Y | N | 2039 Main Street | None | None |
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Y | N | 343 City Parkway | None | None |
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | N | 214 HR Avenue | None | None |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | N | N | 2395 Hogwarts Avenue | None | None |
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Y | N | 121 Paper Avenue | Pennsylvania | None |
17 | 1018 | Clark | Kent | nan-- | 3498 Super Lane | Y | NaN | 3498 Super Lane | None | None |
18 | 1019 | Creed | Braton | Na-- | N/a | N/a | Y | N/a | None | None |
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Y | N | 910 Tatooine Road | Tatooine | None |
df = df.replace('N/a','')
df = df.replace('nan--','')
df = df.replace('Na--','')
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Street_Name | State | Zip_Code | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Y | N | 123 Shire Lane | Shire | None |
1 | 1002 | Abed | Nadir | 123-643-9775 | 93 West Main Street | N | Y | 93 West Main Street | None | None |
2 | 1003 | Walter | White | 298 Drugs Driveway | N | NaN | 298 Drugs Driveway | None | None | |
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Y | Y | 980 Paper Avenue | Pennsylvania | 18503 |
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | N | 123 Dragons Road | None | None |
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Y | Y | 768 City Parkway | None | None |
6 | 1007 | Jeff | Winger | 1209 South Street | N | N | 1209 South Street | None | None | |
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | N | 98 Clue Drive | None | None |
8 | 1009 | Gandalf | NaN | 123 Middle Earth | Y | NaN | 123 Middle Earth | None | None | |
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Y | N | 25th Main Street | New York | None |
10 | 1011 | Samwise | Gamgee | 612 Shire Lane, Shire | Y | N | 612 Shire Lane | Shire | None | |
11 | 1012 | Harry | Potter | 2394 Hogwarts Avenue | Y | NaN | 2394 Hogwarts Avenue | None | None | |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Y | N | 2039 Main Street | None | None |
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Y | N | 343 City Parkway | None | None |
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | N | 214 HR Avenue | None | None |
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | N | N | 2395 Hogwarts Avenue | None | None |
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Y | N | 121 Paper Avenue | Pennsylvania | None |
17 | 1018 | Clark | Kent | 3498 Super Lane | Y | NaN | 3498 Super Lane | None | None | |
18 | 1019 | Creed | Braton | Y | None | None | ||||
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Y | N | 910 Tatooine Road | Tatooine | None |
df = df.fillna('')
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Street_Name | State | Zip_Code | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Y | N | 123 Shire Lane | Shire | |
1 | 1002 | Abed | Nadir | 123-643-9775 | 93 West Main Street | N | Y | 93 West Main Street | ||
2 | 1003 | Walter | White | 298 Drugs Driveway | N | 298 Drugs Driveway | ||||
3 | 1004 | Dwight | Schrute | 123-543-2345 | 980 Paper Avenue, Pennsylvania, 18503 | Y | Y | 980 Paper Avenue | Pennsylvania | 18503 |
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | N | 123 Dragons Road | ||
5 | 1006 | Ron | Swanson | 304-762-2467 | 768 City Parkway | Y | Y | 768 City Parkway | ||
6 | 1007 | Jeff | Winger | 1209 South Street | N | N | 1209 South Street | |||
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | N | 98 Clue Drive | ||
8 | 1009 | Gandalf | 123 Middle Earth | Y | 123 Middle Earth | |||||
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Y | N | 25th Main Street | New York | |
10 | 1011 | Samwise | Gamgee | 612 Shire Lane, Shire | Y | N | 612 Shire Lane | Shire | ||
11 | 1012 | Harry | Potter | 2394 Hogwarts Avenue | Y | 2394 Hogwarts Avenue | ||||
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Y | N | 2039 Main Street | ||
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Y | N | 343 City Parkway | ||
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | N | 214 HR Avenue | ||
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | N | N | 2395 Hogwarts Avenue | ||
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Y | N | 121 Paper Avenue | Pennsylvania | |
17 | 1018 | Clark | Kent | 3498 Super Lane | Y | 3498 Super Lane | ||||
18 | 1019 | Creed | Braton | Y | ||||||
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Y | N | 910 Tatooine Road | Tatooine |
for x in df.index:
if df.loc[x, 'Do_Not_Contact'] == 'Y':
df.drop(x, inplace=True)
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Street_Name | State | Zip_Code | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Y | N | 123 Shire Lane | Shire | |
2 | 1003 | Walter | White | 298 Drugs Driveway | N | 298 Drugs Driveway | ||||
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | N | 123 Dragons Road | ||
6 | 1007 | Jeff | Winger | 1209 South Street | N | N | 1209 South Street | |||
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | N | 98 Clue Drive | ||
8 | 1009 | Gandalf | 123 Middle Earth | Y | 123 Middle Earth | |||||
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Y | N | 25th Main Street | New York | |
10 | 1011 | Samwise | Gamgee | 612 Shire Lane, Shire | Y | N | 612 Shire Lane | Shire | ||
11 | 1012 | Harry | Potter | 2394 Hogwarts Avenue | Y | 2394 Hogwarts Avenue | ||||
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Y | N | 2039 Main Street | ||
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Y | N | 343 City Parkway | ||
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | N | 214 HR Avenue | ||
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | N | N | 2395 Hogwarts Avenue | ||
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Y | N | 121 Paper Avenue | Pennsylvania | |
17 | 1018 | Clark | Kent | 3498 Super Lane | Y | 3498 Super Lane | ||||
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Y | N | 910 Tatooine Road | Tatooine |
#df.dropna(subset='Phone_Number', inplace=True) Could have done this if didn't replace NaN cells with empty string earlier
for x in df.index:
if df.loc[x, 'Phone_Number'] == '':
df.drop(x, inplace=True)
df
<style scoped>
.dataframe tbody tr th:only-of-type {
vertical-align: middle;
}
</style>
.dataframe tbody tr th {
vertical-align: top;
}
.dataframe thead th {
text-align: right;
}
CustomerID | First_Name | Last_Name | Phone_Number | Address | Paying Customer | Do_Not_Contact | Street_Name | State | Zip_Code | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 1001 | Frodo | Baggins | 123-545-5421 | 123 Shire Lane, Shire | Y | N | 123 Shire Lane | Shire | |
4 | 1005 | Jon | Snow | 876-678-3469 | 123 Dragons Road | Y | N | 123 Dragons Road | ||
7 | 1008 | Sherlock | Holmes | 876-678-3469 | 98 Clue Drive | N | N | 98 Clue Drive | ||
9 | 1010 | Peter | Parker | 123-545-5421 | 25th Main Street, New York | Y | N | 25th Main Street | New York | |
12 | 1013 | Don | Draper | 123-543-2345 | 2039 Main Street | Y | N | 2039 Main Street | ||
13 | 1014 | Leslie | Knope | 876-678-3469 | 343 City Parkway | Y | N | 343 City Parkway | ||
14 | 1015 | Toby | Flenderson | 304-762-2467 | 214 HR Avenue | N | N | 214 HR Avenue | ||
15 | 1016 | Ron | Weasley | 123-545-5421 | 2395 Hogwarts Avenue | N | N | 2395 Hogwarts Avenue | ||
16 | 1017 | Michael | Scott | 123-643-9775 | 121 Paper Avenue, Pennsylvania | Y | N | 121 Paper Avenue | Pennsylvania | |
19 | 1020 | Anakin | Skywalker | 876-678-3469 | 910 Tatooine Road, Tatooine | Y | N | 910 Tatooine Road | Tatooine |
#df.reset_index(inplace=True) No need to do this since we dont export Index = false
df.to_excel('output.xlsx', index=False)