diff --git a/README.es.md b/README.es.md
index d85876a9..72ea33ea 100644
--- a/README.es.md
+++ b/README.es.md
@@ -3,14 +3,14 @@
- Comprender un dataset nuevo.
-- Utilizar los conocimientos adquiridos en el prework para resolver las cuestiones planteadas.
+- Utiliza los conocimientos adquiridos en el prework para resolver las cuestiones planteadas.
- Analizar, si fuera necesario, otras cuestiones.
## 🌱 Cómo iniciar este proyecto
Sigue las siguientes instrucciones:
-1. Crear un nuevo repositorio haciendo fork en el [proyecto de Git](https://github.com/4GeeksAcademy/realestate-datacleanup-exercise) o [haciendo clic aquí](https://github.com/4GeeksAcademy/realestate-datacleanup-exercise/fork).
+1. Crea un nuevo repositorio haciendo fork en el [proyecto de Git](https://github.com/4GeeksAcademy/realestate-datacleanup-exercise) o [haciendo clic aquí](https://github.com/4GeeksAcademy/realestate-datacleanup-exercise/fork).
2. Abre el repositorio creado recientemente en Codespace usando la [extensión del botón de Codespace](https://docs.github.com/en/codespaces/developing-in-codespaces/creating-a-codespace-for-a-repository#creating-a-codespace-for-a-repository).
3. Una vez que el VSCode del Codespace haya terminado de abrirse, comienza tu proyecto siguiendo las instrucciones a continuación.
diff --git a/project.es.ipynb b/project.es.ipynb
index da1f12ef..215eef21 100644
--- a/project.es.ipynb
+++ b/project.es.ipynb
@@ -1,963 +1,2349 @@
{
- "cells": [
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "innocent-university",
- "metadata": {},
- "source": [
- "# Limpieza de bienes raíces\n",
- "\n",
- "Este es un conjunto de datos (dataset) reales que fue descargado usando técnicas de web scraping. La data contiene registros de **Fotocasa**, el cual es uno de los sitios más populares de bienes raíces en España. Por favor no hagas esto (web scraping) a no ser que sea para propósitos académicos.\n",
- "\n",
- "El dataset fue descargado hace algunos años por Henry Navarro y en ningún caso se obtuvo beneficio económico de ello.\n",
- "\n",
- "Contiene miles de datos de casas reales publicadas en la web www.fotocasa.com. Tu objetivo es extraer tanta información como sea posible con el conocimiento que tienes hasta ahora de ciencia de datos, por ejemplo ¿cuál es la casa más cara en todo el dataset?\n",
- "\n",
- "Empecemos precisamente con esa pregunta... ¡Buena suerte!"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "multiple-glass",
- "metadata": {},
- "source": [
- "#### Ejercicio 00. Lee el dataset assets/real_estate.csv e intenta visualizar la tabla (★☆☆)"
- ]
- },
+ "cells": [
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "innocent-university",
+ "metadata": {},
+ "source": [
+ "# Limpieza de bienes raíces\n",
+ "\n",
+ "Este es un conjunto de datos (dataset) reales que fue descargado usando técnicas de web scraping. La data contiene registros de **Fotocasa**, el cual es uno de los sitios más populares de bienes raíces en España. Por favor no hagas esto (web scraping) a no ser que sea para propósitos académicos.\n",
+ "\n",
+ "El dataset fue descargado hace algunos años por Henry Navarro y en ningún caso se obtuvo beneficio económico de ello.\n",
+ "\n",
+ "Contiene miles de datos de casas reales publicadas en la web www.fotocasa.com. Tu objetivo es extraer tanta información como sea posible con el conocimiento que tienes hasta ahora de ciencia de datos, por ejemplo ¿cuál es la casa más cara en todo el dataset?\n",
+ "\n",
+ "Empecemos precisamente con esa pregunta... ¡Buena suerte!"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "multiple-glass",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 00. Lee el dataset assets/real_estate.csv e intenta visualizar la tabla (★☆☆)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 382,
+ "id": "frank-heath",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": 1,
- "id": "frank-heath",
- "metadata": {},
- "outputs": [
- {
- "data": {
- "text/html": [
- "
\n",
- "\n",
- "
\n",
- " \n",
- " \n",
- " \n",
- " Unnamed: 0 \n",
- " id_realEstates \n",
- " isNew \n",
- " realEstate_name \n",
- " phone_realEstate \n",
- " url_inmueble \n",
- " rooms \n",
- " bathrooms \n",
- " surface \n",
- " price \n",
- " ... \n",
- " level4Id \n",
- " level5Id \n",
- " level6Id \n",
- " level7Id \n",
- " level8Id \n",
- " accuracy \n",
- " latitude \n",
- " longitude \n",
- " zipCode \n",
- " customZone \n",
- " \n",
- " \n",
- " \n",
- " \n",
- " 0 \n",
- " 1 \n",
- " 153771986 \n",
- " False \n",
- " ferrari 57 inmobiliaria \n",
- " 912177526.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 3.0 \n",
- " 2.0 \n",
- " 103.0 \n",
- " 195000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,2948276786438 \n",
- " -3,44402412135624 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 1 \n",
- " 2 \n",
- " 153867863 \n",
- " False \n",
- " tecnocasa fuenlabrada ferrocarril \n",
- " 916358736.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 3.0 \n",
- " 1.0 \n",
- " NaN \n",
- " 89000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 1 \n",
- " 40,28674 \n",
- " -3,79351 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 2 \n",
- " 3 \n",
- " 153430440 \n",
- " False \n",
- " look find boadilla \n",
- " 916350408.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 2.0 \n",
- " 2.0 \n",
- " 99.0 \n",
- " 390000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,4115646786438 \n",
- " -3,90662252135624 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 3 \n",
- " 4 \n",
- " 152776331 \n",
- " False \n",
- " tecnocasa fuenlabrada ferrocarril \n",
- " 916358736.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 3.0 \n",
- " 1.0 \n",
- " 86.0 \n",
- " 89000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,2853785786438 \n",
- " -3,79508142135624 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 4 \n",
- " 5 \n",
- " 153180188 \n",
- " False \n",
- " ferrari 57 inmobiliaria \n",
- " 912177526.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 2.0 \n",
- " 2.0 \n",
- " 106.0 \n",
- " 172000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,2998774864376 \n",
- " -3,45226301356237 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " ... \n",
- " \n",
- " \n",
- " 15330 \n",
- " 15331 \n",
- " 153901377 \n",
- " False \n",
- " infocasa consulting \n",
- " 911360461.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 2.0 \n",
- " 1.0 \n",
- " 96.0 \n",
- " 259470 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,45416 \n",
- " -3,70286 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 15331 \n",
- " 15332 \n",
- " 150394373 \n",
- " False \n",
- " inmobiliaria pulpon \n",
- " 912788039.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 3.0 \n",
- " 1.0 \n",
- " 150.0 \n",
- " 165000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,36652 \n",
- " -3,48951 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 15332 \n",
- " 15333 \n",
- " 153901397 \n",
- " False \n",
- " tecnocasa torrelodones \n",
- " 912780348.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 4.0 \n",
- " 2.0 \n",
- " 175.0 \n",
- " 495000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,57444 \n",
- " -3,92124 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 15333 \n",
- " 15334 \n",
- " 152607440 \n",
- " False \n",
- " inmobiliaria pulpon \n",
- " 912788039.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 3.0 \n",
- " 2.0 \n",
- " 101.0 \n",
- " 195000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,36967 \n",
- " -3,48105 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- " 15334 \n",
- " 15335 \n",
- " 153901356 \n",
- " False \n",
- " infocasa consulting \n",
- " 911360461.0 \n",
- " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- " 3.0 \n",
- " 2.0 \n",
- " 152.0 \n",
- " 765000 \n",
- " ... \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 0 \n",
- " 40,45773 \n",
- " -3,69068 \n",
- " NaN \n",
- " NaN \n",
- " \n",
- " \n",
- "
\n",
- "
15335 rows × 37 columns
\n",
- "
"
- ],
- "text/plain": [
- " Unnamed: 0 id_realEstates isNew realEstate_name \\\n",
- "0 1 153771986 False ferrari 57 inmobiliaria \n",
- "1 2 153867863 False tecnocasa fuenlabrada ferrocarril \n",
- "2 3 153430440 False look find boadilla \n",
- "3 4 152776331 False tecnocasa fuenlabrada ferrocarril \n",
- "4 5 153180188 False ferrari 57 inmobiliaria \n",
- "... ... ... ... ... \n",
- "15330 15331 153901377 False infocasa consulting \n",
- "15331 15332 150394373 False inmobiliaria pulpon \n",
- "15332 15333 153901397 False tecnocasa torrelodones \n",
- "15333 15334 152607440 False inmobiliaria pulpon \n",
- "15334 15335 153901356 False infocasa consulting \n",
- "\n",
- " phone_realEstate url_inmueble \\\n",
- "0 912177526.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "1 916358736.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "2 916350408.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "3 916358736.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "4 912177526.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "... ... ... \n",
- "15330 911360461.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "15331 912788039.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "15332 912780348.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "15333 912788039.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "15334 911360461.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
- "\n",
- " rooms bathrooms surface price ... level4Id level5Id level6Id \\\n",
- "0 3.0 2.0 103.0 195000 ... 0 0 0 \n",
- "1 3.0 1.0 NaN 89000 ... 0 0 0 \n",
- "2 2.0 2.0 99.0 390000 ... 0 0 0 \n",
- "3 3.0 1.0 86.0 89000 ... 0 0 0 \n",
- "4 2.0 2.0 106.0 172000 ... 0 0 0 \n",
- "... ... ... ... ... ... ... ... ... \n",
- "15330 2.0 1.0 96.0 259470 ... 0 0 0 \n",
- "15331 3.0 1.0 150.0 165000 ... 0 0 0 \n",
- "15332 4.0 2.0 175.0 495000 ... 0 0 0 \n",
- "15333 3.0 2.0 101.0 195000 ... 0 0 0 \n",
- "15334 3.0 2.0 152.0 765000 ... 0 0 0 \n",
- "\n",
- " level7Id level8Id accuracy latitude longitude zipCode \\\n",
- "0 0 0 0 40,2948276786438 -3,44402412135624 NaN \n",
- "1 0 0 1 40,28674 -3,79351 NaN \n",
- "2 0 0 0 40,4115646786438 -3,90662252135624 NaN \n",
- "3 0 0 0 40,2853785786438 -3,79508142135624 NaN \n",
- "4 0 0 0 40,2998774864376 -3,45226301356237 NaN \n",
- "... ... ... ... ... ... ... \n",
- "15330 0 0 0 40,45416 -3,70286 NaN \n",
- "15331 0 0 0 40,36652 -3,48951 NaN \n",
- "15332 0 0 0 40,57444 -3,92124 NaN \n",
- "15333 0 0 0 40,36967 -3,48105 NaN \n",
- "15334 0 0 0 40,45773 -3,69068 NaN \n",
- "\n",
- " customZone \n",
- "0 NaN \n",
- "1 NaN \n",
- "2 NaN \n",
- "3 NaN \n",
- "4 NaN \n",
- "... ... \n",
- "15330 NaN \n",
- "15331 NaN \n",
- "15332 NaN \n",
- "15333 NaN \n",
- "15334 NaN \n",
- "\n",
- "[15335 rows x 37 columns]"
- ]
- },
- "execution_count": 1,
- "metadata": {},
- "output_type": "execute_result"
- }
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Unnamed: 0 \n",
+ " id_realEstates \n",
+ " isNew \n",
+ " realEstate_name \n",
+ " phone_realEstate \n",
+ " url_inmueble \n",
+ " rooms \n",
+ " bathrooms \n",
+ " surface \n",
+ " price \n",
+ " ... \n",
+ " level4Id \n",
+ " level5Id \n",
+ " level6Id \n",
+ " level7Id \n",
+ " level8Id \n",
+ " accuracy \n",
+ " latitude \n",
+ " longitude \n",
+ " zipCode \n",
+ " customZone \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 \n",
+ " 1 \n",
+ " 153771986 \n",
+ " False \n",
+ " ferrari 57 inmobiliaria \n",
+ " 912177526.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 2.0 \n",
+ " 103.0 \n",
+ " 195000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,2948276786438 \n",
+ " -3,44402412135624 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 1 \n",
+ " 2 \n",
+ " 153867863 \n",
+ " False \n",
+ " tecnocasa fuenlabrada ferrocarril \n",
+ " 916358736.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 1.0 \n",
+ " NaN \n",
+ " 89000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " 40,28674 \n",
+ " -3,79351 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 2 \n",
+ " 3 \n",
+ " 153430440 \n",
+ " False \n",
+ " look find boadilla \n",
+ " 916350408.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 99.0 \n",
+ " 390000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,4115646786438 \n",
+ " -3,90662252135624 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 3 \n",
+ " 4 \n",
+ " 152776331 \n",
+ " False \n",
+ " tecnocasa fuenlabrada ferrocarril \n",
+ " 916358736.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 1.0 \n",
+ " 86.0 \n",
+ " 89000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,2853785786438 \n",
+ " -3,79508142135624 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 4 \n",
+ " 5 \n",
+ " 153180188 \n",
+ " False \n",
+ " ferrari 57 inmobiliaria \n",
+ " 912177526.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 106.0 \n",
+ " 172000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,2998774864376 \n",
+ " -3,45226301356237 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " \n",
+ " \n",
+ " 15330 \n",
+ " 15331 \n",
+ " 153901377 \n",
+ " False \n",
+ " infocasa consulting \n",
+ " 911360461.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 2.0 \n",
+ " 1.0 \n",
+ " 96.0 \n",
+ " 259470 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,45416 \n",
+ " -3,70286 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 15331 \n",
+ " 15332 \n",
+ " 150394373 \n",
+ " False \n",
+ " inmobiliaria pulpon \n",
+ " 912788039.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 1.0 \n",
+ " 150.0 \n",
+ " 165000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,36652 \n",
+ " -3,48951 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 15332 \n",
+ " 15333 \n",
+ " 153901397 \n",
+ " False \n",
+ " tecnocasa torrelodones \n",
+ " 912780348.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 4.0 \n",
+ " 2.0 \n",
+ " 175.0 \n",
+ " 495000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,57444 \n",
+ " -3,92124 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 15333 \n",
+ " 15334 \n",
+ " 152607440 \n",
+ " False \n",
+ " inmobiliaria pulpon \n",
+ " 912788039.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 2.0 \n",
+ " 101.0 \n",
+ " 195000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,36967 \n",
+ " -3,48105 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ " 15334 \n",
+ " 15335 \n",
+ " 153901356 \n",
+ " False \n",
+ " infocasa consulting \n",
+ " 911360461.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 2.0 \n",
+ " 152.0 \n",
+ " 765000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,45773 \n",
+ " -3,69068 \n",
+ " NaN \n",
+ " NaN \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
15335 rows × 37 columns
\n",
+ "
"
],
- "source": [
- "import pandas as pd\n",
- "\n",
- "# Este archivo CSV contiene puntos y comas en lugar de comas como separadores\n",
- "ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
- "ds"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "latin-guest",
- "metadata": {},
- "source": [
- "#### Ejercicio 01. ¿Cuál es la casa más cara en todo el dataset? (★☆☆)\n",
- "\n",
- "Imprime la dirección y el precio de la casa seleccionada. Por ejemplo:\n",
- "\n",
- "`La casa con dirección en Calle del Prado, Nº20 es la más cara y su precio es de 5000000 USD`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "developing-optimum",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "lesser-cosmetic",
- "metadata": {},
- "source": [
- "#### Ejercicio 02. ¿Cuál es la casa más barata del dataset? (★☆☆)\n",
- "\n",
- "Imprime la dirección y el precio de la casa seleccionada. Por ejemplo:\n",
- "\n",
- "`La casa con dirección en Calle Alcalá, Nº58 es la más barata y su precio es de 12000 USD`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "lovely-oasis",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "compliant-fellowship",
- "metadata": {},
- "source": [
- "#### Ejercicio 03. ¿Cuál es la casa más grande y la más pequeña del dataset? (★☆☆)\n",
- "\n",
- "Imprime la dirección y el área de las casas seleccionadas. Por ejemplo:\n",
- "\n",
- "`La casa más grande está ubicada en Calle Gran Vía, Nº38 y su superficie es de 5000 metros`\n",
- "\n",
- "`La casa más pequeña está ubicada en Calle Mayor, Nº12 y su superficie es de 200 metros`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "every-tiffany",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "danish-spirit",
- "metadata": {},
- "source": [
- "#### Ejercicio 04. ¿Cuantas poblaciones (columna level5) contiene el dataset? (★☆☆)\n",
- "\n",
- "Imprime el nombre de las poblaciones separadas por coma. Por ejemplo:\n",
- "\n",
- "`> print(populations)`\n",
- "\n",
- "`population1, population2, population3, ...`"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "exciting-accreditation",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "crazy-blame",
- "metadata": {},
- "source": [
- "#### Ejercicio 05. ¿El dataset contiene valores no admitidos (NAs)? (★☆☆)\n",
- "\n",
- "Imprima un booleano (`True` o `False`) seguido de la fila/columna que contiene el NAs."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "transparent-poetry",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "italic-hydrogen",
- "metadata": {},
- "source": [
- "#### Ejercicio 06. Elimina los NAs del dataset, si aplica (★★☆)\n",
- "\n",
- "Imprima una comparación entre las dimensiones del DataFrame original versus el DataFrame después de las eliminaciones.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "id": "administrative-roads",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
- {
- "attachments": {},
- "cell_type": "markdown",
- "id": "middle-china",
- "metadata": {},
- "source": [
- "#### Ejercicio 07. ¿Cuál la media de precios en la población (columna level5) de \"Arroyomolinos (Madrid)\"? (★★☆)\n",
- "\n",
- "Imprima el valor obtenido."
+ "text/plain": [
+ " Unnamed: 0 id_realEstates isNew realEstate_name \\\n",
+ "0 1 153771986 False ferrari 57 inmobiliaria \n",
+ "1 2 153867863 False tecnocasa fuenlabrada ferrocarril \n",
+ "2 3 153430440 False look find boadilla \n",
+ "3 4 152776331 False tecnocasa fuenlabrada ferrocarril \n",
+ "4 5 153180188 False ferrari 57 inmobiliaria \n",
+ "... ... ... ... ... \n",
+ "15330 15331 153901377 False infocasa consulting \n",
+ "15331 15332 150394373 False inmobiliaria pulpon \n",
+ "15332 15333 153901397 False tecnocasa torrelodones \n",
+ "15333 15334 152607440 False inmobiliaria pulpon \n",
+ "15334 15335 153901356 False infocasa consulting \n",
+ "\n",
+ " phone_realEstate url_inmueble \\\n",
+ "0 912177526.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "1 916358736.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "2 916350408.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "3 916358736.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "4 912177526.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "... ... ... \n",
+ "15330 911360461.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15331 912788039.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15332 912780348.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15333 912788039.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15334 911360461.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "\n",
+ " rooms bathrooms surface price ... level4Id level5Id level6Id \\\n",
+ "0 3.0 2.0 103.0 195000 ... 0 0 0 \n",
+ "1 3.0 1.0 NaN 89000 ... 0 0 0 \n",
+ "2 2.0 2.0 99.0 390000 ... 0 0 0 \n",
+ "3 3.0 1.0 86.0 89000 ... 0 0 0 \n",
+ "4 2.0 2.0 106.0 172000 ... 0 0 0 \n",
+ "... ... ... ... ... ... ... ... ... \n",
+ "15330 2.0 1.0 96.0 259470 ... 0 0 0 \n",
+ "15331 3.0 1.0 150.0 165000 ... 0 0 0 \n",
+ "15332 4.0 2.0 175.0 495000 ... 0 0 0 \n",
+ "15333 3.0 2.0 101.0 195000 ... 0 0 0 \n",
+ "15334 3.0 2.0 152.0 765000 ... 0 0 0 \n",
+ "\n",
+ " level7Id level8Id accuracy latitude longitude zipCode \\\n",
+ "0 0 0 0 40,2948276786438 -3,44402412135624 NaN \n",
+ "1 0 0 1 40,28674 -3,79351 NaN \n",
+ "2 0 0 0 40,4115646786438 -3,90662252135624 NaN \n",
+ "3 0 0 0 40,2853785786438 -3,79508142135624 NaN \n",
+ "4 0 0 0 40,2998774864376 -3,45226301356237 NaN \n",
+ "... ... ... ... ... ... ... \n",
+ "15330 0 0 0 40,45416 -3,70286 NaN \n",
+ "15331 0 0 0 40,36652 -3,48951 NaN \n",
+ "15332 0 0 0 40,57444 -3,92124 NaN \n",
+ "15333 0 0 0 40,36967 -3,48105 NaN \n",
+ "15334 0 0 0 40,45773 -3,69068 NaN \n",
+ "\n",
+ " customZone \n",
+ "0 NaN \n",
+ "1 NaN \n",
+ "2 NaN \n",
+ "3 NaN \n",
+ "4 NaN \n",
+ "... ... \n",
+ "15330 NaN \n",
+ "15331 NaN \n",
+ "15332 NaN \n",
+ "15333 NaN \n",
+ "15334 NaN \n",
+ "\n",
+ "[15335 rows x 37 columns]"
]
- },
+ },
+ "execution_count": 382,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import pandas as pd\n",
+ "\n",
+ "# Este archivo CSV contiene puntos y comas en lugar de comas como separadores\n",
+ "ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "ds"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 383,
+ "id": "00d47043",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "nuclear-belief",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "\n",
+ "RangeIndex: 15335 entries, 0 to 15334\n",
+ "Data columns (total 37 columns):\n",
+ " # Column Non-Null Count Dtype \n",
+ "--- ------ -------------- ----- \n",
+ " 0 Unnamed: 0 15335 non-null int64 \n",
+ " 1 id_realEstates 15335 non-null int64 \n",
+ " 2 isNew 15335 non-null bool \n",
+ " 3 realEstate_name 15325 non-null object \n",
+ " 4 phone_realEstate 14541 non-null float64\n",
+ " 5 url_inmueble 15335 non-null object \n",
+ " 6 rooms 14982 non-null float64\n",
+ " 7 bathrooms 14990 non-null float64\n",
+ " 8 surface 14085 non-null float64\n",
+ " 9 price 15335 non-null int64 \n",
+ " 10 date 15335 non-null object \n",
+ " 11 description 15193 non-null object \n",
+ " 12 address 15335 non-null object \n",
+ " 13 country 15335 non-null object \n",
+ " 14 level1 15335 non-null object \n",
+ " 15 level2 15335 non-null object \n",
+ " 16 level3 15335 non-null object \n",
+ " 17 level4 8692 non-null object \n",
+ " 18 level5 15335 non-null object \n",
+ " 19 level6 708 non-null object \n",
+ " 20 level7 13058 non-null object \n",
+ " 21 level8 6756 non-null object \n",
+ " 22 upperLevel 15335 non-null object \n",
+ " 23 countryId 15335 non-null int64 \n",
+ " 24 level1Id 15335 non-null int64 \n",
+ " 25 level2Id 15335 non-null int64 \n",
+ " 26 level3Id 15335 non-null int64 \n",
+ " 27 level4Id 15335 non-null int64 \n",
+ " 28 level5Id 15335 non-null int64 \n",
+ " 29 level6Id 15335 non-null int64 \n",
+ " 30 level7Id 15335 non-null int64 \n",
+ " 31 level8Id 15335 non-null int64 \n",
+ " 32 accuracy 15335 non-null int64 \n",
+ " 33 latitude 15335 non-null object \n",
+ " 34 longitude 15335 non-null object \n",
+ " 35 zipCode 0 non-null float64\n",
+ " 36 customZone 0 non-null float64\n",
+ "dtypes: bool(1), float64(6), int64(13), object(17)\n",
+ "memory usage: 4.2+ MB\n"
+ ]
+ }
+ ],
+ "source": [
+ "ds.info()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 384,
+ "id": "94195880",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "concerned-radical",
- "metadata": {},
- "source": [
- "#### Ejercicio 08. Trazar el histograma de los precios para la población (level5 column) de \"Arroyomolinos (Madrid)\" y explica qué observas (★★☆)\n",
- "\n",
- "Imprime el histograma de los precios y escribe en la celda del Markdown un breve análisis del trazado.\n"
+ "data": {
+ "text/plain": [
+ "Index(['Unnamed: 0', 'id_realEstates', 'isNew', 'realEstate_name',\n",
+ " 'phone_realEstate', 'url_inmueble', 'rooms', 'bathrooms', 'surface',\n",
+ " 'price', 'date', 'description', 'address', 'country', 'level1',\n",
+ " 'level2', 'level3', 'level4', 'level5', 'level6', 'level7', 'level8',\n",
+ " 'upperLevel', 'countryId', 'level1Id', 'level2Id', 'level3Id',\n",
+ " 'level4Id', 'level5Id', 'level6Id', 'level7Id', 'level8Id', 'accuracy',\n",
+ " 'latitude', 'longitude', 'zipCode', 'customZone'],\n",
+ " dtype='object')"
]
- },
+ },
+ "execution_count": 384,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "\n",
+ "ds.columns"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "latin-guest",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 01. ¿Cuál es la casa más cara en todo el dataset? (★☆☆)\n",
+ "\n",
+ "Imprime la dirección y el precio de la casa seleccionada. Por ejemplo:\n",
+ "\n",
+ "`La casa con dirección en Calle del Prado, Nº20 es la más cara y su precio es de 5000000 USD`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 385,
+ "id": "developing-optimum",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "sudden-message",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO: Code"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "El precio de la casa más cara es 8500000 usd y la direccion es El Escorial\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "\n",
+ "direccion_maximo = ds.loc[ds['price'].idxmax(), 'address']\n",
+ "precio_maximo = ds['price'].max()\n",
+ "\n",
+ "print(f\"El precio de la casa más cara es\", precio_maximo ,\"usd\" ,'y la direccion es', direccion_maximo)\n",
+ "\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "lesser-cosmetic",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 02. ¿Cuál es la casa más barata del dataset? (★☆☆)\n",
+ "\n",
+ "Imprime la dirección y el precio de la casa seleccionada. Por ejemplo:\n",
+ "\n",
+ "`La casa con dirección en Calle Alcalá, Nº58 es la más barata y su precio es de 12000 USD`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 386,
+ "id": "lovely-oasis",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "impressed-combination",
- "metadata": {},
- "source": [
- "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "El precio de la casa más barata es 0 usd la dirreccion es Parla\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "direccion_min=ds.loc[ds['price'].idxmin(),'address']\n",
+ "precio_minimo = ds[\"price\"].min()\n",
+ "print(\"El precio de la casa más barata es\", precio_minimo ,\"usd\",'la dirreccion es',direccion_min)"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "compliant-fellowship",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 03. ¿Cuál es la casa más grande y la más pequeña del dataset? (★☆☆)\n",
+ "\n",
+ "Imprime la dirección y el área de las casas seleccionadas. Por ejemplo:\n",
+ "\n",
+ "`La casa más grande está ubicada en Calle Gran Vía, Nº38 y su superficie es de 5000 metros`\n",
+ "\n",
+ "`La casa más pequeña está ubicada en Calle Mayor, Nº12 y su superficie es de 200 metros`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 387,
+ "id": "every-tiffany",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "actual-edinburgh",
- "metadata": {},
- "source": [
- "#### Ejercicio 09. ¿Son los precios promedios de \"Valdemorillo\" y \"Galapagar\" los mismos? (★★☆)\n",
- "\n",
- "Imprime ambos promedios y escribe una conclusión sobre ellos."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "la casa mas pequeña esta ubicada Calle Amparo, Madrid Capital y su superficie es de 15.0 metros\n",
+ "la casa mas grande esta ubicada Sevilla la Nueva y su superficie es de 249000.0 metros\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "pequeña=ds[\"surface\"].min()\n",
+ "pequeña_address = ds.loc[ds[\"surface\"].idxmin(),'address']\n",
+ "grande = ds[\"surface\"].max()\n",
+ "grande_address= ds.loc[ds[\"surface\"].idxmax(),'address']\n",
+ "\n",
+ "print(f'la casa mas pequeña esta ubicada' ,(pequeña_address) ,'y su superficie es de',(pequeña),'metros')\n",
+ "print(f'la casa mas grande esta ubicada',(grande_address) ,'y su superficie es de',(grande),'metros')"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "danish-spirit",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 04. ¿Cuantas poblaciones (columna level5) contiene el dataset? (★☆☆)\n",
+ "\n",
+ "Imprime el nombre de las poblaciones separadas por coma. Por ejemplo:\n",
+ "\n",
+ "`> print(populations)`\n",
+ "\n",
+ "`population1, population2, population3, ...`"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 388,
+ "id": "d9752a75",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "numeric-commerce",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
+ "data": {
+ "text/plain": [
+ "Index(['Unnamed: 0', 'id_realEstates', 'isNew', 'realEstate_name',\n",
+ " 'phone_realEstate', 'url_inmueble', 'rooms', 'bathrooms', 'surface',\n",
+ " 'price', 'date', 'description', 'address', 'country', 'level1',\n",
+ " 'level2', 'level3', 'level4', 'level5', 'level6', 'level7', 'level8',\n",
+ " 'upperLevel', 'countryId', 'level1Id', 'level2Id', 'level3Id',\n",
+ " 'level4Id', 'level5Id', 'level6Id', 'level7Id', 'level8Id', 'accuracy',\n",
+ " 'latitude', 'longitude', 'zipCode', 'customZone'],\n",
+ " dtype='object')"
]
- },
+ },
+ "execution_count": 388,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ds.columns\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 389,
+ "id": "exciting-accreditation",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "lonely-article",
- "metadata": {},
- "source": [
- "#### Ejercicio 10. ¿Son los promedios de precio por metro cuadrado (precio/m2) de \"Valdemorillo\" y \"Galapagar\" los mismos? (★★☆)\n",
- "\n",
- "Imprime ambos promedios de precio por metro cuadrado y escribe una conclusión sobre ellos.\n",
- "\n",
- "Pista: Crea una nueva columna llamada `pps` (*price per square* o precio por metro cuadrado) y luego analiza los valores."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "numero de poblaciones 168\n",
+ "level5\n",
+ " Madrid Capital 6643\n",
+ "Alcalá de Henares 525\n",
+ "Las Rozas de Madrid 383\n",
+ "Móstoles 325\n",
+ "Getafe 290\n",
+ "San Sebastián de los Reyes 280\n",
+ "Boadilla del Monte 275\n",
+ "Parla 272\n",
+ "Valdemoro 262\n",
+ "Torrejón de Ardoz 261\n",
+ "Name: count, dtype: int64\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "#import pandas as pd\n",
+ "#ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "poblaciones = ds.value_counts(\"level5\")\n",
+ "num_poblaciones = len(poblaciones)\n",
+ "\n",
+ "print(\"numero de poblaciones\",num_poblaciones)\n",
+ "#print(\"Nombres de las poblaciones:\", \", \".join(poblaciones)) #duda\n",
+ "print(poblaciones.head(10))"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "crazy-blame",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 05. ¿El dataset contiene valores no admitidos (NAs)? (★☆☆)\n",
+ "\n",
+ "Imprima un booleano (`True` o `False`) seguido de la fila/columna que contiene el NAs."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 390,
+ "id": "e6cf819c",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "hourly-globe",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
+ "data": {
+ "text/plain": [
+ "Unnamed: 0 False\n",
+ "id_realEstates False\n",
+ "isNew False\n",
+ "realEstate_name True\n",
+ "phone_realEstate True\n",
+ "url_inmueble False\n",
+ "rooms True\n",
+ "bathrooms True\n",
+ "surface True\n",
+ "price False\n",
+ "date False\n",
+ "description True\n",
+ "address False\n",
+ "country False\n",
+ "level1 False\n",
+ "level2 False\n",
+ "level3 False\n",
+ "level4 True\n",
+ "level5 False\n",
+ "level6 True\n",
+ "level7 True\n",
+ "level8 True\n",
+ "upperLevel False\n",
+ "countryId False\n",
+ "level1Id False\n",
+ "level2Id False\n",
+ "level3Id False\n",
+ "level4Id False\n",
+ "level5Id False\n",
+ "level6Id False\n",
+ "level7Id False\n",
+ "level8Id False\n",
+ "accuracy False\n",
+ "latitude False\n",
+ "longitude False\n",
+ "zipCode True\n",
+ "customZone True\n",
+ "dtype: bool"
]
- },
+ },
+ "execution_count": 390,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ds.isnull(). any()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 391,
+ "id": "transparent-poetry",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "pleasant-invite",
- "metadata": {},
- "source": [
- "#### Ejercicio 11. Analiza la relación entre la superficie y el precio de las casas. (★★☆)\n",
- "\n",
- "Pista: Puedes hacer un `scatter plot` y luego escribir una conclusión al respecto."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " Unnamed: 0 id_realEstates isNew realEstate_name phone_realEstate \\\n",
+ "0 False False False False False \n",
+ "1 False False False False False \n",
+ "2 False False False False False \n",
+ "3 False False False False False \n",
+ "4 False False False False False \n",
+ "... ... ... ... ... ... \n",
+ "15330 False False False False False \n",
+ "15331 False False False False False \n",
+ "15332 False False False False False \n",
+ "15333 False False False False False \n",
+ "15334 False False False False False \n",
+ "\n",
+ " url_inmueble rooms bathrooms surface price ... level4Id \\\n",
+ "0 False False False False False ... False \n",
+ "1 False False False True False ... False \n",
+ "2 False False False False False ... False \n",
+ "3 False False False False False ... False \n",
+ "4 False False False False False ... False \n",
+ "... ... ... ... ... ... ... ... \n",
+ "15330 False False False False False ... False \n",
+ "15331 False False False False False ... False \n",
+ "15332 False False False False False ... False \n",
+ "15333 False False False False False ... False \n",
+ "15334 False False False False False ... False \n",
+ "\n",
+ " level5Id level6Id level7Id level8Id accuracy latitude longitude \\\n",
+ "0 False False False False False False False \n",
+ "1 False False False False False False False \n",
+ "2 False False False False False False False \n",
+ "3 False False False False False False False \n",
+ "4 False False False False False False False \n",
+ "... ... ... ... ... ... ... ... \n",
+ "15330 False False False False False False False \n",
+ "15331 False False False False False False False \n",
+ "15332 False False False False False False False \n",
+ "15333 False False False False False False False \n",
+ "15334 False False False False False False False \n",
+ "\n",
+ " zipCode customZone \n",
+ "0 True True \n",
+ "1 True True \n",
+ "2 True True \n",
+ "3 True True \n",
+ "4 True True \n",
+ "... ... ... \n",
+ "15330 True True \n",
+ "15331 True True \n",
+ "15332 True True \n",
+ "15333 True True \n",
+ "15334 True True \n",
+ "\n",
+ "[15335 rows x 37 columns]\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "#import pandas as pd\n",
+ "#ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "encontras_nas = ds.isnull().sum()\n",
+ "encontras_nas = ds.isnull()\n",
+ "\n",
+ "#.sort_values(ascending=True)\n",
+ "print(encontras_nas)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 392,
+ "id": "9515def3",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "common-drilling",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO: Código"
+ "data": {
+ "text/plain": [
+ "np.int64(0)"
]
- },
+ },
+ "execution_count": 392,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "#como encontrar datos duplicados \n",
+ "ds.duplicated().sum()"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "italic-hydrogen",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 06. Elimina los NAs del dataset, si aplica (★★☆)\n",
+ "\n",
+ "Imprima una comparación entre las dimensiones del DataFrame original versus el DataFrame después de las eliminaciones.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 393,
+ "id": "administrative-roads",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "ahead-liquid",
- "metadata": {},
- "source": [
- "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Empty DataFrame\n",
+ "Columns: [Unnamed: 0, id_realEstates, isNew, realEstate_name, phone_realEstate, url_inmueble, rooms, bathrooms, surface, price, date, description, address, country, level1, level2, level3, level4, level5, level6, level7, level8, upperLevel, countryId, level1Id, level2Id, level3Id, level4Id, level5Id, level6Id, level7Id, level8Id, accuracy, latitude, longitude, zipCode, customZone]\n",
+ "Index: []\n",
+ "\n",
+ "[0 rows x 37 columns]\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "##inplace=true. sive para aplicar el cambio o actualizar dataframe\n",
+ "ds_limpieza = ds.dropna (axis=0)\n",
+ "print (ds_limpieza)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 394,
+ "id": "efe42c12",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "coordinate-sunrise",
- "metadata": {},
- "source": [
- "#### Ejercicio 12. ¿Cuántas agencia de bienes raíces contiene el dataset? (★★☆)\n",
- "\n",
- "Imprime el valor obtenido."
+ "data": {
+ "text/html": [
+ "\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " \n",
+ " Unnamed: 0 \n",
+ " id_realEstates \n",
+ " isNew \n",
+ " realEstate_name \n",
+ " phone_realEstate \n",
+ " url_inmueble \n",
+ " rooms \n",
+ " bathrooms \n",
+ " surface \n",
+ " price \n",
+ " ... \n",
+ " level4Id \n",
+ " level5Id \n",
+ " level6Id \n",
+ " level7Id \n",
+ " level8Id \n",
+ " accuracy \n",
+ " latitude \n",
+ " longitude \n",
+ " zipCode \n",
+ " customZone \n",
+ " \n",
+ " \n",
+ " \n",
+ " \n",
+ " 0 \n",
+ " 1 \n",
+ " 153771986 \n",
+ " False \n",
+ " ferrari 57 inmobiliaria \n",
+ " 912177526.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 2.0 \n",
+ " 103.0 \n",
+ " 195000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,2948276786438 \n",
+ " -3,44402412135624 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 1 \n",
+ " 2 \n",
+ " 153867863 \n",
+ " False \n",
+ " tecnocasa fuenlabrada ferrocarril \n",
+ " 916358736.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 1.0 \n",
+ " 0.0 \n",
+ " 89000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 1 \n",
+ " 40,28674 \n",
+ " -3,79351 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 2 \n",
+ " 3 \n",
+ " 153430440 \n",
+ " False \n",
+ " look find boadilla \n",
+ " 916350408.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 99.0 \n",
+ " 390000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,4115646786438 \n",
+ " -3,90662252135624 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 3 \n",
+ " 4 \n",
+ " 152776331 \n",
+ " False \n",
+ " tecnocasa fuenlabrada ferrocarril \n",
+ " 916358736.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 1.0 \n",
+ " 86.0 \n",
+ " 89000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,2853785786438 \n",
+ " -3,79508142135624 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 4 \n",
+ " 5 \n",
+ " 153180188 \n",
+ " False \n",
+ " ferrari 57 inmobiliaria \n",
+ " 912177526.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 2.0 \n",
+ " 2.0 \n",
+ " 106.0 \n",
+ " 172000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,2998774864376 \n",
+ " -3,45226301356237 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " ... \n",
+ " \n",
+ " \n",
+ " 15330 \n",
+ " 15331 \n",
+ " 153901377 \n",
+ " False \n",
+ " infocasa consulting \n",
+ " 911360461.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 2.0 \n",
+ " 1.0 \n",
+ " 96.0 \n",
+ " 259470 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,45416 \n",
+ " -3,70286 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 15331 \n",
+ " 15332 \n",
+ " 150394373 \n",
+ " False \n",
+ " inmobiliaria pulpon \n",
+ " 912788039.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 1.0 \n",
+ " 150.0 \n",
+ " 165000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,36652 \n",
+ " -3,48951 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 15332 \n",
+ " 15333 \n",
+ " 153901397 \n",
+ " False \n",
+ " tecnocasa torrelodones \n",
+ " 912780348.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 4.0 \n",
+ " 2.0 \n",
+ " 175.0 \n",
+ " 495000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,57444 \n",
+ " -3,92124 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 15333 \n",
+ " 15334 \n",
+ " 152607440 \n",
+ " False \n",
+ " inmobiliaria pulpon \n",
+ " 912788039.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 2.0 \n",
+ " 101.0 \n",
+ " 195000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,36967 \n",
+ " -3,48105 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ " 15334 \n",
+ " 15335 \n",
+ " 153901356 \n",
+ " False \n",
+ " infocasa consulting \n",
+ " 911360461.0 \n",
+ " https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ " 3.0 \n",
+ " 2.0 \n",
+ " 152.0 \n",
+ " 765000 \n",
+ " ... \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 0 \n",
+ " 40,45773 \n",
+ " -3,69068 \n",
+ " 0.0 \n",
+ " 0.0 \n",
+ " \n",
+ " \n",
+ "
\n",
+ "
15335 rows × 37 columns
\n",
+ "
"
+ ],
+ "text/plain": [
+ " Unnamed: 0 id_realEstates isNew realEstate_name \\\n",
+ "0 1 153771986 False ferrari 57 inmobiliaria \n",
+ "1 2 153867863 False tecnocasa fuenlabrada ferrocarril \n",
+ "2 3 153430440 False look find boadilla \n",
+ "3 4 152776331 False tecnocasa fuenlabrada ferrocarril \n",
+ "4 5 153180188 False ferrari 57 inmobiliaria \n",
+ "... ... ... ... ... \n",
+ "15330 15331 153901377 False infocasa consulting \n",
+ "15331 15332 150394373 False inmobiliaria pulpon \n",
+ "15332 15333 153901397 False tecnocasa torrelodones \n",
+ "15333 15334 152607440 False inmobiliaria pulpon \n",
+ "15334 15335 153901356 False infocasa consulting \n",
+ "\n",
+ " phone_realEstate url_inmueble \\\n",
+ "0 912177526.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "1 916358736.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "2 916350408.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "3 916358736.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "4 912177526.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "... ... ... \n",
+ "15330 911360461.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15331 912788039.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15332 912780348.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15333 912788039.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "15334 911360461.0 https://www.fotocasa.es/es/comprar/vivienda/ma... \n",
+ "\n",
+ " rooms bathrooms surface price ... level4Id level5Id level6Id \\\n",
+ "0 3.0 2.0 103.0 195000 ... 0 0 0 \n",
+ "1 3.0 1.0 0.0 89000 ... 0 0 0 \n",
+ "2 2.0 2.0 99.0 390000 ... 0 0 0 \n",
+ "3 3.0 1.0 86.0 89000 ... 0 0 0 \n",
+ "4 2.0 2.0 106.0 172000 ... 0 0 0 \n",
+ "... ... ... ... ... ... ... ... ... \n",
+ "15330 2.0 1.0 96.0 259470 ... 0 0 0 \n",
+ "15331 3.0 1.0 150.0 165000 ... 0 0 0 \n",
+ "15332 4.0 2.0 175.0 495000 ... 0 0 0 \n",
+ "15333 3.0 2.0 101.0 195000 ... 0 0 0 \n",
+ "15334 3.0 2.0 152.0 765000 ... 0 0 0 \n",
+ "\n",
+ " level7Id level8Id accuracy latitude longitude zipCode \\\n",
+ "0 0 0 0 40,2948276786438 -3,44402412135624 0.0 \n",
+ "1 0 0 1 40,28674 -3,79351 0.0 \n",
+ "2 0 0 0 40,4115646786438 -3,90662252135624 0.0 \n",
+ "3 0 0 0 40,2853785786438 -3,79508142135624 0.0 \n",
+ "4 0 0 0 40,2998774864376 -3,45226301356237 0.0 \n",
+ "... ... ... ... ... ... ... \n",
+ "15330 0 0 0 40,45416 -3,70286 0.0 \n",
+ "15331 0 0 0 40,36652 -3,48951 0.0 \n",
+ "15332 0 0 0 40,57444 -3,92124 0.0 \n",
+ "15333 0 0 0 40,36967 -3,48105 0.0 \n",
+ "15334 0 0 0 40,45773 -3,69068 0.0 \n",
+ "\n",
+ " customZone \n",
+ "0 0.0 \n",
+ "1 0.0 \n",
+ "2 0.0 \n",
+ "3 0.0 \n",
+ "4 0.0 \n",
+ "... ... \n",
+ "15330 0.0 \n",
+ "15331 0.0 \n",
+ "15332 0.0 \n",
+ "15333 0.0 \n",
+ "15334 0.0 \n",
+ "\n",
+ "[15335 rows x 37 columns]"
]
- },
+ },
+ "execution_count": 394,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "ds.fillna(value=0)"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "middle-china",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 07. ¿Cuál la media de precios en la población (columna level5) de \"Arroyomolinos (Madrid)\"? (★★☆)\n",
+ "\n",
+ "Imprima el valor obtenido."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 395,
+ "id": "nuclear-belief",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "valid-honolulu",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "la media de Arroyomolinos es de $ 294541\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "\n",
+ "ds_filtrado = ds[ds['level5'] == 'Arroyomolinos (Madrid)']\n",
+ "media_price = int(ds_filtrado ['price'].mean())\n",
+ "print(f'la media de Arroyomolinos es de ','$',(media_price))\n",
+ "\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "concerned-radical",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 08. Trazar el histograma de los precios para la población (level5 column) de \"Arroyomolinos (Madrid)\" y explica qué observas (★★☆)\n",
+ "\n",
+ "Imprime el histograma de los precios y escribe en la celda del Markdown un breve análisis del trazado.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 396,
+ "id": "sudden-message",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "binding-ebony",
- "metadata": {},
- "source": [
- "#### Ejercicio 13. ¿Cuál es la población (columna level5) que contiene la mayor cantidad de casas?(★★☆)\n",
- "\n",
- "Imprima la población y el número de casas."
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
- },
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# TODO:\n",
+ "import pandas as pd \n",
+ "import matplotlib.pyplot as plt\n",
+ "ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "\n",
+ "\n",
+ "\n",
+ "ds_filtrado = ds[ds['level5'] == 'Arroyomolinos (Madrid)']\n",
+ "plt.hist(ds['price'], bins=100, edgecolor='black') \n",
+ "plt.xlabel('Price') \n",
+ "plt.ylabel('Arroyomolinos (Madrid)') \n",
+ "plt.title('Histograma de Prices') \n",
+ "plt.show()\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "impressed-combination",
+ "metadata": {},
+ "source": [
+ "confimado"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "actual-edinburgh",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 09. ¿Son los precios promedios de \"Valdemorillo\" y \"Galapagar\" los mismos? (★★☆)\n",
+ "\n",
+ "Imprime ambos promedios y escribe una conclusión sobre ellos."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 397,
+ "id": "numeric-commerce",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "static-perry",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "El promedio de Valdemorillo 363860\n",
+ "El promedio de Galapagar 360063\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "val = ds[ds['level5'] == 'Valdemorillo']\n",
+ "promedio_val = int(val [\"price\"] .mean())\n",
+ "gal= ds [ds[\"level5\"]== \"Galapagar\"]\n",
+ "promedio_gal = int(gal [\"price\"] .mean())\n",
+ "\n",
+ "\n",
+ "\n",
+ "print(\"El promedio de Valdemorillo\" , promedio_val)\n",
+ "print(\"El promedio de Galapagar \", promedio_gal)\n",
+ "#si los redondeo son iguales\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "id": "9057c1fa",
+ "metadata": {},
+ "source": [
+ "no son los mismos"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "lonely-article",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 10. ¿Son los promedios de precio por metro cuadrado (precio/m2) de \"Valdemorillo\" y \"Galapagar\" los mismos? (★★☆)\n",
+ "\n",
+ "Imprime ambos promedios de precio por metro cuadrado y escribe una conclusión sobre ellos.\n",
+ "\n",
+ "Pista: Crea una nueva columna llamada `pps` (*price per square* o precio por metro cuadrado) y luego analiza los valores."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 398,
+ "id": "hourly-globe",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "entire-classification",
- "metadata": {},
- "source": [
- "#### Ejercicio 14. Ahora vamos a trabajar con el \"cinturón sur\" de Madrid. Haz un subconjunto del DataFrame original que contenga las siguientes poblaciones (columna level5): \"Fuenlabrada\", \"Leganés\", \"Getafe\", \"Alcorcón\" (★★☆)\n",
- "\n",
- "Pista: Filtra el DataFrame original usando la columna `level5` y la función `isin`."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "0 0.00181\n",
+ "1 0.00181\n",
+ "2 0.00181\n",
+ "3 0.00181\n",
+ "4 0.00181\n",
+ " ... \n",
+ "15330 0.00181\n",
+ "15331 0.00181\n",
+ "15332 0.00181\n",
+ "15333 0.00181\n",
+ "15334 0.00181\n",
+ "Name: pps, Length: 15335, dtype: float64\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "#surface\n",
+ "\n",
+ "val = ds[ds['level5'] == 'Valdemorillo']\n",
+ "m2_val = val [\"surface\"] .mean()\n",
+ "gal= ds [ds[\"level5\"]== \"Galapagar\"]\n",
+ "m2_gal = gal [\"surface\"] .mean()\n",
+ "\n",
+ "val = ds[ds['level5'] == 'Valdemorillo']\n",
+ "promedio_val = (val [\"price\"] .mean())\n",
+ "gal= ds [ds[\"level5\"]== \"Galapagar\"]\n",
+ "promedio_gal = (gal [\"price\"] .mean())\n",
+ "\n",
+ "promedio_m2 = int(m2_val + m2_gal) \n",
+ "promedio_precio = int(promedio_gal + promedio_val ) / 2\n",
+ "\n",
+ "#ds['pps'] = ds[m2_val] / ds[promedio_val] \n",
+ "promedio_precio_m2= (promedio_m2 / promedio_precio)\n",
+ "\n",
+ "ds[\"pps\"] = promedio_precio_m2\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "\n",
+ "#print(\"El promedio de Valdemorillo\" , promedio_val)\n",
+ "#print(\"El promedio de Galapagar \", promedio_gal)\n",
+ "\n",
+ "#print(\"El promedio m2 de Valdemorillo\" , pps_val)\n",
+ "#print(\"El promedio m2 de Galapagar \", pps_gal)\n",
+ "print(ds[\"pps\"])\n",
+ "#print(ds[\"surface\"], ds[\"price\"])\n",
+ "\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "pleasant-invite",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 11. Analiza la relación entre la superficie y el precio de las casas. (★★☆)\n",
+ "\n",
+ "Pista: Puedes hacer un `scatter plot` y luego escribir una conclusión al respecto."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 399,
+ "id": "common-drilling",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "binary-input",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
- },
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# TODO: \n",
+ "\n",
+ "import pandas as pd \n",
+ "import matplotlib.pyplot as plt\n",
+ "ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "\n",
+ "# Crear el scatter plot\n",
+ "plt.scatter(ds[\"surface\"], ds[\"price\"])\n",
+ "# Etiquetas y título\n",
+ "\n",
+ "plt.xlabel('Superficie (m2)') \n",
+ "plt.ylabel('Precio') \n",
+ "plt.title('Relación entre superficie y precio de las casas')\n",
+ "plt.xlim(0,300)\n",
+ "plt.ylim(0,300000)\n",
+ "# Mostrar el gráfico\n",
+ "plt.show()\n",
+ "#disminuir el rango \n",
+ "\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "ahead-liquid",
+ "metadata": {},
+ "source": [
+ "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "coordinate-sunrise",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 12. ¿Cuántas agencia de bienes raíces contiene el dataset? (★★☆)\n",
+ "\n",
+ "Imprime el valor obtenido."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 400,
+ "id": "valid-honolulu",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "severe-fisher",
- "metadata": {},
- "source": [
- "#### Ejercicio 15. Traza un gráfico de barras de la mediana de los precios y explica lo que observas (debes usar el subconjunto obtenido del Ejercicio 14) (★★★)\n",
- "\n",
- "Imprima un gráfico de barras de la mediana de precios y escriba en la celda Markdown un breve análisis sobre el gráfico."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "la cantida de agencias en Bienes Raices es de 15325\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "\n",
+ "agencias_br = ds['realEstate_name'].count()\n",
+ "\n",
+ "print(f'la cantida de agencias en Bienes Raices es de ', (agencias_br))"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "binding-ebony",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 13. ¿Cuál es la población (columna level5) que contiene la mayor cantidad de casas?(★★☆)\n",
+ "\n",
+ "Imprima la población y el número de casas."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 401,
+ "id": "static-perry",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "lyric-bunch",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO: Code"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "la poblacion que contiene mas casas en la columna level5 es Zarzalejo\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "\n",
+ "poblacion_max = ds['level5'] .max()\n",
+ "\n",
+ "print (f'la poblacion que contiene mas casas en la columna level5 es',(poblacion_max))"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "entire-classification",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 14. Ahora vamos a trabajar con el \"cinturón sur\" de Madrid. Haz un subconjunto del DataFrame original que contenga las siguientes poblaciones (columna level5): \"Fuenlabrada\", \"Leganés\", \"Getafe\", \"Alcorcón\" (★★☆)\n",
+ "\n",
+ "Pista: Filtra el DataFrame original usando la columna `level5` y la función `isin`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 402,
+ "id": "binary-input",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "sublime-newspaper",
- "metadata": {},
- "source": [
- "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " Unnamed: 0 id_realEstates isNew \\\n",
+ "1 2 153867863 False \n",
+ "3 4 152776331 False \n",
+ "85 86 153152077 False \n",
+ "94 95 153995577 False \n",
+ "109 110 153586414 False \n",
+ "... ... ... ... \n",
+ "15275 15276 153903887 False \n",
+ "15291 15292 151697757 False \n",
+ "15305 15306 153902389 False \n",
+ "15322 15323 153871864 False \n",
+ "15325 15326 153901467 False \n",
+ "\n",
+ " realEstate_name phone_realEstate \\\n",
+ "1 tecnocasa fuenlabrada ferrocarril 916358736.0 \n",
+ "3 tecnocasa fuenlabrada ferrocarril 916358736.0 \n",
+ "85 sinergical inmobiliaria NaN \n",
+ "94 viviendas365com 911226014.0 \n",
+ "109 area uno asesores inmobiliarios 912664081.0 \n",
+ "... ... ... \n",
+ "15275 aliseda servicios de gestion inmobiliaria 911368198.0 \n",
+ "15291 unipiso 912788631.0 \n",
+ "15305 jadein ferrero 914871639.0 \n",
+ "15322 gestion comercial 911220662.0 \n",
+ "15325 montehogar 68 911790675.0 \n",
+ "\n",
+ " url_inmueble rooms bathrooms \\\n",
+ "1 https://www.fotocasa.es/es/comprar/vivienda/ma... 3.0 1.0 \n",
+ "3 https://www.fotocasa.es/es/comprar/vivienda/ma... 3.0 1.0 \n",
+ "85 https://www.fotocasa.es/es/comprar/vivienda/le... 1.0 1.0 \n",
+ "94 https://www.fotocasa.es/es/comprar/vivienda/le... 3.0 2.0 \n",
+ "109 https://www.fotocasa.es/es/comprar/vivienda/ma... 3.0 3.0 \n",
+ "... ... ... ... \n",
+ "15275 https://www.fotocasa.es/es/comprar/vivienda/al... 3.0 1.0 \n",
+ "15291 https://www.fotocasa.es/es/comprar/vivienda/al... 3.0 2.0 \n",
+ "15305 https://www.fotocasa.es/es/comprar/vivienda/ma... 3.0 2.0 \n",
+ "15322 https://www.fotocasa.es/es/comprar/vivienda/ma... 3.0 1.0 \n",
+ "15325 https://www.fotocasa.es/es/comprar/vivienda/ma... 2.0 2.0 \n",
+ "\n",
+ " surface price ... level4Id level5Id level6Id level7Id level8Id \\\n",
+ "1 NaN 89000 ... 0 0 0 0 0 \n",
+ "3 86.0 89000 ... 0 0 0 0 0 \n",
+ "85 50.0 107000 ... 0 0 0 0 0 \n",
+ "94 120.0 320000 ... 0 0 0 0 0 \n",
+ "109 142.0 425000 ... 0 0 0 0 0 \n",
+ "... ... ... ... ... ... ... ... ... \n",
+ "15275 78.0 138000 ... 0 0 0 0 0 \n",
+ "15291 110.0 279000 ... 0 0 0 0 0 \n",
+ "15305 85.0 170000 ... 0 0 0 0 0 \n",
+ "15322 91.0 112000 ... 0 0 0 0 0 \n",
+ "15325 99.0 215000 ... 0 0 0 0 0 \n",
+ "\n",
+ " accuracy latitude longitude zipCode customZone \n",
+ "1 1 40,28674 -3,79351 NaN NaN \n",
+ "3 0 40,2853785786438 -3,79508142135624 NaN NaN \n",
+ "85 1 40,35059 -3,82693 NaN NaN \n",
+ "94 0 40,31933 -3,77574 NaN NaN \n",
+ "109 0 40,3313411 -3,8313868 NaN NaN \n",
+ "... ... ... ... ... ... \n",
+ "15275 1 40,31381 -3,83733 NaN NaN \n",
+ "15291 0 40,3259051 -3,76318 NaN NaN \n",
+ "15305 0 40,2882193 -3,8098617 NaN NaN \n",
+ "15322 0 40,28282 -3,78892 NaN NaN \n",
+ "15325 1 40,28062 -3,79869 NaN NaN \n",
+ "\n",
+ "[907 rows x 37 columns]\n"
+ ]
+ }
+ ],
+ "source": [
+ "# TODO\n",
+ "#definir las ciudades\n",
+ "ciudades=['Getafe', 'Leganés','Fuenlabrada','Alcorcón']\n",
+ "#filtrar el dataframe\n",
+ "cinturon_sur = ds[ds['level5'].isin(ciudades)]\n",
+ "\n",
+ "print(cinturon_sur)\n",
+ "\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "severe-fisher",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 15. Traza un gráfico de barras de la mediana de los precios y explica lo que observas (debes usar el subconjunto obtenido del Ejercicio 14) (★★★)\n",
+ "\n",
+ "Imprima un gráfico de barras de la mediana de precios y escriba en la celda Markdown un breve análisis sobre el gráfico."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 403,
+ "id": "d1d1495f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "\n",
+ "median_prices = cinturon_sur.groupby('level5')['price'].median()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 404,
+ "id": "lyric-bunch",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "speaking-diamond",
- "metadata": {},
- "source": [
- "#### Ejercicio 16. Calcula la media y la varianza de muestra para las siguientes variables: precio, habitaciones, superficie y baños (debes usar el subconjunto obtenido del Ejercicio 14) (★★★)\n",
- "\n",
- "Imprime ambos valores por cada variable."
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
- },
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "# TODO: Code\n",
+ "\n",
+ "import pandas as pd \n",
+ "import matplotlib.pyplot as plt\n",
+ "ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "\n",
+ "plt.bar(median_prices.index, median_prices.values)\n",
+ "plt.xlabel('ciudad')\n",
+ "plt.ylabel('mediana de precios')\n",
+ "plt.show()\n"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "sublime-newspaper",
+ "metadata": {},
+ "source": [
+ "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "speaking-diamond",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 16. Calcula la media y la varianza de muestra para las siguientes variables: precio, habitaciones, superficie y baños (debes usar el subconjunto obtenido del Ejercicio 14) (★★★)\n",
+ "\n",
+ "Imprime ambos valores por cada variable."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 405,
+ "id": "0425cda0",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "ciudades=['Getafe', 'Leganés','Fuenlabrada','Alcorcón']\n",
+ "cinturon_sur = ds[ds['level5'].isin(ciudades)]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 406,
+ "id": "random-feeling",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "random-feeling",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " price rooms surface \\\n",
+ " mean var mean var mean \n",
+ "level5 \n",
+ "Alcorcón 230071.052632 1.594783e+10 2.914894 0.933895 105.913295 \n",
+ "Fuenlabrada 177198.021459 4.701021e+09 3.025974 0.355844 103.624365 \n",
+ "Getafe 265040.500000 2.098267e+10 3.151724 0.772748 126.896266 \n",
+ "Leganés 208682.010309 1.191394e+10 2.906736 0.824590 105.852273 \n",
+ "\n",
+ " bathrooms \n",
+ " var mean var \n",
+ "level5 \n",
+ "Alcorcón 4244.323834 1.623656 0.592735 \n",
+ "Fuenlabrada 2264.643893 1.445415 0.353367 \n",
+ "Getafe 5828.110028 1.865052 0.658809 \n",
+ "Leganés 3987.475195 1.518135 0.553055 \n"
+ ]
+ }
+ ],
+ "source": [
+ "estadisticas = cinturon_sur.groupby('level5').agg({ 'price': ['mean', 'var'], 'rooms': ['mean', 'var'], 'surface': ['mean', 'var'], 'bathrooms': ['mean', 'var']})\n",
+ "print(estadisticas)"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "revolutionary-matrix",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 17. ¿Cuál es la casa más cara de cada población? Debes usar el subset obtenido en la pregunta 14 (★★☆)\n",
+ "\n",
+ "Imprime tanto la dirección como el precio de la casa seleccionada de cada población. Puedes imprimir un DataFrame o una sola línea para cada población."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 407,
+ "id": "fifteen-browse",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "# TODO\n",
+ "ciudadmax_price= cinturon_sur.groupby('level5')['price'].max()\n",
+ "ciudad_price=ds.loc[cinturon_sur['price'].idxmax(), 'address']"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 408,
+ "id": "2fd97081",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "revolutionary-matrix",
- "metadata": {},
- "source": [
- "#### Ejercicio 17. ¿Cuál es la casa más cara de cada población? Debes usar el subset obtenido en la pregunta 14 (★★☆)\n",
- "\n",
- "Imprime tanto la dirección como el precio de la casa seleccionada de cada población. Puedes imprimir un DataFrame o una sola línea para cada población."
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "level5\n",
+ "Alcorcón 950000\n",
+ "Fuenlabrada 490000\n",
+ "Getafe 1050000\n",
+ "Leganés 650000\n",
+ "Name: price, dtype: int64\n",
+ "la casa mas cara de cada poblacion es Getafe\n"
+ ]
+ }
+ ],
+ "source": [
+ "\n",
+ "print(ciudadmax_price)\n",
+ "print(f'la casa mas cara de cada poblacion es' ,(ciudad_price))"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "activated-knight",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 18. Normaliza la variable de precios para cada población y traza los 4 histogramas en el mismo gráfico (debes usar el subconjunto obtenido en la pregunta 14) (★★★)\n",
+ "\n",
+ "Para el método de normalización, puedes usar el que consideres adecuado, no hay una única respuesta correcta para esta pregunta. Imprime el gráfico y escribe en la celda de Markdown un breve análisis sobre el gráfico.\n",
+ "\n",
+ "Pista: Puedes ayudarte revisando la demostración multihist de Matplotlib."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 409,
+ "id": "civic-meditation",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "fifteen-browse",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/tmp/ipykernel_605/3375357450.py:1: SettingWithCopyWarning: \n",
+ "A value is trying to be set on a copy of a slice from a DataFrame.\n",
+ "Try using .loc[row_indexer,col_indexer] = value instead\n",
+ "\n",
+ "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
+ " cinturon_sur['normalized_price'] = cinturon_sur.groupby('level5')['price'].transform(lambda x: (x - x.mean()) / x.std())\n"
+ ]
},
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "activated-knight",
- "metadata": {},
- "source": [
- "#### Ejercicio 18. Normaliza la variable de precios para cada población y traza los 4 histogramas en el mismo gráfico (debes usar el subconjunto obtenido en la pregunta 14) (★★★)\n",
- "\n",
- "Para el método de normalización, puedes usar el que consideres adecuado, no hay una única respuesta correcta para esta pregunta. Imprime el gráfico y escribe en la celda de Markdown un breve análisis sobre el gráfico.\n",
- "\n",
- "Pista: Puedes ayudarte revisando la demostración multihist de Matplotlib."
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
- },
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "cinturon_sur['normalized_price'] = cinturon_sur.groupby('level5')['price'].transform(lambda x: (x - x.mean()) / x.std())\n",
+ "plt.figure(figsize=(10, 6))\n",
+ "for ciudad in ciudades: subset = cinturon_sur[cinturon_sur['level5'] == ciudad] \n",
+ "plt.hist(subset['normalized_price'], bins=30, alpha=0.5, label=ciudad)\n",
+ "plt.xlabel('Precio Normalizado')\n",
+ "plt.ylabel('Frecuencia')\n",
+ "plt.title('Histogramas de Precios Normalizados por Ciudad')\n",
+ "plt.legend()\n",
+ "\n",
+ "plt.show()"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "precise-heavy",
+ "metadata": {},
+ "source": [
+ "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "patent-jonathan",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 19. ¿Qué puedes decir sobre el precio por metro cuadrado (precio/m2) entre los municipios de 'Getafe' y 'Alcorcón'? Debes usar el subconjunto obtenido en la pregunta 14 (★★☆)\n",
+ "\n",
+ "Pista: Crea una nueva columna llamada `pps` (price per square en inglés) y luego analiza los valores"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 410,
+ "id": "c9d3847f",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "#definir las ciudades\n",
+ "ciudades=['Getafe', 'Leganés','Fuenlabrada','Alcorcón']\n",
+ "#filtrar el dataframe\n",
+ "cinturon_sur = ds[ds['level5'].isin(ciudades)]"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 411,
+ "id": "initial-liverpool",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "civic-meditation",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
- ]
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "count 241.000000\n",
+ "mean 2066.314949\n",
+ "std 741.872702\n",
+ "min 0.000000\n",
+ "25% 1684.285714\n",
+ "50% 1973.333333\n",
+ "75% 2628.787879\n",
+ "max 3827.160494\n",
+ "Name: pps, dtype: float64\n",
+ "count 173.000000\n",
+ "mean 2239.302480\n",
+ "std 539.951527\n",
+ "min 604.761905\n",
+ "25% 1904.081633\n",
+ "50% 2207.792208\n",
+ "75% 2472.727273\n",
+ "max 3698.159509\n",
+ "Name: pps, dtype: float64\n"
+ ]
},
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "precise-heavy",
- "metadata": {},
- "source": [
- "**TODO: Markdown**. Para escribir aquí, haz doble clic en esta celda, elimina este contenido y coloca lo que quieras escribir. Luego ejecuta la celda."
- ]
- },
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/tmp/ipykernel_605/500706172.py:5: SettingWithCopyWarning: \n",
+ "A value is trying to be set on a copy of a slice from a DataFrame.\n",
+ "Try using .loc[row_indexer,col_indexer] = value instead\n",
+ "\n",
+ "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
+ " getafe['pps'] = getafe['price'] / getafe['surface']\n",
+ "/tmp/ipykernel_605/500706172.py:6: SettingWithCopyWarning: \n",
+ "A value is trying to be set on a copy of a slice from a DataFrame.\n",
+ "Try using .loc[row_indexer,col_indexer] = value instead\n",
+ "\n",
+ "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
+ " alcorcon['pps'] = alcorcon['price'] / alcorcon['surface']\n"
+ ]
+ }
+ ],
+ "source": [
+ "#filtro\n",
+ "getafe = cinturon_sur[cinturon_sur['level5'] == 'Getafe']\n",
+ "alcorcon = cinturon_sur[cinturon_sur['level5'] == 'Alcorcón']\n",
+ "#pps\n",
+ "getafe['pps'] = getafe['price'] / getafe['surface']\n",
+ "alcorcon['pps'] = alcorcon['price'] / alcorcon['surface']\n",
+ "\n",
+ "pps_getafe = getafe['pps'].describe()\n",
+ "pps_alcorcon = alcorcon['pps'].describe()\n",
+ "\n",
+ "#print(\"Getafe:\") \n",
+ "print(pps_getafe)\n",
+ "#print(\"Alcorcón:\") \n",
+ "print(pps_alcorcon)"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "enhanced-moscow",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 20. Realiza el mismo gráfico para 4 poblaciones diferentes (columna level5) y colócalos en el mismo gráfico. Debes usar el subconjunto obtenido en la pregunta 14 (★★☆) \n",
+ "Pista: Haz un diagrama de dispersión de cada población usando subgráficos (subplots)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 412,
+ "id": "accepting-airfare",
+ "metadata": {},
+ "outputs": [
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "patent-jonathan",
- "metadata": {},
- "source": [
- "#### Ejercicio 19. ¿Qué puedes decir sobre el precio por metro cuadrado (precio/m2) entre los municipios de 'Getafe' y 'Alcorcón'? Debes usar el subconjunto obtenido en la pregunta 14 (★★☆)\n",
- "\n",
- "Pista: Crea una nueva columna llamada `pps` (price per square en inglés) y luego analiza los valores"
- ]
- },
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "/tmp/ipykernel_605/2202286744.py:2: SettingWithCopyWarning: \n",
+ "A value is trying to be set on a copy of a slice from a DataFrame.\n",
+ "Try using .loc[row_indexer,col_indexer] = value instead\n",
+ "\n",
+ "See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy\n",
+ " cinturon_sur['pps'] = cinturon_sur['price'] / cinturon_sur['surface']\n"
+ ]
+ }
+ ],
+ "source": [
+ "#median_prices = cinturon_sur.groupby('level5')['price'].median()\n",
+ "cinturon_sur['pps'] = cinturon_sur['price'] / cinturon_sur['surface']"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 413,
+ "id": "2de92e87",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "initial-liverpool",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "enhanced-moscow",
- "metadata": {},
- "source": [
- "#### Ejercicio 20. Realiza el mismo gráfico para 4 poblaciones diferentes (columna level5) y colócalos en el mismo gráfico. Debes usar el subconjunto obtenido en la pregunta 14 (★★☆) \n",
- "Pista: Haz un diagrama de dispersión de cada población usando subgráficos (subplots)."
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "cell_type": "code",
- "execution_count": null,
- "id": "accepting-airfare",
- "metadata": {},
- "outputs": [],
- "source": [
- "# TODO"
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
+ },
+ "metadata": {},
+ "output_type": "display_data"
},
{
- "attachments": {},
- "cell_type": "markdown",
- "id": "blocked-effects",
- "metadata": {},
- "source": [
- "#### Ejercicio 21. Realiza un trazado de las coordenadas (columnas latitud y longitud) del cinturón sur de Madrid por color de cada población (debes usar el subconjunto obtenido del Ejercicio 14) (★★★★)\n",
- "\n",
- "Ejecuta la siguiente celda y luego comienza a codear en la siguiente. Debes implementar un código simple que transforme las columnas de coordenadas en un diccionario de Python (agrega más información si es necesario) y agrégala al mapa."
+ "data": {
+ "image/png": "",
+ "text/plain": [
+ ""
]
- },
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "plt.figure(figsize=(10, 6))\n",
+ "for ciudad in ciudades:\n",
+ " subset = cinturon_sur[cinturon_sur['level5'] == ciudad]\n",
+ " plt.scatter(subset['surface'], subset['pps'], label=ciudad)\n",
+ " plt.xlabel('Área (m²)')\n",
+ " plt.ylabel('Precio por Metro Cuadrado ($)')\n",
+ " plt.title('Gráfico de Dispersión del Precio por Metro Cuadrado') \n",
+ " plt.legend() \n",
+ " plt.grid(True) \n",
+ " plt.show()"
+ ]
+ },
+ {
+ "attachments": {},
+ "cell_type": "markdown",
+ "id": "blocked-effects",
+ "metadata": {},
+ "source": [
+ "#### Ejercicio 21. Realiza un trazado de las coordenadas (columnas latitud y longitud) del cinturón sur de Madrid por color de cada población (debes usar el subconjunto obtenido del Ejercicio 14) (★★★★)\n",
+ "\n",
+ "Ejecuta la siguiente celda y luego comienza a codear en la siguiente. Debes implementar un código simple que transforme las columnas de coordenadas en un diccionario de Python (agrega más información si es necesario) y agrégala al mapa."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 414,
+ "id": "d6ae7f56",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "headed-privacy",
- "metadata": {},
- "outputs": [],
- "source": [
- "from ipyleaflet import Map, basemaps\n",
- "\n",
- "# Mapa centrado en (60 grados latitud y -2.2 grados longitud)\n",
- "# Latitud, longitud\n",
- "map = Map(center = (60, -2.2), zoom = 2, min_zoom = 1, max_zoom = 20, \n",
- " basemap=basemaps.Stamen.Terrain)\n",
- "map"
- ]
- },
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "Defaulting to user installation because normal site-packages is not writeable\n",
+ "Requirement already satisfied: ipyleaflet in /home/vscode/.local/lib/python3.11/site-packages (0.19.2)\n",
+ "Requirement already satisfied: pandas in /home/vscode/.local/lib/python3.11/site-packages (2.2.3)\n",
+ "Requirement already satisfied: branca>=0.5.0 in /home/vscode/.local/lib/python3.11/site-packages (from ipyleaflet) (0.8.0)\n",
+ "Requirement already satisfied: ipywidgets<9,>=7.6.0 in /home/vscode/.local/lib/python3.11/site-packages (from ipyleaflet) (8.1.5)\n",
+ "Requirement already satisfied: jupyter-leaflet<0.20,>=0.19 in /home/vscode/.local/lib/python3.11/site-packages (from ipyleaflet) (0.19.2)\n",
+ "Requirement already satisfied: traittypes<3,>=0.2.1 in /home/vscode/.local/lib/python3.11/site-packages (from ipyleaflet) (0.2.1)\n",
+ "Requirement already satisfied: xyzservices>=2021.8.1 in /home/vscode/.local/lib/python3.11/site-packages (from ipyleaflet) (2024.9.0)\n",
+ "Requirement already satisfied: numpy>=1.23.2 in /home/vscode/.local/lib/python3.11/site-packages (from pandas) (2.1.2)\n",
+ "Requirement already satisfied: python-dateutil>=2.8.2 in /home/vscode/.local/lib/python3.11/site-packages (from pandas) (2.9.0.post0)\n",
+ "Requirement already satisfied: pytz>=2020.1 in /home/vscode/.local/lib/python3.11/site-packages (from pandas) (2024.2)\n",
+ "Requirement already satisfied: tzdata>=2022.7 in /home/vscode/.local/lib/python3.11/site-packages (from pandas) (2024.2)\n",
+ "Requirement already satisfied: jinja2>=3 in /home/vscode/.local/lib/python3.11/site-packages (from branca>=0.5.0->ipyleaflet) (3.1.4)\n",
+ "Requirement already satisfied: comm>=0.1.3 in /home/vscode/.local/lib/python3.11/site-packages (from ipywidgets<9,>=7.6.0->ipyleaflet) (0.2.2)\n",
+ "Requirement already satisfied: ipython>=6.1.0 in /home/vscode/.local/lib/python3.11/site-packages (from ipywidgets<9,>=7.6.0->ipyleaflet) (8.28.0)\n",
+ "Requirement already satisfied: traitlets>=4.3.1 in /home/vscode/.local/lib/python3.11/site-packages (from ipywidgets<9,>=7.6.0->ipyleaflet) (5.14.3)\n",
+ "Requirement already satisfied: widgetsnbextension~=4.0.12 in /home/vscode/.local/lib/python3.11/site-packages (from ipywidgets<9,>=7.6.0->ipyleaflet) (4.0.13)\n",
+ "Requirement already satisfied: jupyterlab-widgets~=3.0.12 in /home/vscode/.local/lib/python3.11/site-packages (from ipywidgets<9,>=7.6.0->ipyleaflet) (3.0.13)\n",
+ "Requirement already satisfied: six>=1.5 in /home/vscode/.local/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)\n",
+ "Requirement already satisfied: decorator in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (5.1.1)\n",
+ "Requirement already satisfied: jedi>=0.16 in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.19.1)\n",
+ "Requirement already satisfied: matplotlib-inline in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.1.7)\n",
+ "Requirement already satisfied: prompt-toolkit<3.1.0,>=3.0.41 in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (3.0.48)\n",
+ "Requirement already satisfied: pygments>=2.4.0 in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (2.18.0)\n",
+ "Requirement already satisfied: stack-data in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.6.3)\n",
+ "Requirement already satisfied: typing-extensions>=4.6 in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (4.12.2)\n",
+ "Requirement already satisfied: pexpect>4.3 in /home/vscode/.local/lib/python3.11/site-packages (from ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (4.9.0)\n",
+ "Requirement already satisfied: MarkupSafe>=2.0 in /home/vscode/.local/lib/python3.11/site-packages (from jinja2>=3->branca>=0.5.0->ipyleaflet) (2.1.5)\n",
+ "Requirement already satisfied: parso<0.9.0,>=0.8.3 in /home/vscode/.local/lib/python3.11/site-packages (from jedi>=0.16->ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.8.4)\n",
+ "Requirement already satisfied: ptyprocess>=0.5 in /home/vscode/.local/lib/python3.11/site-packages (from pexpect>4.3->ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.7.0)\n",
+ "Requirement already satisfied: wcwidth in /home/vscode/.local/lib/python3.11/site-packages (from prompt-toolkit<3.1.0,>=3.0.41->ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.2.13)\n",
+ "Requirement already satisfied: executing>=1.2.0 in /home/vscode/.local/lib/python3.11/site-packages (from stack-data->ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (2.1.0)\n",
+ "Requirement already satisfied: asttokens>=2.1.0 in /home/vscode/.local/lib/python3.11/site-packages (from stack-data->ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (2.4.1)\n",
+ "Requirement already satisfied: pure-eval in /home/vscode/.local/lib/python3.11/site-packages (from stack-data->ipython>=6.1.0->ipywidgets<9,>=7.6.0->ipyleaflet) (0.2.3)\n",
+ "\n",
+ "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m A new release of pip is available: \u001b[0m\u001b[31;49m23.1.2\u001b[0m\u001b[39;49m -> \u001b[0m\u001b[32;49m24.3.1\u001b[0m\n",
+ "\u001b[1m[\u001b[0m\u001b[34;49mnotice\u001b[0m\u001b[1;39;49m]\u001b[0m\u001b[39;49m To update, run: \u001b[0m\u001b[32;49mpip install --upgrade pip\u001b[0m\n",
+ "Note: you may need to restart the kernel to use updated packages.\n"
+ ]
+ }
+ ],
+ "source": [
+ "pip install ipyleaflet pandas"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 415,
+ "id": "86899b72",
+ "metadata": {},
+ "outputs": [
{
- "cell_type": "code",
- "execution_count": null,
- "id": "present-mistress",
- "metadata": {},
- "outputs": [],
- "source": [
- "## Aquí: traza la coordenadas de los estados\n",
- "\n",
- "## PON TU CÓDIGO AQUÍ:\n"
+ "data": {
+ "text/plain": [
+ "Index(['Unnamed: 0', 'id_realEstates', 'isNew', 'realEstate_name',\n",
+ " 'phone_realEstate', 'url_inmueble', 'rooms', 'bathrooms', 'surface',\n",
+ " 'price', 'date', 'description', 'address', 'country', 'level1',\n",
+ " 'level2', 'level3', 'level4', 'level5', 'level6', 'level7', 'level8',\n",
+ " 'upperLevel', 'countryId', 'level1Id', 'level2Id', 'level3Id',\n",
+ " 'level4Id', 'level5Id', 'level6Id', 'level7Id', 'level8Id', 'accuracy',\n",
+ " 'latitude', 'longitude', 'zipCode', 'customZone'],\n",
+ " dtype='object')"
]
+ },
+ "execution_count": 415,
+ "metadata": {},
+ "output_type": "execute_result"
}
- ],
- "metadata": {
- "kernelspec": {
- "display_name": "Python 3",
- "language": "python",
- "name": "python3"
- },
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
+ ],
+ "source": [
+ "ds.columns"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 416,
+ "id": "headed-privacy",
+ "metadata": {},
+ "outputs": [],
+ "source": [
+ "\n",
+ "from ipyleaflet import Map, basemaps, Marker, LayerGroup \n",
+ "import pandas as pd\n",
+ "ds = pd.read_csv('assets/real_estate.csv', sep=';')\n",
+ "ds\n",
+ "\n",
+ "ciudades = ['Getafe', 'Leganés', 'Fuenlabrada', 'Alcorcón'] \n",
+ "cinturon_sur = ds[ds['level5'].isin(ciudades)]\n",
+ "\n",
+ "coordenadas = {}\n",
+ "for ciudad in ciudades:\n",
+ " subset = cinturon_sur[cinturon_sur['level5'] == ciudad] \n",
+ " coordenadas[ciudad] = list(zip(subset['latitude'], subset['longitude']))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 417,
+ "id": "present-mistress",
+ "metadata": {},
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.jupyter.widget-view+json": {
+ "model_id": "ffd9fbe7c11b4852baf47a7bad6f0fa5",
+ "version_major": 2,
+ "version_minor": 0
},
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.11.3"
+ "text/plain": [
+ "Map(center=[40.35, -3.75], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title', 'zoom_o…"
+ ]
+ },
+ "execution_count": 417,
+ "metadata": {},
+ "output_type": "execute_result"
}
+ ],
+ "source": [
+ "## Aquí: traza la coordenadas de los estados\n",
+ "map = Map(center=(40.35, -3.75), zoom=12, min_zoom=1, max_zoom=20, basemap=basemaps.OpenStreetMap.Mapnik)\n",
+ "colores = { 'Getafe': 'blue', \n",
+ " 'Leganés': 'green', \n",
+ " 'Fuenlabrada': 'red', \n",
+ " 'Alcorcón': 'purple' }\n",
+ "\n",
+ "## PON TU CÓDIGO AQUÍ:\n",
+ "for ciudad, coords in coordenadas.items(): \n",
+ " markers = [] \n",
+ " for coord in coords:\n",
+ " marker = Marker(location=coord, draggable=False)\n",
+ " markers.append(marker)\n",
+ "\n",
+ " layer_group = LayerGroup(layers=markers)\n",
+ " map.add_layer(layer_group)\n",
+ "map\n"
+ ]
+ }
+ ],
+ "metadata": {
+ "kernelspec": {
+ "display_name": "Python 3",
+ "language": "python",
+ "name": "python3"
},
- "nbformat": 4,
- "nbformat_minor": 5
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.4"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
}