This project was built with numerous tools and technologies, this is a summary document. So if you want more statistical and computational information, see Google Collaboratory, to read about the findings found in the project review the Report and see more information about datasets at Stooq and Data Reader API. It is worth mentioning that the time series forecasting method was not implemented in the cloud, because every time we have new data inserted in the dataset we must retrain the model, which is a good practice for time series methods, justifying the non-implementation of the method.
The purpose of this project is to forecast the closing price of the Apple stock market, in addition, we answer some business questions that are in the jupyter notebook. Furthermore, this is a complete project as we go through several steps of a usual data science project such as data collection, feature engineering, data cleansing, data transformation, data visualization, data analysis, modeling.
The current project can be used to help users, companies and shareholders in investments, decision making and understanding of market behavior. Through this project, shareholders can choose whether or not to invest in Apple's stock exchange or choose to invest in others. Although this program is part of my personal portfolio, please feel free to use it for studies, repairs and improvements. 🤙
This project was developed to be part of my personal portfolio, which served both to test my skills and for my learning, since countless technologies could be used in it. Despite being an end-to-end project, it still needs some future improvements, as I initially intended to make a program to predict several different investment markets and compare them, to the point where I could choose the best among them. I believe it would be possible to create several models for this, however, all data analysis, accuracy and performance tests should also be done for each of the markets, in addition to the execution time that would be very long even to implement it. 😃
The application is already running and it is not necessary to install anything on your machine, however, if you want to run the application locally, you must install the Python language on your machine. In addition, you must have the libraries listed below on your machine.
- Pandas 1.3.5;
- Imblearn 0.8.1;
- seaborn 0.11.2;
- statsmodels;
- numpy 1.21.6;
- tensorflow 2.8;
- pandas datareader 0.9.0;
- matplotlib 3.2.2
- scikit-learn 1.0.2.
The installations of the libraries are already explained in the links above, type what is reported below for all libraries EXCEPT FOR TENSORFLOW AND KERAS, for the latter follow the links recommended above and use the same versions as me. So for the other libraries do::
pip install scikit-learn==1.0.2
pip install numpy==1.21.6
pip install pandas==1.3.5
pip install imblearn==0.8.1
pip install statsmodels==0.10.2
pip install pandas_datareader==0.9.0
pip install matplotlib==3.2.2
pip install seaborn==0.11.2
done, for information about jupyter notebook see:
and see the application run on your machine. 😮
Criticism, doubts and suggestions feel free to send me:
e-mail: [email protected]
LinkedIn: https://www.linkedin.com/in/marcos-matheus-silva-089699b3/ 🤗
Marcos Matheus de Paiva Silva
The code written in Google Colaboratory was based on the steps of the book (Aurelien Geron,2019), (ZAREMBA; SUTSKEVER; VINYALS, 2014), (SAK; SENIOR; BEAUFAYS, 2014). In addition, this code was developed based on everything I learned from: Jesse E.Agbe, Siddhardhan, Lucas Grassano Lattari, Shashank Kalanithi, Walisson Silva, Israel Dryer, Fernando Nakamuta, Alex Freberg, Jason Brownlee.
This project is licensed under the MIT License - see the file LICENSE for more details.