-
Notifications
You must be signed in to change notification settings - Fork 3
Home
Ane edited this page Aug 2, 2022
·
6 revisions
Welcome to the boulder wiki!
Other pages in this wiki
This wiki contains how-to steps to develop the project, for internal use.
# List virtual environments
pyenv virtualenvs
# activate virtualenv
pyenv activate boulderenv
pyenv deactivate
Set up NodeJS and AWS CDK. Docker must be running to deploy the Python Lambda function.
export AWS_PROFILE=XXXX
cdk deploy
pip install -r requirements.txt
Run Streamlit locally. Set up the AWS profile to access the dataset in AWS:
export AWS_PROFILE=my_profile
streamlit run app.py
Front-end deployment: Heroku
- Create app in Heroku and set a name. In this case,
bouldern
. - Create
Dockerfile
with specific streamlit commands - Set web and backend in
heroku.yml
- Create environment variables in Heroku:
AWS_ACCESS_KEY_ID
,AWS_SECRET_ACCESS_KEY
,OWM_API
from PyOWM
Log in to Heroku from the terminal
# attach project to heroku app
heroku git:remote -a bouldern
# log in with the CLI. docker must be running
heroku container:login
Push and release project in Heroku
# push changes to heroku
heroku container:push web
# release app
heroku container:release web
# check logs
heroku logs --tail
Alternatively, as the way it's set up now, you can connect Heroku to Github so that the commits to the main branch trigger a Heroku deployment.
Legal info about scraping/crawling
- Web Scraping and Crawling Are Perfectly Legal, Right?
- robots.txt file doesn't prohibit scraping the main webpage
- No prohibitions in AGB or Datenschutzerklärung. No Nutzunsbedingungen
Issue 23 of the repo: Fixed with this script:
import pandas as pd
df = pd.read_csv('boulderdata.csv')
# change the time and the value. :15 -> :20 and :45 -> :40. remove the :30
df['current_time'] = df['current_time'].apply(lambda x: x.replace(':15', ':20').replace(':45', ':40'))
df = df[~df['current_time'].str.contains(':30')]
df = df.set_index('current_time')
df.to_csv('boulderdata.csv')