-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training: Writing Python scripts in Snowflake directly for data analysis #466
Comments
Notes from Mintu on the goals for what this session can look like: Benefits:
The training can be helpful if it includes following but not limited -
|
Note from @ian-r-rose : this training should include discussion that even doing this analysis direct in Python, they will still want to put guardrails around the size of the data being analyzed to avoid incurring high costs since their data is so large. We should include guidance on when it makes sense to do this type of analysis in Snowflake directly vs. another option. |
We want to do more fact-finding before starting this to better understand why they feel the need to download data for analysis currently, to better understand the problem. We will bring this up in the modeling session on 11/27. |
Notes from discussion with Mintu and team:
Focus more on the notebook style since you can have in-line visualizations which may be more what the Caltrans team needs. |
@JamesSLogan Here's a notebook I used to explore some data (this was for trying to figure out if the clearinghouse and data relay server were at parity or not). https://app.snowflake.com/vsb79059/dse_caltrans_pems/#/notebooks/TRANSFORM_DEV.PUBLIC.DATA_RELAY_UNION_TEST_STATIONS_SAMPLE (I did not see a 'share notebook' option anywhere, so lmk if that link doesn't work!) |
This will be a training on doing data analysis within Snowflake using Python scripts, to avoid having to download the data and doing an analysis locally.
The text was updated successfully, but these errors were encountered: