Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create plan for Infrastructure Improvements Step 1: Create Roadmap #378

Open
2 of 4 tasks
jkarpen opened this issue Sep 17, 2024 · 5 comments
Open
2 of 4 tasks

Create plan for Infrastructure Improvements Step 1: Create Roadmap #378

jkarpen opened this issue Sep 17, 2024 · 5 comments
Assignees

Comments

@jkarpen
Copy link
Collaborator

jkarpen commented Sep 17, 2024

This can wait until Ian returns since he can help guide on what to prioritize here. Main pain points have been cost management and SCIM rollout (see OKTA related issues) so possibly focus there first, but confirm with Ian. The goals are:

  • Identify any issues that are outdated/no longer needed and can be closed
  • Create issues for topics that should be addressed and do not have one already
  • Identify which issues to tackle first in the near term
  • Create a roadmap for when to tackle remaining issues
@ram-kishore-odi
Copy link
Contributor

Hello Everyone,

I reviewed the items in the backlog under each of the above sections and classified them into groups for discussing as team and plan on the next steps.

#Infrastructure

Future work items

  1. Configure and set up fivetran and dbt
  2. Set up Zoom connector on Fivetran to RAW_PRD with schema odi_zoom
  3. draft future security policy for dbt, snowflake, fivetran
  4. Investigate options for Azure for future projects - Caltrans in particular
  5. Review documentation for new project setup
  6. Document new project setup for Fivetran
  7. Add Project to Sprint Summary table
  8. Capture dbt audit logs in IT-Ops audit platform

Already WIP

  1. Test Asana for Project Management
  2. Investigate notifications to a group when nightly job fails
  3. Investigate options for dbt failure notifications
  4. added Snowflake OAuth instructions and fixed many case, spelling, and grammars errors

#Infrastructure - Okta Rollout

Future work items

  1. Document Okta-related processes
  2. Update dbt set up docs to include Snowflake OAuth
  3. Walk through onboarding/offboarding Okta process with Kevin (had initial discussion took place)

Need to confirm if these are still relevant

  1. Investigate using SSO for AWS CLI and SDK access
  2. Consider using IAM roles for user access
  3. Troubleshoot python errors encountered passing Okta url to snowflake connector function

Already WIP

  1. Consider best practices for Snowflake "break glass" account
  2. Implement Okta SCIM provisioning for our Snowflake accounts

#Infrastructure - Cost Management

Future work items

  1. Dashboard Snowflake Query Costs

Need to confirm if these are still relevant

  1. Set up Fivetran Platform Connector & Dashboard

#Infrastructure - Develop Platform Management Processes

Future work items

  1. Document approach to service accounts in Snowflake
  2. Obtain/review existing documentation template

Need to confirm if these are still relevant

  1. Create incident management system in airtable

#Infrastructure - Orchestration

Need to confirm if these are still relevant

  1. Investigate orchestration options. (This also may require redefining current objectives)

#Infrastructure - Project Templates

Future work items

  1. Validate dbt Cloud CI with Azure DevOps
  2. Create ODI Azure DevOps org
  3. Migrate usage of snowflake_user to snowflake_service_user and snowflake_legacy_service_user

Need to confirm if these are still relevant

  1. Create milestone/issue template

Already WIP

  1. Evaluate options for static docs within Azure DevOps
  2. Connect Azure DevOps MDSA project repo to dbt Cloud
  3. Convert pre-commit GitHub action to Azure Pipelines
  4. Create narrative description of handoff steps in project docs

Completed

  1. Implement workaround for sharing views in terraform configuration
  2. Create more warehouse size options in "ELT" terraform module

#Infrastructure - AWS

Future work items

  1. Improve AWS logging practices
  2. Set up production AWS account

Need to confirm if these are still relevant

  1. Astronomer-on-AWS proof-of-concept

#Infrastructure - Pain Points

Future work items

  1. Improve Batch CI/CD
  2. Figure out how to test template actions
  3. Add ability to customize "ELT" terraform module

Need to confirm if these are still relevant

  1. Should we create a terraform package for our Snowflake ELT setup?
  2. Should we rename our GCP project?

@ram-kishore-odi
Copy link
Contributor

In order complete the individual tasks in the story a follow up discussion with Ian/Team would be necessary.

@ram-kishore-odi
Copy link
Contributor

Hi @ian-r-rose,

Can you please look at the classification of the stories when you get a chance (especially the ones under Need to confirm if these are still relevant sections) ? Based on your feedback, it will be easy to clean up the backlog and prioritize the remaining ones which can be part of the near term roadmap, I feel.

It is fine if you want to discuss these in our sprint planning meeting tomorrow. I hope there will be time to discuss these.

@ian-r-rose
Copy link
Member

Let's discuss in more detail during sprint planning! Some initial thoughts below:

#Infrastructure - Okta Rollout

Need to confirm if these are still relevant

1. [Investigate using SSO for AWS CLI and SDK access](https://github.com/cagov/data-infrastructure/issues/118)

2. [Consider using IAM roles for user access](https://github.com/cagov/data-infrastructure/issues/128)

3. [Troubleshoot python errors encountered passing Okta url to snowflake connector function](https://github.com/cagov/data-infrastructure/issues/289)

Yes, I think these are still relevant. I think Kevin is actually interested in working on Okta+AWS in the new year, you might ask what his plans are there.

Infrastructure - Cost Management

Need to confirm if these are still relevant

1. [Set up Fivetran Platform Connector & Dashboard](https://github.com/cagov/data-infrastructure/issues/167)

I think this one is superseded by #430 and other issues in the cost tracking milestone, and can be closed.

#Infrastructure - Develop Platform Management Processes

Future work items

1. [Document approach to service accounts in Snowflake](https://github.com/cagov/data-infrastructure/issues/201)

2. [Obtain/review existing documentation template](https://github.com/cagov/data-infrastructure/issues/281)

Need to confirm if these are still relevant

1. [Create incident management system in airtable](https://github.com/cagov/data-infrastructure/issues/26)

I think we can close this as not planned right now. We may revisit incident response at a later date, but for now we are not the long-term holder of critical infrastructure.

#Infrastructure - Orchestration

Need to confirm if these are still relevant

1. [Investigate orchestration options](https://github.com/cagov/data-infrastructure/issues/4). (This also may require redefining current objectives)

Let's close this as complete for now. We may revisit orchestration options at some point, but would probably start with a new set of tasks.

Need to confirm if these are still relevant

1. [Create milestone/issue template](https://github.com/cagov/data-infrastructure/issues/240)

We can keep this in the backlog for now.

Need to confirm if these are still relevant

1. [Astronomer-on-AWS proof-of-concept](https://github.com/cagov/data-infrastructure/issues/51)

I think we can close this as not planned for the time being.

Need to confirm if these are still relevant

1. [Should we create a terraform package for our Snowflake ELT setup?](https://github.com/cagov/data-infrastructure/issues/109)

Curious what you think of this one @ram-kishore-odi. We've been using our module's URL in github (plus a commit hash) as an ersatz package for a bit. Do you feel that's working well enough? Or is it worth the effort to publish a more "official" package?

2. [Should we rename our GCP project?](https://github.com/cagov/data-infrastructure/issues/74)

Let's close this one as not planned. We don't have much in GCP anymore.

@ram-kishore-odi
Copy link
Contributor

Thank you so much for your feedback @ian-r-rose ! I will act on the tickets as suggested
Hi @jkarpen We may not need a separate meeting to discuss these as I requested in our sprint planning meeting. I now have sufficient information to proceed further.

Hi @ian-r-rose, with respect to 1. Should we create a terraform package for our Snowflake ELT setup?, I think using module's URL in GitHub (plus a commit hash) is working well enough for now. We can certainly think about creating an official package after the terraform snowflake provider becomes more stable or after the next major release like 1.0 or later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants