Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hydrofab2ngen tools #46

Open
wants to merge 108 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
108 commits
Select commit Hold shift + click to select a range
d08274c
Tool to generate list of number of upstream catchments
JordanLaserGit Apr 18, 2023
174b2a0
Made arguments positional
JordanLaserGit Apr 18, 2023
ca21308
Tool for isolating forcing files based on the catchments within a geo…
JordanLaserGit Apr 18, 2023
f80b5ca
Removed unnecessary copy
JordanLaserGit Apr 19, 2023
c86455b
configurable python script to generate catchment forcing files for ngen
JordanLaserGit Apr 25, 2023
890e698
Speed up:implemented new data -> data frame and write functions
JordanLaserGit Apr 26, 2023
83ac0b2
Included preop_rate (which is broken)
JordanLaserGit Apr 26, 2023
b8c95e2
user input markdown
JordanLaserGit May 1, 2023
c0770d7
added hydrofab field
JordanLaserGit May 1, 2023
6d0eb21
gitignore initial commit
JordanLaserGit May 1, 2023
84ac5a2
Added options and explanations
JordanLaserGit May 1, 2023
3bd2d8e
Updated example
JordanLaserGit May 1, 2023
c984dfd
Made many of the operations conditional and added options
JordanLaserGit May 1, 2023
fe90310
Added options to write to S3 bucket in either csv or parquet
JordanLaserGit May 3, 2023
5940ea2
removed precip_rate functions
JordanLaserGit May 3, 2023
4757216
updated config and readme
JordanLaserGit May 3, 2023
1ba8ba2
Indent for code block
JordanLaserGit May 3, 2023
5816e09
moved files and implemented threaded download
JordanLaserGit May 4, 2023
367cc3a
weights write to json, removed functions
JordanLaserGit May 4, 2023
82a67d3
Applied black formatting
JordanLaserGit May 4, 2023
72066ab
block pycache in ngen_forcing
JordanLaserGit May 4, 2023
c9630ce
removed pycache
JordanLaserGit May 4, 2023
106fb08
removed raw script
JordanLaserGit May 4, 2023
3d77837
Removed alternates for now
JordanLaserGit May 4, 2023
ded1a98
blacked subsetting and added new line to end of file
JordanLaserGit May 4, 2023
ca06fe1
Added hydrofabric version to config
JordanLaserGit May 4, 2023
a3ac55e
Fixed wget gpkg bug
JordanLaserGit May 4, 2023
ef427f5
Default to short range
JordanLaserGit May 4, 2023
33ad1a8
Fixed pathing issues
JordanLaserGit May 4, 2023
f7592a0
Added remote indexing. Moved cache into forcing.
JordanLaserGit May 5, 2023
093320e
Removed indexing (bug)
JordanLaserGit May 5, 2023
37488b5
Removed bug #12412
JordanLaserGit May 5, 2023
1254086
Removed bug #12413
JordanLaserGit May 5, 2023
53ac8d5
blacked and print statements fixes
JordanLaserGit May 5, 2023
4aeca19
moved file names template
JordanLaserGit May 8, 2023
3fcc1fc
fixed threading, organized config
JordanLaserGit May 8, 2023
7ffb9cf
blacked
JordanLaserGit May 8, 2023
ebd7d32
Fixed threading and added local file check
JordanLaserGit May 9, 2023
c789cd9
removed print statements
JordanLaserGit May 11, 2023
aa6c0c0
Threaded local data processing and updated user_inputs
JordanLaserGit May 11, 2023
b710cff
Retrospective file names
JordanLaserGit May 12, 2023
1d0190e
Removed import shield
JordanLaserGit May 12, 2023
e5ae8f1
Tool to generate list of number of upstream catchments
JordanLaserGit Apr 18, 2023
801d89d
Made arguments positional
JordanLaserGit Apr 18, 2023
6831f11
Tool for isolating forcing files based on the catchments within a geo…
JordanLaserGit Apr 18, 2023
72a6e0e
Removed unnecessary copy
JordanLaserGit Apr 19, 2023
0b7d411
configurable python script to generate catchment forcing files for ngen
JordanLaserGit Apr 25, 2023
cd5346d
Speed up:implemented new data -> data frame and write functions
JordanLaserGit Apr 26, 2023
6a8cab6
Included preop_rate (which is broken)
JordanLaserGit Apr 26, 2023
79b21d7
user input markdown
JordanLaserGit May 1, 2023
23e4ec8
added hydrofab field
JordanLaserGit May 1, 2023
6297d64
gitignore initial commit
JordanLaserGit May 1, 2023
2475605
Added options and explanations
JordanLaserGit May 1, 2023
5bf52f1
Updated example
JordanLaserGit May 1, 2023
634a3c5
Made many of the operations conditional and added options
JordanLaserGit May 1, 2023
aa00098
Added options to write to S3 bucket in either csv or parquet
JordanLaserGit May 3, 2023
4106e76
removed precip_rate functions
JordanLaserGit May 3, 2023
b5cb3a7
updated config and readme
JordanLaserGit May 3, 2023
30e0e29
Indent for code block
JordanLaserGit May 3, 2023
cf8639a
moved files and implemented threaded download
JordanLaserGit May 4, 2023
45fe379
weights write to json, removed functions
JordanLaserGit May 4, 2023
3fdb6fa
Applied black formatting
JordanLaserGit May 4, 2023
3c0c7f5
block pycache in ngen_forcing
JordanLaserGit May 4, 2023
b4dca39
removed pycache
JordanLaserGit May 4, 2023
66e40e3
removed raw script
JordanLaserGit May 4, 2023
007c1aa
Removed alternates for now
JordanLaserGit May 4, 2023
ee82e79
blacked subsetting and added new line to end of file
JordanLaserGit May 4, 2023
1d6254d
Added hydrofabric version to config
JordanLaserGit May 4, 2023
1a0b492
Fixed wget gpkg bug
JordanLaserGit May 4, 2023
2403ba0
Default to short range
JordanLaserGit May 4, 2023
c8354ed
Fixed pathing issues
JordanLaserGit May 4, 2023
715002d
Added remote indexing. Moved cache into forcing.
JordanLaserGit May 5, 2023
6117014
Removed indexing (bug)
JordanLaserGit May 5, 2023
b917592
Removed bug #12412
JordanLaserGit May 5, 2023
4ddd40c
Removed bug #12413
JordanLaserGit May 5, 2023
52da0d7
blacked and print statements fixes
JordanLaserGit May 5, 2023
5e14877
moved file names template
JordanLaserGit May 8, 2023
4eddf71
fixed threading, organized config
JordanLaserGit May 8, 2023
dd7a721
blacked
JordanLaserGit May 8, 2023
3461a7c
Fixed threading and added local file check
JordanLaserGit May 9, 2023
bf96bd3
removed print statements
JordanLaserGit May 11, 2023
3710014
Threaded local data processing and updated user_inputs
JordanLaserGit May 11, 2023
f14faa6
Retrospective file names
JordanLaserGit May 12, 2023
159d33c
Removed import shield
JordanLaserGit May 12, 2023
d813d7f
Shielded subset function
JordanLaserGit May 17, 2023
c81852c
Merge remote-tracking branch 'origin/hydrofab2ngen_tools' into hydrof…
JordanLaserGit May 18, 2023
7b584e6
Removed unnecessary imports
JordanLaserGit May 22, 2023
9fe6912
Removed tony's subset import
JordanLaserGit May 22, 2023
3c79bdb
find bug fix
JordanLaserGit May 22, 2023
82db7a3
pytests for listofnwmfilenames and prep data script
JordanLaserGit May 30, 2023
e465a49
split the tests up
JordanLaserGit May 31, 2023
4d61218
Garbage collected and updated storage variables
JordanLaserGit Jul 6, 2023
b41bd2d
Initial lambda function files WIP
JordanLaserGit Jul 11, 2023
267acaf
implemented metadata outputs, lambda function works though still WIP
JordanLaserGit Jul 14, 2023
9fe5d3d
Added security manager
JordanLaserGit Jul 14, 2023
66adf2e
Added lambda function versioning
JordanLaserGit Jul 24, 2023
6ba50db
Implemented hashing in metadata
JordanLaserGit Aug 16, 2023
c4d0f1a
removed hard coded variables and made testing optional
JordanLaserGit Aug 16, 2023
daabf3c
moved conf into proper metadata folder
JordanLaserGit Aug 16, 2023
652aea2
personal -> ciroh changes
JordanLaserGit Aug 25, 2023
e2de303
Merge branch 'AlabamaWaterInstitute:main' into hydrofab2ngen_tools
JordanLaserGit Aug 25, 2023
3ae2290
fix path
JordanLaserGit Aug 30, 2023
b9c150e
Merge remote-tracking branch 'refs/remotes/origin/hydrofab2ngen_tools…
JordanLaserGit Aug 30, 2023
2a06395
config file
JordanLaserGit Aug 30, 2023
a6f39b7
Allows VPU to be given as env variable
JordanLaserGit Sep 18, 2023
3e86ad8
aws updates
JordanLaserGit Sep 19, 2023
3dccd0b
debug
JordanLaserGit Sep 19, 2023
a1e20ea
removing file cap for production
JordanLaserGit Sep 19, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
data/*
venv9/
tests/data/ngen_inputs/forcing/*
45 changes: 45 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# Define custom function directory
ARG FUNCTION_DIR="/function"

FROM python:3.9 as build-image

# Include global arg in this stage of the build
ARG FUNCTION_DIR

# Copy function code
RUN mkdir -p ${FUNCTION_DIR}
RUN mkdir -p ${FUNCTION_DIR}/ngen_forcing
RUN mkdir -p ${FUNCTION_DIR}/subsetting
RUN mkdir -p ${FUNCTION_DIR}/nwm_filenames
COPY ./ngen_forcing ${FUNCTION_DIR}/ngen_forcing
COPY ./subsetting ${FUNCTION_DIR}/subsetting
COPY ./nwm_filenames ${FUNCTION_DIR}/nwm_filenames
COPY requirements.txt ${FUNCTION_DIR}
COPY lambda_function.py ${FUNCTION_DIR}

# Install the function's dependencies
RUN pip3 install --upgrade pip
RUN pip3 install --no-cache-dir \
--target ${FUNCTION_DIR} \
awslambdaric

# Use a slim version of the base Python image to reduce the final image size
FROM python:3.9-slim

# Include global arg in this stage of the build
ARG FUNCTION_DIR

# Set working directory to function root directory
WORKDIR ${FUNCTION_DIR}

# Copy in the built dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}

RUN pip3 install --upgrade pip
RUN pip3 install --no-cache-dir -r "${FUNCTION_DIR}/requirements.txt"

# Set runtime interface client as default command for the container runtime
ENTRYPOINT [ "/usr/local/bin/python", "-m", "awslambdaric" ]

# Pass the name of the function handler as an argument to the runtime
CMD [ "lambda_function.handler" ]
151 changes: 151 additions & 0 deletions forcing_processor_lambda.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
terraform {
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 4.0"
}
}
}

# Variable declarations
variable "region" {
type = string
}

variable "trigger_bucket" {
type = string
}

variable "ecr_repo" {
type = string
}

variable "function_name" {
type = string
}

variable "trigger_file_prefix" {
type = string
}

variable "trigger_file_suffix" {
type = string
}

variable "image_tag" {
type = string
}

variable "memory_size" {
type = number
}

provider "aws" {
region = var.region
}

data "aws_ecr_repository" "image_repo" {
name = var.ecr_repo
}

# Create function and set role
resource "aws_lambda_function" "forcing_processor_function" {
function_name = "${var.function_name}"
timeout = 900 # 900 is max
image_uri = "${data.aws_ecr_repository.image_repo.repository_url}:${var.image_tag}"
package_type = "Image"

memory_size = var.memory_size


role = aws_iam_role.forcing_processor_function_role.arn

}

resource "aws_iam_role" "forcing_processor_function_role" {
name = "forcing-processor"

assume_role_policy = jsonencode({
Statement = [
{
Action = "sts:AssumeRole"
Effect = "Allow"
Principal = {
Service = "lambda.amazonaws.com"
}
},
]
})
}

# Set up the trigger
resource "aws_s3_bucket" "trigger_bucket" {
bucket = var.trigger_bucket
}

resource "aws_s3_bucket_notification" "bucket_notification" {
bucket = aws_s3_bucket.trigger_bucket.id

lambda_function {
lambda_function_arn = aws_lambda_function.forcing_processor_function.arn
events = ["s3:ObjectCreated:*"]
filter_prefix = var.trigger_file_prefix
filter_suffix = var.trigger_file_suffix
}

depends_on = [aws_lambda_permission.allow_bucket]
}

resource "aws_lambda_permission" "allow_bucket" {
statement_id = "AllowExecutionFromS3Bucket"
action = "lambda:InvokeFunction"
function_name = aws_lambda_function.forcing_processor_function.arn
principal = "s3.amazonaws.com"
source_arn = aws_s3_bucket.trigger_bucket.arn
}

resource "aws_iam_policy" "function_logging_policy" {
name = "function-logging-policy"
policy = jsonencode({
"Version" : "2012-10-17",
"Statement" : [
{
Action : [
"logs:CreateLogGroup",
"logs:CreateLogStream",
"logs:PutLogEvents"
],
Effect : "Allow",
Resource : "arn:aws:logs:*:*:*"
}
]
})
}

resource "aws_iam_role_policy_attachment" "function_logging_policy_attachment" {
role = aws_iam_role.forcing_processor_function_role.id
policy_arn = aws_iam_policy.function_logging_policy.arn
}

# Add secret access to lambda function
resource "aws_iam_policy" "secrets_manager_policy" {
name = "secrets_manager_access_policy"
description = "Allows access to Secrets Manager"

policy = jsonencode({
Version = "2012-10-17"
Statement = [
{
Sid = "AllowSecretsManagerAccess"
Effect = "Allow"
Action = ["secretsmanager:GetSecretValue"]
Resource = "*"
}
]
})
}

resource "aws_iam_role_policy_attachment" "secrets_manager_attachment" {
role = aws_iam_role.forcing_processor_function_role.name
policy_arn = aws_iam_policy.secrets_manager_policy.arn
}
8 changes: 8 additions & 0 deletions jl_dev.tfvars
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
region = "us-east-2"
trigger_bucket = "ngenresources"
ecr_repo = "nextgenforcing"
function_name = "forcingprocessor"
trigger_file_prefix = ""
trigger_file_suffix = "02.conus.nc.txt"
image_tag = "forcingprocessor"
memory_size = 4096
15 changes: 15 additions & 0 deletions lambda_function.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
import sys, json
# from aws_lambda_powertools.utilities import parameters

def handler(event, context):

# load template config
conf = json.load(open('/data_access_examples/ngen_forcing/ngen_forcings_lambda.json'))

# get date from event

# call function
from ngen_forcing import prep_hydrofab_forcings_ngen
prep_hydrofab_forcings_ngen.prep_ngen_data(conf)

return 'Done!'
3 changes: 3 additions & 0 deletions ngen_forcing/filenames.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
https://storage.googleapis.com/national-water-model/nwm.20220822/forcing_medium_range/nwm.t00z.medium_range.forcing.f006.conus.nc
https://storage.googleapis.com/national-water-model/nwm.20220822/forcing_medium_range/nwm.t00z.medium_range.forcing.f007.conus.nc
https://storage.googleapis.com/national-water-model/nwm.20220822/forcing_medium_range/nwm.t00z.medium_range.forcing.f008.conus.nc
42 changes: 42 additions & 0 deletions ngen_forcing/ngen_forcings_lambda.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
{
"forcing" : {
"forcing_type" : "operational_archive",
"start_date" : "20230705",
"end_date" : "20230705",
"cache" : false,
"nwm_file" : "",
"path_override": "",
"runinput" : 1,
"varinput" : 5,
"geoinput" : 1,
"meminput" : 0,
"urlbaseinput" : 3,
"fcst_cycle" : [0],
"lead_time" : [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18]
},

"hydrofab" : {
"version" : "v1.2",
"vpu" : "ENV",
"geopkg_file" : "",
"catch_subset" : "",
"weights_only" : false
},

"storage":{
"storage_type" : "S3",
"output_bucket" : "ngenforcingdev1",
"output_bucket_path" : "",
"cache_bucket" : "ngenforcingresources",
"cache_bucket_path" : "",
"output_file_type" : "csv"
},

"run" : {
"verbose" : true,
"dl_threads" : 1,
"collect_stats" : true,
"secret_name" : "jlaser_creds",
"region_name" : "us-west-2"
}
}
Loading