This module creates:
- A local file: A Google Cloud Batch job template file is created. See the
instructions
output for the location of the file and instructions on how to submit it to Batch. - An instance template: This instance template defines the compute settings to be used for the Batch job such as network, machine type, image, and startup script. This instance template is automatically referenced from the Batch job template described above.
When this module is used with the batch-login-node
module, the generated
job template will be placed on the login node.
In some cases the job template can be submitted to the Google Cloud Batch API without modification, but for more complex workloads it is expected that the user will modify the template after running the HPC Toolkit.
- id: batch-job
source: modules/scheduler/batch-job-template
use: [network1]
settings:
runnable: "echo 'hello world'"
machine_type: n2-standard-4
outputs: [instructions]
See the
Google Cloud Batch Example
for how to use the batch-job-template
module with other HPC Toolkit modules such
as filestore
and startup-script
.
This module supports using a shared VPC with a Batch job. To accomplish this,
include a pre-existing-vpc
module that references an existing shared VPC and
then have the batch-job-template
module use
the pre-existing-vpc
.
Many of the settings for a Google Cloud Batch job are set using an instance
template, machine_type
for example. The batch-job-template
module accomplishes
this by creating an instance template within the module, which is supplied to
the Google Cloud Batch job.
Alternatively, one can supply an instance template to the batch-job-template
module using the instance_template
setting. This supplied instance template
could be generated outside of the HPC Toolkit (via the Cloud Console UI for
example) or using a separate module within the blueprint. To define an instance
template within a blueprint, one can use the Cloud Foundation Toolkit instance
template module as shown in the following example. This can be useful when
trying to set a property not natively supported in the batch-job-template
module.
deployment_groups:
- group: primary
modules:
- id: network1
source: modules/network/pre-existing-vpc
- id: appfs
source: modules/file-system/filestore
use: [network1]
- id: batch-startup-script
source: modules/scripts/startup-script
settings:
runners: ...
- id: batch-compute-template
source: github.com/terraform-google-modules/terraform-google-vm//modules/instance_template?ref=v7.8.0
use: [batch-startup-script]
settings:
# Boiler plate to work with Cloud Foundation Toolkit
network: $(network1.network_self_link)
service_account: {email: null, scopes: ["https://www.googleapis.com/auth/cloud-platform"]}
access_config: [{nat_ip: null, network_tier: null}]
# Batch customization
machine_type: n2-standard-4
metadata:
network_storage: ((jsonencode([module.appfs.network_storage])))
source_image_family: hpc-centos-7
source_image_project: cloud-hpc-image-public
- id: batch-job
source: ./modules/scheduler/batch-job-template
settings:
instance_template: $(batch-compute-template.self_link)
outputs: [instructions]
Copyright 2022 Google LLC
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
Name | Version |
---|---|
terraform | >= 0.13.0 |
local | >= 2.0.0 |
Name | Version |
---|---|
local | >= 2.0.0 |
Name | Source | Version |
---|---|---|
instance_template | terraform-google-modules/vm/google//modules/instance_template | > 7.6.0 |
netstorage_startup_script | github.com/GoogleCloudPlatform/hpc-toolkit//modules/scripts/startup-script | 64bc890 |
Name | Type |
---|---|
local_file.job_template | resource |
Name | Description | Type | Default | Required |
---|---|---|---|---|
deployment_name | Name of the deployment, used for the job_id | string |
n/a | yes |
enable_public_ips | If set to true, instances will have public IPs | bool |
true |
no |
gcloud_version | The version of the gcloud cli being used. Used for output instructions. Valid inputs are "alpha" , "beta" and "" (empty string for default version) |
string |
"alpha" |
no |
image | Google Cloud Batch compute node image. Ignored if instance_template is provided. |
object({ |
{ |
no |
instance_template | Compute VM instance template self-link to be used for Google Cloud Batch compute node. If provided, a number of other variables will be ignored as noted by Ignored if instance_template is provided in descriptions. |
string |
null |
no |
job_filename | The filename of the generated job template file. Will default to cloud-batch-<job_id>.json if not specified |
string |
null |
no |
job_id | An id for the Google Cloud Batch job. Used for output instructions and file naming. Defaults to deployment name. | string |
null |
no |
labels | Labels to add to the Google Cloud Batch compute nodes. List key, value pairs. Ignored if instance_template is provided. |
any |
n/a | yes |
log_policy | Create a block to define log policy. When set to CLOUD_LOGGING , logs will be sent to Cloud Logging.When set to PATH , path must be added to generated template.When set to DESTINATION_UNSPECIFIED , logs will not be preserved. |
string |
"CLOUD_LOGGING" |
no |
machine_type | Machine type to use for Google Cloud Batch compute nodes. Ignored if instance_template is provided. |
string |
"n2-standard-4" |
no |
mpi_mode | Sets up barriers before and after runnable. In addition, sets permissiveSsh=true , requireHostsFile=true , and taskCountPerNode=1 . taskCountPerNode can be overridden by task_count_per_node . |
bool |
false |
no |
native_batch_mounting | Batch can mount some fs_type nativly using the 'volumes' block in the job file. If set to false, all mounting will happen through HPC Toolkit starup scripts. | bool |
true |
no |
network_storage | An array of network attached storage mounts to be configured. Ignored if instance_template is provided. |
list(object({ |
[] |
no |
on_host_maintenance | Describes maintenance behavior for the instance. If left blank this will default to MIGRATE except the use of GPUs requires it to be TERMINATE |
string |
null |
no |
project_id | Project in which the HPC deployment will be created | string |
n/a | yes |
region | The region in which to run the Google Cloud Batch job | string |
n/a | yes |
runnable | A string to be executed as the main workload of the Google Cloud Batch job. This will be used to populate the generated template. | string |
"## Add your workload here" |
no |
service_account | Service account to attach to the Google Cloud Batch compute node. Ignored if instance_template is provided. |
object({ |
{ |
no |
startup_script | Startup script run before Google Cloud Batch job starts. Ignored if instance_template is provided. |
string |
null |
no |
subnetwork | The subnetwork that the Batch job should run on. Defaults to 'default' subnet. Ignored if instance_template is provided. |
any |
null |
no |
task_count | Number of parallel tasks | number |
1 |
no |
task_count_per_node | Max number of tasks that can be run on a VM at the same time. If not specified, Batch will decide a value. | number |
null |
no |
Name | Description |
---|---|
gcloud_version | The version of gcloud to be used. |
instance_template | Instance template used by the Batch job. |
instructions | Instructions for submitting the Batch job. |
job_data | All data associated with the defined job, typically provided as input to clout-batch-login-node. |
network_storage | An array of network attached storage mounts used by the Batch job. |
startup_script | Startup script run before Google Cloud Batch job starts. |