@batch
The @batch
decorator sends a step for execution on the AWS Batch compute layer. For more information, see Executing Tasks Remotely.
Note that while @batch
doesn't allow mounting arbitrary disk volumes on the fly, you can create in-memory filesystems easily with tmpfs
options. For more details, see using metaflow.S3
for in-memory processing.
from metaflow import batch
Specifies that this step should execute on AWS Batch.
cpu: int, default 1
Number of CPUs required for this step. If @resources
is
also present, the maximum value from all decorators is used.
gpu: int, default 0
Number of GPUs required for this step. If @resources
is
also present, the maximum value from all decorators is used.
memory: int, default 4096
Memory size (in MB) required for this step. If
@resources
is also present, the maximum value from all decorators is
used.
image: str, optional, default None
Docker image to use when launching on AWS Batch. If not specified, and METAFLOW_BATCH_CONTAINER_IMAGE is specified, that image is used. If not, a default Docker image mapping to the current version of Python is used.
queue: str, default METAFLOW_BATCH_JOB_QUEUE
AWS Batch Job Queue to submit the job to.
iam_role: str, default METAFLOW_ECS_S3_ACCESS_IAM_ROLE
AWS IAM role that AWS Batch container uses to access AWS cloud resources.
execution_role: str, default METAFLOW_ECS_FARGATE_EXECUTION_ROLE
AWS IAM role that AWS Batch can use [to trigger AWS Fargate tasks] (https://docs.aws.amazon.com/batch/latest/userguide/execution-IAM-role.html).
shared_memory: int, optional, default None
The value for the size (in MiB) of the /dev/shm volume for this step.
This parameter maps to the --shm-size
option in Docker.
max_swap: int, optional, default None
The total amount of swap memory (in MiB) a container can use for this
step. This parameter is translated to the --memory-swap
option in
Docker where the value is the sum of the container memory plus the
max_swap
value.
swappiness: int, optional, default None
This allows you to tune memory swappiness behavior for this step. A swappiness value of 0 causes swapping not to happen unless absolutely necessary. A swappiness value of 100 causes pages to be swapped very aggressively. Accepted values are whole numbers between 0 and 100.
use_tmpfs: bool, default False
This enables an explicit tmpfs mount for this step. Note that tmpfs is not available on Fargate compute environments
tmpfs_tempdir: bool, default True
sets METAFLOW_TEMPDIR to tmpfs_path if set for this step.
tmpfs_size: int, optional, default None
The value for the size (in MiB) of the tmpfs mount for this step.
This parameter maps to the --tmpfs
option in Docker. Defaults to 50% of the
memory allocated for this step.
tmpfs_path: str, optional, default None
Path to tmpfs mount for this step. Defaults to /metaflow_temp.
inferentia: int, default 0
Number of Inferentia chips required for this step.
trainium: int, default None
Alias for inferentia. Use only one of the two.
efa: int, default 0
Number of elastic fabric adapter network devices to attach to container
ephemeral_storage: int, default None
The total amount, in GiB, of ephemeral storage to set for the task, 21-200GiB. This is only relevant for Fargate compute environments
log_driver: str, optional, default None
The log driver to use for the Amazon ECS container.
log_options: List[str], optional, default None
List of strings containing options for the chosen log driver. The configurable values
depend on the log driver
chosen. Validation of these options is not supported yet.
Example: [awslogs-group:aws/batch/job
]