Defining Custom Images
By default, Metaflow uses a default Python image
which doesn't contain any libraries besides Python itself. When additional libraries
are needed, an easy option is to use the
decorators which install libraries on the fly,
on top of the base image.
Alternatively, you can use any other image of your choosing. Many off-the-shelf images work with Metaflow without modifications, or you can build a custom image.
Building a custom image
The main requirement is to make sure that the image has a
python version installed.
For more information about building a custom image, see this external howto page.
Configuring a custom image
There are three ways to pick an image, depending how broadly you want to use the image. The options are listed from the broadest to the most specific:
1. Define a default image
If you want to use an alternative image by default in all remote tasks, specify two variables in the Metaflow configuration files:
METAFLOW_DEFAULT_CONTAINER_REGISTRYcontrols which registry Metaflow uses to pick the image. This defaults to DockerHub but could also be a URL to a public or private ECR repository on AWS.
METAFLOW_DEFAULT_CONTAINER_IMAGEdictates the default container image that Metaflow should use.
2. Define a step-specific image
3. Execute a run with a custom image
You can test a specific image with a run without changing anything in the
configuration or the code. Simply add
--with batch or
For instance, you can run with the latest Ubuntu image like this:
python helloflow.py run --with kubernetes:image=hub.docker.com/ubuntu:latest
Custom image with
You can use both a custom image as well as
@conda on top of it.
@conda guarantee isolated environments, meaning that packages
installed in the image won't be visible in steps unless you explicity disable
the environment for a step.
This combination is beneficial if there are other assets in the image besides packages that should be accessed by steps. Steps may access all files in the image, e.g. configuration files, background processes work as usual, and you can launch image-specific subprocesses.
This way, you can design a base image containing non-library assets, and let developers
handle libraries independently with