Skip to main content

Deploying Variants of Event-Triggered Flows

Consider this advanced scenario: You have deployed two flows linked together via @trigger_on_finish. The flows run happily in production. At some point, you want to experiment with a new modeling approach. In order to know if the new approach works better than the current production version, you'd like to run them concurrently using the same data, maybe powering an A/B test.

It is critical that the experimental variant doesn't interfere with the production version. Conceptually, you would like to have two isolated deployments like here:

Fortunately, you can achieve such isolated deployments by using the @project decorator in conjunction with @trigger_on_finish.

Using @project and @trigger_on_finish together

from metaflow import FlowSpec, step, current, project

@project(name='variant_demo')
class FirstProjectFlow(FlowSpec):

@step
def start(self):
print("This deployment is called", current.project_flow_name)
self.next(self.end)

@step
def end(self):
pass

if __name__ == '__main__':
FirstProjectFlow()

and

from metaflow import FlowSpec, step, trigger_on_finish, current, project

@trigger_on_finish(flow='FirstProjectFlow')
@project(name='variant_demo')
class SecondProjectFlow(FlowSpec):

@step
def start(self):
print("This deployment is called", current.project_flow_name)
print("This run was triggered by", current.trigger.event)
self.next(self.end)

@step
def end(self):
pass

if __name__ == '__main__':
SecondProjectFlow()

Deploy both the flows on Argo Workflows:

python firstproject.py argo-workflows create
python secondproject.py argo-workflows create

and trigger the first one manually:

python firstproject.py argo-workflows trigger

Thanks to @project, the flows are deployed with a special name that includes a branch prefix. By default, each user gets their own prefix, so the output of the start step of FirstProjectFlow should look like:

The deployment is called variant_demo.user.alice.FirstProjectFlow

The SecondProjectFlow starts automatically when FirstProjectFlow completes. It should show output like here:

This deployment is called variant_demo.user.alice.SecondProjectFlow
This run was triggered by
MetaflowEvent(name='metaflow.variant_demo.user.alice.FirstProjectFlow.end', ...)

which indicates that the event triggering the run is specific to Alice.

Deploying a parallel branch

To deploy a parallel variant or a branch - in the sense of Git branches - execute the following commands:

python firstproject.py --branch new_model argo-workflows create
python secondproject.py --branch new_model argo-workflows create

and trigger the branch like here:

python firstproject.py --branch new_model argo-workflows trigger

You should see a corresponding output for these runs. Importantly, triggering the new_model branch doesn't have any effect on Alice's deployment, which is fully isolated in its own namespace.

Triggering across branches

As shown above, @project guarantees that all flows linked together within the same project and branch are isolated from other deployments. However, sometimes you may want to depend on an upstream flow that is deployed outside of your branch. For instance, you may want to deploy a variant of a downstream workflow, like SecondProjectFlow above, while reusing results from an existing upstream flow, like FirstProjectFlow.

You can accomplish this by specifying explicitly the project-branch name that you want to depend on. For instance, this line makes a flow depend on Alice's deployment regardless of the branch the flow is deployed on:

@trigger_on_finish(flow='variant_demo.user.alice.FirstProjectFlow')