Skip to main content

Basic Operations

FlowSpec(use_cli)

[source]

from metaflow import FlowSpec

Main class from which all Flows should inherit.

Attributes 

script_name

index

input

FlowSpec.next(self, *dsts, **kwargs)

[source]

from metaflow import FlowSpec.next

Indicates the next step to execute at the end of this step

This statement should appear once and only once in each and every step (except the end step). Furthermore, it should be the last statement in the step.

There are several valid formats to specify the next step: - Straight-line connection: self.next(self.next_step) where next_step is a method in the current class decorated with the @step decorator - Static fan-out connection: self.next(self.step1, self.step2, ...) where stepX are methods in the current class decorated with the @step decorator - Foreach branch: self.next(self.foreach_step, foreach='foreach_iterator') In this situation, foreach_step is a method in the current class decorated with the @step docorator and foreach_iterator is a variable name in the current class that evaluates to an iterator. A task will be launched for each value in the iterator and each task will execute the code specified by the step foreach_step.

Raises 

InvalidNextException

Raised if the format of the arguments does not match one of the ones given above.

FlowSpec.foreach_stack(self)

[source]

from metaflow import FlowSpec.foreach_stack

Returns the current stack of foreach steps for the current step

This effectively corresponds to the indexes and values at the various levels of nesting. For example, considering the following code:

@step
def root(self):
    self.split_1 = ['a', 'b', 'c']
    self.next(self.nest_1, foreach='split_1')

@step
def nest_1(self):
    self.split_2 = ['d', 'e', 'f', 'g']
    self.next(self.nest_2, foreach='split_2'):

@step
def nest_2(self):
    foo = self.foreach_stack()

foo will take the following values in the various tasks for nest_2: [(0, 3, 'a'), (0, 4, 'd')] [(0, 3, 'a'), (1, 4, 'e')] ... [(0, 3, 'a'), (3, 4, 'g')] [(1, 3, 'b'), (0, 4, 'd')] ...

where each tuple corresponds to: - the index of the task for that level of the loop - the number of splits for that level of the loop - the value for that level of the loop Note that the last tuple returned in a task corresponds to: - first element: value returned by self.index - third element: value returned by self.input

Returns 

List[Tuple[int, int, object]]

An array describing the current stack of foreach steps