Skip to main content

Why Metaflow

1. Modern businesses are eager to utilize data science and ML

In the past, data scientists and ML engineers had to rely on a medley of point solutions and custom systems to build ML and data science applications.

Many data science opportunities


2. What is common in DS/ML applications?

Applications can be built quicker and more robustly if they stand on a common, human-friendly foundation. But what should the foundation cover?

A solid foundation for all use cases


3. All DS/ML applications use data

Data may come in different shapes and sizes and may be loaded from various data stores. However, no matter what data is used, accessing and processing it shouldn't be too cumbersome.

Data


4. DS/ML applications need to perform computation

Some applications require a tremendous amount of compute power - think computer vision - while some do with less. Regardless of the scale, all applications need to perform computation reliably. Thanks to cloud computing, data scientists and ML engineers should be able to utilize elastic compute resources without friction.

Compute


5. DS/ML applications consists of multiple interconnected parts

Consider an application that loads data, transforms it, trains a bunch of models, chooses the best performing one, runs inference, and writes the results to a database. Multi-steps workflows like this are a norm in ML. A workflow orchestrator is needed to make sure all steps get executed in order, on time.

Orchestration


6. DS/ML applications evolve over time incrementally

Rarely a real-world application is built and deployed only once. Instead, a typical application is built gradually, through contributions by many people. The project needs to be tracked, organized, and versioned, which enables systematic and continuous improvement over time.

Versioning


7. DS/ML applications produce business value in various ways

To produce real business value, DS/ML applications can't live in a walled garden. They must be integrated with the surrounding systems seamlessly: Some applications enhance data in a database, some power internal dashboards or microservices, whereas some power user-facing products. There are many such ways to deploy ML in production. The more valuable the application, the more carefully it needs to be operated and monitored as well.

Deployment


8. DS/ML applications should leverage the best tools available

For many data scientists and ML engineers, the most rewarding part of the project is modeling. Using their domain knowledge and expertise, the modeler should be able to choose the best tool for the job amongst off-the-shelf libraries, such as PyTorch, XGBoost, Scikit Learn, and many others. Or, if necessary, they should be able to use a wholly custom approach.

Modeling


9. Metaflow covers the full stack of DS/ML infrastructure

Metaflow was originally created at Netflix, motivated by the realization that data scientists and ML engineers need help with all these concerns: Any gaps or friction in the stack can slow down the project drastically. Thanks to a common foundation provided by Metaflow, data scientists can iterate on ideas quickly and deploy them confidently by relying on a well-defined architecture and best practices, shared by everyone in the team.

Full-stack Metaflow


10. Metaflow takes care of the plumbing, so you can focus on the fun parts

Metaflow provides a robust and user-friendly foundation for a wide spectrum of data-intensive applications, including most data science and ML use cases. Data scientists and ML engineers who know the basics of Python can build their own applications, models, and policies on top of it, while Metaflow takes care of the low-level infrastructure: data, compute, orchestration, and versioning.

Full stack triangles


11. Metaflow relies on systems that engineers know and trust

Metaflow was designed at Netflix to serve the needs of business-critical ML/DS applications. It relies on proven and scalable infrastructure which works for small and large organizations alike. Metaflow integrates with all the top clouds as well as with Kubernetes and systems around them in a responsible manner. It respects the security and other policies of your company, making engineering teams happy too.

Existing infrastructure


12. Metaflow is used by hundreds of innovative companies

Today, Metaflow powers thousands of ML/DS applications at innovative companies such as Netflix, CNN, SAP, 23andMe, Realtor.com, REA, Coveo, Latana, and hundreds of others across industries. Commercial support for Metaflow is provided by Outerbounds. To hear first-hand experiences from these companies and many others, join the Metaflow Slack.