Lyft open sources data orchestration platform Flyte

Lyft is open-sourcing the secret gear of its machine learning pipeline management.
10 January 2020 | 11 Shares

Fiat 500 painted pink and carrying a Lyft logo. Source: Shutterstock

The tech powering some of the biggest ‘disruptors’ is increasingly becoming available in a versatile, open-source form for other firms to employ in any way they see fit.

Uber recently open-sourced its Manifold deep learning debugging tool and has a history of pushing its technology out into the public domain from platforms for training conversational AI and machine learning to autonomous vehicle visualization systems.

Open-sourcing encourages a wider culture of knowledge sharing and continuous product innovation; not only is a company ‘donating’ its work for the benefit of technological progression, but it’s also benefiting from the adaptations and improvements brought about by a wider community of developers.

Founded in 2012, in less than a decade ridesharing firm Lyft has grown to 1,000 employees with operations in more than 600 cities, and has raised more than US$2 billion in venture capital. Now, the company is looking to make available some of the technology that powered that growth.

Lyft is open-sourcing the secret gear of its machine learning pipeline management as Flyte, which it describes as a structured programming and distributed processing platform for “highly concurrent, scalable, and maintainable workflows.”

The platform has been serving production model training and data processing internally for three years, utilized across pricing, logistics, mapping, and autonomous projects and is used in up to 7,000 unique workflows, overseeing 100,000 executions monthly and a total of 1 million tasks. 

“Flyte is built to power and accelerate machine learning and data orchestration at the scale required by modern products, companies, and applications,” wrote Lyft. “Together, Lyft and Flyte have grown to see the massive advantage a modern processing platform provides, and we hope that in open sourcing Flyte you too can reap the benefits.”

The artificial intelligence (AI) platform is a multi-tenant system that allows teams to work on separate repositories as well as sharing workflows across tenants. Picture an apartment block with various tenants living in different units but have access to communal facilities. 

Similar to other platforms, Flyte provides abstraction so that developers don’t have to worry about the underlying infrastructure and can focus on business logic

As a data orchestration platform, Lyft says Flyte is an essential backbone to its operations, but added that the scalable and maintainable model is a fit for modern-day products and applications which are powered by streams of data. 

“With data now being a primary asset for companies, executing large-scale compute jobs is critical to the business, but problematic from an operational standpoint. Scaling, monitoring, and managing compute clusters becomes a burden on each product team, slowing down iteration, and subsequently product innovation. Moreover, these workflows often have complex data dependencies,” according to the company’s blog post

“Flyte’s mission is to increase development velocity for machine learning and data processing by abstracting this overhead.”

Flyte’s launch succeeds AI tools released by the ridesharing firm previously to simulate the result of machine learning algorithms. Last year, the company released a publicly available database for autonomous vehicle development