Технологический Болт Генона@tech_b0lt

Технологический Болт Генона

Netflix открыл свою балалайку для обслуживания рабочих процессов (workflow-as-a-service, WAAS)

> It serves thousands of users, including data scientists, data engineers, machine learning engineers, software engineers, content producers, and business analysts, for various use cases. It schedules hundreds of thousands of workflows, millions of jobs every day and operates with a strict SLO even when there are spikes in the traffic.

Maestro: Netflix’s Workflow Orchestrator
https://github.com/Netflix/maestro

И выкатили большой пост с подробностями реализации и работы (удивительно, но в 2025 году запилили на Java 21)

100X Faster: How We Supercharged Netflix Maestro’s Workflow Engine
https://netflixtechblog.com/100x-faster-how-we-supercharged-netflix-maestros-workflow-engine-028e9637f041

Всё не влезет в пост, поэтому оставлю только выводы-заключение

> Despite these challenges, the migration was a success. We migrated over 60,000 active workflows generating over a million data processing tasks daily with almost no user involvement. By observing the flow engine’s lifecycle management latency, we validated a reduction in step launch overhead from around 5 seconds to 50 milliseconds. Workflow start overhead (incurred once per each workflow execution) also improved from 200 milliseconds to 50 milliseconds. Aggregating this over a million daily step executions translates to saving approximately 57 days of flow engine overhead per day, leading to a snappier user experience, more timely workflow status for data practitioners and greater overall task throughput for the same infrastructure scale.

> The architectural evolution of Maestro represents a significant leap in performance, reducing overhead from seconds to milliseconds. This redesign with a stateful actor model not only enhances speed by 100X but also maintains scalability and reliability, ensuring Maestro continues to meet the diverse needs of Netflix’s data and ML workflows.

> Performance matters: Even in a system designed for scale, the speed of individual operations significantly impacts user experience and productivity.

> Simplicity wins: Reducing dependencies and simplifying architecture not only improved performance but also enhanced reliability and maintainability.

> Locality optimizations pay off: Collocating related flows and tasks in the same JVM dramatically reduces overhead from the Maestro engine.

> Modern language features help: Java 21’s virtual threads enabled an elegant actor-based implementation with minimal code complexity and dependencies.

👍13

www.tgoop.com/tech_b0lt_Genona/5736

2.5K viewsSep 29 at 20:55

tgoop.com/tech_b0lt_Genona/5736

Create: 2025-09-29
Last Update: 2025-10-04 08:18:49

BY Технологический Болт Генона

Share with your friend now:
tgoop.com/tech_b0lt_Genona/5736

Telegram News

Netflix открыл свою балалайку для обслуживания рабочих процессов (workflow-as-a-service