netflixtechblog.com Open in urlscan Pro
52.4.38.70  Public Scan

Submitted URL: https://netflixtechblog.com/python-at-netflix-bba45dae649e
Effective URL: https://netflixtechblog.com/python-at-netflix-bba45dae649e?gi=91d4e252c6e2
Submission: On April 06 via manual from US — Scanned from DE

Form analysis 0 forms found in the DOM

Text Content

Open in app

Sign up

Sign in

Write


Sign up

Sign in



Top highlight


PYTHON AT NETFLIX

Netflix Technology Blog

·

Follow

Published in

Netflix TechBlog

·
8 min read
·
Apr 29, 2019

8.2K

16

Listen

Share

By Pythonistas at Netflix, coordinated by Amjith Ramanujam and edited by Ellen
Livengood



As many of us prepare to go to PyCon, we wanted to share a sampling of how
Python is used at Netflix. We use Python through the full content lifecycle,
from deciding which content to fund all the way to operating the CDN that serves
the final video to 148 million members. We use and contribute to many
open-source Python packages, some of which are mentioned below. If any of this
interests you, check out the jobs site or find us at PyCon. We have donated a
few Netflix Originals posters to the PyLadies Auction and look forward to seeing
you all there.


OPEN CONNECT

Open Connect is Netflix’s content delivery network (CDN). An easy, though
imprecise, way of thinking about Netflix infrastructure is that everything that
happens before you press Play on your remote control (e.g., are you logged in?
what plan do you have? what have you watched so we can recommend new titles to
you? what do you want to watch?) takes place in Amazon Web Services (AWS),
whereas everything that happens afterwards (i.e., video streaming) takes place
in the Open Connect network. Content is placed on the network of servers in the
Open Connect CDN as close to the end user as possible, improving the streaming
experience for our customers and reducing costs for both Netflix and our
Internet Service Provider (ISP) partners.

Various software systems are needed to design, build, and operate this CDN
infrastructure, and a significant number of them are written in Python. The
network devices that underlie a large portion of the CDN are mostly managed by
Python applications. Such applications track the inventory of our network gear:
what devices, of which models, with which hardware components, located in which
sites. The configuration of these devices is controlled by several other systems
including source of truth, application of configurations to devices, and back
up. Device interaction for the collection of health and other operational data
is yet another Python application. Python has long been a popular programming
language in the networking space because it’s an intuitive language that allows
engineers to quickly solve networking problems. Subsequently, many useful
libraries get developed, making the language even more desirable to learn and
use.


DEMAND ENGINEERING

Demand Engineering is responsible for Regional Failovers, Traffic Distribution,
Capacity Operations, and Fleet Efficiency of the Netflix cloud. We are proud to
say that our team’s tools are built primarily in Python. The service that
orchestrates failover uses numpy and scipy to perform numerical analysis, boto3
to make changes to our AWS infrastructure, rq to run asynchronous workloads and
we wrap it all up in a thin layer of Flask APIs. The ability to drop into a
bpython shell and improvise has saved the day more than once.

We are heavy users of Jupyter Notebooks and nteract to analyze operational data
and prototype visualization tools that help us detect capacity regressions.


CORE

The CORE team uses Python in our alerting and statistical analytical work. We
lean on many of the statistical and mathematical libraries (numpy, scipy,
ruptures, pandas) to help automate the analysis of 1000s of related signals when
our alerting systems indicate problems. We’ve developed a time series
correlation system used both inside and outside the team as well as a
distributed worker system to parallelize large amounts of analytical work to
deliver results quickly.

Python is also a tool we typically use for automation tasks, data exploration
and cleaning, and as a convenient source for visualization work.


MONITORING, ALERTING AND AUTO-REMEDIATION

The Insight Engineering team is responsible for building and operating the tools
for operational insight, alerting, diagnostics, and auto-remediation. With the
increased popularity of Python, the team now supports Python clients for most of
their services. One example is the Spectator Python client library, a library
for instrumenting code to record dimensional time series metrics. We build
Python libraries to interact with other Netflix platform level services. In
addition to libraries, the Winston and Bolt products are also built using Python
frameworks (Gunicorn + Flask + Flask-RESTPlus).


INFORMATION SECURITY

The information security team uses Python to accomplish a number of high
leverage goals for Netflix: security automation, risk classification,
auto-remediation, and vulnerability identification to name a few. We’ve had a
number of successful Python open sources, including Security Monkey (our team’s
most active open source project). We leverage Python to protect our SSH
resources using Bless. Our Infrastructure Security team leverages Python to help
with IAM permission tuning using Repokid. We use Python to help generate TLS
certificates using Lemur.

Some of our more recent projects include Prism: a batch framework to help
security engineers measure paved road adoption, risk factors, and identify
vulnerabilities in source code. We currently provide Python and Ruby libraries
for Prism. The Diffy forensics triage tool is written entirely in Python. We
also use Python to detect sensitive data using Lanius.


PERSONALIZATION ALGORITHMS

We use Python extensively within our broader Personalization Machine Learning
Infrastructure to train some of the Machine Learning models for key aspects of
the Netflix experience: from our recommendation algorithms to artwork
personalization to marketing algorithms. For example, some algorithms use
TensorFlow, Keras, and PyTorch to learn Deep Neural Networks, XGBoost and
LightGBM to learn Gradient Boosted Decision Trees or the broader scientific
stack in Python (e.g. numpy, scipy, sklearn, matplotlib, pandas, cvxpy). Because
we’re constantly trying out new approaches, we use Jupyter Notebooks to drive
many of our experiments. We have also developed a number of higher-level
libraries to help integrate these with the rest of our ecosystem (e.g. data
access, fact logging and feature extraction, model evaluation, and publishing).


MACHINE LEARNING INFRASTRUCTURE

Besides personalization, Netflix applies machine learning to hundreds of use
cases across the company. Many of these applications are powered by Metaflow, a
Python framework that makes it easy to execute ML projects from the prototype
stage to production.

Metaflow pushes the limits of Python: We leverage well parallelized and
optimized Python code to fetch data at 10Gbps, handle hundreds of millions of
data points in memory, and orchestrate computation over tens of thousands of CPU
cores.


NOTEBOOKS

We are avid users of Jupyter notebooks at Netflix, and we’ve written about the
reasons and nature of this investment before.

But Python plays a huge role in how we provide those services. Python is a
primary language when we need to develop, debug, explore, and prototype
different interactions with the Jupyter ecosystem. We use Python to build custom
extensions to the Jupyter server that allows us to manage tasks like logging,
archiving, publishing, and cloning notebooks on behalf of our users.
We provide many flavors of Python to our users via different Jupyter kernels,
and manage the deployment of those kernel specifications using Python.


ORCHESTRATION

The Big Data Orchestration team is responsible for providing all of the services
and tooling to schedule and execute ETL and Adhoc pipelines.

Many of the components of the orchestration service are written in Python.
Starting with our scheduler, which uses Jupyter Notebooks with papermill to
provide templatized job types (Spark, Presto, …). This allows our users to have
a standardized and easy way to express work that needs to be executed. You can
see some deeper details on the subject here. We have been using notebooks as
real runbooks for situations where human intervention is required — for example:
to restart everything that has failed in the last hour.

Internally, we also built an event-driven platform that is fully written in
Python. We have created streams of events from a number of systems that get
unified into a single tool. This allows us to define conditions to filter
events, and actions to react or route them. As a result of this, we have been
able to decouple microservices and get visibility into everything that happens
on the data platform.

Our team also built the pygenie client which interfaces with Genie, a federated
job execution service. Internally, we have additional extensions to this library
that apply business conventions and integrate with the Netflix platform. These
libraries are the primary way users interface programmatically with work in the
Big Data platform.

Finally, it’s been our team’s commitment to contribute to papermill and
scrapbook open source projects. Our work there has been both for our own and
external use cases. These efforts have been gaining a lot of traction in the
open source community and we’re glad to be able to contribute to these shared
projects.


EXPERIMENTATION PLATFORM

The scientific computing team for experimentation is creating a platform for
scientists and engineers to analyze AB tests and other experiments. Scientists
and engineers can contribute new innovations on three fronts, data, statistics,
and visualizations.

The Metrics Repo is a Python framework based on PyPika that allows contributors
to write reusable parameterized SQL queries. It serves as an entry point into
any new analysis.

The Causal Models library is a Python & R framework for scientists to contribute
new models for causal inference. It leverages PyArrow and RPy2 so that
statistics can be calculated seamlessly in either language.

The Visualizations library is based on Plotly. Since Plotly is a widely adopted
visualization spec, there are a variety of tools that allow contributors to
produce an output that is consumable by our platforms.


PARTNER ECOSYSTEM

The Partner Ecosystem group is expanding its use of Python for testing Netflix
applications on devices. Python is forming the core of a new CI infrastructure,
including controlling our orchestration servers, controlling Spinnaker, test
case querying and filtering, and scheduling test runs on devices and containers.
Additional post-run analysis is being done in Python using TensorFlow to
determine which tests are most likely to show problems on which devices.


VIDEO ENCODING AND MEDIA CLOUD ENGINEERING

Our team takes care of encoding (and re-encoding) the Netflix catalog, as well
as leveraging machine learning for insights into that catalog.
We use Python for ~50 projects such as vmaf and mezzfs, we build computer vision
solutions using a media map-reduce platform called Archer, and we use Python for
many internal projects.
We have also open sourced a few tools to ease development/distribution of Python
projects, like setupmeta and pickley.


NETFLIX ANIMATION AND NVFX

Python is the industry standard for all of the major applications we use to
create Animated and VFX content, so it goes without saying that we are using it
very heavily. All of our integrations with Maya and Nuke are in Python, and the
bulk of our Shotgun tools are also in Python. We’re just getting started on
getting our tooling in the cloud, and anticipate deploying many of our own
custom Python AMIs/containers.


CONTENT MACHINE LEARNING, SCIENCE & ANALYTICS

The Content Machine Learning team uses Python extensively for the development of
machine learning models that are the core of forecasting audience size,
viewership, and other demand metrics for all content.





SIGN UP TO DISCOVER HUMAN STORIES THAT DEEPEN YOUR UNDERSTANDING OF THE WORLD.


FREE



Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.


Sign up for free


MEMBERSHIP



Access the best member-only stories.

Support independent authors.

Listen to audio narrations.

Read offline.

Join the Partner Program and earn for your writing.


Try for $5/month
Data Science
Python
Netflix
Netflixoss
Programming


8.2K

8.2K

16


Follow



WRITTEN BY NETFLIX TECHNOLOGY BLOG

406K Followers
·Editor for

Netflix TechBlog

Learn more about how Netflix designs, builds, and operates our systems and
engineering organizations

Follow




MORE FROM NETFLIX TECHNOLOGY BLOG AND NETFLIX TECHBLOG

Netflix Technology Blog

in

Netflix TechBlog


BENDING PAUSE TIMES TO YOUR WILL WITH GENERATIONAL ZGC


THE SURPRISING AND NOT SO SURPRISING BENEFITS OF GENERATIONS IN THE Z GARBAGE
COLLECTOR.

6 min read·Mar 6, 2024

370

4




Netflix Technology Blog

in

Netflix TechBlog


SUPPORTING DIVERSE ML SYSTEMS AT NETFLIX


DAVID J. BERG, ROMAIN CLEDAT, KAYLA SEELEY, SHASHANK SRIKANTH, CHAOYING WANG,
DARIN YU

12 min read·Mar 7, 2024

554

4




Netflix Technology Blog

in

Netflix TechBlog


INTRODUCING SAFETEST: A NOVEL APPROACH TO FRONT END TESTING


BY MOSHE KOLODNY

7 min read·Feb 13, 2024

1.8K

28




Netflix Technology Blog

in

Netflix TechBlog


SEQUENTIAL TESTING KEEPS THE WORLD STREAMING NETFLIX PART 2: COUNTING PROCESSES


MICHAEL LINDON, CHRIS SANDEN, VACHE SHIRIKIAN, YANJUN LIU, MINAL MISHRA, MARTIN
TINGLEY

7 min read·Mar 18, 2024

249

2



See all from Netflix Technology Blog
See all from Netflix TechBlog



RECOMMENDED FROM MEDIUM

Netflix Technology Blog

in

Netflix TechBlog


SEQUENTIAL A/B TESTING KEEPS THE WORLD STREAMING NETFLIX PART 1: CONTINUOUS DATA


MICHAEL LINDON, CHRIS SANDEN, VACHE SHIRIKIAN, YANJUN LIU, MINAL MISHRA, MARTIN
TINGLEY

10 min read·Feb 12, 2024

927

8




Dave Melillo

in

Towards Data Science


BUILDING A DATA PLATFORM IN 2024


HOW TO BUILD A MODERN, SCALABLE DATA PLATFORM TO POWER YOUR ANALYTICS AND DATA
SCIENCE PROJECTS (UPDATED)

9 min read·Feb 6, 2024

2.7K

35





LISTS


CODING & DEVELOPMENT

11 stories·548 saves


PREDICTIVE MODELING W/ PYTHON

20 stories·1069 saves


GENERAL CODING KNOWLEDGE

20 stories·1089 saves


PRACTICAL GUIDES TO MACHINE LEARNING

10 stories·1281 saves


Artturi Jalli


I BUILT AN APP IN 6 HOURS THAT MAKES $1,500/MO


COPY MY STRATEGY!


·3 min read·Jan 23, 2024

15.8K

171




Kevin Beaumont

in

DoublePulsar


INSIDE THE FAILED ATTEMPT TO BACKDOOR SSH GLOBALLY — THAT GOT CAUGHT BY CHANCE


WHY THE THREAT ACTOR RUSHED DEPLOYMENT.

7 min read·5 days ago

449

6




Akhilesh Mishra

in

KPMG UK Engineering


YOU SHOULD STOP WRITING DOCKERFILES TODAY —  DO THIS INSTEAD


USING DOCKER INIT TO WRITE DOCKERFILE AND DOCKER-COMPOSE CONFIGS

5 min read·Feb 8, 2024

5.3K

38




Builescu Daniel



in

Python in Plain English


MY BOSS LAUGHED AT PYTHON…THEN I SHOWED HIM THIS


AND STREAMLINED MY DATA ANALYSIS WORKFLOW


·5 min read·Mar 28, 2024

1.5K

20



See more recommendations

Help

Status

About

Careers

Blog

Privacy

Terms

Text to speech

Teams

To make Medium work, we log user data. By using Medium, you agree to our Privacy
Policy, including cookie policy.