towardsdatascience.com Open in urlscan Pro
52.1.147.205  Public Scan

Submitted URL: http://towardsdatascience.com/
Effective URL: https://towardsdatascience.com/?gi=f21780e2b656
Submission Tags: falconsandbox
Submission: On August 24 via api from US

Form analysis 0 forms found in the DOM

Text Content

Get started
Open in app

Sign in

Get started
A Medium publication sharing concepts, ideas and codes.
Follow
572K Followers
·
Editors' PicksFeaturesDeep DivesGrowContribute
About

Get started
Open in app

Omri Kaduri

·Pinned


A couple of robots, desperately waiting for algorithms to advise them how to act
in the world. Image from Unsplash.


FROM A* TO MARL (PART 1 — MAPF)


AN INTUITIVE HIGH-LEVEL OVERVIEW OF THE CONNECTION BETWEEN AI PLANNING THEORY TO
CURRENT REINFORCEMENT LEARNING RESEARCH FOR MULTI-AGENT SYSTEMS

Research of Reinforcement Learning (RL) and Multi-Agent RL (MARL) algorithms has
advanced rapidly during the last decade. One might suggest it is due to the rise
of deep learning and the use of its architectures for RL tasks. While it is true
at some level, the foundations of RL, which can be thought of as a planning
problem formulated as a learning system, lie in AI planning theory (which has
been in development for over 50 years). …

Read more · 10 min read




84




--------------------------------------------------------------------------------

CJ Sullivan

·Pinned


THOUGHTS AND THEORY


BEHIND THE SCENES ON THE FAST RANDOM PROJECTION ALGORITHM FOR GENERATING GRAPH
EMBEDDINGS


A DETAILED LOOK INTO FASTRP AND ITS HYPERPARAMETERS


Photo by Crissy Jarvis on Unsplash

The vast majority of data science and machine learning models rely on creating a
vector, or embedding, of your data. Some of these embeddings naturally create
themselves. For example, for numerical data organized in columns we can think of
the values associated with each row as a single vector. In more complicated
cases such as natural language processing we have to generate those embeddings
from the words through a variety of different approaches like one-hot encoding,
skip-gram methods such as word2vec, etc. These vectors are then used as the
representation of the data that is to be modeled.

It is…

Read more · 10 min read




112




--------------------------------------------------------------------------------

Nick Jones

·Pinned


MAKING SENSE OF BIG DATA, NOTES FROM INDUSTRY


ON THE GRID: ESTIMATING POPULATION DENSITY FOR ANYWHERE ON EARTH

“How many people live here?” From estimating demand for transport infrastructure
to planning vaccine distribution, this question is an essential starting point
for public policy analysis. And yet it remains surprisingly hard to answer, with
analysts spending hours hunting down datasets that depict population density in
the specific locations that interest them.

Fortunately, there is an easier way. …

Read more · 8 min read




31




--------------------------------------------------------------------------------

Pedro Madruga

·Just now


GETTING STARTED WITH TASK GROUPS IN AIRFLOW 2.0


A SIMPLE PIPELINE WITH TWO GROUPS OF TASKS, USING THE @TASKGROUP DECORATOR OF
THE TASKFLOW API FROM AIRFLOW 2.


BACKGROUND

This post is part of the ETL series tutorial. This was originally posted on
pedromadruga.com. If you like this post, consider subscribing to the newsletter
or following me on Twitter.

The complete code is available here.


INTRO

Before Task Groups in Airflow 2.0, Subdags were the go-to API to group tasks.
With Airflow 2.0, SubDags are being relegated and now replaced with the Task
Group feature. The TaskFlow API is simple and allows for a proper code
structure, favoring a clear separation of concerns.

What we’re building today is a simple DAG with two groups of tasks, using the
@taskgroup decorator…

Read more · 3 min read





--------------------------------------------------------------------------------

Zulie Rane

·Just now


THE 7 BEST WAYS TO LEARN PYTHON DEPENDING ON YOUR EXTREMELY SPECIFIC
CIRCUMSTANCE


READ THIS IF YOU’RE SLIGHTLY OVERWHELMED BY THE NUMBER OF PYTHON-LEARNING
OPTIONS OUT THERE


Photo by Kamil Zubrzycki from Pexels.

Everyone wants to know the best way to learn to code Python nowadays. It’s a
great language as I’ve written about (extensively) before, with great career
prospects and tons of useful features.

For as many reasons as there are to learn Python, there is probably an
equivalent number of ways to learn Python. You can already tell because this is
a listicle and not a tweet, but the best method to learn Python does not have a
single answer. There’s no one best way — there’s only the best way to learn
Python that’s good for your specific circumstance. …

Read more · 7 min read





--------------------------------------------------------------------------------

Samir Saci

·Just now


CENTRAL LIMIT THEOREM FOR PROCESS IMPROVEMENT WITH PYTHON


ESTIMATE THE WORKLOAD FOR RETURNS MANAGEMENT ASSUMING A NORMAL DISTRIBUTION OF
THE NUMBER OF ITEMS PER CARTON RECEIVED FROM YOUR STORES.


Inbound Area for Returns Management — (Image by Author)

If you are interested in articles related to Data Science for Supply Chain feel
free to have a look at my portfolio: https://samirsaci.com



Returns management, often referred to as reverse logistics, is the management of
returned items from retail locations in your distribution center.

After the reception, products are sorted, organized, and inspected for quality.
If they are in good condition, these products can be restocked in the warehouse
and added to the inventory count waiting to be reordered.

In this article, we will see how the Central Limit Theorem can help us to
estimate the workload for the process…

Read more · 5 min read




1




--------------------------------------------------------------------------------

Richmond Alake

·Just now


MISTAKES I MADE IN MY MACHINE LEARNING CAREER


AND HOW YOU CAN AVOID THEM

The truth is you will make tons of mistakes in your career as an ML
practitioner. The plus side is that there’s an opportunity to learn and level up
for each mistake you make.


Photo by Brett Jordan on Unsplash

In this article, you’ll come across mistakes that I’ve made so far in my career
as a Computer Vision / Machine Learning Engineer; and how you as an ML
practitioner can avoid each mistake I’ve made.


WHY IS THIS IMPORTANT?

The average human spends 50 years of their entire lives employed in a job, and
for most of us, we are just at the start of our careers, furthermore, I…

Read more · 8 min read




15




--------------------------------------------------------------------------------

Damian Ejlli, Ph.D

·Just now


FIVE REGRESSION PYTHON MODULES THAT EVERY DATA SCIENTIST MUST KNOW


Fig. 1. Plot of the life satisfaction value versus GDP per capita by using the
seaborn python library (figure created by the author for educational purposes)
as in section 5. The colored region represents the 95% confidence region of the
linear regression line.


INTRODUCTION

Regression is a very important concept in statistical modelling, data science,
and machine learning that helps establish a possible relationship between an
independent variable (or predictor), x, with a dependent variable (or simply
output) y(x) by using specific mathematical minimisation criteria. There are
several types of regression that are used in different situations and one of the
most common is linear regression. Other types of regression include logistic
regression, non-linear regression, etc.

In Python, there are several libraries and corresponding modules that can be
used to perform regression depending on a specific problem that one encounters
and its complexity. In…

Read more · 10 min read




1




--------------------------------------------------------------------------------

Rudradeb Mitra

·Just now


BUILDING THE WORLD’S LARGEST AI4GOOD PYTHON LIBRARY


COLLABORATIVELY BUILT AND MAINTAINED BY THE GLOBAL AI COMMUNITY


Image via Canva Pro under license to Omdena

> Imagine an open-source Python library, which allows you to build, within days,
> an end-to-end data science pipeline that is ready for production!
> Additionally, the library is not just a codebase but also a knowledge source
> helping in every stage of development while solving some of the world´s most
> challenging problems.
> 
> Sounds imaginary, right? But that is exactly what OmdenaLore is.


OMDENALORE IS DEVELOPED BY THE COMMUNITY

We went on a mission to build OmdenaLore, an open-sourced data science package
that provides comprehensive and ready-to-use Python classes and functions to
solve almost any machine learning problem in an accelerated manner. We want this
to be a one-stop-shop…

Read more · 6 min read




1



Show more

--------------------------------------------------------------------------------

Dimitris Poulopoulos

·Just now


WHAT A TFRECORD IS AND HOW TO CREATE IT


HOW TO USE THE TFRECORD FORMAT TO TRAIN NEURAL NETWORKS EFFICIENTLY


Photo by Jan Antonin Kolar on Unsplash

TensorFlow is one of the most popular Deep Learning frameworks today. Some
people swear by it, and some think it is a great but bloated tool, one that
carries a heavy burden of legacy code.

Personally, I prefer working with PyTorch, but in my opinion, every Machine
Learning (ML) researcher or engineer should know how to find their way into a
TensorFlow repository. There is a lot of innovation in the field, and almost
half of it is expressed using TensorFlow.

However, for the most part, TensorFlow is an opinionated framework. …

Read more · 4 min read




1





TOWARDS DATA SCIENCE

A Medium publication sharing concepts, ideas and codes.

Follow

About

Write

Help

Legal

Get the Medium app


To make Medium work, we log user data. By using Medium, you agree to our Privacy
Policy, including cookie policy.