towardsdatascience.com
Open in
urlscan Pro
162.159.153.4
Public Scan
Submitted URL: https://towardsdatascience.com/what-if-your-data-is-not-normal-d7293f7b8f0
Effective URL: https://towardsdatascience.com/what-if-your-data-is-not-normal-d7293f7b8f0?gi=4f1549069cbe
Submission: On December 05 via manual from US — Scanned from DE
Effective URL: https://towardsdatascience.com/what-if-your-data-is-not-normal-d7293f7b8f0?gi=4f1549069cbe
Submission: On December 05 via manual from US — Scanned from DE
Form analysis
0 forms found in the DOMText Content
Open in app Sign up Sign in Write Sign up Sign in Member-only story WHAT IF YOUR DATA IS NOT NORMAL? IN THIS ARTICLE, WE DISCUSS THE CHEBYSHEV’S BOUND FOR STATISTICAL DATA ANALYSIS. IN THE ABSENCE OF ANY IDEA ABOUT THE NORMALITY OF A GIVEN DATA SET, THIS BOUND CAN BE USED TO GAUGE THE CONCENTRATION OF DATA AROUND THE MEAN. Tirthajyoti Sarkar · Follow Published in Towards Data Science · 7 min read · Nov 2, 2018 1.2K 3 Listen Share INTRODUCTION This is Halloween week, and in between the tricks and treats, we, data geeks, are chuckling over this cute meme over the social media. You think this is a joke? Let me tell you, this is not a laughing matter. It is scary, true to the spirit of Halloween! > If we cannot assume that most of our data (of business, social, economic, or > scientific origin) are at least approximately ‘Normal’ (i.e. they are > generated by a Gaussian process or by a sum of multiple such processes), then > we are doomed! Here is an extremely brief list of things that will not be valid, * The whole concept of six-sigma * The famous 68–95–99.7 rule * The ‘holy’ concept of p=0.05 (comes from 2 sigma interval) in statistical analysis Scary enough? Let’s talk more about it… THE OMNIPOTENT AND OMNIPRESENT NORMAL DISTRIBUTION Let’s keep this section short and sweet. Normal (Gaussian) distribution is the most widely known probability distribution. Here are some links to the articles describing its power and wide applicability, * Why Data Scientists love Gaussian WHY DATA SCIENTISTS LOVE GAUSSIAN? THREE MAIN REASONS WHY GAUSSIAN DISTRIBUTION IS SO POPULAR WITH DEEP LEARNING, MACHINE LEARNING ENGINEERS AND… towardsdatascience.com * How to Dominate the Statistics Portion of Your Data Science Interview * What’s So Important about the Normal Distribution? CREATE AN ACCOUNT TO READ THE FULL STORY. The author made this story available to Medium members only. If you’re new to Medium, create a new account to read this story on us. Continue in app Or, continue in mobile web Sign up with Google Sign up with Facebook Sign up with email Already have an account? Sign in 1.2K 1.2K 3 Follow WRITTEN BY TIRTHAJYOTI SARKAR 12.6K Followers ·Writer for Towards Data Science Sr. Director of AI/ML platform | Stories on Artificial Intelligence, Data Science, and ML | Speaker, Open-source contributor, Author of multiple DS books Follow MORE FROM TIRTHAJYOTI SARKAR AND TOWARDS DATA SCIENCE Tirthajyoti Sarkar in Towards Data Science DESIGN YOUR ENGINEERING EXPERIMENT PLAN WITH A SIMPLE PYTHON COMMAND DESIGN YOUR ENGINEERING EXPERIMENT PLAN WITH A SIMPLE PYTHON COMMAND. 10 min read·Jul 4, 2018 655 3 Rahul Nayak in Towards Data Science HOW TO CONVERT ANY TEXT INTO A GRAPH OF CONCEPTS A METHOD TO CONVERT ANY TEXT CORPUS INTO A KNOWLEDGE GRAPH USING MISTRAL 7B. 12 min read·Nov 10 3.6K 39 Anthony Alcaraz in Towards Data Science EMBEDDINGS + KNOWLEDGE GRAPHS: THE ULTIMATE TOOLS FOR RAG SYSTEMS THE ADVENT OF LARGE LANGUAGE MODELS (LLMS) , TRAINED ON VAST AMOUNTS OF TEXT DATA, HAS BEEN ONE OF THE MOST SIGNIFICANT BREAKTHROUGHS IN… ·10 min read·Nov 14 1K 8 Tirthajyoti Sarkar in Towards Data Science ESSENTIAL MATH FOR DATA SCIENCE THE KEY TOPICS TO MASTER TO BECOME A BETTER DATA SCIENTIST ·8 min read·Aug 8, 2018 14.3K 29 See all from Tirthajyoti Sarkar See all from Towards Data Science RECOMMENDED FROM MEDIUM Erdogan Taskesen in Towards Data Science HOW TO FIND THE BEST THEORETICAL DISTRIBUTION FOR YOUR DATA KNOWING THE UNDERLYING DATA DISTRIBUTION IS AN ESSENTIAL STEP FOR DATA MODELING AND HAS MANY APPLICATIONS, SUCH AS ANOMALY DETECTION… ·19 min read·Feb 4 1K 11 MS Somanna GUIDE TO ADDING NOISE TO YOUR DATA USING PYTHON AND NUMPY IN THIS ARTICLE YOU’LL LEARN WHY YOU SHOULD ADD NOISE TO YOUR OTHERWISE PERFECT SYNTHETIC DATA, WHAT ARE THE TYPES OF NOISES YOU CAN ADD… 8 min read·Jul 22 71 1 LISTS PREDICTIVE MODELING W/ PYTHON 20 stories·662 saves PRACTICAL GUIDES TO MACHINE LEARNING 10 stories·742 saves NATURAL LANGUAGE PROCESSING 932 stories·443 saves NEW_READING_LIST 174 stories·213 saves Mehul Gupta in Data Science in your pocket PERMUTATION TESTING EXPLAINED WITH AN EXAMPLE MOVING BEYOND HYPOTHESIS TESTING 4 min read·Jun 6 91 Virat Patel I APPLIED TO 230 DATA SCIENCE JOBS DURING LAST 2 MONTHS AND THIS IS WHAT I’VE FOUND. A LITTLE BIT ABOUT MYSELF: I HAVE BEEN WORKING AS A DATA ANALYST FOR A LITTLE OVER 2 YEARS. ADDITIONALLY, FOR THE PAST YEAR, I HAVE BEEN… ·3 min read·Aug 11 2.6K 53 Unbecoming 10 SECONDS THAT ENDED MY 20 YEAR MARRIAGE IT’S AUGUST IN NORTHERN VIRGINIA, HOT AND HUMID. I STILL HAVEN’T SHOWERED FROM MY MORNING TRAIL RUN. I’M WEARING MY STAY-AT-HOME MOM… ·4 min read·Feb 16, 2022 70K 1016 Philippe Tousignant UNDERSTANDING TIME SERIES STATIONARITY WITH PYTHON HOW SHOULD I TRANSFORM MY DATA? ·4 min read·Jul 17 1 See more recommendations Help Status About Careers Blog Privacy Terms Text to speech Teams To make Medium work, we log user data. By using Medium, you agree to our Privacy Policy, including cookie policy.