zlib.pub Open in urlscan Pro
2606:4700:3035::ac43:bd7c  Public Scan

URL: https://zlib.pub/book/practical-python-data-wrangling-and-data-quality-getting-started-with-reading-cleaning-and-...
Submission: On March 29 via manual from DE — Scanned from DE

Form analysis 2 forms found in the DOM

GET https://zlib.pub/search

<form action="https://zlib.pub/search" class="my-auto w-100 d-inline-block order-1" method="GET">
  <div class="input-group">
    <input class="form-control border border-right-0" name="q" placeholder="Search Books / Articles / PDF" type="text">
    <span class="input-group-append">
      <button class="btn btn-outline-light border border-left-0" type="submit">Search</button>
    </span>
  </div>
</form>

POST https://zlib.pub/download/practical-python-data-wrangling-and-data-quality-getting-started-with-reading-cleaning-and-analyzing-data-ufel04klj340

<form action="https://zlib.pub/download/practical-python-data-wrangling-and-data-quality-getting-started-with-reading-cleaning-and-analyzing-data-ufel04klj340" method="post">
  <div class="g-recaptcha" data-sitekey="6LdrFLMdAAAAABIxrK9AISTVGRgF_EOFMt0sAEA9">
    <div style="width: 304px; height: 78px;">
      <div><iframe title="reCAPTCHA" width="304" height="78" role="presentation" name="a-88pfvs1pwhkg" frameborder="0" scrolling="no"
          sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-top-navigation allow-modals allow-popups-to-escape-sandbox allow-storage-access-by-user-activation"
          src="https://www.google.com/recaptcha/api2/anchor?ar=1&amp;k=6LdrFLMdAAAAABIxrK9AISTVGRgF_EOFMt0sAEA9&amp;co=aHR0cHM6Ly96bGliLnB1Yjo0NDM.&amp;hl=de&amp;v=moV1mTgQ6S91nuTnmll4Y9yf&amp;size=normal&amp;cb=z64ysj9hdojc"></iframe></div><textarea
        id="g-recaptcha-response" name="g-recaptcha-response" class="g-recaptcha-response" style="width: 250px; height: 40px; border: 1px solid rgb(193, 193, 193); margin: 10px 25px; padding: 0px; resize: none; display: none;"></textarea>
    </div><iframe style="display: none;"></iframe>
  </div>
  <button class="btn btn-primary btn-block" type="submit" style="width: 80%;"> Download PDF </button>
</form>

Text Content

ZLIB.PUB
Search
 * Home
 * Categories
   Religion History Technique Computers Programming Mathematics Logic Military
   History Linguistics Foreign Economy Other Social Sciences


PRACTICAL PYTHON DATA WRANGLING AND DATA QUALITY: GETTING STARTED WITH READING,
CLEANING, AND ANALYZING DATA PDF

Title Practical Python Data Wrangling and Data Quality: Getting Started with
Reading, Cleaning, and Analyzing Data Author Susan E. McGregor Language English
ISBN 1492091502 / 9781492091509 Year 2021 Pages 416 File Size 7.5 MB Total
Downloads 2,564 Total Views 5,859 Edition 1 Pages In File 578 Identifier
1492091502,9781492091509 Org File Size 7,890,266 Extension pdf


Download PDF

--------------------------------------------------------------------------------

PREVIEW

CLICK TO PREVIEW PDF

--------------------------------------------------------------------------------

SUMMARY

Download Practical Python Data Wrangling and Data Quality: Getting Started with
Reading, Cleaning, and Analyzing Data PDF

--------------------------------------------------------------------------------

DESCRIPTION

The world around us is full of data that holds unique insights and valuable
stories, and this book will help you uncover them. Whether you already work with
data or want to learn more about its possibilities, the examples and techniques
in this practical book will help you more easily clean, evaluate, and analyze
data so that you can generate meaningful insights and compelling
visualizations.Complementing foundational concepts with expert advice, author
Susan E. McGregor provides the resources you need to extract, evaluate, and
analyze a wide variety of data sources and formats, along with the tools to
communicate your findings effectively. This book delivers a methodical,
jargon-free way for data practitioners at any level, from true novices to
seasoned professionals, to harness the power of data.Use Python 3.8+ to read,
write, and transform data from a variety of sourcesUnderstand and use
programming basics in Python to wrangle data at scaleOrganize, document, and
structure your code using best practicesCollect data from structured data files,
web pages, and APIsPerform basic statistical analyses to make meaning from
datasetsVisualize and present data in clear and compelling ways...

--------------------------------------------------------------------------------

TABLE OF CONTENTS

Preface
Who Should Read This Book?
Who Shouldn’t Read This Book?
What to Expect from This Volume
Conventions Used in This Book
Using Code Examples
O’Reilly Online Learning
How to Contact Us
Acknowledgments
1. Introduction to Data Wrangling and Data Quality
What Is “Data Wrangling”?
What Is “Data Quality”?
Data Integrity
Data “Fit”
Why Python?
Versatility
Accessibility
Readability
Community
Python Alternatives
Writing and “Running” Python
Working with Python on Your Own Device
Getting Started with the Command Line
Installing Python, Jupyter Notebook, and a Code Editor
Working with Python Online
Hello World!
Using Atom to Create a Standalone Python File
Using Jupyter to Create a New Python Notebook
Using Google Colab to Create a New Python Notebook
Adding the Code
In a Standalone File
In a Notebook
Running the Code
In a Standalone File
In a Notebook
Documenting, Saving, and Versioning Your Work
Documenting
Saving
Versioning
Conclusion
2. Introduction to Python
The Programming “Parts of Speech”
Nouns ≈ Variables
Verbs ≈ Functions
Cooking with Custom Functions
Libraries: Borrowing Custom Functions from Other Coders
Taking Control: Loops and Conditionals
In the Loop
One Condition…
Understanding Errors
Syntax Snafus
Runtime Runaround
Logic Loss
Hitting the Road with Citi Bike Data
Starting with Pseudocode
Seeking Scale
Conclusion
3. Understanding Data Quality
Assessing Data Fit
Validity
Reliability
Representativeness
Assessing Data Integrity
Necessary, but Not Sufficient
Important
Achievable
Improving Data Quality
Data Cleaning
Data Augmentation
Conclusion
4. Working with File-Based and Feed-Based Data in Python
Structured Versus Unstructured Data
Working with Structured Data
File-Based, Table-Type Data—Take It to Delimit
Wrangling Table-Type Data with Python
Real-World Data Wrangling: Understanding Unemployment
XLSX, ODS, and All the Rest
Finally, Fixed-Width
Feed-Based Data—Web-Driven Live Updates
Wrangling Feed-Type Data with Python
Working with Unstructured Data
Image-Based Text: Accessing Data in PDFs
Wrangling PDFs with Python
Accessing PDF Tables with Tabula
Conclusion
5. Accessing Web-Based Data
Accessing Online XML and JSON
Introducing APIs
Basic APIs: A Search Engine Example
Specialized APIs: Adding Basic Authentication
Getting a FRED API Key
Using Your API key to Request Data
Reading API Documentation
Protecting Your API Key When Using Python
Creating Your “Credentials” File
Using Your Credentials in a Separate Script
Getting Started with .gitignore
Specialized APIs: Working With OAuth
Applying for a Twitter Developer Account
Creating Your Twitter “App” and Credentials
Encoding Your API Key and Secret
Requesting an Access Token and Data from the Twitter API
API Ethics
Web Scraping: The Data Source of Last Resort
Carefully Scraping the MTA
Using Browser Inspection Tools
The Python Web Scraping Solution: Beautiful Soup
Conclusion
6. Assessing Data Quality
The Pandemic and the PPP
Assessing Data Integrity
Is It of Known Pedigree?
Is It Timely?
Is It Complete?
Is It Well-Annotated?
Is It High Volume?
Is It Consistent?
Is It Multivariate?
Is It Atomic?
Is It Clear?
Is It Dimensionally Structured?
Assessing Data Fit
Validity
Reliability
Representativeness
Conclusion
7. Cleaning, Transforming, and Augmenting Data
Selecting a Subset of Citi Bike Data
A Simple Split
Regular Expressions: Supercharged String Matching
Making a Date
De-crufting Data Files
Decrypting Excel Dates
Generating True CSVs from Fixed-Width Data
Correcting for Spelling Inconsistencies
The Circuitous Path to “Simple” Solutions
Gotchas That Will Get Ya!
Augmenting Your Data
Conclusion
8. Structuring and Refactoring Your Code
Revisiting Custom Functions
Will You Use It More Than Once?
Is It Ugly and Confusing?
Do You Just Really Hate the Default Functionality?
Understanding Scope
Defining the Parameters for Function “Ingredients”
What Are Your Options?
Getting Into Arguments?
Return Values
Climbing the “Stack”
Refactoring for Fun and Profit
A Function for Identifying Weekdays
Metadata Without the Mess
Documenting Your Custom Scripts and Functions with pydoc
The Case for Command-Line Arguments
Where Scripts and Notebooks Diverge
Conclusion
9. Introduction to Data Analysis
Context Is Everything
Same but Different
What’s Typical? Evaluating Central Tendency
What’s That Mean?
Embrace the Median
Think Different: Identifying Outliers
Visualization for Data Analysis
What’s Our Data’s Shape? Understanding Histograms
The Significance of Symmetry
Counting “Clusters”
The $2 Million Question
Proportional Response
Conclusion
10. Presenting Your Data
Foundations for Visual Eloquence
Making Your Data Statement
Charts, Graphs, and Maps: Oh My!
Pie Charts
Bar and Column Charts
Line Charts
Scatter Charts
Maps
Elements of Eloquent Visuals
The “Finicky” Details Really Do Make a Difference
Trust Your Eyes (and the Experts)
Selecting Scales
Choosing Colors
Above All, Annotate!
From Basic to Beautiful: Customizing a Visualization with seaborn and matplotlib
Beyond the Basics
Conclusion
11. Beyond Python
Additional Tools for Data Review
Spreadsheet Programs
OpenRefine
Additional Tools for Sharing and Presenting Data
Image Editing for JPGs, PNGs, and GIFs
Software for Editing SVGs and Other Vector Formats
Reflecting on Ethics
Conclusion
A. More Python Programming Resources
Official Python Documentation
Installing Python Resources
Where to Look for Libraries
Keeping Your Tools Sharp
Where to Learn More
B. A Bit More About Git
You Run git push/pull and End Up in a Weird Text Editor
Your git push/pull Command Gets Rejected
Run git pull
Git Quick Reference
C. Finding Data
Data Repositories and APIs
Subject Matter Experts
FOIA/L Requests
Custom Data Collection
D. Resources for Visualization and Information Design
Foundational Books on Information Visualization
The Quick Reference You’ll Reach For
Sources of Inspiration
Index
About the Author

--------------------------------------------------------------------------------

SIMILAR FREE PDFS

PRACTICAL PYTHON DATA WRANGLING AND DATA QUALITY: GETTING STARTED WITH READING,
CLEANING, AND ANALYZING DATA

 * 416 Pages
 * 2021

DATA WRANGLING WITH PYTHON

 * 2018

DATA WRANGLING WITH JAVASCRIPT

 * 430 Pages
 * 2019

PRACTICAL DATA ANALYSIS WITH PYTHON

 * 2015

DATA CLEANING

DATA CLEANING

 * 282 Pages
 * 2019

DATA CLEANING

HANDS-ON DATA ANALYSIS WITH PANDAS: EFFICIENTLY PERFORM DATA COLLECTION,
WRANGLING, ANALYSIS, AND VISUALIZATION USING PYTHON

 * 740 Pages
 * 2019

PYTHON FOR DATA ANALYSIS : DATA WRANGLING WITH PANDAS, NUMPY, AND IPYTHON

 * 2013

PYTHON FOR DATA ANALYSIS: DATA WRANGLING WITH PANDAS, NUMPY, AND IPYTHON

 * 550 Pages
 * 2017

ANALYZING SENSORY DATA WITH R

 * 2018

ANALYZING QUALITATIVE DATA WITH MAXQDA

ANALYZING BASEBALL DATA WITH R

 * 361 Pages
 * 2018

PRACTICAL DATA SCIENCE WITH PYTHON 3

 * 468 Pages
 * 2019

DATA SCIENCE WITH PYTHON

 * 1,255 Pages
 * 2016

GETTING STARTED WITH DATA SCIENCE: MAKING SENSE OF DATA WITH ANALYTICS

 * 2015

POPULAR AUTHORS

 * K.V.
 * Paul A. Greenberger
 * Leslie C. Grammer
 * Wilderness and Third World Medicine Forum
 * Austere
 * The Remote
 * Киреева Т.Н.
 * Graham Smith BSc(Hon) MD FRCA
 * David J. Rowbotham MD MRCP FRCA
 * Donald L. Quicke
 * A.P. Rasnitsyn
 * Bulte J.W.M. (eds.)
 * De Cuyper M.
 * Goldsmith T.H.
 * Hunter.
 * MELISSA C. McDADE
 * Christopher Janson
 * Roger E. Koeppe
 * Richard I. GumportFrank H. Deis
 * J. Nicholas Housby (eds.)
 * L. M. Smith (auth.)
 * T. J. Griffin
 * P. Prusinkiewicz
 * Layman D.P.
 * Fusco G. (eds.)
 * John J. Tyson (Editors)
 * O'Shea Michael
 * James (James Schooley) Schooley
 * Peter Clark (Editors)
 * Weisbuch G.
 * Perelson A.S.
 * Price E.O.

POPULAR TAGS

 * American Accent Training
 * Современные проблемы математики
 * Mathématiques -- Concours
 * Solid state physics Quantum theory Chemical bonds SCIENCE Physics Condensed
   Matter Física do estado sólido Mecânica quântica
 * Dictionary of American History
 * Управление большими системами
 * Architectural Record
 * IEEE Transactions on Antennas and Propagation
 * National Geographic Magazine (2000 - 2009 гг.)
 * Technology Review
 * Радиохобби
 * Сборники рецептур рыбных изделий
 * Радиодизайн
 * Исследования по механике строительных конструкций и материалов
 * Онегов Анатолий
 * FHM (Россия)
 * FHM
 * Джеймс Питер
 * Народный доктор
 * ОСТ Машиностроение и материалообработка
 * ГОСТ Транспорт
 * Современная электроника
 * Виноделие и виноградарство
 * Экспресс метод Илоны Давыдовой
 * Школа грибоводства
 * Мастер на все руки
 * Комнатные и садовые растения
 * Игнатова. Английский язык. Интенсивный курс
 * Катера и Яхты
 * Successful Writing
 * Радио (1940 - 1949 гг.)
 * CHIP

 * About
 * Terms of Service
 * Privacy Policy
 * Cookie Policy
 * Contact us
 * DMCA & Copyright

Disclaimer: ZLIB is a pdf web search tool for unreservedly accessible pdf
archives on the Internet. We don't have any document on our server. In the event
that you have any inquiry or need to eliminate any substance recorded here if
it's not too much trouble, go ahead and reach us at zlibpub[at]protonmail.com.

© ZLIB all rights reserved 2024.