www.packtpub.com Open in urlscan Pro
2606:4700:10::ac43:15a2  Public Scan

Submitted URL: https://packt.omeclk.com/portal/wts/ug^cnN2cyLec6q6AmgEfnDa8eE8zmq4rhkrt7yCrCka
Effective URL: https://www.packtpub.com/en-us/product/python-feature-engineering-cookbook-9781835883587?utm_source=PythonPro+Newsletter&...
Submission: On September 17 via api from SA — Scanned from DE

Form analysis 3 forms found in the DOM

https://www.packtpub.com/en-us/search

<form class="search desktop" data-search="https://www.packtpub.com/api/rebuild/header/search" data-method="POST" action="https://www.packtpub.com/en-us/search">
  <svg role="presentation" class="icon icon-5 search-box-icon">
    <use href="https://www.packtpub.com/rebuild/build/assets/common-CwvaZMrJ.svg#search"></use>
    <desc>Search icon</desc>
  </svg>
  <input type="text" name="query" class="search-input" placeholder="Search..." id="search" autocomplete="off">
  <span class="loader d-none"></span>
  <svg role="presentation" class="search-close d-none">
    <use href="https://www.packtpub.com/rebuild/build/assets/common-CwvaZMrJ.svg#close"></use>
    <desc>Close icon</desc>
  </svg>
  <div class="search-results d-none scrollbar" id="results"></div>
</form>

GET https://www.packtpub.com/checkout/add/ebook/9781835883587

<form action="https://www.packtpub.com/checkout/add/ebook/9781835883587" method="get">
  <button id="product-buy-now" type="submit" class="rebuild-btn rebuild-btn-primary" data-analytics-type="add_to_cart" data-analytics-currency="USD" data-analytics-item-id="US-9781835883594-EBOOK"
    data-analytics-item-title="Python Feature Engineering Cookbook - Third Edition" data-analytics-item-language="" data-analytics-item-framework="" data-analytics-item-concept="" data-analytics-item-publication-year="2024"
    data-analytics-item-quantity="1" data-analytics-item-index="0" data-analytics-item-format="ebook" data-analytics-item-price="24.99" data-analytics-item-discount="11"> Buy Now </button>
</form>

GET https://www.packtpub.com/checkout/add/ebook/9781835883587

<form action="https://www.packtpub.com/checkout/add/ebook/9781835883587" method="get">
  <button id="product-buy-now" type="submit" class="rebuild-btn rebuild-btn-primary" data-analytics-type="add_to_cart" data-analytics-currency="USD" data-analytics-item-id="US-9781835883594-EBOOK"
    data-analytics-item-title="Python Feature Engineering Cookbook - Third Edition" data-analytics-item-language="" data-analytics-item-framework="" data-analytics-item-concept="" data-analytics-item-publication-year="2024"
    data-analytics-item-quantity="1" data-analytics-item-index="0" data-analytics-item-format="ebook" data-analytics-item-price="24.99" data-analytics-item-discount="11"> Buy Now </button>
</form>

Text Content

Search icon Close icon

Search icon CANCEL
Subscription
0
Cart icon
Cart
Close icon
You have no products in your basket yet
Save more on your purchases!
Buy 2 products and save 10%
Buy 3 products and save 15%
Buy 5 products and save 20%
Savings automatically calculated. No voucher code required
Profile icon
Account
Close icon
Sign in New User? Create Account

Your Subscription Your Owned Titles Your Account Your Orders

Country Selection:


CHANGE COUNTRY

Modal Close icon
Country selected Country selected
United States
Country selected
United Kingdom
Country selected
India
Country selected
Germany
Country selected
France
Country selected
Canada
Country selected
Russia
Country selected
Spain
Country selected
Brazil
Country selected
Australia
Country selected

--------------------------------------------------------------------------------

Argentina
Country selected
Austria
Country selected
Belgium
Country selected
Bulgaria
Country selected
Chile
Country selected
Colombia
Country selected
Cyprus
Country selected
Czechia
Country selected
Denmark
Country selected
Ecuador
Country selected
Egypt
Country selected
Estonia
Country selected
Finland
Country selected
Greece
Country selected
Hungary
Country selected
Indonesia
Country selected
Ireland
Country selected
Italy
Country selected
Japan
Country selected
Latvia
Country selected
Lithuania
Country selected
Luxembourg
Country selected
Malaysia
Country selected
Malta
Country selected
Mexico
Country selected
Netherlands
Country selected
New Zealand
Country selected
Norway
Country selected
Philippines
Country selected
Poland
Country selected
Portugal
Country selected
Romania
Country selected
Singapore
Country selected
Slovakia
Country selected
Slovenia
Country selected
South Africa
Country selected
South Korea
Country selected
Sweden
Country selected
Switzerland
Country selected
Taiwan
Country selected
Thailand
Country selected
Turkey
Country selected
Ukraine
Country selected
Arrow left icon
All Products


Best Sellers


New Releases


Books


Videos


Audiobooks


Learning Hub
Newsletters
Free Learning


Arrow right icon


Home > Data > Data Science > Python Feature Engineering Cookbook - Third Edition


PYTHON FEATURE ENGINEERING COOKBOOK: A COMPLETE GUIDE TO CRAFTING POWERFUL
FEATURES FOR YOUR MACHINE LEARNING MODELS, THIRD EDITION

Profile Icon Soledad Galli
By Soledad Galli
$24.99 $35.99
Book Aug 2024 396 pages 3rd Edition
eBook
$24.99 $35.99
Print
$32.99 $44.99
Subscription
Free Trial
Renews at $19.99p/m
Profile Icon Soledad Galli
By Soledad Galli
$24.99 $35.99
Book Aug 2024 396 pages 3rd Edition
eBook
$24.99 $35.99
Print
$32.99 $44.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$24.99 $35.99
Print
$32.99 $44.99
Subscription
Free Trial
Renews at $19.99p/m


WHAT DO YOU GET WITH EBOOK?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced
features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
ADD TO CART
Table of content icon View table of contents Preview book icon Preview Book


PYTHON FEATURE ENGINEERING COOKBOOK - THIRD EDITION


Left arrow icon


PAGE 1 OF 12

Right arrow icon
Download code icon Download Code


KEY BENEFITS

 * Craft powerful features from tabular, transactional, and time-series data
 * Develop efficient and reproducible real-world feature engineering pipelines
 * Optimize data transformation and save valuable time
 * Purchase of the print or Kindle book includes a free PDF eBook


DESCRIPTION

Streamline data preprocessing and feature engineering in your machine learning
project with this third edition of the Python Feature Engineering Cookbook to
make your data preparation more efficient. This guide addresses common
challenges, such as imputing missing values and encoding categorical variables
using practical solutions and open source Python libraries. You’ll learn
advanced techniques for transforming numerical variables, discretizing
variables, and dealing with outliers. Each chapter offers step-by-step
instructions and real-world examples, helping you understand when and how to
apply various transformations for well-prepared data. The book explores feature
extraction from complex data types such as dates, times, and text. You’ll see
how to create new features through mathematical operations and decision trees
and use advanced tools like Featuretools and tsfresh to extract features from
relational data and time series. By the end, you’ll be ready to build
reproducible feature engineering pipelines that can be easily deployed into
production, optimizing data preprocessing workflows and enhancing machine
learning model performance.


WHAT YOU WILL LEARN

 * Discover multiple methods to impute missing data effectively
 * Encode categorical variables while tackling high cardinality
 * Find out how to properly transform, discretize, and scale your variables
 * Automate feature extraction from date and time data
 * Combine variables strategically to create new and powerful features
 * Extract features from transactional data and time series
 * Learn methods to extract meaningful features from text data


PRODUCT DETAILS

Country selected
Publication date, Length, Edition, Language, ISBN-13

Publication date : Aug 30, 2024
Length 396 pages
Edition : 3rd Edition
Language : English
ISBN-13 : 9781835883587
Category :
Data
Languages :
Python
Concepts :
Data Science


WHAT DO YOU GET WITH EBOOK?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced
features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now
ADD TO CART


PRODUCT DETAILS


Publication date : Aug 30, 2024
Length 396 pages
Edition : 3rd Edition
Language : English
ISBN-13 : 9781835883587
Category :
Data
Languages :
Python
Concepts :
Data Science


PACKT SUBSCRIPTIONS

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books
and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and
reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
START FREE TRIAL
BUY NOW
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books
and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and
reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like
for just $5 each
Feature tick icon Exclusive print discounts
START FREE TRIAL
BUY NOW
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books
and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and
reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like
for just $5 each
Feature tick icon Exclusive print discounts
START FREE TRIAL
BUY NOW


TABLE OF CONTENTS

14 Chapters
Preface Chevron down icon Chevron up icon
Preface
Who this book is for
What this book covers
To get the most out of this book
Conventions used
Sections
Get in touch
Share Your Thoughts
1. Chapter 1: Imputing Missing Data Chevron down icon Chevron up icon
Chapter 1: Imputing Missing Data
Technical requirements
Removing observations with missing data
Performing mean or median imputation
Imputing categorical variables
Replacing missing values with an arbitrary number
Finding extreme values for imputation
Marking imputed values
Implementing forward and backward fill
Carrying out interpolation
Performing multivariate imputation by chained equations
Estimating missing data with nearest neighbors
2. Chapter 2: Encoding Categorical Variables Chevron down icon Chevron up icon
Chapter 2: Encoding Categorical Variables
Technical requirements
Creating binary variables through one-hot encoding
Performing one-hot encoding of frequent categories
Replacing categories with counts or the frequency of observations
Replacing categories with ordinal numbers
Performing ordinal encoding based on the target value
Implementing target mean encoding
Encoding with Weight of Evidence
Grouping rare or infrequent categories
Performing binary encoding
3. Chapter 3: Transforming Numerical Variables Chevron down icon Chevron up icon
Chapter 3: Transforming Numerical Variables
Transforming variables with the logarithm function
Transforming variables with the reciprocal function
Using the square root to transform variables
Using power transformations
Performing Box-Cox transformations
Performing Yeo-Johnson transformations
4. Chapter 4: Performing Variable Discretization Chevron down icon Chevron up
icon
Chapter 4: Performing Variable Discretization
Technical requirements
Performing equal-width discretization
Implementing equal-frequency discretization
Discretizing the variable into arbitrary intervals
Performing discretization with k-means clustering
Implementing feature binarization
Using decision trees for discretization
5. Chapter 5: Working with Outliers Chevron down icon Chevron up icon
Chapter 5: Working with Outliers
Technical requirements
Visualizing outliers with boxplots and the inter-quartile proximity rule
Finding outliers using the mean and standard deviation
Using the median absolute deviation to find outliers
Removing outliers
Bringing outliers back within acceptable limits
Applying winsorization
6. Chapter 6: Extracting Features from Date and Time Variables Chevron down icon
Chevron up icon
Chapter 6: Extracting Features from Date and Time Variables
Technical requirements
Extracting features from dates with pandas
Extracting features from time with pandas
Capturing the elapsed time between datetime variables
Working with time in different time zones
Automating the datetime feature extraction with Feature-engine
7. Chapter 7: Performing Feature Scaling Chevron down icon Chevron up icon
Chapter 7: Performing Feature Scaling
Technical requirements
Standardizing the features
Scaling to the maximum and minimum values
Scaling with the median and quantiles
Performing mean normalization
Implementing maximum absolute scaling
Scaling to vector unit length
8. Chapter 8: Creating New Features Chevron down icon Chevron up icon
Chapter 8: Creating New Features
Technical requirements
Combining features with mathematical functions
Comparing features to reference variables
Performing polynomial expansion
Combining features with decision trees
Creating periodic features from cyclical variables
Creating spline features
9. Chapter 9: Extracting Features from Relational Data with Featuretools Chevron
down icon Chevron up icon
Chapter 9: Extracting Features from Relational Data with Featuretools
Technical requirements
Setting up an entity set and creating features automatically
Creating features with general and cumulative operations
Combining numerical features
Extracting features from date and time
Extracting features from text
Creating features with aggregation primitives
10. Chapter 10: Creating Features from a Time Series with tsfresh Chevron down
icon Chevron up icon
Chapter 10: Creating Features from a Time Series with tsfresh
Technical requirements
Extracting hundreds of features automatically from a time series
Automatically creating and selecting predictive features from time-series data
Extracting different features from different time series
Creating a subset of features identified through feature selection
Embedding feature creation into a scikit-learn pipeline
11. Chapter 11: Extracting Features from Text Variables Chevron down icon
Chevron up icon
Chapter 11: Extracting Features from Text Variables
Technical requirements
Counting characters, words, and vocabulary
Estimating text complexity by counting sentences
Creating features with bag-of-words and n-grams
Implementing term frequency-inverse document frequency
Cleaning and stemming text variables
12. Index Chevron down icon Chevron up icon
Index
Why subscribe?
13. Other Books You May Enjoy Chevron down icon Chevron up icon
Other Books You May Enjoy
Packt is searching for authors like you
Share Your Thoughts
Download a free PDF copy of this book


RECOMMENDATIONS FOR YOU

1 of 10
Left arrow icon
Mastering NLP from Foundations to LLMs
Read more
Apr 2024 340 pages
ebook
eBook
 * ebook eBook $29.99
 * print Print $46.99

$29.99 $42.99
$46.99 $52.99
ADD TO CART
Machine Learning with PyTorch and Scikit-Learn
Read more
Feb 2022 774 pages
Full star icon 4.9
ebook
eBook
 * ebook eBook $29.99
 * print Print $54.99

$29.99 $43.99
$54.99
ADD TO CART
LLM Prompt Engineering for Developers
Read more
May 2024 251 pages
ebook
eBook
 * ebook eBook $13.98

$13.98 $19.99
ADD TO CART
Principles of Data Science
Read more
Jan 2024 326 pages
ebook
eBook
 * ebook eBook $27.98
 * print Print $49.99

$27.98 $39.99
$49.99
ADD TO CART
Microsoft Fabric Complete Guide – The Future of Data with Fabric
Read more
Dec 2023 9h 2m
video Video
$89.99
ADD TO CART
Microsoft Power BI - The Complete Masterclass [2023 EDITION]
Read more
Jan 2023 14h 31m
video Video
$109.99
ADD TO CART
The Complete SQL Bootcamp for Aspiring Data Scientists
Read more
Aug 2023 8h 10m
video Video
$69.99
ADD TO CART
Python for Algorithmic Trading Cookbook
Read more
Aug 2024 412 pages
ebook
eBook
 * ebook eBook $27.98
 * print Print $36.99

$27.98 $39.99
$36.99 $49.99
ADD TO CART
Python Natural Language Processing Cookbook
Read more
Sep 2024 312 pages
ebook
eBook
 * ebook eBook $24.99
 * print Print $44.99

$24.99 $35.99
$44.99
ADD TO CART
Machine Learning and Generative AI for Marketing
Read more
Aug 2024 482 pages
ebook
eBook
 * ebook eBook $27.98
 * print Print $49.99

$27.98 $39.99
$49.99
ADD TO CART
Right arrow icon


ABOUT THE AUTHOR

Profile icon Soledad Galli
LinkedIn icon
Soledad Galli is a bestselling data science instructor, author, and open-source
Python developer. As the leading instructor at Train in Data, she teaches
intermediate and advanced courses in machine learning that have enrolled over
64,000 students worldwide and continue to receive positive reviews. Sole is also
the developer and maintainer of the Python open-source library Feature-engine,
which provides an extensive array of methods for feature engineering and
selection. With extensive experience as a data scientist in finance and
insurance sectors, Sole has developed and deployed machine learning models for
assessing insurance claims, evaluating credit risk, and preventing fraud. She is
a frequent speaker at podcasts, meetups, and webinars, sharing her expertise
with the broader data science community.
Read more
See other products by Soledad Galli
Get free access to Packt library with over 7500+ books and video courses for 7
days!
Start Free Trial


FAQS

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the
book details for that title. Add either the standalone eBook or the eBook and
print book bundle to your shopping cart. Your eBook will show in your cart as a
product on its own. After completing checkout and payment in the normal way, you
will receive your receipt on the screen containing a link to a personalised PDF
download file. This link will remain active for 30 days. You can download backup
copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will
download and open the PDF file directly. If you don't, then save the PDF file on
your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing,
completing your purchase means you accept the terms of our licence agreement.
Please read the full text of the agreement. In it we have tried to balance the
need for the ebook to be usable for you the reader with our needs to protect the
rights of us as Publishers and of our authors. In summary, the agreement says:

 * You may make copies of your eBook for your own use onto any machine
 * You may not pass copies of the eBook on to anyone else

How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please
follow below steps:

 1. Register on our website using your email address and the password.
 2. Search for the title by name or ISBN using the search option.
 3. Select the title you want to purchase.
 4. Choose the format you wish to purchase the title in; if you order the Print
    Book, you get a free eBook copy of the same title. 
 5. Proceed with the checkout process (payment to be made using Credit Card,
    Debit Cart, or PayPal)

Where can I access support around an eBook? Chevron down icon Chevron up icon
 * If you experience a problem with using or installing Adobe Reader, the
   contact Adobe directly.
 * To view the errata for the book, see www.packtpub.com/support and view the
   pages for the title you have.
 * To view your account details or to download a new copy of the book go
   to www.packtpub.com/account
 * To contact us directly if a problem is not resolved,
   use www.packtpub.com/contact-us

What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and
ePubs. In the future, this may well change with trends and development in
technology, but please note that our PDFs are not Adobe eBook Reader format,
which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF
eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
 * You can get the information you need immediately
 * You can easily take them with you on a laptop
 * You can download them an unlimited number of times
 * You can print them out
 * They are copy-paste enabled
 * They are searchable
 * There is no password protection
 * They are lower price than print
 * They save resources and space

What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available
in PDF and ePub formats. Every piece of content down to the page numbering is
the same. Because we save the costs of printing and shipping the book to you, we
are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the
link in Your Download Area. We recommend you saving the file to your hard drive
before opening it.

For optimal viewing of our eBooks, we recommend you download and install the
free Adobe Reader version 9.

Arrow left icon
Machine Learning Data Analysis Data Science Business Intelligence Data
Visualization Artificial Intelligence Deep Learning Database Administration Data
Processing Databases
Arrow right icon
Legal
Terms and Conditions Privacy Policy Cookie Policy Shipping Policy Cancellation
Policy Return Policy
Support
Help Contact Us
Business
Partnerships
Sponsored eBooks Custom eBooks
Careers Become an author
Packt+ Membership
Subscription DataPro SecPro TechLeaders
United States
Company Address: Packt Publishing Ltd, Grosvenor House, 11 St Paul's Square,
Birmingham, B3 1RB © 2024 Packt Publishing Limited All Rights Reserved

We use some essential cookies to make this service work. We’d also like to use
analytics cookies so we can understand how you use the service and make
improvements.

ACCEPT ALL COOKIES ACCEPT ESSENTIAL COOKIES REJECT ALL COOKIES

--------------------------------------------------------------------------------

Close icon

Signed in users are eligible for personalised offers and content
recommendations.

Country selected Sign in with Packt Gmail Sign in with Google Github Sign in
with Github

--------------------------------------------------------------------------------

You are browsing a version of our website which may not be the most relevant
option for you. We suggest changing to the following version. Country selected
OK Germany