dokumen.pub Open in urlscan Pro
2606:4700:3036::ac43:8aa9  Public Scan

Submitted URL: https://dokumen.pub/download/the-art-of-seo-mastering-search-engine-optimization-4nbsped-1098102614-9781098102616-v-...
Effective URL: https://dokumen.pub/the-art-of-seo-mastering-search-engine-optimization-4nbsped-1098102614-9781098102616-v-4590168.html
Submission: On October 24 via manual from US — Scanned from DE

Form analysis 2 forms found in the DOM

POST https://dokumen.pub/report/the-art-of-seo-mastering-search-engine-optimization-4nbsped-1098102614-9781098102616-v-4590168

<form role="form" method="post" action="https://dokumen.pub/report/the-art-of-seo-mastering-search-engine-optimization-4nbsped-1098102614-9781098102616-v-4590168">
  <div class="modal-header">
    <h5 class="modal-title pull-left">Report "The Art of SEO: Mastering Search Engine Optimization [4&nbsp;ed.] 1098102614, 9781098102616"</h5>
    <button type="button" class="btn btn-dark pull-right" data-dismiss="modal" aria-hidden="true">×</button>
  </div>
  <div class="modal-body">
    <div class="form-group">
      <input type="text" name="name" required="required" class="form-control border" placeholder="Enter your name">
    </div>
    <div class="form-group">
      <input type="email" name="email" required="required" class="form-control border" placeholder="Enter your Email">
    </div>
    <div class="form-group">
      <select name="reason" required="required" class="form-control border">
        <option value="">--- Select Reason ---</option>
        <option value="pornographic">Pornographic</option>
        <option value="defamatory">Defamatory</option>
        <option value="illegal">Illegal/Unlawful</option>
        <option value="spam">Spam</option>
        <option value="others">Other Terms Of Service Violation</option>
        <option value="copyright">File a copyright complaint</option>
      </select>
    </div>
    <div class="form-group">
      <textarea name="description" required="required" rows="6" class="form-control border" placeholder="Enter description"></textarea>
    </div>
    <div class="form-group">
      <div class="d-inline-block">
        <div class="g-recaptcha" data-sitekey="6Ld9bsMUAAAAAFUJ3kb3qtJCvEbX7XEDp18HE5iQ">
          <div style="width: 304px; height: 78px;">
            <div><iframe title="reCAPTCHA" width="304" height="78" role="presentation" name="a-y657rsp53y6k" frameborder="0" scrolling="no"
                sandbox="allow-forms allow-popups allow-same-origin allow-scripts allow-top-navigation allow-modals allow-popups-to-escape-sandbox allow-storage-access-by-user-activation"
                src="https://www.google.com/recaptcha/api2/anchor?ar=1&amp;k=6Ld9bsMUAAAAAFUJ3kb3qtJCvEbX7XEDp18HE5iQ&amp;co=aHR0cHM6Ly9kb2t1bWVuLnB1Yjo0NDM.&amp;hl=de&amp;v=lqsTZ5beIbCkK4uGEGv9JmUR&amp;size=normal&amp;cb=1a7c94czji2"></iframe></div>
            <textarea id="g-recaptcha-response" name="g-recaptcha-response" class="g-recaptcha-response"
              style="width: 250px; height: 40px; border: 1px solid rgb(193, 193, 193); margin: 10px 25px; padding: 0px; resize: none; display: none;"></textarea>
          </div><iframe style="display: none;"></iframe>
        </div>
      </div>
    </div>
    <script src="https://www.google.com/recaptcha/api.js"></script>
  </div>
  <div class="modal-footer">
    <button type="button" class="btn btn-danger pull-left" data-dismiss="modal">Close</button>
    <button type="submit" class="btn btn-primary pull-right">Submit</button>
  </div>
</form>

POST https://dokumen.pub/newsletter

<form action="https://dokumen.pub/newsletter" method="post">
  <div id="newsletter" class="w-100">
    <h3>Subscribe to our newsletter</h3>
    <hr>
    <p>Be the first to receive exclusive offers and the latest news on our products and services directly in your inbox.</p>
    <div class="input-group">
      <input type="text" placeholder="Enter your E-mail" name="newsletter_email" id="newsletter_email" class="form-control" required="required">
      <span class="input-group-btn">
        <button class="btn btn-primary" type="submit"><i class="fa fa-bell-o mr-1"></i>Subscribe</button>
      </span>
    </div>
  </div>
</form>

Text Content

 * Anmelden
 * Registrierung
 * Deutsch
    * English
    * Español
    * Português
    * Français

 * Dom
 * Najlepsze kategorie
   * CAREER & MONEY
   * PERSONAL GROWTH
   * POLITICS & CURRENT AFFAIRS
   * SCIENCE & TECH
   * HEALTH & FITNESS
   * LIFESTYLE
   * ENTERTAINMENT
   * BIOGRAPHIES & HISTORY
   * FICTION
 * Najlepsze historie
 * Najlepsze historie
 * Dodaj historię
 * Moje historie

 1. Home
 2. The Art of SEO: Mastering Search Engine Optimization [4 ed.] 1098102614,
    9781098102616


THE ART OF SEO: MASTERING SEARCH ENGINE OPTIMIZATION [4 ED.] 1098102614,
9781098102616

Three acknowledged experts in search engine optimization share guidelines and
innovative techniques that will help you p

9,665 1,895 31MB

English Pages 773 [775] Year 2023

Report DMCA / Copyright

DOWNLOAD FILE



POLECAJ HISTORIE

--------------------------------------------------------------------------------

THE ART OF SEO: MASTERING SEARCH ENGINE OPTIMIZATION 9781098102616, 1098102614

Three acknowledged experts in search engine optimization share guidelines and
innovative techniques that will help you p

7,056 1,512 28MB Read more

THE ART OF SEO: MASTERING SEARCH ENGINE OPTIMIZATION [4 ED.] 1098102614,
9781098102616

Three acknowledged experts in search engine optimization share guidelines and
innovative techniques that will help you p

552 193 33MB Read more

TEACH YOURSELF VISUALLY SEARCH ENGINE OPTIMIZATION (SEO) 9781118675373,
9781118470664

The perfect guide to help visual learners maximize website discoverability
Whether promoting yourself, your business, or

280 160 36MB Read more

THE SEO BOOK: SEARCH ENGINE OPTIMIZATION 2020, FREE SEO AUDIT INCL., WAY TO NR.
1, SEO AND SEM

⭐ Learn how it is possible for websites to rank # 1 on Google. ⭐ Easy step by
step instructions to significantly increa

2,276 360 2MB Read more

SEO WORKBOOK: SEARCH ENGINE OPTIMIZATION SUCCESS IN SEVEN STEPS (2021 SEO)



3,072 685 32MB Read more

THE SEO BOOK: SEARCH ENGINE OPTIMIZATION 2020, FREE SEO AUDIT INCL., WAY TO NR.
1, SEO AND SEM

⭐ Learn how it is possible for websites to rank # 1 on Google. ⭐ Easy step by
step instructions to significantly increa

1,304 150 2MB Read more

SEO LIKE I’M 5: THE ULTIMATE BEGINNER’S GUIDE TO SEARCH ENGINE OPTIMIZATION



2,175 235 3MB Read more

SEO 2023: LEARN SEARCH ENGINE OPTIMIZATION WITH SMART INTERNET MARKETING
STRATEGIES

Learn SEO and rank at the top of Google with SEO 2023—beginner to advanced!
Recently updated - EXPANDED & UPDATED AU

1,451 286 2MB Read more

SEO FOR DENTISTS : SEARCH ENGINE OPTIMIZATION FOR DENTIST, ORTHODONTIST &
ENDODONTIST WEBSITES (SEO FOR BUSINESS OWNERS AND WEB DEVELOPERS)

Do you have a dentist or orthodontist website that looks professional, but isn’t
bringing in customers? The third book

285 95 4MB Read more

SEO FOR DENTISTS : SEARCH ENGINE OPTIMIZATION FOR DENTIST, ORTHODONTIST &
ENDODONTIST WEBSITES (SEO FOR BUSINESS OWNERS AND WEB DEVELOPERS)

Do you have a dentist or orthodontist website that looks professional, but isn’t
bringing in customers? The third book

313 144 4MB Read more

 * Author / Uploaded
 * Eric Enge
 * Stephan Spencer
 * Jessie Stricchiola

 * Categories
 * Business
 * Marketing: Advertising






 * Commentary
 * Publisher's PDF | Published: September 2023 | Revision History: 2023-08-30:
   First Release

Table of contents :
Copyright
Table of Contents
Foreword
Preface
Who This Book Is For
How This Book Is Organized
Why Us?
Conventions Used in This Book
O’Reilly Online Learning
How to Contact Us
Acknowledgments
1. Search: Reflecting Consciousness and Connecting Commerce
Is This Book for You?
SEO Myths Versus Reality
The Mission of Search Engines
Goals of Searching: The User’s Perspective
Determining User Intent: A Challenge for Search Marketers and Search Engines
Navigational Queries
Informational Queries
Transactional Queries
Local Queries
Searcher Intent
How Users Search
How Search Engines Drive Ecommerce
Types of Search Traffic
Search Traffic by Device Type
More on the Makeup of SERPs
The Role of AI and Machine Learning
Using Generative AI for Content Generation
SEO as a Career
Conclusion
2. Generative AI and Search
A Brief Overview of Artificial Intelligence
More About Large Language Models
Generative AI Solutions
Generative AI Capabilities
Prompt Generation (a.k.a. Prompt Engineering)
Generative AI Challenges
Conclusion
3. Search Fundamentals
Deconstructing Search
The Language of Search
Word Order and Phrases
Search Operators
Vertical and Local Intent
Crawling
The Index
The Knowledge Graph
Vertical Indexes
Private Indexes
The Search Engine Results Page
Organic Results
Special Features
Query Refinements and Autocomplete
Search Settings, Filters, and Advanced Search
Ranking Factors
Relevance
AI/Machine Learning’s Impact on Relevance
EEAT
Local Signals and Personalization
Timing and Tenure
Legitimacy
Source Diversity
Keywords in Anchor Text
Negative Ranking Factors
User Behavior Data
Conclusion
4. Your SEO Toolbox
Spreadsheets
Traffic Analysis and Telemetry
Google Search Console
Server-Side Log Analysis
JavaScript Trackers
Tag Managers
Search Engine Tools and Features
Autocomplete
Google Ads Keyword Planner
Google Trends
Google News
Related
Search Operators
SEO Platforms
Semrush
Ahrefs
Searchmetrics
Moz Pro
Rank Ranger
Other Platforms
Automation
YouTube Optimization
Conclusion
5. SEO Planning
Strategy Before Tactics
The Business of SEO
Ethical and Moral Considerations
The Escape Clause
Deciding on Accepting Work
Typical Scenarios
Startups (Unlaunched)
Startups (Launched)
Established Small Businesses
Large Corporations
Initial Triage
Document Previous SEO Work
Look for Black Hat SEO Efforts
Watch for Site Changes That Can Affect SEO
Identify Technical Problems
Know Your Client
Take Inventory of the Client’s Relevant Assets
Perform a Competitive Analysis
Information Architecture
SEO Content Strategy
The Long Tail of Search
Examples of Sites That Create Long-Tail Content
Why Content Breadth and Depth Matter
Can Generative AI Solutions Help Create Content?
Measuring Progress
Conclusion
6. Keyword Research
The Words and Phrases That Define Your Business
The Different Phases of Keyword Research
Expanding Your Domain Expertise
Building Your Topics List
Preparing Your Keyword Plan Spreadsheet
Internal Resources for Keyword Research
Web Logs, Search Console Reports, and Analytics
Competitive Analysis
People
External Resources for Keyword Research
Researching Natural Language Questions
Researching Trends, Topics, and Seasonality
Keyword Valuation
Importing Keyword Data
Evaluating Relevance
Priority Ratings for Business Objectives
Filtering Out Low-Traffic Keywords
Breaking Down High-Difficulty Keywords
Trending and Seasonality
Current Rank Data
Finding the Best Opportunities
Acting on Your Keyword Plan
Periodic Keyword Reviews
Conclusion
7. Developing an SEO-Friendly Website
Making Your Site Accessible to Search Engines
Content That Can Be Indexed
Link Structures That Can Be Crawled
XML Sitemaps
Creating an Optimal Information Architecture
The Importance of a Logical, Category-Based Flow
Site Architecture Design Principles
Flat Versus Deep Architecture
Search-Friendly Site Navigation
Root Domains, Subdomains, and Microsites
When to Use a Subfolder
When to Use a Subdomain
When to Use a Separate Root Domain
Microsites
Selecting a TLD
Optimization of Domain Names/URLs
Optimizing Domains
Picking the Right URLs
Keyword Targeting
Title Tags
Meta Description Tags
Heading Tags
Document Text
Image Filenames and alt Attributes
Visual Search
Boldface and Italicized Text
Keyword Cannibalization
Keyword Targeting in CMSs and Automatically Generated Content
Effective Keyword Targeting by Content Creators
Long-Tail Keyword Targeting
Content Optimization
Content Structure
CSS and Semantic Markup
Content Uniqueness and Depth
Content Themes
Duplicate Content Issues
Consequences of Duplicate Content
How Search Engines Identify Duplicate Content
Copyright Infringement
Example Actual Penalty Situations
How to Avoid Duplicate Content on Your Own Site
Controlling Content with Cookies and Session IDs
What’s a Cookie?
What Are Session IDs?
How Do Search Engines Interpret Cookies and Session IDs?
Why Would You Want to Use Cookies or Session IDs to Control Search Engine
Access?
Content Delivery and Search Spider Control
Cloaking and Segmenting Content Delivery
Reasons for Showing Different Content to Search Engines and Visitors
Leveraging the robots.txt File
Using the rel=“nofollow” Attribute
Using the Robots Meta Tag
Using the rel=“canonical” Attribute
Additional Methods for Segmenting Content Delivery
Redirects
Why and When to Redirect
Good and Bad Redirects
Methods for URL Redirecting and Rewriting
How to Redirect a Home Page Index File Without Looping
Using a Content Management System
CMS Selection
Third-Party CMS or Ecommerce Platform Add-ons
CMS and Ecommerce Platform Training
JavaScript Frameworks and Static Site Generators
Types of Rendering
JavaScript Frameworks
Jamstack
Problems That Still Happen with JavaScript
Best Practices for Multilingual/Multicountry Targeting
When to Enable a New Language or Country Version of Your Site
When to Target a Language or Country with a Localized Website Version
Configuring Your Site’s Language or Country Versions to Rank in Different
Markets
The Impact of Natural Language Processing
Entities
Fair Use
Structured Data
Schema.org
Schema.org Markup Overview
How to Use Schema.org
Summarizing Schema.org’s Importance
Google’s EEAT and YMYL
Author Authority and Your Content
Why Did Google End Support for rel=“author”?
Is Author Authority Dead for Google?
Author Authority and EEAT
Author Authority Takeaways
Google Page Experience
Use of Interstitials and Dialogs
Mobile-Friendliness
Secure Web Pages (HTTPS and TLS)
Core Web Vitals
How Much of a Ranking Factor Are Core Web Vitals?
Using Tools to Measure Core Web Vitals
Optimizing Web Pages for Performance
Approach to Rendering Pages
Server Configuration
Ecommerce/CMS Selection and Configuration
Analytics/Trackers
User Location and Device Capabilities
Domain Changes, Content Moves, and Redesigns
The Basics of Moving Content
Large-Scale Content Moves
Mapping Content Moves
Expectations for Content Moves
Maintaining Search Engine Visibility During and After a Site Redesign
Maintaining Search Engine Visibility During and After Domain Name Changes
Changing Servers
Changing URLs to Include Keywords in Your URL
Accelerating Discovery of Large-Scale Site Changes
Conclusion
8. SEO Analytics and Measurement
Why Measurement Is Essential in SEO
Analytics Data Utilization: Baseline, Track, Measure, Refine
Measurement Challenges
Analytics Tools for Measuring Search Traffic
Valuable SEO Data in Web Analytics
Referring Domains, Pages, and Sites
Event Tracking
Connecting SEO and Conversions
Attribution
Segmenting Campaigns and SEO Efforts by Conversion Rate
Increasing Conversions
Calculating SEO Return on Investment
Diagnostic Search Metrics
Site Indexing Data
Index-to-Crawl Ratio
Search Visitors per Crawled Page
Free SEO-Specific Analytics Tools from Google and Bing
Using GA4 and GSC Together
Differences in How GA4 and GSC Handle Data
Differences Between Metrics and Dimensions in GA4 and GSC
First-Party Data and the Cookieless Web
Conclusion
9. Google Algorithm Updates and Manual Actions/Penalties
Google Algorithm Updates
BERT
Passages and Subtopics
MUM
Page Experience and Core Web Vitals
The Link Spam Update
The Helpful Content Update
Broad Core Algorithm Updates
Functionality Changes
Google Bug Fixes
Google Search Console
Google Webmaster Guidelines
Practices to Avoid
Good Hygiene Practices to Follow
Quality Content
Content That Google Considers Lower Quality
The Importance of Content Diversity
The Role of Authority in Ranking Content
The Impact of Weak Content on Rankings
Quality Links
Links Google Does Not Like
Cleaning Up Toxic Backlinks
Sources of Data for Link Cleanup
Using Tools for Link Cleanup
The Disavow Links Tool
Google Manual Actions (Penalties)
Types of Manual Actions/Penalties
Security Issues
Diagnosing the Cause of Traffic/Visibility Losses
Filing Reconsideration Requests to Remediate Manual Actions/Penalties
Recovering from Traffic Losses Not Due to a Manual Action/Penalty
Conclusion
10. Auditing and Troubleshooting
SEO Auditing
Unscheduled Audits
Customized Approaches to Audits
Pre-Audit Preparations
Additional SEO Auditing Tools
Core SEO Audit Process Summary
Sample SEO Audit Checklist
Auditing Backlinks
SEO Content Auditing
Troubleshooting
Pages Not Being Crawled
Page Indexing Problems
Duplicate Content
Broken XML Sitemaps
Validating Structured Data
Validating hreflang Tags
Local Search Problems
Missing Images
Missing alt Attributes for Images
Improper Redirects
Bad or Toxic External Links
Single URL/Section Ranking/Traffic Loss
Whole Site Ranking/Traffic Loss
Page Experience Issues
Thin Content
Poor-Quality Content
Content That Is Not Helpful to Users
Google Altering Your Title or Meta Description
Hidden Content
Conclusion
11. Promoting Your Site and Obtaining Links
Why People Link
Google’s View on Link Building
How Links Affect Traffic
Finding Authoritative, Relevant, Trusted Sites
Link Analysis Tools
Identifying the Influencers
Determining the Value of Target Sites
Creating a Content Marketing Campaign
Discuss Scalability
Audit Existing Content
Seek Organically Obtained Links
Researching Content Ideas and Types
Articles and Blog Posts
Videos
Research Reports, Papers, and Studies
Interactives
Collaborations with Other Organizations
Collaborations with Experts
Quizzes and Polls
Contests
Cause Marketing
Comprehensive Guides and In-Depth Content Libraries
Infographics
Tools, Calculators, or Widgets
Viral Marketing Content
Memes
Content Syndication
Social Media Posts
Creating Remarkable Content
Hiring Writers and Producers
Generating and Developing Ideas for Content Marketing Campaigns
Don’t Be a Troll
Don’t Spam, and Don’t Hire Spammers
Relationships and Outreach
Building Your Public Profile
Link Reclamation
Link Target Research and Outreach Services
Qualifying Potential Link Targets
The Basic Outreach Process
What Not to Do
Outbound Linking
Conclusion
12. Mobile, Local, and Vertical SEO
Defining Vertical, Local, and Mobile Search
Vertical Search
Local and Mobile Search
The Impact of Personalization
Journeys and Collections
How Local Is Local Search?
Vertical Search Overlap
Optimizing for Local Search
Where Are Local Search Results Returned?
Factors That Influence Local Search Results
Optimizing Google Business Profiles
Customer Reviews and Reputation Management
Understanding Your Local Search Performance
The Problem of Fake Competitors
Common Myths of Google Business Profile Listings
Optimizing On-Site Signals for Local
Optimizing Inbound Links for Local
Optimizing Citations for Local
Structured Data for Google Business Profile Landing Pages
Image Search
Image Optimization Tips
Image Sharing Sites
News Search
Google News
Measuring News Search Traffic
Video Search
YouTube
Google Videos and Universal Search
Conclusion
13. Data Privacy and Regulatory Trends
The Consumer Data Privacy Revolution
The IAB Sounds the Alarm
Privacy Legislation Overview
Third-Party Cookie Deprecation
Google’s Privacy Sandbox
Google’s Topics API
Google’s Protected Audience API
Google Analytics 4
Apple’s Consumer-Friendly Privacy Shift
Conclusion
14. SEO Research and Study
SEO Research and Analysis
SEO Forums
Websites
Subreddits and Slack Channels
Resources Published by Search Engines
Interpreting Commentary
SEO Testing
Analysis of Top-Ranking Sites and Pages
Analysis of Algorithmic Differences Across Various Search Platforms
The Importance of Experience
Competitive Analysis
Content Analysis
Internal Link Structure and Site Architecture
External Link Attraction Analysis
SEO Strategy Analysis
Competitive Analysis Summary
Using Competitive Link Analysis Tools
Using Search Engine–Supplied SEO Tools
Google Search Console
Bing Webmaster Tools
The SEO Industry on the Web
Social Networks
Conferences and Organizations
SEO and Search Marketing Training
Conclusion
15. An Evolving Art Form: The Future of SEO
The Ongoing Evolution of Search
The Growth of Search Complexity
Natural Language Processing
Entities and Entity Maps
Meeting Searcher Intent
More Searchable Content and Content Types
Crawling Improvements
New Content Sources
More Personalized and User-Influenced Search
User Experience
Growing Reliance on the Cloud
Increasing Importance of Local, Mobile, and Voice Search
Local Search
Voice Is Just a Form of Input
Increased Market Saturation and Competition
SEO as an Enduring Art Form
The Future of Semantic Search and the Knowledge Graph
Conclusion
Index

CITATION PREVIEW

--------------------------------------------------------------------------------

on iti Ed

SEO

/Theory/In/Practice

h 4t

The Art of

Mastering Search Engine Optimization

Eric Enge, Stephan Spencer & Jessie Stricchiola

/  t h e o r y  /  in /   p r a c tic e

The Art of SEO Three acknowledged experts in search engine optimization share
guidelines and innovative techniques that will help you plan and execute a
comprehensive SEO strategy. Complete with an array of effective tactics from
basic to advanced, this fourth edition prepares digital marketers for 2024 and
beyond with updates on SEO tools, new search engine optimization methods, and
new tools such as generative AI that have reshaped the SEO landscape. Authors
Eric Enge, Stephan Spencer, and Jessie Stricchiola provide novices with a
thorough SEO education, while experienced SEO practitioners get an extensive
reference to support ongoing engagements.

“SEO is surely an art. But this book ups its value by showing that SEO is also a
flat-out science that can be learned and applied to great effect. We’d be
foolish not to benefit from its clear lessons.” —Dr. Robert Cialdini,
bestselling author of Influence

• Learn about the various intricacies and complexities of internet search •
Explore the underlying theory and inner workings of search engines and their
algorithms • Gain insights into the future of search, including the impact of AI
• Discover tools to track results and measure success • Examine the effects of
key Google algorithm updates • Consider opportunities for visibility in local,
video (including YouTube), image, and news search • Build a competent SEO team
with defined roles

SEO

56599 781098 102616

Stephan Spencer is founder of the SEO firm Netconcepts. His books include Google
Power Search and Social eCommerce. His clients have included Chanel, Volvo,
Sony, and Zappos. Jessie Stricchiola, founder of Alchemist Media, has been a
retained expert in over 100 litigation matters regarding internet platforms,
search engines, social media, and analytics.

Twitter: @oreillymedia linkedin.com/company/oreilly-media
youtube.com/oreillymedia

US $65.99 CAN $82.99 ISBN: 978-1-098-10261-6

9

Eric Enge is the president of Pilot Holding. He’s founded, built, and sold four
companies with an SEO focus and has worked with many of the world’s largest
brands.

Praise for The Art of SEO, 4th Edition

The Art of SEO is an absolute must-have for anyone wanting to learn and
implement SEO best practices and boost website rankings. Buy this book now!
—Kara Goldin, Founder, Hint and Author, The Wall Street Journal Bestseller
Undaunted SEO can be a minefield if you’re not armed with the knowledge and
resources to safely navigate this ever-changing landscape. Think of The Art of
SEO as your field survival manual. Simply put, don’t leave home without it.

The Art of SEO is THE must-read reference of the SEO industry. If you’re serious
about learning SEO, it’s a must-have. —Aleyda Solis, SEO Consultant and Founder,
Orainti I’d highly recommend this comprehensive overview of the SEO world.
Written by veteran practitioners who’ve seen and done it all, it’s essential to
have a copy near your workspace. With the continuing rapid development of the
industry, keep up-to-date is more important than ever.

—Jamie Salvatori, Founder, Vat19

—Josh Greene, CEO, The Mather Group

The Art of SEO is an in-depth reference to SEO as a discipline. In this 4th
edition, industry veterans Eric Enge, Stephan Spencer, and Jessie Stricchiola
once again explain the importance of SEO and crafted a step by step manual on
how to take your website to new heights. This is a mustread for anyone in the
marketing and digital space.

I’d recommend The Art of SEO, 4th edition, to anyone looking to improve their
website’s SEO. This comprehensive book covers everything from the basics of
search fundamentals, keyword research, and on-page optimization to more advanced
techniques like link building, analytics, and dealing with manual penalties. The
authors use clear and concise language to explain complex concepts, making it
easy for even beginners to understand and implement SEO strategies effectively.
One of the things I appreciated most about this book was the emphasis on ethical
SEO practices. The authors stress the importance of creating high-quality
content that meets the needs of users rather than relying on black hat tactics
that can ultimately harm your website’s rankings. Overall, this book is a
valuable resource for anyone looking to improve their website’s visibility and
drive more organic traffic.

—Nir Eyal, Bestselling Author, Indistractable and Hooked

The Art of the SEO is THE textbook for SEO. In our complex marketing world, SEO
is more important than ever, and this is the book that every SEO practitioner,
executive and technologist must read in order to understand how to build an
effective SEO channel. To many businesses, an SEO channel will be their most
profitable source of revenue and reading this book cover to cover will give a
serious head start in unlocking growth. —Eli Schwartz, Author, Product-Led SEO
and Growth Consultant

—Jason Hennessey, Founder and CEO, Hennessey Digital and Author, Honest SEO

This book is a must-read for anyone as it provides invaluable insights into why
SEO is essential for success and expertise on how to get the results you need.
Written by some of the top SEO experts in the industry, The Art of SEO is the
ultimate resource for anyone looking to maximize their online visibility. —Rand
Fishkin, Cofounder, SparkToro, Author, Lost and Founder and Founder, Moz The one
book that every SEO needs to have. More than just an introductory text, The Art
of SEO contains detailed insight and tactics to help every website succeed in
organic search. This 4th edition clarifies the latest AI and machine learning
developments that are driving the future of search and SEO. —Barry Adams, SEO
Consultant, Polemic Digital As I read through the book, I realized I’ve seen
many of the proposed solutions and the same messaging coming from other SEOs
over the years. It’s amazing how impactful this book has been to so many SEOs
and how it’s teachings have proliferated in our industry. —Patrick Stox, Product
Advisor and Technical SEO, Ahrefs

The Art of SEO is an excellent resource for anyone interested in SEO. The book
covers a wide range of relevant topics, making it a comprehensive guide for SEO
practitioners at all levels. All three authors are seasoned SEO veterans with a
wealth of practical experience. They provide clear explanations and actionable
advice, backed up by research and case studies, that can help readers improve
their website’s visibility in search engine results pages. The Art of SEO is
also regularly updated to reflect changes in search engine algorithms and best
practice, making it a valuable reference for anyone involved in SEO! —Marcus
Tandler, Cofounder and Chief Evangelist, Ryte SEO is surely an art. But this
book ups its value by showing that SEO is also a flat-out science that can be
learned and applied to great effect. We’d be foolish not to benefit from its
clear lessons. —Robert Cialdini, CEO and President, Influence at Work and
Bestselling Author,

Influence: Science and Practice

The authors Eric Enge, Stephan Spencer, and Jessie Stricchiola are deep experts
on the whole search and findability space. Their work and advice are clear and
actionable. —Gerry McGovern, Author, Top Tasks and World Wide Waste Authors
Eric, Jessie, and Stephan are deep subject matter experts whose curiosity and
focus shows up in this content rich book. Warm, smart and in service. —Matt
Church, Founder, Thought Leaders and Author, The Leadership Landscape

The Art of SEO is the definitive book on SEO. Period. The authors, Eric Enge,
Stephan Spencer, and Jessie Stricchiola, are all experts in the field and have
written the most comprehensive guide on SEO that exists. It covers everything
you need to know to improve your website’s ranking in search engines, from basic
concepts like keyword research to advanced techniques like link building. This
4th Edition is a welcome revision in light of the recent advances in Generative
AI and Google’s own revision to their E-A-T framework (E-E-A-T). If you’re
serious about improving your website’s visibility online with the most
comprehensive and up-to-date information, then The Art of SEO is an essential
read. —Neal Schaffer, Author, The Age of Influence With so many marketing
pundits foretelling the end of marketing as we know it--with the advent of
artificial intelligence (AI) taking over the world, the promise of more
institutional regulations, online marketing imposters making it more difficult
for the good guys to do good--some of the really good guys are doubling down by
releasing the 4th edition of their book on one of those areas where its demise
has been erroneously predicted, SEO. The authors, Eric Enge, Stephan Spencer,
and Jessie Stricchiola are not only doubling down but they are predicting even
more opportunities in this vast and complicated online media choice. They
believe, like most of the “good guys” do, that every challenge is an
opportunity–and with the 4th edition of The Art of SEO, they not only update the
opportunities, they create new avenues and applications available nowhere else.
—Brian Kurtz, Titans Marketing and Author, Overdeliver: Build a Business for a
Lifetime Playing the Long Game in Direct Response Marketing and The Advertising
Solution

The Art of SEO is a game-changing guide for anyone wanting to boost their online
visibility and stay ahead in the ever-changing SEO landscape. —Michael Stelzner,
Founder and CEO, Social Media Examiner SEO is a mystery for many marketers and
entrepreneurs. Eric Enge, Stephan Spencer, and Jessie Stricchiola are undisputed
masters of the craft. In this book they guide you step-by-step through
understanding and implementing this important way of marketing.

The Art of War isn’t about Chinese pottery, and The Art of SEO isn’t a
paint-by-numbers kit. This 800-page book is a comprehensive guide to SEO
strategies and tactics written by three SEO experts: Eric Enge, Stephan Spencer,
and Jessie Stricchiola. The 4th edition covers the latest changes in the digital
marketing landscape, including the advent of ChatGPT. So, this book continues to
be a must read for anyone interested in mastering SEO. —Greg Jarboe, President,
SEO-PR, Author, YouTube and Video Marketing, and Coauthor, Digital Marketing
Fundamentals

—Allan Dib, Bestselling Author,

The 1-Page Marketing Plan

Written by top SEO minds and pioneers of the industry, The Art of SEO (4th
edition) is a mustread for anyone in the field. This comprehensive guide covers
everything from fundamentals to advanced concepts like EEAT and generative AI.
Since its first edition in 2009, it has empowered generations of SEO
professionals. A game-changing resource for all skill levels. —Michael Geneles,
Cofounder, Pitchbox

The Art of SEO is the go-to book on how to optimize your website for Google.
Stephan Spencer, Eric Enge, and Jessie Stricchiola crush it with every edition.
Invest your time in this book. You won’t be disappointed. —Marcus Sheridan,
Author,

They Ask You Answer

This comprehensive guide is your road map to digital success! With clear
explanations, real-world examples, and actionable advice, it’s the ultimate SEO
resource. —Roger Dooley, Author, Brainfluence and Friction What I love about the
authors, their approaches, and this book is this: they’re playing the long game
and showing you how you can do the same. If you want to win the search engine
game, this book is crucial to your victory. —Matt Gallant, CEO and Cofounder,
BIOptimizers

An essential guide to best practices and cuttingedge tactics that belongs on the
desk of all search marketing professionals, especially in these days of nearly
constant change, updates, and new approaches by the search engines. —Chris
Sherman, Founding Editor, Search Engine Land and VP Programming, Search
Marketing Expo How serious are you about maximizing and multiplying, monetizing
your ability to harness the dynamic force and power of SEO? I say dynamic
because it is not static. Like game theory, your strategy needs to be
preeminent, preemptive, and ever-evolving. The Art of SEO is deep, serious (yet
eminently, elegantly, and stunningly clear and profoundly actionable)! The
authors examine, explain, explore, and expose the real truths, the real inner
workings, the real ethical means of optimizing SEO in a fast-changing,
ultracompetitive online environment. You will finally and meaningfully grasp how
to gain the most prized outcome imaginable: the sustainable “gift” of your
target market’s fullest attention, presence, and trust! Every chapter is
sincerely a complete short-course primer, masterfully distilled down to its most
actionable, relevant, critical elements. If you’ve struggled to figure out who
to trust to understand and meaningfully manage your SEO opportunities, read this
book—then use it as a reality check against anyone you entrust your most
precious online relationship to. —Jay Abraham, Bestselling Author and Forbes
Magazine “Top 5” Best Executive Coaches in the US This is the book to read on
SEO, packed full of knowledge from beginner to expert to master. —Ramez Naam,
Former Relevance Group Program Manager, Bing and Author, Nexus

The Art of SEO represents a comprehensive and instructive guide to mastering SEO
for webmasters new and experienced. While the SEO industry continues to evolve,
the primary teachings of this seminal book hold constant— SEO continues to be an
art that can be learned and finessed when armed with the tools and tips recorded
in these writings. —Kristopher B. Jones, Founder, LSEO.com and Coauthor, Search
Engine Optimization

All-in-One For Dummies

If you ever want to sell anything, you can’t overestimate the importance of
understanding search behavior. And there’s no stronger team of A-players to
write about SEO. Everyone should read this book! —Chris Goward, Founder
(exited), Widerfunnel and Author, You Should Test That!

The Art of SEO combines the expertise of the three leading search experts in the
world, making it an invaluable resource. —Gokul Rajaram, Product Engineering
Lead, Square; Former Product Director, Ads, Facebook; Former Product Director,
AdSense, Google

The Art of SEO is hands-down the gold standard of SEO books. It was the first
SEO book I ever owned and the newly updated edition is filled with relevant,
helpful information for anyone who wants to be better at SEO. —Cyrus Shepard,
Owner, Zyppy SEO

The Art of SEO is a most comprehensive guide to SEO available today. Providing a
foundation for beginners to the seasoned SEO pros, you’ll find value in the
principles of this book. Anyone who is looking to succeed in today’s digital
landscape can break down the complex world of SEO into manageable pieces. I
highly recommend this book to anyone who wants to stay ahead of the curve when
it comes to SEO. —Jordan Koene, CEO and Cofounder, Previsible

The Art of SEO is an innovative book that can change your company’s fortune and
future forever from the very first page. The book is full of valuable
information that will save you countless hours— and perhaps make you millions of
dollars—when promoting your business online. The concepts and ideas are easy to
understand and follow, which is key for brands or companies that are busy
focusing on their product or service, but need to keep well informed. Authors
Stephan Spencer, Eric Enge and Jessie Stricchiola, bring together collectively
decades of experience, and share some of their most innovative methods,
research, and strategies to save you valuable time and money in accomplishing
measurable results in your SEO. In its 4th edition, the authors of The Art of
SEO are constantly following the latest changes, and providing the most
up-to-date, comprehensive, tried-and-tested techniques to keep you ahead of the
curve. As I’ve said in many of my talks, if you’re not upgrading your skills,
you’re falling backward. The Art of SEO gives you the latest information to stay
competitive in the field, with all the knowledge readily available at your
fingertips. —Brian Tracy, President, Brian Tracy International Organic traffic
is one of the strongest traffic and sales-driving channels, and The Art of SEO
is the best resource out there that captures how to maximize it. From planning
your strategy and understanding all the technical considerations to demystifying
mobile and local, The Art of SEO covers it and makes clear for both the expert
SEO practitioner and the general digital marketer. Anyone working in ecommerce
needs this book on their shelf. —Erin Everhart, Director, Digital Retail and
Loyalty, Arby’s at Inspire Brands

The Art of SEO is the perfect complement to the science of conversion
optimization. This book is a must-read volume by three highly regarded industry
veterans. —Bryan Eisenberg, New York Times Bestselling Author, Call to Action,
Always Be Testing, Waiting for Your Cat to Bark, and

Be Like Amazon

The 4th edition of The Art of SEO expands and enhances a book that was already
the industry standard for SEO education and strategy. Anyone looking to optimize
their website and get better rankings on the search engines should keep this
book on their desk and refer to it daily. All of the advanced technical SEO
strategies are covered in a straightforward method that is easy to understand
and action-oriented. When you are finished reading this book, you will have a
better grasp on how search engines work and how you can optimize your website
with expert proficiency. If you want to drive more traffic to your website,
engage your audience on a deeper level, generate more sales, and grow your
business—this books lays the plan out for you. —Joseph Kerschbaum, Senior Vice
President, Search and Growth Labs, DEPT In The Art of SEO, Eric Enge, Stephan
Spencer, and Jessie Stricchiola have taken on the daunting task of compiling a
comprehensive, step-by-step walk-through of what it takes to rank well on
search. They go well beyond the usual tactical aspects, addressing fundamental
challenges like understanding user intent, integrating an SEO culture within
your organization, properly measuring success, and managing an SEO project. This
is a deep, deep dive into the world of organic optimization, and you couldn’t
ask for better guides than Enge, Spencer, and Stricchiola. Clear a place on your
desk, because this is going to be your SEO bible. —Gord Hotchkiss, Founder,
Enquiro Search Solutions It’s more expensive than ever to get digital
advertising right. Cost of acquisition, search volume, and more is causing many
brands to rethink their spending and strategy. One strategy that has remained
true (and powerful) is the art and science of SEO. It’s not always easy... and
it takes time, but the results always astound. What you are holding in your
hands is the bible when it comes to SEO. Want to truly create a moat and get
ahead of your competition? Dig in! —Mitch Joel, Founder, ThinkersOne and Author,
Six Pixels of Separation and CTRL ALT Delete

An amazingly well-researched, comprehensive, and authoritative guide to SEO from
some of the most well-respected experts in the industry; highly recommended for
anyone involved in online marketing. —Ben Jesson, Cofounder, Conversion Rate
Experts With the immense changes in the search engine landscape over the past
few years, there couldn’t be a better time for the 4th edition of The Art of
SEO. Do your career, and yourself a favor, and read this book. —Ross Dunn, CEO
and Founder, StepForth Web Marketing, Inc. As a coauthor of a book people refer
to as the “Bible of Search Marketing,” you might think that I wouldn’t recommend
other search books. Not so. But I recommend only excellent search books written
by outstanding search experts. The Art of SEO easily clears that high standard
and is a must-read for anyone serious about organic search success. —Mike Moran,
Coauthor, Search Engine Marketing, Inc. and Author, Do It Wrong Quickly Roll up
your sleeves, buckle your seat belt, and take your foot off the brake. You are
about to go on a journey from the very basics to the very high-end, enterprise
level, and then into the future of the art of SEO. These three authors have been
involved in Internet marketing from the very start and have hands-on experience.
These are not pundits in search of an audience but practitioners who have
actually done the work, know how it’s done, and have the scars to prove it. This
is a dynamite primer for the beginner and a valued resource for the expert.
Clear, concise, and to the point, it may not make you laugh or make you cry, but
it will make you smart and make you successful. —Jim Sterne, Producer, eMetrics
Marketing Optimization Summit and Chairman, Web Analytics Association Regardless
of whether you’re a beginner or an expert search marketer, The Art of SEO
delivers! From keyword research and search analytics to SEO tools and more! —Ken
Jurina, President and CEO, Epiar

There are no better guides through the world of SEO—the combined experience of
these authors is unparalleled. I can’t recommend highly enough that you buy this
book. —Will Critchlow, CEO, SearchPilot Simply put…The Art of SEO is a smart
book on SEO. Neatly laid out, comprehensive and clear… this edition explains the
nuances of cutting-edge tactics for improving your SEO efforts. I refer to it
constantly. —Allen Weiss, Founder and CEO, MarketingProfs.com Presenting the
inner mechanics of SEO is a daunting task, and this book has accomplished it
with flair. The book reveals the closely guarded secrets of optimizing websites
in a straightforward, easy-to-understand format. If you ever wanted to unravel
the mysteries of the most enigmatic discipline on the internet, this is the book
you want as your guide. This book is so comprehensive and well written, it just
might put me out of a job. —Christine Churchill, President, KeyRelevance
Integration of SEO into any strategic PR plan represents the evolution of our
industry. Ultimately it’s this combination of SEO and PR that realizes the
greatest message pull-through. With its practical tips, The Art of SEO has been
invaluable to our PR firm and to me as a PR professional, helping us form our
content and social media strategy as well as acquire more valuable backlinks
from top media outlets. —Heidi Krupp, CEO, Krupp Kommunications

The definitive book on SEO just keeps getting better, as the new 4th edition of
The Art of SEO is packed full of helpful new information. —Brett Tabke, Founder
and CEO, Pubcon, the Premier Optimization, and New Media Conferences SEO is
still critical to a content creator’s success. Whether you are a neophyte or an
advanced search engine marketer, this book will take your business to the next
level and help you drive real revenue opportunities. —Joe Pulizzi, Author, Epic
Content Marketing and Content Inc. Since the science of SEO changes daily,
understanding those changes and executing from that understanding is critical to
today’s business. This map in book form can help you navigate the seas of change
and take control of your ship. The essential SEO guide will move you into the
captain’s seat of online marketing. —Toni Sikes, CEO, CODAworx and Founder, The
Guild

The Art of SEO masterfully unravels the intricate world of SEO, blending
marketing and technical expertise to illuminate this often-misunderstood field.
Providing insights that range from SEO fundamentals to advanced techniques, this
book is an indispensable guide for beginners and an invaluable reference for
seasoned practitioners. Stephan Spencer, Eric Enge and Jessie Stricchiola have
written the definitive resource for anyone looking to excel in this hyperdynamic
landscape. —Jenise Uehara, CEO, Search Engine Journal

Written by in-the-trenches practitioners, The Art of SEO is a well-written
step-by-step guide providing sensible and practical advice on how to implement a
successful SEO program. The authors have created a readable and straightforward
guide filled with concise and easily adopted strategies and tactics any online
business can use. I now have a great resource to recommend when people ask,
“Know any good books on SEO?” —Debra Mastaler, President, De9er Media

Mastering is the keyword of this title. Once you’re ready to become an expert
and reap the benefits, this is the ultimate deep dive. —Derek Sivers, Author,
Anything You Want and TED Speaker

The Art of SEO books have helped thousands of people come to grips with the
confusing world of SEO. This 4th edition is no different. With new examples, new
strategies and new approaches, this book gives you the tools needed to get
higher Google rankings using modern day strategies. —Brian Dean, Cofounder,
Exploding Topics

The Art of SEO has always given readers the most comprehensive insights possible
to the industry I have loved for decades. Every SEO should have a reference
copy. I am delighted to see it refreshed and brought up to date. —Dixon Jones,
CEO, inLinks

The Art of SEO not only illustrates the broader landscape of search mechanics,
but also zeroes in on constructing websites and content that harmonize with
users’ journeys. This guide transcends the typical mantra of “create good
content,” instead offering a detailed road map to crafting engaging online
experiences that secure impressive rankings. —Bartosz Góralewicz, Founder and
CEO, Onely Edition after edition, The Art of SEO continues to bring the top
strategies, research, tactics and insights into our ever changing industry. SEO
is far from easy, but having a resource like this makes it just a bit more
manageable as we tackle all that Google and other technologies throw at us.
—Sean Kainec, VP, Strategic Growth and Marketing, Quattro Agency DO NOT BUY THIS
BOOK. Please. I beg of you. If you compete with us or any of our clients, do not
buy this book. It’s become our go-to source for anything—and everything—we need
to know about successful SEO. —Amy Africa, CEO, Eight By Eight

This updated book is THE BLUEPRINT for essentially uncovering hidden cash in
your business. It has everything from technical SEO to high-level strategies for
on-page and backlinks— everything the modern business needs to know to dominate
organic search now and into the future! Highly recommended! —Greg Merrilees,
Founder, Studio1Design.com The Art of SEO is a thorough and comprehensive
must-have for SEO practitioners, teachers, students, marketers, and anyone who
is interested in learning and implementing SEOn. You can’t just leave it up to
chance whether Google directs the prospects to you or not! —Robert Allen,
Author, Multiple Streams of Income and Coauthor, The One Minute Millionaire This
book is a must-read for anyone because it provides invaluable insights into why
SEO is essential for success and expertise on how to get the results you need.
Written by some of the top SEO experts in the industry, The Art of SEO is the
ultimate resource for anyone looking to maximize their online visibility. —Dan
Gaul, Cofounder and CTO, Digital Trends

Stephan Spencer, Eric Enge, and Jessie Stricchiola, the triumvirate behind ‘Art
of SEO', are industry titans who’ve collectively transformed SEO. Their book is
a global reference point, reflecting over 60 years of their combined expertise.
It’s an honor to endorse The Art of SEO and its authors—true pioneers in the
realm of search. —Christoph Cemper, Founder, AIPRM, LinkResearchTools, and Link
Detox

The Art of SEO FOURTH EDITION

Mastering Search Engine Optimization

Eric Enge, Stephan Spencer, and Jessie Stricchiola

The Art of SEO by Eric Enge, Stephan Spencer, and Jessie Stricchiola Copyright ©
2023 Pilot Holding, Inc., Stephan Spencer, and Alchemist Media, Inc. All rights
reserved. Printed in the United States of America. Published by O’Reilly Media,
Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may
be purchased for educational, business, or sales promotional use. Online
editions are also available for most titles (https://oreilly.com). For more
information, contact our corporate/institutional sales department: 800-998-9938
or corporate@oreilly.com.

Acquisitions Editor: Melissa Duffield Development Editor: Shira Evans Production
Editor: Katherine Tozer Copyeditor: Rachel Head Proofreader: Sonia Saruba
September 2023:

Indexer: Judith McConville Interior Designer: David Futato Cover Designer: Randy
Comer Illustrator: Kate Dullea

Fourth Edition

Revision History for the First Release 2023-08-30:

First Release

See https://oreilly.com/catalog/errata.csp?isbn=9781098102616 for release
details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. The
Art of SEO, the cover image, and related trade dress are trademarks of O’Reilly
Media, Inc. The views expressed in this work are those of the authors and do not
represent the publisher’s views. While the publisher and the authors have used
good faith efforts to ensure that the information and instructions contained in
this work are accurate, the publisher and the authors disclaim all
responsibility for errors or omissions, including without limitation
responsibility for damages resulting from the use of or reliance on this work.
Use of the information and instructions contained in this work is at your own
risk. If any code samples or other technology this work contains or describes is
subject to open source licenses or the intellectual property rights of others,
it is your responsibility to ensure that your use thereof complies with such
licenses and/or rights.

978-1-098-10261-6 [LSI]

TABLE OF CONTENTS

Foreword. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .

xxv

Preface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . xxix 1

Search: Reflecting Consciousness and Connecting Commerce. . . 1 Is This Book for
You? 2 SEO Myths Versus Reality 2 The Mission of Search Engines 5 Goals of
Searching: The User’s Perspective 6 Determining User Intent: A Challenge for
Search Marketers and Search Engines 8 Navigational Queries 8 Informational
Queries 9 Transactional Queries 10 Local Queries 11 Searcher Intent 13 How Users
Search 17 How Search Engines Drive Ecommerce 20 Types of Search Traffic 22
Search Traffic by Device Type 23 More on the Makeup of SERPs 27 The Role of AI
and Machine Learning 31 Using Generative AI for Content Generation 31 SEO as a
Career 33 Conclusion 36

2

Generative AI and Search. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . A Brief Overview of Artificial Intelligence More About Large
Language Models Generative AI Solutions Generative AI Capabilities

37 37 38 39 45

xi

xii

Prompt Generation (a.k.a. Prompt Engineering) Generative AI Challenges
Conclusion

60 61 63

3

Search Fundamentals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . Deconstructing Search The Language of Search Word Order and
Phrases Search Operators Vertical and Local Intent Crawling The Index The
Knowledge Graph Vertical Indexes Private Indexes The Search Engine Results Page
Organic Results Special Features Query Refinements and Autocomplete Search
Settings, Filters, and Advanced Search Ranking Factors Relevance AI/Machine
Learning’s Impact on Relevance EEAT Local Signals and Personalization Timing and
Tenure Legitimacy Source Diversity Keywords in Anchor Text Negative Ranking
Factors User Behavior Data Conclusion

65 66 67 68 70 72 73 74 74 75 76 76 76 77 83 85 85 86 86 87 88 89 90 91 91 91 92
93

4

Your SEO Toolbox. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . Spreadsheets Traffic Analysis and Telemetry Google Search
Console

95 96 96 97

CONTENTS

5

Server-Side Log Analysis JavaScript Trackers Tag Managers Search Engine Tools
and Features Autocomplete Google Ads Keyword Planner Google Trends Google News
Related Search Operators SEO Platforms Semrush Ahrefs Searchmetrics Moz Pro Rank
Ranger Other Platforms Automation YouTube Optimization Conclusion

98 101 105 106 106 107 108 108 109 109 110 111 112 112 113 114 114 115 116 116

SEO Planning. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . Strategy Before Tactics The Business of SEO Ethical and
Moral Considerations The Escape Clause Deciding on Accepting Work Typical
Scenarios Startups (Unlaunched) Startups (Launched) Established Small Businesses
Large Corporations Initial Triage Document Previous SEO Work Look for Black Hat
SEO Efforts Watch for Site Changes That Can Affect SEO Identify Technical
Problems Know Your Client

117 117 118 119 119 120 120 121 121 121 122 124 125 125 126 127 130

CONTENTS

xiii

6

xiv

CONTENTS

Take Inventory of the Client’s Relevant Assets Perform a Competitive Analysis
Information Architecture SEO Content Strategy The Long Tail of Search Examples
of Sites That Create Long-Tail Content Why Content Breadth and Depth Matter Can
Generative AI Solutions Help Create Content? Measuring Progress Conclusion

130 133 133 135 135 137 139 142 143 143

Keyword Research. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . The Words and Phrases That Define Your Business The Different
Phases of Keyword Research Expanding Your Domain Expertise Building Your Topics
List Preparing Your Keyword Plan Spreadsheet Internal Resources for Keyword
Research Web Logs, Search Console Reports, and Analytics Competitive Analysis
People External Resources for Keyword Research Researching Natural Language
Questions Researching Trends, Topics, and Seasonality Keyword Valuation
Importing Keyword Data Evaluating Relevance Priority Ratings for Business
Objectives Filtering Out Low-Traffic Keywords Breaking Down High-Difficulty
Keywords Trending and Seasonality Current Rank Data Finding the Best
Opportunities Acting on Your Keyword Plan Periodic Keyword Reviews Conclusion

145 145 146 147 147 149 153 153 154 154 158 158 161 162 163 164 166 167 168 169
171 171 176 177 178

7

Developing an SEO-Friendly Website. . . . . . . . . . . . . . . . . . . . . . .
. Making Your Site Accessible to Search Engines Content That Can Be Indexed Link
Structures That Can Be Crawled XML Sitemaps Creating an Optimal Information
Architecture The Importance of a Logical, Category-Based Flow Site Architecture
Design Principles Flat Versus Deep Architecture Search-Friendly Site Navigation
Root Domains, Subdomains, and Microsites When to Use a Subfolder When to Use a
Subdomain When to Use a Separate Root Domain Microsites Selecting a TLD
Optimization of Domain Names/URLs Optimizing Domains Picking the Right URLs
Keyword Targeting Title Tags Meta Description Tags Heading Tags Document Text
Image Filenames and alt Attributes Visual Search Boldface and Italicized Text
Keyword Cannibalization Keyword Targeting in CMSs and Automatically Generated
Content Effective Keyword Targeting by Content Creators Long-Tail Keyword
Targeting Content Optimization Content Structure CSS and Semantic Markup Content
Uniqueness and Depth Content Themes Duplicate Content Issues

179 180 180 180 183 188 188 192 195 199 206 208 208 209 209 212 213 213 216 218
220 222 224 225 228 229 243 244 245 245 246 248 248 250 253 256 256

CONTENTS

xv

Consequences of Duplicate Content How Search Engines Identify Duplicate Content
Copyright Infringement Example Actual Penalty Situations How to Avoid Duplicate
Content on Your Own Site Controlling Content with Cookies and Session IDs What’s
a Cookie? What Are Session IDs? How Do Search Engines Interpret Cookies and
Session IDs? Why Would You Want to Use Cookies or Session IDs to Control Search
Engine Access? Content Delivery and Search Spider Control Cloaking and
Segmenting Content Delivery Reasons for Showing Different Content to Search
Engines and Visitors Leveraging the robots.txt File Using the rel=“nofollow”
Attribute Using the Robots Meta Tag Using the rel=“canonical” Attribute
Additional Methods for Segmenting Content Delivery Redirects Why and When to
Redirect Good and Bad Redirects Methods for URL Redirecting and Rewriting How to
Redirect a Home Page Index File Without Looping Using a Content Management
System CMS Selection Third-Party CMS or Ecommerce Platform Add-ons CMS and
Ecommerce Platform Training JavaScript Frameworks and Static Site Generators
Types of Rendering JavaScript Frameworks Jamstack Problems That Still Happen
with JavaScript Best Practices for Multilingual/Multicountry Targeting When to
Enable a New Language or Country Version of Your Site

xvi

CONTENTS

258 259 263 263 264 266 267 267 269 269 270 270 271 273 278 280 283 285 288 288
288 289 295 297 303 303 304 304 305 308 308 310 311 312

When to Target a Language or Country with a Localized Website Version
Configuring Your Site’s Language or Country Versions to Rank in Different
Markets The Impact of Natural Language Processing Entities Fair Use Structured
Data Schema.org Schema.org Markup Overview How to Use Schema.org Summarizing
Schema.org’s Importance Google’s EEAT and YMYL Author Authority and Your Content
Why Did Google End Support for rel=“author”? Is Author Authority Dead for
Google? Author Authority and EEAT Author Authority Takeaways Google Page
Experience Use of Interstitials and Dialogs Mobile-Friendliness Secure Web Pages
(HTTPS and TLS) Core Web Vitals How Much of a Ranking Factor Are Core Web
Vitals? Using Tools to Measure Core Web Vitals Optimizing Web Pages for
Performance Approach to Rendering Pages Server Configuration Ecommerce/CMS
Selection and Configuration Analytics/Trackers User Location and Device
Capabilities Domain Changes, Content Moves, and Redesigns The Basics of Moving
Content Large-Scale Content Moves Mapping Content Moves Expectations for Content
Moves Maintaining Search Engine Visibility During and After a Site Redesign

313 314 323 324 327 328 329 332 334 346 346 349 350 351 352 352 355 355 356 358
358 362 366 367 368 369 370 370 371 371 372 372 373 376 377

CONTENTS

xvii

Maintaining Search Engine Visibility During and After Domain Name Changes
Changing Servers Changing URLs to Include Keywords in Your URL Accelerating
Discovery of Large-Scale Site Changes Conclusion

xviii

378 380 382 382 384

8

SEO Analytics and Measurement. . . . . . . . . . . . . . . . . . . . . . . . . .
. . 385 Why Measurement Is Essential in SEO 386 Analytics Data Utilization:
Baseline, Track, Measure, Refine 386 Measurement Challenges 387 Analytics Tools
for Measuring Search Traffic 388 Valuable SEO Data in Web Analytics 388
Referring Domains, Pages, and Sites 389 Event Tracking 389 Connecting SEO and
Conversions 391 Attribution 393 Segmenting Campaigns and SEO Efforts by
Conversion Rate 394 Increasing Conversions 394 Calculating SEO Return on
Investment 394 Diagnostic Search Metrics 396 Site Indexing Data 397
Index-to-Crawl Ratio 398 Search Visitors per Crawled Page 398 Free SEO-Specific
Analytics Tools from Google and Bing 398 Using GA4 and GSC Together 399
Differences in How GA4 and GSC Handle Data 399 Differences Between Metrics and
Dimensions in GA4 and GSC 400 First-Party Data and the Cookieless Web 401
Conclusion 402

9

Google Algorithm Updates and Manual Actions/Penalties. . . . . Google Algorithm
Updates BERT Passages and Subtopics MUM Page Experience and Core Web Vitals The
Link Spam Update

CONTENTS

403 404 404 406 408 408 410

10

The Helpful Content Update Broad Core Algorithm Updates Functionality Changes
Google Bug Fixes Google Search Console Google Webmaster Guidelines Practices to
Avoid Good Hygiene Practices to Follow Quality Content Content That Google
Considers Lower Quality The Importance of Content Diversity The Role of
Authority in Ranking Content The Impact of Weak Content on Rankings Quality
Links Links Google Does Not Like Cleaning Up Toxic Backlinks Sources of Data for
Link Cleanup Using Tools for Link Cleanup The Disavow Links Tool Google Manual
Actions (Penalties) Types of Manual Actions/Penalties Security Issues Diagnosing
the Cause of Traffic/Visibility Losses Filing Reconsideration Requests to
Remediate Manual Actions/Penalties Recovering from Traffic Losses Not Due to a
Manual Action/Penalty Conclusion

411 412 413 415 415 416 417 420 420 422 424 427 427 428 431 436 438 439 440 442
443 449 449

452 453

Auditing and Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . SEO Auditing Unscheduled Audits Customized Approaches to Audits
Pre-Audit Preparations Additional SEO Auditing Tools Core SEO Audit Process
Summary Sample SEO Audit Checklist Auditing Backlinks

455 455 456 456 458 459 461 462 472

451

CONTENTS

xix

SEO Content Auditing Troubleshooting Pages Not Being Crawled Page Indexing
Problems Duplicate Content Broken XML Sitemaps Validating Structured Data
Validating hreflang Tags Local Search Problems Missing Images Missing alt
Attributes for Images Improper Redirects Bad or Toxic External Links Single
URL/Section Ranking/Traffic Loss Whole Site Ranking/Traffic Loss Page Experience
Issues Thin Content Poor-Quality Content Content That Is Not Helpful to Users
Google Altering Your Title or Meta Description Hidden Content Conclusion 11

xx

CONTENTS

473 475 475 479 480 482 483 486 486 487 488 488 488 490 492 494 496 497 498 499
500 502

Promoting Your Site and Obtaining Links. . . . . . . . . . . . . . . . . . . .
503 Why People Link 504 Google’s View on Link Building 505 How Links Affect
Traffic 508 Finding Authoritative, Relevant, Trusted Sites 510 Link Analysis
Tools 511 Identifying the Influencers 512 Determining the Value of Target Sites
513 Creating a Content Marketing Campaign 514 Discuss Scalability 515 Audit
Existing Content 516 Seek Organically Obtained Links 517 Researching Content
Ideas and Types 517 Articles and Blog Posts 518 Videos 518

12

Research Reports, Papers, and Studies Interactives Collaborations with Other
Organizations Collaborations with Experts Quizzes and Polls Contests Cause
Marketing Comprehensive Guides and In-Depth Content Libraries Infographics
Tools, Calculators, or Widgets Viral Marketing Content Memes Content Syndication
Social Media Posts Creating Remarkable Content Hiring Writers and Producers
Generating and Developing Ideas for Content Marketing Campaigns Don’t Be a Troll
Don’t Spam, and Don’t Hire Spammers Relationships and Outreach Building Your
Public Profile Link Reclamation Link Target Research and Outreach Services
Qualifying Potential Link Targets The Basic Outreach Process What Not to Do
Outbound Linking Conclusion

519 520 522 522 523 524 525 526 527 527 528 529 529 530 530 531 532 535 535 536
536 537 538 540 541 552 554 555

Vertical, Local, and Mobile SEO. . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . Defining Vertical, Local, and Mobile Search Vertical Search Local and
Mobile Search The Impact of Personalization Journeys and Collections How Local
Is Local Search? Vertical Search Overlap

557 558 558 559 560 561 562 563

CONTENTS

xxi

xxii

Optimizing for Local Search Where Are Local Search Results Returned? Factors
That Influence Local Search Results Optimizing Google Business Profiles Customer
Reviews and Reputation Management Understanding Your Local Search Performance
The Problem of Fake Competitors Common Myths of Google Business Profile Listings
Optimizing On-Site Signals for Local Optimizing Inbound Links for Local
Optimizing Citations for Local Structured Data for Google Business Profile
Landing Pages Image Search Image Optimization Tips Image Sharing Sites News
Search Google News Measuring News Search Traffic Video Search YouTube Google
Videos and Universal Search Conclusion

564 566 570 571 574 576 582 586 590 591 592 593 594 596 599 602 603 616 618 619
635 641

13

Data Privacy and Regulatory Trends. . . . . . . . . . . . . . . . . . . . . . .
. . The Consumer Data Privacy Revolution The IAB Sounds the Alarm Privacy
Legislation Overview Third-Party Cookie Deprecation Google’s Privacy Sandbox
Google’s Topics API Google’s Protected Audience API Google Analytics 4 Apple’s
Consumer-Friendly Privacy Shift Conclusion

643 643 644 644 651 652 652 653 653 654 654

14

SEO Research and Study. . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . 655 SEO Research and Analysis 655

CONTENTS

15

SEO Forums Websites Subreddits and Slack Channels Resources Published by Search
Engines Interpreting Commentary SEO Testing Analysis of Top-Ranking Sites and
Pages Analysis of Algorithmic Differences Across Various Search Platforms The
Importance of Experience Competitive Analysis Content Analysis Internal Link
Structure and Site Architecture External Link Attraction Analysis SEO Strategy
Analysis Competitive Analysis Summary Using Competitive Link Analysis Tools
Using Search Engine–Supplied SEO Tools Google Search Console Bing Webmaster
Tools The SEO Industry on the Web Social Networks Conferences and Organizations
SEO and Search Marketing Training Conclusion

656 656 656 656 657 657 660 661 662 663 663 663 664 665 665 666 666 668 674 677
677 678 679 681

An Evolving Art Form: The Future of SEO. . . . . . . . . . . . . . . . . . . . .
The Ongoing Evolution of Search The Growth of Search Complexity Natural Language
Processing Entities and Entity Maps Meeting Searcher Intent More Searchable
Content and Content Types Crawling Improvements New Content Sources More
Personalized and User-Influenced Search User Experience Growing Reliance on the
Cloud

683 687 687 689 690 691 692 694 695 696 696 697

CONTENTS

xxiii

Increasing Importance of Local, Mobile, and Voice Search Local Search Voice Is
Just a Form of Input Increased Market Saturation and Competition SEO as an
Enduring Art Form The Future of Semantic Search and the Knowledge Graph
Conclusion

698 698 699 699 701 702 706

Index. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . 709

xxiv

CONTENTS

Foreword

Welcome to the 4th edition of The Art of SEO. I’ve always valued this book
through each of its prior editions, and this edition is no different. If you
have an interest in learning about search engine optimization (SEO), this book
is one of the best places you could possibly start. But why does SEO remain such
an important field of endeavor, and what is it that makes it worth learning?
There are many driving factors for this. These include: • SEO as a discipline
predates Google. It has been around for over 25 years. As a career, it offers so
many different opportunities for those who become proficient with it. Want to
think like a marketer today? Search Engine Optimization offers plenty of
opportunities to work with a marketing focus. Want to be a tech geek today? No
problem. There are many ways to engage with SEO from a deeply technical mindset.
• In SEO, change is constant. Every two to three years, significant changes take
place. The evolution of SEO has been fascinating to watch. In the very
beginning, it was all about keywords. Then technical architecture became the
next frontier. With the advent of Google, links became the next big thing.
Following that, many drove great success by focusing on scaling content to meet
user needs. And we may be entering the era of AI assisted SEO programs now. This
constant rate

xxv

of change makes SEO challenging, and to be successful you must be constantly
learning. This can make it one of the most rewarding careers you can pursue.
Some may raise the concern about the longevity of a career in SEO. Will it be
around in five years? Won’t AI spell the end for SEO? Those who worry about how
Generative AI may impact SEO are concerned that perhaps all content will be
auto-generated in the future, or that in some way, humans won’t be needed to
implement websites. The reality is that today’s AI algorithms don’t really
contain material knowledge on anything. They are able to spit back pretty good
answers based on the language analysis of sites that they’ve found across the
web, but the algorithms are highly prone to “hallucinations” (this is what the
industry calls errors from Generative AI tools). For example, ChatGPT is widely
known to make up references and facts alike. The biggest issue with Generative
AI is not the error rate, however, it’s the fact that these issues are hard to
identify and fix. Finding these problems can be crowdsourced, but the likelihood
is that the quantity of issues to be identified is so vast that it will take a
very long time for crowdsourcing efforts to find the great majority of them.
Finding them is not enough either, as a human then needs to decide how to tweak
that part of the algorithm and this involves a human value judgment, leaving us
to wonder which humans will we assign to make those judgments? AI is only the
latest reason why people are speculating that SEO may be on the verge of
becoming obsolete. This conversation comes up every two or three years. But
those who raise these questions don’t fully understand the underpinning of what
makes SEO so important: • Users want to search for things online. This is a core
need today, and AI, or any other technology, won’t change that. • Search engines
need third parties to make information available to them. In today’s
environment, this is done largely by information published by third parties via
websites. This basic ecosystem will continue to exist for the foreseeable
future, and as long as it does, SEO will be alive and well. OK, so SEO will be
around for a long time, but how valuable is it, really? That’s a great question!
Per a Conductor study based on a review of six hundred enterprise websites, 36%
of all web traffic comes from organic search. The authors of this book also
shared data with me that they received directly from seoClarity, showing that
across 800 enterprise sites, 48% of all traffic comes from SEO.

xxvi

In addition, SEO is one of the highest ROI channels available to an
organization. A survey by the Search Engine Journal showed that 49% of
respondents saw organic search as their highest ROI channel. That said, SEO
tends to be hard for people to understand. Channels like email, display
advertising, and paid search are direct response channels. You put some money in
today and within a few days you have some money back. This makes measuring
results quite easy. In contrast, investments in SEO can take many months to show
results. By the time you see the results, you may not even be sure which site
changes were responsible. All of this leads to confusion about the value of SEO
and its reliability as a channel. This is what makes this book such a valuable
resource. This updated edition of The Art of SEO addresses all these topics and
more. For example, the book provides you with: • Background on how search
engines work • A detailed approach to creating your SEO strategy • The ins and
outs of keyword research • An introduction to the top SEO tools in the market •
The most in-depth coverage of technical SEO that I’ve ever seen in one place •
An in-depth look at how to deal with algorithm updates • A how-to for auditing
and troubleshooting site issues • Complete coverage of vertical search,
including how you should optimize for video search, YouTube, local search, and
image search • A review of how you should approach analytics and ROI analysis •
A vision of where SEO is going in the future Furthermore, this team of authors
is uniquely qualified to bring this resource to you. Eric Enge was for a long
time this mythical figure in SEO for me, always clearly on the forefront of SEO.
His articles were among the first I read to learn SEO, and he always brought a
clarity of vision on the industry that few bring. He continues to write and
publish and offer the clarity that I have admired and respected for so long. I
don’t remember when I met coauthor Stephan Spencer, but it was probably at an
SEO conference in 2008 or 2009. I was impressed with his insights and for a long
time we had a dialogue going about SEO plug-ins for WordPress and potentially
collaborating. While that collaboration didn’t happen, we did keep in touch, as
Stephan was always one of the most clear-headed people in our industry.

xxvii

While I don’t know coauthor Jessie Stricchiola personally, I do know her by
reputation. Her work as an expert consultant and witness in legal matters is
extremely well known. This is a field that requires more than just an
understanding of SEO, but also keen insight into the workings of the search
industry and all the players within it. To summarize, a couple of things stand
out. The first is that despite having seen the difference SEO can make in so
many companies with my own eyes, most companies still don’t really understand
SEO and what it could bring them. When they do want to invest in SEO, what
people think works for SEO is more often than not based on what I can only
describe as nonsense. By reading this book and following its suggestions, you’ll
make better decisions on SEO and, more importantly, produce better results. The
second thing is related, but oh so important: the realization that it’s just so
much easier to measure the impact of PPC, but it’s also so much more expensive
in the long run. As an investor interested in long-term sustainable marketing
tactics, I keep pushing companies to not just do search marketing, but SEO
specifically. This book will be among the things I’ll give to the companies we
invest in, and if you buy it, it’ll help you build a good SEO program for your
website too.

—Joost de Valk Founder, Yoast

xxviii

Preface

Search engine optimization (SEO) is a field that remains little understood by
most organizations. To complicate matters more, SEO crosses over between
marketing and technical disciplines, so it’s not even clear where it should sit
in most organizations. Yet, a successful SEO program for your website is often
one of the highest-ROI activities an organization can decide to invest in. We
wrote this book to demystify this ever-changing, ever-evolving landscape. This
latest edition is a complete revamp of the third edition of The Art of SEO,
which was published in 2015. It covers all of the latest aspects of SEO,
including EEAT and the impact of generative AI (ChatGPT, Google Bard, Bing Chat,
Claude, etc.).

Who This Book Is For You can think of this book as a complete guide to SEO 101,
SEO 102, and so on, all the way through to SEO 500. As such, it can serve both
as an essential manual for those seeking to learn SEO for the first time, and as
an invaluable reference for those who are already experienced with SEO.

How This Book Is Organized The first few chapters of the book represent a great
starting place for those who are just beginning to learn about SEO, and the
complexity of the material builds gradually from there. Chapters 5 and 6 provide
you with guidance on how to get started with

xxix

your SEO program, covering SEO strategy and keyword research. This is where
you’ll get oriented with what the situation is in your topical area and where
your best opportunities lie to build market share. In Chapters 7 through 10, we
address technical SEO in depth. While this grounding is essential for SEO
beginners, these chapters are also filled with rich material for advanced SEO.
Chapter 11 focuses on the role of links and strategies for attracting attention
(and links) to your site. Chapter 12 is a comprehensive guide to optimizing for
local and vertical search—news, video, and images—with each section going into a
great deal of depth on how to drive SEO success in those areas. We’ve even
included a detailed review of how to optimize your videos for ranking in YouTube
(in addition to how to rank your videos in Google)! Chapters 13 and 14 cover
other aspects of SEO, including the legal landscape as well as how to continue
your development as a world-class SEO professional. Finally, Chapter 15 presents
a vision of the future as we look into the crystal ball to see what may be
coming our way over the next few years. Visit the book’s website for FAQs and to
post your own burning questions. You’ll have access to special offers and
discounts on various SEO tools and services. You can also get exclusive access
to instructional videos related to the concepts in the book by emailing
bonuses@artofseo.com.

Why Us? As a group, this book’s authors have over 60 years’ experience working
on SEO, a discipline involving deep proficiency in all aspects of digital
marketing—from website development, information architecture, and user
experience (UX) to market research, content strategy, analytics, conversion
optimization, and data-driven decision making. These technical skills, along
with the ability to merge the analytical with the creative, are essential
elements of the SEO professional’s toolkit. We have seen how SEO works over a
relatively long period of time, and this book is our effort to share that
knowledge with you.

Conventions Used in This Book The following typographical conventions are used
in this book:

Italic Indicates new terms, URLs, email addresses, filenames, and file
extensions.

xxx

PREFACE

Constant width Used for program listings, as well as within paragraphs to refer
to program elements such as variable or function names, databases, data types,
environment variables, statements, and keywords. Constant width bold Shows
commands or other text that should be typed literally by the user. Constant
width italic Shows text that should be replaced with user-supplied values or by
values determined by context.

O’Reilly Online Learning NOTE For more than 40 years, O’Reilly Media has
provided technology and business training, knowledge, and insight to help
companies succeed.

Our unique network of experts and innovators share their knowledge and expertise
through books, articles, and our online learning platform. O’Reilly’s online
learning platform gives you on-demand access to live training courses, in-depth
learning paths, interactive coding environments, and a vast collection of text
and video from O’Reilly and 200+ other publishers. For more information, visit
https://oreilly.com.

How to Contact Us Please address comments and questions concerning this book to
the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol,
CA 95472 800-889-8969 (in the United States or Canada) 707-829-7019
(international or local) 707-829-0104 (fax) support@oreilly.com
https://www.oreilly.com/about/contact.html We have a web page for this book,
where we list errata, examples, and any additional information. You can access
this page at https://oreil.ly/the-art-of-seo-4e.

PREFACE

xxxi

For news and information about our books and courses, visit https://oreilly.com.
Find us on LinkedIn: https://linkedin.com/company/oreilly-media. Follow us on
Twitter: https://twitter.com/oreillymedia. Watch us on YouTube:
https://youtube.com/oreillymedia.

Acknowledgments We’d like to thank the following people: • Contributors who
helped us with their specialized expertise: Dalton Finney (technical SEO),
Jessica Fjeld (generative AI challenges), Greg Gifford (local search), Greg
Jarboe (video search), Jeff Martin (video search), Marian Mursa (local search),
Ralf Seybold (links and penalties), John Shehata (news search), Chris Silver
Smith (image search), Aleyda Solis (international SEO), Mats Tolander (technical
SEO), and Mark Traphagen (authorship and EEAT). • Our technical reviewers: Eli
Schwartz, Chris Silver Smith, and Patrick Stox. • Our O’Reilly copyeditor:
thanks to Rachel Head for her countless hours dredging her way through hundreds
and hundreds of pages. You helped make this a better book.

xxxii

PREFACE

CHAPTER ONE

Search: Reflecting Consciousness and Connecting Commerce The term search engine
optimization (SEO) is a bit of a misnomer. People who work in the SEO field do
not optimize search engines; they optimize web pages so that they are accessible
and appealing to both search engines and people. Therefore, a more appropriate
label would be “search result optimization,” because the immediate goal of SEO
is to improve the rankings of your web pages in search results. Notice we didn’t
say “the ultimate goal,” because ranking higher in search results is not an end
unto itself; it is only a means for increasing traffic to your site, which—if it
is optimized for converting visitors to customers—will help your business be
more successful. Why is this such a critical field? Quite simply, search has
become integrated into the fabric of our society. More than 7.5 billion Google
searches are performed every day, which equates to more than 85,000 queries per
second (with users typically expecting that responses to their search queries
will be returned in less than a second). Further, it is estimated that more than
half of all traffic across the web comes from organic (nonpaid) search, which
means that for many businesses, SEO is the most important digital marketing
investment. Through the power of search, we’re often able to find whatever we
want in just a minute or two, or even a few seconds. People can use search to
conduct many of their research activities and shopping, banking, and social
transactions online—something that has changed the way our global population
lives and interacts. As a result, it’s critical for owners of websites to
increase their visibility in search engine results as much as they can.
Obtaining the desired prime search result real estate is not a simple matter,
but it is one that this book aims to deconstruct and demystify as we examine,
explain, and explore the ever-changing art of search engine optimization.

1

Is This Book for You? The short answer? Yes, if you’re someone whose work
touches a website. For the advanced SEO, this book is filled with material that
will help you develop your SEO skills even further. For beginners or
intermediates, or for people who work in peripheral fields or roles (such as
product managers, marketers, graphic designers, web developers, or other roles
that have anything to do with creating, managing, or maintaining a website),
this book will help you learn to apply SEO to your work on a day-to-day basis.
Most SEO books are short and focus only on the technical aspects. Understanding
these aspects is critical, as failure on this front can prevent your pages from
being accessible to Google or from ranking as well as they should, regardless of
how great the content is on your site. Unfortunately, most content management
systems (CMSs) and ecommerce platforms are not naturally optimized for SEO, and
the same is true of most JavaScript frameworks and static site generators (if
you’re using one of these). This fact is what makes technical SEO such a
critical area of expertise, and it’s one that demands a large amount of time and
energy to successfully execute. Ultimately, though, technical SEO is just table
stakes, and its main benefit is that it enables your pages to compete for
rankings. It’s the quality and depth of your content and your overall reputation
and visibility online (including the links that your site earns) that drives how
highly you rank. For these reasons, SEO also has a major strategic component.
These aspects of SEO require you to think holistically and creatively. You have
to research trends and personas for the industry or market that your site is
designed for, analyze the sites and companies that compete with it, develop
excellent content and try to get high-value sites to link to it, approach your
topics and keywords (search terms and phrases) from a variety of different
search perspectives and contexts, document your work, and measure your progress,
all while communicating effectively with various managers, engineers, designers,
and administrators who control different aspects of the website. You’ll learn
more about the strategic aspects of SEO in Chapter 4.

SEO Myths Versus Reality There are few industries or professions that suffer
more from myths and misconceptions than SEO. Here are some of the more common
ones:

Myth: SEO is just a few secret tricks. Reality: SEO is a complex, multifaceted,
iterative process, and it takes time and effort to fully realize its potential.
About the only time you’ll reliably see a substantial “quick win” in legitimate
SEO is when you fix a technical issue that

2

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

prevents search engines from properly accessing your website content. There may
be some lesser-known tools or techniques, but there aren’t really any secrets
per se. And if there were, then over time, search engines would find them and
fix them, and you’d most likely lose any rankings you’d gained by exploiting
them. Furthermore, optimizing for search engines does not mean cheating or
manipulating them. Optimizing a web page is a lot like optimizing a career
resume or a profile on a dating site; the goal is to enable your site to be
found quickly and easily by the right people, without sabotaging yourself with
lies or misrepresentations.

Myth: All SEO is spammy and unethical. Reality: There are some unethical SEO
practices out there that seek to exploit loopholes in search engine algorithms,
intentionally violate rules or guidelines published by search engines, or
circumvent systems designed to detect or prevent the manipulation of organic
search rankings. We don’t teach such “black hat” tactics in this book; in fact,
we specifically implore you to avoid them. Like pretty much everything else
worth doing in life, you will get the most benefit out of SEO efforts that focus
on mastering the fundamentals instead of exploiting sketchy shortcuts.

Myth: SEO that is not spammy and unethical is a waste of time and money.
Reality: The black hat stuff sometimes does get a page to the top of the search
results, but only for a short period of time before it gets banned from the
index. When unethical SEO practices are detected, search engines are quick to
punish sites that employ them. There are no legitimate SEO tactics that will
instantly rank a new page highly for a high-volume search term; it takes time,
effort, and money to compete for the top 10 spots. It’s a lot like bodybuilding:
steroids may yield big gains very quickly, but at what cost?

Myth: You need to be a web developer or an IT expert to do SEO. Reality: If
you’re not tech-savvy, there are SEO tools (which are covered in Chapter 4) that
can help you discover and diagnose technical SEO issues. You can then rely on
those who are responsible for building or maintaining your site to make the
necessary changes. You’ll benefit from being familiar with HTML, but you likely
won’t have to do any coding or Unix command-line wizardry.

Myth: SEO is only for ecommerce sites or huge corporations. Reality: If you want
to be found via web search, then you will benefit from search engine
optimization. Your site doesn’t have to directly sell anything. Do you have a
mailing list that you want people to sign up for? Do you want to collect
qualified sales leads through a web form or survey? Perhaps you have a
content-centric site such as a blog that generates revenue from ads or affiliate
links, or from paid

SEO MYTHS VERSUS REALITY

3

subscriptions; or maybe you want to promote your book, movie, or band. If more
traffic would help you in any way, then you’ll benefit from SEO.

Myth: Optimizing my site means I’ll have to pay for a complete redesign, or use
a different content management system, or pay for upgraded web hosting. Reality:
In some cases, you may need to make changes to your site or web host in order to
solve technical SEO problems. There are many different options, though, and if
you will benefit from increased traffic, then it’s probably worth the effort and
expense. There is no universal rule of SEO that says you must use Apache or
WordPress or anything like that.

Myth: SEO requires me to be active on social media and schmooze with
influencers. Reality: You don’t necessarily have to do anything like that if it
doesn’t make sense for your organization. It certainly will help you obtain more
incoming links (this is explained in Chapter 10), but if your site is search
friendly and you have great content, then you’ll eventually get some links
without having to solicit for them—it’ll just take more time and may not yield
the same level of improvement in rank.

Myth: Buying Google Ads (or hosting a Google Ads banner on my site) is cheaper
than SEO and works just as well. Reality: Paid search will certainly yield more
visitors than doing nothing, but successful SEO programs usually deliver much
more traffic with a better return on investment (ROI) than paid campaigns, and
will not directly influence organic search rankings. You could pay for ads for
your best “link-worthy” content (this is covered in Chapter 10) and hope that
some of the people who click the ads will then share the links; this could
indirectly increase your search rankings for some keywords, but there is no
direct correlation to buying a Google ad and ranking higher in organic results.
If paying for search ads in Google or Bing is working for you and there is a
decent ROI, then great—keep doing it! But why would you not put in the effort to
improve your organic rankings as well? You don’t have to pay every time someone
clicks on a high-ranking organic result, and as a result, SEO frequently
provides a higher ROI than paid search campaigns. SEO does not have to be
expensive. Certain scenarios are, of course, inherently costly, such as if you
are competing against huge corporations for very difficult keywords, are
attempting to rehabilitate a site that has been banned for previous black hat
shenanigans, or have tens of thousands of pages with messy HTML. Even if you run
into a cost barrier like one of these, though, you can still find creative
alternatives or commit to making smaller incremental improvements over a longer
period.

4

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Hosting Google AdSense units on your pages won’t impact how these pages rank in
organic results either. If your goal is to get indexed, then you’re much better
off submitting an XML sitemap to Google Search Console (this is covered in
Chapter 6) instead. If your site is a content site, then Google AdSense may be
worthwhile from a revenue standpoint, but it may show ads from your competitors.
And if your goal is to get conversions (sales, sign-ups, leads), then why would
you want to show ads for them? NOTE Google Ads is Google’s parent company
Alphabet Inc.’s largest source of revenue. In general, paid search placement is
not within the scope of SEO. However, Google Ads data can be useful for keyword
research purposes and for A/B testing of organic landing pages, and many SEO
professionals use Google Ads data, including conversion and user engagement
data, to assist in SEO strategy development.

The Mission of Search Engines While those on the internet are free to use any of
the many available search engines to find what they are seeking, Google remains
the dominant player worldwide, with over 90% market share (see Figure 1-1).
Nonetheless, the burden is on Google (and other search engines) to provide a
satisfying search experience. For the most part, search engines aim to
accomplish this by presenting the most relevant results and delivering them as
fast as possible, as users will return to the search engine they believe will
return the results they want in the least amount of time. To this end, search
engines invest a tremendous amount of time, energy, and capital in improving
their speed and relevance. This includes performing extensive studies of user
responses to their search results, continuous testing and experimentation,
analysis of user behavior within the search results (discussed later in this
chapter), and application of advanced machine learning techniques to tune their
search algorithms. Search engines such as Google generate revenue primarily
through paid advertising. The great majority of this revenue comes from a
pay-per-click (or cost-per-click) model, in which the advertisers pay only for
the number of users who click on their ads. Because the search engines’ success
depends so greatly on the relevance of their search results, manipulations of
search engine rankings that result in nonrelevant results (generally referred to
as spam) are dealt with very seriously. Each major search engine employs teams
of people who focus solely on finding and eliminating spam from their search
results (these are referred to as “webspam” teams). Larger search engines such
as Google additionally apply dynamic algorithms that detect and deal with
poor-quality content and/or spam automatically. These efforts

THE MISSION OF SEARCH ENGINES

5

to fight spam matter to SEO professionals because they need to be careful that
the tactics they employ will not be considered spammy by the search engines.
Figure 1-1 shows the global market share for search engines as of June 2023,
according to Statcounter. As you can see, Google is the dominant search engine
on the web worldwide, with a 92.5% worldwide market share. Bing comes in a
distant second, with a 3.1% share.

Figure 1-1. Search engine market share (source: Statcounter)

However, in some markets, Google is not dominant. In China, for instance, Baidu
is the leading search engine, and Yandex is the leading search engine in Russia.
The fact remains, however, that in most world markets, a heavy focus on Google
is a smart strategy for SEO.

Goals of Searching: The User’s Perspective The basic goal of a search engine
user is to obtain information relevant to a specific set of search terms entered
into a search box, also known as a query. A searcher may formulate the query as
a question, but the vast majority of searches are performed by users simply
entering word combinations—leaving the search engines to do the work of
determining a query’s “intent.” One of the most important elements of building
an SEO strategy for a website is thus developing a thorough understanding of the
psychology of your target audience and how they use words and concepts to obtain
information about the services and/or products you provide. Once you understand
how the average search engine user—and, more specifically, your target audience—
utilizes query-based search engines, you can more effectively reach and retain
those users.

6

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Search engine usage has evolved over the years, but the primary principles of
conducting a search remain largely unchanged. Most search engine use includes
the following stages: 1. Users experience the need for information. They may be
looking for information on a specific website, and they will search for that
website (a navigational query); they might want to learn something (an
informational query); or they might want to buy something (a transactional
query). We will discuss the challenge of determining the user’s intent in more
detail in the following section. 2. Users formulate that need using a string of
words and phrases (i.e., search terms), comprising the query. Data provided to
us by seoClarity in March 2021 (see Figure 1-2) showed that 58.8% of user search
queries are one to three words long, though as users become more web savvy, they
may use longer queries to generate more specific results more quickly.

Figure 1-2. Search query lengths (source: seoClarity)

3. Users execute the query, check the results, and, if they seek additional
information, try a refined query.

GOALS OF SEARCHING: THE USER’S PERSPECTIVE

7

When this process results in the satisfactory completion of a task, the user,
search engine, and site providing the information or result have a positive
experience.

Determining User Intent: A Challenge for Search Marketers and Search Engines
Good marketers are empathetic, and smart SEO practitioners share with search
engines the goal of providing searchers with results that are relevant to their
queries. Therefore, a crucial element of building an online marketing strategy
around SEO and organic (nonpaid) search visibility is understanding your
audience and how they think about, discuss, and search for your service,
product, and brand. Search engine marketers need to be aware that search engines
are tools—resources driven by intent toward a content destination. Using the
search box is fundamentally different from entering a URL into the browser’s
address bar, clicking on a bookmark, or clicking a link to go to a website.
Searches are performed with intent: the user wants to find specific information,
rather than just land on it by happenstance. Search is also different from
browsing or clicking various links on a web page. This section provides an
examination of the different types of search queries and their categories,
characteristics, and processes.

Navigational Queries Users perform navigational searches with the intent of
going directly to a specific website. In some cases, the user may not know the
exact URL, and the search engine serves as the equivalent of an old-fashioned
telephone directory. Figure 1-3 shows an example of a navigational query. You
can use the following two criteria to evaluate whether it’s worth ranking for a
navigational query related to a competing brand:

Opportunities: Pull searcher away from destination; get ancillary or
investigatory traffic. However, a September 2019 study by Eric Enge (one of this
book’s authors) published by Perficient shows that nearly 70% of all clicks go
to the first search result for branded queries.

Average traffic value: Very high when searches are for the publisher’s own
brand. These types of searches tend to lead to very high conversion rates.
However, the searchers are already aware of the company brand; some percentage
of these queries will not represent new customers, and for all these queries the
user began with an intent to visit the brand site. For brands other than the one
being searched for, the click-through rates will tend to be low, but this may
represent an opportunity to take a customer away from a competitor.

8

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Figure 1-3. A navigational query

Informational Queries Informational searches involve an incredibly broad range
of queries. Consider the many types of information people might look for: local
weather, driving directions, a celebrity’s recent interview, disease symptoms,
self-help information, how to train for a specific type of career…the
possibilities are as endless as the human capacity for thought. Informational
searches are primarily nontransaction-oriented (although they can include
researching information about a product or service); the information itself is
the goal, and in many cases no interaction beyond clicking and reading is
required for the searcher’s query to be satisfied. Figure 1-4 shows an example
of an informational query. Informational queries are often lower converting but
good for building brand and attracting links. Here is how you can evaluate
whether a query is worth pursuing:

DETERMINING USER INTENT: A CHALLENGE FOR SEARCH MARKETERS AND SEARCH ENGINES

9

Opportunities Provide searchers who are already aware of your brand with
positive impressions of your site, information, company, and so on; attract
inbound links; receive attention from journalists/researchers; potentially
convert users to sign up or purchase.

Average traffic value The searcher may not be ready to make a purchase, or may
not even have long-term purchase intent, so the value tends to be “medium” at
best. However, many of these searchers will later perform a more refined search
using more specific search terms, which represents an opportunity to capture
mindshare with those potential customers. For example, informational queries
that are focused on researching commercial products or services can have high
value.

Figure 1-4. An informational query

Transactional Queries Transactional searches don’t necessarily have to involve a
credit card or immediate financial transaction. Other examples might include
creating a Pinterest account, signing up for a free trial account at
DomainTools, or finding a good local Japanese restaurant for dinner tonight.
Figure 1-5 shows an example of a transactional query. These queries tend to be
the highest converting. You can use the following two criteria to evaluate the
value of a specific query to you are:

10

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Opportunities Achieve a transaction (financial or other).

Average traffic value Very high. Transactions resulting from these queries may
not be immediate, and it’s up to the site receiving the related traffic to
provide enough value to the user to convert them on their site or to make enough
of an impression that the user comes back and converts later.

Figure 1-5. A transactional query

Local Queries As the name implies, local searches relate to users seeking
information as regards a specific location, such as where they currently are, or
a location that they reference in the query. Examples include looking for
directions to the nearest park, a place to buy a slice of pizza, or the closest
movie theater (Figure 1-6). Local queries are not an intent in the same way that
navigational, informational, and transactional queries are, but represent a
subclass that cuts across all types of queries. Many local queries are
transactional (though they differ as they relate to actions or transactions that
will occur in person), but you can have navigational or informational local
queries as well.

DETERMINING USER INTENT: A CHALLENGE FOR SEARCH MARKETERS AND SEARCH ENGINES

11

Figure 1-6. A local query

Consider the following when evaluating the value of ranking for a local query:

Opportunities Drive foot traffic based on the proximity of the searcher. Offers
a strong potential to achieve a transaction (financial or other).

Average traffic value Very high. When users search for something near them, the
probability that they are interested in direct interaction, and possibly a
near-term transaction, is high. We can see that in the way Google has tailored
its search engine results pages (SERPs) for local queries to meet this demand.

12

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Searcher Intent When you are building keyword research charts for clients or for
your own sites, it can be incredibly valuable to determine the intent of each of
your primary keywords. Table 1-1 shows some examples. Table 1-1. Examples of
query types Term

Queries

Intent

Monetary value per visitor

Beijing Airport

5,400

Navigational

Low

Hotels in Xi’an

110

Informational

Mid

7-Day China tour package

30

Transactional

High

Sichuan jellyfish recipe

53

Informational

Low

Hopefully, this data can help you to think carefully about how to serve
different kinds of searchers based on their individual intents, and how to
concentrate your efforts in the best possible areas. This type of analysis can
help you determine where to concentrate content and links, as well as where to
place ads. Although informational queries are less likely to immediately convert
into sales, this does not mean you should forgo pursuing rankings on these
queries; getting your informative content in front of users seeking information
can be incredibly valuable and can turn users into potential customers. As you
can see in Figure 1-7, data from Conductor shows that users who find useful
informational content on your site are 131% more likely to come to you to make a
related purchase at a later date. They may also decide to share your information
with others via their own websites, or through social media engagement—an
indirect but potentially more valuable result than converting the single user
into a paying customer. One problem in search is that when most searchers
formulate their search queries, their input is limited to just a handful of
words (per Figure 1-2, 78.8% of queries consist of one to four words). Because
most people don’t have a keen understanding of how search engines work, they
often provide queries that are too general or that are presented in a way that
does not provide the search engine (or the marketer) with what it needs to
determine, with 100% accuracy 100% of the time, their specific intent. Some
search engine users may not have a specific intent behind a query beyond
curiosity about a currently trending topic or a general subject area. While this
can make it challenging for a search engine to deliver relevant results, it
poses a great opportunity for the digital marketer to capture the mind of
someone who may not know exactly what they are looking for, but who is
interested in the variety of results the search engine delivers in response to a
general query. These types of queries are important to most businesses because
they often get the brand and site on the searcher’s radar,

DETERMINING USER INTENT: A CHALLENGE FOR SEARCH MARKETERS AND SEARCH ENGINES

13

which initiates the process of building trust with the user. Over time, the user
will move on to more specific searches that are more transactional or
navigational in nature. If, for instance, companies buying pay-per-click (PPC)
search ads bought only the high-converting navigational and transactional terms
and left the informational ones to competitors, they would lose market share to
those competitors. Over the course of several days, a searcher may start with
digital cameras, hone in on Olympus OMD, and then ultimately buy from the store
that showed up in their search for digital cameras and pointed them in the
direction of the Olympus OMD model.

Figure 1-7. How informational content impacts user trust (source: Conductor)

To illustrate further, consider the case where a user searches for the phrase
Ford Focus. They likely have numerous considerations in mind when searching,
even though they only use those two words in the query. Figure 1-8 gives an idea
of what the range of those considerations might be. As we can see, the user’s
needs may have many layers. They may be specifically interested in a hatchback,
a sedan, an electric car, or one of many specific model numbers. If they’re
buying a used car, they may want to specify the year or approximate mileage of
the car. The user may also care about the car having aluminum wheels, Spotify, a
roof rack, front and rear seat warmers, and various other options.

14

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Figure 1-8. The pyramid of user needs

Research published by Think with Google generalizes this concept, referring to
it as the “messy middle.” As Figure 1-9 shows, this is the gap between the
trigger that causes the user to take action and the actual purchase.

Figure 1-9. Typical user journey to a purchase

DETERMINING USER INTENT: A CHALLENGE FOR SEARCH MARKETERS AND SEARCH ENGINES

15

The exploration/evaluation part of this journey is highly complex and differs
for every user. Whatever desires/needs users bring to this process, it’s
incumbent on the website to try its level best to meet them in order to earn the
conversion. Given the general nature of how query sessions start, though,
determining intent is extremely difficult, and it can result in searches being
performed where the user does not find what they want—even after multiple tries.
Research from the American Customer Satisfaction Index (ACSI) found that during
the most recent reporting period, 79% of Google users and 71% of Bing users were
satisfied with their experiences. Figure 1-10 shows the ACSI satisfaction scores
for Google from 2002 through 2020.

Figure 1-10. User satisfaction with Google over time (source: Statista)

While 79% satisfaction is an amazing accomplishment given the complexity of
building a search engine, this study still showed that over 20% of users were
not satisfied with Google’s search results. These numbers could reflect users’
dissatisfaction with the number of ads that increasingly infiltrate the SERPs.
The important takeaway here is that in all instances, determining searcher
intent remains a challenge; and when the searcher’s intent is broad, there is
ample opportunity to leverage your content assets with SEO. As an SEO
practitioner, you should be aware that some of the visitors you attract to your
site may have arrived for the wrong reasons (i.e., they were really looking for
something else), and these visitors are not likely to help you achieve your
digital marketing goals. Part of your task in performing SEO is to maintain a
high level of relevance in the content placed on the pages you manage, to help
minimize this level

16

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

of waste—while still attempting to maximize your overall presence in the SERPs
and gain brand exposure.

How Users Search Search engines invest significant resources into understanding
how people use search, enabling them to produce better (faster, fresher, and
more relevant) search engine results. For website publishers, the information
regarding how people use search can be used to help improve the usability of a
site as well as search engine compatibility. User interactions with search
engines can also be multistep processes, as indicated in the user search session
documented by Microsoft and shown in Figure 1-11.

Figure 1-11. “Merrell shoes” user search session

In this sequence, the user performs 5 searches over a 55-minute period before
making a final selection. They are clearly trying to solve a problem and work at
it in a persistent fashion until the task is done. In April 2021, ad management
platform provider Marin Software provided us with data about consumer latency in
completing a purchase based on a review of all their clients. As you can see in
Figure 1-12, for 55% of these users the conversion happens the same day on which
the initial visit occurs for 100% of customers. In other words, there is no
latency at all for those companies.

HOW USERS SEARCH

17

Figure 1-12. Latency in completing purchases

However, for 45% of the companies there is some latency in completion of
purchases. This behavior pattern indicates that people are thinking about their
tasks in stages. As in our Merrell shoes example in Figure 1-11, people
frequently begin with a general term and gradually get more specific as they get
closer to their goal. They may also try different flavors of general terms. In
Figure 1-11, it looks like the user did not find what they wanted when they
searched on Merrell shoes, so next they tried discount Merrell shoes. The user
then refined the search until they finally settled on Easy Spirit as the type of
shoe they wanted. This is just one example of a search sequence, and the variety
is endless. Figure 1-13 shows another search session, once again provided
courtesy of Microsoft. This session begins with a navigational search, where the
user simply wants to locate the travel website Orbitz.com. The user’s stay there
is quite short, and they progress to a search on Cancun all inclusive vacation
packages. Following that, the user searches on a few specific resorts and
finally settles on cancun riviera maya hotels, after which it appears they may
have booked a hotel—the final site visited from among that set of search results
is for Occidental Hotels & Resorts, and the direction of the searches changes
after that. At this point, the user begins to look for things to do while in
Cancun. They conduct a search for cancun theme park and then begin to look for
information on xcaret, a well-known eco park in the area.

18

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Figure 1-13. Travel user search session

Users can traverse countless different scenarios when they are searching for
something. The example search sessions illustrated in Figures 1-11 and 1-13
represent traditional desktop interactions. Behavior will differ somewhat on
mobile devices. For example, with respect to local search, data from Google
indicates that “76% of people who conduct a local search on their smartphone
visit a physical place within 24 hours and 28% of those searches result in a
purchase.” Search engines do a lot of modeling of these different types of
scenarios to enable them to provide better results to users. The SEO
practitioner can benefit from a basic understanding of searcher behavior as
well. We will discuss searcher behavior in more detail in Chapter 2.

HOW USERS SEARCH

19

How Search Engines Drive Ecommerce People make use of search engines for a wide
variety of purposes, with some of the most popular being to research, locate,
and buy products. Digital Commerce 360 estimates that the total value of US
ecommerce sales in 2022 reached $1.03 trillion. Statista forecasts that
worldwide ecommerce retail sales will reach $7.4 trillion by 2025, as shown in
Figure 1-14.

Figure 1-14. Statista online retail forecast to 2025 (source: Statista)

It is important to note that search and offline behavior have a heavy degree of
interaction, with search playing a growing role in driving offline sales. Figure
1-15 shows data from a May 2019 study by BrightEdge which found that 27% of the
traffic across their client base came from paid search. Note that this dataset
is drawn from BrightEdge’s customer mix of over 1,700 global customers,
including 57 of the Fortune 100. Driving traffic to your ecommerce site isn’t
just about driving conversions of every visitor. As shown in Figure 1-16,
visitors that come to your site from a search engine may be at any stage of the
customer journey. This is why ecommerce sites should consider creating content
for each and every stage—capturing visitors when they are

20

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

in the initial stages of discovery and research can significantly increase your
chances of making the sale when they are ready to buy.

Figure 1-15. Sources of traffic to BrightEdge customer sites (source:
BrightEdge)

Figure 1-16. Search delivers traffic across the customer journey

HOW SEARCH ENGINES DRIVE ECOMMERCE

21

Types of Search Traffic By now, you should be convinced that you want your site
content to be prominently displayed within SERPs. However, data shows that you
may not want to be #1 in the paid search results, because the cost incurred to
gain the top position in a PPC campaign can reduce the total net margin of your
campaign. As shown in Figure 1-17, London-based Hallam Internet published data
in 2019 that suggests that the #3 and #4 ad positions may offer the highest ROI.

Figure 1-17. Position 1 in search ads may not be the most profitable

Of course, many advertisers may seek the #1 position in paid search results, as
it offers benefits including branding and maximizing market share. For example,
if an advertiser has a really solid backend on their website and is able to make
money when they are in the top position, they may well choose to pursue it.
Nonetheless, the data from Hallam suggests that, due to the lower ROI, there are
many organizations for which being #1 in paid search does not make sense.

22

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Search Traffic by Device Type The majority of traffic to websites today comes
from mobile devices. In 2021, while working at Perficient, Eric Enge (one of
this book’s authors) conducted a comprehensive study of mobile versus desktop
traffic that showed that over 60% of all traffic in the US and nearly 70%
globally comes from mobile devices, with tablets accounting for a further 3% (as
shown in Figure 1-18).

Figure 1-18. Mobile versus desktop traffic, US and global views (source:
Perficient)

This does not mean that desktop search has become unimportant; indeed, it
delivers the most total time on site (also known as “dwell time”) from visitors
in aggregate, and

TYPES OF SEARCH TRAFFIC

23

nearly the same number of page views as mobile devices. Figure 1-19, also from
the Perficient study, shows the aggregated (across all visitors) total time on
site for desktop versus mobile users.

Figure 1-19. Mobile versus desktop aggregate time on site, US and global views
(source: Perficient)

Google has long been aware of the growing importance of mobile devices, and
first announced an algorithm update to focus on these devices in April 2015.
This announcement was dubbed “Mobilegeddon” by the industry; many expected the
update to cause a dramatic upheaval of the search landscape. In reality, it’s
almost never going to be in Google’s interests to completely disrupt the
existing search results

24

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

in a dramatic fashion because, as we discussed earlier in this chapter, user
satisfaction with Google has stayed at roughly the 80% level every year since
2002. This is reflected in how Google handled its switch to mobile-first
indexing, with the initial announcement in November 2016 stating: Today, most
people are searching on Google using a mobile device. However, our ranking
systems still typically look at the desktop version of a page’s content to
evaluate its relevance to the user. This can cause issues when the mobile page
has less content than the desktop page because our algorithms are not evaluating
the actual page that is seen by a mobile searcher. Nearly everyone in the
industry expected this switch to unfold far more quickly than it did. The reason
that it didn’t is that Google has to perform extensive amounts of testing on any
change it makes to how it indexes and ranks content in order to minimize
unintended negative impacts on search results. Due to the scale of search, this
is a highly involved and cumbersome process. It was not until March 2020—over
three years later—that Google announced a target date of September 2020 to make
to mobile-first indexing universal. Making such a change is incredibly complex,
however, and in November 2022 Google had still not 100% switched over.
Nonetheless, the great majority of sites are indexed mobile-first. Unless your
site has unusually difficult issues, it is likely already being indexed that
way. From an SEO perspective, this means that Google crawls the mobile version
of your site and analyzes its structure and content to determine the types of
queries for which the site is relevant. This means that the majority of your SEO
focus needs to be on the mobile version of your site. The fact that most
visitors to your site are going to come from mobile devices is of critical
importance. Among other things, it means that website design should start with
mobile functionality, design, and layout. Any other approach is likely to result
in a mobile site that is not as optimal as it could be. It also means that for
most queries, you should be studying the structure and format of the mobile
search results from Google. Figure 1-20 shows sample search results for the
query digital cameras. In the first three screens of results on mobile devices,
the ranking opportunities are in Google Shopping, People Also Ask boxes, and
local search. A core part of your SEO strategy is to develop an understanding of
the search landscape at this level of detail, as it can directly impact the
search terms you choose to target. Targeting informational queries is quite
different, as shown in Figure 1-21. Here you still see Google Shopping at the
top of the results, but the next two screens are filled with access to
informational content. As a result, the nature of the ranking opportunities is
not the same.

TYPES OF SEARCH TRAFFIC

25

Figure 1-20. Mobile SERPs for the query “digital cameras”

Figure 1-21. Mobile SERPs for the query “history of prague”

26

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

More on the Makeup of SERPs In 2019, Perficient published a comprehensive study
by coauthor Eric Enge of how Google’s search features—any content that falls
outside the basic list of links—impact click-through rate (CTR), based on data
for both desktop and mobile searches, as well as for branded and unbranded
searches. Figure 1-22 shows the average CTR (percentage of clicks received) by
Google SERP position for both branded and nonbranded queries. As you can see,
the disparity is significant.

Figure 1-22. Search results CTR by ranking position

Google search results are rich in many different kinds of search features.
Figure 1-23 shows the frequency of different search features within the Google
SERPs as of December 2019.

TYPES OF SEARCH TRAFFIC

27

Figure 1-23. Popularity of search features

Each of these features creates different opportunities for placement in the
search results and impacts the potential CTR you might experience. The reason
that CTR is impacted is that users respond to different visual elements and
their eyes get drawn to images and parts of the page that look different. Nearly
two decades ago, research firms Enquiro, EyeTools, and Did-It conducted heat map
testing with search engine users that produced fascinating results related to
what users see and focus on when engaged in search activity. Figure 1-24 depicts
a heat map showing a test performed on Google; the graphic indicates that users
spent the most amount of time focusing their eyes in the upper-left area, where
shading is the darkest. This has historically been referred to in search
marketing as the “Golden Triangle.” Note that while this research was done many
years ago, it still teaches us about how users react to search results.

28

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Figure 1-24. Eye-tracking results, 2005

That said, the search landscape has changed dramatically since 2005 and has
become increasingly more complex, evolving toward results that are media-rich
and mobilecentric. As a result, the appearance of the search results is nowhere
near as consistent as it used to be, which causes users to have a much less
consistent approach to how their eyes scan the SERPs. As shown in Figure 1-25,
more recent research performed by Nielsen Norman Group shows that users follow a
pinball-like path through the page.

TYPES OF SEARCH TRAFFIC

29

Figure 1-25. How users view search results today (source: Nielsen Norman Group)

Even though richer content sprinkled throughout search results pages has altered
users’ eye-tracking and click patterns, the general dynamic of users viewing and
clicking upon listings at the top of search results most frequently, with each
subsequent listing receiving less attention and fewer clicks, has been supported
by multiple studies over time. These types of studies illustrate the importance
of the layout of the search engine results pages. And, as the eye-tracking
research demonstrates, as search results continue to evolve, users’ search and
engagement patterns will follow suit. There will be more items on the page for
searchers to focus on, more ways for searchers to remember and access the search
listings, and more interactive, location-based delivery methods and results
layouts—which will continue to change as search environments and platforms
continue to evolve. NOTE Over the past few years, Google has introduced
“continuous scroll” across mobile and desktop search, where listings are no
longer limited to 10 or so per page, but instead more listings are loaded as one
scrolls downward. It is not clear as yet what impact this change has had upon
searcher behavior.

30

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

The Role of AI and Machine Learning By now you’ve got the idea that user
behavior is highly complex, and the challenge of meeting users’ varied needs
with a search engine is enormous. As the seoClarity data in Figure 1-2 showed,
nearly 80% of all search queries consist of 4 words or fewer. These short
phrases are all that a search engine gets in order to determine what results to
return in the SERPs. In addition, March 2021 data from SEO tool provider Ahrefs
shows that nearly 95% of all search queries are searched on 10 times per month
or fewer. That does not provide search engines with a lot of prior history in
order to model what the user wants. Google deploys a large array of resources to
try to meet these challenges, including a growing number of machine learning
(ML) algorithms such as RankBrain, BERT, SpamBrain, and MUM (we’ll talk more
about some of these in later chapters). You can expect to see Google continuing
to roll out new ML algorithms—used to supplement the human-generated algorithms
that it has developed over the course of decades, which are also undergoing
continuous improvements—on an ongoing basis. This brings some unique challenges,
as the nature of how these algorithms work is opaque even to those who create
them. As a result, testing them and validating that they work as expected is
incredibly complex. We’ll return to many of these topics later in the book.

Using Generative AI for Content Generation In late 2022, OpenAI released a
generative AI model called ChatGPT. ChatGPT quickly drew major media and
industry attention because of its ability to provide detailed natural language
responses to relatively complex queries. Soon after, both Microsoft and Google
announced the release of new conversational AI services. Bing Chat, which
leverages ChatGPT as its underlying technology, was released in February 2023,
and Google launched its generative AI solution, called Bard, in March 2023. At
around the same time OpenAI updated the GPT-3 platform (the underlying engine
that drives ChatGPT) to GPT-4, resulting in significant improvements in the
quality of ChatGPT results. ChatGPT can respond to queries such as: • Write a
600-word article on the life of Albert Einstein. • Create a detailed outline for
an article on quantum mechanics. • Suggest titles for eight articles on the
American Depression. • Read the following content and provide five potential
title tags (then append the content to the prompt). • Create the Schema.org
markup required for the following content (then append the content to the
prompt).

THE ROLE OF AI AND MACHINE LEARNING

31

• Debug this block of Python code (then append the Python code to the prompt). •
Create a new Python script to do X (where X is a description of the task). An
initial review of the output for queries like these shows impressive results
(we’ll show some examples in Chapter 2, where we discuss generative AI further).
We’ll talk more about global concerns about the future of this technology in
Chapters 2 and 15, but some of the additional issues specific to ChatGPT’s (and
Bing Chat’s and Bard’s) ability to help you create your own content are
summarized here: • Prone to including overtly incorrect information in its
responses • Prone to showing bias (towards ethnicity, gender, religion,
nationality, and other areas) • Frequently omits information that may be
considered materially important in a response to the query • Doesn’t provide
insights • Bias toward being neutral (when you may want to take a position) In
March 2023, coauthor Eric Enge did a study designed to see which of the
generative AI solutions performed the best. ChatGPT scored the highest for
responses that were free of overt inaccuracies, with a score of 100% accuracy
across tested queries 81.5% of the time. Figure 1-26 shows more detail on these
results. However, a separate test of 105 different queries exposed various
limitations with types of queries that ChatGPT doesn’t handle well, such as: •
Product pricing

• Sports scores

• Directions

• Your Money or Your Life (YMYL) topics

• Weather (and other real-time topics) • News

This study also showed that Bing Chat and Google Bard have significant problems
with accuracy and completeness too, but these platforms don’t share the problems
with other types of content (such as product pricing, directions, YMYL topics,
etc.).

32

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

Figure 1-26. Generative AI study results

Despite these concerns, ChatGPT, Bing Chat, and Bard represent a huge step
forward in language generation, and their capabilities are impressive. As
mentioned at the beginning of this section, generative AI solutions can already
be used to provide responses to a huge range of queries, and as these algorithms
continue to be improved, their use will only grow. In recognition of this,
Google has updated its guidance on using AI for content generation. The blog
post “Google Search’s Guidance About AI-Generated Content” includes a section
called “How automation can create helpful content” that acknowledges that AI has
been used to help generate sports scores, weather forecasts, and transcripts,
and provides other guidance on when it’s OK to use AI to generate content. As
these algorithms evolve, AI will increasingly be able to produce better answers
and higher-quality content. We therefore encourage you to experiment with AI,
but to be careful to not get too far out in front of its capabilities. You still
need to have your own subject matter experts owning the responsibility for all
of your content (no matter how it’s initially generated) and your own brand
voice, positioning, and priorities.

SEO as a Career The authors of this book have been in the SEO industry for a
long time and have founded and sold successful SEO consultancy businesses. In
the modern web-first world, though, it has become more common for companies to
have one or more full-time SEO employees, or at least employees whose
responsibilities include SEO.

SEO AS A CAREER

33

SEO does not align perfectly with any traditional business divisions or
departments, but since its primary goal is to increase conversions (from
visitors into customers, clients, list subscribers, and so on), it is nominally
within the realm of marketing. You don’t need to have a strong background in
marketing to succeed in SEO; however, you must be able to communicate well with
designers, product managers, and marketing managers. They will explain the
organization’s goals and how their projects contribute to them, and you will
inform them of changes that must be made to enhance the site’s effectiveness.
Similarly, while the World Wide Web is inherently technical in nature, you don’t
need to be a software engineer or system administrator in order to be an
effective SEO. There are some technological concepts that you must understand at
a high level, such as the fact that web pages are files stored on web servers or
are dynamically generated by web applications and are rendered in web browsers
and (hopefully!) cataloged by search engines. You must be able to communicate
effectively with system administrators, web developers, IT managers, and
database administrators so that low-level technical issues can be resolved.
Optimizing for search engines also means optimizing for the people who visit
your site; your pages must appeal to visitors without sacrificing search engine
friendliness. You don’t have to be a web designer or UI engineer, but you do
need to be able to help people in those roles to understand the impact that
their decisions may have on search traffic, links from other sites, and
conversion rates. It is often the case that some piece of web technology is nice
from a user perspective, but unintentionally hinders or blocks search engines;
conversely, if you design a site only with search engines in mind, then visitors
won’t find it very appealing. Lastly, you will probably (or eventually) need to
have some skill in dealing with decision-makers. A successful SEO will have a
great deal of organizational influence without needing to have direct control
over site design or web hosting. Depending on your role and the company or
client you’re working for, you may have to collaborate with several
decision-makers, such as the business owner, CEO, VP of marketing, director of
engineering, or a product manager or IT manager. Some of these people may view
SEO as a one-off enhancement, like a rebranding or a style refresh. Regardless
of your role, your first task on any project is to make it clear that SEO is a
process, not an event; it requires a long-term commitment to monitoring a site’s
search rankings and indexing, and regular reviews of keyword lists and marketing
plans. If your boss or client is not versed in SEO best practices, they may
expect unrealistic (immediate) results. You must convince them that traffic
grows like a healthy child— gradually most of the time, but occasionally in
spurts—and will be the result of an accumulation of incremental improvements to
various projects and services associated with the website. Since it’s rare for
those to all be under one person’s control, the

34

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

better you are at integrating with existing business projects, product
development processes, and budget planning sessions, the more success you’ll
have as a professional SEO. There is more information on this topic in Chapter
4.

Supplemental Skills While not strictly required, SEOs will greatly benefit from
developing their expertise on the topics listed in this section. The Art of SEO
lightly covers them to varying degrees, but a deep dive on any one of them is
beyond the scope of this book, so we’ve included some suggestions for further
study:

Systems theory Systems theory is a philosophical approach to analyzing organized
structures (systems); it’s at the heart of creative problem solving in any
field, not just SEO. The internet is a complex system composed of many
interconnected components. By developing your systems thinking skills, you’ll
gain a much better understanding of the impact of any changes you make to your
site or server. The Systems Thinker is an excellent resource for learning more.
Data analysis While the tools and services that SEOs use to analyze search
traffic usually have excellent charts, graphs, tables, and dashboards, you may
want to be able to format and filter the raw data in a unique way. Data analysis
skills allow you to retrieve useful data and format it in such a way that you
are able to discover valuable insights about your web pages and visitors. It
also helps to be proficient with spreadsheets (pivot tables and so forth), as
SEOs spend a lot of time working with large keyword lists in spreadsheet
applications; the Excel Easy website is a good place to get started. HTML, CSS,
and XML The more you know about markup languages and stylesheets, the better
enabled you are to optimize web content on your own. Search engines can be
stymied by badly formed or missing HTML elements, and you can potentially
improve your rankings simply by ensuring that each page has proper tags and
attributes. Also, Google is increasingly relying on structured data markup for
its services and special features. W3Schools is a great resource for learning
about HTML, CSS, and XML. JavaScript Many modern websites are dynamically
generated for each visitor, based on logic written in JavaScript and data stored
in JSON or XML format. Unfortunately, search engines have only a limited ability
to execute JavaScript and often have difficulty indexing dynamic web pages. Most
of the time you’ll have to work with a full-time web developer to address these
issues, but if you have some experience with JavaScript, you’ll be better
equipped to troubleshoot and diagnose them quickly. You can find tutorials,
reference materials, and other resources at W3Schools.

SEO AS A CAREER

35

Scripting The ability to automate your most tedious and time-consuming tasks by
writing a script in Python, Bash, or Perl is an SEO superpower. For instance,
our late friend and colleague Hamlet Batista wrote a Python script that used
artificial intelligence to generate alt attributes for every image element on
every page of a website! On the simpler side, you could easily learn to write a
script to retrieve keyword data and add it to a Google Sheets document. To learn
more, check out Jean-Christophe Chouinard’s “Python for SEO”.

Conclusion Search is an integral part of the fabric of global society. The way
people learn, work, share, play, shop, research, socialize, and interact has
changed forever, and organizations, causes, brands, charities,
individuals—almost all entities—need to view their internet presence as a core
priority. Leveraging search engines and search functionality within all
platforms is essential to generate exposure and facilitate engagement. This book
will cover in detail how search, and therefore SEO, is at the center of the web
ecosystem, and thus how it can play a major role in your success within the
ever-evolving digital economy.

36

CHAPTER ONE: SEARCH: REFLECTING CONSCIOUSNESS AND CONNECTING COMMERCE

CHAPTER TWO

Generative AI and Search Search engines, recommendation engines, and related
systems have been actively using artificial intelligence algorithms for decades,
but the landscape changed dramatically with OpenAI’s release of ChatGPT in
November 2022. ChatGPT opened up the possibility of an entirely different way
for humans to interact with machines to get answers to questions. This type of
AI technology is referred to as generative AI. This chapter will explore both
how search engines are applying this technology and what opportunities exist for
organizations to use generative AI as part of their SEO programs. The first few
sections provide some basic background on the current AI landscape to provide
some context for the technology and help clarify its broader implications. If
you’d like to jump ahead to applications, you can skip to “Generative AI
Solutions” on page 39.

A Brief Overview of Artificial Intelligence The concept of AI algorithms was
originally formulated in the 1950s, with Alan Turing’s “Computing Machinery and
Intelligence”, considered to be a cornerstone paper in the AI field. The term
artificial intelligence was coined by computer scientists John McCarthy, Marvin
Minsky, Nathaniel Rochester, and Claude E. Shannon, with early AI research and
development activity taking place at the Massachusetts Institute of Technology
and the University of Oxford. Most of what we see today in terms of AI consists
of neural networks and large language models (LLMs) based on very large neural
networks with many layers, referred to as deep learning models. Neural networks
and deep learning models are like enormously complex equations with many
parameters. These models can be trained to provide the desired output by using
training data to tune the weights assigned to each parameter (supervised
learning) or by running them against the raw data and having them learn the
patterns in the data themselves (unsupervised learning).

37

These types of algorithms are designed to create tools that are very good at
analytical tasks. To understand what this means, it’s useful to step back and
consider what makes up human intelligence. One model, called the triarchic
theory of intelligence, was proposed by Robert J. Sternberg. This theory
suggests that there are three types of intelligence:

Componential intelligence Associated with being gifted at analytical tasks. This
is the type of intelligence that is most often tested.

Experiential intelligence Associated with the ability to be creative and
flexible in one’s thinking.

Practical intelligence Associated with the ability to apply a given idea or
thought process to the context of a situation (i.e., does it make sense given
what we’re trying to accomplish?). Currently, AI algorithms focus primarily on
componential (or analytical) intelligence. We are also beginning to see some
offerings, like AutoGPT, that can be used to execute complex tasks with many
steps that may represent the beginning of some level of practical intelligence.
Where these algorithms may go and how fast they will progress is not clear at
this time, but their development does give rise to significant risks that need
to be managed. We will delve more into this topic in “Generative AI Challenges”
on page 61.

More About Large Language Models LLMs are essentially very large neural networks
trained to understand language. These networks may have billions of parameters
or more (OpenAI’s GPT-3 consists of 175 billion parameters; the size of the
GPT-4 network has not been confirmed, but industry speculation suggests that it
has more than 100 trillion parameters). They’re also incredibly complicated and
expensive to train: OpenAI’s CEO Sam Altman has said that training GPT-4 cost
more than $100 million. These vastly complex networks are required to maximize
the models’ ability to understand language to a high degree. Deep learning
models learn language based on the concept of completion models. As an example,
consider the following three-word phrase and fill in the blank: “See Spot
_______.” You may have decided that the third word should be “run”—it’s very
likely that a large number of people will pick that due to “see Spot run” being
a phrase made famous in a highly popular set of children’s books (plus a more
recent, not quite as successful movie). Whatever word you chose, you likely did
so because you considered it to

38

CHAPTER TWO: GENERATIVE AI AND SEARCH

have the highest probability of being the most popular answer. This is
essentially the approach that LLMs use to learning language, but at a much
larger scale. Another key innovation that makes LLMs possible is the concept of
attention models. Since the training of LLMs involves a great deal of expense
and time (training the full model can take years), having to update the entire
model is a forbidding task. It’s also very difficult to train a model that has
to account for all possible scenarios across all languages. The advent of
attention models made all of this much more manageable because it enabled the
training to happen in much narrower contexts, a piece at a time. For example,
the model can be trained on a specific topic area, such as all the language
related to kayaking, or all the language related to playing golf. Such
innovations, and the sheer scale of these models, are what have made many of the
recent advances possible. In addition, adjusting for specific issues without
having to retrain the whole model becomes feasible. We’ll discuss this more in
the section on generative AI challenges. In their current form, LLMs may be
reaching the limits of their potential capabilities. In fact, Sam Altman
predicted that “the age of giant AI models is already over” and that new ideas
will be required to advance AI further from where it is now.

Generative AI Solutions OpenAI was the first to offer a generative AI tool with
its release of ChatGPT in November 2022. Market response to this release was
immediate: ChatGPT was the fastest application to reach one hundred million
users in history, hitting this milestone in just two months. Inherent in its
exponential growth was the compelling value proposition provided: ChatGPT
enables users to converse with it in a meaningful way. This includes the option
to pose complex multipart questions (prompts) and have it respond with complex,
long-form responses. In addition to this question-and-answer functionality,
ChatGPT has many other capabilities that include (but are not limited to): •
Writing and debugging code

• Explaining complex topics

• Creating Schema.org markup

• Writing music

• Writing Excel macros

• Translating text

• Creating article outlines

• Summarizing content

• Writing drafts of content These capabilities are not yet available in
traditional search engines, yet they are in high demand by users.

GENERATIVE AI SOLUTIONS

39

WARNING All generative AI solutions are prone to making mistakes. As a result,
any and all outputs they provide should be validated (and corrected) before you
rely on them. As a cautionary example, CNET was quick to try publishing content
created by an AI tool, but after examining 71 of the long-form articles the tool
produced, it found that 41 of them contained mistakes. Coauthor Eric Enge also
conducted a small study of 30 queries with ChatGPT and found that 20% of the
content produced contained overtly incorrect information and over 50% of the
content had material omissions. He also tested two versions of Bing Chat, which
performed similarly to ChatGPT, and Google Bard, which performed somewhat worse.

Bing announced its solution, which is based on ChatGPT and called Bing Chat, in
February 2023. This created a great deal of excitement due to its integration
into the Bing search engine. As an example, if you search on how do you make loc
lac (a popular Cambodian dish), you get a standard search result, as shown in
Figure 2-1.

Figure 2-1. Bing search result

However, under the search query you will see an option called CHAT. Clicking
that brings you into the Bing Chat experience, as shown in Figure 2-2.

40

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-2. Bing Chat search result

This integration offers an elegant way to switch between the traditional search
experience and Bing Chat. Eric Enge did an interview with Bing’s Fabrice Canel
where they discussed the thinking that went into how this integration should
work. One of the key insights was that searchers are typically in one of two
modes. Figure 2-3 shows those two modes and some sample queries.

Figure 2-3. Bing’s theory on two modes of search

Bing positions traditional search engine models as being built to handle queries
where the user knows exactly what they want. Examples of these types of queries
are navigational queries and straightforward informational queries. However,
many search

GENERATIVE AI SOLUTIONS

41

sessions involve multiple query iterations, where the user works through a
process to finally get what they want. We shared examples of such complex
sessions in Chapter 1, in Figures 1-11 and 1-13. The interview with Canel also
sheds light on how the underlying infrastructure for the Bing Chat integration
is structured. Figure 2-4 shows what Bing calls the Prometheus model.

Figure 2-4. Bing’s Prometheus model

In this model, the Bing Orchestrator manages much of the interaction between
Bing Search and Bing Chat. However, the real innovation is Bing’s backend,
powered by the LLM. Here is what Canel had to say about it: When we had a
breakthrough in large language models (LLMs), much like other LLMs, it was
trained with data through a given point in time, so we thought that we could
make the user experience richer, more relevant, and more accurate by combining
it with the power of Bing’s back-end. To be more concrete, we developed a
proprietary technology called Prometheus, a first-of-its-kind AI model that
combines the fresh and comprehensive Bing index, ranking, and answers results
with the creative reasoning capabilities of most-advanced models. Bing Chat
benefits because the Prometheus model can use data from the Bing index to
fact-check responses it generates and reduce inaccuracies. Bing refers to this
as grounding.

42

CHAPTER TWO: GENERATIVE AI AND SEARCH

NOTE ChatGPT did not have this ability at launch, but a Bing Search integration
was added through a deal with Microsoft announced in May 2023.

One other important aspect to note about Bing’s implementation is that, unlike
ChatGPT, Bing Chat provides attribution to all the sources used in generating an
answer. Here is Canel’s commentary on that: Our goal is to satisfy our Bing
users and drive more traffic to publishers in this new world of search. It is a
top goal for us, and we measure success partly by how much traffic we send from
the new Bing/Edge. Unsurprisingly, Google was also quick to respond to the
release of ChatGPT. In February 2023, Google CEO Sundar Pichai published a post
titled “An Important Next Step on Our AI Journey” that announced the release of
Google’s own ChatGPT-like product, called Bard. It quickly became evident that
Google’s Bard, released the following month, was not as capable as ChatGPT or
Bing Chat, but it was a foot in the game. Unlike Bing Chat, Bard was launched
fully separately from the search experience. It also fails to provide much in
the way of attribution to the third-party websites that were the sources of
information used in generating its responses, an aspect that many were quick to
criticize. Google took the next step at its annual I/O conference on May 10,
2023, announcing plans for what it calls the Search Generative Experience (SGE).
This is Google’s first step toward integrating generative AI directly into its
search results. Per Google’s announcement, SGE will enable users to: • Ask
“entirely new types of questions that you never thought Search could answer.” •
Quickly get the lay of the land on a topic, with links to relevant results to
explore further. • Ask follow-up questions naturally in a new conversational
mode. • Get more done easily, like generating creative ideas and drafts right in
Search. In principle, this sounds quite similar to the higher-level goals
expressed by Bing, but with some specific Google approaches to it. Google also
provided a preview of what this experience might look like, as shown in Figure
2-5.

GENERATIVE AI SOLUTIONS

43

Figure 2-5. An early preview of generative AI content in Google SERPs

Note the disclaimer at the top left that says, “Generative AI is experimental.”
This caution shows in other aspects of Google’s announcement as well, and the
potential limitations were discussed in more depth in a PDF that Google released
following the announcement. This document also discusses several other aspects
of Google’s approach to SGE. Some of the additional key points include: • SGE is
separate from Bard. They share a common foundation, but Bard is designed to be a
standalone tool, and SGE is not. For that reason, there may be some differences
between them. • SGE is fully integrated with other experiences, such as shopping
and local search. • Creative use of SGE is encouraged, but with a disclaimer:
“While SGE is adept at both informational and creative applications, users will
notice constraints on creative uses to start, as we’ve intentionally placed a
greater emphasis on safety and quality.” • Google is planning to use color to
help differentiate aspects of the AI-powered results, noting that this color may
change based on journey types and user intent. • Google plans to use human
evaluators to review results and accelerate the process of eliminating errors
and weaknesses, noting that they will focus on areas more susceptible to quality
risks.

44

CHAPTER TWO: GENERATIVE AI AND SEARCH

• Google plans to actively use adversarial testing methods to expose other areas
of weakness. • As with search, Google will hold SGE to an even higher standard
with respect to YMYL queries. This includes adding specific disclaimers to these
types of results to advise users to validate the responses with information from
other sources. • In areas where reliable data is scarce (Google calls these data
voids), SGE will decline to provide a response. • SGE will avoid providing
generated responses to queries indicating vulnerable situations, such as those
related to self-harm. • Google plans to be very careful with topics that are
explicit, hateful, violent, or contradictory of consensus on public interest
topics, for example. Also of interest are Google’s observations of behavior that
can cause users to trust the results more than they should. To minimize this
risk, Google has made two intentional choices:

No fluidity SGE will not behave as a free-flowing brainstorming partner, but
will instead be dry and factual, providing links to resources the user can
consult for additional information.

No persona SGE will not respond in the first person, and the results are
designed to be objective and neutral. As you can see, Google has taken great
pains to be careful with how it uses generative AI. As we’ll discuss in
“Generative AI Challenges” on page 61, these technologies have many issues with
accuracy and completeness. Google is showing great concern about how the
information it provides could potentially be misused or cause harm. It is our
belief that this is a substantial part of the reason why Google was behind Bing
in rolling out its own solution—it may simply have held back.

Generative AI Capabilities Generative AI can be used in many different ways to
support SEO and SEO-related tasks. In this section we’ll share some of those
ways, including sample prompts and results. Please note that the examples shown
in this section all used ChatGPT-4, but most (if not all) of them will work with
Bing Chat or Google’s SGE. As you will see, they span a variety of areas, from
coding to keyword research to content creation: 1. Write code. For a simple
example, try the prompt: “Please write the Python code that will write ‘Hello
World!’ on my PC screen.” Figure 2-6 shows the results.

GENERATIVE AI CAPABILITIES

45

Figure 2-6. A Python script to print out “Hello World!”

This is obviously a very simple script, but ChatGPT is pretty good at handling
more complex code creation, including accessing APIs, difficult math problems,
and a lot more. Try it out and see how it works for you, but ensure that you
have an experienced developer who knows the goals of the program using it as a
starting place for their work. Warning: don’t use ChatGPT to replace your
developer! 2. Debug code. Have a piece of code that’s not working? Copy and
paste it into ChatGPT to check it for you. Try a prompt along these lines:
“Please review the following code for syntax errors.” This one can have a
surprising amount of value, as one of the least enjoyable parts of programming
is spending an hour trying to find a problem in code you’re working on, only to
find that it’s a basic syntax error. These types of mistakes are quite common!
3. Determine keyword intent for a group of keywords. Provide a list of keywords
and use a prompt similar to this: “classify the following keyword list in groups
based on their search intent, whether commercial, transactional or
informational.” Figure 2-7 shows an example of the prompt, and Figure 2-8 shows
the results provided by ChatGPT. Note that this result has some errors in it.
Notably, the interpretation of the word “navigational” wasn’t quite what was
expected. “Boston Celtics website” is clearly a navigational query, and “Boston
Celtics Gear” and “Boston Celtics Parking” are clearly not. This illustrates a
limitation of generative AI, but such an issue can be easily fixed by adding to
your prompt clear definitions of what you mean by “informational,”
“transactional,” and “navigational.”

46

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-7. A prompt for classifying keywords

Figure 2-8. ChatGPT results for classifying keywords

GENERATIVE AI CAPABILITIES

47

4. Group keywords by relevance. Provide a list of keywords, preceded by a
prompt, similar to the following: “Cluster the following keywords into groups
based on their semantic relevance.” Figure 2-9 shows the prompt plus a sample
word list, and Figure 2-10 shows the results.

Figure 2-9. A prompt for classifying keywords

Figure 2-10. Keywords clustered by relevance by ChatGPT

5. Translate text. You might want to do this with a list of keywords. Try a
prompt along these lines: “Translate the following keywords from English to
French and generate the results in a table with two columns, with the keywords
in English in the first one, and their translation to French in the second.”

48

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-11. A prompt for translating keywords

Figure 2-12. Keywords translated by ChatGPT (partial list)

GENERATIVE AI CAPABILITIES

49

6. Generate a list of questions on a topic. Here is a sample prompt for content
ideas related to making furniture: “Generate a list of 10 popular questions
related to building furniture that are relevant for new woodworkers.” Figure
2-13 shows the response to this prompt.

Figure 2-13. Relevant questions for new woodworkers generated by ChatGPT

7. Generate article outlines. We tested the following prompt: “Please generate
an article outline about European riverboat cruises.” Figure 2-14 shows a
portion of the results returned by ChatGPT.

50

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-14. A portion of an article outline provided by ChatGPT

8. Create potential title tags. Try this sample prompt: “Generate 10 unique
title tags, of a maximum of 60 characters, for the following text. They should
be descriptive and include the term ‘fine wines’ in them.” Follow this with the
text you want analyzed. Figure 2-15 shows the sample text provided to ChatGPT
for analysis, and Figure 2-16 shows the response.

GENERATIVE AI CAPABILITIES

51

Figure 2-15. Text used in ChatGPT title tag generation example

Figure 2-16. Example title tag suggestions from ChatGPT

Note that these suggested title tags do not leverage industry keyword research
tools (like those discussed in Chapter 6) and the volume information that they
provide, so they will need to be reviewed against that type of data before
finalizing your choices. Nonetheless, using ChatGPT in this way can be helpful
for the purposes of idea generation. You can then feed some of the key phrases
it provides to your keyword research tools to get data on search volumes. 9.
Generate meta descriptions. Provide some text you want to analyze, and follow
that with a prompt similar to this: “Generate 6 unique meta descriptions, of a
maximum of 150 characters, for the following text. They should be catchy with a
call to action, including the term ‘fine wines’ in them.” Figure 2-15 shows the
sample text provided to ChatGPT for analysis, and Figure 2-17 shows the
response.

52

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-17. Example meta description suggestions from ChatGPT

As with any other content created by a generative AI tool, make sure to review
the output before publishing the content and/or putting it into production. You
can use these ideas as fuel for your brainstorming process, and then craft your
own descriptions that best meet your needs. 10. Generate text descriptions. You
can also get pretty solid content created for you, if all of the material facts
are extracted from a trusted database. For example, if you have a chain of pizza
shops, you might want to have a page for each one on your website. If you have a
database of information about each location storing details such as address,
opening hours, and some of the food they offer, then this can work to scale. As
part of the process, provide the data for each location and a sample of the
desired output. For example, you might provide a prompt such as: “Rephrase the
following paragraph for three different locations, but avoid repetition, while
keeping its meaning. Use the following information for each location.” You can
see how this works in Figure 2-18 (showing the sample prompt that we used,
including the sample data for each location), the template text provided as part
of the prompt, and Figure 2-19. Note that Figure 2-19 shows only one of the
outputs generated from this prompt; the other two locations had similar results,
albeit with very different wording used.

GENERATIVE AI CAPABILITIES

53

Figure 2-18. Prompt for text generation driven by a database

Here is the draft text blurb that was provided: Joe’s Pizza shop offers some of
the finest pizzas available anywhere in the United States. We use only natural
ingredients from local farms in all of our pies. Popular pizzas in our stores
include: Cheese, Pepperoni, Pineapple, and meat lovers. We always strive to be
price competitive, but it’s the quality of our pizzas and services that matters
most to our customers. We’ll treat you like a long-lost friend when you come to
visit us. We look forward to seeing you soon!

Figure 2-19. Output from the content generation from a database example

11. Generate a list of FAQs for a block of content. Your prompt might be
something along these lines: “Generate a list of X frequently asked questions
based on the following content.” Figure 2-20 shows the content block used in our
example, and Figure 2-21 shows the output from ChatGPT.

54

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-20. Sample text used to generate text with ChatGPT

Figure 2-21. Suggested FAQs from ChatGPT

GENERATIVE AI CAPABILITIES

55

12. Find and source statistics. To test this out, we tried the following prompt:
“Generate a list of the top five facts and statistics related to the battle of
Stiklestad, including their source.” Figure 2-22 shows what we got back.

Figure 2-22. Facts and stats sourced from ChatGPT

This is a great way to use generative AI to speed up the process of enriching
content while providing proper attribution to sources. 13. Summarize content.
Sometimes it’s useful to get shorter summaries of content you’ve written, and
generative AI can help here too. A sample prompt you can use is “Generate a 15-
to 20-word summary of the following content.” Figure 2-23 shows the text we used
in testing this prompt, and Figure 2-24 shows the results that ChatGPT provided.
Note that the response here is 33 words long, a bit more than we were looking
for.

56

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-23. Sample text used to generate a summary from ChatGPT

Figure 2-24. Summary generated by ChatGPT

14. Create Schema.org markup for one or more Q&A questions. To illustrate this,
we used the prompt and example Q&A shown in Figure 2-25, and we received the
response shown in Figure 2-26.

GENERATIVE AI CAPABILITIES

57

Figure 2-25. Example Q&A prompt for ChatGPT

Figure 2-26. ChatGPT Schema.org markup for Q&A

Note that you can provide a block of text as well, and ChatGPT will identify
what it thinks good question and answer candidates are out of the text. This
also works pretty well. 15. Generate hreflang tags. Figure 2-27 shows the full
prompt we used, and Figure 2-28 shows the response provided.

58

CHAPTER TWO: GENERATIVE AI AND SEARCH

Figure 2-27. Sample prompt to generate hreflang tags with ChatGPT

Figure 2-28. Sample hreflang tags produced by ChatGPT

16. Generate .htaccess code. To illustrate this, we used a fairly simple
example, as shown in Figure 2-29, and got the results shown in Figure 2-30.

Figure 2-29. Sample prompt to generate .htaccess code

GENERATIVE AI CAPABILITIES

59

Figure 2-30. Sample .htaccess code produced by ChatGPT

There are many more use cases for generative AI than the 16 described in this
section. One broad area to explore is how to optimize internal business
processes. For example, maybe your preferred generative AI tool doesn’t work so
well at creating article outlines, as shown in use case #7—but that doesn’t mean
you can’t use it to help with this task. Consider using the tool to create three
different outlines on the topic, then give all of those outlines to your subject
matter expert, who will end up writing the article. Have them create their own
outline, leveraging the AI-generated outlines to speed up their development
efforts. These techniques can be surprisingly effective, as often the hardest
part of creating new content is coming up with the initial concept and deciding
how you want it structured. This can be an excellent way to brainstorm new
content ideas and improve on current ones. Similar approaches can work well with
writing code. Have your generative AI tool create some draft code, and have your
developer use it as a starting point. While coding is not specifically an SEO
task, we offer up this suggestion because anything you can do to increase
operational efficiency is a good thing.

Prompt Generation (a.k.a. Prompt Engineering) In the preceding section we shared
many examples of prompt/response interactions with ChatGPT. As noted at the
beginning of that section, most of these will also work with Bing Chat or SGE.
You can use those as starting points in your own efforts, but it’s also useful
to understand a general approach to creating successful prompts. Here are a few
guidelines: 1. Be clear and direct about what you want the generative AI tool to
do.

60

CHAPTER TWO: GENERATIVE AI AND SEARCH

2. Provide an example of the type of output you’re looking for, using a very
similar topic. For example, if you’re looking to get a brief description of the
history of each of the teams in the NBA, provide an example of the output for
one of the teams and ask the tool to use it as a template. 3. Define your target
audience, including whatever criteria make sense for your purpose (age, gender,
financial status, beliefs, etc.). The tools will adapt their output accordingly.
4. Try something and see if it works. If not, modify it and try again. You can
re-enter your entire prompt with modifications, or chain prompts together, as
discussed in the next point. 5. When you get output that doesn’t meet your
needs, tweak your prompt by adding a supplemental instruction in a follow-on
prompt. For example, if you get a response that is partially correct, provide
some corrective input such as: “Thanks for your response, but instead of
focusing on the male lead in the play, please focus on the female lead.”
Chaining prompts together in this way is often an easier way to refine the
output than typing in the whole prompt again, with modifications. 6. Describe
how you want your output formatted. Do you want paragraphs? Bulleted lists? A
two- or three-column table? The more specific you are, the better. 7. Provide
context instructing the tool to “act as if.” For example, you could start your
prompt with “ChatGPT, you’re an experienced B2B marketer who deeply understands
technical SEO.” 8. Expanding on point 7, the general idea is to provide the
generative AI tool with as much context as you can. This can include who the
response is coming from (as per point 7), who the intended audience is (with as
many specifics as you can provide), the desired style of writing, whether you
want an expansive or narrowly scoped response, and more. These types of prompts
are referred to as super prompts.

Generative AI Challenges As artificial intelligence continues to be deployed at
scale within search and throughout our global data infrastructures, its rapid
adoption and significant impact to society are driving urgent questions about
the trustworthiness, challenges, and risks of AI, and the data upon which it
relies. Conversations about AI within the realm of law, policy, design,
professional standards, and other areas of technology governance are shifting
rapidly toward the concept of responsible technology design and focusing on the
need for values, principles, and

GENERATIVE AI CHALLENGES

61

ethics to be considered in AI deployment. A 2020 report released by the Berkman
Klein Center for Internet & Society at Harvard University describes eight key
“themes” gleaned from an analysis of selected prominent AI principles documents:
• Privacy

• Fairness and nondiscrimination

• Accountability

• Human control of technology

• Safety and security

• Professional responsibility

• Transparency and explainability

• Promotion of human values

While a thorough treatment of AI challenges in the context of search and general
information discovery is beyond the scope of this book, it is necessary to
acknowledge that these issues do exist, and will need to be addressed, in the
context of search and the various ways users seek and obtain information.
Exactly how (and how effectively) Google and others will approach these many
challenges remains to be seen—and will involve navigating ethical, legal,
regulatory, commercial, and societal concerns. As we outlined in “Generative AI
Solutions” on page 39, Google has made clear that it is aware of these concerns
and is taking steps to address them in SGE, such as by prioritizing safety and
quality in its responses, using human reviewers, avoiding providing generated
responses for queries that indicate the user is in a vulnerable situation, and
taking care to avoid including explicit, hateful, or violent content in its
responses. As we also discussed in that section, ChatGPT, Bing Chat, and Google
Bard all have a propensity for making mistakes (referred to as “hallucinations”
in the AI industry). As previously mentioned, CNET found that over half of the
AI-written long-form articles it published contained such hallucinations, while
research by Eric Enge confirmed that hallucinations (as well as omissions of
important information) occurred frequently even in short responses. Currently,
errors and omissions represent a critical area of risk with respect to using
generative AI for content production, to such a degree that the authors strongly
advise against using generative AI to create the content for your website unless
you have a subject matter expert review and fact-check the output and correct
identified inaccuracies and omissions. You don’t want your brand to be damaged
due to publishing poorly written, inaccurate, incomplete, or scraped content.
Additional areas of risk for your organization when utilizing AI may include,
but are not limited to:

62

CHAPTER TWO: GENERATIVE AI AND SEARCH

Bias The data sources used by generative AI tools may have a clear bias, and
this will come through in the content the tools create. You may not want your
content to present an unfair bias, as this might not reflect well on your brand.

Copyright issues Generative AI takes content it finds on other websites and uses
that to create its own answers to your queries. Basically, it’s leveraging the
work of others, and even with proper attribution, it may in some cases push the
limits of “fair use.” This issue is something that may become the subject of
lawsuits at some point in the future, perhaps by major news organizations. Best
to not get caught up in the middle of that!

Liability issues As we’ve detailed in this chapter, generative AI solutions are
inherently prone to errors, inaccuracies, and biases. At this point in time, it
is plausible that there are risks involved when people follow advice from an
AI-reliant system and doing so results in harm to them (or to others). These
liability implications are not trivial. Obvious topic areas where this could
happen include health and financial information, but any topic area where there
is potential for harm is in play. This list merely scratches the surface of
potential areas of risk and liability when dealing with AI in your organization.
To design and deploy responsible AI technologies, the appropriate investigation
and research is necessary based on your specific organization’s platform,
business model, data access and usage, vertical, geographic region, user base,
and a myriad of other factors. What is a risk factor for one organization may
not be a risk factor for another, and some verticals, such as medtech, have
stricter regulatory compliance requirements than others. If you are utilizing
large datasets and deploying AI in your technology offerings, it would benefit
your organization to consult with relevant experts and advisors to address the
issues specific to your implementations.

Conclusion A common question in our industry these days is, “Will generative AI
spell the end of SEO?” We’ll start by addressing this in the simplest way
possible: no, generative AI will not bring about the end of SEO. Search engines,
including those that leverage generative AI, need third parties to create and
publish content on the web. In addition, they need those who publish content to
permit them to crawl it. Without this content being made available to search
engines (including their generative AI tools), the engines will have nothing to
show to their users.

CONCLUSION

63

On a more operational level, there remain many questions about how search
engines will integrate generative AI into their services. We can see what Bing
and Google are offering now, but it’s likely that things will evolve rapidly
over the next few years. For example, we may see changes in the user experience
design and implementation of new techniques to support the tools, such as new
Schema.org types. In addition, the issues that we discussed in “Generative AI
Challenges” on page 61 will become a bigger concern if the search engines
leverage the content of others without sharing some benefit (such as traffic)
back to the publisher sites. Doing this is therefore in their interest, and this
is reflected in how Bing and Google are implementing generative AI in search.
All of these factors imply a need for SEO. In the US, there is also an issue
pertaining to Section 230 of Title 47 of the United States Code, which provides
immunity to online publishers of third-party content for harms stemming from
that third-party material. A notable question is, when a search engine generates
“new” (i.e., original) content in the form of answers/responses as a result of
its proprietary generative AI and index-based processes (drawing upon
third-party website content), does the search engine retain Section 230
immunity? Another important question is how the design, deployment, and
management of generative AI tools will evolve to address valid concerns about
errors and omissions. Fixing the accuracy issues in the output of these tools is
not an easy task, as these flaws are often inherent in the models that are
designed to incorporate data scraped from the web. The identification of errors
and omissions will be an ongoing task, and so far, it is a human task—machines
left to their own devices are unable to fact-check themselves. And in this
context, we’re not talking about hundreds, thousands, or millions of issues that
need to be “patched,” we’re likely talking about billions of issues (or more).
And speaking more broadly, consider the very real looming issue of what happens
as AI models increasingly train on synthetic (AI-generated data), as examined in
the 2023 “The Curse of Recursion: Training on Generated Data Makes Models
Forget”. Consider also Sam Altman’s proclamation that “the age of giant AI
models is already over” and that “future strides in artificial intelligence will
require new ideas.” When will these new ideas emerge, and how will they impact
the systems, platforms, and processes we are accustomed to today? At this stage,
we expect new ideas to come thick and fast—but what we cannot predict is their
impact. Nor can we predict how burgeoning regulatory examination of various
AI-related issues may impact the deployment of, and access to, AI-based systems.
What we do know is that at this stage, there are many ways that generative AI
can be used constructively to help grow your organization, and there are an
increasing number of resources available to minimize your risk in taking the AI
leap.

64

CHAPTER TWO: GENERATIVE AI AND SEARCH

CHAPTER THREE

Search Fundamentals Search has become a fundamental aspect of how we find
information, identify solutions to problems, and accomplish an enormous range of
tasks, from making purchases to booking travel. Without search engines—whether
standalone like Google or platform-specific like LinkedIn Search—we would be
unable to quickly find what we need and execute the tasks we seek to accomplish.
We often don’t know where to find an answer or solution, or whether one exists;
and even when we know where something is located, it can still be more efficient
to use a search engine to retrieve it. For instance, consider the URL
https://blogs.example.com/archive/articles/2019/dogs/caninedentistry-advice.html.
This web page may have information you’re looking for pertaining to canine
dentistry. However, you likely have no idea that this blog exists, let alone
where to find this specific page within the blog, given the complexity of the
URL structure. Additionally, given its age (2019) it’s likely that this article
will be archived, which means it won’t necessarily be on the blog’s home page—to
find it you’d have to drill down at least three levels, and you might have to
sift through dozens or hundreds of articles in the resulting list. Conversely,
consider the few keystrokes required to go to Google and perform a search for
canine dentistry advice. Assuming that our example page is in Google’s index,
and that Google considers it a relevant result for you when you perform the
search, it will be much more efficient to search for this topic than to type the
exact URL into your browser’s address bar, or to navigate to the page through
the blog’s archives. Thus, Google is not solely a provider of potential answers;
it is also the interface to the internet and a method of navigating websites,
many of which may be poorly designed. In essence, search engines enable us to
connect efficiently to information, people, and actionable online activities.
They don’t just help us “find things”; they connect people to their interests by
way of sites, pages, and services.

65

The internet, including the people and things that connect to it, is a system
comprising purposeful connections among disparate components and resources. As
an SEO practitioner within this system, you can influence the interconnectedness
of these various components to present quality, relevant, authoritative,
trustworthy, and useful content, and you can make your website easy to navigate
and understand for both users and search engines. To be an SEO professional is
to be a master of organization and connections—between searches and conversions;
customers and products; web content and search engines, users, and influencers;
and even between you and the people and organizations responsible for
developing, maintaining, and promoting the site you’re optimizing. You also need
to develop a comprehensive understanding of the subject matter of your website,
what user needs relate to your market area, and how users typically communicate
those needs, which is expressed in the search queries they use. This will serve
as a guide to what, and how much, content you need to create to be successful in
SEO. For those of you seasoned in SEO, or for those who simply have high digital
literacy, some of the material in this chapter may seem obvious or elementary at
a glance, while to others it may seem arcane and obscure. However, context is
essential in SEO, and understanding the digital ecosystem, including the people
who connect to and through it, will help you understand how to present your site
as a valuable, authoritative, and trustworthy resource on the web for your topic
area.

Deconstructing Search While search technology has grown at a rapid pace, the
fundamentals of search remain unchanged: a user performs a search (referred to
as a query) using a search engine, and the search engine generates a list of
relevant results from its index and presents the results to the user on what is
referred to as a search engine results page (SERP). We can therefore divide the
user search process into three fundamental components: the query, the index, and
the results. Within each area there exist vast realms of information and
knowledge that the SEO professional must possess. As a starting point, it is
essential to begin to think of SEO in the following terms: search queries map to
topics, which map to keywords and proposed articles; a search engine’s index
maps to all aspects of technical SEO; and a given set of search results maps to
content creation, content marketing, and user behavior. Let’s start by analyzing
the search query, the human interface to a search engine.

66

CHAPTER THREE: SEARCH FUNDAMENTALS

The Language of Search Keywords are the common language between humans and
search engines. On a basic level, a search engine seeks to understand the intent
of the user’s search query, and then identify web pages (or other forms of web
content) that best meet that intent. This is done by a variety of techniques,
including natural language processing (NLP). In the old days of SEO, keywords
were mostly individual words and combinations of words (phrases), minus stop
words (short words like to, of, and the). Currently, keywords (a general term
for a group of related words that will be used in a search query) are often
interpreted less literally by search engines as they seek to understand the
intent of the user by considering the meaning of language beyond exact word
matches, while utilizing various additional information about the user to help
inform the process. Google’s most advanced search system (as of 2022), named
MUM, is a multimodal training system comprising various algorithms that is
supposedly one thousand times as powerful as its predecessor, BERT. MUM
represents Google’s evolution toward understanding information across languages
and content types while utilizing machine learning to enable its systems to
learn as searches are performed, in order to better understand the user’s intent
and to identify the most relevant information available on the web to return to
that user within their search results. As users, we tend to simplify our
language when performing searches, in hopes that the search engines will be able
to understand us better. For instance, if you wanted to know how many Formula
One world championships the Williams Racing team has won, you might use a search
query like this: williams F1 "world championships"

While you wouldn’t use this language to ask a person about this, you have
learned that search engines understand this language perfectly. As you will see
in Figure 3-1, Google provides a direct answer for the query (in the form of a
OneBox result, defined in “Special Features” on page 77) because it has high
confidence in its interpretation of your intent. For simple queries, however,
search engines have evolved to the point that we can usually communicate with
them in a more natural way when our search intent is unambiguous and there is a
definitive answer to our question. For instance, you would be served the same
OneBox result for the preceding query as you would for the natural language
query how many F1 world championships does Williams have?

THE LANGUAGE OF SEARCH

67

Figure 3-1. Sample OneBox result

In the initial query, we translated our intent into keywords, then arranged them
so that the search engine would interpret them properly. In the second query, we
asked the question in the form that we would use when speaking to another
person. When conducting keyword research, you must consider both of these types
of search behaviors. It’s important to identify the individual words and phrases
that exist to describe the topics related to your business, but it’s just as
important to understand the actual questions people might ask a search engine
when looking for the types of information and content you provide.

Word Order and Phrases While different queries may return similar (or the same)
results, word order is still an important factor in search queries and can
influence the content, type, and ordering of the search results you receive. As
an example, currently, while in the same search session you may be shown the
same search results for williams F1 “world championships”; as you would for F1
williams “world championships”, only the former query will display a OneBox
result with the answer. Similarly, if you were to use the singular “world

68

CHAPTER THREE: SEARCH FUNDAMENTALS

championship” in the first query instead of the plural, you might not receive a
OneBox answer. Search engines consider variations of keywords (such as synonyms,
misspellings, alternate spellings, and plurality) when determining searcher
intent and rendering search results pages. However, you don’t necessarily need
to use every keyword variant you see in your keyword research within your
content in order for the search engines to understand that your content is
relevant to a query. For example, as shown in Figure 3-2, if you offer resume
writing services, you can find many closely related keyword variants. You don’t
need to use every single one of those variants on your page or have a page for
each variant.

Figure 3-2. Keyword variants related to resume writing

Nonetheless, comprehensive keyword research is still important. When creating
your initial keyword list, a good first step is to list all of your products,
services, and brands. The next step should be to consider other words that might
appear before and after

THE LANGUAGE OF SEARCH

69

them in a search query. This involves thinking about the larger topical context
around your keywords, and their relationships to other topics. To help with
that, profile all the user needs that relate to your products and/or services.
How might the user think about what they want? What are the initial queries they
might start with, and how will they progress as they move through the purchase
funnel? As you perform your keyword research, keep all of these considerations
in mind and make a plan to address as many of these user needs as possible.
Traditionally, search engines have ranked pages based on the similarity between
the user’s search query and the content they see on relevant websites. If a
search query appears verbatim in a page’s content, that page has a stronger
chance of being near the top of the results, but the use of related words and
phrases is also important and further reinforces the context. NOTE Be wary of
overoptimizing for a keyword. If it’s obvious that a website has thin content
that repeatedly and awkwardly uses keywords, Google will penalize the site by
lowering its rankings or even removing its pages from the index. This topic is
covered in more depth in Chapter 9.

Search Operators Search operators are directives that (when used correctly in a
query) limit the scope of the returned search results in specific ways, making
them extremely useful for various forms of research and information retrieval.
Google doesn’t provide a comprehensive list of search operators, and the ones
that work change from time to time, so the best general approach to identify
available search operators is to search for “google search operators” and use
the after: search operator to limit the age of the pages in the result set:
"google search operators" after:2023-06-01

Bing does provide a list of its supported advanced search options and keywords.
Here are some important search operators that have remained fairly stable in
functionality over time:

OR Includes specific alternatives for a given keyword. Can be expressed as OR or
| (the pipe symbol): Christmas gifts for kids | children | boys | girls

70

CHAPTER THREE: SEARCH FUNDAMENTALS

AND Requires both terms to be present in the pages returned: Christmas gifts for
kids AND children

NOT The opposite of OR. Expressed as a – (dash) symbol before a keyword, with no
space between them: Christmas gifts for kids -teens

site: Limits the search scope to the specified domain: "the art of seo" site:
oreilly.com

filetype: Limits the search scope to documents that have the specified
three-letter file extension: evil plan to destroy the linux operating system
filetype:doc site:microsoft.com

NOTE As the preceding example demonstrates, as long as they aren’t mutually
exclusive, you can usually use more than one operator in a query.

Wildcard When there are too many OR operators, you can use a wildcard (expressed
as the * symbol) to indicate an unknown word within the context of a query: evil
plan to destroy the * operating system filetype:doc site:microsoft.com

You can also use it to indicate all possible top-level domains or subdomains
with the site: operator: evil plan to destroy the linux operating system
filetype:doc site:*.microsoft.com

cache: Brings you directly to the most recent Google cache copy of the specified
URL, if there is one: cache:https://www.example.com/secrets/OMG_DELETE_THIS.htm

inanchor: Searches for a word used in anchor text (link text). You’d usually use
this with the site: operator: site:*.microsoft.com inanchor:monopolistic

THE LANGUAGE OF SEARCH

71

allinanchor: Same as inanchor:, but for multiple words (notice that there’s a
space after the colon this time): site:*.linux.com allinanchor: steve ballmer
monkey dance

We’ll leave investigating the rest of the list—whatever it may contain at the
moment you read this—up to you.

Vertical and Local Intent The concept of searcher intent is extremely important
to search engines because if a search engine accurately identifies user intent,
and therefore serves relevant search results that satisfy that intent, users are
more likely to return to the search service for future needs—which results in an
increased user base and increased advertising revenue (e.g., via Google Ads, for
Google). As a result, search engines have a powerful incentive to correctly
identify and understand user intent. When someone searches for a specific kind
of content (images, videos, events, news, travel, products, music, etc.), this
is referred to as a vertical search, even if it’s not performed from a vertical
search engine. NOTE This is a simplified overview of local, mobile, and vertical
search. For more comprehensive coverage, refer to Chapter 12.

A query can suggest vertical intent when it isn’t explicitly requested. For
instance, if you search for diamond and emerald engagement ring, Google may
interpret this query to mean that you want to see high-quality photos along with
jewelry retail product page links, so your search results may include image and
product results. If you search for parrot playing peekaboo, Google may interpret
this query to mean that you’re looking for videos and return videos prominently
in your search results. Importantly, Google will factor in any available
behavioral data about the user performing the search, to enhance its
understanding of the query’s intent. Signals of local search intent are usually
in the form of keywords, but a query does not necessarily need to include the
name of a specific city or town for a search engine to interpret it as having
local intent. For example, if a user searches for pizza without including a
location in the query, a search engine will likely interpret this query as
having local intent and utilize any information it has about the user to
identify where they are—e.g., the IP address or other location data gleaned from
the device being used by the user performing the search. Depending on how much
location data access you allow Google to have, a mobile search with obvious
local signals (for instance,

72

CHAPTER THREE: SEARCH FUNDAMENTALS

using the keywords near me in your query) will often generate results that are
relevant to the immediate area around you. Local scope can be as specific as a
street corner, or as broad as any place that has a name (a state, country, or
continent). For instance: • Café near the Louvre

• Japanese food in Disney Springs

• Best golf course in Pennsylvania

• Most expensive place to live in America

• Nearest Ferrari dealership

Crawling Web crawling is the process that search engines use to discover content
on your site and across the web (generally located at specific URLs). As a core
component of information and content discovery for search engines, crawling
plays a critical role in how search engines build their indexes of web documents
(URLs). As a result, making your site easy for search engines to crawl—that is,
making it “crawlable” or “crawler-friendly”—is a critical area of focus for
content development to support your ongoing SEO efforts. Overall, the web is too
vast for any search engine or any other entity to crawl completely, so search
engines like Google need to prioritize crawler efficiency and effectiveness by
limiting how much content they crawl, and how often. As a result, there is no
guarantee that a search engine crawler will crawl all of your site’s content,
especially if your site is quite large. Search engine crawlers may not crawl
areas of your site for many reasons, including: • The crawler never finds a link
to the URL, and it does not appear in your XML sitemap file(s). (Perhaps you
actually do link to the page, but it’s only accessible via JavaScript that does
not render some of the content on your page until a user clicks on a page
element.) • The crawler becomes aware of the URL, but it is far down in your
hierarchy (i.e., the crawler would have to crawl several other pages to reach
it), so it decides not to crawl the URL. • The crawler has crawled the page at
some point in the past, and based on the search engine’s interpretation of the
page’s content or other characteristics, it decides that there is no need to
crawl it again. • Your site may be assigned a limited crawl budget, and there
may not be enough available for the crawler to reach all of the site’s content.
This can happen for various reasons, such as if there are issues with the server
hosting your site at

CRAWLING

73

the time of the crawl or if there are multiple URLs that contain the exact same
content. Ensuring the crawlability of your URLs requires understanding the
development platform and environment for your content, how URLs are constructed
and implemented, how redirects are handled and maintained, and numerous other
factors discussed further in Chapter 7.

The Index Today, when we talk about the search index we’re generally referring
to the index of indexes, which contains metadata on many more asset types than
HTML pages, including images, videos, PDF documents, and other file types. When
we say a site is indexed, that means that a search engine has connected to it
through some means (e.g., a link from a page already in the index, or a sitemap
submitted through the search engine’s backend channels), crawled it with a
script that discovers all links to find new content, fully rendered it so it can
see all of the page contents, performed semantic analysis of its contents to
understand the topic matter, created some descriptive metadata about it so that
it can be associated with the words and intentions of searchers, and then stored
it in a database (often referred to as “the index,” though as you will see it’s
really a set of indexes) for later retrieval in response to a related user
search query. As well as relying on information about URLs in its search index,
Google uses various data it obtains and stores about users to determine the
relevance of indexed content to a search query. In addition, there are certain
types of information, typically information in the public domain, that Google
stores in a database that it calls the Knowledge Graph, described in the
following section.

The Knowledge Graph The Knowledge Graph is a rapidly evolving graph database
that Google uses to understand how topics and concepts relate to one another.
It’s composed of trusted facts and their relationships, and was originally
populated by structured data from reliable public sources such as Wikipedia,
Wikidata, and The CIA World Factbook. Today Google also incorporates structured
data from many other sites and services into the Knowledge Graph and uses
machine learning to analyze and collect data from search queries and other user
activity. There are two data types in the Knowledge Graph: entities, which are
real-world persons, places, or things; and concepts, which are abstract ideas or
constructs. For instance, going back to the example in “The Language of Search”
on page 67, the Knowledge Graph would define F1 (an abbreviation of Formula One)
as an entity that

74

CHAPTER THREE: SEARCH FUNDAMENTALS

is associated with the auto racing concept. Williams refers to Williams Racing,
a Formula One team, and is also connected to auto racing, so it would be
classified as an entity and connected to the Formula One entity. World
championships is an abstract concept that could apply to multiple entities but
is narrowed to the scope of Formula One, and then to the Williams Racing team,
due to the co-occurrence of those entities in the search query. You can see this
process in real time as you type; Google’s autocomplete feature will show you
what it thinks the next words in your query will be. The moment the words
williams F1 are typed into the query field, Google has already narrowed the
search scope to the Formula One and Williams Racing entities and their
relationships, and has calculated several words that could logically follow,
most notably world championships. You can see another example showing entities
and their relationships in Figure 3-3. We’ll talk more about this concept in
Chapter 7, in the context of natural language processing.

Figure 3-3. Simplified entities and relationships example

Vertical Indexes As part of the crawling process, search engines discover and
catalog vertical content wherever possible. Each content type has a unique set
of attributes and metadata, so it’s logical to create niche indexes for them,
rather than attempting to describe and rank them according to generic web search
criteria. It also makes sense to create a niche search engine for each vertical
index, but it’s conceivable that there could be scenarios where a vertical index
would only be indirectly accessible through universal search.

THE INDEX

75

Private Indexes Over the past 20 years, Google has offered several different
products for creating private search indexes, and there are other third-party
companies that offer similar products, as well as open source search solutions.
These can be implemented publicly for on-site search capabilities, or privately
for intranet search for employees. Most companies with an intranet will
eventually need an effective method for users to search its contents. In some
instances, you may need to work with a hybrid search engine that returns results
from both a private intranet and the public internet. This may create a
situation in which you will be asked to optimize pages that will never be on the
public internet.

The Search Engine Results Page SERPs are dynamically rendered based on many
different signals of user intent, such as the search query itself; current
trends/events; the user’s location, device, and search history; and other user
behavior data. Changing any one of those signals or conditions may trigger
different algorithms and/or generate different search results for the same
query, and some of these variations may offer better optimization opportunities
than others (as we previously explained with respect to vertical content).
Universal SERPS can vary wildly depending on the degree of confidence in
interpreting the searcher’s intent, the degree of confidence that certain
results will completely satisfy that intent, and how accessible those results
are to both the search engine and the searcher.

Organic Results Organic search results are any results within a SERP that aren’t
paid ads or content owned and published exclusively by the search engine. These
results can include vertical results and special features for pages that use
structured data elements (on-page metadata that Google can use to construct SERP
special features, covered in detail in Chapter 7). One core component of the
search results is listings that are pure text, showing URL links, titles for the
results, and some descriptive text. Some of these may include some enhanced
information about the result. A review of the makeup of these types of listings
follows.

The title and snippet The traditional organic search result included the page’s
literal element and either its meta description or the first couple of sentences
of page content. Today, both the title and the snippet are dynamically generated
by Google, using the content in the

76

CHAPTER THREE: SEARCH FUNDAMENTALS

element, the main visual headline shown on a page, heading (, , etc.) elements,
other content prominently shown on the page, anchor text on the page, and text
within links that point to the page.

Organic search listings can be elevated to the top of the results as featured
snippets for some queries, in some instances where Google has a high degree of
certainty that the query can be answered by showing an excerpt of that page’s
content on the SERP. Featured snippets generally contain longer excerpts than
normal listings, and the URL for the website from which the answer was sourced
appears at the bottom instead of the top.

Cached and similar pages Most organic results have a link to Google’s most
recent cached copy of the page, though some webmasters choose to opt out of this
service (for various reasons) by utilizing the noarchive directive. A cached
copy of a page will generally show its text content in an unembellished fashion;
there is no fancy styling, programmatic elements are either disabled or
statically rendered, and images are either missing or slow to load. The typical
use case for a cached page is to see the content when the site is unavailable
due to a temporary service outage, but cached pages also give you some insight
into what Googlebot “sees” when it crawls the page—this is not the fully
rendered page (this is discussed in more detail in Chapter 7). Results for some
sites also include a Similar link. This leads to a new SERP that shows a list of
sites that Google recognizes as being closely related.

Special Features Under a variety of conditions, both universal and local
searches can generate SERPs with special features to highlight vertical and
structured data-driven content. Google adds new special features every so often,
many of which follow the same pattern of using structured data elements to
display vertical results; this section covers the ones that were current at the
time this book went to press, but more may have been added by the time you’re
reading it. NOTE Some special features (such as the map pack) generally increase
search traffic; others (such as enriched results and featured snippets) may in
some cases reduce traffic. Some SEOs choose to deoptimize a page when a special
feature results in lost traffic, but before you commit to doing that, ensure
that the lost traffic is worth fighting for. If your conversions haven’t
decreased, then the traffic lost to a SERP special feature was either worthless
or unnecessary. For instance, if you’re selling tickets to an event and Google
creates an enriched result for your sales page that enables searchers to buy
tickets directly from the SERP, then you’re likely to see less traffic but more
conversions.

THE SEARCH ENGINE RESULTS PAGE

77

OneBox results When a query is straightforward and has a definitive answer that
Google can provide without sourcing that information from a third-party website,
it puts a OneBox answer at the top of the SERP, followed by universal results.
These same results may also be served as responses in the Google Assistant app.
In some cases, they are combined with a featured snippet in order to allow
Google to provide supplemental information to the user. For example, queries
like the following will reliably return OneBox answers regardless of your
location, device, and other search signals: • When is Father’s Day?

• What is the capital of Washington state?

• What time is it in Florence? Figure 3-4 shows a SERP that combines a OneBox
with a featured snippet.

Figure 3-4. An example of a featured snippet

78

CHAPTER THREE: SEARCH FUNDAMENTALS

Knowledge panels When the query is straightforward, but the answer is too
nuanced or complex to deliver in a OneBox result, Google will put a knowledge
panel on the right side of the SERP, alongside the universal results. Knowledge
panels are tables of common facts about a well-documented entity or concept from
the Knowledge Graph. Sometimes they are generated based on structured data
elements, but most often they are just a SERP-friendly repackaging of the
infobox element of the topic’s Wikipedia page.

Map packs In addition to filtering out results from irrelevant locales, search
engines also generate special SERP features for queries that seem to have local
intent. The most obvious location-specific feature is the map pack, which is a
block of three local business results displayed under a map graphic that shows
where they are. If you see a map pack in a SERP for any of your keywords, that’s
proof that Google has detected local intent in your search query, even if you
didn’t explicitly define a location. You can see an example of a map pack (also
referred to as the local 3-pack) result in Figure 3-5. For some verticals, the
map pack shows business information on the left and website and direction links
on the right; for others, review scores are displayed on the left, and images
are shown on the right instead of links. In any map pack, you can click the
“View all” or “More places” link at the bottom to see the Local Finder page,
which generates a SERP focused entirely on local results. This is nearly
identical to a vertical search in Google Maps, though the search radius will
differ between the two depending on the search intent. In addition, there will
be variations in layout and functionality depending on whether you are searching
on mobile or desktop, as with nonlocal search results.

THE SEARCH ENGINE RESULTS PAGE

79

Figure 3-5. An example local 3-pack result

Rich results and enriched results In a rich result (also called a rich snippet),
the snippet is enhanced or replaced by an image thumbnail or review star rating
that summarizes or represents the originating content. Examples of when you’ll
find rich results include when you search for reviews of just about anything, or
interviews with public figures. You can see an example of this in the search
results for mahatma gandhi, shown in Figure 3-6.

80

CHAPTER THREE: SEARCH FUNDAMENTALS

Figure 3-6. An example of a rich result

Searches for job postings, recipes, or event listings can generate enriched
results. These go one step further than rich results by offering some kind of
interaction with the page, sometimes without the user having to visit the page
directly. For instance, you might be able to buy tickets to a show, track a
package that has been shipped to you, or send in your resume to a recruiter
directly from the SERP.

The carousel If there are multiple relevant rich results, Google may show them
in a carousel at the top of the SERP. You can see a good example of this if you
query for starting lineup of the New York Yankees. This can also apply to
universal queries that contain a high number of images, such as wading birds, or
a high number of product results, such as phones under $500.

THE SEARCH ENGINE RESULTS PAGE

81

Key moments in videos Google is also likely to show enhanced results when it
determines that videos may be more useful results for users. In this case the
SERP may include one or more videos, as well as links to specific key moments
within the video content. You can see an example of this in Figure 3-7.

Figure 3-7. A search result including videos

82

CHAPTER THREE: SEARCH FUNDAMENTALS

The sitelinks search box If Google determines that your query should have been
executed on a specific site’s internal search engine instead of Google, then you
may see a search field below that site’s snippet within the SERP. If you use
that secondary search field for a query, then the search scope will be limited
to that site (this is identical to using the site: operator, discussed in
“Search Operators” on page 70). To trigger a sitelinks search box, the site in
question has to have its own publicly accessible search feature and rank highly
in the results, and the Google search query has to be relatively broad. One
example of this is if you search Google for pinterest, as you can see in Figure
3-8.

Figure 3-8. An example result including a sitelinks search box

Query Refinements and Autocomplete Google has invested heavily in machine
learning technologies that specialize in analyzing human language (both spoken
and written). The following is a list of some of the algorithms it uses, with
descriptions of their purpose within Google Search:

THE SEARCH ENGINE RESULTS PAGE

83

The synonyms system While Google does not have an actual algorithm called the
“synonyms system,” it does have algorithms that analyze keywords in a query and
consider words and phrases that are similar in intent. This is similar to using
the OR operator to list multiple similar keywords, except it’s done
automatically for you. For instance, if you search for Christmas gifts for kids,
a page optimized for holiday presents for children might rank highly in the SERP
despite not including the exact keywords you searched for anywhere on the page.
This type of analysis also extends to how Google analyzes the content of web
pages, and it can extend beyond literal synonyms to contextually understanding
the relationships between potential antonyms. For instance, a query for where to
sell a guitar will likely include highly ranked results for where to buy a
guitar, because the concepts of buying and selling are very closely related in
search intent, even if they are semantic antonyms.

BERT Google uses BERT to analyze a query to determine the meaning of each of its
component words in context. Prior to BERT, Google’s language analysis could only
consider the words before or after a given word or a phrase to understand its
meaning. BERT enables Google to examine words before and after a word or a
phrase in a query to fully understand its meaning. For example, thanks to BERT,
Google is able to accurately decipher the intended meaning of a query like 2023
brazil traveler to usa need a visa as being about whether a person traveling
from Brazil to the US in 2023 needs a visa, whereas prior to the implementation
of this algorithm Google would have assumed that the query was about someone in
the US wanting to travel to Brazil.

Passages The passages algorithm analyzes sentences within the context of
paragraphs or pages of content. For example, if a page on your site contains a
2,000-word article about installing certain types of windows, and it includes a
unique piece of content about how to determine whether the windows contain UV
glass, this algorithm can help extract that specific piece of content.

MUM The Multitask Unified Model (MUM) algorithm uses language models similar to
BERT (discussed further in “BERT” on page 404) to provide responses to queries
that cross-reference two entities or concepts within the same topical domain.
For instance, MUM will enable Google to assemble a SERP specific to the mixed
context of a complex query such as I’m a marathon runner. What do I need to know
to train for a triathlon?

84

CHAPTER THREE: SEARCH FUNDAMENTALS

We’ll talk more about various algorithm updates that Google has made in recent
years in Chapter 9.

Search Settings, Filters, and Advanced Search In addition to using search
operators to limit the search scope for a query, as discussed earlier in this
chapter, you can use the Advanced Search feature buried in Google’s SERP
Settings menu, also located at this specific URL:
https://www.google.com/advanced_search. From the Settings menu (accessed via the
gear icon on the SERP or the Settings link in the lower-right corner of the main
search page), you can also alter the search settings to limit results to pages
that are hosted in certain countries or contain content in certain languages,
change the number of results per page, enable or disable autocomplete, and
enable or disable the SafeSearch filter that attempts to exclude broadly
“offensive” results. On the SERP, the Tools menu enables you to filter results
by age or date range, and to enable or disable verbatim interpretation of the
query.

Ranking Factors Search results are selected and ranked according to various
logical processes (algorithms) which apply a variety of scoring methodologies
and rulesets. In the early days of web search, search engines were not advanced
in how they assessed the quality of site content; they simply matched document
vocabulary and user vocabulary. Pages that contained a title, description, or
content that matched the search query verbatim would often reliably rank well
for some or all of the keywords in that query, even if the sites contained
low-quality content or were spam. Consequently, it was pretty easy to influence
search results by inserting keywords into pages in the right locations. The
magic that set Google apart from its early search competitors was in the way it
qualified pages in the index by analyzing whether they were linked to, and how
they were described by, other web pages. The PageRank algorithm uses the link
text (the text between the HTML tags on web pages) as an extra layer of
descriptive metadata for sites, pages, keywords, and topics, then evaluates each
page’s rankings based on the quantity of those links and the quality of the
source pages containing the links. More concisely, links are votes endorsing the
quality of your site (but all votes are not equal, and some don’t count at all).
The details involved in weighting those votes are among the most precious of
Google’s secrets, but there are some fundamental truths. In general, Google
gives more weight to links from sites that it trusts and are linked

RANKING FACTORS

85

to or from other trustworthy, topically related websites. These factors are
discussed in depth in the following subsections.

Relevance The first and most important objective for a search engine, as
described earlier, is to determine query intent and then deliver results that
are relevant to the query by satisfying the intent of the user performing that
query. As a result, this is, and always will be, the largest ranking factor.
When a trusted site uses descriptive link text to link out to a page, Google
begins to establish topical relevance for that new page. For instance, if your
site sells used cars in the Phoenix, AZ, region, a descriptive link like this
one from the Phoenix Chamber of Commerce will establish your site’s relevance to
the Phoenix, AZ topic: Local used car virtual showroom

If that page is also linked to from a credible national or international
magazine site that reviews online used car dealerships, then that will establish
relevance to the used cars topics. This is the case even if the link text does
not contain the phrase “used cars,” because Google understands that a site that
has been publishing reviews about used car dealerships for the past 15 years is
within the used cars domain. With those two links, your site will soon be
included in results for the query used cars in Phoenix. The words, phrases, and
subjects that define your site in human terms can be referred to as its topical
domain (a subject covered in more depth in Chapter 6). All other pages on all
other sites that share your topical domain(s) are relevant to your site in some
way; the more topics that you have in common with an external page, the more
relevance you share with it. Inbound links that originate from highly relevant
sources are generally more valuable, in isolation, than links from partially or
tangentially related sites. Links from sites that have no relevance to yours
(i.e., which don’t share your topical domain at all) are generally less
valuable, but if the sites are quality sites (and aren’t ads) they still have
some value.

AI/Machine Learning’s Impact on Relevance While content and linking are
important factors in Google’s determination of a piece of content’s topic and
relevance to a query, modern machine learning technologies, such as the
algorithms discussed in “Query Refinements and Autocomplete” on page 83, play a
useful role in analyzing queries and the text content of web pages. The
resulting AI models are able to analyze new web pages and natural language
queries and determine their relevance to every topical domain in the index.

86

CHAPTER THREE: SEARCH FUNDAMENTALS

These algorithms, however, also have drawbacks that are difficult to address
programmatically, including containing inherent human biases with regard to
race, gender, and various other social and cultural elements, as well as being
vulnerable to the influence of organized disinformation campaigns, the
interpretation of negative associations, and the semantic collisions caused by
slang terms and regional dialects. Additionally, generating a new AI model
requires a massive amount of computing resources, using huge amounts of energy
and incurring substantial costs. Regardless of the algorithms involved, the SEO
fundamentals remain unchanged: conduct good keyword research that includes
natural language queries, build web content that is accessible to search
engines, and create high-quality, useful content that attracts high-quality
links to your responsive, cleanly structured website.

EEAT In 2018, Google’s Search Quality Raters Guidelines (SQRG), a set of
instructions provided to human reviewers who test the quality of its search
results, introduced the concept of EAT, an acronym for Expertise,
Authoritativeness, and Trustworthiness; in 2022, Google added the extra E, for
Experience2. According to Google, this is not a direct ranking factor, but
rather a set of factors that human reviewers are asked to consider in evaluating
the quality of the search results. Google reportedly also uses other signals as
proxies to try to measure whether the content matches how users might perceive
the EEAT of a site. The reviewer’s input is not fed directly into the Google
algorithm; instead, it’s used to highlight examples of Google search results
that need to be improved. These can then be used as test cases when Google
engineers work on new algorithms to improve the overall results. We’ll briefly
introduce the components of EEAT here, and discuss them a bit more in Chapter 7:

Experience There are many situations where what users are looking for in content
is to benefit from the experience of others, and/or the point of view that
others have developed based on their experience. Note that if your content is
created by AI, it can really only put together summaries of what others have
published on the web, and this is the basic objection that Google has to such
content.

Expertise Expertise relates to the depth of knowledge that you offer on your
site. For example, contrast the expertise of a general copywriter you hire to
write your content with someone that has two decades of experience in the topic
area of your business. A general copywriter given a few hours to create a piece
of content will struggle to write material of the same quality that a true
subject matter expert can produce.

RANKING FACTORS

87

Authoritativeness Google assigns authority to sites that are linked to from
other authoritative sites that have proven to be trustworthy over time. Your
used car website will gain authority when it is linked to from external pages
that have a significant amount of relevant topical authority in the search
index. While relevance is easy to determine, authority is not. Calculating
authority requires a nuanced interplay between many objective and subjective
factors, and every search engine has its own methods for this.

Trustworthiness A search engine’s concept of trust is similar to the usual
sociological definition: it’s a measure of a page, site, or domain’s integrity
over time. Trusted sites have a long history of consistently playing by the
rules and have not been compromised by spammers or scammers. If Google had to
rebuild its index from scratch, one of the first things that information
architects might need to do is create a list of trusted sites. From there,
authority would be calculated as the product of relevance and trust. This is not
conceptually different from the process of moving to a new town and establishing
a new social network; you identify people you can trust, then expand outward
from there (because trustworthy people usually associate with other trustworthy
people). But as you get further away from the original trusted source, you have
to reduce the level of inherited trust. Using this theoretical approach, a site
that is one click away from a highly trusted source will inherit a lot of trust;
two clicks away, a bit less; three clicks away, even less; and so forth. Again,
search engines consider their actual trust algorithms to be valuable trade
secrets, so the best you can do is examine the results and try to work backward
toward the contributing factors (which is exactly what link analysis tools and
SEO platforms do). While the technical details will always be obscured, it’s
safe to assume that search engines follow the same paradigm as a human
“background check.” Which sites does your site link to, and which sites link to
it? What’s the history of this domain, including hosting and ownership? Does it
have a privacy policy and valid contact information for the owners? Is it doing
anything suspicious with scripts or redirects?

Local Signals and Personalization As explained earlier in this chapter, results
can be heavily influenced by local intent and personalization
factors—essentially, taking user behavior data into account when determining
query intent and deciding which results to show. For example, depending on the
query, searches from mobile devices may be assumed to be “local first.” The

88

CHAPTER THREE: SEARCH FUNDAMENTALS

stronger the local signals are, the less likely it is that nonlocal results will
appear in a SERP, even if they are highly relevant to the query. For instance,
if you were to execute the query Disney’s haunted mansion from a desktop
computer with an IP address originating in Cleveland, OH, there would likely be
few or no local results. This is the title of a movie, and the name of a theme
park ride at both Disneyland in California and Walt Disney World in Florida. If
there are no local signals, the SERP will mostly pertain to the movie. However,
if you were to execute the same query from a smartphone with an IP address
indicating the Orlando, FL, area, the SERP would be more likely to skew toward
the theme park ride at Walt Disney World, and any results for the Disneyland
ride would be more likely to rank considerably lower, no matter how highly they
rank for the Disney’s haunted mansion keyword. You could arrive at a similar
SERP from your Ohio-based desktop computer just by adding Florida to the query.
If you follow up your initial search with one for Disney’s haunted mansion
Florida, you’ll still get a few results (and probably ads) pertaining to the
movie, since there is likely to be a shared interest between the movie and the
theme park ride. However, if you were to execute these queries in reverse order
(with Florida, then without), the second SERP would be more likely to be nearly
identical to the first: because you recently expressed interest in the theme
park ride in Florida, Google may assume that this is still your intent even
though you removed Florida from the query. Historically, the impact of
personalization extended beyond your most recent searches, leveraging your
longer-term search history and the preferences that you’ve shown over time.
However, Google appears to have scaled this back, announcing in 2018 that it
would be limiting personalization to the user’s location and recent searches.

Timing and Tenure Search engines keep detailed records on linking relationships
between websites (as well as information pertaining to domain names, IP
addresses, pages, and URLs). With regard to linking relationships, the search
engines generally store the following information:

When the link was first seen (by Google) This isn’t just a simple date stamp;
it’s combined with an analysis of other changes in the index. For example, did
this link (URL) appear immediately after an article was published in the The New
York Times?

When the link was no longer seen (by Google) Sometimes link retirement is
routine, such as when blog posts move from the home page to an archive page
after a certain period of time. However, if an inbound link disappears shortly
after you made major changes to your site, search

RANKING FACTORS

89

engines may interpret this as a negative signal. Did that site’s owner disagree
with the changes you made, and revoke its association with your page?

How long the link has existed If a link has been around for a long time, a
search engine can potentially give it more weight or less, depending on the
authority/trust of the site providing the link, or other secret factors.

Legitimacy Google analyzes the content around links, as well as the context in
which they appear, to determine their legitimacy. In a previous era, search
engines were fooled by keyword stuffing and link farming techniques. Google, in
particular, now goes to great lengths to detect link schemes and spammy content,
and also to explicitly detect legitimate content. Here are some of the potential
factors that search engines may use to qualify content:

External links to the linking page Does the external page containing the inbound
link have its own inbound links? If the page linking to your site is benefiting
from incoming links, then this will make the link to your site more valuable.

Page placement Is your link in the main body of the content? Or is it in a block
of links at the bottom of the right rail of the web page? Better page placement
can be a ranking factor. This is also referred to as prominence, and it applies
to on-page keyword location as well.

Nearby text Does the text immediately preceding and following your link seem
related to the anchor text of the link and the content of the page on your site
that it links to? If so, that could be an additional positive signal. This is
also referred to as proximity.

Closest section header Search engines can also look more deeply at the context
of the section of the page where your link resides. This can be the nearest
header tag, or the nearest text highlighted in bold, particularly if it is
implemented like a header (two to four boldface words in a paragraph by
themselves).

Overall page context The relevance and context of the linking page are also
factors in the value of a link. If your anchor text, surrounding text, and the
nearest header are all related, that’s good. If the overall context of the
linking page is closely related too, that’s better still.

90

CHAPTER THREE: SEARCH FUNDAMENTALS

Overall site context Another signal is the context of the entire site (or
possibly the section of that site) that links to you. For example, if a site has
hundreds of pages that are relevant to your topic and links to you from a
relevant page, with relevant headers, nearby text, and anchor text, these all
add to the impact, so the link will have more influence than if the site had
only one page relevant to your content.

Source Diversity In addition to sourcing links from similar types of websites,
you should also try to get links from pages that have different content and
serve different purposes. For example, if all your links come from blogs, then
you have poor source diversity. There are many examples of other types of link
sources that you could cultivate: national media websites, local media websites,
sites that are relevant but cover more than just your space, university sites
with related degree programs, and so on. If all your links come from a single
class of sites, search engines may view this as a potential link scheme. If you
have links coming in from multiple types of sources, search engines are more
likely to view your backlink profile as legitimate.

Keywords in Anchor Text Anchor text (also called link text) refers to the
clickable part of a link from one web page to another: This is anchor (or link)
text.

Search engines use anchor text as descriptive metadata about the destination
page. However, don’t try to stuff keywords into anchor text if the words don’t
naturally fit with the surrounding content, and avoid overly descriptive anchor
text that can appear to be keyword stuffing. Search engines look for unnatural
language usage in anchor text, and if they detect this, they will lower the
ranking of the linked page. Similarly, if you have 20 external links to your
page and 19 of them use anchor text that matches your main keyword exactly, that
is likely to be seen as unnatural, and these links may be discounted.

Negative Ranking Factors It’s also possible to have negative ranking factors.
For example, if a site has a large number of low-quality inbound links that
appear to be the result of artificial efforts by the publisher to influence
search rankings, the links will likely be ignored. This is, in fact, exactly
what Google’s Penguin algorithm (discussed in “Quality Links” on page 428) does.
Some other potential negative ranking factors include:

RANKING FACTORS

91

Malware hosting Your site must not contain malicious software or scripts.
Usually this happens by accident; your site is hacked without your knowledge,
and malware is hosted clandestinely.

Cloaking Your site must show the same content to users that it shows to search
engines. If you try to circumvent this by showing a special page to web
crawlers, your site will be penalized. Note, however, that with client-side
rendering (CSR) there can be scenarios where users see different content than
search engines, and this is typically not seen as cloaking; this is discussed
further in Chapter 7.

Unqualified paid links If you sell links from your site to others, they must be
properly marked with an appropriate rel="sponsored" or rel="nofollow" attribute
(see Chapter 7 or Chapter 11 for more details). Otherwise, in extreme cases of
intentionally attempting to manipulate SEO results, your site could be
penalized.

Page load time If your site’s content is very slow to load, its visibility
within search results can be negatively impacted, as the search engines are
aware that users generally seek to avoid pages that don’t load quickly. Fix your
site first, then optimize it for search.

User Behavior Data Search engines—Google in particular—monitor user behavior
data, including their interactions with SERPs, with the goal of providing a
better search experience to ensure repeat users. This data includes (but is not
limited to) location data, voice search data, mouse movements, and data gleaned
from authenticated Google account usage across various Google products. While
click-throughs from search results are a good signal for both search engines and
websites, if visiting users quickly abandon your page and come back to the SERP
(a bounce), this could be considered a negative signal. Users can bounce back to
the SERP for a variety of reasons, including slow page load time, poor user
interface design, irrelevant content, being presented with interstitial ads or
paywalls, or simply because they’ve accidentally clicked or tapped the wrong
result. Generally speaking, a high bounce rate can be a signal that something is
wrong with your site or that you’re providing a negative experience for users.
However, there are some issues with attempting to use this type of signal as a
ranking factor: • Unless you have a site with a lot of traffic, and each page
that the search engine is analyzing has a lot of traffic, there won’t be enough
data to evaluate.

92

CHAPTER THREE: SEARCH FUNDAMENTALS

• Poor user engagement, such as rapid bounces off of a web page, can also happen
because the user finds what they were looking for very quickly. In addition,
Google specifically disclaims that user engagement signals such as bounce rate
are used as ranking factors. However, if your site offers a bad experience that
causes users to abandon it rapidly or to never come back, it’s very likely that
this will result in other behaviors that do negatively impact your ranking (such
as obtaining fewer links, receiving bad reviews, etc.).

Conclusion Although search technology and the digital ecosystem as a whole
continue to rapidly evolve, having a solid understanding of search engine
fundamentals is absolutely essential to SEO success. Understanding how various
signals are used by search engines to deliver a good search experience for users
to ensure their return will put you in a good position to develop a strategy to
connect your future customers to the content you create in order to leverage
organic search for your business. Next, we will evaluate and assemble a set of
tools that will help you collect and analyze data about your site and the search
context that will most effectively connect people to it.

CONCLUSION

93

CHAPTER FOUR

Your SEO Toolbox Success in SEO is highly dependent on having the right tools.
Before you can learn the tricks of the trade, you need to have the tools of the
trade. The services and utilities covered in this chapter will enable you to
analyze your site and identify technical or structural problems, discover the
most cost-effective topics and keywords, compare your site’s performance to that
of its top competitors, track incoming links, and measure visitor behavior.
You’re probably already familiar with some of them, but we’re going to
reintroduce them from an SEO perspective. These tools all have a variety of
purposes and functions that are useful for SEO, but the common thread for most
of them is their utility in keyword research, a topic that we cover in depth in
Chapter 6. The first and most important tool is a simple spreadsheet application
to aggregate data from multiple sources, and to calculate the best
opportunities. This is a requirement for maximum SEO productivity and
efficiency. Next, we’ll help you explore some options for technical SEO. It
doesn’t make sense to optimize a site that isn’t being indexed properly. There
are many technical utilities and site analysis features of larger SEO service
packages that can help you solve technical site problems. Some are standalone
(like Google Search Console), but every major SEO platform and most marketing
services suites have their own site and page analysis features. There are three
data perspectives on website activity: server-side, client-side (visitorside),
and search-side. Since your company or client controls the web server, it makes
sense to start this chapter by first analyzing the data you already have. Next,
we’ll explain how you can supplement that data with extra visitor context from
on-page JavaScript trackers. Finally, we’ll introduce you to SEO platforms that
provide search data for the keywords you’re targeting, the current search rank
for every indexed page on your site, and other valuable features that will help
you optimize your site and increase search traffic.

95

Some of the tools we cover in this chapter are free, but most require a license
fee, paid subscription plan, or SaaS contract. Paid tools tend to charge on a
per-user, per-site (or property), or per-client basis. While you don’t have to
make any decisions about any of these tools right now, in order to follow many
of the processes and examples throughout the rest of this book, you must have at
least one SEO platform subscription and a web analytics service or tag manager
deployed on your site, and you must set up and configure Google Search Console.

Spreadsheets Our dream careers rarely align well with reality. Real
archaeologists spend a lot of their days on their knees in the hot sun with a
toothbrush and a garden shovel, not dodging ancient booby traps with a whip and
a revolver à la Indiana Jones. Real lawyers spend much of their time on
administrative tasks such as billing, collections, office management, case law
research, and reading and filing formal documents, not winning clever courtroom
battles with hostile witnesses or delivering dramatic closing arguments before a
jury. And a professional SEO often spends more billable time working in a
spreadsheet than a web browser or code editor. Hopefully that doesn’t throw too
much water on your fire. This is still a fun and fascinating industry! SEO
practitioners rely heavily on spreadsheets, and most commonly that means
Microsoft Excel, but you can use any modern equivalent. Regardless of which
spreadsheet app you use, you must be proficient enough with it to create and
maintain proper keyword plans for your company or clients. Specifically, you
must be comfortable working with data tables, basic formulas, filters, and pivot
tables. If you have a few knowledge gaps in these areas or you don’t feel
confident in your spreadsheet skills, then you should invest in training, or at
least be prepared to use the Help menu and Google to figure out how to use these
advanced features to filter, sort, and calculate your keyword lists. This topic
is covered in more detail in Chapter 6, where we walk you through the process of
creating a keyword plan spreadsheet.

Traffic Analysis and Telemetry In order to analyze data, first you must collect
it. There are two paradigms for visitor data collection: raw web server logs
that record all incoming traffic from the internet, and JavaScript trackers
(also known as tags) that are embedded in the source code of every page on your
site. Each has its advantages and disadvantages, and to get a holistic
perspective on traffic and user behavior it’s often preferable to incorporate
both web server log input and JavaScript tracking. However, what type of
solution will work best for you will depend on your individual feature
requirements, and what’s already deployed (or required due to vendor contracts)
at your company.

96

CHAPTER FOUR: YOUR SEO TOOLBOX

Before you proceed, be aware that there is almost certainly already some kind of
web analytics package (or perhaps several) deployed somewhere at your company.
It isn’t unusual for a web-based business to use a variety of separate analytics
tools to measure different metrics or supply data to different services or
utilities. You should begin by taking stock of what’s already deployed (and paid
for) before making any decisions on analytics tools. As a consultant, there will
also be times when you’re stuck using a client’s preferred vendor or solution,
so you’ll have to learn to work within those boundaries. If nothing is currently
deployed, then free analytics packages such as Google Analytics and Open Web
Analytics are an excellent starting point. Even if a free service doesn’t
ultimately meet your needs, at the very least you can use it as a basis for
comparing against paid alternatives. Whether you’re evaluating an existing
solution or a new tool, make note of the features that you find valuable and any
gaps in functionality that a competing program might be able to cover. Then look
for ways to modify or extend the services you’re using, or for a higher-end
solution that covers those gaps. This is a long journey, not a one-time event.
As you gain more experience in SEO, you’ll continue to develop your requirements
and preferences, and will likely end up with a few different go-to options for
different scenarios. Be wary of services (especially free ones) that want you to
upload customer data or server logfiles; those service providers may collect
your data for other purposes, and this would probably represent a privacy and/or
security violation at your company. JavaScript trackers may also share or
collect data about your traffic and visitors, and though this is less invasive
and dangerous, it still may violate privacy laws or your internal IT policies.
(Legal and privacy issues are covered in more detail in Chapter 13.)

Google Search Console Google Search Console is a free service that provides a
lot of technical site information that Google Analytics lacks. With it, you can
test your site for indexing, view inbound search query data (keywords,
impressions, click-through rate, rank), generate and test XML sitemaps, test
mobile compatibility, analyze page performance, and measure the performance of
structured data elements that generate SERP features such as enriched results
and OneBox answers. Google Search Console should almost always be the first
service you configure for an initial site audit, because it offers a quick way
to identify low-level problems. Its keyword research utility is limited to SERP
impressions your site is already receiving, but the data on existing search
traffic is useful.

TRAFFIC ANALYSIS AND TELEMETRY

97

Server-Side Log Analysis Web server software outputs a constant stream of text
that describes all of the activity it is handling and stores it in a file
somewhere on the server. A typical server log is a plain-text list of HTTP
requests. Here’s an example of a line that you might see in an Apache web server
log: 127.0.0.1 [05/Nov/2022:21:43:06 -0700] "GET /requested_page.html HTTP/1.1"
200 1585 "https://www.example.com/referring_page.html" "Mozilla/5.0 (X11;
Ubuntu; Linux x86_64; rv:36.0) Gecko/20100101 Firefox/36.0"

From left to right, here’s what that data represents: • The IP address of the
machine (computer, mobile device, or server) that made the request • The time,
date, and time zone of the request (relative to the web server) • The HTTP
request method (either GET or POST) and the resource being requested (in this
example, it’s a web page named requested_page.html) • The HTTP status code (200
represents a successful request) • The size of the request, in bytes (usually
either the amount of data being returned to the client, or the size of the file
being requested) • The full URL of the page that referred to this resource
(sometimes there isn’t a referrer, such as when someone directly types or pastes
a URL into a browser; referrers are only shown if the request came from an HTTP
resource, such as when a user or crawler follows a link on a web page) • The
user agent string, which shows the browser or crawler name and version number
(Firefox, for some reason, likes to report itself as Mozilla); the operating
system name (X11 is a graphical user environment framework on the Ubuntu Linux
operating system), revision or build number (expressed here as rv:36.0), and CPU
architecture (x86_64 refers to an Intel or AMD 64-bit processor); and the HTML
rendering engine and revision number (Gecko is the native engine in the Firefox
browser, and the revision number is the same as the operating system’s in this
example because it was built for and distributed with Ubuntu Linux) NOTE To
learn more about the elements of an HTTP request and how to adjust the verbosity
and format of a logfile, consult your server software documentation.

Everything about these logfiles is configurable, from the level of detail and
verbosity to the location of the file and the conditions under which one logfile
should close and another one should begin. The web server log is the oldest tool
in the box. When it’s

98

CHAPTER FOUR: YOUR SEO TOOLBOX

properly configured and its data is filtered and aggregated for human
consumption, it can be useful for a variety of SEO purposes, such as: •
Determining how often search engines are crawling the site and each of its pages
(and which pages they aren’t crawling at all) • Determining how much time is
spent crawling low-value pages (ideally, the search engines would spend this
time crawling more important pages on your site) • Identifying pages that
redirect using a means other than a 301 redirect • Identifying chains of
redirects • Identifying pages on your site that return status codes other than
“200 OK” • Backlink discovery • Finding missing pages/bad links • Measuring site
performance • Determining visitors’ platforms (device, operating system, and
browser version) • Determining visitors’ locales Web server logs can also be
merged with other data sources to provide insights on conversion rates from paid
and organic campaigns, server optimization, and URL canonicalization for
duplicate content, among other things. NOTE Some web hosting providers may
restrict or deny access to raw server logs and configuration files. Web server
logs are useful for technical SEO, so if the hosting company won’t give you the
data you need, you may want to switch to a more SEO-friendly provider.

The raw logfile from a web server, while human readable, isn’t human
comprehensible. You’ll need a third-party analysis tool to cleanse, combine,
sort, and slice this transactional information into facts and dimensions that
actually mean something to you. There are so many different tools and methods
for logfile analysis that describing them could take up a whole (rather boring
and repetitive) chapter of this book, so in this section we’ll just mention a
few. First, as always, check to see if there’s already something deployed, and
whether it will meet your needs. Keep in mind that a server log is nothing more
than a plain-text transactional data source. Aside from dedicated web logfile
analysis tools, many companies have business analytics packages that can use
server logs to produce useful reports and dashboards. A business analyst within
the organization can potentially work with you to pull web log data into it and
configure the output to meet your needs. Also, check to see if the company is
already using an enterprise services suite

TRAFFIC ANALYSIS AND TELEMETRY

99

(such as Atlassian, Salesforce, IBM, or Oracle) that has an optional web log
analysis component. Even if the company is not currently paying for that
component, you’ll have a much easier time convincing management to expand the
budget for an existing solution than to buy into a completely new one. The
following are some logfile analysis tools that we’ve used and would recommend:

Splunk Splunk bills itself as a “data to everything platform,” meaning it can be
configured to use any logfile or database as input and produce any kind of
report, chart, or file as output. Therefore, Splunk isn’t functionally different
from most other business intelligence or data analytics solutions. It may be a
more affordable alternative to larger analytics suites, though, because Splunk
is more of an engine than a service, meaning you have to develop your own
solution with it rather than just copying and pasting some code or clicking a
few buttons in a web interface as with Google Analytics. Splunk is an extremely
high-end option, and it may be overkill if you’re only analyzing relatively
small transactional web logfiles. However, it’s popular enough that it may
already be deployed elsewhere within a large organization, which would make it
cheaper and easier for you to adopt it for your SEO project.

BigQuery BigQuery is Google’s platform for analyzing data at scale. It can
provide scalable analysis over petabytes of data. Google bills it as “a
serverless, cost-effective and multi-cloud data warehouse designed to help you
turn big data into valuable business insights.” If you’re dealing with large
volumes of data, then this platform may be your best bet.

Datadog Whereas Splunk and BigQuery are analysis tools, Datadog is more of a
real-time monitoring service. It can also take input from several different
logfiles, but its output is geared more toward real-time results. This is a good
solution for measuring the efficacy of time-limited campaigns, short-term
promotions, and multivariate testing efforts.

Screaming Frog SEO Log File Analyser Screaming Frog’s SEO Log File Analyser may
be the perfect standalone tool for web logfile analysis for SEO projects. It’s
inexpensive and its only purpose is SEO-oriented web server analytics, so you
aren’t paying for features and functions that have nothing to do with your work.
If you don’t want to buy into a long-term agreement with a service provider but
need solid server-side analytics, Screaming Frog should be your first
consideration. The free version has all the same features as the paid version,
but it’s limited to

100

CHAPTER FOUR: YOUR SEO TOOLBOX

importing only one thousand log lines, so it’s more of an “evaluation edition”
than a fully featured free edition—you’ll be able to see if it has the features
you need, but with such a big limitation on the amount of input, it isn’t viable
in production the way other free solutions are (such as Google Analytics and
Open Web Analytics).

Oncrawl Oncrawl is a full-service SEO platform that has particularly good
SEO-oriented site analysis tools which combine crawler-based performance metrics
with logfile data. You don’t have to subscribe to the full suite—you can just
pay for the logfile analyzer—but its other SEO components are worth considering.

Botify Botify is a large-scale technical SEO analysis service that can use
multiple local and external data sources to provide insights and metrics on all
of your digital assets. As with Oncrawl, logfile analysis is just part of its
larger toolset, and it’s worth your while to evaluate this tool in a larger
technical SEO context.

Sitebulb Sitebulb is a website crawler that focuses on delivering actionable
data insights for SEOs. It’s been built to crawl sites of up to tens of millions
of pages, and also offers a business model with no project limits and does not
charge extra for rendering JavaScript.

JavaScript Trackers Web server log stats are a valuable SEO asset, but they can
only show you what the server can record. From this perspective, it can
sometimes be challenging to tell the difference between a real visitor and a bot
using a false user agent string, and you’ll have little or no ability to analyze
user activity across multiple devices and sessions. While most search crawlers
identify themselves as bots via unique user agents such as Googlebot, Bingbot,
DuckDuckBot, and archive.org_bot, anyone can write a simple program to scrape
web pages and use whatever user agent they like, and your server will record
whatever they report themselves as. A badly behaved bot can circumvent a site’s
robots.txt restrictions and pollute your server log with bad data, not to
mention overloading your server and network. Most of the time these bots aren’t
malicious and it’s just somebody clumsily trying to understand what’s on your
site. To be safe, you can block them with .htaccess directives or a service like
IPBlock.com. Both Google and Bing execute JavaScript in order to be able to
fully render the web pages they crawl, so it would be natural to expect that
they would be trackable by JavaScript trackers. However, both Googlebot and
Bingbot recognize the JavaScript of most popular JavaScript trackers and skip
executing them to save on resources. Most other bots don’t execute any
JavaScript at all.

TRAFFIC ANALYSIS AND TELEMETRY

101

JavaScript trackers also offer more in-depth visitor information such as time on
site, session recording or behavior analytics data (the exact mouse clicks or
screen taps a visitor makes when navigating a site), and demographics (if
visitor data is being stored in a cookie that your analytics package has access
to). However, many people use virtual private networks (VPNs), browser plug-ins,
and other privacy countermeasures that prevent JavaScript trackers from working
properly (though data on those users will still show up in your server logs).
Assuming that your trackers are not blocked, they’ll provide you with a highly
detailed view of a subset of the true visitor data. Over time, this method will
become less effective as privacy and security awareness increase. For that
reason, many modern solutions attempt to show you the whole picture by pulling
in external data from server logs or other tracking services and integrating it
with what’s been collected via JavaScript.

Google Marketing Platform/Google Analytics Google Marketing Platform is Google’s
suite of web analytics products, which encompasses the following components:

Google Analytics A web traffic and visitor behavior data collection and analysis
tool. Google Analytics is the key component in this platform because it’s the
primary data collector; you can use it on its own without involving other
aspects of the Google Marketing Platform.The rest of this section has more
details.

Looker Studio (formerly known as Data Studio) A basic data analysis tool that
enables you to combine data from several sources and create reports and
dashboards to visualize certain metrics and goals. This is not nearly as
powerful as a dedicated business intelligence solution such as Cognos, Qlik, or
Oracle, but you may be able to make it work for your project.

Tag Manager A framework for deploying multiple JavaScript trackers using one
lightweight tag. Refer to “Tag Managers” on page 105 for more details.

Surveys An easy-to-use utility that helps you create and deploy visitor surveys
on your site. Everything in Google Marketing Platform is free, but all data
recorded through these services will be collected and used by Google. You can
also upgrade to the paid enterprise service level, which enables you to keep
your data private, adds a few new services that help you track marketing
campaigns and manage digital media assets, and includes “360” branded editions
of the free services listed above. The 360 editions include Campaign Manager
360, Display & Video 360, and Search Ads 360. These

102

CHAPTER FOUR: YOUR SEO TOOLBOX

have more integrations with other services, more filtering and funneling
options, and higher limits on web property views and data caps, among other
perks and upgrades. In this chapter we’re only covering Google Analytics and
Google Tag Manager. The rest of the Google Marketing Platform services may be
useful to you for other marketing purposes, but they either have little to do
with SEO or are inferior to the equivalent components in proper SEO platforms
and business analytics suites. As shown in Figure 4-1, Google Analytics is
active on over 72% of websites (the data shown in this figure predates the
replacement of Google Analytics with Google Analytics 4 [GA4], which took place
on July 1, 2023). The basic version of GA4 is free of charge, easy to deploy,
and offers a deep view of visitor data.

Figure 4-1. Google Analytics market share (source: Statista)

If you’re an in-house SEO and only manage one site, and you don’t have any
security or privacy concerns with Google retaining the data it collects from
your Analytics deployment (Google has access to much of this data anyway via its
Googlebot crawler, cookie trackers, Chrome users, and visitors who are logged
into a Google account during their browsing session), then the free version
might be the perfect solution for you. If you have some concerns with Google’s
data collection practices, or if you’re a freelance SEO and have multiple web
properties to monitor, then the upgrade to Analytics 360 is worth evaluating. As
with all individual pieces of larger service suites, when you’re using other
Google products like Google Ads, Google Search Console, or Google Ad Manager,
the unique integrations that they have with Google Analytics may prove to be a
major advantage over other analytics tools.

TRAFFIC ANALYSIS AND TELEMETRY

103

Obtaining keyword-specific data Google Analytics doesn’t reveal the keywords
that people search for when they click through to your pages from a SERP. To
obtain keyword insights, you’ll have to use a third-party tool to add that
missing information. Popular options include:

Keyword Hero This tool was specifically designed to provide data about on-site
user behavior per keyword in Google Analytics. It shows you how users respond to
each landing page per search query, as well as to your website at large for each
keyword. You get everything from behavioral metrics to performance metrics such
as conversions and revenue per keyword.

Keyword Tool Keyword Tool seeks to fill in the data gaps in the Google Ads
Keyword Planner, which makes it a good secondary resource for optimizing Google
Ads campaigns. Because it incorporates data from Google’s autocomplete feature,
Keyword Tool is also particularly good for local keyword research. NOTE Some SEO
platforms (covered later in this chapter) can provide missing keyword data, too:
most notably Semrush, Ahrefs, and Searchmetrics.

Kissmetrics Traffic analysis is all Kissmetrics does, so if you may want to give
it extra consideration if you prefer to build your SEO toolset with
interchangeable standalone services, or if you’re unhappy with the web analytics
capabilities of a larger marketing suite that is already deployed at the company
and are looking for a one-off replacement. There are two different Kissmetrics
analytics products: one for SaaS sites, and one specialized for ecommerce sites.
Both services attempt to identify individual visitors and build a profile for
them that encompasses all of their actions during all of their visits across all
of their devices and browsers. This has obvious SEO advantages, but it could
also be useful as a data source for expanding the information in your customer
relationship management (CRM) database.

Adobe Analytics Adobe offers a comprehensive suite of online marketing tools in
its Adobe Experience Cloud. Adobe Analytics is used by a large number of large
enterprises, either separately or as a part of Adobe Experience Cloud, due to
the scalability of the platform.

104

CHAPTER FOUR: YOUR SEO TOOLBOX

As a standalone solution, Adobe Analytics has interesting features that most of
its competitors don’t. These include access to a wider array of potential data
sources beyond what is collected via its JavaScript tag, and an AI component
that can predict future traffic levels based on patterns and anomalies in past
visitor data. It also is highly flexible and customizable and integrates with
other data platforms. According to Builtwith.com, Adobe Analytics is used on
over 200,000 websites.

Open Web Analytics Open Web Analytics is an open source web analytics package
written in PHP. It isn’t hosted externally; you have to deploy it on your web
server and configure it yourself (if you’re not in a small business, your system
administrator should be able to do this for you). The advantages are that you
control your data and your deployment in-house, you can pare it down to just the
data and visualizations that you need, and you have the unique ability to track
via both PHP and JavaScript, which allows you to collect some data about people
who use browser plug-ins or VPNs to block trackers, or have JavaScript disabled.
When properly deployed, Open Web Analytics looks and acts a lot like Google
Analytics without all of its extra marketing metrics and Google product
integrations. It’s an excellent bare-bones web analytics package with no bloat,
no monthly fees (though if you use it in production, you should give back to the
project via a donation), and no limit on how many sites you can deploy it to.
The downside is that someone at your company must spend time and effort to
deploy and maintain it.

Tag Managers Your company or client may have several different JavaScript
trackers deployed on the same site, for various reasons. Multiple trackers will
potentially interfere with one another, and the calls these tags make to
external servers can slow page load times. Ultimately, it’s better to reduce the
number of tags per page, but in a large company that may not be possible due to
interdepartmental politics or budgeting constraints. Tag management services
combine several different JavaScript trackers into one short tag, which reduces
page load times, bandwidth usage, and the effort of managing multiple tags on a
large site. Examples include:

Google Tag Manager This is an extremely popular free tag management solution.
The only downside is that it may not support some less frequently used
JavaScript trackers, and you’ll need to use custom tags for unsupported
trackers. However, that’s only a deal-breaker if you aren’t savvy enough with
JavaScript (or don’t have access to someone who is) to create and maintain that
custom code.

TRAFFIC ANALYSIS AND TELEMETRY

105

Google Tag Manager is undeniably the first solution you should explore, not just
because it’s free, but because it might be the simplest option.

Tealium Tealium is a commercial tag manager that supports substantially more
tracker code templates than Google Tag Manager, which makes it easier for
nondevelopers to deploy. However, it may incorrectly render your pages for
visitors who block all trackers, and ultimately it’s the conversions that
matter, not the visitor stats.

Search Engine Tools and Features We’ve already covered a few Google services
that are designed specifically for website optimization, but if you approach
many of Google’s other properties from a creative angle, you’ll find a lot of
hidden SEO value—particularly for keyword discovery and valuation. In this
section we’ll review several search engine tools and features that can play an
important role in your SEO program.

Autocomplete Google uses algorithms to predict the rest of your query when you
start typing in the search field and shows you a list of top-ranked predictions
below it. This is an excellent way to see the most popular keywords for your
topics. For example, typing in lilac might reveal suggestions like those shown
in Figure 4-2. Google won’t tell you how many times lilac sugar cookies has been
searched for, but because it appears at the top of the list of suggestions, you
can infer that it was probably searched for more often than the phrases that
appear below it. This can give you important insight into what searchers are
looking for, or what they search for in relation to a specific topic. NOTE
Autocomplete predictions are strongly influenced by the user’s location (e.g.,
wet n wild phoenix might be a prediction that shows up when a user in Phoenix,
AZ, types the letter w into the Google search box). If you’re searching from a
mobile device, you might also see apps in the list of suggestions.

106

CHAPTER FOUR: YOUR SEO TOOLBOX

Figure 4-2. Example Google autocomplete results

Google Ads Keyword Planner The Google Ads Keyword Planner can analyze your site
content and deliver a list of relevant keywords that are likely to be converted
to clicks in a Google Ads campaign. It’s free to use, but you have to have a
Google account (which is also free) in order to log in. It has excellent
documentation and a walkthrough for new users. Regardless of whether your
company invests in Google advertising, you can use the Keyword Planner to get
good suggestions for related terms, search volume estimates, search trends, and
ad cost estimates for any keyword or URL that you enter. You could use these
numbers to calculate keyword popularity, keyword difficulty, and cost per click
(CPC, covered in Chapter 6). Unfortunately, unless you are running a paid ad
campaign, the search volume will be approximated in wide ranges, so the data
won’t be precise enough for you to use for anything other than quick, basic
keyword valuation. You can work around this and get better data by setting up a
small, low-budget campaign; or, if you know someone who is spending money on
Google Ads campaigns, you can ask them to add you as an authorized user to their
account. The CPC data is much more precise than the other metrics and can be
useful for gauging keyword difficulty for both paid listings and (in a general
way) organic searches. You can get more exact estimates by selecting specific
budgets or costs per click, and

SEARCH ENGINE TOOLS AND FEATURES

107

you can forecast the traffic impact and conversion rate based on data from the
past two weeks. The Google Ads Keyword Planner is most useful for projects that
will include advertising, but it can also serve as a free (or low-cost) option
for organic keyword research. Due to its limitations, though, this is no
replacement for search data from any of the major SEO platforms (covered later
in this chapter).

Google Trends Google Trends enables you to view the relative popularity of a
keyword over time and by geography. You can also compare trend data between two
or more keywords. There are two datasets you can use: realtime (search data from
the past 7 days, up to the past hour), and non-realtime (search data spanning
the entire Google archive, starting in 2004 and ending about 36 hours ago). Both
datasets are sampled (a random, representative sample is taken from the complete
source, similar to a poll), normalized according to time and locale, and indexed
to the time span (on the line graph, the 0 point represents the lowest search
volume over that time span, and the 100 point represents peak search volume).
Google Trends is useful for identifying and predicting spikes in search volume
for your topics, and for finding potentially related topics and queries. By
drilling down into spikes in the trend graph, you can see the events and news
stories that contributed to them, which can give you some good ideas for site
content and media outreach. There’s more you can do with Google Trends,
including comparing relative volumes of different keywords over time (what’s
bigger, car repair or auto repair?), competitive brand strength research, and
seeing trends isolated down to any country in the world. You can read more about
this in Chapter 6. Google Trends is free to access (with a free Google account)
and easy to use, so there’s no harm or risk in looking up some of your topics or
keywords to see if there’s any useful information on them. Only popular search
phrases are sampled, though, so it’s unlikely to give you insights about obscure
long-tail keywords.

Google News Google News (like other searchable news aggregators) enables you to
gauge the media activity around a topic and the recent and real-time popularity
(and therefore competitiveness/difficulty) of keywords. You can also search for
mentions of your brand or company name (in quotes if it’s a phrase). Results are
sortable by time, all the way down to stories from the past hour, which makes
Google News an excellent resource for extremely down-to-the-minute current
keywords. The data from most other keyword research tools is at least a day old.

108

CHAPTER FOUR: YOUR SEO TOOLBOX

Related Most search engines show “related” search results, either on the SERP or
somewhere on a content page. Google shows related searches at the bottom of
every SERP, and related queries under the “People also ask” heading. YouTube
shows related videos in a sidebar next to the viewing pane. X (formerly Twitter)
shows accounts that are similar to the one you’re currently looking at. Amazon
has several different “related” features on every SERP and product page, such as
“Customers who bought this item also bought,” “Save or upgrade with a similar
product,” “Compare with similar items,” and “Frequently bought together.”
“Related” features like these are especially useful for keyword research in
highly competitive markets.

Search Operators The site: operator is one of the most used operators for
researching information on a site. It’s a fast and easy way to see what pages
Google has indexed for a website. Note, however, that Google normally limits the
number of pages it will show (the limit changes from time to time, but in the
past it has been 300 or 400 pages). An example of these results for mit.edu is
shown in Figure 4-3.

Figure 4-3. Sample site: query results

SEARCH ENGINE TOOLS AND FEATURES

109

The pages shown by Google are usually the ones that it sees as being more
important; however, as there is a limit to the number of pages shown, the
absence of pages in this list does not mean that they are of poor quality. The
before: and after: query operators limit search results to those that were added
to the index before or after a given date (in YYYY, YYYY-MM, or YYYY-MM-DD
format). If you only specify a year, the rest of the date is assumed to be 01-01
(January 1). You can use both operators in the same query to limit results to a
specific window of time. For instance, if you wanted to see what people were
saying about the Blizzard Entertainment game Overwatch between its official
release date (October 4, 2022) and its launch (April 24, 2023), you could use
this query: Blizzard overwatch 2 before:2023-04-24 after:2022-10-04

You can also accomplish this via the Tools menu on the Google SERP (by selecting
“Any time,” then selecting “Custom range”). That’s fine for a single query, but
if you’re going to look up several keywords or use several different dates, it’s
quicker to use the search operators instead.

SEO Platforms SEO platforms offer incredibly valuable tools for a wide array of
SEO projects. Though their exact feature sets vary, the best ones can analyze
your site for optimization gaps, suggest topics for new content, compare your
site against your competitors, and find backlinks. Perhaps more importantly,
they also harvest search data from Google, cleanse it of bot traffic, and
connect it to in-depth keyword exploration features that can help you calculate
the best opportunities. These services are not free. Expect to pay anywhere
between $100 and $500 per month for the good ones, depending on the level of
service you need. It’s usually free or inexpensive to do a limited trial,
though, and the customer support is excellent (for the platforms we use and
recommend, anyway). If search traffic is important to you, then this is money
well spent. A professional SEO may even subscribe to several platforms, because
they each do something a little better (or just differently in a way that you
prefer) than the others, and certain projects may benefit from a particular
niche tool or feature. To do a proper valuation of your keyword list (more on
this in Chapter 6), you must have a reliable and current data source for, at the
very least, monthly search traffic and CPC, though ideally you’d have keyword
difficulty and current rank data from an SEO platform as well. We recommend any
of the services discussed here for this purpose (and more—they’re much more than
just keyword research tools!), but there are several others on the market that
might work better for your industry or region, and new ones may appear after
this book has gone to press.

110

CHAPTER FOUR: YOUR SEO TOOLBOX

Semrush Semrush is more than an SEO platform; it also provides tools for market
research, social media management, content creation, and ad optimization. Within
its SEO package, there are tools for keyword research, page optimization, local
search optimization, rank tracking, backlink building, and competitive analysis.
All of these services will be helpful for your SEO work, but we really want to
highlight Semrush’s superior keyword research capabilities. If you only pay for
one keyword research tool, Semrush should be a top candidate. It’s not that the
others are bad (some are actually better for other purposes), but Semrush’s
biggest strength is its keyword database: at over 22 billion keywords (as of
this printing), to our knowledge it’s the largest and most comprehensive in the
industry. The database is built by scraping Google SERPS for the top 500 million
most popular keywords, then analyzing the sites in the top 100 positions for
each. The only potential shortcoming to this huge database is that it’s only
updated about once per month. That’s fine for most SEO purposes (except for
researching emerging trends), though, and it’s on par with the update cycles of
most other SEO platforms. Semrush has a variety of keyword research tools,
including:

Keyword Overview Produces a comprehensive report showing a keyword’s popularity,
difficulty, CPC, trends, and related keywords and questions. You can analyze up
to 1,000 keywords at a time by copying and pasting them from your spreadsheet or
text file into the Bulk Analysis field.

Organic Research Provides a report on the keywords used by the top 100 sites
that relate to or compete with yours.

Keyword Magic Based on the topic or keyword you give it, finds every related
relevant keyword in the Semrush database. You can also specify broad match,
phrase match, or exact match modes to tune how many keywords it provides.

Keyword Gap Analyzes your competitor sites and identifies the best opportunities
for keyword targeting and page optimization.

Keyword Manager Performs a real-time analysis of how up to 1,000 of your
keywords perform on SERPs and with competitors.

Organic Traffic Insights Provides insights related to the keywords driving your
site traffic data.

SEO PLATFORMS

111

The other Semrush features either speak for themselves or should be evaluated on
an individual basis. In general, this platform is well-documented and easy to
use, but Semrush’s customer support is among the best in the industry if you end
up needing help.

Ahrefs Ahrefs includes clickstream data from a wide variety of sources beyond
Google, such as Yandex, Baidu, Amazon, and Bing. It doesn’t offer the
big-picture marketing services beyond SEO like Semrush does, but its SEO toolset
is feature equivalent. Ahrefs provides the following tools:

Site Audit This tool provides a comprehensive SEO analysis of your site, using
over 100 optimization criteria. The report shows technical issues with HTML,
CSS, and JavaScript; inbound and outbound link impact; performance; and content
quality analysis.

Site Explorer You can use this tool to get a report on your competitors’ sites,
including the keywords they rank for, their ad campaigns, and a backlink
analysis.

Keywords Explorer Enter up to 10,000 keywords to get search volume and other
valuable data. This tool provides a large number of filtering options, including
a “topic” column.

Content Explorer You provide a topic, and Content Explorer shows you an analysis
of the topperforming articles and social media posts related to it. Excellent
for finding backlink opportunities.

Rank Tracker Monitors your site’s performance relative to your competitors. You
can have updated reports delivered to you via email every week.

Searchmetrics Like Semrush, Searchmetrics is a larger marketing services and
consulting company that sells a powerful SEO suite. Searchmetrics was acquired
by Conductor in February 2023. There are four tools in the Searchmetrics Suite:

Research Cloud A domain-level market research tool that identifies valuable
topics and keywords, analyzes competing sites, and reveals content gaps.

112

CHAPTER FOUR: YOUR SEO TOOLBOX

Content Experience Provides data that helps you write effective search-optimized
content for a given topic, including the best keywords to use, seasonal
considerations, searcher intent, and competitive analysis.

Search Experience Provides performance monitoring and gap analysis for your site
and your competitors’ sites in organic search. Whereas Research Cloud is
centered on your site, Search Experience is centered on the search journey that
leads to it. This service has a much wider global reach than most others in this
space.

Site Experience Produces a technical audit of your site that reveals potential
problems with search indexing, broken links, orphaned pages, and mobile
responsiveness. Searchmetrics Content Experience is outstanding for topic
research. One feature that really stands out is the Topic Explorer. You provide
a topic, and Topic Explorer creates a report containing search data and
statistics, and a color-coded interactive mind map that shows how it performs
relative to other semantically related topics in terms of search volume, rank,
seasonality, search intent, sales funnel, and level of competitiveness. You can
drill down into any of those topics to get a more refined view of the keywords
within them.

Moz Pro Moz Pro offers SEO tools for keyword research, rank tracking, site
auditing, on-page optimization, and backlink building. Its main advantages are
the Page Authority and Domain Authority scores, which help you find optimal
backlink opportunities. The backlink research tool, Link Explorer, states that
it has data on over 47 trillion backlinks. For keyword research, the Moz Keyword
Explorer is an excellent resource, especially for natural language questions. It
also has one of the best keyword difficulty scoring systems in the industry. Moz
Pro’s site auditing tool can be set up to perform a full site audit and identify
a wide range of issues that may be hampering your SEO. In addition, you can set
up alerts that will check your site and proactively let you know when problems
are discovered. Moz also offers a Page Optimization Score to help identify
issues with your pages. This offers content optimization suggestions that enable
you to improve the ability of your pages to rank.

SEO PLATFORMS

113

Rank Ranger Rank Ranger (acquired by SimilarWeb in May 2022) is another
marketing services suite that includes excellent SEO tools for keyword research,
rank tracking, site auditing, and more. Its biggest selling point is its high
degree of customizability. Most SEO platforms have a few static presets for
charts and graphs; Rank Ranger enables you to build your own. What we want to
highlight in particular, though, is the superior natural language question
research capabilities of the Rank Ranger Keyword Finder. If mobile search is a
higher priority for your site than desktop search, and you only want to pay for
one SEO platform, Rank Ranger should be the first one you evaluate.

Other Platforms As an SEO consultant, you may be asked to use the keyword
research functions that are included in a marketing services suite that your
client has already paid for or is already familiar with. The following are some
other good SEO platforms we’ve worked with in this capacity that we want to
mention: • BrightEdge

• seoClarity

• Conductor Again, they do most of the same things, and you’ll likely find that
you prefer one over the others for specific purposes. Of course, there are many
more SEO platforms and keyword research services than the ones we’ve listed in
this chapter. These are just the ones we’ve used successfully and are
comfortable recommending. If you want to use a platform that isn’t covered in
this book, some important questions to ask before subscribing to it are: • Where
is the data coming from, and how often is it updated? • Does the data apply to
the region or locale that my site is being marketed to? • Can I import my whole
keyword list via copy and paste or by uploading a CSV? • Can I export keyword
data to a CSV (or XLS) file? • Does it offer metrics like monthly search volume,
keyword difficulty (or keyword competition), CPC, and rank? • What are its
unique reporting capabilities? Another useful tool is SerpApi. This is a
programmatic interface to Google and other search engines that enables you to
run automated queries and returns SERP data in JSON format. You can plug that
data into a dashboard or reporting engine, or convert it to CSV or XLS and work
with it in a spreadsheet. This is similar to what most SEO platforms do to
scrape search data, except SerpApi offers many more data points, customization
options, and access to a wider array of search engines, including:

114

CHAPTER FOUR: YOUR SEO TOOLBOX

• Baidu

• YouTube

• Bing

• Walmart

• Yahoo!

• The Home Depot

• Yandex

• LinkedIn

• eBay If you only need access to search data, and you have a web developer or
Python guru available to help, then SerpApi is a cheaper alternative to a more
comprehensive SEO platform.

Automation Typically, retrieving keyword rank data from an SEO platform is a
manual process: you provide a list of keywords, then filter and sort the data,
then export it to a CSV or XLS file. From there, you’d import or copy/paste the
data into a spreadsheet or analytics engine. Some Google Sheets extensions exist
that pull data from Google services like Google Search Console and Google
Analytics. One of our favorites is Search Analytics for Sheets. Some SEO
platforms also offer access to their data via an application programming
interface (API), which enables you to script part or all of the export process.
If your preferred platform or data provider has an API, expect to pay extra for
an API key and a certain number of monthly API usage units. Some services allow
you to pull just about any data from them via APIs, not just keyword reports.
You can use this data in custom or third-party dashboards or business analytics
packages and CMSs, or you can write a quick Python script to fetch your new
keyword data CSV every month. If you have a very large (by your own standards)
keyword dataset, or if you maintain separate keyword lists for subdomains,
content directories, or product lines, then you may find value in using AI to do
the categorization and sorting for you. Specifically, we want to call out BigML
as an easy-to-use platform for building machine learning models for sorting
large keyword datasets. Note that generative AI tools such as ChatGPT can be
used to perform specific functions here too. One example is to have these tools
help you classify a list of keywords based on the nature of their intent
(transactional, informational, navigational), or group them based on semantic
relevance. You can read about more applications for ChatGPT in executing SEO
tasks in Chapter 2.

AUTOMATION

115

YouTube Optimization YouTube is the world’s second most popular website (Google
being #1), and it has so much content that it’s nearly impossible to navigate
without its built-in search function. Unfortunately, unlike Google’s video
vertical search feature, it only returns results for videos on YouTube, so its
benefit to SEO is at best indirect (this is covered in more detail in Chapters
11 and 12). That said, if you’re optimizing for YouTube search, vidIQ is one
resource that can help: it’s focused on gaining YouTube views and subscribers.
Another excellent YouTube resource to consider is TubeBuddy, which provides
useful features such as a keyword explorer and A/B testing of video titles and
thumbnails. NOTE Some SEO platforms (such as Ahrefs and Rank Ranger) also have
YouTube keyword research and rank-tracking capabilities.

Conclusion In this chapter we covered the SEO tools that we like to use (or in
some cases are forced to use because of budget limitations or client
preferences), but there are hundreds of other options out there that you may
prefer for a variety of reasons. Perhaps there are tools that are better suited
to non-English languages, or plug-ins for enterprise service or analytics suites
that you’ll have to learn to use because they’re already deployed and paid for.
You may also find that your preferred toolset for one project may not be as good
a fit for a completely different project. Beyond that, search technology, the
web in general, and the field of SEO are all constantly evolving. New needs and
concerns will arise as old ones fade away; new utilities and services will
continue to enter the market, and some will merge with others or disappear
entirely. SEO platforms are very competitive and will add new features and
capabilities over time, so even if you’re an in-house SEO with only one site to
manage, it’s still a good idea to re-evaluate your SEO tools now and then.
Regardless of which platforms, services, and utilities you select, the most
important thing is that you have the right tools to do SEO your way.

116

CHAPTER FOUR: YOUR SEO TOOLBOX

CHAPTER FIVE

SEO Planning Before you dig into the gritty details of your SEO campaigns, you
first need to work through the overall goals and determine the best approach for
your business to use. This will include understanding available resources,
budgets, limitations, history, and more. While this may not feel like an SEO
activity, your campaigns will not succeed if you don’t provide the right
foundation.

Strategy Before Tactics Planning your SEO strategy in advance is the most
important step toward effective and successful SEO implementation. Things rarely
go perfectly according to expectations, but if you are thorough in your
process—if you put an appropriate level of effort into planning without relying
too heavily on the accuracy of the plan—you’ll be prepared to adapt to
unanticipated obstacles. Your SEO strategy should generally address the
following components: 1. Understanding the business you’re working with and its
objectives 2. Assessing web development, content, analytics, and management
resources: a. Identifying critical technical SEO issues b. Identifying the most
efficient and highest-impact improvements c. Prioritizing tactics and the
phasing of implementation 3. Developing a content strategy to efficiently and
effectively address the needs of users related to your products/services and the
value that they provide 4. Tracking, measuring, and refining on a timeline that
is supported by resources and overall business objectives

117

Developing an effective SEO strategy requires that you learn as much as you can
about your client’s (or employer’s) business. In some cases, your initial plan
will be limited to critical technical SEO tasks. After these fundamentals have
been addressed, you’ll be able to begin collecting and analyzing data that will
guide you in developing a long-term plan for higher-level SEO tasks, such as
competitive analysis, keyword research, seasonal planning, and content
marketing.

The Business of SEO Your SEO strategy is only as valuable as your ability to
execute it. You can know exactly what’s wrong with a site, and exactly how to
fix it, but be financially or organizationally blocked from implementing the
necessary solutions. Thus, before diving into technical SEO implementation, it
is crucial to address the important topic of business management. NOTE This
section is only a basic overview of topics that are covered in much more detail
in books about project management and business administration. For a more
in-depth perspective on soft skills and client strategies for SEOs, we recommend
Product-Led SEO, by Eli Schwartz (Houndstooth Press).

This is true whether you are working in-house, as an independent consultant, or
within an agency. In general, there isn’t anything different about the SEO
consultancy industry, as standard business fundamentals apply—but there are a
few extra considerations when it comes to evaluating clients:

Retain an attorney or law firm knowledgeable in digital marketing and ecommerce
Eventually you will have a prospective client who may be doing something
illegal, and this is a client whom you should decline to take on. If there are
even slight concerns about the legality of a site you’re asked to work on, ask
your lawyer for an opinion before you agree to take on the client. Retaining
counsel is especially important when dealing with international clients, from
both a contract and a content perspective.

Customize your payment agreement terms to ensure you get paid for completed work
There are numerous ways to structure client agreements, and payment terms are no
exception. Some companies charge a flat rate paid monthly, with or without an
initial cost; others get paid based on specific projects, while still others get
paid in other ways tied to deliverables over a short or long term. Consider the
inclusion of late fees for payments not received by their due date.

118

CHAPTER FIVE: SEO PLANNING

Consider confidentiality and nondisclosure agreements (NDAs) as needed Both you
and your client have intellectual property you both likely want to protect.
Mutual confidentiality and NDAs can be an easy way to protect this valuable IP
and create a sense of safety within the relationship.

Clearly identify performance terms Consider what work will be done, by whom, by
when, and in what manner, as well as what support the client needs from you to
ensure performance. The right attorney can draft a contract template for you to
use with your clients which they will be comfortable enforcing and defending,
and which will set you up with the appropriate protection.

Extensively document your work Break down projects into tasks and keep track of
hours and expenses (with receipts). Establish analytics and measurement
requirements and/or protocols specific to the client’s needs and the overall SEO
strategy.

Audit all your legal boilerplate Have a legal expert do a comprehensive business
risk assessment, reviewing all contractual obligations and intellectual property
provisions to ensure your rights are protected and your responsibilities are
limited.

Ethical and Moral Considerations Familiarize yourself with the type of content
your client, or prospective client, is asking you to promote online to ensure
that the client’s content is in accordance with the various community and safety
standards and guidelines on the platforms on which you choose to promote them.
Every site can benefit from good SEO, even those containing and promoting
content considered harmful, terrorist, and/or violent extremist. These types of
sites, and the entities behind them, are regularly identified by members of the
Global Internet Forum to Counter Terrorism (GIFCT), including Facebook,
Microsoft, X (formerly Twitter), and YouTube, in their efforts to remove harmful
content from internet circulation. Those seeking to reduce the visibility of
these content publishers may use a variety of countermeasures, including SEO
efforts.

The Escape Clause Despite your initial due diligence, issues can still arise
after the contract is signed and/or the project is funded. Some examples
include: • There’s a change in management or ownership. • The information you
were given was incorrect (such as a deadline, sales target, budget, or
asset/resource control).

THE BUSINESS OF SEO

119

• A critical internal partner refuses to participate (e.g., the engineering
director or product manager refuses to assign designer or developer time to SEO
tasks). • Other priorities usurp the internal resources that were initially
assigned to SEO. Often, these issues stem from internal client misconceptions
about the complexity and importance of SEO, so make sure your requirements for
client support are clearly articulated before the beginning of work and
identified throughout the course of the engagement.

Deciding on Accepting Work There is an important philosophical decision to be
made related to the clients you take on or the employment that you’re willing to
accept. For some SEO professionals it’s hard to turn down a client or job, even
if they are a “bad” client, simply because you need the money. Some people and
agencies take on all the new projects they are offered, to drive as much growth
as possible. Others are very selective about the work they accept or will fire
clients who cross certain lines (perhaps they find what the client markets and
sells objectionable, or simply determine that the likelihood that an SEO program
will be successful with them is low). Some warning flags to watch for include: •
Strongly held beliefs about how Google works that are inaccurate • A desire to
implement a spam-oriented approach to SEO • Unwillingness to invest in SEO, or
to make organizational changes if such changes are required to succeed •
Internal processes that make progress with SEO impossible (for example, having
many disparate groups in control of site content without a cohesive
communication and implementation strategy) How you choose to handle such
situations is entirely up to you, but engagements that fail, even if it’s not
your fault in any way, are unlikely to expand their business with you or provide
referrals to other potential clients or employers.

Typical Scenarios Over time, most companies with a web presence figure out that
they need to have greater search visibility. The SEO solution that will work
best for them depends on various factors, including how large the company is,
how large a footprint the company has online, and how well it has understood and
leveraged the scope of the opportunity from a content perspective.

120

CHAPTER FIVE: SEO PLANNING

Startups (Unlaunched) An unlaunched startup is as much of a “green field” as
you’ll ever find in the SEO industry. Since you are starting from scratch, you
will be able to take SEO into account in the initial technology and development
decisions. Your choices will directly influence the site’s SEO success for years
to come. Whether you’re an employee or a consultant, your role is likely to be
advisory as you guide the company toward avoiding search visibility and content
development issues, while also guiding overall usability, design, and
development in the initial startup phase. If you’re an employee, it’s likely
that your involvement in SEO will be on a part-time basis and you’ll have many
other duties at the same time.

Startups (Launched) For startups that already have a web presence but have
little or no SEO visibility, the first question you’ll usually ask yourself is,
“Improve it or replace it?” Startups often have limited human resources. For
example, the administrator for the website may be the same person who serves as
the IT manager and/or IT support administrator (two entirely different roles!),
and this person is often tasked with managing the SEO efforts that the marketing
team decides to pursue. (In contrast, in a larger organization, the management
of SEO would likely be divided among several roles filled by several people.) In
this scenario, the scope of your role as an SEO will be dependent upon the
organization’s ability to understand SEO implementation requirements, and the
level of investment available for SEO. In all situations, it is important to
remember that to be effective (and for the organization to have a chance at SEO
success), you must have strong influence over technology, development, and
design decisions for the website, and you must have the ability to effectively
advocate for the resources needed to execute on those decisions.

Established Small Businesses For the purposes of this section, what we mean by a
“small business” is a private company that doesn’t intend to go public. Startups
are usually heavy on early investment and aim to either go public or get bought
out by a bigger corporation. Sometimes, though, there’s enough profitability to
keep going, but not enough growth to attract an advantageous buyout. And then
there are businesses that start small and never have the intention of going
public or selling out—most often family-owned and sole proprietor businesses.
The good news about privately held businesses is that they’re forced to be
profitable; the bad news is that they also tend to be cautious when it comes to
spending. Private

TYPICAL SCENARIOS

121

business owners tend to think in different time frames than public business
executives; year-end sales totals and monthly expenses are usually more
important than quarterly profits. As with startup organizations, if you’re
responsible for SEO in a small business, then you likely have several other
responsibilities at the same time. If you’re consulting for one, you can avoid
sticker shock the same way car salesmen do: by spreading out your work over a
longer timeline to keep monthly costs low. Either way, you will need to
prioritize tasks that will most reliably show month-on-month progress. Small
business organizations can be a nightmare. They often want to do things as
cheaply as possible and have unrealistic expectations about timelines.
Frequently, someone at the company has already tried to improve search rankings
based on information from old articles and blog posts and ended up making things
worse. Then the search traffic stops, and you’ll get a panicked call or email
from the owner, begging you to fix it. Small business sites—especially ecommerce
and local service providers (including a surprising number of lawyers)—that have
been around for a decade or longer are a great place to find black hat SEO
gimmicks. If there has been a sudden drop in search traffic, you should begin by
investigating if the drop is due to a Google penalty (discussed in Chapter 9).
It isn’t impossible to undo the damage caused by unethical SEO hacks (see
Chapter 9), but you can’t make any guarantees. At best, sites that have been
manually removed from the search index will take a lot of time and effort to
repair.

Large Corporations Whereas private companies present autocratic challenges for
SEOs, large public corporations often sabotage SEO efforts with bureaucracy and
interdepartmental warfare. Corporate environments may offer more resources for
SEOs (in the form of money and in-house talent), but only if you’re willing and
able to play political games. You might have approval from the CEO to do
whatever is necessary to improve search traffic, yet still find yourself
stonewalled by product managers and marketing VPs. NOTE Public companies are
rarely as top-down as they look from the outside. In fact, in many cases you’re
better off thinking of them as bottom-up. Successful corporate employees only
seek the approval of their direct manager; this paradigm continues upward, level
by level, all the way to the C-suite.

To be successful, you must be able to build a cross-functional and
multidisciplinary SEO team that spans departments and has the explicit support
of director-level management in engineering, IT, and marketing. The ideal SEO
team consists of a manager (who may have a manager, director, or higher title),
designer (and possibly other creatives), software engineer (or a full web
development team), systems administrator (or

122

CHAPTER FIVE: SEO PLANNING

IT representative), business analyst, and systems analyst. Ideally these people
identify as SEOs or have SEO experience, but if not, they should be competent
enough to learn how to tune their skill sets to the requirements and
sensibilities of this field. NOTE Nearly every department can help with SEO in
some way, but the only ones that can effectively block SEO efforts are
engineering, marketing (including design and product management), and IT.

The key to success in a large corporation is to integrate with its process for
defining and tracking goals. Every company’s culture is a little different, but
all corporations that have a history of success use some kind of paradigm for
defining very large goals and breaking them down into departmental objectives.
In order to get budget and resource commitments, you must figure out how to
attach SEO tasks to every level of that hierarchy. NOTE In this section we list
the most common methods of corporate goal tracking, but many others exist, and
new ones tend to emerge every time a new business book hits the bestseller list.
Don’t be afraid to ask a product manager, project manager, business analyst, or
systems analyst to help you understand corporate jargon.

At the top of the food chain is the big, hairy, audacious goal (BHAG; we will
leave it to you to figure out variants of this acronym). This is typically
something that seems impossible now but could be achievable in time with focused
and sustained effort. Some classical examples of BHAGs are: • Find a cure for
breast cancer.

• Organize the world’s information.

• Put a person on the Moon and bring them back to Earth safely. One level below
the BHAG is the manage by objective (MBO) paradigm, popularized by the legendary
author and business process consultant Peter Drucker, which establishes major
business objectives (also MBOs, depending on your client’s corporate vernacular)
that work toward a BHAG, but are on a shorter timeline (usually one fiscal
year). Some MBO examples based on the previously listed BHAGs might be: •
Identify genetic and environmental factors involved with breast cancer. • Design
a spaceship that can go from the Earth to the Moon and back. • Create a computer
program that can find and record all relevant web pages and categorize them
according to a curated metadata model.

TYPICAL SCENARIOS

123

Below the MBO, you might have a statement of objectives and key results (OKRs).
This is a much more specific goal that defines exactly what someone will do to
achieve a measurable result. For instance: • I will develop a DNA test that can
identify genes associated with breast cancer. • I will design a spaceship that
can propel a human-habitable capsule beyond Earth’s atmosphere. • I will
engineer an automated script called Googlebot that will scrape an HTML page’s
text content, then follow all of its outbound links. Also below (or sometimes on
the same level as) MBOs are specific, measurable, achievable, relevant, and
time-bound (SMART) goals. Basically, these are OKRs that include a specific time
frame. For instance: • I will create a saliva test that can accurately detect
both a BRCA1 and a BRCA2 gene mutation before the next anniversary of Susan G.
Komen’s birthday. • I will design a multistage rocket that is capable of
achieving planetary escape velocity before my Russian competitors. • I will map
the World Wide Web as a graph data structure before I get evicted from the
Stanford dormitory. Regardless of the methodology and acronyms, everything you
do for your client’s company should align with the hierarchy of goals that it
claims to honor. At some point you will be asked to defend your SEO budget; the
best way to do this is to show how every dollar is spent in service to the goals
represented by the appropriate corporate acronyms.

Initial Triage Congratulations—you got approval for the project (if you’re
in-house) or the contract (if you’re an agency consultant), and you’re ready to
get started. Here are your action items for phase one, regardless of the type or
size of company you’re working with: 1. Figure out what’s already been done in
terms of SEO: who was here before, what did they do, and, if they’re not here
anymore, why did they leave? In addition, map out how the company sees the role
of content in SEO and determine its willingness to invest in that content going
forward. 2. Identify the IT and SEO products and services that are in use right
now and determine whether you’ll be locked into certain vendor agreements for
your toolset. Learn what the content creation budget is and how flexible it
might be going forward.

124

CHAPTER FIVE: SEO PLANNING

3. Implement some baseline analytics (or configure the existing analytics
service, if there is one), and start collecting data. 4. Look for major
technical SEO problems, then fix them. Identify content gaps and get the content
created to fill them.

Document Previous SEO Work If there are another SEO’s fingerprints on the
website, stop and take stock of what they’ve done. Ideally, they did everything
right and documented their work in a detailed log, which you can compare against
analytics data to gauge the impact of each change. One important aspect of
investigating prior SEO work is determining what was done in terms of link
building activities. Were shady practices used that carry a lot of risk? Was
there a particular link building tactic that worked quite well? Analyzing the
history of link building efforts can yield tons of information that you can use
to determine your next steps. If no such log exists, start creating one
yourself. Begin by asking your client if they have a copy of the previous SEO’s
recommendations, requests, or statements of work. Next, look for prior snapshots
of the site on the Internet Archive’s Wayback Machine, and make a note of what
changed and approximately when. NOTE From this point forward, document every
change you make, including all relevant details such as URLs, timestamps, and
extra procedures such as server restarts and software upgrades. This may seem
tedious in the beginning, but you’ll thank us later when something goes wrong
and you don’t know which of the past 50 changes to revert.

Look for Black Hat SEO Efforts Occasionally you’re going to find some amount of
unethical SEO activity, otherwise known as black hat SEO. The most common black
hat tactics are:

Keyword stuffing Overusing specific keywords in your content to such an
obnoxious degree that no human would want to read it. This term can also refer
to dumping your keyword list into some part of a page (or, not uncommonly, every
page on a site), sometimes at the bottom or in invisible text.

Link trading Buying or selling links without qualifying them properly as
nofollow. Note that this includes paying to place guest posts.

INITIAL TRIAGE

125

Cloaking Showing Googlebot a search-friendly page but redirecting real people to
spammy pages.

Content theft Stealing content from other sites, then either republishing it as
is or spinning it (using a script to change the wording to try to defeat
duplicate content detection algorithms).

Thin content Similar to keyword stuffing, this refers to including content
that’s designed to appeal to search engines but is worthless to humans. Often
this is the same block of text published on several pages on a site, with the
keywords swapped out for each page.

Spamming forums and blog comments Placing links to your pages in the comment
sections of blogs and on forums.

Automation Using a script or service to automate any of the above, especially
content theft (scraping and spinning), thin content generation, content
submission to sites that accept user contributions, forum and blog comment
spamming, and auto-clicking on SERPs and ads. These are just the old standbys;
new dirty tricks are constantly being discovered and exploited. Black hat
tactics often work very well at first, but when Google detects them (and it
always does), it removes the offending pages from the search index and may even
ban the entire site. It’s never worth it. The quickest way to find the most
obvious illicit SEO tactics is to use a backlink analysis tool such as Majestic
or LinkResearchTools to find links from irrelevant and/or low-quality sites.

Watch for Site Changes That Can Affect SEO Ideally your log should track all
changes to the website, not just those that were made with SEO in mind—though
you may not always have access to that information, especially if IT and
engineering services are outsourced. In larger organizations, many different
people can make different kinds of changes to the website that can impact SEO.
In some cases, they don’t think the changes will have an effect on SEO, and in
other cases they don’t think about SEO at all. Here are some examples of basic
changes that can interfere with your SEO project:

Adding content areas/features/options to the site This could be anything from a
new blog to a new categorization system.

126

CHAPTER FIVE: SEO PLANNING

Changing the domain name If not managed properly with 301 redirects, this will
have a significant impact.

Modifying URL structures Including the web server directory structure.

Implementing a new content management system (CMS) This will have a very big
impact on SEO. If you must change your CMS, make sure you do a thorough analysis
of the SEO shortcomings of the new system versus the old one, and make sure you
track the timing and the impact of the change so that you can identify critical
problems quickly.

Establishing new partnerships that either send links or require them Meaning
your site is earning new links or linking out to new places.

Making changes to navigation/menu systems Moving links around on pages, creating
new link systems, etc.

Content changes Publishing new content, revising existing content, or deleting
old content can all have a significant impact on SEO. There probably isn’t much
you can do about changes that happened more than a few months ago, except
perhaps redirect broken incoming links to new URLs and begin the work of
rebuilding the lost SEO traffic. More importantly, we mention these common site
changes because you will now have to be vigilant for them. You aren’t
necessarily going to be notified (let alone consulted!) about changes that other
people in the organization (or its outsourced partners) make.

Identify Technical Problems A large part of SEO is a technical process, and as
such, it impacts major technology choices. For example, a CMS can either
facilitate or undermine your SEO strategy. The technology choices you make at
the outset of developing your site can have a major impact on your SEO results.

Servers and hosting Whether you host your own server onsite, colocate it in a
managed facility, or rent a server from a hosting provider, make sure it’s built
to handle the level of traffic that you expect to have. In many instances, it
may be underpowered for the traffic you already have. Googlebot makes note of
page load times, and while it may wait around a little while to see what’s on a
slow-loading page, a human often will not; this is what’s known in the industry
as “a poor search experience.” If your site loads slowly, this can have an

INITIAL TRIAGE

127

impact on how you rank in the search results (you’ll learn more about this in
“Google Page Experience” on page 355.

Bandwidth limits Hosting providers limit the amount of incoming connections and
outgoing data transfer (generally referred to collectively as bandwidth) for
each server or account. You can usually purchase a higher service level to raise
your bandwidth cap; if not, then you’ll have to find a better hosting solution.
If you don’t have enough bandwidth, and you have some piece of content that
suddenly goes viral or your site suffers from a distributed denial-of-service
(DDOS) attack, your site will either load very slowly or not at all. If you’re
lucky, it’ll go back to normal when the tidal wave of traffic recedes; if you’re
not lucky, your site will stay offline until your monthly bandwidth allocation
is replenished. NOTE Be wary of hosting providers that claim to offer
“unmetered” bandwidth. This doesn’t usually mean what you probably think it
does. Most often it means that you still have a (sometimes secret) bandwidth
limit, but the provider won’t keep strict track of how much you’ve used. It may
also mean that bandwidth is throttled down after you reach a certain threshold,
which will cause your pages to load slowly.

Gated content Making content accessible only after the user has completed a form
(such as a login) or made a selection from an improperly implemented pull-down
list is a great way to hide content from search engines. Do not use these
techniques unless you want to hide your content from the whole web. Also, do not
attempt to use cloaking techniques to show the content to search crawlers, but
not to human visitors. If your business model is to get people to pay for
content, then make sure that a significant amount of content is visible without
going through the paywall in order to have a chance of still earning some
organic search traffic. In addition, you can use flexible sampling as a way to
let Google crawl and index your paywalled content. This requires that you let
users who see that content in the Google results read the entire piece without
going through the paywall, but you can limit how many times they get to do that
per month. You can read more about this (and Google’s guidelines on cloaking) in
Chapter 7.

Temporary URL redirects When you change a page URL, you have to instruct your
web server to resolve an existing URL to a new one, via a redirect. Redirects
can be implemented server-side or

128

CHAPTER FIVE: SEO PLANNING

via meta refresh, using a element with the http-equiv parameter set to
"refresh", and a content parameter specifying the URL to load and the time
interval to wait, in seconds. The former is generally the preferred method,
unless unsupported by your web platform. If your URL change is a temporary one,
then you would typically use a temporary redirect command in your web server
configuration (usually a 302-, 303-, or 307-type server-side redirect), but meta
refresh > 0 and http refresh > 0 also serve as temporary redirects. Most often,
however, URL changes are permanent. The correct HTTP status code for that is a
301 or 308 “moved permanently” redirect (less commonly used forms of permanent
redirects include meta refresh = 0 or http refresh = 0) . When search engines
find a “moved permanently” redirect, they view this as a strong signal that the
redirect target should be canonical and update the index with the new URL,
passing most or all of the link authority from the old URL to the new one. If
Googlebot finds a “temporary” redirect, it may continue to keep the original URL
(the one being redirected) in its index, and that URL may be the one that
garners link authority instead of the new URL (the one being redirected to).
NOTE While there are many cases where a 302 redirect could lead to undesirable
results, there are some scenarios where 302 redirects are recommended. For
example, Google recommends 302 redirects as part of one solution for serving the
right home page to international users.

We’ll talk more about redirects in Chapter 7.

Mobile responsiveness The majority of internet searches come from mobile
devices, and Google bases its index on what it finds in a crawl of your mobile
site (though it continues to crawl the desktop versions of your pages). At
minimum, your site must be mobile-friendly in order to succeed. If you’re
building and launching a new website, it should be designed to be mobilefirst,
optimizing for speed and mobile usability across different device types, screen
sizes, and mobile operating systems. For existing websites that haven’t been
substantially updated for a long time, it is recommended to set aside time for a
comprehensive audit to assess issues that are impeding your current search
visibility and determine the site’s mobile-friendliness. Note that monitoring
your search visibility and mobile-friendliness should be done on a regular
basis, such as monthly.

INITIAL TRIAGE

129

Know Your Client Now your SEO planning shifts from the objective to the
subjective. Here are some useful questions to help guide you in the right
direction, optimally considered from a nontechnical frame of reference:

What are the company’s current objectives, and what are its future goals? This
might align with the official mission statement, and it might not. For instance,
Red Bull is manufacturing and selling energy drinks and hopes to expand its
market share, or a prelaunch startup is developing a product that it hopes to
bring to market on a specific release date.

What are all of the identifiable outcomes that the company’s success relies on?
For instance: a closed deal, product sale, mailing list subscription, social
media follow, download, or service sign-up.

Who are the people (your users) who are likely to perform those ultimate
actions? Who is this company trying to attract or influence? Who are its
customers, clients, users, or followers? What are their demographics? Where do
they live? What languages do they speak?

What are all of the current and possible methods of connecting people to those
ultimate actions? Are there online and offline advertising campaigns, discount
sales, seasonal opportunities, bulk mailings?

How does the company track and measure success? There can be several answers to
this, but there should be one metric that takes precedence. Gross revenue or
return on investment (ROI) is usually factored into this measurement, but within
the context of revenue generation there are various identifiable user actions
that, when tracked and measured, can generally correlate with broader financial
success over various time frames.

What does the competitive landscape look like? Who are the market leaders in
your market space? What are they doing that you’re not? What are the search
terms that you’re winning on, and what can you learn from that to guide your SEO
strategy going forward?

Take Inventory of the Client’s Relevant Assets Determining the current assets
available for SEO is the first place to start when planning your overall
strategy. Even if your client’s business is relatively small or new, there is
likely still a cache of valuable material that you can use for planning and
keyword research, such as:

130

CHAPTER FIVE: SEO PLANNING

• Lists of products, brands, and trademarks • Commissioned studies and paid
market research reports • Customer interviews, reviews, testimonials, and
surveys • Licensed content (photos, videos, music) • Analysis reports of
previous marketing campaigns • Access to Google Analytics (or other
marketing/analytics platform) data • Access to Google Ads data • Site
architecture or information architecture documents Other assets that require
more consideration are listed in the following subsections.

Customer personas Customer personas are representations of your ideal customers
(which may be consumers or B2B clients). In some industries, these may be
referred to as buyer personas or customer profiles, and your company may have
already defined them. If so, use the ones that you already have. If you need to
create your own personas, then consider these questions about your ideal
customers: • What age range is each persona?

• Are they married or single?

• Where do they live?

• Do they have kids (and if so, how many)?

• What is their job title (or role within the organization you’re selling to)? •
What are their hobbies?

• Do they primarily search from a mobile device, or a computer?

• How much money do they make (or what is their net worth range)?

• What do they spend annually on industry products and services?

Next, consider some broader, more subjective questions, such as: • How late do
they work?

• What do they do on the weekends?

A good persona includes psychographics in addition to demographics. In other
words, it peeks into the customer’s heart and mind to glimpse what motivates
them and what keeps them up at night. Thus, you’ll also want to consider
questions like the following: • What are their fears?

• What are their aspirations?

• What are their frustrations?

• What problems are they trying to solve? (Go deeper and identify the

• What are their wants?

KNOW YOUR CLIENT

131

unspoken problem, not just the obvious, spoken problem.)

• Who are their mentors, heroes, supporters? Detractors, enemies, critics?

• What does success look or feel like? Failure? The more details you can think
of, the better. Record all of this information in a text file, then give the
persona a one- or two-syllable name that is easy to remember. Create a persona
to represent each different type of ideal customer. This process will help you
divine each persona’s pain points and desires, and that will help you understand
the search queries they’ll use and the questions they’ll ask.

Domain names and subdomains If you have multiple domains, some relevant
considerations for most SEO projects would be: • Can you 301-redirect some of
those domains back to your primary domain, or to a subfolder on the site for
additional benefit? • Be sure to check the domain health in Google Search
Console and Bing Webmaster Tools before performing any 301 redirects, as you
want to ensure the domain in question has no penalties before closely
associating it with your main domain(s). • If you’re maintaining those domains
as separate sites, are you linking between them intelligently and appropriately?
If any of those avenues produce potentially valuable strategies, pursue
them—remember that it is often far easier to optimize what you’re already doing
than to develop entirely new strategies, content, and processes.

Vertical content Take the time to properly catalog all of your digital media
files. From an SEO perspective, it’s useful to think of digital media such as
images and video as vertical content that is separately indexed. A YouTube link
is a URL that points to a page; from a search indexing perspective, it’s also a
video, especially when considering that the same video hosted at that URL is
viewable within the YouTube mobile app. You cannot make changes to the HTML or
any of the code on a hosted service like YouTube; you can only optimize the
content you upload and the metadata for your uploaded files. When you control
the hosting, however, vertical content can be a secret weapon for SEO. There may
be high-difficulty keywords that you don’t have the budget to compete for in web
search, but which offer affordable opportunities in verticals like images or
video. Consider the Chapter 3 example of a query that has

132

CHAPTER FIVE: SEO PLANNING

vertical intent: diamond and emerald engagement ring. The product vertical
results are mostly huge brand names (Diamondere, Tiffany & Co.) who’ve
undoubtedly paid a lot of money for top placements. As we explained earlier,
though, this query suggests a search for both products and images. If the
product vertical is out of reach, then you can more affordably target the Google
Images vertical engine.

Offline and nonindexed content You may have articles, blog posts, and videos
that were never published on the web, or anywhere else. Even if content was
published on the web, if it was never indexed by a search engine, then it
effectively was never published. In the copy room filing cabinet there could be
hundreds of well-written articles that were only published in a print newsletter
which are still relevant enough to be reused. For example, you could publish the
entire cache of newsletters in a searchable archive on your website. The same
concept applies to email newsletters that were never published in a searchable
archive.

Perform a Competitive Analysis Once you have performed in-depth keyword research
(or if you can rely on previously performed research), you are well positioned
to begin carrying out some competitive research. Business owners usually have a
very good idea of who their direct competitors are (regardless of search result
rankings), so this can be a useful starting point, especially for companies that
serve specific regions, locales, or niche markets. In a larger company, the
marketing and sales departments should be an excellent resource on this topic.
Marketing, at least, should have some competitive analysis reports or
presentations that you can learn from. NOTE High-ranking competitors aren’t
necessarily good examples of SEO, so be careful where you draw your inspiration
from. Don’t make any big decisions or assumptions about competition until you’ve
assessed their SEO strategies.

Information Architecture All of the pages and digital media resources that
comprise your website must use concise but descriptive filenames, contain
appropriate metadata, and be linked to from other pages on the site. From a
usability perspective, users on any type of device should be able to quickly and
intuitively navigate to your home page and to the information they’re looking
for. If a user can find your content easily, so can a web crawler. That’s
because Googlebot discovers new pages by following links, processing sitemap
files, and analyzing page content.

INFORMATION ARCHITECTURE

133

Information architecture encapsulates everything described in the previous
paragraph: nomenclature, taxonomy, and navigation for a large collection of
information. It’s a subset of library science, and it’s useful to think of your
website in similar terms. If web pages and vertical digital media assets were
books, and your website were a library, how would you organize it such that
visitors do not need an index to find a specific resource? NOTE If you’re having
difficulty with this, you may need to engage with other team members for
clarification.

For planning purposes, you must create a site architecture document that defines
or proposes a filenaming convention and illustrates the directory structure and
content taxonomy. A good taxonomy has as few categories as possible, without
defining them so broadly that the classification becomes useless. You will be in
a much better position to flesh out your site architecture document after you’ve
gone through the keyword research process (covered in Chapter 6). Once you have
your site architecture mapped out, here are some refinements and addenda to
consider:

Cross-link your cross-references. Many of your pages and assets can reasonably
fall into more than one category. Beyond that, keyword research will yield many
associations that you wouldn’t be able to anticipate on your own. For example,
look at any product page on Amazon.com and note how many ways products are
cross-referenced (frequently bought together, customers who bought this item
also bought, etc.). Hashtags are another form of cross-referencing that you
should consider.

Use descriptive anchor text. For all internal links, avoid using irrelevant
anchor text such as “More” or “Click here.” Try to be as specific and
contextually relevant as possible, and include phrases when appropriate within
your link text.

Implement breadcrumb navigation. This is the best way to show users where they
are in the topical hierarchy, and an excellent way to properly use keywords in
the anchor text for each category page.

Refactor your architecture to minimize link depth. Pages that are reachable from
your home page in one or two clicks may be seen as more important to search
engines. From a human (and search crawler) perspective, the most important
things are always close at hand (your driver’s

134

CHAPTER FIVE: SEO PLANNING

license, house key, and smartphone, for instance), and the least important
things are archived or stored in unobtrusive places (receipts from nine years
ago, the owner’s manual for your car, out-of-season holiday decorations).
Therefore, the deeper a URL is in the site hierarchy, the less important it is
assumed to be from a search perspective. However, don’t go overboard and put a
link to every page in your 10,000-page site on the home page. This would be a
terrible user experience and perform poorly for SEO purposes. NOTE You may have
to make some compromises in good information architecture in order to reduce
link depth for SEO purposes. Some sites will need to be as flat as possible in
order to minimize the number of clicks from the home page.

SEO Content Strategy Due to the state of technology on the web, as well as the
continuing general lack of understanding of SEO across the industry, addressing
technical SEO issues can often be challenging. Updating your site so that your
content is easy for Google to crawl and process should always be your first
priority, until other major issues are addressed. However, this is only the
start of the SEO process. You must then take your SEO-ready site and build out
pages that offer the content that users might be looking for in your market
space. The first step in this process is performing keyword research, which
we’ll discuss in depth in Chapter 6. This is how you discover what types of
phrases users are entering in search engines. It’s important that you do this
research in great depth so that you can fully map out the variety of needs that
your users have, and then develop content to address those needs. This is where
an SEO strategy comes into play. You must map out the keyword space in detail,
understand what your competition is doing to succeed and where their gaps are,
and then decide what portions of the topic and keyword space you wish to
address.

The Long Tail of Search Understanding just how much content you might need is
one of the trickier parts of developing an SEO strategy. To help with this,
consider the reality of the long tail of search, illustrated in Figure 5-1.

SEO CONTENT STRATEGY

135

Figure 5-1. The long tail of search (source: Ahrefs)

As noted in Figure 5-1, long-tail keywords are those that have lower search
volumes. They also tend to be longer phrases, often four to seven words in
length. Ahrefs research shows that 95% of all keywords are long tail in nature,
and that they represent 35% of total search volume. In addition, because they
tend to be longer phrases addressing more specific questions, they may offer
higher conversion rates.

136

CHAPTER FIVE: SEO PLANNING

Examples of Sites That Create Long-Tail Content There are many examples of sites
that have had great success with a long-tail content strategy. As Figure 5-2
shows, WebMD earns nearly as much traffic as the Mayo Clinic does with its
website.

Figure 5-2. Mayo Clinic (top) versus WebMD (bottom) organic traffic

Earning this level of traffic is a truly impressive accomplishment, given that
the Mayo Clinic has been one of the world’s premier medical institutions since
the late 1800s and WebMD was only founded in 1996. There are many reasons why
this has happened, but one of them is the great depth of the WebMD content.
Consider the query diabetes. At the time of writing, as shown in Figure 5-3,
WebMD has 397 different pages on the topic. Each of these pages addresses
different specific aspects of the disease, the symptoms, the outlook for people
with the disease, treatment options, and more.

SEO CONTENT STRATEGY

137

Figure 5-3. Google SERP for “diabetes” on WebMD.com

For another example, look at Figure 5-4, which compares the website traffic
received by NerdWallet, launched in 2009, and Fidelity, which has been a leader
in providing financial services since 1946. As with WebMD, the depth and breadth
of NerdWallet’s content is a core driver of its success. Fidelity’s Learning
Center has hundreds of pages of content (at the time of writing, a
site:fidelity.com/learning-center query in Google showed that this section of
the site had 1,460 pages). In contrast, NerdWallet’s site is 100% dedicated to
offering informational content, and a comparable site: query suggested it had
over 24,000 pages.

138

CHAPTER FIVE: SEO PLANNING

Figure 5-4. Fidelity (top) versus NerdWallet (bottom) organic traffic

Examples like these lead us to the same conclusion, which is that sites with a
greater breadth and depth of content around a specific topic area win the most
traffic and market share.

Why Content Breadth and Depth Matter Understanding the long tail of search is
the one of the keys to understanding why Google likes sites that provide broader
and deeper content. As mentioned previously, research suggests that queries with
a very low monthly search volume account for 35% of all searches. Figure 5-5
presents data from Bloomreach that shows that over 75% of users’ time on
websites is spent in discovery mode, browsing and searching.

SEO CONTENT STRATEGY

139

Figure 5-5. Bloomreach study of user browsing

In 2020 Google published a white paper titled “How People Decide What to Buy
Lies in the ‘Messy Middle’ of the Purchase Journey” that discusses this behavior
in more detail. As you can see in Figure 5-6, the main concept is that there may
be many different triggers that cause users to begin looking for something that
may lead to a purchase of a product. However, it’s rare that the journey from
the initial query to the purchase goes in a straight line. There are many
different considerations that users may wish to address before they make the
final decision of what to buy.

Figure 5-6. Google’s concept of the “messy middle”

140

CHAPTER FIVE: SEO PLANNING

As the Bloomreach data shown in Figure 5-5 illustrates, a large percentage of
the user’s time will be spent in discovery mode. This is because users
frequently have many related questions they want to address or selections they
wish to make. If your site doesn’t address these more specific questions, then
there is a strong chance that you will fail to meet the user’s needs and will
therefore fail to convert them. As a simple illustration of this, imagine buying
a car but not being allowed to select whether or not you get a sunroof, or a
reversing camera, or seat warmers. You would likely never buy a car without
having the opportunity to make those types of decisions, and the same goes for
many of the things you might consider buying online. The same logic applies to
informational content, too. Continuing our WebMD example from earlier in this
chapter, you can build a pyramid related to user needs for diabetes, as shown in
Figure 5-7.

Figure 5-7. A sample user needs pyramid

The chart only shows a partial view of all the needs around diabetes, but it
does help us understand why WebMD has 397 pages on the topic. The next key
insight is that creating content to cover all the various possible user
scenarios can require you to create a significant number of pages (as the WebMD
example illustrates). You’ll need to plan on doing this, or you’ll risk failing
to satisfy a large percentage of your visitors. Understanding all of this is
important to how you put together your SEO strategy. A successful strategy will
include the following elements:

SEO CONTENT STRATEGY

141

• A detailed map of user search queries that includes the long tail • A thorough
understanding of users’ needs based on collecting and reviewing data from many
sources, such as: — Discussions with salespeople, marketing people, and/or
customer service team — Reviewing search query data from your site’s search
function — Interviews or surveys with customers and prospects • A clear
definition of the problem(s) that your product/service helps solve for users •
Matching these needs against your actual offerings and then creating the content
and pages that present this information to users, making it easy to navigate and
discover in an efficient manner Creating a site with the pages and content
required to meet all these needs is key to positioning your site as a leader in
its market space. It will help your site satisfy more users than other sites
that are less comprehensive, leading to better user engagement and conversion
rates, and make it more attractive to link to. In addition, Google likes to
surface these types of more comprehensive sites in the search results because
users are more likely to be satisfied with what they find there, and thus be
more satisfied with the search results that it provides to them.

Can Generative AI Solutions Help Create Content? When faced with a large task,
it’s natural to want to look for tools to help you execute the process more
efficiently. Generative AI is such a tool, and it can play a role in making your
content creation efforts more efficient. However, this process needs to be 100%
driven by your true subject matter experts (SMEs). They need to be the
gatekeepers for all you do with your content, regardless of what tools,
processes, or writers you may be using to create it. That said, here are some
ways that generative AI solutions, such as ChatGPT, Bing Chat, or Google Bard,
can help: • Suggest article topics.

• Create draft article outlines.

• Review existing articles and identify content gaps.

• Group keywords by their semantic relevance.

You could also potentially have these tools create drafts of content, but this
comes with a variety of risks, such as:

142

• Factual errors

• Bias

• Taking positions inconsistent with your brand

• Important omissions

CHAPTER FIVE: SEO PLANNING

Ultimately, whatever you decide to do with generative AI tools, ensure that your
SMEs own full responsibility for all of the content that you publish on your
site. One way to incentivize that is by putting their names on that content as
the authors.

Measuring Progress Websites are complicated in their design and development, and
there are many hidden factors among the services that enable them. Search
engines are quick to crawl a site and discover new content, assuming the site is
search friendly with appropriate XML sitemaps submitted via Google Search
Console—so it’s imperative to create properly configured sitemaps and keep them
updated. You usually won’t know exactly how, when, or if a particular change
will impact search visibility and keyword rankings. There are various elements
to measuring SEO progress and demonstrating the value of investing in SEO:
appropriate planning, strategic and efficient implementation, documenting
changes, and tracking and measuring relevant data. You cannot afford to let the
organization take it on faith that you’re doing the right things, especially
when there often aren’t immediate results in the search. If something’s broken,
technical SEO improvements tend to have quicker results than on-page SEO
enhancements because they remediate fundamental problems with crawling and
indexing. Think of a site with technical SEO problems as a race car with two bad
spark plugs; replacing the failed components will of course lead to instant
performance gains, but in the larger context of a race, all you’ve done is
return to the baseline level of performance. Winning the race will take a lot
more time and effort than simple maintenance tasks. Much of your ability to
successfully measure the impact of your SEO efforts will depend on the analytics
platform data you have access to, so take charge of website analytics, and
leverage any other marketing/performance/SEO analytics platforms you can.
Decision-makers and internal influencers at a company can sabotage you with junk
metrics like “hits” or “mentions” or other things that aren’t relevant to the
company’s actual success criteria. You should be able to show, at any given
time, how what you’ve done has led to, or will lead to, progress toward the
company’s goals.

Conclusion The perfect plan can only be created in retrospect, after the job is
done and every problem is solved. It is therefore imperative to learn as much as
you can at the outset about your organization’s or client’s website, target
market, business objectives, and organizational structure. From the moment you
begin, document the relevant portions of your work, and set up the appropriate
analytics so you can track and measure your progress.

CONCLUSION

143

CHAPTER SIX

Keyword Research Keyword research is the process of finding the words and
phrases that connect your customers or clients to your business. This is the
most important aspect of search engine marketing, but it also has a great deal
of business value beyond the web. Keyword research enables you to predict shifts
in demand; respond to changing market conditions; and provide new products,
services, and content that web searchers are actively seeking. In the history of
marketing, there has never been a more efficient and effective method of
understanding the motivations of consumers in virtually every niche.

The Words and Phrases That Define Your Business In this chapter, we’ll walk you
through the entire keyword research process, beginning with building your domain
expertise and analyzing the language associated with your company and its
products. Next, we’ll give you some tips for developing a list of topics, and
how to use it to develop an initial keyword list. We’ll show you how to set up a
spreadsheet to track and calculate your keyword plan, and how to use data from
an SEO platform to calculate the best opportunities. Lastly, we’ll provide some
guidance on how and when to update your keyword plan. By the end of this
chapter, you will have a spreadsheet with search terms and phrases that are of
value to your business, categorized by topic and ordered by their level of
opportunity. This is the basis for every subsequent chapter in this book.
Without going through the keyword research process, you cannot develop
meaningful strategies and plans for updating your website to be more search
friendly, or effectively create new content to attract search traffic, or
successfully tune your analytics to look for future opportunities. Once you have
your keyword plan, on the other hand, you’ll be able to develop your website’s
editorial calendar, optimize your site’s title tags, identify internal linking
opportunities, develop your link-building campaigns, and much more.

145

Don’t rush through this. Keyword research is not a fast process, especially if
you have a large retail site with a lot of different products. Plan to spend at
least a few days on the initial effort. NOTE To do a proper valuation of your
keyword list, you must have a reliable and current data source for (at the very
least) monthly search traffic and cost per click (CPC), though ideally you’ll
have keyword difficulty and current rank data from an SEO platform or data
provider as well. If you haven’t made a decision about that yet, then you can
either use some of the free or low-cost tools we introduced in Chapter 4, or
delay your keyword research effort until you’re ready to sign up for an SEO
platform. However, don’t wait too long on this, as keyword research is a
foundational component of all your SEO plans.

The Different Phases of Keyword Research The keyword research process can vary
quite a bit depending on where you are in your SEO project and the individual
needs of the business. Therefore, we’re going to present the whole process that
you’d follow if you were working on a completely new site that isn’t currently
ranking for anything. Even if this is not your situation, you should read this
chapter in sequence anyway, just in case there’s a knowledge gap. The first
keyword research effort is typically concurrent with an initial site audit that
establishes a starting point for the big-picture SEO project, but every project
and organization is different. Ultimately it depends on the information you are
required to deliver as part of your first estimate. This can include a basic
assessment of the keywords that a site currently ranks for, but it should go
beyond that to include a list of keywords that would likely benefit the company.
If you are expected to produce a list of title or meta tag recommendations as
part of your audit, then you’ll have to put a lot of effort into your keyword
plan. After you complete your initial optimization work, you will be able to
more accurately assess the cost of optimization for your keyword list. This is a
good time to drill down into long-tail keywords to look for the most efficient
opportunities. Depending on the size and complexity of the SEO project, this
phase of keyword research can be the most labor-intensive. Once you’ve built and
refined your keyword plan, you’ll schedule regular reviews to update it with new
data. This can be done once a month or once a quarter; you should never go more
than a year without a full keyword review. In most cases, seasonal keyword
research should be tracked and scheduled separately from your regular keyword
reviews. Nearly every site can benefit from seasonal keyword research, even if
there isn’t an obvious seasonality to the company.

146

CHAPTER SIX: KEYWORD RESEARCH

Expanding Your Domain Expertise Before you go further, ask yourself how well you
know the company and its industry. You cannot successfully conduct keyword
research for a business that you aren’t familiar with. You don’t have to be the
world’s foremost expert on it, but you should have a reasonable understanding of
the fundamental technologies behind the products and services it provides, its
history, its mission, the size of the market, who the main competitors are, the
impact of various seasons and holidays, and basic customer demographics. If
there are existing customer avatars or personas, ask to see them. If you’re an
outside consultant, this is a critically important part of the process. If
you’re an employee of this company, then take the time to learn something new
about it. Talk to the most recent hires and the longest-tenured employees about
their experiences, read the documentation, or do a Google News search and see
what’s been written about it.

Building Your Topics List It’s a good idea to build a topics list before you go
further with keyword research. Start by identifying the entities and concepts
relevant to the company, its products or services, and the industry as a whole.
If possible, begin at the lowest level, then go up one step at a time. What are
the topics that apply to every search query that you want to rank highly for?
For instance, if the company sells smartphone cases and screen protectors, then
the lowest level might be smartphone accessories, or, if that’s too broad (say,
if you don’t sell earbuds, chargers, or cables), perhaps you need to think of
this as two topic areas (smartphone cases and screen protectors). How you create
your topics list needs to be thought out to best fit the logical structure of
your market. Think of these root-level topics as domains. Next, ask yourself
what the related topics are. Why do people buy smartphone accessories? What are
their concerns? What purposes do these products serve? What are the
alternatives? What are the most important features or qualities? As a solution
provider, you’re used to solution-side thinking; you’re at the end of the chain.
Your customers started their journey long before this, though, and you want to
be visible to them as early in the process as possible. In some instances, this
may be before they even know they have a need or a problem. Let’s say your
company sells cases and screen protectors for Apple and Samsung phones. The
problems that lead people to buy a case or a screen protector are pretty
obvious, but some of the alternative perspectives aren’t. Your customer may be
someone who’s dropped an expensive mobile device and their search query is for a
repair option, but consciously or subconsciously they’re also asking themselves
how they can

THE WORDS AND PHRASES THAT DEFINE YOUR BUSINESS

147

prevent this from happening again. That’s a great time to show up in their
search results. A contextually similar but topically separate concern is water
damage. If someone is searching for how to tell if their phone has been water
damaged, or how to properly dry out a phone that fell into a toilet, that’s also
a great time for your waterproof phone cases to show up in their search results.
They aren’t looking for a waterproof smartphone case right now, and in fact they
may not even know that such things exist until they see your site come up in
search results for What kind of rice should you use to dry out a soaked iPhone?
Insurance is an alternative path that someone might explore after suffering a
smartphone loss, but it’s expensive. You might position your products as a
cheaper preventative option to a monthly insurance fee. You could potentially
get a lot of sales from ranking highly in the results for a query like Is
smartphone insurance worth it? even though it’s only tangentially related to the
products you sell. So, your related topics might be: • smartphone screen damage

• iPhone compatibility

• smartphone protection

• Samsung compatibility

• smartphone insurance

• stylish case

• waterproof case These aren’t keywords, they’re classifications for keywords
(though there may be some overlap between the two). Since keywords will
eventually provide more context, you can simplify and generalize your topics as
follows, by assuming that they are subsets of your domain (which is smartphone
cases in this example): • screen damage

• water damage

• protection

• iPhone

• insurance

• Samsung

• waterproof

• style

Among these, are there any that would apply to a disproportionately large number
of other topics (what in mathematical terms would be described as topics with a
high cardinality)? In our example, the two that stand out are Samsung and
iPhone, because (assuming you only sell cases for these two brands) one and only
one of them will always apply to every other topic. When you discover these
superset topics, make note of them and keep them separate from the main list.
When you start building out your keyword spreadsheet, you’ll create columns for
each of them so that you can do fine-grained sorting and filtering.

148

CHAPTER SIX: KEYWORD RESEARCH

With the realization that Samsung and iPhone together span 100% of your topics
list but have no overlap, it makes sense to go one level lower to their common
domain: device brand. Even if you have a few products that are brand agnostic
(such as a screen protector that is compatible with some iPhone and Samsung
models), the taxonomy still requires one and only one phone brand per keyword
because people will only search for one or the other. NOTE This example exposes
some interesting problems with keywords. First, the internal company product
taxonomy is different from the keyword list taxonomy because the latter must be
governed by search intent. Second, if you have products that are compatible with
multiple brands and models, you may have to have a different product page for
each model, even though it’s the same product from your perspective. If your
product page lists 20 different models that this item is compatible with, then
it’s not optimized for searches on any of those models. For instance, a
smartphone screen cleaning kit may be compatible with all mobile devices, but if
you want it to rank highly for iphone 14 screen cleaner, you’ll have to create a
product page that is optimized only for that keyword (or perhaps just for iphone
screen cleaner, but we’ll get to that level of detail later in this chapter).

Now take another look at the topic list. Are there any other high-cardinality
topics that you might want to drill down into and sort by? In our example, the
only item that jumps out is style. Everything else is a Boolean; a case is
either waterproof or it isn’t, and screen damage, water damage, insurance, and
protection refer to searches peripheral to your products. Style, however, has
several important subcategories: color, materials, thickness, hardness, special
features. Among those, you could drill down even further. If you think that
you’ll need to sort your keyword list by any of those subtopics, then mark this
as a superset. If you end up being wrong, no big deal—it’s easy to add or remove
spreadsheet columns later. When the remaining topics have a similar cardinality,
or if it doesn’t make sense to break them down any further right now, then
whatever’s left is your general list of topics. This is as far as you need to
drill down in this example. There’s more work to do later, but at this point the
foundation is solid. TIP If you’re having trouble coming up with topics, refer
to “Researching Trends, Topics, and Seasonality” on page 161.

Preparing Your Keyword Plan Spreadsheet If your keyword list is not yet in a
spreadsheet, then now is the time to migrate to one. Start with a new file.
Label your first worksheet Raw Keywords, and put every keyword

THE WORDS AND PHRASES THAT DEFINE YOUR BUSINESS

149

you’ve come up with into column A. This is only an initial, unrefined list of
potential search terms. A quick reminder: keyword is a general term for a group
of related words that will be used in a search query. For example, here are four
separate but similar keywords of varying lengths: • pink waterproof iphone case

• pink iphone case

• waterproof iphone case

• iphone case

Many of your keywords will be as similar as those are. Even though they’re
almost the same, and may lead to largely the same set of results, they all have
different search data associated with them and will provide different levels of
value and opportunity. In fact, you could simply change the word order of each
and come up with four more keywords, all with different search data. For now, go
with whatever makes sense to you. When in doubt, choose the keyword that most
resembles a natural language question. NOTE This is not the final list; you only
need it to “prime the pump,” so to speak, so don’t exhaust yourself trying to
think of every possible search query. When you eventually load these raw
keywords into your preferred SEO platform(s), you’ll be able to see the traffic
levels, rankings, and difficulties for them and their many variations, so
there’s no sense in trying to go further than this right now.

Next, create a new worksheet tab called Keyword Plan. This will be your main
worksheet containing your refined list of keywords and their metadata. For now,
all you need to do is set up the structure. At the top of this worksheet, create
a 10 × 2 table with a header, and populate the header row with the following
titles: • Keyword

• CPC

• Monthly search volume

• Superset

• Priority

• Topic

• Relevance

• Persona

• Difficulty

• URL

• Rank The Keyword column will contain the refined list of worthwhile keywords
that you’ll eventually import from an SEO platform or search data provider. NOTE
Don’t put your raw keywords here; put them in the Raw Keywords worksheet.

150

CHAPTER SIX: KEYWORD RESEARCH

Monthly search volume is exactly what it sounds like: the average search volume
for the previous 12 months. Many SEO platforms can also provide indications of
seasonality, or seasonal monthly changes in search volume. Relevance refers to a
subjective assessment of how relevant this keyword is to your current page
content. This will be covered in more detail in “Keyword Valuation” on page 162.
Priority is either a flag or a rating that identifies topics and keywords that
are of major importance to your company, usually in alignment with your
company’s major business objectives, marketing initiatives, and/or sales
targets. This is also covered in more detail in the “Keyword Valuation” section.
Keyword difficulty (sometimes called Keyword competition) refers to the relative
amount of competition for this keyword. If a lot of sites are fighting over the
same keyword, then it has a high degree of difficulty. It will take more work to
place higher in the SERP for that keyword, and ads will cost significantly more.
For instance, in our example, the broad keyword iphone case will almost
certainly have more competition (and therefore a higher keyword difficulty
rating) than the more specific pink waterproof iphone case. There are a few
different ways to represent keyword difficulty, depending on which data provider
you use (that’s covered later in this chapter), but the easiest standard to
conform to is a scale from 1 to 100. Rank refers to the SERP position that a
page on your site (which is specified in the URL column) currently has for this
keyword. These columns are not strictly required right now, but you’ll need them
later so that you can see your progress. Ranking data comes from SEO platforms
that analyze your site. Other keyword research tools may only provide
nonsite-specific data for keyword popularity, difficulty, and CPC, which is all
you need to develop an initial keyword plan. If your pages are mostly unranked
(or ranked very low), or if you are launching a new site, then these columns
will be empty anyway. Most data providers only include the top 100 pages for
each keyword, so this column will usually be a range from 1 to 100. CPC is the
amount of money you’d pay for a click-through on an ad that targets this
keyword. If you don’t have access to good data for keyword difficulty, you can
generally use CPC as a substitute, though ideally you’ll have both. The Superset
column is a placeholder for a high-cardinality topic. In the previous example we
defined two supersets: device brand and style. That means you’d rename this
column Device brand and create a second Style column. Add a column for each
superset that you defined. Topic refers to the refined list of topics that you
created earlier. This sheet is for keywords, though, so in this context a topic
is a dimension of a keyword—a category to which it belongs. That means you have
to list your topics somewhere else. Create

THE WORDS AND PHRASES THAT DEFINE YOUR BUSINESS

151

a third worksheet tab labeled Topics and Supersets. In it, convert column A into
a table with the name topics_list. Change the column title from Column1 to All
Topics. Then go back to your Keyword Plan worksheet, select the Topic column,
use the Data Validation feature to allow only a list, and use this formula as a
source: =INDIRECT("topics_list[All Topics]")

This will avoid potential filtering problems by strictly enforcing the correct
nomenclature in the Topic column. This makes filtering easier, and it also makes
it possible to pivot the data table to aggregate all the keywords for each
topic, which you may find useful later. If you want to add new topics in the
future, you must add them to the All Topics column in the Topics and Supersets
worksheet. Repeat this process for each of your supersets (go to your Topics and
Supersets worksheet, add a new column to the topics_list table for each of your
supersets, retitle the columns to match each superset, populate each column with
the items in those lists, and enable Data Validation for the corresponding
columns in the Keyword Plan worksheet). NOTE If the data validation becomes
burdensome at any point, disable it.

Persona refers to the customer avatars or personas you created or acquired from
someone else at the company back in Chapter 5. You may find value in repeating
the data validation process for the Persona column (and creating another
worksheet tab for Personas), since that could be considered an abstraction of
the customers domain. This is probably a column you would want to filter, sort,
and pivot by. If you chose not to create customer avatars, then you can remove
this column. The URL column will contain the URL of the page that this keyword
is currently ranking for. If a page is unranked, this will be blank. This column
isn’t required for developing an initial keyword plan, but it’s useful later.
For example, it can be used to determine which pages have the most striking
distance opportunity based on the quality and search volume of the keywords for
which they rank in the top 20 positions. Depending on the nature of your site
and your industry, you may want to consider adding a column that identifies a
keyword as being either Branded or Nonbranded. This is just a flag for branded
keywords; you can mark it with an X or a 1 or whatever you prefer. No search
data will be imported into these cells. The reason we do this is because you
should not need to focus on optimizing your site for keywords that include your
brand name unless your business is very small or very new.

152

CHAPTER SIX: KEYWORD RESEARCH

You now have the basic spreadsheet framework to build out your keyword plan. You
may want to make a copy of this file to use as a template for future projects.
To the extent that it is relevant to your project, try to create the spreadsheet
exactly as described, even if you think that you won’t use some of this data.
You won’t truly know what you will and won’t need in your keyword plan until the
project is complete, and every project has different needs.

Internal Resources for Keyword Research In Chapter 4 we covered a variety of
third-party tools that can help you identify topics, keywords, and questions.
Eventually this is where your keyword data will come from, but it works best
when you provide a comprehensive list of raw keywords as a “seed.” In addition
to your own brainstorming, there are many potential keyword resources in-house.
It may be useful to make note of your sources. If so, go ahead and create a
Source column in your Raw Keywords worksheet to identify where or whom the idea
for each keyword came from. It isn’t critical that you keep track of this, but
it could be helpful during future keyword reviews, or for other marketing
purposes beyond SEO. Gathering this kind of intelligence is what a traditional
marketer might have done prior to initiating a marketing campaign before the web
existed. And of course, if any of this data is available to you from other
departments of the company, be sure to incorporate it.

Web Logs, Search Console Reports, and Analytics Collect all internal web traffic
data you possibly can. If you can get a CSV export of useful data, do it, and
add it as a new tab in your keyword spreadsheet. The most valuable web visitor
data from Google comes not from Google Analytics but from Google Search Console,
in the form of clicks and impressions data from search queries. That’s because
Google does not pass keyword data through to your website, and as a result a web
analytics package generally can’t list the search queries or keywords from
incoming search traffic. TIP There are tools that can get keyword-based data.
Keyword Hero, for example, was specifically designed to provide data about
on-site user behavior per keyword in Google Analytics. It shows you how users
respond to each landing page, per search query, as well as to your website at
large for each keyword. So, you’ll get everything from behavioral metrics to
performance metrics such as conversions and revenue per keyword.

INTERNAL RESOURCES FOR KEYWORD RESEARCH

153

Additionally, page-level information about site traffic is potentially useful,
including visitor counts (daily, monthly, and seasonal), locations, and
platforms (browser and operating system); bounce rates; inbound link (or
“referrer”) URLs; direct link URLs to assets such as images or videos; and 404
“page not found” errors.

Competitive Analysis Your competitors face the same challenges with keyword
research and content optimization, and unless you are very lucky, they are
probably also resourceful and creative. Even if they haven’t invested in SEO, at
the very least you can expect that they’ve put a lot of effort into learning
about their customers and the best ways to appeal to them. Review your
competitors’ websites, and try to determine the keywords and phrases they’re
targeting for the products and services that compete with yours. Look for unique
variations and synonyms they use in their content. Do these unique terms
indicate shifting trends in the vernacular of your industry? Are they obviously
optimizing for certain keywords? What nonbrand terms do they use for their
business? Have they written any articles or blog posts? What does the media say
about them?

People Every employee could have valuable insight into the thoughts and actions
of your customers. You’ll find a lot of value in talking to them, not just to
get ideas for keywords, but also to reveal gaps in the concepts and terminology
used by your organization and your customers. Some basic questions you might ask
are: • What are the words and phrases that define our business and its products
or services? • What words and phrases do customers use when they talk about our
products or services? • What are the questions that prospects and customers ask
us? • What are the questions that people ask before they connect with us? • Do
you see gaps in how we talk about our products or services and how customers see
their needs? The following sections describe the different people you might want
to consult.

You Before you meet with anyone, consider your topics, then generate an initial
list of terms and phrases that you think are relevant to your industry and what
your site or business offers. Include all of your various past and present brand
names, products,

154

CHAPTER SIX: KEYWORD RESEARCH

and services. If your site has a massive number of products, consider stepping
back a level (or two) and listing the lower-level categories and subcategories.
Aim to come up with at least a hundred keywords and phrases that could
potentially be used in a search query by relevant, qualified customers or
visitors. Ideally you’ll come up with a list (or a series of lists) that looks a
lot like Bubba’s lengthy enumeration of the many ways that “the fruit of the
sea” can be prepared in the movie Forrest Gump: • barbecue shrimp

• shrimp gumbo

• boiled shrimp

• pan-fried shrimp

• broiled shrimp

• deep-fried shrimp

• baked shrimp

• shrimp stir-fry

• sauteed shrimp

• pineapple shrimp

• shrimp kabobs

• lemon shrimp

• shrimp Creole

• shrimp and potatoes

(This assumes that shrimp is the common entity.) For now, stick to broad two- or
three-word phrases like these. You can drill down into one keyword and expand it
with relevant peripheral words and disambiguations if you really want to, but
generally it’s best to do that later, when you have access to search data. You
wouldn’t want to spend an hour thinking of 50 more keywords based on lemon
shrimp if that isn’t what your business wants to target, or if it ends up being
a low-volume, low-opportunity topic.

Employees across the company If the business is small enough that it’s
logistically possible to call an all-hands meeting to brainstorm for keywords,
then this could be a good next step. You’re not going to get everything you need
from an all-hands meeting, of course; this is just a starting point to get
everyone thinking about keywords and questions. Ask people to email you more
suggestions as they think of them, and schedule some breakout sessions for
employees or departments who are particularly motivated to help with keyword
research. You can also arrange to send out an email survey to everyone at the
company. This is a fairly low-effort option, but you likely won’t get as much
participation as you would from a face-to-face meeting. Beyond your current
keyword research mission, this process also reveals the departments and
individuals who care most about search traffic and website performance.

INTERNAL RESOURCES FOR KEYWORD RESEARCH

155

You might consider forming a working group or committee for those who want to
participate in future keyword reviews.

Marketers Technically speaking, SEO is a form of marketing, so the people in the
marketing department should have a great deal of insight for you, especially in
terms of traditional marketing data. You should already be working with your
company’s marketers by this point, unless you’re an outside SEO consultant—in
which case we advise you to engage with the entire marketing department, not
just your point of contact. Someone might have done some of this research
already and produced a keyword list that you can work with. Ask if there are any
upcoming product launches or marketing campaigns that you can align with your
SEO efforts. Be wary of letting marketing-speak and insider jargon slip into
your keyword list. Marketers try to create a certain impression in customers’
minds by using specific terms, but the language that customers actually use may
be quite different.

Salespeople No one talks to customers more than the people in sales. They’re
familiar with the exact language that customers use, the problems people are
trying to solve by buying a product, the most common questions people ask before
buying, and related concerns such as price, reliability, warranty, returns, and
support. Specifics are important in keyword research, which is why it’s a good
idea to talk to sales after you’ve talked to marketing. For instance, the people
in marketing may have said that affordable is a good keyword, but the
salespeople may say that customers most commonly say or respond to cheapest or
least expensive instead. It’s too early to make judgments, though, so for now
include all three. You’ll identify the highest-value keywords later, and filter
out the duds. It’s also useful to ask the salespeople for feedback on the
company’s marketing efforts. Which parts of the marketing funnel are delivering
qualified leads? Which aren’t? Marketing is supposed to enable the sales team,
but there can be some disconnection—and sometimes even some resentment—between
them. Whereas marketing, user experience design, and product management roles
may have formal personas or avatars that represent ideal customers, people in
sales may have their own informal labels. For instance a frequent, big-spending
customer might be referred to as a “whale,” and a past customer may be labeled
as an “upgrader” or an existing one as a “renewal.” Each of these classes may
use entirely different terminology. If you inherited customer avatars from the
marketing department, for example, it would be a good idea to modify them to
align with the sales perspective.

156

CHAPTER SIX: KEYWORD RESEARCH

Engineering/product (a.k.a. IT) The system administrator and/or webmaster will
know if there is any internal data that might be useful to you. In addition to
the web logs and analytics services that we’ve already mentioned, the IT
department will probably also have access to records from your site’s built-in
search function. The on-site search queries of existing visitors and customers
are a keyword gold mine. If your site doesn’t have on-site search, consider
implementing Google’s Programmable Search Engine.

Support/customer service The support department is more of a resource for
content creation ideas than keyword research, but it’s still useful to ask what
words and phrases customers use and what questions they ask when talking about
the product. Customers who contact customer service may fit into specific
categories. If so, revise your avatars to account for this perspective. Support
personnel are a window into the problems and known issues with your products. As
much as the marketing leadership would like to pretend these don’t exist, in
reality your customers are probably searching for ways to deal with them. Your
job is to direct those searches to good content, so you need to know what they
are. As painful as it may be, you should (perhaps secretly) append the word
sucks to your branded keywords. If people are searching for your product name
and sucks, then you should be aware of it. You might also include other negative
terms, like scam, unreliable, and rip-off, or whatever might apply to what your
business sells. It’s also a good idea to add support to your branded keywords.
If your customers need product support, you want your support page to be
found—and you certainly don’t want your competitors’ sales pages to rank above
it.

Founders and owners They started this company for a reason, right? What’s their
story? What problem did they set out to solve? What’s the next step for the
company? What do they want its public image to be? Answers to questions like
these can inform your keyword plan.

Customers If you have access to existing customers, it’s useful to ask them how
they found your site. Some organizations include this question as part of the
closing or checkout process. You can also set up a survey on your site, or send
one as an email follow-up after the sale. Some people like to offer discounts
for customers who complete surveys; be aware that while this will get you more
survey responses, it’ll also add a lot of junk and bias to the data, which could
lead you astray.

INTERNAL RESOURCES FOR KEYWORD RESEARCH

157

Not everyone finds your site through a search engine, so customer feedback is
not always directly useful for discovering keywords. Indirectly, however, the
process that a customer followed up to the sale can reveal valuable keywords
that you might not have expected. For instance, before buying a new video card
from your PC retail site, a customer may have read a review on Tom’s Hardware,
then asked questions about it in the forum. The names of websites and
publications peripheral to your business may make good keywords, and discussion
forums are an excellent source of relevant customer questions. NOTE Try to talk
to representatives from each of your customer avatars.

Noncustomers Sometimes it helps to get a fresh, unbiased perspective. So far
everyone you’ve talked to is an insider. What would someone outside of this
industry search for if they wanted to buy a product you sell? At some point,
most or all of your customers were new to your industry. What was their search
journey like? You can also get a broader perspective by looking up industry
associations and media sites that pertain to your business. What language are
they using when they talk about this subject?

External Resources for Keyword Research By this point you should have a
substantial list of raw keywords. The next step is to use third-party keyword
research tools to find similar search terms. We covered the most important ones
in Chapter 4, but there are a few niche keyword research tools that you should
consider in addition to one or more full-service SEO platforms or data
providers, especially if your company or client relies heavily on mobile search
traffic and/or seasonal trends.

Researching Natural Language Questions Natural language searches are important
for all search methods, but they’re critical for mobile search marketing.
Compared to desktop users, people searching from a mobile device are far more
likely to speak their queries than type them. In these cases, there is a higher
tendency to use natural language questions or prepositions—statements that imply
questions—instead of search-friendly keywords. As shown in Figure 6-1, by 2019,
over a quarter of mobile users worldwide were using voice search (note, however,
that this doesn’t indicate that users are doing all of their searches by voice;
this statistic includes people using voice search at least once per month).
Nonetheless, use of voice to conduct search queries, improvements in

158

CHAPTER SIX: KEYWORD RESEARCH

ranking algorithms, people’s desire to ask their questions of a search engine
naturally, and other factors are contributing to a rise in natural language
queries. Data released by Google in 2022 shows that it has seen a 60% increase
in natural language queries since 2015.

Figure 6-1. Percent of mobile users using voice search

If you can, try to have at least one corresponding natural language question for
every keyword in your list. Ideally you’d have a series of related questions
that refine the search scope; Google calls this process a journey, and it saves
the context of each user’s journey so that it can provide pre-refined results if
they come back to it in the future. The idea here is that searchers aren’t
seeking a single, objective answer; they’re subjectively evaluating many
possible answers by following their curiosity and exploring a topic over a
period of time. An efficient SEO strategy includes engaging with those searchers
as early in their search journey as possible. For instance, consider the
previous example of keywords for smartphone accessories. Thinking about the
domain and the topics within it, we came up with two good natural language
questions that could lead to an accessory sale: • What kind of rice should you
use to dry out a soaked iPhone?

• Is smartphone insurance worth it?

Let’s work with the first one. Is this really the first question in someone’s
search journey? If we go back a step or two, some better starting points might
be: • Is the iPhone waterproof? • What do I do with a wet iPhone?

• Will an iPhone work if you drop it in a pool?

• How do I dry out an iPhone? The next level might be: • Does AppleCare cover
water damage? • What does it cost to repair iPhone water damage?

• What kind of rice should you use to dry a soaked iPhone? • Can I trade in a
waterlogged iPhone?

EXTERNAL RESOURCES FOR KEYWORD RESEARCH

159

Remember: the goal here is to sell a waterproof iPhone case. Some of these
questions are reactive, and some are proactive. Either way, the people who ask
these questions are probably interested in a waterproof iPhone case, even if
they aren’t directly searching for it yet. They may not even know it exists, in
which case your content will shape their first impression. Another great example
is baby furniture and clothing. If that’s what you’re selling, the search
journey might start with questions like: • What baby names are most popular? •
How much maternity leave will I need?

• How much does a full-term pregnancy cost?

At some point later in this search journey, the person who asked these questions
will need a crib, crib accessories, a night light, a rocking chair, and baby
clothes—things your site sells. They aren’t searching for them right now; they
will be, but they haven’t gotten around to thinking about these items yet, so
this is your chance to get in front of this future customer ahead of your
competitors. Another path to consider is a parent who is having a second (or
more) child. They might ask: • Can I reuse my old crib for my new baby?

• Are used baby clothes safe?

• Is my car seat still legal? If you’re stumped, or if you want as many
questions in your list as possible, consider the third-party tools described in
the following subsections. TIP SEO platforms like Rank Ranger, Ahrefs, and Moz
Pro are also excellent sources for natural language questions.

AlsoAsked This site analyzes your natural language questions and returns a mind
map of all related questions in the search journey. You can drill down into each
related question if the topic is too broad. This service is free to use on a
limited exploratory basis, but if you want to export to CSV (which you’ll need
to do in order to import the list into your spreadsheet), you’ll have to upgrade
to a paid account. The AlsoAsked data comes from Google’s People Also Ask SERP
feature.

160

CHAPTER SIX: KEYWORD RESEARCH

AnswerThePublic This site analyzes a proposed topic and returns a mind map of
the most popular natural language questions and prepositions within it. This
service is free to use on a limited exploratory basis, but the paid version
includes historical reporting, “listening alerts” (notifications of changes for
the topics you’re tracking), CSV export (required for importing data to your
spreadsheet), and extensive training materials. AnswerThePublic gets its data
from Google’s autocomplete feature. A similar tool that is completely free is
AnswerSocrates.com.

Researching Trends, Topics, and Seasonality A trend is a pattern of increasing
and/or decreasing activity that is supported by historical data. Seasonality is
a trend that recurs based on well-defined and predictable conditions. If you
have enough data, then both of these concepts are easily established with
ordinary keyword research and analytical tools. (There’s more detail on this
topic in “Trending and Seasonality” on page 169.) Hopefully your company’s
marketing department has already done plenty of research to define past trends
and seasonality, and it’s available to you to use. If not, or if you’re
launching an entirely new business, then this is a bit out of scope for an SEO
project; seasonal market research and planning is a whole-company effort.
Emerging trends are much more difficult to plan for because current events cause
keywords to fluctuate on a daily basis. Google Trends can help you recognize
short-term spikes in keyword demand, so if you believe you’re seeing an emerging
trend, you can check it there. Google processes over 5 billion queries per day,
approximately 15% of which are new. It takes at least a day for search data to
trickle into most keyword research sites, so there is a lot of variance and
volatility with relatively new keywords, and regardless of how much traffic data
you collect, at best you’re getting an approximation of yesterday’s search
volume for any given keyword. Trends don’t start or end on search engines,
though, so if you really want to know what’s trending today, you’ll have to rely
on other data sources. In addition to Google Trends (which was covered in
Chapter 4), the following subsections provide a few suggestions.

X (formerly Twitter) X describes itself as “what’s happening and what people are
talking about right now.” The sidebar tracks the current most popular trending
topics, accounts, and hashtags, and the Explore feature has several sorting
options for researching emerging trends. Hashtags don’t usually make good
keywords, but they can inspire some ideas for good keywords and topics.

EXTERNAL RESOURCES FOR KEYWORD RESEARCH

161

BuzzSumo BuzzSumo allows the marketer to “see how much interest brands and
content generate in the wild.” The suite includes tools to research keywords,
track brand mentions, and identify influencers, but it’s best known for its
ability to spot trends and viral content as they take off. It provides a window
into what content is trending right now, not just overall, but also by topic or
keyword. Armed with this information, a content marketer is well-positioned to
“newsjack” on trending topics before their competitors.

Soovle Soovle is a web-based keyword research tool that aggregates results from
other search engines. When you enter a topic or keyword, it shows you the most
popular related search queries on several sites, most notably YouTube, Amazon,
Wikipedia, and Answers.com. People often search these sites because they want to
learn more about something they’ve heard about recently, and that’s not
necessarily what they’d go to Google to look for, so Soovle’s results page may
be more likely to show evidence of emerging trends before they show up in
Google’s keyword data (though Soovle does show Google and Bing results as well).
Even if you’re not trying to capitalize on an emerging trend, Soovle is still an
overall excellent source of inspiration for keywords and topics.

Keyword Valuation Up to this point, your keyword research has mostly focused on
building your raw keywords and topics lists. The initial brainstorming phase is
now complete. You’ve identified your topics and collected a lot of potentially
relevant keywords and questions, but you don’t yet know which ones represent
actionable and profitable SEO opportunities for your site. While they may be
relevant, some may not be attainable (the cost of optimizing for them may be too
high), and some may not have enough search volume (the benefit of optimizing for
them may be too low). The hierarchy of importance for keyword valuation is: 1.
Priority. Keywords that serve a major business objective, sales goal, branding
initiative, or other critical marketing purpose should be considered above
everything else. 2. Relevance. You only want to rank for keywords that are
highly relevant to your site’s content. Low-relevance keywords are not
necessarily bad, but they should be moved to a separate list in case you need
them later. 3. Popularity. You only want to optimize for keywords that are used
in a measurable number of actual searches. High-relevance but low-popularity
keywords should

162

CHAPTER SIX: KEYWORD RESEARCH

be filtered out (hidden in your spreadsheet) but not deleted, because they may
become popular in the future. 4. Difficulty. If the cost of acquiring traffic is
higher than the benefit of converting it, then you’re wasting money.
High-difficulty keywords that are highly relevant to your site can be broken
down into variations that represent more efficient opportunities. To obtain
these metrics, you’ll combine search data from SEO platforms with your own
subjective ratings for topics and keywords. If you have tens of thousands of
keywords, then this is going to take a lot of time and patience—but that’s part
of the job; it’s what you’re being paid to do. The following subsections offer
more detail on each of these points.

Importing Keyword Data Before you go any further, you must populate your Keyword
Plan worksheet with useful search data. Good data isn’t free—or if it is, then
it has limited utility—so don’t skimp on this part. Bad data will lead to bad
decisions that will waste time and money, possibly get your site penalized, and
maybe get you fired. In SEO, it’s better to have no data than bad data, and it’s
better to do nothing than to do the wrong thing. We provided some suggestions
for search data providers in “SEO Platforms” on page 110. You only need one good
source, but if you have the budget for it, you may benefit from combining data
from several different providers. Regardless of which service(s) you use, you
must at least be able to upload your list of raw keywords to them and export
recent search data about those keywords to a CSV file with columns that include:
• Monthly search volume

• Current rank (if there is one)

• Keyword difficulty (a.k.a. “keyword competition”)

• URL (of the page corresponding to the current rank)

• CPC (nice to have, especially if a keyword difficulty score isn’t provided) If
there are other metrics that you have good data for, add columns for them. NOTE
We’ll just refer to your export file as a CSV (comma separated values) file
since that’s supported by all of the keyword research tools that we use, but
many of the tools also support the Microsoft Excel format, and other spreadsheet
file formats that will work just as well for raw search data. Export to the
format that is most useful to you. For example, Excel files support many useful
functions such as filtering, multiple sheets, pivot tables, and more.

KEYWORD VALUATION

163

Once you have a CSV file with the exported search data, open your Keyword Plan
spreadsheet. Create a new worksheet tab and rename it to Data Import (or
something that reflects where you got it from, such as Semrush Import), then
import the CSV data into it (using a spreadsheet function, or plain old copy and
paste) and modify the headings and the order of the columns to match what’s in
your Keyword Plan worksheet. Alternatively, you may find it easier to reorder
the headings in your Keyword Plan spreadsheet to match the data export columns.
TIP If you’re a spreadsheet expert, you can write a formula that pulls data into
each Keyword Plan column from the appropriate equivalent in the Data Import
worksheet. Assuming column A is for keywords and column B is for monthly search
volume, then this formula would copy the search volume data from the Data Import
worksheet to the same column in the Keyword Plan worksheet: =VLOOKUP(A2, ’Data
Import’!$A$1:$B$49995, 2,FALSE) Repeat this process for all other data columns.

Evaluating Relevance At a glance your entire keyword list might seem perfectly
relevant to the topics you defined, but keep in mind that relevance is relative.
A keyword might be highly relevant to your site’s current content but have low
relevance to your business model (or vice versa). Seasonal keywords fluctuate in
relevance throughout the year, and some branded keywords will become less
relevant when products are retired or replaced with newer models. Low-relevance
keywords aren’t necessarily lost opportunities; they’re just not at peak value
right now. When searchers click on your site and find the content to be
valuable, they’re likely to remember your brand, bookmark the page so they can
return to it later, and potentially link to it when suggesting it to a friend.
Low-relevance keywords, therefore, can present good opportunities to strengthen
the branding of your site. This type of brand value can lead to return visits by
those users when they are more likely to convert. Seasonal keywords also may be
qualified as low-relevance outside of the season, but they could be highly
relevant when laying the groundwork for the next season. To identify currently
relevant, high-quality keywords, ask yourself the following questions:

What is the search intent? As discussed in Chapter 1, there are three major
kinds of query intentions: • Transactional: someone is actively seeking to
initiate a conversion (buy something, sign up for something, etc.).

164

CHAPTER SIX: KEYWORD RESEARCH

• Navigational: someone is looking for a specific brand, product, or service. •
Informational: someone is seeking general information on a topic, or researching
in preparation for making an upcoming (but not immediate) buying decision.

How closely is this keyword related to the content, services, products, or
information currently on your site? If you’re selling smartphone cases, you may
end up with some keywords that seemed like they might be relevant, but actually
aren’t. For instance, iphone 14 data plans doesn’t apply to any of your product
pages. The search intent doesn’t align with your current content. Perhaps you
intended to write an article on this topic (which might be a great idea), but
never got around to it. Under the right conditions this could be a relevant
keyword, but presently it is not.

If searchers use this keyword in a query and click through to your site from a
SERP, what is the likelihood that they will convert? Conversion rate is directly
related to how accurately your content matches searcher intent (though there are
other factors as well). So, if you do get around to writing that comparison
review of various iPhone 14 data plans (with an extra section that explains how
some iPhone 14 cases might affect the signal in positive and negative ways),
you’re reasonably well aligned with the searcher’s intent. Traffic for that
keyword may or may not convert—you’ll have to keep a close eye on it. This does
not mean that you shouldn’t have or create a page to address the keyword.
Ranking for relevant informational keywords is a great way to catch users at the
beginning of their customer journey and build loyalty so that you can grab that
conversion later.

How many people who search for this term will come to your site and leave
dissatisfied? If your title or snippet promises a comprehensive comparison
review of iPhone 14 data plans, you have to deliver that or else the visitor
will quickly leave. Don’t take the position that more eyes on your site will
necessarily lead to more conversions (either now or the future). You want to
attract qualified leads. “Getting more people in through the door means more
sales” is a brick-and-mortar retail store theory that relies on customers having
to invest time and energy into physically traveling to your store, which doesn’t
translate well to the web because it takes no effort to close a tab or click the
back button. Web analytics can tell you if you’re delivering what you promise. A
high bounce rate on a page indicates that your content did not meet visitors’
expectations, though there could also be a usability or technical issue at
fault. Be aware, though, that informational content tends to have a higher
bounce rate, and that’s OK if it’s relevant and people are getting value from
it.

KEYWORD VALUATION

165

Some keyword research tools offer their own proprietary “relevance” or
“relevancy” score for search terms. This is not quite the same “relevance” that
we’re talking about in this section. Third-party relevance scores are a measure
of how many times a keyword appears in the overall search volume of its main
parent topic or keyword (if you’re researching similar terms). This can be
useful for keyword research within a specific topic, and for finding related
keywords, but it isn’t a good metric to use for keyword valuation. Instead of
relying on a third-party score, think about your customer avatars and their
search intents, your conversion goals, and the alignment between your keywords
and your page content, then develop your own relevance scoring system. We
suggest a scale from 1 to 3, with 1 being a highly relevant transactional query
(likely to convert), 2 being a reasonably relevant navigational query (may
convert), and 3 being a broad informational query (not likely to convert right
now, but maybe later) or a seasonal keyword that is currently in the off-season.
Start at the top of your Keyword Plan worksheet and add a score to the Relevance
column for all of your keywords.

Priority Ratings for Business Objectives Similar to relevance scoring, you may
also benefit from a subjective priority rating scheme to override or bias the
standard cost/benefit ratio. There are a lot of different ways to represent
priority, depending on the details of your business and what you want to sort
and filter by. Are there certain keywords you want to rank highly for, no matter
how difficult or costly it is? All other things being equal, are there certain
topics or keywords that are a top priority for your business? If two keywords
from two different topics have the same cost and benefit, and you can only focus
on one, which would it be? The best scoring method for this metric is one that
closely matches your company’s processes and culture. Many companies define an
official hierarchy of major business objectives (MBOs) on a quarterly or yearly
basis. Marketing departments have a list of priorities as well, and sales
departments often have monthly and quarterly targets for certain product groups
or services. If nothing else, then you should at least have an idea of the
relative benefit of each conversion type, or an estimate of how much each is
worth. (If you only have one value for all conversions, then you probably don’t
need to assign priority ratings, and can skip to the next section.) Using those
sources, develop a basic rating system for your topics and keywords that
reflects your company’s current priorities. There are several ways to do this.
We suggest one of these:

166

CHAPTER SIX: KEYWORD RESEARCH

• Use a simple Boolean flag (such as a different cell color, or the letter X) to
indicate equally high-priority items; all unflagged items are equally low
priority. • Use a numeric scale, from 1 to however many MBOs there are. •
Approach it from a product management perspective—create a list of MBOs (using
short names) and assign one or more of them to each keyword or topic. In your
Keyword Plan worksheet, start at the top and assign scores or flags in the
Priority column for each keyword (or just the topics, if that suits you better).

Filtering Out Low-Traffic Keywords Low-traffic keywords are generally not worth
the effort of optimization, but as with so many other things in SEO, it depends
on the context and the specifics of your situation. There are reasonable
scenarios where currently low-popularity keywords are worth targeting. The first
and most obvious one is to get ahead of an emerging trend, season, or future
event. A lot of big events, like the Olympics and the Super Bowl, are planned
many years or months in advance. Long before anyone knows who will be competing
in these events, you already know what they are, and when and where they will be
held. Likewise, the best time to start optimizing for the next back-to-school
season is shortly after the previous one ends, when the competition for traffic
has died down. (There’s more on this subject in “Trending and Seasonality” on
page 169.) You might also want to lay the groundwork for an upcoming product
launch or marketing campaign. This can be tricky, though, if some of your
keywords contain embargoed information. As explained in detail in the previous
chapter, a long-tail keyword strategy involves targeting a high number of
keywords that individually have relatively low traffic, but collectively offer a
lot of opportunity. In this case, try to figure out what the worthwhile traffic
threshold should be. This is subjective, but there’s definitely a point at which
the numbers are so low that the majority of the measurable search traffic is
probably just noise from automated queries that haven’t been filtered out of the
dataset. The irony of SEO data providers like Semrush and Ahrefs (and their
upstream partners) is that their keyword data collection scripts are responsible
for a certain amount of the search traffic that they’re reporting. Even if they
try to filter out their own queries, they can’t always filter out their
competitors’. Every time you test a query or examine a SERP, you’re adding to
those numbers as well. So in most cases, any keyword with only a handful of
monthly searches for a big market like the US should be a candidate for
exclusion unless you have a good reason for keeping it. If you’re unsure, and if
the conversion potential seems high, then test it—the cost of ranking for a
low-traffic keyword is intrinsically low.

KEYWORD VALUATION

167

Don’t delete low-traffic keywords; filter them out instead by using your
spreadsheet’s filter function on the Monthly search volume column in your
Keyword Plan spreadsheet. If they’re relevant, then they may become more
valuable in the future.

Breaking Down High-Difficulty Keywords Keyword difficulty can be calculated in a
variety of ways, which is why each SEO platform has its own special scoring
system for it. Typically it’s a function of one or more of the following data
points, weighted according to a top-secret proprietary formula: search volume,
number of results, backlink count, and paid ad CPC. On their own, each of these
is a weak indicator of keyword competition, but collectively they become more
reliable, so you’ll want to incorporate as many of them as possible when
calculating your keyword difficulty score. You can do this on your own in your
Keyword Plan spreadsheet, or you can rely solely on the difficulty or
competition score from an SEO platform. NOTE Your data must be normalized
(meaning all metrics are converted to the same scale) before you can use it to
calculate a difficulty score. You can do this in Excel via the STANDARDIZE,
AVERAGE, and STDEV.P functions.

High-difficulty keywords don’t usually represent good opportunities. They’re
high volume, but can be low relevance. If they’re too broad, they will bring in
a lot of useless traffic with a high bounce rate and few conversions. If they
are specific enough to generate qualified leads, then the cost of acquiring them
may be too high for the return on conversions. Regardless, if you don’t already
have a salvageable SERP placement for a popular keyword (or sometimes even if
you do), then it’s going to be a costly battle to get into the top 10 (which is
where 95% of SERP clicks are). It’s initially much more efficient to break it
down into more specific long-tail terms that are cheaper and easier to target,
and hopefully represent a higher return in aggregate. If you have the brand,
budget, or content to compete for those high-competition keywords and you
believe they offer a lot of potential, then by all means pursue them, but be
prepared to invest a lot of time and effort to succeed. Going back to our shrimp
example, let’s say that shrimp stir-fry is one of the most popular search terms
in your list, with several indicators of very high difficulty. You can reduce
the difficulty by breaking it out into related keywords, phrases, and questions
that are more closely associated with specific products or content on your site
(and are therefore more relevant as well), such as: • shrimp stir-fry recipe

168

CHAPTER SIX: KEYWORD RESEARCH

• shrimp stir-fry recipe “low sodium”

• how to cook shrimp stir-fry

• gluten-free shrimp stir-fry

• how much fat is in shrimp stir-fry?

• best oil for shrimp stir-fry

• what seasoning do I use for shrimp stir-fry?

• wok for shrimp stir-fry

TIP The best way to do this is through one of the SEO platforms we’ve
recommended. You could do it on your own, but if you have access to good data,
why aren’t you using it?

Next, check for potentially valuable disambiguations and alternate spellings and
phrasings. Hyphenation, verb tenses, and singular/plural variations of
high-difficulty keywords can provide additional opportunities to go after. For
example: • shrimp stirfry

• shrip stir-fry

• stirfried shrimp

• shrimp stir-fy

• stir-fried shrimp NOTE We are not encouraging you to intentionally insert
misspellings into your page content. That will affect your credibility with
readers and will likely get picked up by one of Google’s content quality
algorithms.

A lot of English words are associated with unique colloquialisms or alternative
spellings across different dialects, such as plow and plough, check and cheque,
donut and doughnut, and hiccup and hiccough. In this case, prawn can be another
word for shrimp; you might try substituting prawn and prawns for shrimp in those
keywords. Don’t be afraid to consult a thesaurus! When you’ve broken down a
high-difficulty keyword into more efficient and affordable variations, be sure
to evaluate them for relevance before adding them to your spreadsheet. Don’t
delete the original keyword—you’ll want to keep track of that in the future, and
it provides an excellent reference point for calculating opportunities for other
keywords.

Trending and Seasonality The keyword plan that you’ve been building has been
focused on present-day opportunities with current data. If you happen to be
doing keyword research on Black Friday or during a major industry convention
where a lot of product releases and keynote speeches are generating buzz, then
your data will probably be affected. That’s

KEYWORD VALUATION

169

why monthly keyword reviews are important (that’s covered in more depth later in
this chapter), and it’s also why you should create separate keyword plans for
seasonal search marketing. At the very least you should create a new column for
seasonality, or a new worksheet that enables better sorting and filtering for
multiple seasons without cluttering up your main Keyword Plan spreadsheet. Many
keyword research tools offer a fine-grained, long-term view of seasonal variance
in topics and keywords, but the best overall big-picture tool for analyzing
trends and seasonality is Google Trends. That’s going to give you access to the
largest amount of seasonal search data, and you can easily compare several
related keywords to see if there are better opportunities in and out of season.
Even a tiny difference like changing a singular word to plural can yield
different seasonal variance. Literally any keyword can be seasonally affected,
and not always during the times of year that you may expect. It’s likely that
there are regular and predictable trends that you aren’t aware of because they
don’t align with a traditional event or holiday, but that could still be
valuable to your site. For instance, sales of memory cards for digital cameras
could spike slightly in March due to a combination of unrelated overlapping
factors such as college basketball playoffs, Vancouver Fashion Week, and a
yearly increase in cruise ship voyages. If your site sells memory cards, you
might not even be aware of the impact of these trends because your competitors
are optimized for them, or because you’ve always seen a steady increase in sales
from March to August, and you assumed it was due to the usual peak wedding and
vacation seasons (which are probably much more expensive to target). It’s
worthwhile to challenge your assumptions and investigate any potential trends
that you find. Try to break them down to account for every major contributing
factor. Sometimes there are so many trends and seasons that impact your business
that it makes sense to step back and focus on the big picture first. For
instance, a costume shop is busy year-round supplying local and traveling
actors, theater production companies, makeup artists, models, musicians, clowns,
and a wide variety of private parties and special events. Halloween might be a
busier time of year than normal, but it might also be prohibitively expensive to
optimize for the Halloween season due to broader competition and low conversion
rate. If you specialize in expensive high-quality costumes and professional
stage makeup, then it might not be worth the effort to compete against
low-margin, high-volume retail juggernauts like Walmart and Amazon for sales of
cheap Halloween costumes. School semesters might also be “seasons” because of
school plays, ballets, and operas. The common theme here is that most of the
costume shop’s business is local, so even though the national Halloween season
might seem like an obvious choice for seasonal optimization, it’s likely that
focusing on smaller local trends and non-Halloween customers will be a more
efficient and profitable effort.

170

CHAPTER SIX: KEYWORD RESEARCH

Seasonal optimization is a year-round process, and thus so is seasonal keyword
research. If you want to target the Halloween season (from August to October),
you can’t expect to start the process in July—or if you do, don’t expect to get
very far with it this year. Search traffic tends to start going up two or three
months in advance of major holidays and events, but Halloween costumes and
Christmas gifts both draw heavily from pop culture fads, trends, movies, TV
shows, and news events from the previous year, so there’s no such thing as “too
early” to start collecting keywords for the next holiday season.

Current Rank Data To get where you want to go, first you have to know where you
are. The SEO platforms we’ve recommended in Chapter 4 are capable of analyzing
your site and providing the current SERP rankings for every page, as well as the
exact URL that’s being indexed at that rank. If multiple URLs are ranked for a
given keyword, then use the highest-ranked one for now. (You’ll decide what to
do with the lower-ranked pages in future chapters.) To calculate an opportunity
score (which is covered in the next section), you’ll need to have a number in
every cell in the Current rank column. If there are highly relevant
high-priority keywords that you want to place for but currently don’t, then fill
in the blank with 101. A SERP rank of 101 is more or less equivalent to not
ranking at all, and it will be above the “top 100 ranked pages” threshold that
most data providers use as a standard report, so if you improve your rankings
for these keywords in the future, you’ll be able to see your progress. It’s
possible to continue the keyword valuation process without current rank data
(and if you’re launching a new site, you’ll have to), but that makes the cost
(and therefore, opportunity) calculations less accurate.

Finding the Best Opportunities At this point you have enough data to identify
good opportunities for optimization. Use your spreadsheet’s filtering, sorting,
and pivot table functions to narrow the scope of your keyword plan so that it
only shows the keywords that you want to work with. Particularly with pivot
tables, it’s best to create a new worksheet named Opportunities for this
purpose. That’s all you need to do for now. If you want to go further, there are
some extra valuation considerations in the following subsections.

Calculating an opportunity score It’s possible to create a new column to
calculate an opportunity score. Unfortunately, it’s difficult to provide
specific guidance on how to calculate this because we don’t

KEYWORD VALUATION

171

know exactly what data you have, what your goals are, or which factors are most
important to you, so we don’t know exactly what a “good opportunity” means to
you. Throughout this chapter we’ve used mostly retail examples because they’re
common and easy to use for that purpose. However, many organizations aren’t
doing retail at all. Some are focused on branding, while others may be
collecting qualified leads for sales calls, acquiring email list subscribers, or
making money by selling advertising. If you have a good idea of how much an
average conversion is worth to you, though, then you can calculate a generic
opportunity score by performing the traditional cost/ benefit analysis:
opportunity score = (# of potential conversions × conversion value) / cost of
optimization If this is an initial or early-stage keyword plan, then you won’t
yet have a good method for calculating the cost of optimization. Some people use
CPC as a cost metric here, and that’s better than nothing, but it won’t be very
accurate. While organic search costs and paid search costs will scale similarly
with keyword difficulty, you cannot reliably calculate one based on the other.
However, you do know what your current rank for a keyword is (or you used 101 as
a placeholder value for unranked pages) and how difficult it will be to compete
for a higher rank, so here’s how you might reframe “cost” and “benefit” in terms
of effort and traffic: opportunity score = ((relevance × priority) × search
volume) / (difficulty × current rank)

NOTE If you don’t have any difficulty scores (and you don’t want to calculate
them), you can substitute any of the difficulty factors except for search volume
(CPC, backlink count, total number of results), though this will reduce the
confidence interval.

This formula won’t work as written because the data isn’t normalized (the
metrics are not on the same scale), and the various factors are not weighted
according to importance. You can normalize the data with a spreadsheet formula
(don’t overwrite your Keyword Plan columns, though; use the Opportunities
worksheet instead). Weighting of the metrics is entirely subjective. We suggest
experimenting with different weighting to align with your budget and
implementation timeline. TIP If you feel overwhelmed right now, ask an
accountant or business analyst for help.

172

CHAPTER SIX: KEYWORD RESEARCH

SERP space considerations By default, Google has provided 10 organic search
results per page (though users can modify their search settings to display
longer SERPs), but that’s not a rule or a guarantee. Organic search results can
occupy a varying percentage of SERP real estate due to encroachment by Google
Ads and SERP special features, such as:

Knowledge panels This is a variable-sized area on the right side of the SERP
that pulls relevant content from the Google Knowledge Graph. Often an
informational query is satisfied by a knowledge panel, which means there’s no
click-through to a result.

OneBox results These are trusted answers to search queries. They’re shown in a
box above the organic results and are usually short text excerpts. Some examples
(among many) of queries that generate OneBox results are word definitions, unit
conversions, package tracking, mathematical equations, health issues, and hotel
searches.

Featured snippets These are similar to OneBox results, except the information
comes from a highly ranked and trusted site instead of from the Knowledge Graph.
A featured snippet is a much larger text excerpt than a regular snippet, and it
appears above the URL instead of below it. You’re more likely to see featured
snippets for natural language queries and mobile searches.

The map pack This is a small excerpt of an image from Google Maps that can
appear in the middle of the organic results, along with the top three Google
Business Profile listings. A map pack will appear for local (or
location-specific) queries such as sushi restaurants in Orlando.

The sitelinks search box This is a search field that sometimes appears below a
snippet. If Google determines that a broad query is likely to lead to a second
query on a site that has its own integrated search engine, then Google will try
to preempt that second search by providing a search field with a site-limited
scope. This often happens when someone uses a URL or a well-known brand or site
name as a search query. For instance, if a user searches Google for pinterest,
Google will provide a sitelinks search box under the top result for
pinterest.com.

Rich results In lieu of a snippet, some search results can display an image
thumbnail or review star rating. This is typically for results where the rich
element is an important part of the content, such as pages that contain reviews
of books or movies, or interviews with celebrities.

KEYWORD VALUATION

173

The carousel If there are multiple pages on a site that are similar and contain
rich results, Google may choose to display them in a horizontal carousel.

Enriched results When search results for job postings, recipes, or event
listings lead to pages that contain interactive elements, Google may add some of
that functionality to a rich result. The more ads and SERP features there are
for a keyword, the shorter the list of organic results on the first page, and
the less opportunity you have for a click-through to your site. You don’t want
to spend time and money improving your rank on a SERP that has a low organic
CTR. Special features and ads can reduce the number of page 1 organic results
and siphon off much of the traffic from the remaining results. Note that you do
also have the potential for optimizing your content to show up in the search
features, and if that is a fit for your organization to pursue then this could
still potentially be worthwhile. Special features also occasionally create
bizarre scenarios in which you will get less traffic by ranking higher. If
Google is excerpting your content with a special feature, searchers may not need
to click through to your page unless they need additional information. By
deoptimizing that page so that it ranks slightly lower, you can force Google to
excerpt someone else’s content, which allows your page to return to the normal
SERP list (and hopefully stay on page 1). However, it may not be wise to
deoptimize in these cases, depending on how the changes in traffic impact your
actual conversions. In many cases that lowered level of search traffic delivers
much more highly qualified traffic, so make your decisions here with care. This
encroachment can apply to any keyword in your list, but it’s especially
impactful on natural language questions and local keywords. Keep in mind,
though, that it’s only a major issue for pages in the top 10 results. You can
easily check SERPs for special features by querying Google with your keywords,
but most of the good keyword research tools have scoring systems that measure
organic CTR and/or SERP feature encroachment.

Rank threshold values When calculating the costs and benefits of optimization,
it helps to adopt a broad hierarchical view of search rankings. The actual rank
numbers don’t matter very much when they’re within certain ranges. Ranks lower
(meaning a larger number) than 20 are only meaningful from the perspective of
monitoring the impact of your SEO efforts and your potential to rank on those
terms over time. Above all else, rankings aren’t worth spending time on until
the fundamentals are solid. If your site recently launched, or if it has
technical or UI problems that prevent a

174

CHAPTER SIX: KEYWORD RESEARCH

lot of your pages from being indexed, then you don’t have any data to work with
yet, and the first goal is to get everything into the index at any rank.
Likewise, if you’re repairing a site that is being penalized for spam or black
hat SEO tactics, then the first goal is to clean it up and get back into the
index. Keyword research is still important, but don’t worry about rankings yet.
From unranked, the next threshold is the top 100. From a conversion standpoint
the low end of this threshold is meaningless, because the ninth page of search
results is the SERP equivalent of Siberia; it may be on the map, but no one goes
there except by accident or adventure. The tenth page of results is the doorway
to the twilight zone; results beyond that are not truly ranked, and Google will
only provide an estimate as to how many there are unless you navigate to the
last results page (whatever it may be). Regardless of traffic or conversion
rate, from an analytics standpoint it’s worth the effort to get into the top 100
because most SEO platforms only provide data for the top 100 sites for each
keyword, so you’ll have access to useful metrics and can start tracking your
progress. To get minimal value out of a keyword, you have to place within the
top 20 results. Only 5% of organic clicks go to results beyond the first page,
so if that’s where you are, there’s a faint hope of a click-through, but unless
the search volume is very high it may not be measurable or predictable. So, if
you’re in the low twenties for ranking you still have a lot of work to do, but
if you’ve steadily improved your rankings to get there, then you can start to
get an idea of what kind of effort it will take to break into the top 10.
Depending on the keyword’s difficulty level, you might already be in striking
distance of the first page. Your SEO tools will help you to dial all that in,
with the help of their cost predictions, backlink suggestions, content ideas,
and comprehensive competitive analyses. About 95% of organic SERP clicks are on
the first page, and a large percentage of those go to the top result. The
further you are from #1, the fewer clicks you get, and remember: SERP features
can kick you off the first page even if you’re in the top 10, so if you want a
high-priority keyword to be on the first page, you need to know what the
threshold is for it. It would be disappointing to spend a lot of resources to
get to #10, only to find out that a rich result kicks you back to page 2. Take
these thresholds into account when you’re considering keyword valuation and
planning. If you’re currently placing #90 for a high-difficulty, high-volume,
highpriority, high-relevance keyword, it probably isn’t worth investing that
much to improve your ranking until you start to approach #20.

KEYWORD VALUATION

175

Filtering by topic Instead of looking for the best opportunities among the
entire keyword list, you might consider filtering them by topic instead,
especially for a long-tail strategy where opportunity is measured in aggregate.
If the list is only a few thousand rows or less, you could sort your Keyword
Plan list by each individual topic to get a quick impression of which topics
represent the best opportunities, but it would be more efficient (and becomes a
necessity when working with larger lists) to create a new worksheet that
aggregates the number of keywords in each topic or superset, and their overall
search volume.

Acting on Your Keyword Plan Now that you have your keyword plan, the next step
is to leverage it to drive traffic growth for your site. Some of the key
activities include:

Improving pages in striking distance These are the places where you can find
quick wins. Sometimes simple optimizations to title tags and/or content are
enough to move you significantly up in the rankings for one or more keywords.
These are the kinds of fast results that your (or your client’s) executive team
will love. NOTE We consider a page to be within striking distance when it
currently ranks in the top 20 search results for a query, but not in position 1.
Optimizing pages that are in striking distance is often the fastest way to
increase traffic (and revenue) from your SEO efforts.

Optimizing title tags Follow up on your keyword research and identify pages with
suboptimal title tags, then take the time to improve them.

Optimizing meta descriptions These do not impact search rankings, but as they
are often shown in the search results as the descriptions for your pages they
can increase CTR. For that reason, you should use keywords to improve the
relevance of your meta descriptions to users. Focus on providing search engine
users with an understanding of the value proposition of your site.

Improving site contextual interlinking As mentioned in “Why Content Breadth and
Depth Matter” on page 139, users (both B2B and B2C) spend more than 75% of their
time on websites either browsing or searching. Global site navigation and search
boxes are a part of how

176

CHAPTER SIX: KEYWORD RESEARCH

to help users find what they’re looking for, but contextual interlinking also
plays a big role. This is why in-context links are so important. Your keyword
research data will help you figure out the key topic areas and the phrases where
you need to do this.

Building out your editorial calendar Identify the keywords that your current
site content has enough depth to compete on. Then build out a map of new pages
to create to fill the gaps. Prioritize this plan by aligning it with the key
strategic objectives of your organization (e.g., what lines of business are the
key area of focus?). These are a few of the areas where leveraging your keyword
research is critical. Don’t let that highly valuable data just sit in some
soon-to-be-forgotten folder somewhere on your PC!

Periodic Keyword Reviews After you’ve developed a good keyword plan, you should
schedule regular reviews to update the data and adjust your calculations. We
suggest monthly reviews, since that’s the interval that most search data
providers use for updating keyword information. You should also do a keyword
review if there are any significant content additions to your site, changes in
business policies or practices, or shifts in trends or standards. If the
marketing plan changes, or if the CEO declares that they’re going to “bet the
company” on this new product release, or when old products are retired, or when
a competitor announces a new product, do a keyword review. Even a news story
that barely relates to your products or content can impact the search volume of
your keywords, such as the announcement of a new version of Microsoft Windows,
political unrest in Taiwan, or a shipping container shortage. Anything that
affects what people search for will affect keywords. Sometimes a change can
sneak up on you. Computer monitors, for instance, have switched connection
standards a few times over the years, from VGA to DVI, then to HDMI, and more
recently to DisplayPort. During each transition, people steadily search for
converters that enable the outgoing standard to work with the incoming one.
Let’s say your site sells cables for electronics, and you have a page optimized
for hdmi to displayport adapter that has been performing well ever since
DisplayPort was introduced to the market. Your previous keyword plans have
repeatedly identified this as a measurably better opportunity than displayport
to hdmi adapter, but there will be a certain point in the transition between
these standards when a fresh keyword review will show that displayport to hdmi
adapter offers a much better opportunity because all new devices now have a
DisplayPort connector, but everyone still has a bunch of old

PERIODIC KEYWORD REVIEWS

177

HDMI cables that they’d like to reuse. The sooner you know about that inflection
point, the quicker you can respond to it with new optimization. Resist the urge
to say: “But I tested that, and I found the best query to optimize this page
for.” Optimization is an ongoing process, not an event. A simple change in the
word order of a query can have a significant impact on traffic at any time, even
if the context is the same. Though there may be a lot of overlap in the results
between hdmi to displayport adapter and displayport to hdmi adapter, and even if
they refer to the exact same product, they still lead to two distinct SERPs.
Even if your page is ranked at the same position in both SERPs, special features
such as People Also Ask boxes will favor the result that most closely matches
the exact word order of the query, regardless of its rank (within reason). SEO
platforms like the ones we’ve recommended in Chapter 4 will usually help you
keep track of your progress over time, but you might find some value in
analyzing the numbers on your own. We suggest creating a new Keyword Plan
spreadsheet (by copying the old one, or starting from scratch) every time you do
a keyword review. That way you can go back to previous plans and get a
page-specific view of how a particular keyword’s performance has changed.

Conclusion Keyword research is a complex and time-consuming aspect of search
engine optimization, but the rewards are high. Once you learn where the keyword
search volume is, you can begin to think about how that affects the information
architecture and navigational structure of your site—two critical elements that
we will explore in greater detail in Chapter 7. You’ll also understand where
your content gaps are, and how to alter your metadata to improve search traffic
and increase conversions. Everything from title tags to page copy, internal
links, content calendars, and link-building campaigns will rely on a
well-researched keyword plan.

178

CHAPTER SIX: KEYWORD RESEARCH

CHAPTER SEVEN

Developing an SEO-Friendly Website In this chapter, we will examine ways to
assess the search engine–friendliness of your website. A search engine–friendly
website, at the most basic level, is one that allows for search engine access to
site content—and having your site content accessible to search engines is the
first step toward prominent visibility in search results. Once your site’s
content is accessed by a search engine, it can then be considered for relevant
positioning within search results pages. Search engine crawlers are basically
software programs, and like all software programs, they have certain strengths
and weaknesses. Publishers must adapt their websites to make the job of these
programs easier—in essence, leverage their strengths and make their weaknesses
irrelevant. If you can do this, you will have taken a major step toward success
with SEO. Developing an SEO-friendly site architecture requires a significant
amount of thought, planning, and communication due to the large number of
factors that influence how a search engine sees your site and the myriad ways in
which a website can be put together, as there are hundreds (if not thousands) of
tools that web developers can use to build a website—many of which were not
initially designed with SEO or search engine crawlers in mind. NOTE This chapter
is a long one, as it covers all the technical aspects of SEO. You can use it as
a guide to answer any questions you may have about this topic, but it may take
more than one sitting for you to read all of it!

179

Making Your Site Accessible to Search Engines The first step in the SEO design
process is to ensure that your site can be found and crawled by search engines.
This is not as simple as it sounds, as there are many popular web design and
implementation constructs that the crawlers may not understand.

Content That Can Be Indexed To rank well in the search engines, a substantial
portion of your site’s content—that is, the material available to visitors of
your site—should be in HTML text form or get pulled in by JavaScript that
executes on the initial page load (we refer to resulting page that is shown as
the fully rendered page). You can read more about this in “Problems That Still
Happen with JavaScript” on page 310. Content that only gets pulled into the page
after a user clicks on a page element, causing JavaScript to then download it,
cannot be indexed. While Google’s capabilities to understand the content of
images are increasing rapidly, the reality is that the processing power required
to analyze visual elements at the scale of the web still limits how far it’s
possible to go with that. The search engines still rely on image filenames, alt
attributes, and nearby text to help them understand what is in an image. Google
enables users to perform a search using an image, as opposed to text, as the
search query (though users can input text to augment the query). By uploading an
image, dragging and dropping an image from the desktop, entering an image URL,
or right-clicking on an image within a browser (Firefox and Chrome with
installed extensions), users can often find other locations of that image on the
web for reference and research, as well as images that appear similar in tone
and composition. The mobile app Google Lens also represents a big step forward
in this area, making it easy for users to initiate searches by submitting an
image from their camera as the search query. Google Lens then identifies the
contents of the image and serves up relevant search results. While this does not
immediately change the landscape of SEO for images, it does give us an
indication of how Google is augmenting its abilities in image and visual
processing. Google Lens and other image-related capabilities are discussed
further in “Visual Search” on page 229.

Link Structures That Can Be Crawled As we outlined in Chapter 3, search engines
use links on web pages to help them discover other web pages and websites. For
this reason, we strongly recommend taking the time to build an internal linking
structure that spiders can crawl easily (are spiderable). At Google I/O in May
2019, Google announced that from that point on Googlebot would always be based
on the latest version of the Chromium rendering engine. Since the Chromium
rendering engine is the underlying engine that defines what Chrome

180

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

can render, this means that if Chrome can render the contents of a page,
Googlebot can successfully see those contents. While this reduced the frequency
of problems where websites create pages where links cannot be crawled, many
sites still create pages that have sections that only get rendered once the user
interacts with the page, and these sections can contain critical content and
links. As Googlebot will not replicate those user actions, it will not be able
to see that content or those links. To our knowledge, the spiders for other
search engines are likely even more limited in their capabilities. Many sites
still make the critical mistake of hiding or obfuscating their navigation in
ways that limit spider accessibility, thus impacting their ability to get pages
listed in the search engines’ indexes. Consider Figure 7-1, which shows how this
problem can occur.

Figure 7-1. Providing search engines with link structures that can be crawled

Here, Google’s spider has reached Page A and sees links to pages B and E.
However, even though pages C and D might be important pages on the site, the
spider has no way to reach them (or even to know they exist) because no direct
links that can be crawled point to those pages after the initial rendering of
the page; the links only become visible in the rendered page after some user
interaction occurs. As far as Google is concerned, they might as well not exist.
Great content, good keyword targeting, and smart marketing won’t make any
difference at all if the spiders can’t reach your pages in the first place. To
refresh your memory of the discussion in Chapter 3, here are some common reasons
why pages may not be reachable:

MAKING YOUR SITE ACCESSIBLE TO SEARCH ENGINES

181

Links in submission-required forms Search spiders will rarely, if ever, attempt
to “submit” forms, and thus any content or links that are accessible only via a
form are invisible to the engines. This even applies to simple forms such as
user logins, search boxes, and some types of pull-down lists.

Links in content not contained in the initially rendered page If your page has
elements that are not present in the rendered version of the page until a user
interaction causes additional content and links to be retrieved from the web
server, search engines will not see that content or those links. If you want to
see what the initial fully rendered version of a page looks like, you can use
Google’s Rich Results Test or Mobile-Friendly Test tools, or the Chrome
Inspector tool.

Links in PowerPoint and PDF files Search engines sometimes report links seen in
PowerPoint files or PDFs. These links are believed to be counted in a manner
similar to links embedded in HTML documents. However, the discussion around this
issue is murky at best, as confirmed in a January 2020 Twitter discussion
including Google’s John Mueller.

Links pointing to pages blocked by the robots meta tag, rel="nofollow", or
robots.txt The robots.txt file provides a very simple means for preventing web
spiders from crawling pages on your site. Using the nofollow attribute on a
link, or placing a tag on the page containing the link, instructs the search
engine to not pass link authority via the link (a concept we will discuss
further in “Content Delivery and Search Spider Control” on page 270). Note that
Google may still crawl pages that are pointed to by nofollow links, but they
will not pass PageRank through those links.

Links on pages with many hundreds or thousands of links Historically, Google
recommended a maximum of 100 links per page and warned that above this limit,
Googlebot might stop spider crawling or spidering additional links from that
page. While this recommendation has been withdrawn, you should still think about
the number of links you have on a page and the impact it has on how much
PageRank will get passed to each of the pages receiving the links. If a page has
200 or more links on it, then none of the links will get very much PageRank.
Managing how you pass PageRank by limiting the number of links is usually a good
idea. Third-party crawling tools such as Screaming Frog, Lumar, Botify, and
Oncrawl can run reports on the number of outgoing links you have per page.

182

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Links in iframes Technically, links in iframes can be crawled, but these still
present structural issues for the engines in terms of organization and
following. Unless you’re an advanced user with a good technical understanding of
how search engines index and follow links in iframes, it’s best to stay away
from them as a place to offer links for crawling purposes. We will discuss
iframes in more detail in “Creating an Optimal Information Architecture” on page
188.

XML Sitemaps Google and Bing both support a protocol known as XML Sitemaps.
Google first announced it in 2005, and Yahoo!, which had its own search engine
at the time, and MSN Search (a predecessor to Bing) agreed to support the
protocol in 2006. Using the Sitemaps protocol, you can supply search engines
with a list of all the URLs you would like them to crawl and index. Adding a URL
to a sitemap file does not guarantee that URL will be crawled or indexed.
However, it can result in the search engine discovering and indexing pages that
it otherwise would not. Sitemaps are a complement to, not a replacement for, the
search engines’ normal, link-based crawl. The benefits include: • For the pages
the search engines already know about through their regular spidering, they may
use the metadata you supply, such as the last date the content was modified ().
Bing says that it relies on this and plans to do so more in the future. Google
says that it uses it if it is consistently and verifiably accurate. Both search
engines indicate that they do not use the field. • For the pages they don’t know
about, they use the additional URLs you supply to increase their crawl coverage.
• For URLs that may have duplicates, the engines can use the XML sitemaps data
to help choose a canonical version. • The crawling/inclusion benefits of
sitemaps may have second-order positive effects, such as improved rankings or
greater internal link popularity. • Having a sitemap registered with Google
Search Console can give you extra analytical insight into whether your site is
suffering from indexing, crawling, or duplicate content issues. • Sitemap files
can also be used to implement hreflang statements if it is otherwise difficult
to implement them in the section of your website. XML sitemaps are a useful, and
in many cases essential, tool for your website. In particular, if you have
reason to believe that the site is not fully indexed, an XML

MAKING YOUR SITE ACCESSIBLE TO SEARCH ENGINES

183

sitemap can help you increase the number of indexed pages. As sites grow larger,
the value of XML sitemap files tends to increase dramatically, as additional
traffic flows to the newly included URLs. John Mueller, Search Advocate at
Google, has commented that “Making a sitemap file automatically seems like a
minimal baseline for any serious website.”

Laying out an XML sitemap The first step in the process is to create an XML
sitemap file in a suitable format. Because creating an XML sitemap requires a
certain level of technical know-how, it would be wise to involve your
development team in the process from the beginning. Here’s an example of some
code from a sitemap:

https://example.com/ XXXX-XX-XX monthly 0.8

To create your XML sitemap, you can use the following:

An XML sitemap generator This is a simple script that you can configure to
automatically create sitemaps, and sometimes submit them as well. Sitemap
generators can create sitemaps from a URL list, access logs, or a directory path
hosting static files corresponding to URLs. Here are some examples of XML
sitemap generators: • Slickplan • Dynomapper’s Visual Sitemap Generator •
SourceForge’s google-sitemap_gen • XML-Sitemaps.com • Inspyder’s Sitemap Creator
In addition, many content management systems have built-in sitemap generators,
and plug-ins such as the Yoast SEO plug-in are available to help generate XML
sitemaps for WordPress. Screaming Frog is a web crawler that will also allow you
to create an XML sitemap directly from a crawl.

Simple text You can provide Google with a simple text file that contains one URL
per line. However, Google recommends that once you have a text sitemap file for
your

184

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

site, you use a sitemap generator to create a sitemap from this text file using
the Sitemaps protocol.

Syndication feed Google accepts Really Simple Syndication (RSS) 2.0 and Atom 1.0
feeds. Note that the feed may provide information on recent URLs only. You can
read more about this in the documentation.

Deciding what to include in a sitemap file When you create a sitemap file, you
need to take care in situations where your site has multiple URLs that refer to
one piece of content: include only the preferred (canonical) version of the URL,
as the search engines may assume that the URL specified in a sitemap file is the
preferred form of the URL for the content. You can use the sitemap file to
indicate to the search engines which URL points to the preferred version of a
given page. In addition, be careful about what not to include. For example, as
well as not including multiple URLs that point to identical content, you should
leave out pages that are simply sort orders or filters for the same content,
and/or any low-value pages on your site. Additional URLs to leave out of your
sitemap file include: • URLs that include any tracking parameters • URLs that
redirect • URLs that do not return a 200 HTTP status code If your sitemap file
has too many of these errors, the search engines may ignore it. Best practice is
to have a process for automatically updating your sitemap file whenever changes
are made to your site. Conversely, it may be worth including information on the
following types of content in your sitemap, or creating separate sitemaps for
these kinds of content:

Videos Including information on videos on your site in your sitemap file will
increase their chances of being discovered by search engines. Google supports
the following video formats: .mpg, .mpeg, .mp4, .m4v, .mov, .wmv, .asf, .avi,
.ra, .ram, .rm, .flv, and .swf. You can find details on implementing video
sitemap entries and using a media RSS (mRSS) feed to provide Google with
information about video content on your site in the “Video sitemaps and
alternatives” documentation.

Images You can increase visibility for your images by listing them in your
sitemap file as well. For each URL you list in the file, you can also list the
images that appear on

MAKING YOUR SITE ACCESSIBLE TO SEARCH ENGINES

185

those pages, in specialized tags (up to a maximum of 1,000 images per page). See
the “Image sitemaps” documentation for details on the format of these tags. As
with videos, listing images in the sitemap increases their chances of being
indexed. If you list some images and not others, it may be interpreted as a
signal that the unlisted images are less important. Using image sitemap files is
becoming increasingly important as Google continues to expand its image
processing capabilities (discussed further in “Visual Search” on page 229).

News With news sitemaps, your content should be added to the existing sitemap
file as soon as it’s published (don’t put the new content in a new sitemap file;
just update the existing one). Your news sitemap should only include content
from the past two days, and older content should be removed. You can find
details on how to implement news sitemaps in the “News sitemaps” documentation.
As with image and video sitemaps, you can create a separate news sitemap or add
news-specific tags to your existing sitemap.

Uploading your sitemap file When your sitemap file is complete, upload it to
your site in the highest-level directory you want search engines to crawl
(generally, the root directory), such as www.yoursite.com/sitemap.xml. You can
include more than one subdomain in your sitemap, provided that you verify the
sitemap for each subdomain in Google Search Console or by using the Domain
property in Google Search Console. Note, though, that it’s frequently easier to
understand what’s happening with indexation if each subdomain has its own
sitemap and its own profile in Google Search Console.

Managing and updating XML sitemaps Once your XML sitemap has been accepted and
your site has been crawled, monitor the results and update your sitemap if there
are issues. With Google, you can return to your Google Search Console account to
view the statistics and diagnostics related to your XML sitemaps. Just click the
site you want to monitor. You’ll also find some FAQs from Google on common
issues such as slow crawling and low indexing. Update your XML sitemap with
Google and Bing when you add URLs to your site. You’ll also want to keep your
sitemap file up to date when you add a large volume of pages or a group of pages
that are strategic. There is no need to update the XML sitemap when you’re
simply updating content on existing URLs, but you should update your sitemap
file whenever you add any new content or remove any pages. Google and Bing will
periodically redownload the sitemap, so you don’t need to resubmit it unless
your sitemap location has changed.

186

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Alternatively, you can enable Google and Bing to autodiscover where your sitemap
file(s) are stored by using the Sitemap directive in your site’s robots.txt
file. If you are adding or removing large numbers of pages to or from your site
on a regular basis, you may want to use a utility (or have your developers build
the ability) for your XML sitemap to regenerate with all of your current URLs on
a regular basis. Many sites regenerate their XML sitemaps daily via automated
scripts. If your sitemap file grows to be larger than 50 MB, it will be
necessary to create multiple sitemap files. To do this, the first step is to
create a sitemap index file. The tags in this file are:

The parent tag surrounding each sitemap file.

The parent tag for each sitemap listed in the sitemap index file. This tag is a
child of the tag.

The location of the sitemap file. This tag is a child of the tag. You can read
more about this process in the documentation. Google and the other major search
engines discover and index websites by crawling links. Google XML sitemaps are a
way to feed the URLs that you want crawled on your site to Google for more
complete crawling and indexing, which results in improved long-tail
searchability. By creating and updating this XML file(s), you help to ensure
that Google recognizes your entire site, which in turn will help people find it.
If you have more than one URL pointing to the same content, it also helps all of
the search engines understand which version is the canonical version.

IndexNow In October 2021, Bing and Yandex announced a new initiative called
IndexNow that pings search engines directly whenever content on a site is
created, modified, or deleted. The objective is to reduce the inherent delay in
waiting for a search engine crawler to return to the site to discover those
changes, shortening the time it takes for the search engines to understand how
to find and rank new or changed content, or remove deleted content from their
indexes. In addition to removing the burden of manually notifying the search
engines of changes you make to your site, this will reduce the load placed on
your servers by the crawlers of search engines that support IndexNow. IndexNow
differs from sitemaps in that you need only list URLs that have been added,
changed, or deleted, so it acts like a real-time notification system that the
search

MAKING YOUR SITE ACCESSIBLE TO SEARCH ENGINES

187

engines can use to prioritize crawling of those pages. Of course, the engines
still need to decide whether or not they will choose to crawl those pages, based
on how important they see them as being. As of the time of writing, Google has
not yet adopted IndexNow, but it has confirmed it will be testing the protocol.
NOTE IndexNow is not intended as a replacement for XML sitemaps, and Bing still
strongly recommends that you implement them.

Creating an Optimal Information Architecture Making your site friendly to search
engine crawlers also requires that you put some thought into your site’s
information architecture (IA). A well-designed site architecture can bring many
benefits for both users and search engines.

The Importance of a Logical, Category-Based Flow Search engines face myriad
technical challenges in understanding your site. This is because crawlers are
not able to perceive web pages in the way that humans do, creating significant
limitations for both accessibility and indexing. A logical and properly
constructed website architecture can help overcome these issues and bring great
benefits in terms of both search traffic and usability. At the core of website
information architecture are two critical principles: making a site easy to use
and crafting a logical, hierarchical structure for content. These tasks are the
responsibility of the information architect, defined by one of the very early IA
proponents, Richard Saul Wurman, as follows: 1) the individual who organizes the
patterns inherent in data, making the complex clear; 2) a person who creates the
structure or map of information which allows others to find their personal paths
to knowledge; 3) the emerging 21st century professional occupation addressing
the needs of the age focused upon clarity, human understanding, and the science
of the organization of information.

Usability and search friendliness Search engines try to reproduce the human
process of sorting relevant web pages by quality. If a real human were to do
this job, usability and user experience would surely play a large role in
determining the rankings. Given that search engines are machines and don’t have
the ability to segregate by this metric quite so easily, they are forced to
employ a variety of alternative, secondary metrics to assist in the process. One
of the best-known of these is a measurement of the inbound links to a website,
and a well-organized site is more likely to receive links (see Figure 7-2).
Search engines may use many other signals to get a measure of search
friendliness as well.

188

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

For example, Google’s page experience signal includes some basic factors, such
as the mobile-friendliness of a page and whether or not it uses interstitials
(ads that pop up over and obscure the content that the user wishes to see).

Figure 7-2. Well-organized sites are more attractive to link to

Since Google launched in the late 1990s, search engines have strived to analyze
every facet of the link structure on the web and have extraordinary abilities to
infer trust, quality, reliability, and authority via links. If you push back the
curtain and examine why links between websites exist and how they come to be,
you can see that a human being (or several humans, if the organization suffers
from bureaucracy) is almost always responsible for the creation of links. The
engines hypothesize that high-quality links will point to high-quality content,
and that great content and positive user experiences will be rewarded with more
links than poor user experiences. In practice, the theory holds up well. Modern
search engines have done a very good job of placing good-quality, usable sites
in top positions for queries.

A site structure analogy Look at how a standard filing cabinet is organized. You
have the individual cabinet, drawers in the cabinet, folders within the drawers,
files within the folders, and documents within the files (see Figure 7-3).

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

189

Figure 7-3. Similarities between filing cabinets and web pages

There is only one copy of any individual document, and it is located in a
particular spot. There is a very clear navigation path to get to it. For
example, if you wanted to find the January 2023 invoice for a client
(Amalgamated Glove & Shoe), you would go to the cabinet, open the drawer marked
Client Accounts, find the Amalgamated Glove & Shoe folder, look for the Invoices
file, and then flip through the documents until you came to the January 2023
invoice (again, there is only one copy of this; you won’t find it anywhere
else). Figure 7-4 shows what it looks like when you apply this logic to the
popular website Craigslist.

Figure 7-4. Filing cabinet analogy applied to Craigslist

If you were seeking an apartment in Seattle, you’d navigate to
https://seattle.craigslist.org, choose apts/housing, narrow that down to two
bedrooms, and pick the twobedroom loft from the list of available postings.
Craigslist’s simple, logical information architecture makes it easy for you to
reach the desired post in four clicks, without having to think too hard at any
step about where to go. This principle applies perfectly to the process of SEO,
where good information architecture dictates:

190

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

• As few clicks as possible to your most important pages • Avoiding overly deep
link architectures to the rest of your pages (note that this doesn’t mean that
you should stick a thousand links on all your pages; aim to balance user
experience and reasonable link depths) • A logical, semantic flow of links from
the home page to categories to detail pages Here is a brief look at how this
basic filing cabinet approach can work for some more complex information
architecture issues:

Subdomains You should think of subdomains as completely separate filing cabinets
within one big room. They may share a similar architecture, but they shouldn’t
share the same content; and more importantly, if someone points you to one
cabinet to find something, they are indicating that that cabinet is the
authority, not the other cabinets in the room. Why is this important? It will
help you remember that links (i.e., votes or references) to subdomains may not
pass all, or any, of their authority to other subdomains (e.g.,
*.craigslist.org, where * is a variable subdomain name). Those cabinets, their
contents, and their authority are isolated from one another and may not be
considered to be associated with one another. Note that at times search engines
like Google might treat a subdomain similar to how it treats subfolders, based
on how tightly the subdomain is interlinked with the main domain. However, in
many cases it is best to have one large, well-organized filing cabinet instead
of several, as the separation might prevent users and bots from finding what
they want.

Redirects If you have an organized administrative assistant, they probably use
redirects inside physical filing cabinets. If they find themselves looking for
something in the wrong place, they might put a sticky note there pointing to the
correct location so it’s easier to find the next time they go looking for that
item. This will make it easier for anyone looking for something in those
cabinets to find it: if they navigate improperly, they’ll find a note pointing
them in the right direction. In the context of your website, redirecting
irrelevant, outdated, or misplaced content to the proper pages on the site will
ensure that users and search engines that try to visit those URLs will instead
be sent to the appropriate places.

URLs It would be tremendously difficult to find something in a filing cabinet if
every time you went to look for it, it had a different name, or if that name
resembled jklhj25br3g452ikbr52k—a not-so-uncommon type of character string found
in some

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

191

dynamic website URLs. Static, keyword-targeted URLs are much better for users
and bots alike. They can always be found in the same place, and they give
semantic clues as to the nature of the content. These specifics aside, thinking
of your site’s information architecture as a virtual filing cabinet is a good
way to make sense of best practices. It’ll help keep you focused on a simple,
easily navigated, easily crawled, well-organized structure. It is also a great
way to explain an often complicated set of concepts to clients and coworkers.
Because search engines rely on links to crawl the web and organize its content,
the architecture of your site is critical to optimization. Many websites grow
organically and, like poorly planned filing systems, become complex, illogical
structures that force people (and search engine spiders) to struggle to find
what they want.

Site Architecture Design Principles In planning your website, remember that
nearly every user will initially be confused about where to go, what to do, and
how to find what they want. An architecture that recognizes this difficulty and
leverages familiar standards of usability with an intuitive link structure will
have the best chance of making a visit to the site a positive experience. A
well-organized site architecture helps solve these problems and provides
semantic and usability benefits to both users and search engines. As Figure 7-5
shows, a recipes website can use intelligent architecture to fulfill visitors’
expectations about content and create a positive browsing experience. This
structure not only helps humans navigate a site more easily, but also helps the
search engines see that your content fits into logical concept groups. This
approach can help you rank for applications of your product in addition to
attributes of your product. Although site architecture accounts for a small part
of the algorithms, search engines do make use of relationships between subjects
and give value to content that has been organized sensibly. For example, if you
were to randomly jumble the subpages in Figure 7-5 into incorrect categories,
your rankings would probably suffer. Search engines, through their massive
experience with crawling the web, recognize patterns in subject architecture and
reward sites that embrace an intuitive content flow.

192

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-5. An organized site architecture

Site architecture protocol Although site architecture—the creation of structure
and flow in a website’s topical hierarchy—is typically the territory of
information architects (or is created without assistance from a company’s
internal content team), its impact on search engine rankings, particularly in
the long run, is substantial. It is, therefore, a wise endeavor to follow basic
guidelines of search friendliness. The process itself should not be overly
arduous if you follow this simple protocol: 1. List all of the requisite pages
(blog posts, articles, product detail pages, etc.). 2. Create top-level
navigation that can comfortably hold all of the unique types of detailed content
for the site. 3. Reverse the traditional top-down process by starting with the
detailed content and working your way up to an organizational structure capable
of holding each page. 4. Once you understand the bottom, fill in the middle.
Build out a structure for subnavigation to sensibly connect top-level pages with
detailed content. In small sites, there may be no need for this level, whereas
in larger sites, two or even three levels of subnavigation may be required. 5.
Include secondary pages such as copyright, contact information, and other
nonessentials. 6. Build a visual hierarchy that shows (to at least the last
level of subnavigation) each page on the site.

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

193

Figure 7-6 shows an example of a well-structured site architecture.

Figure 7-6. Another example of a well-structured site architecture

Category structuring As search engines crawl the web, they collect an incredible
amount of data (millions of gigabytes) on the structure of language, subject
matter, and relationships between content. Though not technically an attempt at
artificial intelligence, the engines have built a repository capable of making
sophisticated determinations based on common patterns. As shown in Figure 7-7,
search engine spiders can learn semantic relationships as they crawl thousands
of pages that cover a related topic (in this case, dogs).

Figure 7-7. Spiders learn semantic relationships

Although content need not always be structured according to the most predictable
patterns, particularly when a different method of sorting can provide value or
interest to a visitor, organizing subjects logically assists both humans (who
will find your site easier to use) and search engines (which will award you with
greater rankings based on increased subject relevance).

194

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Naturally, this pattern of relevance-based scoring extends from single
relationships between documents to the entire category structure of a website.
Site creators can take advantage of this best by building hierarchies that flow
from broad subject matter down to more detailed, specific content. Obviously, in
any categorization system, there is a natural level of subjectivity; think first
of your visitors, and use these guidelines to ensure that your creativity
doesn’t overwhelm the project.

Taxonomy and ontology In designing a website, you should also consider the
taxonomy and ontology. Taxonomy is essentially a two-dimensional hierarchical
model of the architecture of the site. You can think of ontology as mapping the
way the human mind thinks about a topic area. It can be much more complex than
taxonomy, because a larger number of relationship types are often involved. One
effective technique for coming up with an ontology is called card sorting. This
is a user testing technique whereby users are asked to group items together so
that you can organize your site as intuitively as possible. Card sorting can
help identify not only the most logical paths through your site, but also
ambiguous or cryptic terminology that should be reworded. With card sorting, you
write all the major concepts onto a set of cards that are large enough for
participants to read, manipulate, and organize. Your test group assembles the
cards in the order they believe provides the most logical flow, as well as into
groups that seem to fit together. By itself, building an ontology is not part of
SEO, but when you do it properly it will impact your site architecture, and
therefore it interacts with SEO. Coming up with the right site architecture
should involve both disciplines.

Flat Versus Deep Architecture One very strict rule for search friendliness is
the creation of flat site architecture. Flat sites require a minimal number of
clicks to access any given page, whereas deep sites create long paths of links
required to access detailed content. For nearly every site with fewer than
10,000 pages, all content should be accessible through a maximum of four clicks
from the home page and/or sitemap page. That said, flatness should not be forced
if it does not make sense for other reasons. If a site is not built to be flat,
it can take too many clicks for a user or a search engine to reach the desired
content, as shown in Figure 7-8. In contrast, a flat site (see Figure 7-9)
allows users and search engines to reach most content in just a few clicks. Even
sites with millions of pages can have every page accessible in five to six
clicks if proper link and navigation structures are employed.

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

195

Figure 7-8. Deep site architecture

Figure 7-9. Flat site architecture

Flat sites aren’t just easier for search engines to crawl; they are also simpler
for users, as they limit the number of page visits the user requires to reach
their destination. This reduces the abandonment rate and encourages repeat
visits. When creating flat sites, be careful to not overload pages with links.
Pages that have 200 links on them are not passing much PageRank to most of those
pages (note that Google may still pass a decent material PageRank to the pages
it deems to be the most important, which will leave even less PageRank for the
remaining pages).

196

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Pagination The issue of the number of links per page relates directly to another
rule for site architects: avoid excessive pagination wherever possible.
Pagination (see Figure 7-10), the practice of creating a list of elements on
pages separated solely by numbers (e.g., in an ecommerce site where the product
catalog has more products than the site owners wish to show on a single page),
is problematic for many reasons. First, pagination provides virtually no new
topical relevance, as the pages are each largely about the same topic.
Therefore, search engines might see this as poor-quality or “thin” content.
Second, when the content on the paginated pages shifts because new articles or
products are added and/or old ones are deleted, search engines have to recrawl
and reanalyze the paginated pages from scratch, which can have a detrimental
impact on rankings. Finally, pagination can create spider traps (discussed in
“Search engine–friendly navigation guidelines” on page 204), and potentially
hundreds or thousands of extraneous, low-quality pages that can be detrimental
to search visibility.

Figure 7-10. Pagination structures

So, aim to implement flat structures and stay within sensible guidelines for the
number of links per page, while retaining a contextually rich link structure.
This is not always as easy as it sounds, and it may require quite a bit of
thought and planning to build such a structure on some sites. Consider a site
with 10,000 different men’s running shoes. Defining an optimal structure for
that site could be a very large effort, but that effort will pay serious
dividends in return. Solutions to pagination problems vary based on the content
of the website. Here are two example scenarios and their solutions:

Use simple HTML links to connect paginated pages Google used to support
rel="next" and rel="prev" attributes in link tags, but in March 2019, Google’s
John Mueller clarified that it no longer uses these. As a result, simple HTML
links similar to what you see in Figure 7-11 are one of the best ways to
approach a paginated sequence of pages. Search engines can use this to
understand that there is a paginated sequence of pages.

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

197

Figure 7-11. Recommended pagination structures

Create a view-all page and use canonical tags You may have lengthy articles on
your site that you choose to break into multiple pages. However, this results in
links to the pages with anchor text like "1", "2", and so forth. The titles of
the various pages may not vary in any significant way, so they tend to compete
with each other for search traffic. Finally, if someone links to the article but
does not link to the first page, the link authority from that link will largely
be wasted. One way to handle this problem is to retain the paginated version of
the article, but also create a single-page version of the article. This is
referred to as a view-all page. Then, use the rel="canonical" link element
(discussed in more detail in “Content Delivery and Search Spider Control” on
page 270) to point from each of the paginated pages to the view-all page (you
should also include a link to this page from each of the individual paginated
pages). This will concentrate all the link authority and search engine attention
on a single page. However, if the view-all page loads too slowly because of the
page size, it may not be the best option for you. Note that if you implement a
view-all page and do not implement any of these tags, Google will attempt to
discover the page and show it instead of the paginated versions in its search
results. However, we recommend that you make use of one of the aforementioned
two solutions, as Google cannot guarantee that it will discover your view-all
pages, and it is best to provide it with as many clues as possible.

Additional pagination considerations On some sites the degree of pagination can
be quite extensive. For example, used car listing sites may have dozens of
paginated pages for a single make or model of a car. In addition, these sites
may have a large number of sort orders and filters, each of which creates new
pages, so the page counts can become very large. Similar issues can be seen on
many big ecommerce sites as well. In some of these cases it may not be desirable
to have large numbers of paginated pages indexed by search engines.

198

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

There is no easy answer to the question of whether you should expose all of your
paginated pages or not, but one of the best approaches to considering it is to
ask whether or not the presence of a large selection of items is something that
truly brings a benefit to the user. In the case of a used car site it probably
does, and exposing the full catalog to search engines makes sense as it shows
that your site has a large inventory.

Search-Friendly Site Navigation Website navigation is something that web
designers have been putting considerable thought and effort into since websites
came into existence. Even before search engines were significant, navigation
played an important role in helping users find what they wanted. It plays an
important role in helping search engines understand your site as well. Search
engine spiders need to be able to read and interpret your website’s code to
properly index the content on your web pages. Unfortunately, some ways of
implementing web page navigation and content function well for humans but not
for spiders. This section presents a few basic guidelines. NOTE Do not confuse
this with following the rules of organizations such as the W3C, which issues
guidelines on HTML construction. Although following W3C guidelines can be a good
idea, the great majority of sites do not follow them, so search engines
generally overlook violations of these rules as long as their spiders can parse
the code.

Site elements that are problematic for spiders Basic HTML text and HTML links
such as those shown in Figure 7-12 work equally well for humans and search
engine crawlers. So do links implemented in JavaScript as tags, as long as they
are present in the initial fully rendered version of the web page—that is, if
human interaction (such as clicking on an accordion on the page) is not required
to download additional content or display the links on the page. However, many
other types of content may appear on a web page and work well for humans but not
so well for search engines. Let’s take a look at some of the most common ones.

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

199

Figure 7-12. An example page with simple text and text links

Search and web forms.

Many sites incorporate search functionality. These “site search” elements are
specialized search engines that index and provide access to one site’s content.
This is a popular method of helping users rapidly find their way around complex
sites. For example, the Pew Research Center website provides a site search
function in the upper-right corner; this is a great tool for users, but search
engines will not make use of it. Search engines operate by crawling the web’s
link structure—they don’t, in most circumstances, submit forms or attempt random
queries in search fields, and thus any URLs or content solely accessible via
these means will remain invisible to them. In the case of site search tools,
this is OK, as search engines do not want to index this type of content (they
don’t like to serve search results within their search results). However, forms
are a popular way to provide interactivity on many sites, and crawlers

200

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

will not fill out or submit forms; thus, any content restricted to those who
employ them is inaccessible to the search engines. In the case of a simple
“Contact us” form, this is likely to have little impact, but with other types of
forms it can lead to bigger problems. Websites that have content behind paywall
and/or login barriers will need to either provide text links to the content
behind the barrier (which defeats the purpose of the login) or implement
flexible sampling (discussed in “Content Delivery and Search Spider Control” on
page 270).

JavaScript.

In May 2019 Google announced it was updating its crawler to be “evergreen” with
Chromium, building it on top of the very latest Chromium rendering engine used
by the Chrome browser. As a result, anything that Chrome can render, Googlebot
should also be able to render. This was an important development because it
meant that many of the historical issues that made JavaScript problematic for
SEO were solved. However, there are still some issues that can crop up (as
you’ll learn in “Problems That Still Happen with JavaScript” on page 310). You
can use one of several methods to check to see if Google is able to read and
discover the content of the pages on your site, including the URL Inspection
tool, the Rich Results Test tool, and the Mobile-Friendly Test tool. You can
also compare the source of the site as seen in the Chrome Inspector with the
content you see on the screen. You can access the Chrome Inspector by hitting
F12 while in your browser or by right-clicking on a web page and selecting
Inspect. Click Elements to see the initial tree structure of the web page (also
known as the Document Object Model, or DOM), then click on any section of the
page in the DOM to see the detailed contents of that section; if the content or
links you’re concerned about appear there, then Google will see them. (Note that
this will work well for rendering done by Googlebot but may not necessarily show
how well the page renders in other search engines.) Sites generated with
JavaScript can run into other problems too. Some JavaScript frameworks make it
difficult for you to create or modify title tags, meta descriptions, or
robots.txt. In some cases these frameworks are also slow to load, which can lead
to a bad user experience and, if the performance is bad enough, can be a
negative ranking signal as well. Well-known frameworks include Vue, jQuery,
React, and Express. Each framework has different strengths and weaknesses, so
make sure to learn what those are, including their SEO weaknesses and how to
solve them, prior to selecting the one that you will use. If you’re using
Jamstack, you will end up using a static site generator (SSG). SSGs can exhibit
all of the same problems seen in the JavaScript frameworks. Popular options
include Gatsby, 11ty, Jekyll, and Next.js. As with the frameworks, be sure you
understand their strengths and weaknesses, as well as the potential SEO issues,
before

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

201

selecting one for the development of your site. You can learn more about
Jamstack and SSGs in “Jamstack” on page 308. Another way to detect JavaScript
issues is to use a headless browser, a type of browser with no graphical user
interface (GUI). You can run both Chrome and Firefox as headless browsers by
setting them up via a command line or using a third-party tool such as
Puppeteer. These can then be used to capture the HTML that the browser receives,
which allows you to set up automated testing of your site at scale, reducing the
overall cost of site maintenance. Once you’ve set this up you can run these
tests on a repeatable basis, such as whenever you make updates to the website.
Note that if you decide to implement dynamic rendering (discussed in “Dynamic
rendering and hybrid rendering” on page 306), headless browsers can be used as a
tool to capture the server-side rendered version of your pages for you. You can
read more about headless browsers and implementing server-side rendering in
Fernando Doglio’s blog post “Demystifying Server Side Rendering”. Another
potential problem area is rendering different versions of content at the same
URL. The great majority of JavaScript frameworks and SSGs support the ability to
respond to user interactions with the page, followed by loading new content on
the page without changing the current URL. Historically, AJAX was the technology
used by developers to do this, but today this type of functionality is built
into the frameworks and SSGs themselves. However, because this approach uses
database calls to retrieve data without refreshing a page or changing URLs, the
content these technologies pull in is completely hidden from the search engines
(see Figure 7-13). This was confirmed in a June 2015 article by Eric Enge in
which Google’s Gary Illyes said, “If you have one URL only, and people have to
click on stuff to see different sort orders or filters for the exact same
content under that URL, then typically we would only see the default content.”
As a result, if you’re using this type of user-invoked dynamic content
generation and you want the rendered content to be seen by search engines, you
may want to consider implementing an alternative spidering system for search
engines to follow. These types of dynamic applications are so user-friendly and
appealing that forgoing them is simply impractical for many publishers. With
these traditional implementations, building out a directory of links and pages
that the engines can follow is a far better solution.

202

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-13. The problem with AJAX

When you build these secondary structures of links and pages, make sure to
provide users with access to them as well. Inside the dynamic content rendered
by the JavaScript application itself, give your visitors the option to “directly
link to this page” and connect that URL with the URL you provide to search
spiders through your link structures. This will enable content to be crawled,
and also provide users with accurate links that they can use to point to that
content. However, in some cases this type of “hidden” content can be a good
thing. For example, if you have a complex ecommerce site that generates
thousands of different combinations of pages due to the use of a complicated
array of sort orders and filters, eliminating these pages from the search engine
crawl paths through the implementation of this kind of user-driven dynamic
content can significantly reduce the crawl burden for search engines by removing
pages from the index that they weren’t likely to rank anyway.

iframes.

For search engines, the biggest problem with iframes is that they often hold the
content from two or more URLs on a single page. Indeed, the search engines may
consider the content within an iframe as residing on a separate page from the
one the iframe is being used on. NOTE At times they might consider the iframed
content as integrated on the page it’s used on too, and which way this goes
depends on a number of factors. However, if the iframed content is treated as
residing on a separate page, pages with nothing but iframed content will look
virtually blank to the search engines.

Additionally, because search engines rely on links, and iframe pages will often
change content for users without changing the URL, external links often point to
the wrong

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

203

URL unintentionally. In other words, links to the page containing the iframe may
not point to the content the linker wanted to point to. Figure 7-14 illustrates
how multiple pages are combined into a single URL with iframes, which results in
link distribution and spidering issues.

Figure 7-14. A sample page using iframes

Search engine–friendly navigation guidelines Although search engine spiders have
become more advanced over the years, the basic premise and goals remain the
same: spiders find web pages by following links and record the content of the
pages they find in the search engine’s index (a giant repository of data about
websites and pages). In addition to avoiding the topics we just discussed, here
are some additional guidelines for developing search engine–friendly navigation:

Beware of “spider traps.” Even intelligently coded search engine spiders can get
lost in infinite loops of links that pass between pages on a site. Intelligent
architecture that avoids recursively looping 301 or 302 HTTP server codes (or
other redirection protocols) should negate this issue, but sometimes online
calendar links, infinite pagination that loops, or content being accessed or
sorted in a multitude of ways (faceted navigation)

204

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

can create tens of thousands of pages for search engine spiders when you
intended to have only a few dozen true pages of content. You can read more about
Google’s viewpoint on this in the Google Search Central Blog.

Watch out for session IDs and cookies. If you limit a user’s ability to view
pages or redirect them based on a cookie setting or a session ID, search engines
may be unable to crawl your content. Historically, Google didn’t accept cookies
from pages while crawling content. However, that has changed. In 2017, Google’s
Gary Illyes said “Googlebot generally doesn’t make use of cookies, but it can if
it detects it cannot get to the content without them.” Martin Splitt confirmed
that in a 2021 interview with Eric Enge, which you can watch on YouTube. While
this may be true, you can’t rely on Google reading cookies to access your
content, so ensure that the version of your website without cookies has all the
content you want it to see. You can see more on how Google handles rendering in
Splitt’s TechSEO Boost 2019 keynote presentation “Googlebot & JavaScript: A
Closer Look at the WRS”. Note that Google won’t keep cookies across page loads;
it only loads them on a page-by-page basis to ensure that it can see all of the
content intended for users on the current page. Whether Bing accepts cookies on
a similar limited basis is less clear. In addition, if you use session IDs for
the purposes of personalizing content for users, make sure that the search
engines get URLs without session IDs when they’re crawling the site so they see
the nonpersonalized version. Bots do not deal with session IDs properly: each
visit by the crawler gets a URL with a different session ID, and the search
engine sees the URLs with different session IDs as different URLs. You can use
dynamic rendering, discussed in “Dynamic rendering and hybrid rendering” on page
306, to provide server-side rendered pages to the search engine as the solution
here, as this approach ensures that search engines get the fully rendered
versions of your pages. Because you’re providing the search engines with a
different page than what users see, this could technically be considered
cloaking, but in this particular context and for this specific purpose it is
generally considered acceptable. Dynamic rendering can be implemented using a
headless browser, as mentioned in the previous discussion about JavaScript.

Avoid contradictory canonical tagging. Canonical tags are a great way to let
search engines know when you recommend that a particular page should not be
indexed. However, Google treats these as a suggestion, and if the tags are not
used properly, they are ignored. One example of improper use is if you implement
two canonical tags on a page and they point to different target pages (e.g., if
page A has two canonical tags, one of which points to page B and the other to
page C). Another is if page A implements a canonical

CREATING AN OPTIMAL INFORMATION ARCHITECTURE

205

tag pointing to page B, and page B implements a canonical tag pointing to page
A. In either of these cases, search engines will ignore the canonical tags.

Be mindful of server, hosting, and IP issues. Server issues rarely cause search
engine ranking problems—but when they do, disastrous consequences can follow.
The engines are acutely aware of common server problems, such as downtime or
overloading, and will give you the benefit of the doubt (though this will mean
your content cannot be crawled by spiders during periods of server dysfunction).
On the flip side, sites hosted on content delivery networks (CDNs) may get
crawled more heavily, and CDNs offer significant performance enhancements to a
website.

Root Domains, Subdomains, and Microsites Among the common questions about
structuring a website (or restructuring one) are whether to host content on a
new domain, when to use subfolders, and when to employ microsites. As search
engines scour the web, they identify four kinds of structures on which to place
metrics:

Individual pages/URLs These are the most basic elements of the web: filenames,
much like those that have been found on computers for decades, which indicate
unique documents. Search engines assign query-independent scores—most famously,
Google’s PageRank—to URLs and judge them using their ranking algorithms. A
typical URL might look something like https://www.yourdomain.com/page.

Subfolders The folder structures that websites use can also inherit or be
assigned metrics by search engines (though there’s very little information to
suggest that they are used one way or another). Luckily, they are an easy
structure to understand. In the URL https://www.yourdomain.com/blog/post17,
/blog/ is the subfolder and post17 is the name of a file in that subfolder.
Search engines may identify common features of documents in a given subfolder
and assign metrics to these (such as how frequently the content changes, how
important these documents are in general, or how unique the content is that
exists in these subfolders).

Subdomains/fully qualified domains (FQDs)/third-level domains In the URL
https://blog.yourdomain.com/page, three domain levels are present. The top-level
domain (also called the TLD or domain extension) is .com, the second-level
domain is yourdomain, and the third-level domain is blog. The third-level domain
is sometimes referred to as a subdomain. Common web nomenclature does not
typically apply the word subdomain when referring to www, although technically,
this

206

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

too is a subdomain. A fully qualified domain is the combination of the elements
required to identify the location of the server where the content can be found
(in this example, blog.yourdomain.com). These structures can receive individual
assignments of importance, trustworthiness, and value from the search engines,
independent of their second-level domains, particularly on hosted publishing
platforms such as WordPress, Blogspot, and so on. Note that subdomains may be
treated similarly to subfolders on the site, or as separate domains. One factor
that can influence this is how tightly the main domain and the subdomain are
interlinked.

Complete root domains/host domains/second-level domains The domain name you need
to register and pay for, and the one you point DNS settings toward, is the
second-level domain (though it is commonly improperly called the “top-level”
domain). In the URL https://www.yourdomain.com/page, yourdomain.com is the
second-level domain and .com is the top-level domain. Other naming conventions
may refer to the second-level domain as the “root” or “paylevel” domain. Figure
7-15 shows some examples.

Figure 7-15. Breaking down some example URLs

ROOT DOMAINS, SUBDOMAINS, AND MICROSITES

207

When to Use a Subfolder If a subfolder will work, it is the best choice about
90% of the time. Keeping content within a single root domain and a single
subdomain (e.g., https://www.yourdomain.com) gives the maximum SEO benefits, as
the search engines will maintain all of the positive metrics the site earns
around links, authority, and trust, and will apply these to every page on the
site. Subfolders have all the flexibility of subdomains (the content can, if
necessary, be hosted on a unique server or completely unique IP address through
post-firewall load balancing) and none of the drawbacks. Subfolder content will
contribute directly to how search engines (and users, for that matter) view the
domain as a whole. Subfolders can be registered with the major search engine
tools and geotargeted individually to specific countries and languages as well.
Although subdomains are a popular choice for hosting content, they are generally
not recommended if SEO is a primary concern. Subdomains may inherit the ranking
benefits and positive metrics of the root domain they are hosted underneath, but
they do not always do so (and thus, content can underperform in these
scenarios). Of course, there may be exceptions to this general guideline.
Subdomains are not inherently harmful, and there are some content publishing
scenarios in which they are more appropriate than subfolders, as we will discuss
next; it is simply preferable for various SEO reasons to use subfolders when
possible.

When to Use a Subdomain If your marketing team decides to promote a URL that is
completely unique in content or purpose and would like to use a catchy subdomain
to do it, this can be a practical option. Google Maps is an example that
illustrates how marketing considerations make a subdomain an acceptable choice.
One good reason to use a subdomain is in a situation in which, as a result of
creating separation from the main domain, using one looks more authoritative to
users. Another is when you wish to create some separation between the business
of the main domain and the subdomain. For example, perhaps the main domain is
for your consumer customers and the subdomain is for your B2B customers, or
maybe the content on the subdomain is not that closely related to the focus of
the content on the main domain. One signal that content may not be closely
related to your main domain is if it doesn’t make sense for you to implement
substantial links between the main domain and that content. Keep in mind that
subdomains may inherit very little link equity from the main domain. If you wish
to split your site into subdomains and have all of them rank well, assume that
you will have to support each with its own full-fledged SEO strategy.

208

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

When to Use a Separate Root Domain If you have a single primary site that has
earned links, built content, and attracted brand attention and awareness, you
probably won’t want to place any new content on a completely separate domain
unless the focus of that content is materially different. There are occasions
when this can make sense, though; we’ll briefly walk through these here, as well
as explaining how singular sites benefit from collecting all of their content in
one root domain location. Splitting similar or relevant content from your
organization onto multiple domains can be likened to a store taking American
Express Gold cards but rejecting American Express Corporate or Blue cards—this
approach is overly segmented and dangerous for the consumer mindset. If you can
serve web content from a single domain, that domain will earn increased branding
in the minds of your visitors, references from them, links from other sites, and
bookmarks from your regular customers. Switching to a new domain forces you to
rebrand and to earn all of these positive metrics all over again. That said, if
you are operating multiple businesses that have materially different target
audiences or foci, then it can make sense to have these on separate root
domains. Trying to operate more than one business on the same domain can send
confusing signals to users and to search engines, potentially resulting in a
weaker business lowering the ranking potential of the other business(es) on the
same domain. In such cases, operating multiple domains can make sense. However,
if you do operate more than one domain, be aware that each one needs its own
marketing budget, SEO strategies, promotional strategies, etc. Each one needs to
earn its place on the web and in the SERPs based on its own merits. This is not
a small commitment and not one that should be taken lightly.

Microsites There are times when it makes sense to operate a website that is
fully separate from your main site. Often these separate domains are smaller and
much more narrowly focused from a topical perspective. We refer to these types
of sites as microsites. The focus and target audience may be similar to those of
your main domain, but if the microsite is likely to gain more traction and
interest with webmasters and bloggers by being at arm’s length from your main
site, this strategy may be worth considering—for example, if you have a unique
product offering that you wish to market separately. Just be aware that, as
always when operating a separate root domain, you will have to have a separate
marketing budget, SEO strategies, promotional strategies, etc.

ROOT DOMAINS, SUBDOMAINS, AND MICROSITES

209

Here are a few other cases where you should consider a microsite:

When you plan to sell the domain It is very hard to sell a folder or even a
subdomain, so this strategy is understandable if you’re planning to sell the
business related to the microsite.

When you’re a major brand building a “secret” or buzzworthy microsite In this
case, it can be useful to use a separate domain (however, you should
301-redirect the pages of that domain back to your main site after the campaign
is over so that the link authority continues to provide long-term benefit—just
as the mindshare and branding do in the offline world).

When you want to create a user experience that is highly focused on one specific
topic area In some cases you may find it difficult to create the same level of
focus on your main domain, perhaps because of your CMS or ecommerce platform, or
because of the information architecture of your main site. You should never
implement a microsite that acts as a doorway page to your main site, or that has
substantially the same content as your main site. Consider building a microsite
only if you are willing to invest the time and effort to put rich original
content on it, and to invest in promoting it as an independent site. Such a site
may gain more links by being separated from the main commercial site. A
microsite may have the added benefit of bypassing internal political battles or
corporate legal and PR department hurdles, and you may have a freer hand in
making changes to the site. However, it can take nine months or more for a
microsite on a brand-new domain to build enough domain-level link authority to
rank in the search engines. Therefore, if you want to launch a microsite you
should start the clock running as soon as possible on your new domain by posting
at least a few pages to the URL and getting at least a few links to it, as far
in advance of the official launch as possible. It may take a considerable amount
of time before a microsite is able to house enough high-quality content and to
earn enough trusted and authoritative links to rank on its own. Finally, if the
campaign the microsite was created for is time sensitive, consider redirecting
the pages from that site to your main site well after the campaign concludes, or
at least ensure that the microsite links back to the main site to allow some of
the link authority the microsite earns to help the ranking of your main site.

210

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Here are a few considerations for using a microsite:

Search algorithms sometimes favor large, authoritative domains. If the content
you’re considering placing on a microsite is closely related to that of your
main site, chances are high that it will rank higher if you simply place it in a
subfolder on the main domain. If you’re going to go the microsite route, then
ensure that you have a clear vision as to why the content is different enough,
or the separate marketing campaign is potent enough, to merit this approach.

Multiple sites split the benefits of links. As suggested in Figure 7-16, a
single good link pointing to a page on a domain positively influences the entire
domain and every page on it. Because of this phenomenon, it is much more
valuable to have every link you can possibly get pointing to the same domain to
help boost the rank and value of the pages on it. Having content or
keyword-targeted pages on other domains that don’t benefit from the links you
earn to your primary domain only creates more work.

Figure 7-16. How links can benefit your whole site

100 links to domain A ≠ 100 links to domain B + 1 link to domain A (from domain
B). In Figure 7-17, you can see how earning lots of links to page G on a
separate domain is far less valuable than earning those same links to a page on
the primary domain. For this reason, even if you interlink all of the microsites
or multiple domains that you build, the value still won’t be close to what you
could get from those links if they pointed directly to the primary domain.

ROOT DOMAINS, SUBDOMAINS, AND MICROSITES

211

Figure 7-17. Direct links to your domain are better

A large, authoritative domain can host a variety of content. Niche websites
frequently limit the variety of their discourse and content matter, whereas
broader sites can target a somewhat wider range of foci. This is valuable not
just for targeting the long tail of search and increasing potential branding and
reach, but also for viral content, where a broader focus is much less limiting
than a niche focus.

Time and energy are better spent on a single property If you’re going to pour
your heart and soul into web development, design, usability, user experience,
site architecture, SEO, public relations, branding, and so on, you want the
biggest bang for your buck. Splitting your attention, time, and resources across
multiple domains dilutes that value and doesn’t let you build on your past
successes in a single domain. As shown in Figure 7-16, every page on a site
benefits from inbound links to the site. The page receiving the link gets the
most benefit, but other pages also benefit.

Selecting a TLD It used to be that there was a strong bias toward wanting to run
your site on a .com TLD. Whatever TLD you choose, it’s still ideal if you can
own at least the .com, .net, and .org versions of your domain name, as well as
the various international variants. This last point is especially important for
the country where you’re headquartered and any countries in which you plan to
operate your business. If your business has a highly recognizable brand name,
then gaining control over all of your domain variants is critical. There are
situations, though, where this is less

212

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

important, and you might want to use a TLD without owning all of the respective
variants, for example: • If you’re a smaller business and the risk of a third
party damaging your business by operating a site with the same root but on a
different TLD is low • When you are serving only a single geographic region and
are willing to permanently forgo growth outside that region (in this case a
country code top-level domain, or ccTLD, such as .co.uk, .de, .it, etc., might
suffice) Due to the extremely high demand for domain names, the Internet
Corporation for Assigned Names and Numbers (ICANN) started assigning additional
generic toplevel domains (gTLDs), sometimes referred to as new top-level domains
(nTLDs), in 2013. These cover a wide range of potential options, including
.christmas, .autos, .lawyer, .eat, .sydney, and many, many more. A full list can
be found on the ICANN website. One of the major questions that arises with these
is whether using such a TLD will help your site rank organically on terms
related to the TLD. The answer to this is no: in an online forum, Google’s John
Mueller stated that these new generic TLDs are treated the same as other gTLDs
and do not have any effect on your organic rankings. He also noted that even the
new TLDs that sound as if they are region-specific in fact give you no specific
ranking benefit in those regions, though he added that Google reserves the right
to change that in the future. Thus, there is currently no inherent SEO value in
having a TLD that is related to your keywords; having a .storage domain, for
example, does not mean you have some edge over a .com website for a
storage-related business. Nevertheless, you might still want to grab your
domains for key variants of the new TLDs. For example, you may wish to consider
ones such as .spam. You may also wish to register those that relate directly to
your business. It is unlikely that these TLDs will give a search benefit in the
future, but it is likely that if your competition registers your name in
conjunction with one of these new TLDs your users might be confused about which
is the legitimate site. For example, if you are located in New York City, you
should probably purchase your domain name with the .nyc TLD; if you happen to
own a pizza restaurant, you may want to purchase .pizza; and so on.

Optimization of Domain Names/URLs Two of the most basic parts of any website are
the domain name and the URLs for the pages. This section will explore guidelines
for optimizing these important elements.

Optimizing Domains When you’re conceiving or designing a new site—whether it’s
for a new blog, a company launch, or even a small site for a friend—one of the
most critical items

OPTIMIZATION OF DOMAIN NAMES/URLS

213

to consider is the domain name. Here are 12 indispensable tips for selecting a
great domain name:

Brainstorm five top keywords. When you begin your domain name search, it helps
to have five or so terms or phrases in mind that best describe the domain you’re
seeking. Once you have this list, you can start to pair the terms or add
prefixes and suffixes to create good domain ideas. For example, if you’re
launching a mortgage-related website, you might start with terms such as
mortgage, finance, home equity, interest rate, and house payment, and play
around until you can find a good match that’s available.

Make the domain unique. Having your website confused with a popular site that
someone else already owns is a recipe for disaster. Thus, never choose a domain
that is simply a plural, hyphenated, or misspelled version of an already
established domain. For example, for years Flickr did not own
https://flicker.combination, and the company probably lost traffic because of
that. It eventually recognized the problem and bought the domain, and as a
result https://flicker.com now redirects to https://flickr.com.

Try to pick from .com-available domains. If you’re not concerned with type-in
traffic, branding, or name recognition, you don’t need to worry about this one.
However, if you’re at all serious about building a successful website over the
long term, you should be concerned with all of these elements, and although
directing traffic to a .net or .org domain (or any of the other new gTLDs) is
fine, owning and 301-redirecting the .com domain (and/or the ccTLD for the
country your website serves, such as .co.uk for the United Kingdom) is critical.
If you build or have any level of brand recognition for your business, you don’t
want others piggybacking on it and misappropriating it.

Make it easy to type. If a domain name requires considerable attention to type
correctly due to spelling, length, or the use of unmemorable or nonsensical
words, you’ve lost a good portion of your branding and marketing value.
Usability experts even tout the value of having the words include only
easy-to-type letters (which we interpret as avoiding q, z, x, c, and p).

Make it easy to remember. Remember that word-of-mouth marketing relies on the
ease with which the domain can be called to mind. You don’t want to be the
company with the terrific website that no one can ever remember to tell their
friends about because they can’t remember the domain name.

214

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Keep the name as short as possible. Short names are easy to type and easy to
remember (see the previous two rules). Short names also allow more of the URL to
display in the SERPs and are a better fit on business cards and other offline
media.

Create and fulfill expectations. When someone hears about your domain name for
the first time, they should be able to instantly and accurately guess the type
of content they might find there. That’s why we love domain names such as
NYTimes.com, CareerBuilder.com, AutoTrader.com, and WebMD.com. Less intuitive
names like Monster.com, Amazon.com, and Zillow.com require far more branding
effort.

Avoid trademark infringement. This is a mistake that isn’t made too often, but
it can kill a great domain and a great company. To be sure you’re not infringing
on anyone’s registered trademark with your site’s name, visit the US Patent and
Trademark Office site and search before you buy. Knowingly purchasing a domain
that includes a trademarked term with bad-faith intent is a form of
cybersquatting referred to as domain squatting.

Set yourself apart with a brand. Using a unique moniker is a great way to build
additional value with your domain name. A “brand” is more than just a
combination of words, which is why names such as MortgageForYourHome.com and
ShoesAndBoots.com aren’t as compelling (even if they accurately describe the
products/services available) as branded names such as Yelp and Gilt.

Limit the use of hyphens and numbers. Including hyphens and numbers in your
domain name makes it harder to convey verbally, and to remember or type. Avoid
spelled-out or Roman numerals in domains too, as both can be confusing and
mistaken for the other.

Don’t follow the latest trends. Website names that rely on odd misspellings or
uninspiring short adjectives (such as TopX, BestX, or HotX) aren’t always the
best choice. This isn’t a hard-and-fast rule, but in the world of naming
conventions in general, just because everyone else is doing it doesn’t mean it’s
a surefire strategy. Just look at all the people who named their businesses “AAA
X" over the past 50 years to be first in the phone book; how many Fortune 1000s
are named “AAA Company“?

Use a domain selection tool. Domain registrar websites generally make it
exceptionally easy to determine the availability of a domain name. Just remember
that you don’t have to buy through these services: you can find an available
name that you like, and then go to your

OPTIMIZATION OF DOMAIN NAMES/URLS

215

registrar of choice. If the domain name you want is owned by someone else, and
they don’t appear to be using it, you can try to buy it directly from them. Most
registrars offer some level of functionality for waiting until a domain name
becomes available, or even making an offer. You can also try marketplaces such
as BuyDomains.com as an option to attempt to purchase domains that have already
been registered.

Picking the Right URLs Search engines place a small amount of weight on keywords
in your URLs. Be careful, however, as long URLs with numerous hyphens in them
(e.g., Buy-this-awesomeproduct-now.html) can feel a bit spammy to users. The
following are some guidelines for selecting optimal URLs for the pages of your
site(s):

Describe your content. An obvious URL is a great URL. If a user can look at the
address bar (or a pasted link) and make an accurate guess about the content of
the page before ever reaching it, you’ve done your job. These URLs tend to get
pasted, shared, emailed, written down, and yes, even recognized by the engines.

Keep it short. Brevity is a virtue. The shorter the URL, the easier it is to
copy and paste, read over the phone, write on a business card, or use in a
hundred other unorthodox fashions, all of which spell better usability and
increased branding. You can always create a shortened URL for marketing purposes
that redirects to the destination URL of your content, but bear in mind that
this short URL will have no SEO value.

Static is the way. Search engines treat static URLs differently than dynamic
ones. Users also are not fond of URLs in which the big players are ?, &, and =,
as they’re harder to read and understand.

Descriptive text is better than numbers. If you’re thinking of using
114/cat223/, you should go with descriptive text instead (e.g., /brand/adidas/).
Even if the descriptive text isn’t a keyword or is not particularly informative
to an uninitiated user, it’s far better to use words when possible. If nothing
else, your team members will thank you for making it that much easier to
identify problems in development and testing.

Keywords never hurt. If you know you’re going to be targeting a lot of
competitive keyword phrases on your website for search traffic, you’ll want
every advantage you can get. Keywords are certainly one element of that
strategy, so take the list from marketing, map it to the proper pages, and get
to work. For pages created dynamically

216

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

through a CMS, configure the system to allow you to include keywords in the URL.

Subdomains aren’t always the answer. First off, avoid using multiple subdomains
(e.g., product.brand.site.com) if you can; the resulting URLs are unnecessarily
complex and lengthy. Second, consider that subdomains have the potential to be
treated separately from the primary domain when it comes to passing link and
trust value. In most cases where just a few subdomains are used and there’s good
interlinking, it won’t hurt, but be aware of the downsides. For more on this,
and for a discussion of when to use subdomains, see “Root Domains, Subdomains,
and Microsites” on page 206.

Use fewer folders. A URL should contain no unnecessary folders (or words or
characters, for that matter). They do not add to the user experience of the site
and can in fact confuse users.

Hyphens separate best. When creating URLs with multiple words in the format of a
phrase, hyphens are best to separate the terms (e.g.,
/brands/dolce-and-gabbana/). While other characters, like + or _, also work as
separators, these are not recommended.

Stick with conventions. If your site uses a single format throughout, avoid
making just one section unique. Stick to your URL guidelines once they are
established so that your users (and future site developers) will have a clear
idea of how content is organized into folders and pages. This can apply globally
as well as for sites that share platforms, brands, and so on.

Don’t be case-sensitive. URLs can contain both uppercase and lowercase
characters, but don’t use any uppercase letters in your structure.
Unix/Linux-based web servers are casesensitive, so
https://www.domain.com/Products/widgets is technically a different URL from
https://www.domain.com/products/widgets. Note that this is not true of Microsoft
IIS servers, but there are a lot of Apache web servers out there. In addition,
capitalization can be confusing to users, and potentially to search engine
spiders as well. Google sees any URLs with even a single unique character as
unique URLs. If your site shows the same content on
www.domain.com/Products/widgets/ and www.domain.com/products/widgets/, it could
be seen as duplicate content, and it also expands the potential crawl space for
the search engines. If you have such URLs now, implement tolower in your server
config file, as this will map all incoming requests that use uppercase letters
to be all lowercase. You can also

OPTIMIZATION OF DOMAIN NAMES/URLS

217

301-redirect them to all-lowercase versions, to help avoid confusion. If you
have a lot of type-in traffic, you might even consider a 301 rule that sends any
incorrect capitalization permutation to its rightful home.

Don’t append extraneous data. There is no point in having a URL exist in which
removing characters generates the same content. You can be virtually assured
that people on the web will figure it out; link to you in different fashions;
confuse themselves, their readers, and the search engines (with duplicate
content issues); and then complain about it.

Keyword Targeting Search engines face a tough task: based on a few words in a
query (sometimes only one) they must return a list of relevant results ordered
by measures of importance, and hope that the searcher finds what they are
seeking. As website creators and web content publishers, you can make this
process massively simpler for the search engines—and, in turn, benefit from the
enormous traffic they send—based on how you structure your content. The first
step in this process is to research what keywords people use when searching for
businesses that offer products and services like yours. This practice has long
been a critical part of search engine optimization, and although the role
keywords play has evolved over time, keyword usage is still one of the first
steps in targeting search traffic. The first step in the keyword targeting
process is uncovering popular terms and phrases that searchers regularly use to
find the content, products, or services your site offers. There’s an art and
science to this process, but it consistently begins with developing a list of
keywords to target (see Chapter 6 for more on this topic). Once you have that
list, you’ll need to include these keywords in your pages. In the early days of
SEO, this process involved stuffing keywords repetitively into every HTML tag
possible. Nowadays, keyword relevance is much more aligned with the usability of
a page from a human perspective. Because links and other factors make up a
significant portion of the search engines’ algorithms, they no longer rank pages
with 61 instances of free credit report above pages that contain only 60. In
fact, keyword stuffing, as it is known in the SEO world, can actually get your
pages devalued via search engine algorithms. The engines don’t like to be
manipulated, and they recognize keyword stuffing as a disingenuous tactic.
Figure 7-18 shows an example of pages utilizing accurate keyword targeting.
Keyword usage includes creating titles, headlines, and content designed to
appeal to searchers in the results (and entice clicks), as well as building
relevance for search engines to improve your rankings. In today’s SEO, there are
also many other factors involved in ranking, including AI and machine learning
algorithms, term frequency-

218

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

inverse document frequency (TF-IDF), co-occurrence, entity salience, page
segmentation, and several others, which will be described in detail later in
this chapter. However, keywords remain important, and building a search-friendly
site requires that you prominently employ the keywords that searchers use to
find content. Here are some of the more prominent places where a publisher can
place those keywords.

Figure 7-18. Title and headings tags are powerful for SEO

KEYWORD TARGETING

219

Title Tags For keyword placement, tags are an important element for search
engine relevance. The tag is in the section of an HTML document and is the only
piece of metainformation about a page that directly influences relevance and
ranking. Keep in mind that a tag for any given page must directly correspond to
that page’s content. You may have five different keyword categories and a unique
site page (or section) dedicated to each, so be sure to align a page’s tag
content with its actual visible content as well, with a focus on the content
that makes the page unique from the other pages on your site. Also be aware that
Google may not use your tag in the SERPs. As of September 2021, Google says that
it uses the tag specified by the site owner about 87% of the time. It may choose
something other than your tag to show as the title of your SERP listing based on
several different factors, some of which are beyond your control; for example,
because the title you provided is not a good match for the user query but Google
still wants to show your page in the SERPs, or because it’s not a good
description of the content of your page, or it may even be missing entirely (a
November 2021 study by Ahrefs showed that 7.4% of top-ranking pages don’t have a
tag). For details on how Google generates title links in the SERPs, how it
manages common issues with tags, and more, see “Influencing your title links in
search results” in the Google Search Central documentation. The following nine
rules represent best practices for tag construction:

Communicate with human readers. This needs to remain a primary objective. Even
as you follow the other rules here to create a tag that is useful to the search
engines, remember that humans will likely see this text presented in the search
results for your page. Don’t scare them away with a tag that looks like it’s
written for a machine.

Target searcher intent. When writing titles for web pages, keep in mind the
search terms your audience employed to reach your site. If the intent is
browsing or research-based, a more descriptive tag is appropriate. If you’re
reasonably sure the intent is a purchase, download, or other action, make it
clear in your title that this function can be performed at your site. For
example, at the time of writing, the tag of the Best Buy web page
https://bestbuy.com/site/video-games/playstation-4-ps4/
pcmcat295700050012.c?id=pcmcat295700050012 is “PS4 Games and Consoles for
PlayStation 4 - Best Buy.” This text makes it clear that you can buy PS4 games
and consoles at Best Buy.

220

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Incorporate keyword phrases. This one may seem obvious, but it is critical to
prominently include in your tag the keywords your research shows as being the
most valuable for capturing searches.

Place your keywords at the beginning of the tag. It’s a common belief that this
positioning provides the most search engine benefit, though Google employees
have indicated in the past that this is not the case. That said, if you want to
include your brand name in the tag, place it at the end. For most brands, the
most important thing to show the user first is the nature of the content that
they will get if they decide to click on your link in the SERPs.

Limit length to 65 characters (including spaces). The actual length supported by
Google has some variability to it, but content in tags beyond 65 characters may
get cut off in the display in the SERPs.

Target longer phrases if they are relevant. When choosing what keywords to
include in a tag, use as many as are completely and uniquely relevant to the
page at hand while remaining accurate and descriptive. For example, it can be
much more valuable to have a tag such as “SkiDudes | Downhill Skiing Equipment &
Accessories” rather than simply “SkiDudes | Skiing Equipment.” Including
additional terms that are both relevant to the page and receive significant
search traffic can bolster your page’s value. However, if you have separate
landing pages for “skiing accessories” and “skiing equipment,” don’t include one
page’s term in the other’s title. You’ll be cannibalizing your rankings by
forcing the engines to choose which page on your site is more relevant for that
phrase, and they might get it wrong. We will discuss the cannibalization issue
in more detail shortly.

Use a divider. When you’re splitting up the brand from the descriptive text,
options include | (the pipe character), >, –, and :, all of which work well. You
can also combine these where appropriate: for example, “Major Brand Name:
Product Category – Product.” These characters do not bring an SEO benefit, but
they can enhance the readability of your title.

Focus on click-through and conversion rates. The tag is exceptionally similar to
the title you might write for paid search ads, only it is harder to measure and
improve because the stats aren’t provided for you as easily. However, if you
target a market that is relatively stable in search volume week to week, you can
do some testing with your tags and improve the click-through rate.

KEYWORD TARGETING

221

Watch your analytics and, if it makes sense, buy search ads on the page to test
click-through and conversion rates of different ad text as well, even if it’s
for just a week or two. You can then look at those results and incorporate them
into your titles, which can make a huge difference in the long run. A word of
warning, though: don’t focus entirely on click-through rates. Remember to
continue measuring conversion rates.

Be consistent. Once you’ve determined a good formula for your pages in a given
section or area of your site, stick to that regimen. You’ll find that as you
become a trusted and successful “brand” in the SERPs, users will seek out your
pages on a subject area and have expectations that you’ll want to fulfill.

Meta Description Tags Meta descriptions have three primary uses: • To describe
the content of the page accurately and succinctly with a focus on what unique
benefits a user will get by clicking through to your page • To serve as a short
text “advertisement” to prompt searchers to click on your pages in the search
results • To display targeted keywords, not for ranking purposes, but to
indicate the contents of the page to searchers Great meta descriptions, just
like great ads, can be tough to write; but for keywordtargeted pages,
particularly in competitive search results, they are a critical part of driving
traffic from the search engines through to your pages. Their importance is much
greater for search terms where the intent of the searcher is unclear or
different searchers might have different motivations. Here are six good rules
for meta descriptions:

Tell the truth. Always describe your content honestly. If it is not as “sexy” as
you’d like, spice up your content; don’t bait and switch on searchers, or
they’ll have a poor brand association.

Keep it succinct. Be wary of character limits—currently Google and Bing display
as few as 155 to 160 characters. Make sure that your meta description is long
enough to be descriptive, but don’t go over 160 characters. Note that the actual
cap on length is based on the total pixel width, not characters, but for most of
us working with a character count is easier.

222

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Write soft sell copy. Focus on describing to the user the benefit of clicking on
your listing in the SERPs. Make the text compelling and informative. However,
don’t do a hard sell; for most businesses a soft sell meta description is likely
to entice the most clicks.

Analyze psychology. The motivation for organic search clicks is frequently very
different from that of users clicking on paid results. Users clicking on PPC ads
may be very directly focused on making a purchase, while people who click on an
organic result may be more interested in research or learning about the company
or product. Don’t assume that successful PPC ad text will make for a good meta
description (or the reverse).

Include relevant keywords. It is extremely important to have your keywords in
the meta description tag—the boldface that the engines apply can make a big
difference in visibility and clickthrough rate. In addition, if the user’s
search term is not in the meta description, chances are reduced that the meta
description will be used as the description in the SERPs.

Don’t employ descriptions universally. You shouldn’t always write a meta
description. Conventional logic may hold that it is usually wiser to write a
good meta description yourself to maximize your chances of it being used in the
SERPs, rather than let the engines build one out of your page content; however,
this isn’t always the case. If the page is targeting one to three heavily
searched terms/phrases, go with a meta description that hits those users
performing that search. But if you’re targeting longer-tail traffic with
hundreds of articles or blog entries or even a huge product catalog, it can
sometimes be wiser to let the engines themselves extract the relevant text. The
reason is simple: when engines pull, they always display the keywords (and
surrounding phrases) that the user searched for. If you attempt to write the
descriptions yourself, you lose this benefit. Similarly, if you try to
machine-generate your meta descriptions, many of these may be of poor quality
and actually harm your CTR. In such cases it may be better to let the search
engines automatically generate meta descriptions for you. In some cases they’ll
overrule your meta description anyway, but because you can’t consistently rely
on this behavior, opting out of meta descriptions is OK (and for massive sites,
it can save hundreds or thousands of hours of work). Because the meta
description isn’t a ranking signal, it is a second-order activity at any rate.

KEYWORD TARGETING

223

Heading Tags The heading tags in HTML (, , , etc.) are designed to indicate a
headline hierarchy in a document. Thus, an tag might be considered the headline
of the page as a whole, whereas tags would serve as subheadings, s as
tertiary-level subheadings, and so forth. The search engines have shown a slight
preference for keywords appearing in heading tags. Generally when there are
multiple heading tags on a page, the engines will weight the higher-level
heading tags more heavily than those below them. For example, if the page
contains , , and tags, the tag(s) will be weighted the heaviest. If a page
contains only and tags, the tag(s) would be weighted the heaviest. In some
cases, you can use the tag of a page, containing the important keywords, as the
tag. However, if you have a longer tag, you may want to use a more focused,
shorter heading tag including the most important keywords from the tag. When a
searcher clicks a result in a SERP, reinforcing the search term they used in a
prominent headline helps to indicate that they have arrived on the right page,
containing the content they sought. Many publishers assume that they have to use
an tag on every page. What matters most, though, is the highest-level heading
tag you use on a page, and its placement. If you have a page that uses an
heading at the very top, and any other heading tags further down on the page are
at the or lower level, then that first tag will carry just as much weight as if
it were an . Again, what’s most important is the semantic markup of the page.
The first heading tag is typically intended to be a label for the entire page
(so it plays a complementary role to the tag), and you should treat it as such.
Other heading tags on the page should be used to label subsections of the
content. It’s also a common belief that the size at which the heading tag is
displayed is a factor. For the most part, the styling of your heading tags is
not a factor in the SEO weight of these tags. You can style the first tag
however you want, as shown in Figure 7-19, provided that you don’t go to
extremes (and because it acts as a title for the whole page, it should probably
be the largest text element on the page). Note that using heading tags in your
content can also help with earning more featured snippets.

224

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-19. Headings styled to match the site

Document Text The HTML text on a page was once the center of keyword
optimization activities. In the early days of SEO, metrics such as keyword
density and saturation were used to measure the perfect level of keyword usage
on a page. To the search engines, however, text in a document—particularly the
frequency with which a particular term or phrase is used—has very little impact
on how happy a searcher will be with that page. In fact, quite often a page
laden with repetitive keywords attempting to please the engines will provide an
experience that users find unnatural, and this can result in lower rankings
instead of higher ones. It’s much more valuable to create semantically rich
content that covers the topic matter implied by the page’s tag in a
comprehensive way. This means naturally including synonyms and covering related
topic areas in a manner that increases the chances of satisfying the needs of a
large percentage of visitors to that page. It’s a good idea to use the main
keyword for a page in the tag and the main heading tag. It might also appear in
the main

KEYWORD TARGETING

225

content, but the use of synonyms for the main keyword and related concepts is at
least as important. As a result, it’s more important to focus on creating
high-quality content than it is to keep repeating the main keyword.

Page segmentation In the early days of search, Google couldn’t understand page
layout that well because it could not read CSS files and process them like a
browser does. However, that has changed; Google has been able to read CSS files
for many years and explicitly recommends that you don’t block crawling of CSS
files. As a result, Google fully understands the layout of your pages. Given
this, where the keywords are used on the page also matters. Keywords that appear
in the left or right sidebar, or in the footer, are likely given less weight
than they would be if they appeared in the main content of your page. In
addition, with HTML5, new markup exists that allows you to explicitly identify
the section of your page that represents the main content. You can use this
markup to help make Google’s job easier, and to make sure that other search
engines are able to locate that content.

Synonyms Use of related terms is also a factor. A page about “left-handed golf
clubs” should not use that exact phrase every time the product is referenced.
This would not be a natural way of writing and could be interpreted by the
search engines as a signal of poor document quality, lowering the page’s
rankings. Instead, allow your content creators to write naturally. This will
cause them to use other phrases, such as “the sticks,” “set of clubs,” “lefty
clubs,” and other variants that people use in normal writing style. Using
synonyms represents a key step away from manipulative SEO techniques for
creating pages to try to rank for specific search terms. A good way to start
down this path is to avoid giving writers overly explicit instructions of what
phrases to include in their content. Let your writers focus on creating great
content, and they will naturally use many different ways to refer to the page
subject.

Co-occurrence, phrase-based indexing, and entity salience The presence of a
specific keyword or phrase on a page will likely increase the probability of
finding other related words on those pages. For example, if you are reading an
article on diabetes you would expect to be able to access information on
symptoms, causes, diagnosis, treatment, and more. The absence of these could be
a signal of a poor-quality page.

226

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

To put this in the positive, inclusion of more information that readers would
expect to see in the content can be interpreted as an indication that it’s a
good page. Chances are good that more people reading the article will be
satisfied with it as well, as users searching for that term might be looking for
a wide variety of information. Figure 7-20 is an example of such an in-depth
article from the Mayo Clinic.

Figure 7-20. A comprehensive article on diabetes from the Mayo Clinic

You don’t need to address every aspect of the topic on each individual page.
Linking to other relevant resources and high-quality content, both on your site
and on thirdparty sites, can play a key role in establishing your page as
providing a great answer to the user’s question.

KEYWORD TARGETING

227

This last step may well be equally important in the overall page optimization
process. No single page will answer every question from every possible user, so
addressing a significant percentage of questions, and then connecting with other
pages to answer follow-on questions on the same topic, is an optimal structure.
On the product pages of an ecommerce site, where there will not be article-style
content, this can mean a combination of well-structured and unique description
text and access to key refinements, such as individual brands, related product
types, the presence of a privacy policy and “about us” information, a shopping
cart, and more.

Image Filenames and alt Attributes Incorporating images on your web pages can
substantively enrich the user experience. However, as the search engines have
limited capabilities to understand the content of images, it’s important to
provide them with additional information. There are two basic elements that you
can control to give search engines information about the content of images:

The filename Search engines look at image filenames to see whether they provide
any clues to the content of the images. Don’t name your image
example.com/img4137a-bl2.jpg, as it tells the search engine nothing at all about
the image, and you are passing up the opportunity to include keyword-rich text.
Instead, if it’s a picture of Abe Lincoln, name the file abe-lincoln.jpg and/or
have the src URL string contain that keyword (as in
example.com/abe-lincoln/portrait.jpg).

The alt attribute text Image tags in HTML permit you to specify an alt
attribute. This is a place where you can provide more information about what’s
in the image, and again where you can use your targeted keywords. Here’s an
example for the picture of Abe Lincoln:

Be sure to use quotes around the alt attribute’s text string if it contains
spaces. Sites that have invalid tags frequently lump a few words into the tag
without enclosing them in quotes, intending this to serve as the alt content—but
with no quotes, all terms after the first word will be lost. Using the image
filename and the alt attribute in this way permits you to reinforce the major
keyword themes of the page. This is particularly useful if you want to rank in
image search. Make sure the filename and the alt text reflect the content of the
picture, though, and do not artificially emphasize keywords unrelated to the
image (even if they are related to the page). Also, although the alt attribute
and the image

228

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

filename are helpful, you should not use image links as a substitute for text
links with rich anchor text, which carry much more weight from an SEO
perspective.

Visual Search Google is investing heavily in multimodal search experiences, in a
variety of ways. For example, in the interest of improving web page load times
Google developed WebP, an image format that enables the creation of smaller
images that will load faster than other formats, such as JPG and PNG files.
There are also many initiatives to expand the search engine’s capabilities,
ranging from the Google Lens app that allows images to be used as search
queries, to augmented reality (AR) and virtual reality (VR) experiences. Aparna
Chennapragada, who was at the time VP for Google Lens and AR, made the following
observation in December 2018 with regard to why the company was investing in
Google Lens: To me, using our cameras to help us with our day-to-day activities
makes sense at a fundamental human level. We are visual beings—by some
estimates, 30 percent of the neurons in the cortex of our brain are for vision.
This certainly sounds compelling, but it’s often very challenging to translate
such realizations into technology that gets widespread adoption. There are
certainly some areas, though, where great progress is being made. For example,
AR and VR are areas where many big players are making large investments.
Hardware technology and AI algorithms are advancing quickly, and this is driving
rapid growth. The global value of the virtual reality gaming market, for
instance, is projected to reach $6.9B in 2025 (Figure 7-21). We’re also seeing
the broad adoption of AR technology in the cars that we buy, including features
such as collision detection, lane departure warning, adaptive cruise control,
and automated parking. The value of this market was estimated at $21B in 2020
and is expected to grow to $62B by 2026. While Google is not in the virtual
reality gaming or automobile sales market, the increasing availability of these
technologies and the proliferation of ways that people will want to use them
suggest that providing more visual ways for people to search makes sense.

KEYWORD TARGETING

229

Figure 7-21. Virtual reality gaming content revenue worldwide in 2019, 2020, and
2025 (source: Statista)

While there is not much current dependable data available on the percentage of
searches that leverage image search, the most recent data suggests that it
already accounts for a sizable portion of total search volume. Data published by
Google in 2019 shows that “50 percent of online shoppers said images of the
product inspired them to purchase, and increasingly, they’re turning to Google
Images.” A May 2021 report by Data Bridge Market Research indicates that the
value of the visual search market will grow to $33B by 2028. This expanding
market will create opportunities for many new players. For example, Pinterest
introduced a camera search function called Lens in February 2017 that within a
year was reportedly being used to perform 600 million visual searches per month.
While that is a big number, and it has likely grown significantly since then,
this is still a fraction of the overall search queries we see happening on
Google—less than 1% of the approximately 225 billion total searches performed on
Google each month (see the opening of Chapter 1 for more on Google search
volumes).

230

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Google Lens To compete with Pinterest Lens, Google launched Google Lens in
October 2017. This technology provides a wide array of visual searching
capabilities, including the ability to take a picture of an object and use that
picture as the search query, with results being provided that will identify what
you’ve taken a picture of and other related objects. This can tie in to a
variety of informational and shopping experiences. For example, at the Search On
event in September 2021 Google announced that it would be leveraging the
capabilities of its new Multitask Unified Model (MUM) algorithm to allow users
to combine an image search with text inputs for advanced query refinement. One
example that was shared was taking a picture of a colorful shirt and searching
on that to see that shirt and other similar ones, then providing text input to
indicate that what you really want is socks that use or match that color scheme.
Amazon has also started offering image search functionality to users, in two
ways: StyleSnap allows you to search on photos you see on social media, and the
Amazon App Camera Search feature (like Lens) enables you to take pictures of
items and search on them. Then we have Mark Zuckerberg’s vision of a Metaverse
where people can live in a world combining virtual reality, augmented reality,
and video. This highly ambitious concept may never come to pass and is likely a
long way away if it does, but it also suggests the powerful potential of where
our online world could be heading. Google sees this potential and has been
investing in many other aspects of visual experiences as well.

High-resolution images On the web we normally use images that are sized to fit
available screens and to minimize page load times. While this makes complete
sense, there are times when it’s desirable to offer users the opportunity to
access higher-resolution images. You can access these in Google via a simple
search such as the one shown in Figure 7-22.

KEYWORD TARGETING

231

Figure 7-22. Example of high-definition images

You can also access high-definition images by using advanced image search. To do
this, click on the gear icon and select Advanced Search. As shown in Figure
7-23, you can then select a resolution such as “Larger than 10MP” and get
appropriately filtered results.

232

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-23. Google advanced image search

To be eligible for your images to show up for this feature, you need to take two
steps: 1. Opt in to the program via Google Search Console. 2. Implement an
appropriate schema to let Google know where your high-resolution images can be
found. The schema should be similar to the following:

The documentation suggests that to get the best results you should provide
multiple high-resolution images (with a minimum of 50K pixels when multiplying
width and height) with the aspect ratios shown here (1 × 9, 4 × 3, and 1 × 1).

Optimizing for visual search As mentioned in “Image Filenames and alt
Attributes” on page 228, using images that are related to the content of the
page can improve a page’s chances of ranking for relevant search queries. In
addition, the relevance of an image to the content of the page can help the
image rank higher for queries in Google image search. This makes sense, as users
who find an image relevant to their query may also want in-depth content related
to that query. Other factors that impact how high images may rank in image
search are:

Placement higher up on the page. This is treated as an indication of the
importance of the image to the content.

Use of image sitemaps. These can help search engines discover images and are
therefore good for SEO.

3D images and AR Google announced support for 3D images in May 2019. Since then,
it has been steadily adding support for different types of objects. Currently
supported are the following: • Land animals

• Chemistry terms

• Underwater and wetland animals

• Biology terms

• Birds

• Physics terms

• House pets

• Cultural heritage sites

• Human anatomy

• Cultural objects

• Cellular structures

234

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

These are currently only supported on mobile devices, and some of these image
types are only available in some languages or only in English. The process
starts with a normal search. For example, if you search for tiger, as shown in
Figure 7-24, you will see the option to view in 3D as part of the search results
(note that this works only on Android devices or iPhones; it does not work on
desktops).

Figure 7-24. Google search result for a search on “tiger” on an iPhone

KEYWORD TARGETING

235

From there you can see the selected object in 3D on your phone (Figure 7-25).

Figure 7-25. Google 3D search result on an iPhone

You can even take that image and superimpose it on the world around you. For
example, perhaps you want to put that tiger right in your kitchen (Figure 7-26).
Other enhancements are available too, such as being able to hear sounds related
to animals that you’ve embedded in your room. Currently there is no way to
submit your content to Google in 3D, and it’s not clear when such functionality
will be available.

236

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-26. Google 3D model in a kitchen on an iPhone

Google Discover In December 2016, Google introduced the idea of providing users
with a stream of content selected for them based on their preferences. The
announcement offered the following overview: Starting today, in the Google app
on Android (and coming soon to iOS), your cards will be organized into two
sections: a feed that keeps you current on your interests like sports, news, and
entertainment, and a section for your upcoming personal info, like flights,
appointments and more.

KEYWORD TARGETING

237

This content is presented to you when you open your Google app on either Android
or iOS devices; you can see an example of what it looks like in Figure 7-27.
Nearly two years later, Google finally gave this offering a name: Google
Discover.

Figure 7-27. Google Discover content

238

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

What makes this interesting is that you can potentially optimize your content to
show up in Google Discover, and therefore it can become a source of traffic to
your site. To understand how you may be able to get traffic from Google
Discover, it’s useful to have an understanding of what motivated Google to
promote this new offering. Key parts of that vision were outlined in an October
2018 blog post by Google’s Ben Gomes. The core elements of this vision are: The
shift from answers to journeys: To help you resume tasks where you left off and
learn new interests and hobbies, we’re bringing new features to Search that help
you with ongoing information needs. The shift from queries to providing a
queryless way to get to information: We can surface relevant information related
to your interests, even when you don’t have a specific query in mind. And the
shift from text to a more visual way of finding information: We’re bringing more
visual content to Search and completely redesigning Google Images to help you
find information more easily. Note the emphasis on a visual way to find
information—this fits, as Google Discover uses a highly visual format for
presenting its feed.

Types of sites appearing in Google Discover.

An April 2020 study published by Abby Hamilton on the Search Engine Journal
website broke down the types of sites appearing in Google Discover, based on a
review of over 11,000 URLs from 62 domains. The breakdown of industries as well
as where most of the traffic goes is is shown in Figure 7-28.The study found
that 99% of clicks were on news sites. The remaining 1% were from other
industries, including: • B2B

• Finance

• Automotive

• Health

• Education While these other market sectors are comparatively small, there is
still an opportunity there to earn traffic from Google Discover. To understand
how to do so, it’s helpful to have insight into how Google picks content to
serve up within the feed. Google leverages several sources of information for
this purpose, including: • Search history

• App activity

• Browser history

• Location

KEYWORD TARGETING

239

Figure 7-28. Google Discover types of content

Based on this information, Google continuously adjusts its understanding of the
user’s interest areas over time. In addition, on Android platforms users can
provide specific information about their interests by tapping
Settings→Interests→Your Interests and checking or unchecking the available
items. Some of these are highly granular in

240

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

nature, so users can customize things to a large degree. They can also turn the
feed off entirely if they desire.

Optimizing for Google Discover.

As with Search, Google leverages crawling to find content that it may place
within the Google Discover feed. It has provided some insight as to the types of
factors that it considers. In particular, Google places a high degree of
emphasis on sites that have many individual pages that demonstrate a high level
of Experience, Expertise, Authoritativeness, and Trustworthiness (EEAT). Some of
the specific suggestions made by Google are as follows: 1. Use page titles that
capture the essence of the content, but in a non-clickbait fashion. 2. Include
compelling, high-quality images in your content, especially large images that
are more likely to generate visits from Discover. Large images need to be at
least 1200 px wide and enabled by the max-image-preview:large setting, or by
using AMP. Avoid using a site logo as your image. 3. Avoid tactics to
artificially inflate engagement by using misleading or exaggerated details in
preview content (title, snippets, or images) to increase appeal, or by
withholding crucial information required to understand what the content is
about. 4. Avoid tactics that manipulate appeal by catering to morbid curiosity,
titillation, or outrage. 5. Provide content that’s timely for current interests,
tells a story well, or provides unique insights. In addition, there are
guidelines about what content Google may choose not to show in the feed. In
addition to Google Search’s overall policies, content should not violate the
following policies for search features: • Dangerous content

• Medical content

• Deceptive practices

• Sexually explicit content

• Harassing content

• Terrorist content

• Hateful content

• Violence and gore

• Manipulated media

• Vulgar language and profanity

Google Discover–specific content policies include the following: • Ads and
sponsored content should not occupy more of the page than your other content,
and any advertising or sponsorship relationships should be fully disclosed.

KEYWORD TARGETING

241

• Election content that is outdated or that is purely user-generated may be
removed. • Misleading content that does not deliver to users what is promised in
the content preview will not be included. • Transparency is required. This
includes providing clear dates and bylines; information about the authors,
publication, and publisher; information about the company or network behind the
content; and contact information. In summary, for your content to show up in the
Google Discover feed you need to adhere to all of the previously mentioned
guidelines and policies. (Failure to do so can result in losing eligibility for
your content to appear in the feed and may result in a manual action, discussed
in Chapter 9, showing up in Google Search Console.) Google provides the matching
algorithms, and the users define what they want to see either through their
actions or by identifying specific interests within the Chrome browser on
Android. Here are some additional recommendations to follow if you want your
content to appear in the Discover feed:

Produce great content that appeals to users. This applies to the content of your
site as a whole; as the overall quality of the content on your website
increases, so do your chances of it being shown in Google Discover.

Analyze your results and find ways to continuously improve that content. For
example, if you’re not obtaining any Discover traffic, you may need to improve
the quality of your content or find ways to differentiate it from other content
in your market (perhaps by focusing on more specific aspects of a topic in
greater depth than others have). While Discover is not only for news articles,
it can help to write content that may be in high demand because it covers a hot
new topic.

Continue to improve the EEAT for your site. Google’s advice specifically
highlights this as a major factor in your ability to show up in the Discover
feed.

Avoid clickbait-like titles These have a strong chance of being disqualified for
inclusion in the feed.

Leverage high-quality images as a core part of your content. Discover is visual
by design, and attractive high-resolution images will help you earn placements
in the feed.

Try implementing web stories These are a natural fit for inclusion in Discover;
they’re highly visual and easily consumed.

242

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

On Android devices, users have the option to “follow” a website. If they choose
to follow your site, then your content is far more likely to show up in their
feeds. You can help Google understand which content you would like people to
follow by including a link element of this form in the section of your web
pages:

Ensure that your feed file is not blocked in robots.txt and that you keep it up
to date, just as you would with a sitemap file. If you redirect your feed, make
sure that you use a 301 redirect so Google can follow it. You can also use
multiple feeds across your site, specifying the feed that makes the most sense
for any given page. In the event that multiple feed files are applicable to a
single page, specify each of them, listing the one that you prefer that users
follow first and then the others in descending order of preference. Be aware
that given the nature of Discover, the volume of traffic you get can fluctuate
depending on how Google sees the fit of your content to users with related
interests. You can take steps to increase your eligibility, as outlined in this
section, but it will likely exhibit a fair amount of volatility. You can monitor
the level of traffic you get using the Performance report for Discover, which is
available in Google Search Console. Note that you are only able to see this
report if you have been obtaining a minimum threshold of impressions over the
past 16 months.

Boldface and Italicized Text In traditional approaches to writing it’s common to
use a variety of methods to help users understand the content of a page. This
can include using headers, lists, etc. to break up the content, as well as
judicious use of bold text or italics. Google’s John Mueller confirmed that
these signals are taken into account in November 2021: So usually, we do try to
understand what the content is about on a web page, and we look at different
things to try to figure out what is actually being emphasized here. And that
includes things like headings on a page, but it also includes things like what
is actually bolded or emphasized within the text on the page. So to some extent
that does have a little bit of extra value there, in that it’s a clear sign that
actually you think this page or this paragraph is about this topic here. And
usually that aligns with what we think the page is about anyway, so it doesn’t
change that much. However, this does not mean that you should bold or set in
italics all of the content on the page—if you do that, Google will just ignore
the markup. The key point is to use

KEYWORD TARGETING

243

the markup to cause specific important parts of the text to stand out. To bold
text, you can use or tags: important text important text

Either will cause the text to display in boldface on the rendered web page, and
each will have the same level of SEO impact. To italicize text, you can use or
tags: important text important text

Either implementation will cause the text to display in italics on the rendered
web page, and again each will have the same level of SEO impact.

Keyword Cannibalization As we discussed earlier, you should not use common
keywords across multiple page titles. This advice applies to more than just the
tags. One of the nastier problems that often crop up during the course of
designing a website’s information architecture is keyword cannibalization, a
site’s targeting of popular keyword search phrases on multiple pages, which
forces search engines to pick which is most relevant. In essence, the site
competes with itself for rankings and dilutes the ranking power of its internal
anchor text, external links, and keyword relevance. Avoiding cannibalization
requires a strict site architecture with attention to detail. Plot out your most
important terms on a visual flowchart (or in a spreadsheet file, if you prefer),
and pay careful attention to what search terms each page is targeting. Note that
when pages feature two-, three-, or four-word phrases that contain the target
search phrase of another page, linking back to that page within the content with
the appropriate anchor text will avoid the cannibalization issue. For example,
if you had a page targeting mortgages and another page targeting low-interest
mortgages, you would link back to the mortgages page from the low-interest
mortgages page using the anchor text “mortgages” (see Figure 7-29).

Figure 7-29. Adding value with relevant cross-links

244

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

You can do this in the breadcrumb or in the body copy. The New York Times does
the latter; keywords in the body copy link to the related resource pages on the
site.

Keyword Targeting in CMSs and Automatically Generated Content Large-scale
publishing systems, and those that produce automatically generated content,
present some unique challenges. If hundreds of pages are being created every
day, it is not feasible to do independent keyword research on each and every
page, making page optimization an interesting challenge. In these scenarios, the
focus turns to methods/recipes for generating unique titles, heading tags, and
content for each page. It is critical to educate the writers on ways to
implement titles and headings that capture unique, key aspects of the articles’
content. More advanced teams can go further with this and train their writing
staff on the use of keyword research tools to optimize this process even more.
In the case of automatically generated material (such as that produced by
algorithms that mine data from larger textual bodies), the key is to automate
means for extracting a short (fewer than 65 characters) description of the text
and making it unique with respect to other titles generated elsewhere on the
site and on the web at large.

Effective Keyword Targeting by Content Creators The great majority of the time,
someone other than an SEO professional is responsible for content creation.
Content creators often do not have an innate knowledge of how SEO works, or
worse, they may think they know how it works but have the wrong idea about it.
Some training for your writers is critical. This is particularly important when
you’re dealing with large websites and large teams of writers. Here are the main
components of web page copywriting that your writers must understand, with
regard to SEO: • Search engines look to match up a user’s search queries with
the keyword phrases, their synonyms, and related concepts on your web pages. If
some combination of all of these does not appear on the page, chances are good
that your page will never achieve significant ranking for those search phrases.
• The search phrases users may choose to use when looking for something are
infinite in variety, but certain phrases will be used much more frequently than
others. • Using the more popular phrases you wish to target on a web page in the
content for that page is essential to SEO success for the page. • Make sure that
the writers understand the concepts of co-occurrence and entity salience,
discussed earlier in this chapter, so they don’t create content that uses

KEYWORD TARGETING

245

the main keyword excessively. They need to focus on creating semantically rich
content that stays on the topic of the main target keyword phrase for the page,
while still writing naturally. • The tag is the most important element on the
page. Next is the first header (usually ), and then the main body of the
content. • There are tools (as outlined in Chapter 6) that allow you to research
and determine what the most interesting phrases are. If you can get these six
points across, you are well on your way to empowering your content creators to
perform solid SEO. The next key element is training them on how to pick the
right keywords to use. This can involve teaching them how to use keyword
research tools similar to the ones we discussed in the previous chapter, or
having the website’s SEO person do the research and provide the terms to the
writers. The most important factor to reiterate to content creators is that
content quality and user experience always come first. They should take the
keyword information and translate it into a plan for the topics that should be
included in the content, and then work on creating a great piece of content.
Otherwise, you will inevitably get poorer-quality content or even run into
keyword stuffing or other spam issues.

Long-Tail Keyword Targeting As we outlined in Chapter 5, the small-volume search
terms, when tallied up, represent about 35% of overall search traffic, and the
more obvious, higher-volume terms represent the remaining 65%. However, as an
additional benefit, longer, more specific phrases tend to convert higher, so
there is a lot to be gained by targeting this class of keywords. For example, if
you run a site targeting searches for new york pizza and new york pizza
delivery, you might be surprised to find that the hundreds of single searches
each day for terms such as pizza delivery on the corner of 57th & 7th or
Manhattan’s tastiest Italian-style sausage pizza, when taken together, will
actually provide considerably more traffic than the popular phrases you’ve
researched. As we covered in Chapter 5, this concept is called the long tail of
search. Finding scalable ways to chase long-tail keywords is a complex topic,
and targeting the long tail is an aspect of SEO that combines art and science.
This is also an area where many publishers get into a lot of trouble, as they
think they need to create new content for each potential search phrase that a
user might type in that’s related to their business. Chances are you won’t want
to implement entire web pages targeting the rare queries listed in Figure 7-30!

246

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-30. Example of the long-tail search curve

Fortunately, you don’t have to. You can address much of the long tail of search
by using the right content optimization practices on your site. Perhaps you have
a page for ordering pizza in New York City, with a good title and heading tag on
the page (e.g., “New York City Pizza: Order Here”), a phone number and a form
for ordering the pizza, but no other content. If that’s all you have, that page
is not competing effectively for rankings on long-tail search terms. To fix
this, you need to write additional content for the page. Ideally, this would be
content that talks about the types of pizza that are popular in New York City,
the ingredients used, and other related topics that might draw in long-tail
search traffic. If you also have a page for ordering pizza in San Jose, where
you’ve opened up a second restaurant, the picture gets even more complicated.
You don’t want your content on the San Jose page to be the same as it is on the
New York City page, as this could raise issues with duplicate content (discussed
in “Duplicate Content Issues” on page 256) or keyword cannibalization. To
maximize your success, find a way to generate different content for those two
pages, ideally tuned to the specific needs of the audiences that will arrive at
each of them. Perhaps the pizza preferences of the San Jose crowd are different
from those in New York City. Of course, the geographic information is inherently
different between the two locations, so driving directions from key locations
might be a good thing to include on the pages.

KEYWORD TARGETING

247

As you can imagine, if you have pizza parlors in 100 cities, this can get very
complex indeed. The key here is to remain true to the diverse needs of your
users, while using your knowledge of the needs of search engines and searcher
behavior to obtain that long-tail traffic.

Content Optimization Content optimization relates to how the presentation and
architecture of the text, image, and multimedia content on a page can be
optimized for search engines. Many of these recommendations lead to second-order
effects. Having the right formatting or display won’t boost your rankings
directly, but it can make you more likely to earn links, get clicks, and
eventually benefit in search rankings.

Content Structure Because SEO has become such a holistic part of website
development and improvement, it is no surprise that content formatting—the
presentation, style, and layout choices you make for your content—is a part of
the process. A browser-safe sans serif font such as Arial or Helvetica is a wise
choice for the web; Verdana in particular has received high praise from
usability/readability experts (for a full discussion of this topic, see Web
AIM). Verdana is one of the most popular of the fonts designed for on-screen
viewing. It has a simple, straightforward design, and the characters or glyphs
are not easily confused. For example, the uppercase I and the lowercase L have
unique shapes, unlike in Arial, in which the two glyphs may be easily confused
(see Figure 7-31).

Figure 7-31. Arial and Verdana font comparison

Another advantage of Verdana is the amount of spacing between letters. However,
you should take into account the fact that it’s a relatively large font, so your
text will take up more space than if you used Arial, even at the same point size
(see Figure 7-32).

Figure 7-32. How fonts impact space requirements

248

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

The larger size improves readability but also can potentially disrupt carefully
planned page layouts. In addition to font choice, sizing and contrast issues are
important considerations. Type that is smaller than 10 points is typically very
challenging to read, and in all cases, relative font sizes are recommended so
that users can employ browser options to increase/decrease the size if
necessary. Contrast—the color difference between the background and text—is also
critical; legibility usually drops for anything that isn’t black (or very dark)
on a white background.

Content length and word count Content length is another critical piece of the
optimization puzzle that’s often mistakenly placed in the “keyword density” or
“unique content” bucket of SEO. In fact, content length can play a big role in
terms of whether your material is easy to consume and easy to share. People
often ask about the ideal length for a piece of content. The reality is that
this is heavily dependent on the nature of the topic being addressed. Many
pieces of content do well because they are short and easy to consume. On the
other hand, sometimes content that is lengthy and comprehensive in nature will
fare best. Deciding on the optimal length for your content can be a complicated
process. For example, for a complex topic like diabetes, there might be an
enormous amount of information that you want to communicate about the various
aspects of the disease. You would never try to present all this information on
one page (WebMD has hundreds of pages on diabetes!), but how do you get started
on breaking it up and devising a content plan for the topic? You can begin by
developing a map of all the aspects of the disease that you want to cover. For
your top-level page, you probably want to present a thorough overview and
provide an easy path for users to find the rest of your content on the subject.
You can then try to break out other aspects of the topic in logical ways, until
you have a complete plan for the content. The key is to map this out in advance,
rather than working through it organically and seeing how it turns out.

Visual layout Last but not least in content structure optimization is the
display of the material. Beautiful, simple, easy-to-use, consumable layouts
instill trust and garner far more readership and links than poorly designed
content wedged between ad blocks that threaten to take over the page.

CONTENT OPTIMIZATION

249

CSS and Semantic Markup CSS is commonly mentioned as a best practice for general
web design and development, but its principles provide some indirect SEO
benefits as well. Google used to recommend keeping pages to under 101 KB, and it
was a common belief that there were benefits to implementing pages that were
small in size. Nowadays, however, search engines deny that code size is a factor
at all, unless it is extreme. Still, keeping file sizes low means your pages
have faster load times, lower abandonment rates, and a higher probability of
being fully read and more frequently linked to. This is particularly important
in mobile environments. Your experience may vary, but good CSS makes managing
your files easy, so there’s no reason not to make it part of your standard
operating procedure for web development. Use tableless CSS stored in external
files, keep JavaScript calls external, and separate the content layer from the
presentation layer. You can use CSS code to provide emphasis, to quote/reference
other material, and to reduce the use of tables and other bloated HTML
mechanisms for formatting, which can positively impact your SEO. Be sure to
allow Googlebot access to your CSS files, as it does read these files and use
them to understand your page layout. In 2011, Google, Bing, and Yahoo! sponsored
a standard for markup called Schema.org. This represented a new level of
commitment from the search engines to the concept of marking up content, or,
more broadly, to allowing the publisher to provide information about the content
to the search engines. By “marking up,” we mean tagging your content using XML
tags to categorize it. For example, you may label a block of content as
containing a recipe, and another block of content as containing a review. This
notion of advanced markup was not new, as all of the search engines already
supported semantic markup at a limited level and used this markup to display
rich results (a.k.a. rich snippets), some examples of which are shown in the
next section. NOTE See “Special Features” on page 77 for more on rich results
and other special features.

One of the original ways a publisher had to communicate information about a web
page to search engines was with meta tags. Unfortunately, these were so badly
abused by spammers that Google stopped using them as a ranking signal. The
company confirmed this publicly in a post in 2009, which noted that “Google has
ignored the keywords meta tag for years and currently we see no need to change
that policy.”

250

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Google used to publicly state that it did not use markup as a ranking factor
either, and while those statements are no longer being publicly made, there
continues to be no evidence that it has been made a ranking factor. However,
there are important SEO benefits to using markup.

Markup in search results As mentioned, markup is sometimes used by search
engines to create rich snippets. Figure 7-33 shows some examples in a Google
SERP for the query loc lac recipe.

Figure 7-33. Examples of rich snippets

CONTENT OPTIMIZATION

251

Based on the markup that Google found in the HTML for these pages, it has
enhanced the results by showing the recipe reviews (the number of stars), the
required cooking time, and the ingredients of the meal.

Supported types of markup There are a few different standards for markup. The
most common ones are microdata, microformats, and RDFa. Schema.org is based on
the microdata standard. However, prior to the announcement of Schema.org, the
search engines implemented rich snippets based on some (but not all) aspects of
microformats, and they will likely continue support for these. Here is an
example of the microformats code for a recipe-rich snippet:





Google used to support the data-vocabulary.org format but dropped that support
in February 2021. In addition, Google now recommends that JSON-LD be used for
structured data. Note that this doesn’t mean that search engines will drop
support for the other formats; indeed, in a 2020 test SearchPilot found that
there was no measurable difference in organic search traffic performance between
microdata and JSON-LD. That said, it is likely that any new forms of rich
snippets implemented by the search engines will be based on Schema.org (JSON-LD
or microdata), not microformats or RDFa. The Schema.org types supported by
Google at the time of writing include:

252

• Article

• Dataset

• Book

• Education Q&A

• Breadcrumb

• Employer aggregate rating

• Carousel

• Estimated salary

• Course

• Event

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

• Fact check

• Practice problems

• FAQ

• Product

• Home activities

• Q&A

• How-to

• Recipe

• Image metadata

• Review snippet

• Job posting

• Sitelinks search box

• Learning video

• Software app

• Local business

• Speakable

• Logo

• Subscription and paywalled content

• Math solvers

• Video

• Movie We’ll talk more about semantic markup and Schema.org, including walking
through an example of how to use it, in “Schema.org” on page 329.

Impact of rich snippets The key reason that the search engines are pursuing rich
snippets is that they have done extensive testing that has proven that rich
snippets can increase click-through rates. Searchers like seeing more
information about the page in the search results. Thus, you can expect that the
search engines will continue to implement support for more of these types of
search result enhancements based on markup. Meanwhile, from an SEO perspective,
increasing click-through rate is clearly highly desirable, as it brings us more
relevant traffic.

Content Uniqueness and Depth The search engines place a great deal of value on
robust, unique, value-added content—Google in particular works hard to kick
sites with low-quality content out of its indexes, and the other engines have
followed suit. The first critical designation to avoid is thin content—a phrase
that (loosely) refers to a page the engines do not feel contributes enough
unique content to warrant the page’s inclusion in the search results. How much
content is enough content to not be considered thin? The criteria have never
been officially listed, but here are some examples gathered from engineers and
search engine representatives: • At least 30 to 50 unique words, forming unique,
parsable sentences that other sites/pages do not have (for many pages much more
is appropriate, so consider this a minimum).

CONTENT OPTIMIZATION

253

• Unique text content, different from other pages on the site in more than just
the replacement of key verbs and nouns (yes, this means all those webmasters who
build the same page and just change the city and state names, thinking this
makes the content “unique,” are mistaken). • Unique titles and meta description
tags. If you can’t think of a unique title for a page, that may be a clue that
you shouldn’t create it. If you can’t write unique meta descriptions, just omit
them; Google will create its own description to show in the SERPs. • Unique
video/audio/image content. Google sees this as unique content too, though you
might want to have more than a single image with no other content on a web page.
TIP You can often bypass these limitations if you have a good quantity of
high-value external links pointing to the page in question (though this is very
rarely scalable) or an extremely powerful, authoritative site.

The next criterion the search engines apply is that websites “add value” to the
content they publish, particularly if it comes from (wholly or partially) a
secondary source. In general, content that is created primarily for the purposes
of ranking in search engines is problematic too. This type of content is the
primary target of the helpful content algorithm update that Google rolled out in
August and September 2022.

A word of caution about repurposed content The search engines will view any
sites that attempt to rapidly generate content by repurposing third-party
content, including ecommerce sites that use product descriptions generated by
third parties, as having little or no value. Here are a few recommendations with
regard to repurposed content: • Don’t simply republish something that’s found
elsewhere on the web unless your site adds substantive value for users, and
don’t infringe on others’ copyrights or trademarks. • Small changes, such as a
few comments, a clever sorting algorithm or automated tags, filtering, a line or
two of text, simple mash-ups, or advertising do not constitute “substantive
value.” • If you’re hosting affiliate content, expect to be judged more harshly
than others, as affiliates in the SERPs are one of users’ top complaints about
search engines. This warning does not apply to affiliate sites that create and
publish their own unique, high-quality content.

254

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

• If you syndicate content from third parties, then you should consider making
use of cross-domain canonical tags to point to the original source of that
content. For some exemplary cases where websites fulfill these guidelines, check
out the way sites such as CNET and Metacritic take content/products/reviews from
elsewhere, both aggregating and adding value for their users. Last but not
least, Google recommends that content providers refrain from trying to place
“search results in the search results.” The Google Webmaster Guidelines contain
the following recommendation: Use the robots.txt file on your web server to
manage your crawling budget by preventing crawling of infinite spaces such as
search result pages. Sites can benefit from having their search results
transformed into more valuable listings and category/subcategory landing pages.
Sites that have done this have had great success in recovering rankings and
gaining traffic from Google. In essence, you want to avoid your site’s pages
being perceived, both by an engine’s algorithm and by human engineers and
quality raters, as search results. In general, you should not use: • Pages
labeled in the title or headline as “search results” or “results” • Pages that
appear to offer a query-based list of links to “relevant” pages on the site
without other content (add some text, image(s), and/or formatting that make the
“results” look like detailed descriptions/links instead) • Pages whose URLs
appear to carry search queries (e.g., ?q=miami+restaurants or ?
search=Miami+restaurants versus /miami-restaurants) • Pages with text such as
“Results 1 through 10” These subtle, largely cosmetic changes can mean the
difference between inclusion and removal. Err on the side of caution and dodge
the appearance of your pages containing search results.

Ensure that content is helpful to users Google wants to show pages in the SERPs
that help meet the needs of its users. It doesn’t want to rank content that is
targeted at ranking in search engines rather than at helping users, and its
Helpful Content algorithm is targeted specifically at filtering out this type of
content from the search results. In fact, sites that engage in publishing
significant volumes of content solely to rank in search results can see drops in
their search traffic sitewide, not just to the pages that are seen as search
engine–focused.

CONTENT OPTIMIZATION

255

Content Themes A less discussed but also important issue is the fit of each
piece of content to your site. If you create an article about pizza, but the
rest of your site is about horseshoes, your article is unlikely to rank for the
term pizza. Search engines analyze and understand what sites, or sections of
sites, focus on—that is, their overall themes. If you start creating content
that is not on the same theme as the rest of the site or section, that content
will have a very difficult time ranking. Further, your off-topic content could
potentially weaken the theme of the rest of the site. One site can support
multiple themes, but each themed section needs to justify its own existence by
following good SEO practices, including getting third parties to implement links
from the pages of their sites to that section. Make sure you keep your content
on topic, as this will help the SEO for all of the pages of your site.

Duplicate Content Issues Duplicate content generally falls into three
categories: exact (or true) duplicates, where two URLs output identical content;
near duplicates, where there are small content differentiators (sentence order,
image variables, etc.); and cross-domain duplicates, where exact or near
duplication exists on multiple domains. All of these should be avoided, for the
reasons described in the following subsections. Publishers and inexperienced SEO
practitioners sometimes confuse two related concepts with duplicate content that
Google treats differently. These are:

Thin content The issue here isn’t so much duplicate content as pages without
much content (discussed in “Content Uniqueness and Depth” on page 253). An
example might be a set of pages built out to list all the locations for a
business with 5,000 locations, where the only content on each page is an
address.

Thin slicing These are pages with very minor differences in focus. Consider a
site that sells running shoes, with a different page for every size of each
product sold. While technically each page would actually be showing a different
product, there wouldn’t be much useful difference between the pages for each
model of shoe. Google has been clear that it doesn’t like thin content or thin
slicing. Identification of either of these issues can result in a manual
penalty, or simply lower rankings for your site. Exactly how Bing differentiates
duplicate content, thin content, and thin slicing is less clear, but it also
prefers that publishers avoid creating these types of pages. Duplicate content
can result from many causes, including licensing of content to or from your
site, site architecture flaws due to non-SEO-friendly content management

256

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

systems, or plagiarism. In addition, spammers in desperate need of content may
scrape it from legitimate sources, scrambling the words (through many complex
processes) and repurposing the text to appear on their own pages in the hopes of
attracting long-tail searches and serving contextual ads. Thus, today we’re
faced with a world of duplicate content issues and their corresponding
penalties. Here are some definitions that are useful for this discussion:

Unique content This is content written by humans that is completely different
from any other combination of letters, symbols, or words on the web and has
clearly not been manipulated through computer text-processing algorithms (such
as spam tools employing Markov chains). Note that an increasing number of
solutions exist for AI-based generation of draft content or article outlines,
including generative AI solutions such as ChatGPT, Bing Chat, and Google’s
Search Generative Experience, as well as Shortly and Jasper. The content from
these solutions may be considered unique, but may be of poor quality or contain
inaccuracies. Make sure to have a subject matter expert review it before you
publish it.

Snippets These are small chunks of content, such as quotes, that are copied and
reused. They are almost never problematic for search engines, especially when
included in a larger document with plenty of unique content, and especially if
these are sourced back to your site as the original publisher of the content.

Shingles Search engines look at relatively small phrase segments (e.g., five to
six words) for the presence of the same segments on other pages on the web. When
there are too many such shingles in common between two documents, the search
engines may interpret them as duplicate content.

Duplicate content issues This phrase is typically used to signal duplicate
content that is not in danger of getting a website penalized, but rather is
simply a copy of an existing page that forces the search engines to choose which
version to display in the index.

Duplicate content filtering This is when the search engine removes substantially
similar content from a search result to provide a better overall user
experience.

Duplicate content penalty Penalties are applied rarely and only in egregious
situations. Search engines may devalue or ban other pages on the site, too, or
even the entire website.

DUPLICATE CONTENT ISSUES

257

Consequences of Duplicate Content Assuming your duplicate content is a result of
innocuous oversights, the search engines will most likely simply filter out all
but one of the pages that are duplicates, with the aim of displaying just one
version of a particular piece of content in a given SERP. In some cases they may
filter out results prior to including them in the index, and in other cases they
may allow a page into the index and filter it out when assembling the SERPs in
response to a specific query. In the latter case, a page may be filtered out in
response to some queries and not others. Search engines try to filter out
duplicate copies of content because searchers want diversity in the results, not
the same results repeated again and again. This has several consequences: • A
search engine bot comes to a site with a crawl budget, which is the number of
pages it plans to crawl in each particular session. Each time it crawls a page
that is a duplicate (which is simply going to be filtered out of search
results), you have let the bot waste some of its crawl budget. That means fewer
of your “good” pages will get crawled. This can result in fewer of your pages
being included in the search engine’s index. • Links to duplicate content pages
represent a waste of link authority. Duplicated pages can gain PageRank, or link
authority, but because it does not help them rank, that link authority is
misspent. • No search engine has offered a clear explanation of how its
algorithm picks which version of a page it shows. In other words, if it
discovers three copies of the same content, which two does it filter out? Which
one does it still show? Does it vary based on the search query? The bottom line
is that the search engine might not favor the version you want. Although some
SEO professionals may debate some of the preceding specifics, the general points
will meet with near-universal agreement. However, there are a handful of caveats
to take into account. For one, on your site you may have a variety of product
pages and also offer printable versions of those pages. The search engine might
pick just the printer-friendly page as the one to show in its results. This does
happen at times, and it can happen even if the printer-friendly page has lower
link authority and will rank less well than the main product page. The best
potential fix for scenarios where you can’t eliminate the duplicate pages is to
apply the rel="canonical" link element to all versions of the page to indicate
which version is the original.

258

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

A second version of this problem can occur when you syndicate content to third
parties. The search engine may filter your copy of the article out of the
results in favor of the version in use by the person republishing your article.
There are three potential solutions to this: • Get the person publishing your
syndicated content to include a rel="canonical" link element pointing back to
the original page on your site. This will help indicate to the search engines
that your copy of the page is the original, and any links pointing to the
syndicated page will be credited to your page instead. • Have the syndicating
partner noindex their copy of the content. This will keep the duplicate copy out
of the search engine’s index. In addition, any links in that content back to
your site will still pass link authority to you. • Have the partner implement a
link back to the original source page on your site. Search engines usually
interpret this correctly and emphasize your version of the content when they do
this. Note, however, that there have been instances where Google attributes the
originality of the content to the site republishing it, particularly if that
site has vastly more authority and trust than the true original source of the
content. If you can’t get the syndicating site to implement any of these three
solutions, you should strongly consider not allowing them to publish the
content.

How Search Engines Identify Duplicate Content Some examples will illustrate the
process Google follows when it finds duplicate content on the web. In the
examples shown in Figures 7-34 through 7-37, three assumptions have been made: •
The page with text is assumed to be a page that contains duplicate content (not
just a snippet, despite the illustration). • Each page of duplicate content is
presumed to be on a separate domain. • The steps that follow have been
simplified to make the process as easy and clear as possible. This is almost
certainly not the exact way in which Google performs, but it conveys the effect.

DUPLICATE CONTENT ISSUES

259

Figure 7-34. Google finding duplicate content

Figure 7-35. Google comparing the duplicate content to the other copies

260

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-36. Duplicate copies getting filtered out

Figure 7-37. Google choosing one page as the original

DUPLICATE CONTENT ISSUES

261

A few factors about duplicate content bear mentioning; they can trip up
webmasters who are new to the duplicate content issue: 1. Location of the
duplicate content. Is it duplicate content if the copies are all on my site?
Yes. Duplicate content can occur within a site or across different sites. 2.
Percentage of duplicate content. What percentage of a page has to be duplicated
before I run into duplicate content filtering? Unfortunately, the search engines
will never reveal this information because it would compromise their ability to
prevent the problem. It’s also a near certainty that the percentage used by each
engine fluctuates regularly, and that more than one simple direct comparison
goes into duplicate content detection. The bottom line is that pages do not need
to be identical to be considered duplicates. Further, content can be duplicated
with respect to one specific search query even though most of the page is not
duplicated. 3. Ratio of code to text. What if my code is huge and there are very
few unique HTML elements across multiple pages? Will Google think the pages are
all duplicates of one another? No. The search engines do not care about your
code; they are interested in the content on your page. Code size becomes a
problem only when it becomes extreme. 4. Ratio of navigation elements to unique
content. Every page on my site has a huge navigation bar, lots of header and
footer items, but only a little bit of content; will Google think these pages
are duplicates? No. Google and Bing factor out the common page elements, such as
navigation, before evaluating whether a page is a duplicate. They are very
familiar with the layout of websites and recognize that permanent structures on
all (or many) of a site’s pages are quite normal. Instead, they’ll pay attention
to the “unique” portions of each page and often will largely ignore the rest.
Note, however, that these pages risk being considered thin content by the
engines. 5. Licensed content. What should I do if I want to avoid duplicate
content problems, but I have licensed content from other web sources to show my
visitors? Use (commonly referred to as a meta robots noindex tag). Place this in
your page’s header, and the search engines will know that the content isn’t for
them. This is a general best practice, because then humans can still visit and
link to the page, and the links on the page will still carry value.

262

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Another alternative is to make sure you have exclusive ownership and publication
rights for that content.

Copyright Infringement One of the best ways to monitor whether your site’s copy
is being duplicated elsewhere is to use Copyscape, a site that enables you to
instantly view pages on the web that are using your content. Don’t worry if the
pages of these sites rank far behind your own pages for any relevant queries—if
any large, authoritative, content-rich domain tried to fight all the copies of
its work on the web, it would have at least two full-time jobs on its hands.
Luckily, the search engines have placed trust in these types of sites to issue
high-quality, relevant content, and therefore recognize them as the original
issuers. If, on the other hand, you have a relatively new site or a site with
few inbound links, and the scrapers are consistently ranking ahead of you (or
someone with a powerful site is stealing your work), you’ve got some recourse.
One option is to ask the publisher to remove the offending content. In some
cases, the publisher is simply unaware that copying your content is not allowed.
Another option is to contact the site’s hosting company. Hosting companies could
be liable for hosting duplicate content, so they are often quick to react to
such inquiries. Just be sure to provide as much documentation as possible to
show that the content was originally yours. You could also file a DMCA
infringement request with Google and Bing (you should also file this request
with the infringing site’s hosting company). Finally, you could file a legal
suit (or threaten such) against the website in question. You may want to begin
with a less formal communication asking the publisher to remove the content
before you send a letter from the attorneys, as the DMCA motions can take up to
several months to go into effect; but if the publisher is nonresponsive, there
is no reason to delay taking stronger action, either. If the site republishing
your work has an owner in your country, this latter course of action is probably
the most effective first step. DMCA.com is a very effective and inexpensive
option for this process.

Example Actual Penalty Situations The preceding examples show cases where
duplicate content filtering will come into effect, but there are scenarios where
an actual penalty can occur. For example, sites that aggregate content from
across the web can be at risk of being penalized, particularly if they add
little unique content themselves. If you find yourself in this situation, the
only fix is to reduce the number of duplicate pages accessible to the search
engine crawler. You can accomplish this by deleting them, using canonical

DUPLICATE CONTENT ISSUES

263

on the duplicates, noindexing the pages themselves, or adding a substantial
amount of unique content. One example of duplicate content that may get filtered
out on a broad basis is a thin affiliate site. This nomenclature frequently
describes a site that promotes the sale of someone else’s products (to earn a
commission), yet provides little or no information differentiated from other
sites selling the products. Such a site may have received the descriptions from
the manufacturer of the products and simply replicated those descriptions along
with an affiliate link (so that it can earn credit when a click or purchase is
performed). The problem arises when a merchant has a large number of affiliates
using the same descriptive content, and search engines have observed user data
suggesting that, from a searcher’s perspective, these sites add little value to
their indexes. In these cases the search engines may attempt to filter out these
sites, or even ban them from their indexes. Plenty of sites operate affiliate
models but also provide rich new content, and these sites generally have no
problem; it is when duplication of content and a lack of unique, value-adding
material come together on a domain that the engines may take action.

How to Avoid Duplicate Content on Your Own Site As we’ve outlined, duplicate
content can be created in many ways. When material is duplicated on your own
site, you’ll need to use specific approaches to achieve the best possible
results from an SEO perspective. In many cases, the duplicate pages are pages
that have no value to either users or search engines. If that is the case, try
to eliminate the problem altogether by fixing the implementation so that all
pages are referred to by only one URL. Be sure to 301-redirect the old URLs
(this is discussed in more detail in “Redirects” on page 288) to the surviving
URLs, to help the search engines discover what you have done as rapidly as
possible and preserve any link authority the removed pages may have had. If that
process proves to be impossible, you have several other options, as we will
outline in “Content Delivery and Search Spider Control” on page 270. Here is a
summary of the guidelines on the simplest solutions for dealing with a variety
of scenarios: • Use robots.txt to block search engine spiders from crawling the
duplicate versions of pages on your site. • Use the rel="canonical" link
element. This is the next best solution to eliminating duplicate pages. • Use
meta robots noindex tags to tell the search engine not to index the duplicate
pages.

264

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Be aware, however, that if you use robots.txt to prevent a page from being
crawled, then using noindex or nofollow on the page itself does not make
sense—the spider can’t read the page, so it will never see the noindex or
nofollow directive. With these tools in mind, here are some specific duplicate
content scenarios that you may encounter:

HTTP and HTTPS pages Hopefully you have converted your site to make use of the
Secure Sockets Layer (SSL) protocol for encrypted communications between the
browser and the web server. The great majority of sites use SSL, and most of
these redirect all of their http:// pages to https:// pages. This is a great
first step, but be aware that when a user types in just your domain name (e.g.,
yourdomain.com), the browser will initially attempt to load the http:// version
before seeing the 301 redirect. This is a moment of vulnerability that hackers
can try to exploit. For that reason, your site should also implement HTTP Strict
Transport Security (HSTS), a protocol that instructs the browser to never even
attempt loading the http:// version of the page and thereby closes this window.
In addition, some sites fail to convert all of their pages or don’t fully
implement the proper redirects, which leaves them vulnerable. You’ll also need
to watch for links on your https:// pages that link back to other pages on the
site using http:// in the URL (so, for example, the link to your home page is
http://www.yourdomain.com instead of https://www.yourdomain.com). If you have
these kinds of issues on your site, you may want to use 301 redirects or the
rel="canonical" link element, which we describe in “Using the rel=“canonical”
Attribute” on page 283, to resolve the problems. An alternative solution may be
to change any relative links to absolute links
(https://www.yourdomain.com/content instead of /content), which also makes life
more difficult for content thieves that scrape your site (all the links on their
pages will point back to your site instead of linking to their copy of your
content).

A CMS that creates duplicate content Sometimes sites have many versions of
identical pages because of limitations in the CMS, where it addresses the same
content with more than one URL. These are often unnecessary duplications with no
end user value, and the best practice is to figure out how to eliminate the
duplicate pages and 301-redirect the eliminated pages to the surviving pages.
Failing that, fall back on the other options listed at the beginning of this
section.

Printable pages or multiple sort orders Many sites offer printable pages to
provide the user with the same content in a more printer-friendly format, and
some ecommerce sites offer their products in multiple sort orders (such as by
size, color, brand, and price). These pages do have

DUPLICATE CONTENT ISSUES

265

end user value, but they do not have value to the search engine and will appear
to be duplicate content. To deal with these issues, use one of the options
listed previously in this subsection, or (for offering printable pages) set up a
printer-friendly CSS stylesheet.

Duplicate content in blogs and multiple archiving systems (e.g., pagination)
Blogs present some interesting duplicate content challenges. Blog posts can
appear on many different pages, such as the home page of the blog, the permalink
page for the post, date archive pages, and category pages. Search engines will
view this as duplicate content. Few publishers attempt to address the presence
of a post on the home page of the blog and also at its permalink, and this
scenario is common enough that the search engines likely deal reasonably well
with it. However, it may make sense to show only excerpts of the post on the
category and/or date archive pages.

User-generated duplicate content (e.g., repostings) Many sites implement
structures for obtaining user-generated content, such as a blog, forum, or job
board. This can be a great way to develop large quantities of content at a very
low cost. The challenge is that users may choose to submit the same content on
your site and several other sites at the same time, resulting in duplicate
content among those sites. It’s hard to control this, but there are three things
you can do to mitigate the problem: • Have clear policies that notify users that
the content they submit to your site must be unique and cannot be, or have been,
posted to other sites. This is difficult to enforce, of course, but it will
still help some to communicate your expectations. • Implement your forum in a
different and unique way that demands different content. Instead of having only
the standard fields for entering data, include fields that are likely to be
unique compared to what other sites offer, but that will still be interesting
and valuable for site visitors to see. • Conduct human reviews of all
user-submitted content and include as part of the process a check for duplicate
content.

Controlling Content with Cookies and Session IDs Sometimes you want to more
carefully dictate what a search engine robot sees when it visits your site. In
general, search engine representatives refer to the practice of showing
different content to users than to crawlers as cloaking, which violates the
engines’ Terms of Service (TOS) and is considered spam.

266

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

However, there are legitimate uses for this practice that are not deceptive to
the search engines or malicious in intent; for example, the delivery of content
to users for the purpose of providing a personalized experience. This section
will explore methods for controlling content with cookies and session IDs.

What’s a Cookie? A cookie is a small text file that websites can leave on a
visitor’s device, helping them to track that person over time. Cookies are the
reason Amazon remembers your username between visits and the reason you don’t
necessarily need to log in to your Gmail account every time you open your
browser. Cookie data typically contains a short set of information regarding
when you last accessed a site, an ID number, and potentially, information about
your visit (see Figure 7-38).

Figure 7-38. Using cookies to store data

Website developers use cookies for tracking purposes or to display different
information to users based on their actions or preferences. Common uses include
remembering a username, maintaining a shopping cart, and keeping track of
previously viewed content. For example, if you’ve signed up for an account with
Moz, it will provide you with options on your My Account page about how you want
to view the blog and will remember those settings the next time you visit.

What Are Session IDs? Session IDs are virtually identical to cookies in
functionality, with one big difference: as illustrated in Figure 7-39, when you
close your browser (or restart your device), session ID information is (usually)
no longer stored on the device. The website you

CONTROLLING CONTENT WITH COOKIES AND SESSION IDS

267

were interacting with may remember your data or actions, but it cannot retrieve
session IDs from your device that don’t persist (and session IDs by default
expire when the browser shuts down). In essence, session IDs are more like
temporary cookies (although, as you’ll see shortly, there are options to control
this).

Figure 7-39. How session IDs are used NOTE Any user has the ability to turn off
cookies in their browser settings. This often makes web browsing considerably
more difficult, though, and many sites will actually display a page saying that
cookies are required to view or interact with their content. Cookies, persistent
though they may be, are also deleted by many users on a semi-regular basis. For
example, a study by comScore found that 33% of web users deleted their
first-party cookies at least once per month.

Although technically speaking session IDs are just a form of cookie without an
expiration date, it is possible to set session IDs with expiration dates,
similar to cookies (going out decades). In this sense, they are virtually
identical to cookies. Session IDs do come with an important caveat, though: they
are frequently passed in the URL string, which can create serious problems for
search engines (as every request produces a unique URL with duplicate content).
It is highly desirable to eliminate session IDs from your URLs, and you should
avoid them if it is at all possible. If you currently use them on your site, a
short-term fix is to use the rel="canonical" link element (which we’ll discuss
in “Content Delivery and Search Spider Control” on page 270) to tell the search
engines that you want them to ignore the session IDs.

268

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

How Do Search Engines Interpret Cookies and Session IDs? Search engine spiders
do not look at cookies or session IDs, and act as browsers with this
functionality shut off. However, unlike visitors whose browsers won’t accept
cookies, the crawlers can sometimes reach sequestered content by virtue of
webmasters who want to specifically let them through. Many sites have pages that
require cookies or sessions to be enabled but have special rules for search
engine bots, permitting them to access the content as well. Although this is
technically cloaking, there is a form of this known as flexible sampling that
search engines generally allow (we will discuss this in more detail in “Content
Delivery and Search Spider Control” on page 270). Despite the occasional access
that engines are granted to cookie/session-restricted pages, the vast majority
of cookie and session ID usage creates content, links, and pages that limit
access. Web developers can leverage the power of options such as flexible
sampling to build more intelligent sites and pages that function in optimal ways
for both humans and search engines.

Why Would You Want to Use Cookies or Session IDs to Control Search Engine
Access? There are numerous potential tactics to leverage cookies and session IDs
for search engine control. Here are some of the major strategies you can
implement, but there are certainly limitless other possibilities:

Show multiple navigation paths while controlling the flow of link authority.
Visitors to a website often have multiple ways in which they’d like to view or
access content. Your site may benefit from offering many paths to reaching
content (by date, topic, tag, relationship, ratings, etc.), but these varied
sort orders may be seen as duplicate content. By implementing them, you expend
PageRank or link authority that would be better optimized by focusing on a
single search engine–friendly navigational structure. You can prevent the search
engines from indexing multiple pages with the same content by requiring a cookie
for users to access the alternative sort order versions of a page. One other
(but not foolproof) solution is to use the rel="canonical" link element to tell
the search engine that these alternative sort orders are really just the same
content as the original page (we will discuss canonical in the next section).

Keep limited pieces of a page’s content out of the engines’ indexes. Many pages
may contain some pieces of content that you’d like to show to search engines and
other pieces you’d prefer to appear only for human visitors. These could include
ads, login-restricted information, links, or even rich media. Once again,
showing noncookied users the plain version and cookie-accepting visitors

CONTROLLING CONTENT WITH COOKIES AND SESSION IDS

269

the extended information can be invaluable. Note that this option is often used
in conjunction with a login, so only registered users can access the full
content (such as on sites like Facebook and LinkedIn).

Grant access to pages requiring a login. As with snippets of content, there are
often entire pages or sections of a site to which you’d like to restrict search
engine access. This can be easy to accomplish with cookies/sessions, and it can
even help to bring in search traffic that may convert to “registered user”
status. For example, if you had desirable content that you wished to restrict,
you could create a page with a short snippet and an offer for the visitor to
continue reading upon registration, which would then allow them access to that
content at the same URL. We will discuss this more in the following section.

Avoid duplicate content issues. One of the most promising areas for
cookie/session use is to prohibit spiders from reaching multiple versions of the
same content, while allowing visitors to get the version they prefer. As an
example, at Moz, logged-in users can see full blog entries on the blog home
page, but search engines and nonregistered users will see only excerpts. This
prevents the content from being listed on multiple pages (the blog home page and
the specific post pages) and provides a richer user experience for members.

Content Delivery and Search Spider Control On occasion, it can be valuable to
show search engines one version of content and show humans a different version.
As we’ve discussed, this is technically called “cloaking,” and the search
engines’ guidelines have near-universal policies restricting it. In practice,
many websites, large and small, appear to use these techniques effectively and
without being penalized by the search engines. However, use great care if you
implement them, and know the risks that you are taking.

Cloaking and Segmenting Content Delivery Before we discuss the risks and
potential benefits of cloaking-based practices, take a look at Figure 7-40,
which illustrates how cloaking works. Google makes its policy pretty clear in
its guidelines on cloaking: Cloaking is considered a violation of Google’s
Webmaster Guidelines because it provides our users with different results than
they expected. It is true that if you cloak in the wrong ways, with the wrong
intent, Google and the other search engines may remove you from their
indexes—and if you do it egregiously, they certainly will.

270

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-40. How cloaking works

A big factor is intent: if the engines feel you are attempting to manipulate
their rankings or results through cloaking, they may take adverse action against
your site. If the intent of your content delivery doesn’t interfere with their
goals, you’re less likely to be subject to a penalty, but there is never zero
risk of a penalty. Google has taken a strong stand against all forms of
cloaking, regardless of intent. The message should be clear: cloaking won’t
always get you banned, and you can do some pretty smart things with it, but your
intent is key. If you are doing it for reasons that are not deceptive and that
provide a positive experience for users and search engines, you might not run
into problems. However, there is no guarantee of this, so use these types of
techniques with great care, and know that you may still get penalized for it.

Reasons for Showing Different Content to Search Engines and Visitors A few
common reasons for displaying content differently to different visitors,
including search engines, include:

Multivariate and A/B split testing Testing landing pages for conversions
requires that you show different content to different visitors to test
performance. In these cases, it is best to display the content using
JavaScript/cookies/sessions and give the search engines a single, canonical
version of the page that doesn’t change with every new spidering (though this
won’t necessarily hurt you). Google previously offered software called Google
Website Optimizer to perform this function, but it has been discontinued (it was
subsequently replaced with Google Analytics Content Experiments, which has also
been deprecated). If you have used either of these tools in the past, Google
recommends removing the associated tags from your site’s pages.

Content requiring registration and flexible sampling If you force users to
register (paid or free) in order to view specific pieces of content, it is best
to keep the URL the same for both logged-in and nonlogged-

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

271

in users and to show a snippet (one to two paragraphs is usually enough) to
nonlogged-in users and search engines. If you want to display the full content
to search engines, you have the option to provide some rules for content
delivery, such as showing the first one to two pages of content to a new visitor
without requiring registration, and then requesting registration after that
grace period. This keeps your intent more honest, and you can use cookies or
sessions to restrict human visitors while showing the full pieces to the
engines. In this scenario, you might also opt to participate in Google’s
flexible sampling program, wherein websites can expose “premium” or
login-restricted content to Google’s spiders, as long as users who visit the
site from the SERPs are given the ability to view the contents of the article
they clicked on (and potentially more pages) for free. Many prominent web
publishers employ this tactic, including the popular site Experts Exchange. To
be specific, to implement flexible sampling, publishers must grant Googlebot
(and presumably the other search engine spiders) access to all the content they
want indexed, even if users normally have to log in to see the content. The user
who visits the site will still need to log in, but the search engine spider will
not have to do so. This will lead to the content showing up in the search engine
results when applicable. However, if a user clicks on that search result, you
must permit them to view the entire article (including all pages of a given
article if it is a multiple-page article). If the user clicks a link to look at
another article on your site, you can still require them to log in. Publishers
can also limit the number of free accesses a user gets using this technique to a
certain number of articles per month, with that number at the publisher’s
discretion.

Navigation unspiderable by search engines If your site is implemented in
JavaScript that only pulls down the links after a given user action (such as
clicking on a drop-down), or another format where a search engine’s ability to
parse it is uncertain, you should consider showing search engines a version that
has spiderable, crawlable content in HTML. Many sites do this simply with CSS
layers, displaying a human-visible, search-invisible layer and a layer for the
engines (and less capable browsers). You can also employ the tag for this
purpose, although it is generally riskier, as many spammers have applied as a
way to hide content. Make sure the content shown in the search-visible layer is
substantially the same as in the human-visible layer.

Duplicate content If a significant portion of a page’s content is duplicated,
you might consider restricting spider access to it by placing it in an iframe
that’s restricted by robots.txt. This ensures that you can show the engines the
unique portion of your pages,

272

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

while protecting against duplicate content problems. We will discuss this in
more detail in “Additional Methods for Segmenting Content Delivery” on page 285.

Different content for different users At times you might target content uniquely
to users from different geographies (such as different product offerings that
are more popular in their area), users with different screen resolutions (to
make the content fit their screen size better), or users who entered your site
from different navigation points. In these instances, it is best to have a
“default” version of content that’s shown to users who don’t exhibit these
traits to show to search engines as well. There are a variety of strategies that
can be used to segment content delivery. The most basic is to serve content that
is not meant for the engines in unspiderable formats (e.g., placing text in
images, having content that does not load in the browser until after a user
action, plug-ins, etc.). You should not use these formats for the purpose of
cloaking; use them only if they bring a substantial end user benefit (such as an
improved user experience). In such cases, you may want to show the search
engines the same content in a spiderable format. When you’re trying to show the
engines something you don’t want visitors to see, you can use CSS formatting
styles (preferably not display:none, as the engines have filters to watch
specifically for this); user agent–, cookie-, or sessionbased delivery; or IP
delivery (showing content based on the visitor’s IP address). Be very wary when
employing these strategies. As noted previously, the search engines expressly
prohibit cloaking practices in their guidelines, and though there may be some
leeway based on intent and user experience (e.g., if your site is using cloaking
to improve the quality of the user’s experience, not to game the search
engines), the engines take these tactics seriously and may penalize or ban sites
that implement them inappropriately or with the intention of manipulation. In
addition, even if your intent is good, the search engines may not see it that
way and penalize you anyway. The following sections explore common ways of
controlling what search spiders see that are generally considered acceptable.
Then we’ll look at some other techniques that are sometimes used but usually are
better avoided.

Leveraging the robots.txt File The robots.txt file, located at the root level of
your domain (e.g., https://www.yourdomain.com/robots.txt), is a highly versatile
tool for controlling what the spiders are permitted to access on your site. You
can use this file to: • Prevent crawlers from accessing nonpublic parts of your
website. • Block search engines from accessing index scripts, utilities, or
other types of code.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

273

• Avoid indexing duplicate content on a website, such as printable versions of
HTML pages or different sort orders for product catalogs. (Note: Use of
rel="canoni cal" or noindex tags on these pages may be a better approach to
dealing with the duplicate content. See “Duplicate Content Issues” on page 256
for more information on this topic.) • Autodiscover XML sitemaps. The robots.txt
file must reside in the root directory, and the filename must be entirely in
lowercase (robots.txt, not Robots.txt or any other variation that includes
uppercase letters). Any other name or location will not be seen as valid by the
search engines. The file must also be entirely in text format (not in HTML
format). Figure 7-41 illustrates what happens when a search engine robot sees a
directive in robots.txt not to crawl a web page.

Figure 7-41. The impact of telling a search engine robot not to crawl a page

Because the page will not be crawled, links on the page cannot pass link
authority to other pages (the search engine does not see the links). However,
the page can be in the search engine’s index. This can happen if other pages on
the web link to it. Of course, the search engine will not have very much
information on the page, as it cannot read it, and will rely mainly on the
anchor text and other signals from the pages linking to it to determine what the
page may be about. Any resulting search listings end up being pretty sparse when
you see them in the search engine’s index, as shown in Figure 7-42, which shows
the results for the Google

274

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

query site:www.nytimes.com/cnet/. This is not a normal query that a user would
enter, but you can see what the results look like: only the URL is listed, and
there is no description. This is because the spiders aren’t permitted to read
the page to get that data. With today’s algorithms, these types of pages don’t
rank very highly because their relevance scores tend to be quite low for any
normal queries.

Figure 7-42. Example search results for pages that are blocked from crawling in
robots.txt

Google, Bing, and nearly all of the legitimate crawlers on the web will follow
the instructions you set out in the robots.txt file. Commands in robots.txt are
primarily used to prevent spiders from accessing pages and subfolders on a site,
though they have other uses as well. Note that subdomains require their own
robots.txt files, as do files that reside on both HTTP and HTTPS servers.

Syntax of the robots.txt file The basic syntax of robots.txt is fairly simple.
You specify a robot name, such as “Googlebot,” and then you specify an action.
The robot is identified by the User-agent line, and then the actions are
specified on the lines that follow. The major action you can specify is
Disallow, which lets you indicate any pages you want to block the bots from
accessing (you can use as many Disallow lines as needed). Some other
restrictions apply: • Each User-agent/Disallow group should be separated by a
blank line; however, no blank lines should exist within a group (between the
User-agent line and the last Disallow line).

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

275

• The hash symbol (#) may be used for comments within a robots.txt file, where
everything after # on that line will be ignored. • Directories and filenames are
case-sensitive: private, Private, and PRIVATE are all different to search
engines. Here is an example of a robots.txt file: User-agent: Googlebot
Disallow: User-agent: Bingbot Disallow: / # Block all robots from tmp and logs
directories User-agent: * Disallow: /tmp/ Disallow: /logs # for directories and
files called logs

The preceding example will do the following: • Allow Googlebot to go anywhere. •
Prevent Bingbot from crawling any part of the site. • Block all robots (other
than Googlebot) from visiting the /tmp/ directory or directories, or files
called /logs (e.g., /logs or logs.php). Notice that in this example the behavior
of Googlebot is not affected by later instructions such as Disallow: /. Because
Googlebot has its own instructions from robots.txt, it will ignore directives
labeled as being for all robots (i.e., those that use an asterisk). These are
the most basic aspects of robots.txt files, but there are some subtleties, such
as understanding how search engines will parse more complex files.

Resolving rule conflicts in robots.txt Understanding how rule conflicts are
resolved is important; it can have a big impact on whether or not your
robots.txt file behaves as intended. The main guidelines are: • The order in
which instructions appear in robots.txt is irrelevant. • Matching is based on
finding the most specific string that matches the crawler’s user agent. For
example, a crawler user agent of googlebot-news would match a robots.txt
declaration for googlebot-news, but not one for googlebot. • Nonmatching parts
of the string are ignored. For example, googlebot would match both googlebot/1.2
and googlebot*.

276

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

These are the basics; for further details, consult the Google documentation on
how it interprets the robots.txt specification.

Additional robots.txt functionality robots.txt also offers some more specific
directives. Some of these methods are supported by only some of the engines, as
detailed here:

Crawl delay Crawl delay is supported by Google, Bing, and Ask. It instructs a
crawler to wait the specified number of seconds between crawling pages. The goal
of the directive is to reduce the load on the publisher’s server. For example:
User-agent: Bingbot Crawl-delay: 5

Pattern matching Pattern matching appears to be usable by both Google and Bing.
The value of pattern matching is considerable. You can do some basic pattern
matching using the asterisk (*) wildcard character. Here is how you can use
pattern matching to block access to all subdirectories that begin with private
(/privatel/, /private2/, /private3/, etc.): User-agent: Googlebot Disallow:
/private*/

You can match the end of the string using the dollar sign ($). For example, to
block access to URLs that end with .asp: User-agent: Googlebot Disallow: /*.asp$

You may wish to prevent the robots from accessing any URLs that contain
parameters. To block access to all URLs that include a question mark (?), simply
use the following pattern: User-agent: * Disallow: /*?*

The pattern-matching capabilities of robots.txt are more limited than those of
programming languages such as Perl, so the question mark does not have any
special meaning and can be treated like any other character.

Allow The Allow directive appears to be supported only by Google and Ask. It
works in the opposite way to the Disallow directive and provides the ability to
specifically call out directories or pages that may be crawled. When this is
implemented, it can partially override a previous Disallow directive. This may
be beneficial after large sections of the site have been disallowed, or if the
entire site has been disallowed.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

277

Here is an example that allows Googlebot into only the google directory:
User-agent: Googlebot Disallow: / Allow: /google/

Noindex This is an unsupported feature that Google used to support but formally
dropped support for in September 2019. This directive used to work in the same
way as the meta robots noindex tag and told the search engines to explicitly
exclude a page from the index. You should not use this feature, and if you’re
still relying on it in your robots.txt file, you should remove it from there.
Sitemap We discussed XML sitemaps at the beginning of this chapter. You can use
robots.txt to provide an autodiscovery mechanism for the spiders to find your
XML sitemap file, by including a Sitemap directive: Sitemap: sitemap_location

The sitemap_location should be the complete URL to the sitemap, such as https://
www.yourdomain.com/sitemap.xml. You can place this line anywhere in your file.
For full instructions on how to apply robots.txt, see Martijn Koster’s “A
Standard for Robot Exclusion”. You can also test your robots.txt file in Google
Search Console under Crawl→robots.txt Tester. You should use great care when
making changes to robots.txt. A simple typing error can, for example, suddenly
tell the search engines to no longer crawl any part of your site. After updating
your robots.txt file, it’s always a good idea to check it with the robots.txt
Tester tool.

Using the rel=“nofollow” Attribute In 2005, the three major search
engines—Google, Microsoft, and Yahoo! (which still had its own search engine at
that time)—all agreed to support an initiative intended to reduce the
effectiveness of automated spam. Unlike the meta robots nofollow tag, the new
directive could be employed as an attribute within an or tag to indicate that
the linking site “does not editorially vouch for the quality of the linked-to
page.” This enables a content creator to link to a web page without passing on
any of the normal search engine benefits that typically accompany a link (trust,
anchor text, PageRank, etc.). Originally, the intent was to enable blogs,
forums, and other sites where usergenerated links were offered to shut down the
value of spammers who built crawlers that automatically created links. However,
this has expanded as Google, in particular, recommends use of nofollow on any
links that are paid for—its preference is that only

278

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

those links that are truly editorial and freely provided by publishers (without
being compensated) should count toward bolstering a site’s/page’s rankings. In
September 2019 Google announced two additional link-related attributes,
rel="ugc" and rel="sponsored". As one might expect, the recommendation is that
the ugc attribute be used on outbound links found in user-generated content on
your site, and the sponsored attribute on outbound links found within ads on
your website. Google says that you may still use the nofollow attribute for any
link for which you wish to block link signals (including UGC and ad links), but
it prefers you to use the more specific variants if you can. As an example, if
you want to link to a page (say, The New York Times home page) but it’s an ad
and you want to apply the sponsored attribute to that link, you can implement it
using the following format:

Note that although you can use nofollow, sponsored, or ugc to restrict the
passing of link value between web pages, the search engines may still crawl
through those links to the pages that they link to. To summarize, these
attributes don’t expressly forbid indexing or spidering, so they should not be
used in an attempt to prevent either of those. Figure 7-43 shows how a search
engine robot interprets a nofollow, ugc, or sponsored attribute when it finds
one associated with a link. In the early days of the nofollow attribute it was,
for a number of years, considered to not consume any link authority. As a
result, the concept of PageRank sculpting using nofollow was popular. The belief
was that when you nofollowed a particular link, the link authority that would
have been passed to that link was preserved and the search engines would
reallocate it to the other links found on the page. As a result, many publishers
implemented nofollow links to lower-value pages on their sites (such as the
About Us and Contact Us pages, or alternative sort order pages for product
catalogs). Way back in June 2009, however, Google’s Matt Cutts wrote a post that
made it clear that the link authority associated with a nofollowed link is
discarded rather than reallocated. This same treatment also applies to the ugc
and sponsored attributes. In theory, you can still use nofollow on any links you
want, but using it on internal links does not bring any real benefit and should
in general be avoided (and of course, using ugc or sponsored on internal links
would not make sense either). This is a great illustration of the ever-changing
nature of SEO: something that was once a popular, effective tactic is now viewed
as ineffective.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

279

Figure 7-43. The impact of the nofollow, ugc, and sponsored attributes NOTE Some
more aggressive publishers continue to attempt PageRank sculpting by using even
more aggressive approaches, such as implementing links within iframes that have
been disallowed in robots.txt, so that the search engines don’t see them as
links. Such aggressive tactics are probably not worth the trouble for most
publishers.

Using the Robots Meta Tag The robots meta tag supports various directives that
can be used to control the indexing and serving of content. The most important
of these are noarchive, noindex, and nofollow. In each case, the default
behavior is to allow that action, so it is unnecessary to place the
corresponding archive, index, and follow directives on each page. The noarchive
directive tells the engine not to keep the page in its public index, available
via the “cached” link in the “About this result” portion of the search results
(see Figure 7-44). If you don’t include this, Google may store a cached version
of the page showing what it looked like the last time it was crawled, which
users can access through the search results.

280

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-44. Accessing a cached page in the SERPs

The noindex directive tells the engine that the page is not allowed to be stored
in any capacity. A page marked noindex will thus be excluded entirely from the
search engines’ indexes. Figure 7-45 shows what a search engine robot does when
it sees a noindex tag on a web page.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

281

Figure 7-45. The impact of noindex

The page is eligible to be crawled, but it will not appear in search indexes.
The page may still accumulate and pass link authority to other pages, though
Google has indicated that after a period of time it tends to not pass PageRank
from pages marked with noindex. TIP One good application for noindex is to place
this directive on HTML sitemap pages. These are pages designed as navigational
aids for users and search engine spiders to enable them to efficiently find the
content on your site. On some sites these pages are unlikely to rank for
anything of importance in the search engines, yet you still want them to provide
a crawl path to the pages they link to. Putting noindex on these pages keeps
these HTML sitemaps out of the indexes and removes that problem. Bear in mind
that these pages may not pass PageRank, as Google tends to block the passing of
PageRank on pages for which it believes the content is of low value; however,
they may still aid Googlebot in discovering more of the pages on your site.

Finally, applying nofollow in the robots meta tag tells the engine that none of
the links on that page should pass link value. This does not prevent the page
from being crawled or indexed; its only function is to prevent link authority
from spreading out, which has very limited application since the launch of the
rel="nofollow" attribute (discussed earlier), which allows this directive to be
placed on individual links. The default behavior is to crawl the links on the
page and pass link authority through them.

282

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-46 outlines the behavior of a search engine robot when it finds a meta
robots nofollow tag on a web page (assuming there are no other links pointing to
the three

linked URLs).

Figure 7-46. The impact of nofollow

When you use the meta robots nofollow tag on a page, the search engine will
still crawl the page and place the page in its index. However, all links (both
internal and external) on the page will be disabled from passing link authority
to other pages. Note that there are no sponsored or ugc meta robots tags. These
can only be applied as link attributes: rel="sponsored" or rel="ugc".

Using the rel=“canonical” Attribute In February 2009, Google, Yahoo!, and
Microsoft debuted the rel="canonical" link element (sometimes referred to as the
canonical tag). This was a new construct designed explicitly for the purpose of
identifying and dealing with duplicate content. The implementation is very
simple and looks like this:

This tag tells the search engines that the page containing the canonical tag
should not be indexed and that all of its link authority should be passed to the
page that the tag points to (in our example this would be https://moz.org/blog).
This is illustrated in Figure 7-47.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

283

Figure 7-47. The impact of the canonical attribute

Including the rel="canonical" attribute in a element is similar in many ways to
a 301 redirect, from an SEO perspective. In essence, you’re telling the engines
to ignore one page in favor of another one (as a 301 redirect does), but without
actually redirecting visitors to the new URL (for many publishers this is less
effort than some of the other solutions for their development staff). There are
some differences, though: • Whereas a 301 redirect points all traffic (bots and
human visitors) to the new URL, canonical is just for search engines, meaning
you can still separately track visitors to the unique URL versions. • A 301
redirect is a much stronger signal that multiple pages have a single, canonical
source. While redirects are considered directives that search engines and
browsers are obligated to honor, canonical is treated as a suggestion. Although
the search engines generally support this attribute and trust the intent of site
owners, there will be times that they ignore it. Content analysis and other
algorithmic metrics are applied to ensure that a site owner hasn’t mistakenly or
manipulatively applied canonical, and you can certainly expect to see mistaken
uses of it, resulting in the engines maintaining those separate URLs in their
indexes (in which case site owners will experience the problems noted in
“Duplicate Content Issues” on page 256). In general, the best solution is to
resolve the duplicate content problems at their core, and eliminate them if you
can. This is because rel="canonical" link elements are not guaranteed to work.
However, it is not always possible to resolve the issues by other

284

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

means, and in those cases using canonical provides a very effective backup plan.
We will discuss some applications for this tag later in this chapter. TIP The
Google documentation also indicates that you can include the rel="canoni cal"
attribute directly within the HTTP response header for your page. The code might
look something like the following: HTTP/1.1 200 OK Content-Type: application/pdf
Link: ; rel="canonical" Content-Length: 785710 ...rest of HTTP response
headers...

Additional Methods for Segmenting Content Delivery As mentioned previously,
there are several other strategies that can be employed to control what content
the search engine spiders see (again, bear in mind that you should not use these
formats for the purpose of cloaking, but only if they bring some substantial end
user benefit). These include:

Blocking and cloaking by IP address range You can customize entire IP addresses
or ranges to block particular bots through server-side restrictions on IP
addresses. Most of the major engines crawl from a limited number of IP ranges,
making it possible to identify them and restrict access. This technique is,
ironically, popular with webmasters who mistakenly assume that search engine
spiders are spammers attempting to steal their content, and thus block the IP
ranges the bots use to restrict access and save bandwidth. Use caution when
blocking bots, and make sure you’re not restricting access to a spider that
could bring benefits, either from search traffic or from link attribution.

Blocking and cloaking by user agent At the server level, it is possible to
detect user agents and restrict their access to pages or websites based on their
declared identity. As an example, if a website detects a rogue bot, you might
double-check its identity before allowing access. The search engines all use a
similar protocol to verify their user agents via the web: a reverse DNS lookup
followed by a corresponding forward DNS lookup. An example for Google would look
like this: > host 66.249.66.1 1.66.249.66.in-addr.arpa domain name pointer
crawl-66-249-66-1.googlebot.com. > host crawl-66-249-66-1.googlebot.com
crawl-66-249-66-1.googlebot.com has address 66.249.66.1

A reverse DNS lookup by itself may be insufficient, because a spoofer could set
up reverse DNS to point to xyz.googlebot.com or any other address.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

285

Using iframes Sometimes there’s a certain piece of content on a web page (or a
persistent piece of content throughout a site) that you’d prefer search engines
didn’t see. For example, the content may be duplicated elsewhere but useful to
users on your page, so you want them to see it inline but don’t want the engines
to see the duplication. As mentioned earlier, clever use of iframes can come in
handy for this situation, as Figure 7-48 illustrates.

Figure 7-48. Using iframes to prevent indexing of content

The concept is simple: by using iframes, you can embed content from another URL
onto any page of your choosing. By then blocking spider access to the iframe
with robots.txt, you ensure that the search engines won’t “see” this content on
your page. Websites may do this for many reasons, including avoiding duplicate
content problems, reducing the page size for search engines, or lowering the
number of crawlable links on a page (to help control the flow of link
authority).

Hiding text in images As discussed earlier, the major search engines still have
limited capacity to read text in images (and the processing power required makes
for a severe barrier). Hiding content inside images isn’t generally advisable,
as it can be impractical for some devices (mobile devices, in particular) and
inaccessible to others (such as screen readers).

Requiring form submission Search engines will not submit HTML forms in an
attempt to access the information retrieved from a search or submission. Thus,
as Figure 7-49 demonstrates, if

286

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

you keep content behind a forced form submission and never link to it
externally, they won’t see it.

Figure 7-49. Using forms, which are generally not navigable by crawlers, blocks
access to your content

The problem arises when content behind forms earns links outside your control,
such as when bloggers, journalists, or researchers decide to link to the pages
in your archives without your knowledge. Thus, although form submission may keep
the engines at bay, make sure that anything truly sensitive has additional
protection (e.g., through robots.txt or the robots meta tag).

Using login/password protection Password protection and/or paywalls of any kind
will effectively prevent any search engines from accessing content, as will any
form of human verification requirement, such as CAPTCHAs (the boxes requiring
users to copy letter/number combinations or select images containing a certain
object to gain access to content). The major engines won’t try to guess
passwords or bypass these systems.

Removing URLs from a search engine’s index As a secondary, post-indexing tactic,
URL removal from most of the major search engine indexes is possible through
verification of your site and the use of the engines’ tools. For example, Google
allows you to remove URLs through Search Console. Bing also allows you to remove
URLs from its index, via Bing Webmaster Tools.

CONTENT DELIVERY AND SEARCH SPIDER CONTROL

287

Redirects A redirect is used to indicate when content has moved from one
location to another. For example, suppose you have some content at
https://www.yourdomain.com/old, but you decide to restructure your site and move
it to https://www.yourdomain.com/criticalkeyword. Once a redirect is
implemented, users who go to the old versions of your pages (perhaps via a
bookmark they kept for a page) will be sent to the new versions. Without the
redirect, the user would get a 404 “page not found” error. With the redirect,
the web server tells the incoming user agent (whether a browser or a spider) to
instead fetch the requested content from the new URL.

Why and When to Redirect Redirects are important for letting search engines, as
well as users, know when you have moved content. When you move a piece of
content, the search engines will continue to have the old URL in their indexes
and return it in their search results until they discover the page is no longer
there and swap in the new URL. You can help speed up this process by
implementing a redirect. Here are some scenarios in which you may need to
implement redirects: • You have old content that expires, so you remove it. •
You find that you have broken URLs that have inbound links and traffic. • You
change your hosting company. • You change your CMS. • You want to implement a
canonical redirect (e.g., redirecting all pages on https:// yourdomain.com to
https://www.yourdomain.com). • You change the URLs where your existing content
can be found, for any reason. Not all of these scenarios require a redirect. For
example, you can change hosting companies without impacting any of the URLs used
to find content on your site, in which case no redirects are required. However,
for any scenario in which your URLs change, you need to implement redirects.

Good and Bad Redirects There are many ways to perform a redirect, but not all
are created equal. The two most common types of redirects that are implemented,
tied specifically to the HTTP status code returned by the web server to the
browser, are:

288

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

301 “moved permanently” This status code tells the browser (or search engine
crawler) that the resource has been permanently moved to another location, and
there is no intent to ever bring it back.

302 “moved temporarily” This status code tells the browser (or search engine
crawler) that the resource has been temporarily moved to another location, and
that the move should not be treated as permanent. Both forms of redirect send a
human or crawler to the new location, but the search engines interpret these two
HTTP status codes in somewhat different ways. When a crawler sees a 301 HTTP
status code, it assumes it should pass the historical link authority (and any
other metrics) from the old page to the new one, and that it should remove the
old page from the index and replace it with the new one. When a crawler sees a
302 HTTP status code it also assumes it should pass the historical link
authority from the old page to the new one, but it may keep the old page in the
index, and there may be other types of signals that the 302 redirect does not
pass. Note that redirects can pass no status code, or the wrong one, such as a
404 “page not found” error or a 200 “OK” status code. These are also problematic
and should be avoided. Similarly, meta refreshes (discussed in the following
section) and redirects that return a 303 “see other” or 307 “temporary redirect”
status code should also be avoided, as the search engines’ responses to them are
at best unpredictable. The preservation of historical link authority is critical
in the world of SEO. For example, imagine you had 1,000 links to
https://www.yourolddomain.com and you decided to relocate everything to
https://www.yournewdomain.com. If you used redirects that returned a 303 or 307
status code, you would be starting your link-building efforts from scratch
again. In addition, the old versions of the pages might remain in the search
engines’ indexes and compete for search rankings. You want to definitively
return a 301 HTTP status code for a redirect whenever you permanently move a
page’s location.

Methods for URL Redirecting and Rewriting As we just mentioned, there are many
possible ways to implement redirects. On Apache web servers (normally present on
machines running Unix or Linux as the operating system), it is possible to
implement redirects quite simply in a standard file called .htaccess, using the
Redirect and RedirectMatch directives. You can also employ more advanced
directives known as rewrite rules using the Apache module mod_rewrite, which we
will discuss in a moment.

REDIRECTS

289

On web servers running Microsoft Internet Information Services (IIS), different
methods are provided for implementing redirects. The basic method is through the
IIS console, but people with IIS servers can also make use of a text file with
directives, provided they use an ISAPI plug-in such as ISAPI_Rewrite (this
scripting language offers capabilities similar to Apache’s mod_rewrite module).
Many programmers use other techniques for implementing redirects, such as
directly in programming languages like Perl, PHP, ASP, and JavaScript. In this
case, the programmer must make sure the HTTP status code returned by the web
server is a 301. You can check the returned header with the Firefox plug-in HTTP
Header Live, or with a Chrome extension like Redirect Path.

Using meta refresh As mentioned previously, you can also implement a redirect at
the page level via the meta refresh tag, like this:

The first argument to the content parameter, in this case 5, indicates the
number of seconds the web server should wait before redirecting the user to the
indicated page, and the second is the URL to redirect to. A publisher might use
this technique to display a page letting users know that they’re going to get
redirected to a different page than the one they requested. The problem is that
it’s not clear how search engines will treat a meta refresh, unless you specify
a redirect delay of 0 seconds. In that case the engines will treat it as though
it were a 301 redirect, but you’ll have to give up your helpful page telling
users that you are redirecting them. To be safe, the best practice is simply to
use a 301 redirect if at all possible.

Using mod_rewrite or ISAPI_Rewrite mod_rewrite for Apache and ISAPI_Rewrite for
Microsoft IIS offer very powerful ways

to rewrite your URLs. Here are some reasons for using these tools: • You have
changed the URL structure on your site, so that significant amounts of content
have moved from one location to another. This can happen when you change your
CMS, or change your site organization for any reason. • You want to map your
search engine–unfriendly URLs into friendlier ones. If you are running an Apache
web server, you would implement the redirects by placing directives known as
rewrite rules within your .htaccess file or your Apache configuration file
(e.g., httpd.conf or the site-specific config file in the sites_conf direc-

290

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

tory). Similarly, if you are running an IIS server, you’d use an ISAPI plug-in
such as ISAPI_Rewrite and place rules in an httpd.ini config file. The following
discussion focuses on mod_rewrite; if you’re using ISAPI_Rewrite, be aware that
the rules can differ slightly. To implement URL rewriting and redirecting on an
Apache web server, you should start your .htaccess file with: RewriteEngine on
RewriteBase /

You should omit the second line if you’re adding the rewrites to your server
config file, as RewriteBase is supported only in .htaccess. We’re using
RewriteBase here so that you won’t have to type ^/ at the beginning of all the
rules, just ^ (we will discuss regular expressions in a moment). After this
step, the rewrite rules are implemented. Perhaps you want to have requests for
product page URLs of the format https://www.yourdomain.com/products/123 to
display the content found at https://www.yourdomain.com/get_product.php?id=123,
without the URL changing in the location bar of the user’s browser and without
you having to recode the get_product.php script. Of course, this doesn’t replace
all occurrences of dynamic URLs within the links contained on all the site
pages; that’s a separate issue. You can accomplish this first part with a single
rewrite rule, like so: RewriteRule ^products/([0-9]+)/?$ /get_product.php?id=$1
[L]

This example tells the web server that all requests that come into the
/products/ directory should be mapped into requests to /get_product.php, while
using the subfolder to / products/ as a parameter for the PHP script. The ^
signifies the start of the URL following the domain, $ signifies the end of the
URL, [0-9] signifies a numerical digit, and the + immediately following that
indicates that you want to match one or more occurrences of a digit. Similarly,
the ? immediately following the / means that you want to match zero or one
occurrence of a slash character. The () puts whatever is wrapped within it into
memory. You can then use $1 to access what’s been stored in memory (i.e.,
whatever’s within the first set of parentheses). Not surprisingly, if you
included a second set of parentheses in the rule, you’d access that with $2, and
so on. The [L] flag saves on server processing by telling the rewrite engine to
stop if it matches the rule. Otherwise, all the remaining rules will be run as
well. As a slightly more complex example: URLs of the format
https://www.yourdomain.com/webapp/wcs/stores/servlet/ProductDisplay?storeId=10001&catalogId=10001&langId=-1&categoryID=4&productID=123
would be rewritten to https://www.yourdomain.com/4/123.htm:

REDIRECTS

291

RewriteRule ^([^/]+)/([^/]+)\.htm$
/webapp/wcs/stores/servlet/ProductDisplay?storeId=10001&catalogId=10001&
langId=-1&categoryID=$1&productID=$2 [QSA,L]

The [^/] signifies any character other than a slash. That’s because, within
square brackets, ^ is interpreted as not. The [QSA] flag is for when you don’t
want the query string dropped (like when you want a tracking parameter
preserved). To write good rewrite rules you will need to become a master of
pattern matching (which is simply another way to describe the use of regular
expressions). Here are some of the most important special characters and how the
rewrite engine interprets them: *

Zero or more of the immediately preceding characters

+

One or more of the immediately preceding characters

?

Zero or one of the immediately preceding characters

^

The beginning of the string

$

The end of the string

.

Any character (i.e., it acts as a wildcard)

\

“Escapes” the character that follows; for example, \. means the dot is not meant
to be a wildcard, but an actual character

^

Inside brackets ([]), means not; for example, [^/] means not slash

It is incredibly easy to make errors in regular expressions. Some of the common
gotchas that lead to unintentional substring matches include: • Using .* when
you should be using .+ (because .* can match on nothing). • Not “escaping” with
a backslash a special character that you don’t want interpreted, such as when
you specify . instead of \. and you really meant the dot character rather than
any character (thus, default.htm would match on defaultthtm, and default\.htm
would match only on default.htm). • Omitting ^ or $ on the assumption that the
start or end is implied (thus, default\.htm would match on mydefault.html,
whereas ^default\.htm$ would match only on default.htm). • Using “greedy”
expressions that will match on all occurrences rather than stopping at the first
occurrence. Here’s an example that illustrates what we mean by “greedy”:
RewriteRule ^(.*)/?index\.html$ /$1/ [L,R=301]

This will redirect requests for https://www.yourdomain.com/blah/index.html to
https:// www.yourdomain.com/blah, which is probably not what was intended (as
you might

292

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

have guessed, the [R=301] flag tells the rewrite engine to do a 301 redirect
instead of a standard rewrite; the [L] flag says to stop processing the rule set
if the rule matches). This happens because the .* will capture the slash
character within it before the /? gets to see it. Thankfully, there’s an easy
fix: simply use [^] or .*? instead of .* to do your matching. For example, use
^(.*?)/? instead of ^(.*)/?, or [^/]+/[^/] instead of .*/.*. So, to correct the
preceding rule, you could use the following: RewriteRule ^(.*?)/?index\.html$
/$1/ [L,R=301]

You wouldn’t use this: RewriteRule ^([^/]*)/?index\.html$ /$1/ [L,R=301]

because it will match only on URLs with one directory. URLs containing multiple
subdirectories, such as
https://www.yourdomain.com/store/cheese/swiss/wheel/index.html, would not match.
As you might imagine, testing/debugging is a big part of URL rewriting. When you
are debugging, the RewriteLog and RewriteLogLevel directives are your friends!
Set RewriteLogLevel to 4 or more to start seeing what the rewrite engine is up
to when it interprets your rules. Another handy directive to use in conjunction
with RewriteRule is RewriteCond. You would use RewriteCond if you were trying to
match something in the query string, the domain name, or other elements not
present between the domain name and the question mark in the URL (which is what
RewriteRule looks at). Note that neither RewriteRule nor RewriteCond can access
what is in the anchor part of a URL—that is, whatever follows a #—because that
is used internally by the browser and is not sent to the server as part of the
request. The following RewriteCond example looks for a positive match on the
hostname before it will allow the rewrite rule that follows to be executed:
RewriteCond %{HTTP_HOST} !^www\.yourdomain\.com$ [NC] RewriteRule ^(.*)$
https://www.yourdomain.com/$1 [L,R=301]

Note the exclamation point at the beginning of the regular expression. The
rewrite engine interprets that as not. For any hostname other than
https://www.yourdomain.com, a 301 redirect is issued to the equivalent canonical
URL on the www subdomain. The [NC] flag makes the rewrite condition
case-insensitive. Where is the [QSA] flag so that the query string is preserved,
you might ask? It is not needed for redirecting; it is implied. If you don’t
want a query string retained on a rewrite rule with a redirect, put a question
mark at the end of the destination URL in the rule, like so:

REDIRECTS

293

RewriteCond %{HTTP_HOST} !^www\.yourdomain\.com$ [NC] RewriteRule ^(.*)$
https://www.yourdomain.com/$1? [L,R=301]

Why not use ^yourdomain\.com$ instead? Consider: RewriteCond %{HTTP_HOST}
^yourdomain\.com$ [NC] RewriteRule ^(.*)$ https://www.yourdomain.com/$1?
[L,R=301]

That would not have matched on typo domains, such as yourdoamin.com, that the
DNS server and virtual host would be set to respond to (assuming that
misspelling was a domain you registered and owned). You might want to omit the
query string from the redirected URL, as we did in the preceding two examples,
when a session ID or a tracking parameter (such as source=banner_ad1) needs to
be dropped. Retaining a tracking parameter after the redirect is not only
unnecessary (because the original URL with the source code appended would have
been recorded in your access log files as it was being accessed); it is also
undesirable from a canonicalization standpoint. What if you wanted to drop the
tracking parameter from the redirected URL, but retain the other parameters in
the query string? Here’s how you’d do it for static URLs: RewriteCond
%{QUERY_STRING} ^source=[a-z0-9]*$ RewriteRule ^(.*)$ /$1? [L,R=301]

And for dynamic URLs: RewriteCond %{QUERY_STRING} ^(.+)&source=[a-z0-9]+(&?.*)$
RewriteRule ^(.*)$ /$1?%1%2 [L,R=301]

Need to do some fancy stuff with cookies before redirecting the user? Invoke a
script that cookies the user and then 301-redirects them to the canonical URL:
RewriteCond %{QUERY_STRING} ^source=([a-z0-9]*)$ RewriteRule ^(.*)$
/cookiefirst.php?source=%1&dest=$1 [L]

Note the lack of a [R=301] flag in the preceding code. That’s intentional, as
there’s no need to expose this script to the user; use a rewrite and let the
script itself send the 301 after it has done its work. Other canonicalization
issues worth correcting with rewrite rules and the [R=301] flag include when the
engines index online catalog pages under HTTPS URLs, and when URLs are missing a
trailing slash that should be there. First, the HTTPS fix: # redirect online
catalog pages in the /catalog/ directory if HTTPS RewriteCond %{HTTPS} on
RewriteRule ^catalog/(.*) https://www.yourdomain.com/catalog/$1 [L,R=301]

Note that if your secure server is separate from your main server, you can skip
the RewriteCond line. Now to append the trailing slash: RewriteRule ^(.*[^/])$
/$1/ [L,R=301]

294

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

After completing a URL rewriting project to migrate from dynamic URLs to static,
you’ll want to phase out the dynamic URLs not just by replacing all occurrences
of the legacy URLs on your site, but also by 301-redirecting the legacy dynamic
URLs to their static equivalents. That way, any inbound links pointing to the
retired URLs will end up leading both spiders and humans to the correct new
URL—thus ensuring that the new URLs will be the ones that are indexed, blogged
about, linked to, and bookmarked, and the old URLs will be removed from the
index. Generally, here’s how you’d accomplish that: RewriteCond %{QUERY_STRING}
id=([0-9]+) RewriteRule ^get_product\.php$ /products/%1.html? [L,R=301]

However, you’ll get an infinite loop of recursive redirects if you’re not
careful. One quick and dirty way to avoid that situation is to add a nonsense
parameter to the destination URL for the rewrite and ensure that this nonsense
parameter isn’t present before you do the redirect. For example: RewriteCond
RewriteCond RewriteRule RewriteRule

%{QUERY_STRING} id=([0-9]+) %{QUERY_STRING} !blah=blah ^get_product\.php$
/products/%1.html? [L,R=301] ^products/([0-9]+)/?$
/get_product.php?id=$1&blah=blah [L]

Notice that this example uses two RedirectCond lines, stacked on top of each
other. All redirect conditions listed together in the same block will be “ANDed”
together. If you wanted the conditions to be “ORed,” you’d need to use the [OR]
flag. NOTE There is much more to discuss on this topic than we can reasonably
address in this chapter. This is intended only as an introduction to help orient
more technical readers, including web developers and site webmasters, on how
rewrites and redirects function. If you’re not interested in the details, you
may proceed to “How to Redirect a Home Page Index File Without Looping” on page
295.

How to Redirect a Home Page Index File Without Looping Many websites link to
their own home page in a form similar to https://www.yourdomain.com/index.html.
Sometimes this happens because it’s the default behavior of the site’s CMS or
ecommerce platform. The problem with that is that some people that link to the
site will link to the full URL, including the index file, but most of the
incoming links to the site’s home page will specify https://www.yourdomain.com,
thus dividing the link authority into the site. Once a publisher realizes this,
they will want to fix all their internal links to point to the main domain name
(e.g., https://www.yourdomain.com). Following that, there are two approaches to
recouping the link authority pointing to the index file. The first is to
implement a canonical tag on the index file page pointing to the domain name.

REDIRECTS

295

Using our example, the page https://www.yourdomain.com/index.html would
implement a canonical tag pointing to https://www.yourdomain.com. The second
approach is to implement a 301 redirect. However, this is a bit tricky, as a 301
redirect from https://www.yourdomain.com/index.html to
https://www.yourdomain.com will result in recursive redirects if not done
correctly. This is because when someone comes to your website by typing in
https://www.yourdomain.com, the Domain Name System (DNS) translates that
human-readable name into an IP address to help the browser locate the web server
for the site. When no file is specified (as in this example, where only the
domain name is given), the web server loads the default file, which in this
example is index.html. The filename can actually be anything, but most web
servers default to one type of filename or another (other common options include
index.htm, index.shtml, index.php, and default.asp). Where the problem comes in
is that many CMSs will expose both forms of your home page—that is, both
https://www.yourdomain.com and https://www.yourdomain.com/index.html—in links on
the site and hence to users, including those who may choose to link to your
site. And even if all the pages on the site link only to
https://www.yourdomain.com/index.html, given human nature, most of the links to
your home page from third parties will most likely point at
https://www.yourdomain.com. This can create a duplicate content problem if the
search engine sees two versions of your home page and thinks they are separate,
but duplicate, documents. Google is pretty smart at figuring out this particular
issue, but it is best to not rely on that. Based on the previous discussion, you
might conclude that the solution is to 301redirect
https://www.yourdomain.com/index.html to https://www.yourdomain.com. Sounds
good, right? Unfortunately, there is a big problem with this approach. What
happens is that the server sees the request for
https://www.yourdomain.com/index.html and then sees that it is supposed to
301-redirect that to https:// www.yourdomain.com, so it does. But when it loads
https://www.yourdomain.com, it retrieves the default filename (index.html) and
proceeds to load https://www.yourdomain.com/index.html. Then it sees that you
want to redirect that to https://www.yourdomain.com, and it enters an infinite
loop.

296

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

The Default Document Redirect Solution The solution that follows is specific to
the preceding index.html example. You will need to plug in the appropriate
default filename for your own web server. Here’s how to avoid the problematic
loop: 1. Copy the contents of index.html to another file. For this example,
we’ll be using sitehome.php. 2. Create an Apache DirectoryIndex directive for
your document root. Set it to sitehome.php. Do not set the directive on a
serverwide level; otherwise, it may cause problems with other folders that still
need to use index.html as a directory index. 3. Put this in an .htaccess file in
your document root, or, if you aren’t using perdirectory config files, in your
httpd.conf file:

DirectoryIndex sitehome.php

4. Clear out the contents of your original index.html file, and insert this line
of code:

This sets it up so that index.html is not a directory index file (i.e., the
default filename). It forces sitehome.php to be read when someone types in the
canonical URL (https:// www.yourdomain.com). Any requests to index.html from old
links can now be 301redirected while avoiding an infinite loop. If you are using
a CMS, you also need to make sure when you are done with this process that all
the internal links now go to the canonical URL, https://www.yourdomain.com. If
for any reason the CMS started to point to
https://www.yourdomain.com/sitehome.php, the loop problem would return, forcing
you to go through this entire process again.

Using a Content Management System When looking to publish a new site, many
publishers may wonder whether they need to use a CMS, and if so, how to ensure
that it is SEO friendly. It’s essential to make this determination before you
embark on a web development project. You can use the flowchart in Figure 7-50 to
help guide you through the process.

USING A CONTENT MANAGEMENT SYSTEM

297

Figure 7-50. A flowchart to determine whether you need a CMS

Due to the inexpensive, customizable, free platforms such as Drupal, Joomla,
WordPress, Wix, and Weebly, it is increasingly rare for a publisher to develop a
static site, even when a CMS isn’t required. NOTE If you are developing a static
site, you’ll find some information about using static site generators in
“Jamstack” on page 308.

The next step involves understanding how to ensure that a CMS will be search
engine–friendly. Here is a list of features you should look for in a CMS
(whether prebuilt or custom-made):

tag customization and rules A search engine–friendly CMS must allow for tags not
only to be customized on a page-specific level, but also to enable rules for
particular sections of a

298

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

website. For example, if the tag always has to start with your site name
followed by a colon followed by your article title, your on-page optimization
efforts will be limited—at least as far as the powerful tag is concerned. You
should be able to revise the formulas you use to generate the tags across your
site to make them more search optimal.

Static, keyword-rich URLs URLs have historically been the most problematic SEO
issue for CMS platforms. Nowadays, a search-friendly CMS should feature custom
URL creation. Figure 7-51 is an example from WordPress, which refers to a custom
URL as a post slug. Notice how the first line allows you to create the title of
the post, and the second enables you to manually create the URL structure (and
provides an automatic Generate button if you prefer to simply use the post
title).

Figure 7-51. An example of custom URL creation

Meta tag customization Being able to implement custom meta descriptions and
robots meta tags is critical. Enabling editorial control is essential for a good
CMS.

Enabling custom HTML tags A good CMS should offer the ability to customize the
HTML for your pages—for example, implementing nofollow on links and adding,
removing, or editing tags. These can be built-in features accessible through
menu options, or the CMS can simply allow for manual editing of HTML in the text
editor window when required. Having no tags on a given page, having too many
tags on the page, or marking up low-value content (such as the publication date)
as an is not desirable. The article title is typically the best default content
to have wrapped in an , and ideally the CMS will let you customize it from
there.

Internal anchor text flexibility For your site to be “optimized” rather than
simply search friendly, it’s critical to customize the anchor text for internal
links. Rather than simply making all links

USING A CONTENT MANAGEMENT SYSTEM

299

in a site’s architecture the page’s title, a great CMS should be flexible enough
to handle custom input from the administrators for the anchor text of
category-level or global navigation links.

Intelligent categorization structure Another common CMS problem is poor category
structure. When designing an information architecture for a website, you should
not place limits on how pages are accessible due to the CMS’s inflexibility. A
CMS that offers customizable navigation panels will be the most successful in
this respect.

Pagination controls Pagination can be the bane of a website’s search rankings,
so control it by including more items per page, more contextually relevant
anchor text when possible, and, if the situation warrants it, implementing
pagination links as discussed in “Pagination” on page 197.

XML/RSS pinging Although feeds are primarily useful for blogs, any content—from
articles to products to press releases—can be issued in a feed. By utilizing
quick, accurate pinging of the major feed services, you limit some of your
exposure to duplicate content spammers that pick up your feeds and ping the
major services quickly in the hopes of beating you to the punch.

Image handling and alt attributes alt attributes are a clear must-have from an
SEO perspective, serving as the

“anchor text” when an image is used as a link and providing relevant content
that can be indexed for the search engines (note that text links are much better
than images with alt attributes, but if you must use image links you should
implement the alt attribute). Images in a CMS’s navigational elements should
preferably use CSS image replacement rather than mere alt attributes.

CSS exceptions The application of CSS styles in a proper CMS should allow for
manual exceptions so that a user can modify the visual appearance of a headline
or list elements. If the CMS does not offer this ability, writers might opt out
of using proper semantic markup for presentation purposes, which would not be a
good thing.

Static caching options Many CMSs offer caching options, which are a particular
boon if a page is receiving a high level of traffic from social media portals or
news sites. A bulky CMS often makes dozens of extraneous database connections,
which can overwhelm a server if caching is not in place, discouraging potential
inbound links and media attention.

300

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

URLs free of tracking parameters and session IDs Sticking session or tracking
information such as the user’s click path into your URLs is deadly for SEO, as
each time the search engine crawler comes to your site it will think it’s seeing
each of the pages for the first time. This leads to incomplete indexing and
duplicate content issues. To avoid these problems, verify that your CMS allows
you to create URLs without session IDs or tracking parameters; if it doesn’t,
select a different one.

Customizable URL structure If the default URL structure of the CMS doesn’t suit
your needs, you should be able to change it. For example, if you don’t want
/archives/ in the URLs of all your archived articles, you should be able to
remove it; likewise, if you want to reference the article’s name instead of its
database ID in the URL, you should be able to do that.

301 redirects to a canonical URL Duplicate content is a major concern for the
dynamic website owner. Automatic handling of this by the CMS through the use of
301 redirects is a must.

Static-looking URLs The most palatable URLs to spiders are the ones that look
like they lead to static pages—the CMS should not include query strings in the
URL.

Keywords in URLs Conversely, keywords in your URLs (used judiciously) can help
your rankings.

RSS feeds The CMS should auto create RSS feeds to help your site rank in Google
News and other feed engines.

Multilevel categorization structure It is awfully limiting to your site
structure and internal hierarchical linking structure to have a CMS that doesn’t
allow you to nest subcategories into categories, sub-subcategories into
subcategories, and so on.

Paraphrasable excerpts Duplicate content issues are exacerbated on dynamic sites
such as blogs when the same content is displayed on permalink pages, category
pages, archives-by-date pages, tag pages, and the home page. Crafting unique
content for the excerpt, and having that content display on all locations except
the permalink page, will help strengthen the search engines’ perception of your
permalink page as unique content.

Breadcrumb navigation Breadcrumb (drill-down) navigation is great for SEO
because it reinforces your internal hierarchical linking structure with
keyword-rich text links.

USING A CONTENT MANAGEMENT SYSTEM

301

Meta robots noindex tags for low-value pages Even if you don’t link to these
pages, other people may still link to them, which carries a risk of them being
ranked above some of your more valuable content. The CMS should allow you to
place noindex tags on pages that you do not want to be indexed.

Keyword-rich intro copy on category-level pages Keyword-rich introductory copy
helps set a stable keyword theme for the page, rather than relying on the latest
article or blog post to be the most prominent text on the page.

ugc or nofollow links in comments and for any other user-generated content If
you allow visitors to post comments or submit other content and do not implement
the ugc or nofollow attribute for the links, your site will be a spam magnet.
Heck, you’ll probably be a spam magnet anyway, but you won’t risk being seen as
endorsing spammy sites if you use nofollow attributes. Even if you do default
these links to ugc or nofollow, best practice is to perform some level of manual
review of all user-submitted content. sponsored or nofollow links in ads If you
have ads on your site and do not implement the sponsored or nofollow attribute
for the links, you run the risk of the search engines thinking that you’re
selling links. While Google will likely recognize the difference between regular
ads and links intended to sell PageRank, it’s best to not take that chance.

Customizable anchor text on navigational links Phrases like Click Here, Read
More, Full Article, and so on make for lousy anchor text—at least from an SEO
standpoint. Hopefully, your CMS allows you to improve such links to make the
anchor text more keyword rich.

XML sitemap generator Having your CMS generate your XML sitemap can save you a
lot of hassle, as opposed to trying to generate one with a third-party tool.

HTML4, HTML5, or XHTML validation Although HTML validation is not a ranking
signal, it is desirable to have the CMS automatically check for malformed HTML,
as search engines may end up seeing a page differently from how it renders on
the screen and accidentally consider navigation to be part of the content, or
vice versa.

Pingbacks, trackbacks, comments, and antispam mechanisms The problem with
comments/trackbacks/pingbacks is that they are vectors for spam, so if you have
one or more of these features enabled, you will be spammed. Therefore, effective
spam prevention in the form of tools such as Akismet, AntiSpam Bee, WP Cerber
Security, Mollom, or Defensio is a must.

302

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

If you want more information on picking a quality CMS, there are some great web
resources out there—one example is OpenSourceCMS.com—to help manage this task.
We’ll provide a few pointers in the next section.

CMS Selection There are many factors to consider when choosing a CMS. Many CMS
platforms are free, but some of them are proprietary with a license cost per
site. The majority were not designed with security, stability, search
friendliness, and scalability in mind, though in recent years a few vendors have
developed excellent systems that have search friendliness as their primary
focus. Many of these tools were developed to fit a certain market niche but can
be expanded to fit other purposes. Some are no longer maintained. Many are
supported and developed primarily by hobbyists who don’t particularly care if
you’re having trouble getting them installed and configured. Some are even
intentionally made to be difficult to install and configure, so that you’ll be
encouraged to pay the developers a consulting fee to do it all for you. Most CMS
and ecommerce platforms offer support for SEO but require configuration for
optimal results. Make sure you get that help up front to get the SEO for your
site off to a strong start. Selecting a CMS is an important process. If you make
the wrong choice, you will be faced with limited SEO options. Like most
software, a CMS is a moving target—what’s missing today may be a new feature
tomorrow. In addition, just because a feature exists doesn’t mean it is the
default option, so in many instances the desired functionality will need to be
enabled and possibly customized to work to your specifications.

Third-Party CMS or Ecommerce Platform Add-ons Many CMS platforms offer
third-party plug-ins or add-ons that extend their core functionality. In the
WordPress plug-in directory alone, there are over 60,000 plug-ins, including the
hugely popular Yoast SEO, Rank Math, and All in One SEO. The ecommerce platform
Magento is also dependent on plug-ins to function well for SEO; the most popular
SEO plug-ins for this platform are offered by Amasty and Mageworx. Plug-ins
provide a simple way to add new SEO features and functionality, making your
chosen CMS or ecommerce platform much more flexible and SEO friendly. It’s
particularly helpful when there is an active community developing plug-ins. An
active community also comes in very handy in providing free technical support
when things go wrong; and when bugs and security vulnerabilities crop up, it is
important to have an active developer base to solve those issues quickly. Many
add-ons—such as discussion forums, customer reviews, and user polls—come in the
form of independent software installed on your web server, or hosted services.
Discussion forums come in both of these forms: examples include bbPress, which

USING A CONTENT MANAGEMENT SYSTEM

303

is installed software and is optimized for search, and vBulletin, which is a
hosted solution and therefore more difficult to optimize for search. The problem
with hosted solutions is that you are helping to build the service providers’
link authority and not your own, and you have much less control over optimizing
the content. This illustrates the need to take the time to validate the SEO
impact of any CMS or ecommerce add-on that you choose, including WordPress
plug-ins. This can be positive or negative; for example, the plug-ins can slow
down your site’s page speed, or you may run into security compliance issues.

CMS and Ecommerce Platform Training Regardless of the platform(s) you choose,
you’ll inevitably need to invest time in configuring it to meet your SEO needs.
In many cases the issues you may have with the platform out of the box can cause
severe SEO problems or limitations. Most of the popular platforms allow you to
configure your site to be SEO friendly as long as you take the time to set them
up properly. Whether or not your platform requires an add-on for SEO, it’s
important that you train your developers and content authors to use the features
of the platform correctly. Among other things, they’ll need an understanding of
all of the following: • Where pages sit in the overall site hierarchy

• Title tag selection

• Implementing descriptive keywordrich URLs

• Implementing meta descriptions

• How links to the pages are implemented

• Choosing the main heading tag

• Whether or not a page is included in the XML sitemap

For many platforms there are other issues that you’ll need to consider too, such
as what types of redirects are used; when links should be tagged as nofollow,
ugc, or sponsored; how canonical, nofollow, and noindex tags are implemented;
and XML sitemap generation. Having a training program for your developers and
content authors is key to success. The existence of the ability to configure
these aspects of your pages and your site architecture is great, but if your
staff is not properly trained to use them, they also represent potential points
of failure.

JavaScript Frameworks and Static Site Generators Historically, Google had
significant challenges with processing JavaScript. There were two major reasons
for this: Googlebot was not using the most up-to-date page rendering technology,
and Google did not have the infrastructure and processes in place

304

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

to separately execute the additional page rendering resulting from the execution
of JavaScript. This changed in 2019, when Google announced that Googlebot would
from that point on leverage the latest Chromium rendering engine. Since this is
the core rendering engine of the Chrome browser, this should in principle mean
that Googlebot is able to render anything that the Chrome browser can render. At
the same time, Google also put in place the infrastructure to handle deferred
rendering, in a manner similar to that shown in Figure 7-52.

Figure 7-52. Google’s deferred rendering process

In principle these steps enable Google to crawl and fully render any page that
the Chrome browser can. However, a portion of that rendering takes place after a
brief period of delay. As of 2019, according to Google’s Martin Splitt, the
median time to render was 5 seconds and the 90th percentile was done in minutes.
This delay only applies if your site uses client-side rendering or hybrid
rendering, discussed in the following section.

Types of Rendering There are several different methods for rendering pages. The
type of rendering you use can impact the user experience, page performance, and
the level of effort required to develop and maintain the pages.

Server-side rendering versus client-side rendering Server-side rendering (SSR)
is the way that web pages were traditionally rendered on the web, and this can
be done without JavaScript. With SSR, pages are fully assembled on your web
server and are then delivered to the browser in completed form. You can see

JAVASCRIPT FRAMEWORKS AND STATIC SITE GENERATORS

305

an illustration of how this works in Figure 7-53. SSR is faster at delivering
the initial page to a user, as there is less back and forth between the server
and the browser than with client-side rendering. Because users get their pages
faster, search engines also like this.

Figure 7-53. Server-side rendering

In contrast, client-side rendering (CSR) is when pages are primarily rendered on
the client side (within the browser). With this approach, the server delivers a
single-page application file to the browser along with some JavaScript that the
browser then executes to bring in the remaining page elements. Figure 7-54
provides an illustration of how this process works.

Figure 7-54. Client-side rendering

CSR is generally slower than SSR at rendering the first page a user requests
from your site. This is because the CSR process may require many rounds of
communication between the browser and the web server as different scripts
execute and request more information from the server. However, CSR can be faster
at delivering subsequent pages, as page elements downloaded to the browser’s
device for the first page that are reused on other ones don’t need to be
downloaded again.

Dynamic rendering and hybrid rendering Client-side rendering has some strong
advantages, but not all search engines can process JavaScript. In principle,
Google can handle it, but there are two alternative approaches to consider. One
of these is dynamic rendering. As illustrated in Figure 7-55,

306

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

dynamic rendering delivers content to users using client-side rendering and
delivers content to search engines using server-side rendering.

Figure 7-55. Dynamic rendering

Google does not consider dynamic rendering to be cloaking if the content you
deliver to the search engine robot/crawler is substantially the same as what
gets delivered to the user. The other alternative to consider is hybrid
rendering, also known as server-side rendering with hydration. As shown in
Figure 7-56, this is a mix of SSR and CSR. Most of the page is assembled on the
server side and delivered to the browser up front, and then certain specific
elements are executed on the client side. This approach is allows you to render
above the fold content server-side and then render other aspects of the page
using a client-side approach. This can provide fast page load times for users as
well as strong Largest Contentful Paint scores (part of the Core Web Vitals,
which are discussed in depth in “Google Algorithm Updates” on page 404).

Figure 7-56. Hybrid rendering

JAVASCRIPT FRAMEWORKS AND STATIC SITE GENERATORS

307

JavaScript Frameworks JavaScript is a scripting language that can be used to
build dynamic web applications. However, out of the box JavaScript does not
provide easy ways to develop user interfaces that are complex or that are easy
to maintain. This is the reason for making use of a JavaScript framework, which
provides code libraries that simplify this process. Some of the more popular
JavaScript frameworks are:

Vue Developed by an ex-Googler, Vue is highly flexible and good at progressive
site building. It offers higher customizability than other frameworks and is a
good fit for midsized companies with sufficient programming resources. Potential
challenges result from that flexibility, and there is limited support available.

React Developed by Facebook, React is known for being good for UX design and
fast iteration. It also has a lot of developers in the ecosystem. It’s well
suited for large companies with a lot of development resources. React code can
be a bit messy, and it can render poorly.

jQuery jQuery is an open source library that was designed to simplify
interaction with the DOM. While it remains popular, it was released in 2006 and
is a bit dated. It can lead to a large and unwieldy codebase when used to build
larger applications.

Express Express was designed to be a fast, flexible framework for developing a
backend. It can be used together with Node.js to build APIs for hybrid mobile
and web apps. Some of the challenges with Express relate to how it uses
callbacks and getting used to working with middleware.

Ember Ember, created by Yehuda Katz, is a solid pick for building a rich UI on
any device. Ember works best when it is the entire frontend of your application.
One downside of the platform is that it is not backed by a large entity, and as
a result it has a smaller community of developers.

Jamstack Jamstack is an acronym derived from JavaScript, API, and markup. With
Jamstack sites, the entire frontend is pre-rendered into optimized static pages
that can be directly hosted on a content delivery network. There are a number of
advantages to the Jamstack approach to development. These include:

308

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Security Jamstack sites have fewer moving parts, and they don’t dynamically
interact with the web servers. For this reason, they have fewer points of
vulnerability.

Improved portability With less dependence on the specifics of the hosting
environment, Jamstack sites are more easily moved from one environment to
another.

Scale Since Jamstack sites can be served entirely from a CDN, no complex logic
is required to figure out how to deal with traffic at scale.

Speed Since the sites can be entirely served from a CDN, Jamstack sites are
often faster. Jamstack is not necessarily for everyone. Developing with it does
require significant web development experience. To help make this easier, a new
class of tools called static site generators have emerged. You can think of
using an SSG as a compromise between handcoding your pages and using a
full-blown content management system. With handcoding, you have to work through
all of the logic for your site on your own. You have a great deal of
flexibility, but you’re responsible for configuring every aspect of everything.
With a CMS (as you saw earlier in this chapter) many things are automatically
taken care of for you, but you lose a lot of flexibility, and sometimes they
just get in the way of you doing what you want to do. For example, if you’re
trying to do something outside of the core functionality of the CMS, you’ll
likely require some type of plug-in to do it. With an SSG you still have a great
deal of freedom and flexibility, but templates are provided to make a lot of the
task of assembling a static site much simpler and faster. And since SSGs use the
Jamstack architecture, they also offer all the benefits that come with that.
However, like CMSs, SSGs do impose some limitations. For example, some SSGs are
built with a desktop-first approach, and for SEO it’s important to have a
mobile-first mentality. In addition, with some SSGs title tags and meta
descriptions aren’t built into the templates. Each SSG will have its own
limitations, and whatever those are will become a part of your site unless you
find a way to work around them. Examples of popular SSGs include:

Gatsby Gatsby is a strong platform for building high-performance offline and
sync capabilities and leveraging service workers. However, it can sometimes run
into challenges with rendering, and its plug-ins can interfere with SEO and
accessibility.

JAVASCRIPT FRAMEWORKS AND STATIC SITE GENERATORS

309

11ty 11ty is also fast and is able to mix together different template languages.
It can be great for getting a quick start but lacks preinstalled themes or
configurations.

Jekyll Jekyll is strong at generating static websites and offers expressive
templating. However, it uses Ruby (a less well-known language), is not super
flexible, and can have slow build times.

Next.js Next.js offers solid speed, and preloading is easy. It uses React, which
is very well known, so there is a lot of community support. On the downside,
most changes require development, it has a poor plug-in ecosystem, and it can be
comparatively expensive.

Problems That Still Happen with JavaScript There are still plenty of
opportunities to get into trouble from an SEO perspective when developing sites
using JavaScript. Some of the more common problems are described here: • Many
sites are built with portions of their content or links that do not appear in
the pages when they are initially assembled within the browser. That is, there
may be content or links that only get loaded into the page once a user clicks on
something in the user interface. Since this content/these links are not
available upon the initial load of the web page, the search engines will not see
them. This is something that you can test by examining the fully rendered page.
If there is some portion of content or some links that you are concerned about,
you can check whether they’re visible after the initial page load using Google’s
Mobile-Friendly Test tool or Rich Results Test tool, or using the Chrome
Inspector (as described in “Site elements that are problematic for spiders” on
page 199). Note that you can’t simply “view source” in your browser. That is
because this will show the actual JavaScript on the page, not what the page
looks like after the initial JavaScript executes. We refer to the state of the
page after the initial JavaScript executes as the “fully rendered page.” This is
an important concept to understand when it comes to debugging potential
JavaScript problems. Where the problems come in is if, after you have a fully
rendered page, there is still additional content or links that are downloaded
only after the user clicks on something. • Some sites block Google from crawling
their JavaScript and/or CSS files. Blocking either of these file types prevents
Google from fully understanding the content and layout of your pages. Ensure
that these are not blocked in robots.txt.

310

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

• If you’re using an SSG, be aware of the potential limitations of the page
templates that you use. These may not provide you with an easy way to customize
your title tags, meta descriptions, or other page elements. If you have this
issue, you will have to find a way to work around it. • JavaScript permits
approaches to coding that are not supported by modern browsers, including
Chrome, and if you use these they will not function well for users or for search
engines. For example, you can implement links using tags instead of tags, or
have links that rely on fragments (#) to form unique URLs. While JavaScript
permits you to do these things, you should not use these types of constructs.
JavaScript offers a lot of advantages to developers in the process of creating
websites and is highly popular for that reason. In today’s environment
JavaScript can be made very SEO friendly, but you must take the time and care to
work around any limitations in the framework or SSG that you choose to use. Be
sure to investigate the various SEO-related limitations of any platform you’re
considering before you finalize your selection.

Best Practices for Multilingual/Multicountry Targeting One of the ways companies
grow their organic search presence, traffic, conversions, and, ultimately,
business profits is by targeting audiences across more countries (who can speak
and search in the same or different languages), or audiences that speak
different languages within the same country. Although targeting international
markets can be an attractive way to scale, there are many common challenges when
establishing a multilingual/multicountry SEO process, and these need to be
addressed right from the start for a cost-effective process. Questions to
consider include: • Should you enable different language or country versions of
your website to effectively rank across the different markets you’re targeting?
• How many country/language markets should you optimize for? • How will you
configure the different language or country versions of your site to ensure
their crawlability and indexability? Using ccTLDs, subdomains, or
subdirectories? • How can you ensure the relevant country pages rank for the
right audiences? • How can your new language or country versions compete to rank
in new markets?

BEST PRACTICES FOR MULTILINGUAL/MULTICOUNTRY TARGETING

311

There are SEO as well as broader marketing and business-related factors that
affect the answers to these questions, such as the company’s business model and
operations, legal and financial requirements, and other constraints of the
company. Let’s go through the most important criteria to take into consideration
when addressing these issues.

When to Enable a New Language or Country Version of Your Site As with any new
SEO process, the first step to assess the opportunity of targeting a new
language or country from an SEO perspective is to perform keyword and
competitive research (using keyword research tools like Semrush, Ahrefs,
Mangools, etc. that support international markets). This assessment should
leverage support from native language speakers and should be used to identify
the following:

The organic search traffic potential in each market How many searches are there
related to your product or services that would be relevant to target? Is the
volume of related searches in the market growing or decreasing?

The search behavior of the audience Which are the most popular terms? What’s the
impact of seasonality? What are the most commonly searched products or services?

The level and type of competition in the market Who are your market competitors?
How optimized, authoritative, and popular are they? Which are their
best-performing queries and pages? What types of website structures do they use
to target the market? This information will allow you to evaluate whether
there’s enough search traffic potential for a positive ROI in each language or
country market you’re considering targeting, the alignment of search behavior
with your business offerings and goals, as well as the existing search marketing
trends and the competition. You can then combine this organic search market
information with insights from other potential traffic drivers such as the
business operations capacity, legal and financial requirements, and constraints
that you face in order to prioritize the potential markets to target. You must
also take into consideration the cost to translate or localize the content (and
provide support to users throughout the customer journey) to assess whether each
market might be profitable and, depending on that, whether or not it should be
targeted. Finally, you’ll need to consider the relative priority of doing so
versus other investments your business could be making.

312

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Skipping this initial in-depth assessment can result in targeting unprofitable
markets; challenges in justifying the investment in content translation,
localization, and optimization efforts; and ultimately the failure of the
international SEO process.

When to Target a Language or Country with a Localized Website Version Search
engines allow you to target international audiences (and provide configurations
to do so) on a language or country basis. Ideally, you should look to target
your audience at as granular a scale as possible to better optimize your site
for that market and connect with their search behavior. Typically this is at the
country level, as search behavior, trends, seasonality, and terms used can vary
from country to country, even among users who speak the same language. However,
the differences in search behavior across countries for your market might not be
substantial, and depending on your website’s business model and the search
potential per country market, it might not be profitable to target each country
individually. For example, if you have a tech news site in English, you’re
likely publishing content that already attracts users from all over the world
searching in this language. Due to the business model of such a site, you may
not need to target your English-speaking audience on a per-country basis,
despite certain differences in English-language terms used in different
countries (e.g., mobile versus cell phone). The variations might not be
meaningful enough to compensate for the effort involved in creating new
localized versions to target English-speaking users on a per-country basis. On
the other hand, if you have an ecommerce or marketplace site in English
targeting US visitors, even if you end up ranking in other English-speaking
countries for which the search terms used are the same, you won’t be able to
convert them into customers if you don’t offer delivery in those other countries
or accept currencies other than US dollars. In this case, to expand into those
markets you will likely need to enable a new website version for each new
country, as there will be meaningful product information and conditions that
will need to change on a country-by-country basis. In addition, there may be
differences in terms used to look for products (e.g., sneakers versus trainers),
which you’ll discover when doing keyword research. There might also be scenarios
in which you’ll want to target a continent, like Europe, or a region, like Latin
America. It’s not possible to effectively target these types of international
markets, as search engines only support configurations to target specific
languages or countries. In fact, Google’s “Managing Multi-Regional and
Multilingual Sites” documentation indicates that it treats regional top-level
domains, such as .asia and .eu, as if they were top-level domains, without
geolocating them. This makes sense, as across most continents searcher behavior
will likely differ in meaningful ways from country to country; users will tend
to search in their native languages, and it is

BEST PRACTICES FOR MULTILINGUAL/MULTICOUNTRY TARGETING

313

therefore not recommended to target them as a whole with a single website
version for the whole continent. For example, if you had one website version
that was in English targeting audiences in Europe, you would have difficulty
attracting search traffic from users searching in French, German, Spanish,
Italian, etc.

Configuring Your Site’s Language or Country Versions to Rank in Different
Markets Once you’ve decided to target a new market, the first step is to create
specific URLs to feature each new language or country page’s content and ensure
their crawlability and indexability. These URLs can then be configured to
specify their language and country targets. Dynamic solutions that show
different translated versions of content under the same URL won’t be
independently crawled or indexed to rank in their target markets, and therefore
it’s not recommended to use these.

International website structure alternatives There are different types of
structures that you can use to enable different website versions to target
different languages or countries, all of them with some advantages and
disadvantages. The least recommended approach is the use of URL parameters in
gTLDs to target different languages (e.g., yourbrand.com/?lang=es) or countries
(e.g., yourbrand.com/?country=es). The reason this isn’t recommended is that you
can’t use Google Search Console’s geolocation feature when targeting countries,
or rely on the translation/localization of the URL structure depending on the
selected language or country. If you decide to target specific countries, you
can choose from three possible website structures: country code top-level
domains (ccTLDs) for each country, or subdirectories or subdomains created under
gTLDs. These three approaches have the following advantages and disadvantages:

314

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

ccTLDs (e.g., yourbrand.co.uk, yourbrand.es, etc.) Advantages: • Each domain is
already geolocated by default to the relevant country (these geolocations cannot
be changed). NOTE When deciding how to structure an international website, it’s
important to be aware that there are ccTLDs that Google treats as if they were
gTLDs, like .co, .fm, and .ad, among others. These are specified by Google in
its international SEO documentation, which you should check prior to picking a
new ccTLD. These are not geolocated by default and need to be geolocated via
Google Search Console to target the relevant countries.

• Allows you to easily separate different site versions and to better avoid
inheriting issues from one version in another. • Users may prefer to click on a
ccTLD that they see as more clearly targeted toward them, rather than a gTLD. •
Can be independently hosted in the relevant country. Disadvantages: • More
technical resources are required for the operation of multiple domains. • Might
not be available to be purchased or used by nonlocally-based companies,
depending on each country’s legal restrictions. • More effort will be likely
needed to grow the popularity of a new domain for each market (as compared to a
single site for all markets), which can impact capacity to rank for competitive
terms. For example, you’ll need to build a new backlink profile for each domain,
since you will not be able to benefit from the existing domain’s profile. •
Large sites might have less crawl budget available due to the lower popularity
of each individual domain, as popularity is a factor taken into consideration
when assigning crawl budget.

Subdirectories (e.g., yourbrand.com/en-gb/, yourbrand.com/es-es/, etc.)
Advantages: • A single domain for all countries will likely require less
operational and technical support than multiple country-specific domains.

BEST PRACTICES FOR MULTILINGUAL/MULTICOUNTRY TARGETING

315

• A single domain will likely require less marketing investment to grow organic
search traffic than trying to promote a different domain for each country. •
Large sites will likely have more crawl budget available, due to higher
consolidated popularity. • Each subdirectory can be registered and geolocated
via Google Search Console to the relevant country. Disadvantages: • Doesn’t
allow you to easily separate different site versions and avoid inheriting issues
from one version in another. • Will likely have a deeper web structure that will
need better internal linking to make it easily crawlable. • Local users might
prefer to click on sites using ccTLDs (which can look more targeted toward them)
rather than a gTLD. • Cannot be independently hosted in the relevant country.

Subdomains (e.g., en-gb.yourbrand.com, es-es.yourbrand.com, etc.) Advantages: •
Each subdomain can be registered and geolocated via Google Search Console to the
relevant country. • Can be independently hosted in the relevant country. • Might
need fewer resources for operational and development support than ccTLDs, as it
will be using a single domain. Disadvantages: • Local users might prefer to
click on sites using ccTLDs rather than a gTLD, as they appear more targeted
toward them. • Google will need to assess whether the subdomains will be
considered part of the same website or not. As John Mueller explained at a 2018
SearchLove event, “We try to figure out what belongs to this website, and
sometimes that can include subdomains, sometimes that doesn’t include
subdomains.” Depending on your circumstances, some of the previous advantages or
disadvantages will matter more than others, so it’s critical to assess the
options based on the particular context of your organization and website, as
well as the targeted markets’ requirements and constraints. For example,
geolocating by using subdirectories under gTLDs will likely be the easiest, most
straightforward option to start with if the site is already using a gTLD, as
long

316

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

as there are no legal or business constraints that prohibit it. This will allow
you to easily grow your presence across countries, without having to develop the
link profiles of different website versions independently. In this case, not
being able to host each website in the targeted countries might not even be a
critical disadvantage; many sites make use of CDNs for fast performance for
users all over the world anyway, and because of this, as Google specifies in its
international SEO documentation, local hosting isn’t a definitive signal.
However, if you’re targeting China, where the most popular search engine is
Baidu (Google doesn’t specifically target this market), it’s recommended to use
a ccTLD with a local IP to avoid the “Great Firewall” filters. Because of these
factors, it’s also recommended to analyze and take into consideration the
website structures of the competing sites currently ranking in the top positions
for your targeted queries in each market. NOTE If you’re targeting languages,
ccTLDs are not the best approach, as they’re geolocated to a country by default.
In this case you can choose from subdirectories or subdomains. The advantages
and disadvantages of these two approaches are similar to in the country
targeting scenario, but without the geolocation considerations; you won’t need
to geolocate the subdirectories or subdomains to a specific country through
Google Search Console.

Best practices for optimizing your international website versions for their
targeted markets Once you’ve selected a website structure for your international
markets, it’s critical to optimize your new sites to ensure they are relevant,
popular, and competitive so they have a good chance of ranking in their targeted
markets. The most important international SEO actions to follow to achieve this
are:

Publish website versions with content translation and/or localization and
optimization supported by native speakers. The content of new language or
country pages should be translated or localized and optimized by native speakers
to ensure accuracy, quality, and relevance to the targeted audiences. From URLs
and metadata, to headings, navigation, and main content, all the information on
your site should be translated or localized for each new website version, taking
into consideration the keyword research and targeted queries of each market as
input.

BEST PRACTICES FOR MULTILINGUAL/MULTICOUNTRY TARGETING

317

NOTE Despite the improvements in automated translation systems that you might
want to leverage for the content translation process, it’s always recommended to
have native speakers validate the results and optimize the content.

It’s also important to note that even when you target different countries that
speak the same language, there will tend to be differences in the terms used,
product or service preferences, seasonality, currency, conditions, etc., and
it’s therefore important to localize the content accordingly in order to connect
with and satisfy the audience’s needs. On the other hand, if in your target
country the audience’s search behavior is very similar or even identical, the
same language is spoken, and it doesn’t make sense to differentiate the content,
you shouldn’t be afraid of featuring the same or very similar content across
your sites, as long as the content is actually optimized to target the needs of
these different audiences.

Use URL translation/localization when using Latin characters. Ideally, URLs
should be translated into the relevant language of the featured page, as happens
with the rest of the page content, so if the page information is in Spanish, the
URL should be translated into this language too. However, in the case of URLs,
it’s recommended to avoid using special characters (like accents) and to make
use of transliteration to keep using Latin characters in them, rather than
featuring URLs in other scripts, such as Cyrillic for Russian audiences.

Avoid automatic IP or language-based redirects. It’s not recommended to
automatically redirect users based on their IP location or browser language, as
this might prevent search engines from being able to effectively crawl all
language or country versions. Instead, when users land on a page that is not
shown in the language of their browser, or that is not targeted toward their
identified IP location, it’s recommended to suggest that the user to go to the
alternate relevant page version that is in their language or targets their
country, using a nonintrusive banner or notification. If the user makes such a
selection, allow them to save that value and specify it as their “preferred”
country or language version (or do it for them), so they can be automatically
redirected in subsequent visits to the site.

Cross-link between language/country web versions. Add crawlable links referring
to the other language or country versions of each page, to allow users to switch
between them, facilitate their crawlability, and pass link popularity between
them.

318

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Attract local backlinks to each of your international website versions. Promote
each of your international website versions within their targeted
country/language to attract backlinks from relevant, authoritative local sites.
If you have a physical in-country presence, feature the relevant local addresses
and phone numbers in your pages, as well as create a Google Business Profile for
each location.

Specify your page’s language and/or country with relevant tags. Specify your
page’s language and/or the targeted country by using the "contentlanguage" meta
tag or embedding this information in the HTTP headers (this is supported by
Bing). For example, for US English the meta tag is the following, featuring a
two-letter ISO 639 language code, followed by a dash and the relevant ISO 3166
country code:

Google and Yandex also support the use of hreflang annotations for this purpose,
as described in the following section.

Using hreflang annotations Use hreflang annotations when you have a website in
more than one language or target more than one country to give an additional
signal to Google and Yandex about which are the relevant pages that should be
indexed and shown in the search results for users searching in each
country/language. This will help to avoid ranking pages in nonrelevant language
or country markets. You can identify such pages by using the Google Search
Console Performance report or tools like Semrush or Ahrefs that allow you to see
the keywords for which your site ranks in any given country, and if there are
any pages ranking in markets for which you have more appropriate localized
versions of the same pages. TIP It’s not necessary to use hreflang annotations
in pages that are only available in a single country or language (or pages that
cannot be indexed); instead, focus their usage on those pages of your site that
can be indexed and are published in different language or country versions.

Google uses the ISO 639-1 format for the hreflang language values and the ISO
3166-1 alpha-2 format for countries. The annotations can be included in the HTML
head section, within XML sitemaps, or in HTTP headers. These each have
advantages and disadvantages, depending on the scenario:

BEST PRACTICES FOR MULTILINGUAL/MULTICOUNTRY TARGETING

319

In the HTML head This method is best for sites with a small number of language
and/or country versions to tag, and when using a platform that easily allows the
addition and editing of the tags in the page’s HTML head. hreflang tags should
be included in the page’s HTML head, specifying each of their

alternates, using the following syntax: . You can see an example of this in
Figure 7-57.

Figure 7-57. Example of hreflang code in the head section of a web page

In XML sitemaps This tends to be the best method for bigger sites featuring many
language/country versions, where including hreflang tags for all of them in the
HTML head would add too many extra lines of code, increasing the page size. This
is also the best alternative whenever there’s no way to edit the HTML head to
add the hreflang tags there. Here’s an example of hreflang tags in an XML
sitemap indicating that the home page is available in English for the US,
Spanish for Spain, and Spanish for Mexico:

320

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE



https://www.brand.com/en-us/



https://www.brand.com/es-es/



https://www.brand.com/es-mx/



In HTTP headers For non-HTML documents like PDFs, Word files, etc., you can use
the HTTP headers to specify the available language/country versions. It’s
generally better to

BEST PRACTICES FOR MULTILINGUAL/MULTICOUNTRY TARGETING

321

use this method only in such cases, as it is easier to overlook and can be more
difficult to validate. Here’s an example of what this would look like for a
file.pdf document available in English for the US, German for Germany, and
French for France: Link: ; rel="alternate"; hreflang="en-us", ; rel="alternate";
hreflang="de-de", ; rel="alternate"; hreflang="fr-fr"

In these examples, the URLs are all under the same domain, but hreflang
annotations can be also specified for cross-domain configurations (e.g., between
ccTLDs). It’s also possible to combine different methods to implement hreflang
tags for your site; just be sure to define clear rules for when you use one
method or another and the scope of each implementation, to avoid overlaying
them. It’s not required, but if you have a language or country version (or a
“global,” generic one) that you want to be used as the default or fallback for
visitors from countries or using languages that aren’t specifically targeted,
you can do this by applying the hreflang="x-default" annotation to that page
version. For example, if you want to specify that your English for the US home
page is also the default page to fall back to for users in countries that you’re
not specifically targeting, you can add a second hreflang tag for that page
version to indicate that, as shown here:

To facilitate the correct generation of hreflang annotations, you can use free
tools like the Hreflang Tags Generator created by Aleyda Solis or the
Hreflang-Generator by SISTRIX to ensure the tags use Google’s specified values
and syntax, and Merkle’s hreflang Tags Testing Tool or the Hreflang Tag Checker
Chrome add-in from Adapt Worldwide to check the implementation. As with any SEO
implementation, it’s also recommended to first release the change in a test
environment and use SEO crawlers that support hreflang to check the
implementation in bulk, like Screaming Frog, Lumar, Botify, Sitebulb, or Ryte,
to validate that the annotations have been added correctly. The two most common
hreflang implementation issues are: • Pointing to URLs that cannot be indexed
(they might be triggering a 4xx or 5xx HTTP error or a redirect, have a meta
robots noindex tag, canonicalize to other pages, etc.) or are not adding a
return hreflang tag that points back. • Using incorrect/nonsupported language or
country codes, or only specifying the country code without adding a language,
which should always be included. This can be avoided by using the ISO 639-1
format for the language values and the ISO 3166-1 alpha-2 format for the
countries, as suggested previously.

322

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

NOTE A special thanks to Aleyda Solis for her contributions to this portion of
the chapter.

The Impact of Natural Language Processing In the early days of search engines,
the primary focus was on matching the keywords in user search queries with web
pages that used those keywords. This approach was pretty primitive and also
relatively easy for spammers to fool. With the launch of Google in 1998 a major
new factor was added into the mix: links. This was a big change because it added
an element of external validation into the picture. However, this left Google
and other search engines with many challenges in identifying the content that
best matches with the needs implied by a user query. For that reason, for over a
decade now Google has been investing in how to improve its natural language
processing (NLP) capabilities. NLP is a branch of computer science focused on
helping computers process language in much the same way that humans do. This is
one of the most significant ways that machine learning technology is being
deployed by search engines today. Since 2013 Google has announced many NLP
algorithm releases, and there are likely others that we have not heard about.
Some of the ones that have been announced are:

Hummingbird Hummingbird was the first machine learning algorithm announced by
Google in 2013. At the time, Google described the algorithm as follows:
“Hummingbird is paying more attention to each word in a query, ensuring that the
whole query— the whole sentence or conversation or meaning—is taken into
account, rather than particular words.” However, it is likely that this was
still being done in a pretty limited way, as you will see given some of the
following releases.

RankBrain RankBrain was the next major machine learning algorithm announced by
Google in 2015. It focused on trying to match up a given query with other
queries that use different words but that are otherwise similar in intent. This
is done using a machine learning concept known as similarity vectors. This
approach is particularly useful with the roughly 15% of daily queries that
Google has never seen before; it can look at past results for similar queries to
optimize the results it serves for the new queries.

THE IMPACT OF NATURAL LANGUAGE PROCESSING

323

BERT Prior to the release of BERT in October 2019, Google was only able to
consider the words that appeared immediately before or after a keyword in a
piece of content to understand the intent related to the keyword, as the words
in the text were processed one by one in order. This algorithm, which enabled
bidrectional processing of text (considering the words that appear both before
and after the keyword), was considered a major step forward when it was
announced—but it should also put into perspective how limited Google’s NLP
capabilities were prior to BERT’s release. You can read a more extensive
commentary in “BERT” on page 404.

MUM The Multitask Unified Model (MUM) algorithm, which Google claims is 1,000
times more powerful than BERT, was released in May 2021. This algorithm allows
Google to seamlessly process information across languages and mediums (video,
images, text). All of these algorithms are intended to enable Google to better
understand the meaning behind the language of user queries and the language they
encounter on websites and web pages. With this better understanding, it can
continue to improve on how well it provides searchers with the information and
resources that they want across the web.

Generative AI In late 2022 OpenAI made a major splash with its generative AI
model ChatGPT. This took the potential for search in a whole new direction,
where a SERP could consist primarily of a long-form text response to a query
like tell me about the Boxer Rebellion. Microsoft has been a major investor in
OpenAI, and Bing was fast in implementing a version of its search engine that
integrated ChatGPT (a solution it refers to as Bing Chat). Google has also
launched a rival to ChatGPT called Bard. You can read more about using
generative AI solutions to create content in Chapters 1 and 2, and more about
the future of and challenges with generative AI in Chapter 15. For some types of
queries, these developments could change the nature of the appearance of the
SERPs in material ways. However, it’s highly likely that the basics of what you
need to do (as defined in this book) will remain the same. You will still need
to make your site crawlable, you will still need to provide great content, and
you will still need to do the right things to raise the overall visibility for
your organization.

Entities NLP is a field filled with vast complexity, made all the more
complicated by the fact that these types of algorithms have no model of the real
world. It’s possible to help

324

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

address some of that shortcoming by developing models that can be incorporated
into these algorithms, for example by defining entities and their related
attributes. An entity is something that exists in itself. It can be real, like a
car, or fictional, like a film or book character (say, Harry Potter). In either
case, it possesses a set of properties, qualities, and attributes that make up
the “thing” it represents. All of these are language independent, though
obviously when an entity is described we do need a language with which to
describe it. As a way of facilitating entity-based search, Google introduced the
Knowledge Graph in May 2012. As discussed in Chapter 3, its aim is to connect
entities with their different properties, qualities, and attributes, along with
associated entities, in order to better understand what users are searching for
and what content will be most useful to provide. Google describes the purpose of
the Knowledge Graph as being “to discover and surface publicly known, factual
information when it’s determined to be useful.” This includes answering common
questions like What is the capital of Australia? and Who is Britney Spears? NOTE
See the following section for a discussion of some of the legal issues involved
with using information discovered on third-party websites on your site.

The Knowledge Graph and entities are at the heart of what Google calls the
transition from “strings to things.” While all the information on the web is
data, entities allow Google to better understand how that information fits in
and how accurate it might be. For an entity to be created in Google’s index,
Google needs to index numerous pieces of information about it and understand the
relational connections between them. Figure 7-58 shows an example of a set of
entities and the relationships between them. If you’ve been overfocused on
keywords and not on having subject matter experts create rich, high-quality
content for your site, then entities will have a large impact on how you pursue
SEO. While tasks such as keyword research and getting links to your site remain
very important, you must also have the experience and expertise to understand
the scope and depth of a topic area, and that experience, expertise, and
understanding must show in the content that you create for your site. In
addition, you should pursue holistic strategies to build the reputation and
visibility of your business to create positive associations across the web in
order to establish your business as an entity with authority in those topic
areas.

THE IMPACT OF NATURAL LANGUAGE PROCESSING

325

Figure 7-58. Example set of entities and relationships

This brings SEO and marketing a lot closer than they have ever been, and makes
SEO something that should be part of the DNA of a business rather than a bolt-on
activity that can be picked up and added in as the need arises. So, how do you
begin? What are the guiding principles that you need to have in mind in this new
world of SEO? Funnily enough, the concept is as simple to plan as it is
difficult to apply. It starts with answering a few basic questions: Who is your
audience? What are they looking for that you can provide? Why are you one of the
best organizations to provide it? If you cannot answer these three questions
successfully—that is, in a way that expresses a distinct and unique identity for
your business—then chances are good that neither can Google or your prospective
customers. Your SEO is likely governed by activities that make sense at a
technical level but not at a brand identity one. SEO today is all about
establishing that identity, even—especially, one might argue—from a business
point of view. This is what helps with the formation of entities in the Google
search index. Entities then become high-trust points that help Google’s
algorithms understand the value of information better. As you engage in your
overall digital marketing strategy, keep these four areas of concern in focus:

326

• Experience

• Authoritativeness

• Expertise

• Trustworthiness

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

This will help your business find its audience, keep it, and grow. When you’re
demonstrating Experience, Expertise, Authoritativeness, and Trustworthiness
(EEAT), people across the web will recognize that. You’ll attract more links,
you’ll get better reviews, users will spend more time on your site, and in the
process your perceived value as an entity will increase.

Fair Use As Google provides more and more search results where it presents the
answers to the users’ questions based on information that it has sourced from
other websites, some publishers have begun to object, as they feel that Google
is “stealing” their content and profiting from it. The question becomes whether
or not Google’s usage can be considered fair use. According to the US Copyright
Office, there are four factors involved in determining fair use: • The purpose
and character of the use, including whether such use is of a commercial nature
or is for nonprofit educational purposes • The nature of the copyrighted work •
The amount and substantiality of the portion used in relation to the copyrighted
work as a whole • The effect of the use upon the potential market for, or value
of, the copyrighted work There is actually no definitive definition of fair use,
but the depth and nature of what you take from the third party and how you use
it are important indicators. It’s common practice among those who quote others,
or who attempt to make fair use of someone else’s copyrighted material, to
provide attribution. However, the US Copyright Office indicates that this might
not be enough: “Acknowledging the source of the copyrighted material does not
substitute for obtaining permission.” And of course, this is more than a US-only
concern, and the laws differ from country to country. This is an issue that has
started to see its day in court. As a result, in October 2020 Google CEO Sundar
Pichai pledged that the company would pay $1B in licensing fees to news
publishers over the next three years to display their content in Google News
Showcase. In addition, the Australia Competition and Consumer Commission found
that Google’s dominance of the online advertising industry has led to an
environment that harms competition, and part of its response was to create a
mandatory code for companies like Google and Facebook to pay for use of news
content. How this will continue to unfold is not clear, but it will remain a
significant issue for Google until a broader resolution is found.

THE IMPACT OF NATURAL LANGUAGE PROCESSING

327

One additional aspect to consider is that public domain information—for example,
the fact that Olympia is the capital of the state of Washington—is not
copyrightable. Information of this type that Google is able to extract from a
third-party site thus is not subject to this discussion.

Structured Data Structured data is the label applied to a number of markup
formats that allow Google to better understand the data it is indexing. You can
think of it as metadata (data about data) implemented for search engines rather
than people. Unfortunately, many SEOs believe that this is a shortcut to better
rankings, which it’s not. As mentioned in “CSS and Semantic Markup” on page 250,
Google, Microsoft (Bing), and Yahoo! collaborated to develop an independent,
W3C-approved way of implementing structured data across the web, known as
Schema.org (see the following section for more information). The rationale for
this project was that the search engines recognized that understanding the
entirety of the web is a highly complex endeavor. Crawling and indexing the web
is a gargantuan task, and the engines can use a little help to rapidly identify
key elements on the pages they find. That does not mean, however, that
structured data on a website is a ranking signal. It helps in better indexing
but does not play a role in ranking. Regardless of whether or not you have
Schema.org markup on your web pages, the search engines will still attempt to
extract entity information from unstructured data through their own efforts.
There are several good reasons for this, including:

Adoption Structured data markup can be difficult to implement if you do not know
any coding.

Accuracy Since human agents are involved in the markup of data, mistakes will
happen.

Consistency Even when structured data is applied without errors, there are still
differences in the categorization of content and confusion over how to best
apply semantic identifiers.

Reliability There will always be a temptation to try to game search by
implementing structured data markup in ways intended to boost ranking; Google
has a number of manual action penalties that it can apply to remove such spammy
results. The million-dollar question is, is there anything you can do to help
Google index your site better if you do not implement structured data markup?

328

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

The answer is yes: implement all the search engine optimization tools you have
in your arsenal in a way that makes sense for a human user first, and a search
engine second. Namely, your on-page SEO should help a reader better navigate
your content and make sense of it at a glance. The keywords, synonyms, and
entities you use in your content should do the same. Any links you include, and
the anchor text of those links, must similarly fill in those blanks. Use
Schema.org to help search engines better understand the content of your pages,
and where possible, make use of Google’s Data Highlighter tool for teaching
Google about the pattern of structured data on your site. If you’re running a
brick-and-mortar business, all the relevant information should be included on
your pages, such as your name, address, and phone number. Make use of Google
Business Profile and ensure you have a cohesive presence on the web, whose
effectiveness you can measure. Finally, aim to build lots of positive
relationships on the web that help drive signals of trust and authority back to
your website and business.

Schema.org Schema.org is best viewed as part of a much larger idea, one that
traces its origins back to the foundational concepts of the web itself, and its
progenitor, Tim Berners-Lee. In their seminal article in Scientific American in
2001, Berners-Lee, James Hendler, and Ora Lassila described a semantic web that
“will bring structure to the meaningful content of Web pages, creating an
environment where software agents roaming from page to page…will know not just
that [a] page has keywords such as ‘treatment, medicine, physical, therapy’…but
also that Dr. Hartman works at this clinic on Mondays, Wednesdays and Fridays.”
Schema.org is arguably one of the most practical, accessible, and successful
outcomes of the semantic web movement to date. With the marketing prowess of
Google, Yahoo!, Bing, and Yandex behind it, and with the powerful incentive of
gaining additional, more inviting shelf space in the SERPs, it’s no surprise
that webmasters have embraced Schema.org. And Berners-Lee et al.’s words now
read like a prophetic description of the search engine spiders crawling the web
and extracting meaning for display in enhanced search results! At its core,
Schema.org is about standardizing and simplifying the process of adding semantic
markup to your web pages, and providing tangible benefits for doing so. The most
visible such benefits come in the form of rich snippets, such as the star
ratings and price range shown in Figure 7-59.

SCHEMA.ORG

329

Figure 7-59. A rich snippet in a Google SERP

However, it’s clear that Schema.org markup plays a larger, and perhaps
expanding, role in how the SERPs are constructed. Other benefits that can be
attributed to it include local SEO ranking benefits received from clearly
communicating a business’s so-called NAP (name, address, phone number)
information by marking it up with Schema.org, and even supplying Google with
information that can appear in the knowledge panel and “answer box” results (see
Figures 7-60 and 7-61).

330

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-60. Google SERP knowledge panel on Tim Berners-Lee

Before Schema.org, semantic markup was largely the province of academia,
research and development, specialty niche businesses, and others with specific
requirements to exchange and understand data in a deeply meaningful way. With
Schema.org, the local pizza joint can hope to have “5 star reviews” jump off the
search results page; local governments can publicize civic events and have that
information re-presented in the SERPs, providing “instant answers” to searchers;
and the list goes on. With such practical benefits in mind, and with the
simplified approach of Schema.org over its predecessors like RDFa, many people
responsible for building web pages are making the effort to incorporate this
markup into their sites.

SCHEMA.ORG

331

Figure 7-61. Google answer box for Tim Berners-Lee query

Schema.org Markup Overview Schema.org markup communicates the meaning of web
pages to computer programs that read them, like search engine spiders. While
humans can infer the meaning of words on a page through a number of contextual
clues, computer programs often need help to extract such meaning. Let’s walk
through a simple example. Imagine you have a page that displays information
about the book 20,000 Leagues Under the Sea. You might create such a page with
the following HTML code:

332

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

20,000 Leagues Under the Sea

Author: Jules Verne Rating: 5 stars, based on 1374 reviews ISBN:
978-1-904808-28-2

+

After being marked up, the source code might look like Figure 7-62. The
Schema.org microdata markup is highlighted and explained after the figure.

Figure 7-62. Annotated Schema.org markup

Line 1: itemscope itemtype=https://schema.org/Book Adding itemscope to a
container element, in this case a , is the way to begin defining an entity. This
attribute makes the element the outermost, enclosing type definition for the
specified entity. The itemtype=https://schema.org/Book attribute, also added to
the element, declares the type of this entity. Together, this makes the entire a
container for a single book type entity.

Line 2: itemprop="name" Adding itemprop to an HTML element defines it as the
container for a property. In this case, the property is the name of the book,
and the value is the inner text of the tags, 20,000 Leagues Under the Sea.

Line 3: itemprop="image" This is similar to the previous itemprop attribute, but
in this case the property is an image of the book cover and the value is the URL
referenced in the src attribute of the tag.

Line 4 Compare this to line 2. In line 2, the inner text of the element was our
exact title. Here, we also have a label ("Author:"), which is not part of the
actual author property. To keep our browser display looking the same as the
original but omit the "Author:" label from our author property, we use the
construct.

SCHEMA.ORG

333

Lines 5 and 6 Our item property in this case is not a simple text string or URL,
but rather another item—a schema.org/AggregateRating. It is simultaneously a
property of the book (so it uses the itemprop attribute) and a type itself (so
it uses itemscope and itemtype, as we saw in line 1 for our outermost book
type).

Lines 7 and 8 These lines add properties for the aggregateRating, in much the
same way we defined "name" and "author" in lines 2 and 4. Note the careful
enclosure of the data with tags so as to include only the data itself, not the
surrounding labels, in our property. This is the same technique we used in line
4.

Lines 9 and 10 These itemprops contain information that is needed to provide
context for the item rating (namely, that our scale is 0 to 5, with 0 being
worst and 5 being best), but which is not displayed on the page. In the previous
examples, the values of the properties came from the inner text of the HTML
tags. In this case, there is no text to display in the browser, so we use the
value of the element.

Line 12 This code defines the "isbn" property with an itemprop, again using a
element to keep the display and the data cleanly separated. Schema.org is a
simple idea, with a mostly simple implementation, but getting it right can be
tricky. Fortunately, the operating environment is pretty forgiving, and a few
tools can help ease the task. Search engines understand that most webmasters
aren’t structured data experts with a deep understanding of ontologies and
advanced notions of relations, entities, etc. Thus, they are generally quite
adept at figuring out what you mean by your Schema.org markup, even if there are
errors or ambiguities in how you express it. Clearly you should strive to be
accurate, but you should approach this exercise knowing that you don’t have to
understand every single nuance of structured data markup or produce perfect
markup in order to succeed.

How to Use Schema.org Let’s talk about the best way to approach using
Schema.org. Semantic markup is designed to help you provide meaning and clarity
about what your website and each of its pages are about, so you should be clear
about this before attempting to add the markup. Think real-world tangible
objects, or in semantic markup parlance, entities. For example, if you’re a
purveyor of fine linen, your site may have pages related to pillowcases,
bedsheets, duvet covers, and so on. Your pages are “about” these entities. If
you’re willing to make the common conceptual leap here, you could say these
entities “live” on your web pages. Job one is to figure out how to map your
entities

334

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

to Schema.org’s catalog of “types” (the Schema.org documentation sometimes uses
the word items in place of types). At this level, Schema.org is a large,
ever-growing, and evolving catalog of types that attempts to classify everything
that can be represented on web pages. Let’s take a look at the Schema.org page
for the Book type (Figure 7-63).

Figure 7-63. Schema.org definition of Book

SCHEMA.ORG

335

The idea is straightforward: the type definition identifies the key attributes
that you would use to uniquely describe an “instance” (that is, a single,
real-world example) of this type. It may help if you open this page in your web
browser as we discuss it. Note that the Schema.org definitions are frequently
reviewed and updated based on active user feedback, so you may even see minor
variations on the current page. The overall structure will likely remain very
similar, however, and the major elements of the page are the same for all item
types. First, note the simple description, confirming that this is, indeed, the
model for a book. Let’s ignore the Thing > CreativeWork > Book breadcrumb for
now; we’ll come back to that later. Next comes a table of properties—what we
might think of as the attributes that uniquely describe our individual
entity—which, in this example, are the things that describe the book 20,000
Leagues Under the Sea. Each property has a name (the Property column), an
expected type, and a description. The expected type tells us whether this
property is simply a text value (like a name), or something more complex; that
is, a type itself. For example, the illustrator property should contain not the
name of the illustrator, but a full person entity, using the Schema.org Person
type definition (which, as you would expect, itself contains a name property,
and that’s where you’d include the illustrator’s name). As you begin examining a
possible mapping of your entities to Schema.org types, you’ll often encounter
this nesting of types within types. While many of the properties of an entity
are simple descriptions (text strings like "blue", "extra large", or even "Jan
17, 2023"), others are more complex and are entities in their own right. This is
analogous to the concept of composing larger-scale things from a collection of
smaller ones, as in describing a car (a single entity in its own right) as being
made up of an engine, a chassis, wheels, interior trim, and so on (all entities
themselves). Extending this idea further, to an auto mechanic, an engine—a
component of a car— is itself a composite thing (an entity made up of other
entities). To understand the engine in more detail, it’s important to break it
down into its component entities, like carburetors, spark plugs, filters, and so
forth. Schema.org, then, is a set of conventions for modeling complex things in
the real world and marking them up in a way that search engines can consume,
enabling a deeper understanding of web pages. This deeper understanding in turn
leads to many current and future benefits when the search engines subsequently
present that data back to users in compelling, contextually relevant ways.
There’s one more preliminary concept we should cover; it seems complicated at
first but isn’t once we break it down. One thing you’ll notice as you browse
Schema.org’s

336

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

types is that each one lives within a hierarchical family tree. We saw this
earlier with the breadcrumb on the Book page, shown again in Figure 7-64.

Figure 7-64. The “inheritance” hierarchy for the Book type

It’s important to note that this kind of hierarchy, referred to among computer
scientists as inheritance, is different from the composition hierarchy example
we discussed earlier (a car made up of an engine and other smaller entities).
The Schema.org type hierarchy is a way of categorizing things from most generic
to most specific—what we call an ontology. Its form is similar to the well-known
animal kingdom charts we’ve all seen, or the myriad other classification schemes
we all tend to take for granted, which are often represented on web pages with
features like breadcrumbs, navigation menus, and faceted navigation filters. The
key point to remember here is that when choosing the Schema.org types to model
your entities, it’s always best to choose the most specific type you can. That
is, choose Restaurant over LocalBusiness if you’re operating a restaurant,
choose Book over CreativeWork for books, and choose HighSchool over
EducationalOrganization for high schools. Doing so ensures you are giving the
most specific information possible to the search engines, rather than settling
for generic descriptions. With that background covered, let’s run through the
general plan for adding Schema.org markup to your website. Here are the six
major steps: 1. Determine the Schema.org types that best describe the entities
represented on your web pages, which may be different for each of your different
page archetypes. 2. For each page archetype you’re modeling, perform a detailed
mapping of the information elements displayed on the page to the Schema.org type
properties. 3. Choose the approach you will use to express the Schema.org
markup. 4. Edit the HTML document templates, or update the CMS settings, or
modify the scripts—whatever best describes how your pages are generated—to
incorporate the Schema.org markup.

SCHEMA.ORG

337

5. Test the markup to see if your syntax is accurate, and if you’ve properly
modeled complex entities. 6. Monitor how well the search engines are consuming
your structured data, and whether and how that data is being presented in the
SERPs. Let’s look at these one at a time, in more detail.

Step 1: Determine Schema.org types In this step, you think carefully about which
web pages you want to mark up while simultaneously browsing the Schema.org
website for appropriate types (actually, the browsing capability is fairly
limited as of the time of this writing, so you might be better off searching for
types using the search box displayed at the top of each page on the site; see
Figure 7-65).

Figure 7-65. The Schema.org search box

For example, if our website is about community theater groups and includes one
page for each group along with a list of their upcoming performances, we would
begin by searching https://schema.org for something like theater. The resulting
page should look something like Figure 7-66. Scanning the results, we quickly
spot TheaterGroup as a likely candidate for the type of our main entities.
Taking a closer look at the TheaterGroup page, shown in Figure 7-67, we can see
a few of our core concepts at work: • TheaterGroup is part of a logical
hierarchy, starting with the most generic type (Thing—actually the topmost
ancestor of all Schema.org types), then proceeding to more and more refined
types: Organization, PerformingGroup, TheaterGroup. • A TheaterGroup is composed
of many elements (called properties), some of them simple like the name of the
group, and some of them actual types in their own right (such as address,
aggregateRating, employee, etc.). Examining the list of properties confirms our
belief that this is the best type for describing our local theater entities on
our web pages.

338

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-66. Search results for “theater” on Schema.org

SCHEMA.ORG

339

Figure 7-67. The Schema.org TheaterGroup type

It’s during this step that you want to deal with the question “What is this page
about?” and choose the Schema.org type that best describes the overall contents
of the page. Often this choice is obvious, but at times it can be tricky. For
example, for a page with a product for sale, should you choose the Offer or
Product type to model the page? Examining both pages on the Schema.org website,
you can see that an Offer has a property called itemOffered, with an expected
value of Product. This means that you can describe the contents of the page as
an Offer (Schema.org’s concept of something

340

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

for sale), where the item for sale (the Product) is contained within the Offer,
using the itemOffered property. Alternatively, you could use the Product type,
which has a property called offers that can, as you might expect, contain one or
more Offer types. The decision probably depends on the overall purpose of the
page. If the page is a detailed product page describing many attributes of the
product, and the offer information is just a piece of that, it probably makes
sense to model the page as a Product and include the offer information in the
itemOffered property. However, it’s not out of the question that you could
invert this model. Either of the approaches is valid, as both convey the meaning
that you want—but take another look at Offer. You can see that it is a complex
concept, and it has many properties that are themselves types (for example,
aggregateRating). Other complex nestings of types and attributes can easily
arise, and it’s important to model these in the way that best matches the
meaning of the page. The best approach often won’t be obvious at this stage of
analysis, so you may need to revisit your thinking after you complete step 2 of
the process.

Step 2: Map Schema.org properties to elements on the web page The first task
here is to survey the various data elements displayed on the web page and match
them up with the Schema.org types and properties you selected in step 1. In this
step, you may discover relationships that resolve some of the potential
ambiguities from step 1. For example, continuing the Product/Offer discussion,
let’s assume that one of the items displayed on the page is an overall
rating—say, a value on a scale of 1 to 5— representing user evaluations of the
product. We notice that both Product and Offer have a property called
aggregateRating, so this hasn’t quite settled our debate on which type to model
the page on. Let’s also assume that we display several different prices, perhaps
for new or used versions of the product, or with different shipping options or
different currencies. It now starts to become obvious that we should model the
entire page as a Product that contains multiple Offers and a single
aggregateRating that applies to the Product itself. Finally, things are starting
to take shape! You might notice that there are properties defined on the
Schema.org type that you’re not currently displaying to browsers, but which you
have access to. Continuing with our Product example, perhaps your web
application’s database stores the MPN (manufacturer’s part number), but you
don’t choose to display that on the page. What should you do?

SCHEMA.ORG

341

Ideally, you want a very high degree of consistency between what you mark up and
what’s visible to “normal users” via web browsers. Technically, there are
mechanisms that allow you to communicate to the search engines metadata about
your entities that shouldn’t be displayed to users (we saw this earlier in our
aggregateRating example, and we’ll explore that example a bit more momentarily).
However, it’s important to use these mechanisms sparingly, and not be tempted to
stuff a lot of extra data into the Schema.org markup that is not visible to
human users. In the MPN case, our choice should be between adding this as a
visible element on the page (and then of course adding it to our Schema.org
markup), or forgoing it entirely. As you think about this, it should become
clear that marking up a lot of data that is not displayed to the user is
conceptually something a spammer might do, and for that reason the search
engines frown on it. What are the valid reasons for marking up nondisplayed
data? Usually it’s because you need to convey some different context that is
obvious to people, but not to search engine spiders. For example, when you mark
up an aggregateRating, you’re strongly encouraged to specify the scale—if you
display 4 stars for a review on a scale of 0 to 5, this is usually quite clear
in the visual representation, but it needs to be stated explicitly in the
Schema.org markup. Thus, aggregateRating entities have worstRating and
bestRating properties, and you’ll want to supply the values 0 and 5,
respectively, corresponding to your star rating scale. We saw this in the sample
code for our book in the overview at the beginning of this section. Upon
completing this step, you should have a complete mapping between the data
displayed on the page and the various Schema.org types and properties that make
up your model. Your model may be simple or complex, with multiple levels of
nesting. It’s best to make all these decisions before you begin actually adding
the markup on the page.

Step 3: Choose your implementation technique For most people, this step means
“go mark up the page.” Sounds simple, right? And for some pages, especially
those that are template-driven with mostly static data, it should be fairly
straightforward. Or, if you’re lucky enough to be using a content management
system or publishing platform that has built-in support for Schema.org, you can
do most of the actual implementation by setting a few configuration parameters.
For other, more dynamic sites that generate their pages through a complex
pipeline of page generation programs, tweaking things to insert the right tags
in the right places can be far more difficult. Validating that the generated
schema is correct is also challenging for such sites, as the Schema.org markup
may be sporadically injected into kilobytes of code. Google supports three
formats—JSON-LD, microdata, and RDFa— but recommends that you use JSON-LD.

342

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Google recommends this format because the nesting structures on web pages can be
quite complex, and therefore it can be difficult to correctly insert microdata
or RDFa code within them. This leads to a relatively high rate of errors. With
JSON-LD, however, the Schema.org markup is placed in the head section of the web
page and is therefore independent of the nesting structures within the page. If
you use JSON-LD, you simply need to implement the Schema.org code within the
head section of your site. If you don’t use JSON-LD, the primary implementation
technique is to edit templates and/or modify page generation programs to insert
the microdata or RDFa markup as needed to produce the desired final output. The
key is to have a clear mapping of the model from step 2 showing the final
desired HTML with microdata or RDFa markup inserted, and use this to validate
that the final page produced by the web server matches the model. As you’ll see
in step 5, there are some tools that can help with this verification as well.
For those without access to the code or backend systems, or who want a simpler
approach, Google offers the Structured Data Markup Helper as part of Google
Search Console. This proprietary Google tool allows you to annotate a page,
using a pointand-click editor (see Figure 7-68). It’s just an alternative way of
providing the same data you provide via JSON-LD, microdata, or RDFa markup, but
you’re feeding it directly to Google and do not change the page’s source code at
all.

Figure 7-68. Google Structured Data Markup Helper

So why doesn’t everyone do this? There are two good reasons why this isn’t the
best fit for everyone. First, the information is available only to Google, not
to

SCHEMA.ORG

343

other Schema.org-aware search engines (or other applications that may make use
of Schema.org markup). Second, as is often the case with this kind of tool, the
visual editor is more limited than the markup syntax in its ability to express
rich and complex information.

Step 4: Implement the changes to generate the target Schema.org code This step
is really just saying, “Now it’s time for your web developers to breathe some
life into your creation”; that is, go get these pages served up by your web
server! This is where the content management system is tweaked, the templates
are updated, the page production programs are modified, and so on.

Step 5: Test When you reach this stage, your web server is shooting out bundles
of HTML with tidy bits of Schema.org markup embedded in it that add meaning and
structure to your data. At this point, the generated markup should be
syntactically correct and should express the right model—that is, the whole
composition of smaller properties and types into larger properties and types
needed to accurately model the information displayed on your pages. Of course,
it’s important to verify this! The hard way to do that is to examine the
generated code by hand, looking for the opening and closing tags and ensuring
that all the data is there, nested properly. Fortunately, there’s an easier way
(though you should still be prepared to roll up your sleeves and dig into the
code to debug potential problems). The easier way is to use one or more of the
tools available to verify your markup. The most important of these is Google’s
Rich Results Test tool (this supersedes the earlier Structured Data Testing
tool, which has now been retired). The Rich Results Test tool is an elegant
utility that examines your page (either directly by supplying a URL, or
alternatively by cutting/pasting HTML source code) and gives you feedback on the
structured data it finds. Figure 7-69 shows such a result. Note that you can
find more information on the warnings the tool produces by clicking on those
lines; the tool will then provide you with specifics on the nature of the error.

344

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-69. Sample output from Google’s Rich Results Test tool

Step 6: Monitor Finally, monitor the rich results that you have earned as well
as the ones that you wish to obtain to detect changes as they occur. This will
allow you to see if problems have emerged or if Google has made algo changes. If
either of those occurs then you can do further research to see what happened.
You can do the following: • See if Google has announced changes in how they are
treating some aspects of Schema. • Read industry blogs and forums to see if
others havev seen similar changes in their rich results. • Test your page in the
Rich Results Testing Tool to see if your Schema is still valid.

SCHEMA.ORG

345

These steps will help you determine what happened and what your next steps are.

Summarizing Schema.org’s Importance As you’ve seen, Schema.org is a standard for
providing search engines (and potentially other applications) with structured
data describing the meaning of website content. The notion of data structuring
is actually quite intuitive, and maps well to the way we commonly categorize
various types of collections of related items. This logical, webmaster-friendly
approach has led to rapid adoption of Schema.org by the webmaster and content
production communities. Currently, the most common way to structure data with
Schema.org is to add microdata markup to HTML documents. Search engines use this
data to extract meaning and enrich SERPs with rich snippets, answer boxes, and
knowledge panels, providing a more relevant and deeper search result.
Implementing Schema.org can bring these benefits to both users and publishers
today, and can help set the stage for publishers to gradually delve more deeply
into the emerging world of semantic search in the coming years.

Google’s EEAT and YMYL In Chapter 3 we talked briefly about the concept of
Expertise, Authoritativeness, and Trustworthiness (EAT), which rose to
prominence after Google’s Medic update was released in August 2018. Google asks
its human raters to use this concept—which was expanded to EEAT, with the first
E representing Experience, in December 2022—as a measure of the quality of a
search result, with instructions for how to do that provided in the Search
Quality Raters Guidelines (SQRG). The SQRG is a regularly updated manual for
training the human evaluators of the quality of Google’s search algorithms. The
raters are shown sets of actual web pages and tasked with rating them according
to the standards set forth in the guidelines. Google engineers then analyze
these ratings to evaluate the effectiveness of proposed changes to the search
ranking algorithms. It is important to emphasize that the ratings of the Search
Quality Raters do not in any way influence actual search rankings. Rather, the
findings from their ratings are used in aggregate to provide feedback to Google
engineers that they can use to improve the overall algorithms. NOTE Although the
guidelines used to be confidential, Google began making them public in 2015.
Evaluating the changes in each revision of the SQRG can be useful in discerning
how Google engineers are thinking about search result quality and how it can be
improved.

346

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

From Google’s perspective, that’s all that EEAT is: something that the Search
Quality Raters evaluate. It does not directly relate to what Google uses its
algorithms to measure. However, there are many other aspects of your site’s
experience, expertise, authoritativeness, or trustworthiness that Google can
measure. For example, if your site has many factual errors, or sells ads to
questionable businesses, that could potentially hurt your rankings, as in either
case Google may be less likely to send visitors to your site. Figure 7-70 shows
what Google’s Danny Sullivan had to say about whether EAT is a direct ranking
factor (his tweet precedes the addition of Experience to the acronym, but the
concept is still the same and this comment still applies to EEAT).

Figure 7-70. Danny Sullivan on EAT as a ranking factor

In a 2019 white paper, Google further stated: Our ranking system does not
identify the intent or factual accuracy of any given piece of content. However,
it is specifically designed to identify sites with high indicia of experience,
expertise, authoritativeness and trustworthiness. So, while there is no such
thing as an EEAT “score” that is used as a direct ranking factor, it appears we
can say with a good degree of certainty that Google has an explicit goal of
promoting pages and sites that best match up with the stated aims of EEAT. Why
isn’t EEAT itself a ranking factor? It’s important to understand that Google’s
definition is as exactly specified in the SQRG and doesn’t include the many
related concepts that could potentially be indicative of a site or page’s
experience, expertise, authoritativeness, or trustworthiness. That specific
definition can’t readily be measured directly by Google’s algorithms, so instead
it uses things it can measure—what Sullivan called “proxy signals”—to come as
close as it can to what humans would recognize as sites and content with high
EEAT. A second, closely related concept has to do with so-called Your Money or
Your Life (YMYL) sites. These are websites that deal with critical life issues,
such as healthcare, financial management, and insurance, where the potential for
a significant negative

GOOGLE’S EEAT AND YMYL

347

impact to a user’s life if they provide bad or incorrect information is high.
For that reason, it’s even more important that the EEAT of these sites be at a
very high level. From a practical perspective, you should take steps to
demonstrate EEAT, including: • Publish content with differentiated added value.
“Me too” content that adds nothing new to a topic area is not going to cut it. •
Create content that benefits from your direct experience, be it with regard to
using a product, a restaurant you visited, developing creative new programming
algorithms, or something else. Bear in mind that your potential customers
probably want to understand what makes you (or your organization) unique and
special, and if all your content is created by generative AI, you aren’t
creating anything special or unique. • Proactively develop strong relationships.
Strive to maintain a highly positive, highly visible profile in your industry
(bonus points for being a thought leader in your market area). Among other
things, this will likely result in earning links from highly trusted sources
that have high EEAT themselves. • Treat your customers in a way that earns
strong reviews from them, in Google Business Profile and across other
review-oriented platforms that are relevant to your business. • In addition to
ensuring that your content is correct when you publish it, make sure you keep it
up to date. • Identify your authors. Especially for information or advisory
content, you should identify the creator of that content and present their bio.
These should be subject matter experts whose experience and expertise reflect
well on your brand. We’ll talk more about the issue of author authority in the
following section. • Provide clear sources and credit. Any time you refer to
third-party information, you should cite the sources of that information in your
content. Providing links is a great idea. • If you sell ads or have affiliate
partners, these need to be fully disclosed. Users on your site should be fully
informed about any relationships you have that might influence the content that
you’re offering them. You should do this even if those advertisers have no
influence over your content. • Moderate any user-generated content (UGC). UGC is
an area of high risk unless it is closely monitored to prevent spam or other
types of misinformation from ending up on your site. • Be careful with your
advertisers and who you link to. Any links from your site should point only to
authoritative, high-quality sites.

348

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Don’t consider this an exhaustive list of the considerations that you should
have in mind when thinking about EEAT. Focus your mindset on crafting a strong,
positive presence in your marketplace. Provide value and be valued.

Author Authority and Your Content For many years now Google has shown an
interest in the concept of author authority. With the addition of EEAT to the
Search Quality Raters Guidelines, we might better say Google seems interested in
not just the authority, but the experience, expertise, authoritativeness, and
trustworthiness of the authors of web content. One of the most interesting
insights into how Google thinks of this topic was the three-year experiment
known as Google Authorship. Google Authorship was a program that allowed online
authors to identify and verify their content with Google. This was accomplished
by a two-way link between the author’s content across the web and their profile
on Google+, Google’s now-defunct social network. While the Authorship program
was officially discontinued on August 28, 2014, it is likely that Google’s
interest in the value of the authority, trust, and reputation of an author in a
given topical area continues. The roots of Google Authorship lie in a patent
originally granted to Google in 2007 for something called agent rank. The patent
described methods whereby a search engine could identify distinct “agents”
(which could be the author or authors of a web document) and assign a score to
each agent that could then be used as a factor in search rankings. The company
didn’t appear to do anything with this patent until June 2011, when Google’s
Othar Hansson announced in a blog post that it would begin to support the use of
the HTML5 standard rel="author" and the XFN standard rel="me", and that
webmasters could use that markup to identify authors and author profiles on
their sites. The next major step in Authorship came just 21 days later, when
Google unveiled its new social network, Google+. Google+ provided personal
profiles that Google could use to verify authors using the rel="author" markup
on web pages. This intention was confirmed in a YouTube video by Othar Hansson
and Matt Cutts published on August 9, 2011, titled “Authorship Markup”. In the
video, Hansson and Cutts explained that Google wanted web authors to have
Google+ profiles, and that they should link from the “Contributor To” link
sections of those profiles to each domain where they published content. Over
time, Google offered several options by which a publisher could confirm the
relationship by linking back to the author’s Google+ profile.

AUTHOR AUTHORITY AND YOUR CONTENT

349

In that video, Google confirmed that there could be rewards to authors who
implemented Authorship markup. The immediate possible benefit was the potential
for an author’s profile image and byline to be shown with search results for
their content; additional potential benefits mentioned by Hansson and Cutts were
higher search rankings and the fact that Google might be able to use Authorship
to identify the original author or a piece of web content, thus giving that
author’s copy precedence in search over scraped copies. Over time, Google added
several tools and features to make Authorship easier to implement and more
useful for authors and publishers. This was probably the result of the problems
the company saw with a lack of adoption of this markup. The first major hint
that Google might be pulling back on its Authorship experiment came in October
2013, when AJ Kohn revealed that Othar Hansson had left the Authorship team and
was not being replaced. In that same month, Matt Cutts revealed that Google
would soon be cutting back on the amount of Authorship rich snippets shown in
search, as it had shown in tests that doing so improved the quality of those
results. Cutts’s words proved true in December 2013, when observers noticed a
15% reduction in the amount of author photos being shown for most queries. In
June 2014 Authorship was further reduced in search as Google announced that it
would no longer show author photos in results, just bylines. The only announced
reason for this was to bring its mobile and desktop user experiences more into
sync. However, only two months later, as previously noted, Google announced the
complete removal of Authorship data from search and stated that it would no
longer be tracking any data from rel="author" links. The Google Authorship
program, or at least any program based on rel="author" links and showing rich
snippets in search results, was over.

Why Did Google End Support for rel=“author”? In his official announcement on the
end of the Authorship program, John Mueller of Google Webmaster Central said,
“Unfortunately, we’ve also observed that [Authorship] information isn’t as
useful to our users as we’d hoped and can even distract from those results. With
this in mind we’ve made the difficult decision to stop showing authorship in
search results.” He went on to elaborate, saying that this decision was based on
user experience concerns. After three years of testing, Google was no longer
seeing any particular user benefits from showing Authorship results. Mueller
said that removing the Authorship results “did not seem to reduce traffic to
sites.” It would seem, then, that searchers were no longer viewing these results
as anything special.

350

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

What else may have factored into the decision to stop showing Authorship
results? In his post Mueller mentioned that he knew that Authorship “wasn’t
always easy to implement.” Could it be that low implementation rates by most
sites fed Google’s decision? If Google were ever going to rely on Authorship as
a signal for search, it would need to have data from a wide variety of sites. In
a study published just after the ending of Authorship , Eric Enge confirmed from
a sampling of 150 top publishing sites that Authorship implementation was indeed
low. He found that 72% of these sites had attempted Authorship markup in some
way, but out of those nearly three-quarters had errors in their implementation.
Even worse, 71% of the 500 authors sampled from those sites had done nothing on
their side to implement Authorship. This combination of high error rates and low
participation likely also were contributing factors in Google’s decision. Google
may have learned that data you want to use as a ranking factor can’t be
dependent upon voluntary actions by webmasters and authors.

Is Author Authority Dead for Google? Does the end of rel="author"–based
Authorship mean Google has lost all interest in understanding, tracking, and
making use of data concerning the authority levels of online authors? Most
likely not. Shortly after this program was ended, on September 2, 2014, Google
was granted a patent for a system that would retrieve, rank, and display in
search authors considered authoritative for a topic based on their relationship
(in social networks) to the searcher. Also, Google spokesperson Matt Cutts often
spoke during the last year of Google Authorship about his interest in and
support for Google eventually being able to use author reputation as a means of
surfacing worthwhile content in search results, though he noted that he saw it
as a long-term project. While Cutts seemed to be voicing his personal opinion in
such statements, it is doubtful that he would speak so frequently and positively
about the topic if it weren’t actually active at Google. Another area that seems
to support the notion that Google has not given up on author authority is the
importance of entities to Google. We know that Google invests heavily in
understanding the identification of various entities and the relationships
between them. As the original agent rank patent makes clear, authors of web
content are certainly a useful type of entity. Google understands that real
people often evaluate authority and trustworthiness based not just on a
document’s contents or what links to it, but on the reputation of the author.
Semantic search at its simplest is a quest to enable Google’s search algorithms
to evaluate the world more closely to the way people do. Therefore, it makes
sense that Google would continue to pursue the ability to evaluate and rank
authors by the trust and authority real people place in them for a given topic.

AUTHOR AUTHORITY AND YOUR CONTENT

351

However, for most SEOs the removal of Authorship rich snippets from Google’s
search results was seen as the end of the author authority project, and interest
in it waned— that is, until the EAT concept made its way into the Search Quality
Raters Guidelines.

Author Authority and EEAT As mentioned in “Google’s EEAT and YMYL” on page 346,
EAT (Expertise, Authoritativeness, and Trustworthiness) made its first
appearance in the July 2018 update to the SQRG, where it featured heavily in
most sections having to do with content quality. In December 2022, the second E
(for Experience) was added to the concept. One major application of the EEAT
principle was related to content authors: Search Quality Raters were told to
take notice of the author of a piece of content and given several methods of
determining the likely EEAT of that author. Google provides its Search Quality
Raters with three main guidelines for making these determinations: • A clearly
identified author for a piece of content • Biographical information about that
author, either on the page or elsewhere on the site • The number of citations
for that author (i.e., the other places on the web where they have been
published and the credibility and reputation of those sites) One of the proxy
signals Google might use to measure this could be the byline of an authoritative
author associated with the content on a page. How would that work? As discussed
earlier in this chapter, Google has invested heavily in better understanding
entities and the relationships between them. An author is an entity, and if
factors can be determined that establish the likely unique identity of a known
author, then an algorithm can construct a web of connections that attest to that
author’s experience, expertise, authoritativeness, and trustworthiness. The only
questions that remain are to what extent is Google able to identify all of this
information at the scale of the web, and to what extent it is making use of that
information. We can’t answer either of those questions with certainty, but the
evidence suggests it is likely that Google is making some attempt at determining
author authority and EEAT, even if its capabilities are limited. One possibility
is that Google may for now be using author authority, in whatever form it is
able to determine that, only (or mostly) for YMYL sites that provide information
that could have a significant effect on the finances or physical well-being of
real humans.

Author Authority Takeaways It appears that Google remains interested in the
concept of author authority as a factor in search rankings, and it’s likely that
it’s working on methods to identify and evaluate

352

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

authors in an automated way using machine-detectable signals that help its
algorithms understand the EEAT of a potential page or site. Here are some tips
on how to build author authority for yourself (if you’re a content creator) or
your site:

Publish with real names. To build author authority, search engines have to be
able to recognize that multiple pieces of content are connected with a
particular individual. Several of the following tips relate to building your
personal authority both on and offline, so using your real name with your
content is important.

Keep your name consistent. If you’re a content creator, use exactly the same
name for the byline on all your content as well as in all your social profiles.
That will help search engines gain confidence about your identity and make it
more likely that all of your online content will be used to evaluate your
authority. Publishers should make sure they include author bylines and use the
names their authors are best known by.

Cross-link your profiles. Wherever possible, create links between all your
online profiles. This is another way to help search engines have more confidence
in your unique identity.

Link your social profiles to your content. Wherever possible, create links from
your social and site profiles to the sites on which you publish content.
Publishers, include links to your authors’ social profiles either with their
content or on their bio pages on your site.

Cite original sources. Ensure that claims made in your articles link back to the
source(s) of the claim, such as a research paper or survey. This shouldn’t just
be a circular link back to another article on your site that reiterates the same
claim.

Have bio pages for all your authors. Create a detailed biography page for each
regular content contributor to your site. Include as much detail as you can
establishing their qualifications to write about their subject(s), including
links to content they’ve published on other prominent and relevant sites, and
links to any sites stating their qualifications, such as a university bio page
or a LinkedIn profile.

Produce content about all aspects of your field. More and more, we’re seeing
indications that Google is including in measures of a website’s topical
authority how complete and well-rounded the content is. It’s no longer effective
to merely hammer away at certain long-tail keywords. You need to build
contextually rich content that looks at your topic(s) from all sides. That

AUTHOR AUTHORITY AND YOUR CONTENT

353

doesn’t just apply to individual content pieces, but also to the content across
an entire site or across your profile as an author on many sites.

Produce content that goes into depth on specifics of your field. As well as
broadly covering your area(s) of experience and expertise, your content needs to
explore them deeply. That doesn’t mean every piece of content needs to be an
academic paper, or even lengthy, but you should be seeking as often as possible
to produce content that gives a unique perspective on a topic or that goes into
more depth and detail than most other similar pieces on the web.

Cultivate an audience. Every content producer should be as concerned with
building a loyal audience as they are with producing quality content. That means
being active on social networks, for one. Seek to build good relationships with
those who might be interested in your experience and expertise and are likely to
share your content with their networks.

Participate in relevant conversations. Go beyond just broadcasting your content
to participating in relevant online conversations and communities. Doing that
can have multiple benefits. As you contribute to such communities, you get a
chance to display your experience and expertise before a broader audience, some
of whom may start to follow you. That means you are not only growing your
audience (see above), but doing it in places where you are more likely to pick
up followers with high interest in what you do.

Don’t forget real-world opportunities. Attending conferences and networking
events in your field can lead to online connections that help reinforce your
online authority. This is especially true if you are a speaker or panelist at
such events or get interviewed by a media outlet. You can accelerate this effect
by actively inviting people at these events to connect with you online. For
example, always place your primary social profiles prominently in any
presentations you do.

Incubate and promote brand subject matter experts. Publishers should not ignore
the power of individual topical authority used in conjunction with their brands.
Many companies are reluctant to empower individual employees or representatives
to build their own authority, but they miss a real opportunity by not doing so.
People identify with, trust, and connect with real individuals long before they
do with faceless brands. Therefore, wise brands will cultivate SMEs who have a
direct connection with their brand, knowing that the audience and authority
those SMEs build will ultimately reflect back on the brand.

354

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

NOTE A special thanks to Mark Traphagen of seoClarity for his contributions to
the author authority portions of the chapter.

Google Page Experience Google considers the user experience on the websites it
sends traffic to as quite important. Simply put, if a user clicks through to a
web page from its search results, it wants that user to have a good experience
when they get there. If they don’t find what they want or otherwise are not
satisfied by what they do find, that reflects badly on the quality of the search
results provided. Some user experience factors have been part of the Google
algorithm for some time. These include page speed as well as considerations like
whether your site uses intrusive interstitials, delivers all of its content over
a secure connection, and is mobilefriendly. To simplify management of these
diverse signals and to make it simpler to consider and integrate new user
experience–related signals in the future, Google rolled out its page experience
algorithm update in June–August 2021. This initial update focused on mobile
sites, but it was extended to desktop in early 2022. This section will discuss
this update and the related ranking factors, and how to best optimize for them.

Use of Interstitials and Dialogs Google gives preference to sites that don’t
obscure the main content of their pages with interstitials (overlays that cover
the entire page) or dialogs (overlays that cover a portion of the page). Both of
these can be disruptive to the user experience, as they block the user from
seeing the content they are looking for. These types of overlays are frequently
used for advertising or to prompt users to install an app. Google recommends
using less intrusive banners instead. In particular, it suggests using the
in-app install experience for Chrome or Smart App Banners with Safari. Google
also notes that this ranking factor is not applied to sites that must show
interstitials for regulatory compliance, such as casino sites or sites that sell
adult products like tobacco or alcohol, which show an age verification dialog
before letting users enter. Of course, interstitials and dialogs can be very
effective for advertising, and many sites use them to help generate revenue.
This can still be done safely, provided that you are not overly aggressive about
it. Google’s John Mueller had this to say about Google’s main concern: What
we’re looking for is, really, interstitials that show up on the interaction
between the search click and going through the page and seeing the content.

GOOGLE PAGE EXPERIENCE

355

… What you do afterwards, like if someone clicks on stuff within your website or
closes the tab or something like that, then that’s kind of between you and the
user. Based on this clarification, a best practice is to not load intrusive
interstitials or dialogs during the initial load of a page. The reason that this
is Google’s main concern is that it wants users who click on the search results
it provides to immediately get the content they requested. While Google can’t
require sites to not use interstitials or dialogs, it can give ranking
preference to sites that limit their uses to ways it finds acceptable. However,
be aware that Google could change its policies at any time. This might include
penalizing other types of use of interstitials or dialogs, or removing this as a
ranking factor altogether.

Mobile-Friendliness Another component of the page experience signal is
validating that your site is mobilefriendly. One of the easiest ways for a
webmaster to do this is to use Google’s MobileFriendly Test tool. There are many
elements that Google tests with this tool, and the issues it can flag include
the following:

Text too small to read Google wants sites to present content that is readable by
users without them needing to zoom in. The recommendation is that more than 60%
of the content on the page use a 12-point font or larger.

Clickable elements too close together Mobile devices typically have small
screens, and it can be a bad user experience if clickable elements are placed
too close to each other (making it difficult for users to correctly choose one
of them and not the other). Google recommends that clickable elements have a
touch target size of 48 pixels to make them large enough to avoid this problem.

Viewport not set The meta viewport tag instructs the browser on how to control
the page’s size and scale. If this is not set, your page will likely not fit
properly on users’ mobile devices, leading to a frustrating experience.

Viewport not set to "device-width" If your page is set up using a fixed width,
it can’t dynamically adapt to different screen sizes. Google suggests that you
enable this by using the meta viewport value width="device-width". This will
cause the page to dynamically resize itself to the available screen size on the
user’s device.

356

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Content wider than screen If you use images that are designed primarily for
viewing on desktop devices or that do not resize well, or you make CSS
declarations that use fixed values rather than relative width and position
values, you can get this error from the Mobile-Friendly Test tool.

Uses incompatible plugins This error indicates that the page is using plug-ins
that are not widely available in mobile browsers, such as Flash. Google
recommends that your pages should instead use broadly supported approaches to
support animation, such as HTML5. An example of a report produced by the
Mobile-Friendly Test tool is shown in Figure 7-71.

Figure 7-71. A sample Mobile-Friendly Test tool report TIP You can also test for
mobile-friendliness using Google Lighthouse, a Chrome Extension.

Mobile-friendliness is a concern for Google because poor user experiences can
frustrate users, resulting in them being dissatisfied with the pages returned in
the SERPs.

GOOGLE PAGE EXPERIENCE

357

Secure Web Pages (HTTPS and TLS) Google prefers pages that use secure
communications to transfer a requested web page to a user. This is generally
visible to users because the URL of the web page will start with https://
instead of http://. If users see https:// at the start of the URL, this means
that Transport Layer Security (TLS) is in use. The benefit of this is that it
seeks to ensure that the user’s browser gets the exact same content that was
sent to it by your web server, and not something that has been accessed or
modified in transit. NOTE The original protocol used to encrypt these
communications, SSL, did not provide a sufficiently robust level of protection.
For that reason, TLS was developed: it offers a much tighter level of security.
Security is important because the web consists of interconnected networks of
computers, and many other computers may be involved in any transfer of data
between your web server and the user’s browser. This opens up the possibility
that bad actors can intercept your content and modify it before the user sees
it. This is referred to as a “man in the middle” attack.

Providing secure web pages through the use of HTTPS became a ranking signal as
of 2014. This was later incorporated into the page experience signal. In
addition, the Chrome web browser will alert users when they are receiving
content that is not considered secure (this indication can be found in the
address bar of the browser, just to the left of the page URL). Figure 7-72 shows
the types of symbols that Google will show and their meaning.

Figure 7-72. Chrome symbols indicating site security level

Core Web Vitals Core Web Vitals (CWV) is the name that Google has given to the
factors that relate to page loading behavior, including overall page speed, and
it is the page speed component of the page experience ranking signal. CWV scores
are based on real user data that is obtained from the Chrome User Experience
Report (CrUX). CWV has three major components: Largest Contentful Paint (LCP),
First Input Delay (FID), and Cumulative Layout Shift (CLS). All three of these
metrics must have Good scores for a page to have a Good CWV score, and
consequently to be eligible for a Good page experience rating.

358

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Google evaluates CWV for each page individually when there is enough data to do
so, and otherwise it groups pages, aggregating their data, to evaluate them. In
addition, Google may use the data for the home page (if available in sufficient
quantity) for all pages on the site if other options are not available. It is
not unusual that people get hung up on the PageSpeed Insights metrics, which
isn’t necessarily terrible, but these shouldn’t be a KPI used over CWV. In fact,
optimizing for this can directly hurt CWV, since PageSpeed Insights strongly
encourages removing render-blocking CSS and JavaScript, which can result in
really bad CLS if taken to an extreme. Hence, it’s important to learn how to
approach these metrics, and the tools used to measure them, appropriately.

Largest Contentful Paint LCP is a measure of how quickly the main content block
in the initial page viewport loads. To put it another way, this factor tracks
how fast users get the main body of the content they were looking for from the
page. Elements considered as content elements that can make up the largest
content element include: • Images, including background images • Videos •
Block-level elements containing text nodes (including

, , , – , ,

, and ). The LCP should be complete within 2.5 seconds for at least 75% of users
for a page to have a Good LCP. Figure 7-73 provides further insight into how
Google grades the various levels of LCP performance.

Figure 7-73. LCP performance metrics

Optimizing for LCP is a highly complex affair. It is unlikely that you will
impact it just by fixing a single element. For that reason, you should take the
time to learn about all the various components that go into LCP. Google’s
web.dev website has a great resource on this. As outlined on that page, there
are four major components that make up your total LCP time:

GOOGLE PAGE EXPERIENCE

359

Time to first byte (TTFB) The time delay from when the user requests the page to
when the first byte of the HTML response is received by the browser.

Resource load delay The time delay from TTFB to when the browser starts loading
the LCP resource itself.

Resource load time The time it takes the browser to load the LCP resource (but
not actually render it).

Element render delay The time delay from when the LCP resource is loaded to when
the LCP resource is fully rendered by the browser. This book does not discuss
these specific areas further, but many online resources exist to help you better
understand what they are and how to improve them.

First Input Delay FID measures how long it takes before the browser can respond
when a user interacts (such as via a mouse click or keyboard input) with a page.
Input delay is typically caused by the browser being busy with a different
activity, such as executing JavaScript. Figure 7-74 shows how a user might
experience this type of delay.

Figure 7-74. First Input Delay

360

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

The FID for 75% of pages should be no more than 0.1 seconds to be rated a Good
experience. Figure 7-75 provides further insight into how Google grades the
various levels of FID performance.

Figure 7-75. FID performance metrics

A long First Input Delay typically occurs between the First Contentful Paint
(FCP) and the time when the page becomes fully interactive, measured by the Time
to Interactive (TTI) metric. During this period, the page has rendered some of
its content but isn’t yet completely interactive.

Cumulative Layout Shift CLS measures the stability of page content during
initial page load. Instability can occur because resources load asynchronously
or new page elements get loaded in the DOM above content that has already
rendered on the page. For example, the main content may load initially, and then
an ad may be rendered above that content, causing it to appear to jump down.
This can be very frustrating, as the user may be attempting to read that
content, or worse, click on something within the main content; if the ad renders
right where they are clicking they can be taken to a different page than the one
they intended. Figure 7-76 shows how a user might experience this type of delay.

Figure 7-76. Cumulative Layout Shift

GOOGLE PAGE EXPERIENCE

361

The Cumulative Layout Shift for 75% of pages should be 0.1 or less to be rated a
Good experience. Figure 7-77 provides further insight into how Google grades the
various levels of CLS performance.

Figure 7-77. CLS performance metrics

Here are tips to improve issues related to improving CLS: • Always specify size
attributes in your image and video elements. As an alternative, you may specify
the space required with CSS aspect ratio boxes. This will enable your browser to
reserve the proper space for the image/video while it’s still loading. The
unsized-media feature policy can also be used to require this behavior in
browsers that support feature policies. • When rendering a page, always proceed
in a clear top-to-bottom pattern and avoid rendering content above previously
rendered content. As an exception to this, once the page is loaded, user
interactions may result in changes in layout; this will not impact your CLS
score. • If your page has animations, use transform animations instead of ones
that will trigger layout changes. Ensure that your animations are implemented in
a way that provides context and continuity from state to state. You can learn
more about improving CLS on the web.dev website.

How Much of a Ranking Factor Are Core Web Vitals? The Google Search Central
documentation has this to say about the importance of the page experience
signal: Google Search always seeks to show the most relevant content, even if
the page experience is sub-par. But for many queries, there is lots of helpful
content available. Having a great page experience can contribute to success in
Search, in such cases. In addition, John Mueller addressed the topic in an SEO
Office Hours webinar, stating: The other thing is that the page experience
update is generally quite a subtle update. It’s not something that would kind
of—or shouldn’t make or break a website. So if you saw a significant drop in
traffic from search, I would not assume that it is purely due to kind of [sic]
you having a slower website.

362

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

If it’s even too slow to be tested [rendered in Google’s testing tool], that
seems like something to fix just from a user point of view, but again I think
from a search point of view for most of these things I would not assume that
making a website from really slow to really fast would cause a significant rise
in visibility. Like there will be some changes but it’s usually much [sic]
subtle. This was also addressed in a video Google published as a part of its SEO
Mythbusting series. This video was a discussion between Martin Splitt and
coauthor Eric Enge and focused on page speed as a ranking factor. In the video
they had this exchange: MARTIN SPLITT: “If you have bad content, if you are the
fastest website out there but the content is not great, then that’s not helping
you.” ERIC ENGE: “Right. Right. I mean, to get the content you don’t want
quickly is probably not what the user is looking for.” In summary, it’s
important to realize that how well your content addresses the user’s needs as
implied by a given search query (as determined by a combination of its relevance
and quality) is, and will always be, the top ranking factor. In spite of this
type of guidance, many SEO professionals and website publishers assumed that the
rollout of the page experience update meant that Core Web Vitals were now going
to be a large ranking factor. As a result, coauthor Eric Enge and his then
employer Perficient conducted a study to measure the effect of the update. Since
the timing of the rollout was preannounced by Google, Enge and Perficient were
able to set up tracking of rankings and page speed across a wide number of
search queries starting on June 7, 2021 (one week before the rollout began), and
continuing to September 27, 2021 (several weeks after it was completed). The
study monitored the top 20 positions for 1,188 keywords and used Lighthouse
Tools (the Chrome browser extension) to track CLS, FID, and LCP. Tracking of
URLs was performed in two different ways: 1. The first method was based on
tracking the URLs ranking in the top 20 positions for each keyword. For each
ranking URL, ranking position and CWV scores were tracked. This set of “Global
URLs” was used to track how CWV scores improved across each ranking position. 2.
In addition, 7,623 URLs that persisted in the same ranking position throughout
the study were tracked. These “Control URLs” were chosen because they enabled
evaluation of whether any gains seen in the test dataset were driven by general
improvements in CWV scores across the tested sites rather than an increase in
the weight of CWV metrics as a ranking factor. The data for how the LCP scores
fared is shown in Figure 7-78. The results are shown as the percentage of URLs
returning a Good score. For example, a score of 72% for

GOOGLE PAGE EXPERIENCE

363

LCP for position 6 in the SERPs for week 1 indicates that 72% of URLs ranking in
that position during that week across all sampled keywords returned a Good
result.

Figure 7-78. Effect of the rollout on LCP scores (source: Perficient)

If you look at the Week 1 Global line (the red line from the first week of the
test) versus the Week 17 Global line (the brown line from the last week of the
test), you will see that it looks like the average page speed per ranking
position indeed improved. This might lead you to believe that LCP is a strong
ranking factor, but looking only at this data is misleading. You also need to
compare the Week 1 Control line (in light gray) with the Week 17 Control line
(in black). This comparison shows a nearly identical trend to that observed in
the Global group (our test group), which indicates that it’s more likely that
many websites were busily improving their LCP performance during the time frame
of the test, probably in response to Google’s announced rollout of this update.
We see similar results if we look at the FID scores during the course of the
study, as shown in Figure 7-79.

364

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Figure 7-79. The impact of FID on ranking test results (source: Perficient)

The Global versus Control data is even more closely aligned for FID scores than
it was for LCP scores, suggesting that if this is a ranking factor at all, it’s
a relatively small one. To complete the CWV review, Figure 7-80 shows the
corresponding data for CLS. Once again, there is no clear signal that CLS is a
significant ranking factor. Taken as a whole, does this data suggest that Core
Web Vitals are not a ranking factor? No, not at all—it just means that they’re
not a big one. This is consistent with what Google has been telling the SEO and
webmaster communities, and also our intuition that relevance and quality of
content are (and must be) the most important factors.

GOOGLE PAGE EXPERIENCE

365

Figure 7-80. The impact of CLS on ranking test results (source: Perficient)

Using Tools to Measure Core Web Vitals Many free tools are available for
measuring the performance of your web pages. These include three provided by
Google:

Lighthouse Lighthouse is an extension for the Chrome browser. To open the
Lighthouse extension, you need to go into the Chrome Settings menu and select
More Tools→Developer Tools. Then click the >> symbol in the top menu bar and
select Lighthouse. From there you can generate a Lighthouse performance report
that provides detailed information on numerous metrics, including Largest
Contentful Paint and Cumulative Layout Shift. In addition to these metrics,
detailed recommendations are provided on steps that you can take to improve
overall page performance.

Chrome User Experience (UX) Report This tool provides metrics similar to the
other two, in what’s commonly referred to as the CRuX report. Its most
significant distinction is that the information is based on actual user data
(collected anonymously) for the web page being tested. This real-world data is
valuable because it takes into account actual user locations, connections, and
devices used, providing a more robust view of the true UX. In contrast, other
tools, such as Lighthouse, generate what is considered “lab” data.

366

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

In other words, they perform a real-time test of the submitted web page and
measure the results. Per the About CrUX web page, another factor that makes this
data so important is that “CrUX is the official dataset of the Web Vitals
program.”

PageSpeed Insights Like Lighthouse, PageSpeed Insights returns data on numerous
metrics and suggestions for how to improve page performance. All three CWV
metrics are included in the data tracked. As shown in Figure 7-81, using the
PageSpeed Insights tool requires only that you enter in a URL and press the
Analyze button. The figure also shows a sample of the results the tool
generates.

Figure 7-81. A sample PageSpeed Insights performance report

There are also third-party tools available on the market that provide lab data
testing capabilities. Depending on your needs, these may also be helpful to you:
• GT metrix

• WebPageTest

• Pingdom Website Speed Test

Optimizing Web Pages for Performance The task of improving page loading
performance can bring many benefits. While it may be a small ranking factor, it
can be helpful in scenarios where the gap between your page and a competing one
that is just above it in the rankings is small. If all other signals are nearly
equal, then having a materially faster page may be the extra push you need to
move in front of the competition. In addition, faster pages provide a better
experience for users and even increase conversion rates. As discussed in the
previous section, many of the tools you can use to measure your page speed can
also provide suggestions on how to improve it. An example from

OPTIMIZING WEB PAGES FOR PERFORMANCE

367

Lighthouse is shown in Figure 7-82. Note that in Lighthouse you can click on any
of the rows and get details on specific problematic resources.

Figure 7-82. Lighthouse performance improvement recommendations

This data can be quite helpful in identifying areas to work on. However, these
issues are not always that easy to fix. For example, the required change may be
in code that is embedded within your chosen ecommerce platform, CMS platform, or
JavaScript framework. Making the fix may be extremely difficult or even
impossible, in which case you’ll just have to live with the problem. This
doesn’t mean that you shouldn’t look at these suggestions— there may be various
improvements you can make, so it’s definitely worth reviewing them and finding
those issues that you can fix. However, there are many other areas that merit
investigation as well, which we will cover in the next few sections.

Approach to Rendering Pages As discussed in “Types of Rendering” on page 305,
how you render your pages can impact performance. As a reminder, in the early
days of the web, JavaScript was not used to build web pages; they were instead
built by your web server and then delivered to the user’s browser. This approach
was referred to as server-side rendering. As JavaScript became more popular,
developers started to build pages using client-side rendering, where a
single-page application was initially delivered to the browser along with a
script that would manage the process of requesting additional content from the
web server. While this reduced server load and could reduce hosting costs, it
also slowed down the initial page load times for users and search engines. To
avoid this problem two alternatives were developed, known as dynamic rendering
and hybrid rendering (a.k.a. server-side rendering with hydration). Dynamic
rendering

368

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

provides server-side rendered content to search engines and client-side rendered
content to users. Google considers this a “workaround” and does not recommend it
as a long-term solution. In addition, dynamic rendering did not really solve the
problem of speeding up performance for users, which is why hybrid rendering is
now the preferred option. With this approach, the majority of the above-the-fold
content is built server-side and then delivered to users. Then additional page
elements that are below the fold are built using rendering that is done
client-side. This provides the user with a faster perceived page load time and
also results in better Largest Contentful Paint scores. At the same time, it can
still provide many of the benefits of reducing server load that come with a
client-side rendering approach.

Server Configuration There are many ways that you can configure your web server
environment to improve performance. Exactly what applies to you depends on how
your site is set up. For example, small websites that get a few thousand
visitors per month have much simpler requirements than large ecommerce sites
with hundreds of visitors per second. It’s beyond the scope of this book to list
all the ways that you can speed up your server here, but some things to consider
include:

Dedicated versus VPS versus shared servers If you are on a shared server, your
site is competing for resources with other sites on the same server. Moving to a
virtual private server (VPS) can help speed up site performance. In addition, as
your traffic grows you can consider moving to a dedicated server.

Server processing power and memory Once you have a dedicated server you can
begin to look at its configuration. This can include upgrading the processing
power of the server and its memory.

Caching Setting up caching of your pages can also speed up performance. If
popular pages are cached, as each user requests them, your server environment
can send them to the user without having to reconstruct them. This is known as
HTTP caching, but there are other types of caching that you can also set up,
such as OpCode caching, memory caching, and application caching.

Gzip compression If your web server has not enabled gzip compression, that’s
something you should do: it compresses the contents of your page before sending
them to the user’s browser.

OPTIMIZING WEB PAGES FOR PERFORMANCE

369

Content delivery networks Depending on how far away your users may be from your
hosting provider (for example, if your site serves users across the globe), they
may experience long delays for the content from your pages to be delivered to
them across the network. CDNs reduce that network delay time by hosting all or
portions of your content on servers closer to your users.

Database server optimizations If your configuration includes database servers,
then the robustness of these servers and how they are connected to your web
server(s) is another area to review.

Defragmenting the database If your database has been in use for a long time, the
data records can become fragmented and slow down performance. Defragmentation
can speed up overall database performance.

Multiple servers As you reach higher levels of traffic you’ll need to move to a
multiple-server environment. These can be set up with load balancers to share
the burden of handling your traffic.

Ecommerce/CMS Selection and Configuration Most sites today use a content
management system or ecommerce platform, as well as some type of JavaScript
platform (generally either a JavaScript framework or static site generator, as
discussed earlier in this chapter). The choices you make can have a significant
impact on the overall performance of your site. For example, React is a
JavaScript framework that can be relatively slow, and Gatsby is an SSG that is
known for being relatively fast. Of course, there are many other considerations
that may impact what platform you choose, but be aware of the impact on speed as
you make your selection. The other important consideration is how you set up
your platform configuration. For example, if you’re using Magento, you need to
turn its caching capabilities on. You also need to review the impact of any
extensions you have installed, as these may be slowing down the overall
performance of the platform.

Analytics/Trackers You may be using Google Analytics, Adobe Analytics, or
another similar program to measure your overall site traffic. One outcome of
using such a program is that every time one of your pages loads it sends some
information off to a remote server where detailed information is kept regarding
the traffic to your site. This communication with

370

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

the analytics provider’s remote server causes some delay in your overall page
load times. The effect of having one such program running is not too bad, and
you should measure your overall traffic, but many sites have multiple different
types of tracking programs embedded. If you use more than one analytics program,
that means time is spent sending data to two different remote servers. Any time
you put a pixel on your page, such as a Facebook tracking pixel, you have added
yet another remote server that your page communicates with every time it’s
loaded. Other ad networks may also require you to add a tracking pixel to your
page. In addition, you may be using other programs to monitor user behavior to
aid with conversion optimization, such as Hotjar, Mixpanel, or Twilio Segment,
among many others. If you have three or more analytics or tracking programs on
your web page you may have too many, and you should look for ways to reduce
their number to streamline performance.

User Location and Device Capabilities Another consideration that you should not
overlook relates to the locations and device capabilities of your users. This
can impact the complexity of the design and implementation of your website.
There are two major areas of concern related to your user base that can impact
your site’s performance:

Location If, for example, your business is based in Germany but it does a lot of
business in Canada, this could be a key factor in performance. You will want to
set up your hosting environment to serve Canadian users pages from servers based
in Canada.

Device type Users can have many different types of devices. For example, some of
your users may have 3G phones, which will give them substantially less bandwidth
for receiving content. In such cases, you will need to design your website to
use smaller page sizes.

Domain Changes, Content Moves, and Redesigns Whenever you make structural
changes to your website, there is a risk that you will confuse the search
engines and harm your search rankings and organic traffic. The types of site
changes that can influence your site’s organic search performance include
changing your domain name, changing your content management system or ecommerce
platform, redesigning your site, altering your site’s architecture and/or
functionality, changing your blog platform, and many others—basically, anything
that

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

371

fundamentally alters your site’s frontend and/or backend visual or functional
elements can potentially influence your organic search performance. In this
section, we will walk through various scenarios. Be sure to refer back to
“Content Delivery and Search Spider Control” on page 270 to review the technical
specifics of options for moving content from one location to another.

The Basics of Moving Content There are very important reasons to move content
properly, all of which can be easily overlooked by inexperienced or hurried
webmasters and development teams. Google groups site moves into two categories:
moves with URL changes and moves without URL changes. It provides specific
guidelines for handling moves within each category. “Moving content” refers to
any situation in which content that used to be located and accessed at one URL
(e.g., https://www.yourdomain.com/pageA) is moved to another URL (e.g.,
https://www.yourdomain.com/products/pageA). One of your goals when you move
content is to make sure users and search engine crawlers that attempt to visit
the old URL (/pageA) are presented with the content from the new location
(/products/pageA). In addition, when you move content from one URL to another,
the links to the old URL will stop providing value to your rankings in the
search engines for that content unless you properly implement a redirect. As
discussed in “Redirects” on page 288, both 301 and 302 redirects cause the
search engines to pass most of the value of any links for the original page over
to the new page, but only a 301 redirect will result in the rapid deindexation
of the old URL and indexation of the new one. Because link authority is a
precious asset, and because you should take control over what page is indexed by
the search engines, you should always use 301 redirects when you move content
permanently.

Large-Scale Content Moves Setting up redirects becomes difficult when changes
affect large quantities of content. For example, when you change your domain
name, every single piece of content on your site will move to a new URL, even if
the site architecture is identical (https:// www.olddomain.co... moves to
https://www.newdomain.com...). This is challenging because you might have to set
up individual 301 redirects for every single page on the site, as in this
example: • https://www.olddomain.com/page to https://www.newdomain.com/pagel •
https://www.olddomain.com/page2 to https://www.newdomain.com/page2 • ... •
https://www.olddomain.com/page1000 to https://www.newdomain.com/page1000

372

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Unfortunately, some systems still require that these redirects be set up one at
a time, so this could be quite a painful process. Imagine a site with a million
pages! Fortunately, publishers who use an Apache web server (Unix and Linux
servers) can take advantage of the power of Apache’s mod_rewrite module, which
can perform the redirect of every URL on the old domain to the same URL on the
new domain in two lines of code: RewriteCond %{HTTPS_HOST} ^olddomain\.com [NC]
RewriteRule ^/(.*) https://www.newdomain.com/$1 [R=301,L]

The preceding code presumes that you prefer the “www” version as the canonical
URL. You can also use two similar lines of code to specify the “non-www” version
as the canonical URL (see coauthor Stephan Spencer’s article “URL Rewrites &
Redirects: The Gory Details (Part 1 of 2)” for examples without “www” and other
alternative approaches). Another highly popular web server is Microsoft’s IIS.
In many installations of IIS, you will find yourself in a situation where you
have to implement a separate redirect instruction for each page, one at a time.
Fortunately, you can utilize an ISAPI plug-in such as ISAPI_Rewrite, which
enables you to perform large, scalable rewrites in a language similar to that
used by Apache’s mod_rewrite. Both of these plug-ins were discussed earlier in
this chapter, in “Methods for URL Redirecting and Rewriting” on page 289.

Mapping Content Moves Sometimes a site redesign is simply a “reskinning” of the
visual elements of the old site with a new look and feel, retaining the same
technical elements of information architecture, URL and directory names, and
user navigation. Other times, a redesign changes both the visual design and the
technical elements. And sometimes it’s a combination of the two approaches. For
sites changing both design and function, the first stage of planning is to
figure out which content will be moved where, and which content will be removed
altogether. You will need this information to tell you which URLs you will need
to redirect, and to which new locations. The best way to start this process is
by generating a complete map of your information architecture with full URLs.
This might not be as simple as it sounds, but tools are available to help. Here
are some ways to tackle this problem: • Extract a list of URLs from your web
server’s logfiles and site architecture documentation. • Pull the list from your
XML sitemap file, if you believe it is reasonably complete. • Use a free
crawling tool, such as Screaming Frog.

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

373

• Use tools such as Ahrefs, Majestic, Semrush, and/or Moz’s Link Explorer and
Google Search Console to get a list of external links to your site. Since each
of these tools provides a limited view of all your links, you can get the best
results by using multiple tools, combining the results into one list, and then
deduping it. • Check Bing Webmaster Tools’ Index Explorer to find all of the
crawlable URLs that you may not know still exist on the site. These tools should
help you assemble a decent list of all your URLs. After determining which URLs
have content that will remain on the site, you must then map out the pages that
you want to redirect the “migrating” content to. Additionally, for content that
is being retired, you need to determine whether to redirect those links at all
(it’s a definite yes if the URLs for these pages have many internal and external
links), and if so, what new URLs to redirect them to. One way to do this is to
lay it out in a spreadsheet, which might end up looking like Table 7-1. Table
7-1. Spreadsheet mapping redirects one URL at a time Old URL

New URL

https://www.olddomain.com/page1

https://www.newdomain.com/page1

https://www.olddomain.com/page2

https://www.newdomain.com/page2

https://www.olddomain.com/page3

https://www.newdomain.com/page3

https://www.olddomain.com/page4

https://www.newdomain.com/page4

https://www.olddomain.com/page5

https://www.newdomain.com/page5

https://www.olddomain.com/page6

https://www.newdomain.com/page6

https://www.olddomain.com/page7

https://www.newdomain.com/page7

https://www.olddomain.com/page8

https://www.newdomain.com/page8

https://www.olddomain.com/page9

https://www.newdomain.com/page9

https://www.olddomain.com/page10

https://www.newdomain.com/page10

If you are redirecting a massive number of URLs, you should look for ways to
simplify this process, such as writing rules that communicate what you need to
know. For example, you could abbreviate the list in Table 7-1 to the short list
in Table 7-2. Table 7-2. Spreadsheet mapping redirects of URLs in bulk Old URL

New URL

https://www.olddomain.com/page*

https://www.newdomain.com/page*

Then you can save the individual lines for the more complicated moves, so your
resulting spreadsheet would look like Table 7-3.

374

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

Table 7-3. Spreadsheet mapping URL moves individually and in bulk Individual
page moves Old URL

New URL

https://www.olddomain.com/about-us

https://www.newdomain.com/about-us

https://www.olddomain.com/contact-us

https://www.newdomain.com/contact-us

https://www.olddomain.com/press-relations

https://www.newdomain.com/press

Large-scale page moves Old URL

New URL

https://www.olddomain.com/content/*

https://www.newdomain.com/content/*

https://www.olddomain.com/page*

https://www.newdomain.com/page*

The purpose of this is to efficiently give your developers a map for how the
content move should take place. Note that the spreadsheet should contain a map
of all changed URLs, which may include downloadable content such as PDF files,
PowerPoint presentations, videos, and any other types of content being moved.
You can also note retiring content via additional entries in the left column,
with the entries in the right column indicating where users looking for that old
content should be sent. Now your spreadsheet might look like Table 7-4. Table
7-4. Spreadsheet combing page moves with redirects for eliminated pages
Individual page moves Old URL

New URL

https://www.olddomain.com/about-us

https://www.newdomain.com/about-us

https://www.olddomain.com/contact-us

https://www.newdomain.com/contact-us

https://www.olddomain.com/press-relations

https://www.newdomain.com/press

Large-scale page moves Old URL

New URL

https://www.olddomain.com/content/*

https://www.newdomain.com/content/*

https://www.olddomain.com/page*

https://www.newdomain.com/page*

Eliminated pages Old URL

Redirect to

https://www.olddomain.com/widgets/azure

https://www.newdomain.com/widgets/blue

https://www.olddomain.com/widgets/teal

https://www.newdomain.com/widgets/ green

https://www.olddomain.com/widgets/puce

https://www.newdomain.com/widgets

The new entries show what should happen to retired pages. The first two retired
pages may represent products that you no longer carry, so you would likely want
to redirect

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

375

them to the closest existing products you have. The third retired page
represents a URL where there is no appropriate replacement, so you may choose to
redirect that one to the parent page for that topic area. As you can see, a
major SEO objective during content migration is to preserve as much link
authority and traffic from the old URLs as possible, while providing the best
possible user experience for people who arrive at the old URLs.

Expectations for Content Moves The big downside to content migration is that the
search engines generally don’t adapt to the URL changes immediately. This is
because they are dependent on crawling to fully understand those changes. The
IndexNow initiative from Bing and Yandex (see “IndexNow” on page 187) aims to
enable search engines to adapt more quickly, but Google has not yet adopted this
protocol. Thus, in Google many sites temporarily lose rankings after making a
large-scale content move, then recover after a period of time. So naturally, the
question is, how long will it take to get your organic rankings and traffic
back? The reality is that a number of factors are involved, depending on your
particular situation. Some examples of these factors might include:

The size and complexity of your site Bigger, more complex sites may take longer
to process.

The complexity of the move If the site has been fundamentally restructured, it
is likely to take more time for the search engines to adapt to the new
structure.

The perceived authority of the site Sites that have a higher (search engine)
perceived authority may be processed faster. Related to this is the rate at
which the site is typically crawled.

The addition of new links to the new pages Obtaining new links to the new URLs,
or changing old links that used to point to the old URLs so that they point to
the new URLs, can help speed up the process. If you are moving to an entirely
new domain, you can aid the process in Google by using the Change of Address
tool inside Google Search Console. Before using this tool, make sure that both
your old domain and your new domain are verified in Search Console. Then, on the
Search Console home page, click on the old domain. Under Site Configuration,
click “Change of Address,” and then select the new domain. Bing offers a similar
tool, described in a blog post by SearchBrothers cofounder Fili Wiese. When all
is said and done, a reasonable estimate is that a significant dip in traffic
from the search engines after a move should rarely last longer than 60 to 90
days. Many

376

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

sites recover in a shorter time span, although for very large sites (1M+ pages)
that migrate to new domains and change all of their URLs at the same time, the
time to recover can be longer. Another approach to content moves (especially
when you’re updating and redirecting an entire site’s URLs) is to perform the
URL migration in a phased manner as opposed to “wiping out” and redirecting all
of the site’s URLs at once. You’d do this for a few reasons, such as wanting to
test the search engines’ handling of such a migration on your site before
committing to the sitewide change and its resulting impact. Another reason is to
mitigate potential organic traffic dips that will occur during the updating
period; it is often easier to tolerate a series of 10% traffic losses than a
30%–40% traffic loss all at once, especially for websites that rely upon
traffic-based advertising revenue.

Maintaining Search Engine Visibility During and After a Site Redesign Companies
may decide to launch a site redesign as part of a rebranding of their business,
a shift in their product lines, a marketing makeover, or for a variety of other
business reasons. As discussed in the previous sections, any number of things
may change during a site redesign. For example: • Content may move to new URLs.

• New site sections may be added.

• Content might be eliminated.

• New site functionality may be added.

• Content may be changed.

• Navigation/internal linking structures may be changed significantly.

• Content could be moved behind a login or paywall.

Of course, the move may involve moving everything to a new domain as well, but
we will cover that in the next section—as you’ll see, there is a lot of overlap,
but there are also some unique considerations. Here are some best practices for
handling a site redesign that involves these technical elements: • Create 301
redirects for all URLs from the original version of the site pointing to the new
URLs on the redesigned site. This should cover scenarios such as any remapping
of locations of content and any content that has been eliminated. Use a
spreadsheet similar to the ones shown earlier to map out the moves to make sure
you cover all of them. When the URL changes are implemented, you can also leave
up copies of both your old and new XML sitemaps. This will enable Googlebot to
rapidly make the connections between the old URLs and the new URLs, speeding up
the transition.

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

377

• Review your analytics for the top 100 or so domains sending traffic to the
moved and/or eliminated pages and contact as many of these webmasters as
possible about changing their links. This can help the search engines understand
the new layout of your site more rapidly and provides both better branding and a
better user experience. • Review a backlink report (using your favorite backlink
analysis tool) for your site and repeat the previous process with the top 200 to
300 or so results returned. Consider using more advanced tools, such as Ahrefs,
Majestic, Semrush, and/or Link Explorer, and Google Search Console. • Make sure
you update your sitemap and submit it to Google Search Console and Bing
Webmaster Tools. Consider using multiple sitemaps, one for each content type
and/or content area, to submit and monitor the indexing of your new site URLs. •
Monitor your rankings for the content, comparing old to new over time—if the
rankings fall, post in the Google Webmaster Central Help forum detailing what
you did, what happened, and any other information that might help someone help
you. Google employees do monitor these forums and sometimes comment in
situations where they think they can help. Don’t use the forums to complain;
state what has happened and ask for help, as this gives you the best chance of
getting feedback. • Monitor your Search Console account and your analytics for
404 errors and to see how well Google is handling your redirects. When you see
404 errors occurring, make sure you have properly implemented 301 redirects in
place. Don’t limit this checking just to 404 errors; also be on the lookout for
HTTP status codes such as 500 and 302. Maintain the XML sitemap of old URLs
until the search engines discover the redirects.

Maintaining Search Engine Visibility During and After Domain Name Changes There
may come a time when you have a strong business need—such as a rebranding,
renaming, or merger/acquisition—to change your site’s domain name. This section
will cover some of the considerations and challenges involved in a domain name
change.

Unique challenges of domain name changes One of the more challenging aspects of
a domain name change is potentially losing the trust the search engines have
associated with your old domain. Another issue is that if there were
business-specific keywords present in your old domain name that are not in your
new domain name, you may see a decline in organic traffic, even if you maintain
or recover placement after migration. This decline is a result of domain

378

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

bias—the propensity for searchers to click on domains in search results that
include keywords they used in their search query. You may also see a slightly
negative impact in your rankings for organic search terms related to the
keywords in your previous domain; although Google is cracking down on
exact-match-domain (EMD) websites with low-quality content that were ranking
well in search, it still places weight on the words in a domain. Another unique
challenge is the “youth” of the new domain, especially if it was recently
purchased and/or has no historical backlink profile. Because of its recency, the
new domain may be slow to rank. Although the site’s relevance and inbound link
profile (including the links 301-redirected from the old domain) may suggest a
high ranking for some search queries, because the new domain is not yet trusted,
the rankings are suppressed and traffic is much lower than it would otherwise
be. Domain youth is another reason why updating valuable third-party links to
reflect your new domain is important. If the prospect of taking a “young domain”
hit is too unappealing, another tactic you can try is to make the move to a
different domain that has a backlink history associated with it—just make sure
that history is a positive one! You don’t want to move to an old domain that had
any historical spam, manual review, or other negative associations, so be sure
to perform a thorough backlink audit with your preferred link auditing tools.
While you’re at it, see if you can get Google Search Console access to research
whether there were any manual spam actions reported against the domain.

Pre-move preparations for domain changes Unfortunately, losing traffic is common
when you make domain name changes, though the traffic loss is usually temporary.
If you do things properly, you can and should recover from any negative impact,
and hopefully quickly—but you should be prepared for the potential traffic hit
of a domain switch. If you are planning a domain migration, buy the new domain
as early as you can, get some initial content on it, and acquire some links. The
purpose of this exercise is to get the domain indexed and recognized by the
engines ahead of time. Then, register the new domain with Google Search Console
and Bing Webmaster Tools. This is just another part of making sure Google and
Bing know about your new domain as early as possible and in as many ways as
possible. Once you’ve done this, follow these best practices for handling a
domain name change: • Create 301 redirects for all URLs from the old site
pointing to the proper URLs on the new site. Hopefully you will be able to use
mod_rewrite or ISAPI_Rewrite to

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

379

handle the bulk of the work. Use individual rewrite rules to cover any
exceptions. Have this in place at launch. • Review your analytics for the top
100 or so domains sending traffic to the old pages, and contact as many of these
webmasters as possible about changing their links. • Make sure that both the old
site and the new site have been verified and have sitemaps submitted at Google
Search Console and Bing Webmaster Tools. • Launch with a media and online
marketing blitz—your goals are to get as many new inbound links as possible
pointing to the new site as quickly as possible, and to attract a high number of
branded searches for the redesigned site.

Monitoring rankings after domain changes Monitor your Search Console account for
404 errors and to see how well Google is handling your redirects. If you see any
404 errors pop up, make sure you have properly implemented 301 redirects in
place. If not, add them right away. Monitor the search engine spidering activity
on the new domain. This can provide a crude measurement of search engine trust,
as the engines spend more time crawling sites they trust. When the crawl level
at the new site starts to get close to where it was with the old site, you are
probably most of the way there. Watch your search traffic referrals as well.
This should provide you some guidance as to how far along in the process you
have come. You can also check your server logs for 404 and 500 errors. These
will sometimes flag problems that your other checks have not revealed. Google
recommends that if you are moving multiple domains and consolidating them all on
one new domain, you consider executing this one domain at a time. This may allow
Google to respond to these changes faster than if you tried to move all of the
domains at once. The value of this approach is that it reduces the risk
associated with the move by breaking the migration process down into more
manageable chunks. Even if you do this, however, you should still follow the
guidelines outlined in this section to implement the move of each section of
your new site and check on its progress.

Changing Servers You might decide you want to move servers without changing your
domain name or any of your URLs. A common reason for this change is that the
growth of your traffic requires you to upgrade your hosting environment to a
faster server. If you are using third-party hosting, perhaps you are changing
your hosting company, or if you have

380

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

your own data center, you may need to move or expand your facilities, resulting
in changes in the IP addresses of your servers. This is normally a
straightforward process, as you can simply go to the registrar where you
registered the domain name and update the DNS records to point to the new server
location. You can also temporarily decrease the site’s DNS time-to-live (TTL)
value to five minutes (or something similar) to make the move take place faster.
This is really the bulk of what you need to do, though you should follow the
monitoring recommendations we will outline shortly. Even if you follow this
process, certain types of problems can arise. Here are the most common: • You
may have content that can’t function on the new platform—for example, if you use
Perl in implementing your site and Perl is not installed on the new server. This
can happen for various other reasons as well, and the result can be pages that
return 404 or 500 errors instead of the content you intended. • Unfortunately,
publishers commonly forget to move key content or files over, such as
robots.txt, analytics files, XML sitemaps, or the .htaccess file. It is
imperative that these important files are migrated to your new server. • Server
configuration differences can also lead to mishandling of certain types of
requests. For example, even if both your old server and your new server are
running IIS, it is possible that the new server is configured in such a way that
it will transform any 301 redirects you have in place into 302 redirects. Be
sure to double- and triple-check that all server directives are properly
migrated from the old servers to the new ones. The best advice for dealing with
these concerns is to make a list of special files and configuration requirements
and verify that everything is in place prior to flipping the switch on any
server moves. In addition, you should test the new site in its new location
before making the move. You will need to access the content on the new site
using its physical IP address. So, the page at https://www.yourdomain.com/pageA
will be found at an address similar to https://206.130.H7.215/pageA. To access
the site, add that IP address to your test machine’s hosts file (this assumes
you are running Windows) with a corresponding hostname of
https://www.yourdomain.com, which will allow you to surf the site at the new IP
address seamlessly. This advance testing should allow you to check for any
unexpected errors. Note that the location of the hosts file varies across
different versions of Windows, so you may need to search online to get
information on where to find it on your machine.

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

381

As with our other scenarios, post-launch monitoring is important. Here are the
basic monitoring steps you should take after a server move: 1. Monitor your
Google Search Console and Bing Webmaster Tools accounts for 404 errors and to
see how well the search engines are handling your 301 redirects. When you see
404 errors, make sure you have properly implemented 301 redirects in place. 2.
Monitor the spidering activity on the new domain to make sure no unexpected
drops occur. 3. Watch your search traffic referrals for unexpected changes. You
can also check your server logs for 404 and 500 errors, which will sometimes
expose problems that your other checks have not revealed.

Changing URLs to Include Keywords in Your URL It’s also quite common for
companies to consider changing the URLs where their content is found in order to
insert keywords into the URLs, as keywords in the URL are widely believed to be
a ranking factor. However, as shown in Figure 7-83, Google’s John Mueller has
advised that the SEO impact of keywords in URLs is minimal.

Figure 7-83. John Mueller on the impact of keywords in URLs

Note that Mueller also advises that he does not recommend such projects,
suggesting that these types of URL changes are not worth it—you’ll take at least
a temporary hit, and about the best you can hope for is to regain your original
rankings. Our advice on this point is the same: engaging in a project to change
the URLs on your site to insert keywords is not worth the risk and effort.

Accelerating Discovery of Large-Scale Site Changes Any time you make large-scale
changes to your site (content changes, domain changes, URL changes, etc.), one
area to be concerned about is how rapidly Google

382

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

will crawl your site and discover the changes. There are several strategies that
you can use to accelerate how quickly that will happen: 1. Remove or fix error
pages (4xx, 5xx, soft 404s). 2. Delete duplicate pages, or noindex them if
deletion is not possible. 3. Reduce faceted navigation page bloat by
implementing pages not worth indexing via AJAX to make them invisible to Google,
or noindex these pages to deprioritize them. 4. Block crawling of any search
results pages that get rendered by your site search tool. 5. Prioritize your
more important pages by improving internal linking to them on your site (link to
them higher in the site hierarchy or link to them more often). In addition,
Google’s Gary Illyes provided some great insight on how Google views crawl
budget in a 2017 blog post. One of the key observations from this post is as
follows: If the site responds really quickly for a while, the limit goes up,
meaning more connections can be used to crawl. If the site slows down or
responds with server errors, the limit goes down and Googlebot crawls less.
Based on this, optimizing the server response time of your site and the total
load it can handle are also key steps in encouraging Google to crawl your site
faster. There are two methods that can be used to help do that:

Optimize First Input Delay Google uses server response time as a factor in
deciding how much a server can handle, so it’s worth working on optimizing this
to help increase how much crawl budget Google will allocate to your site. (For
more information, see “Core Web Vitals” on page 358.)

Implement application scaling This is an approach to structuring your web
application (or website) and hosting infrastructure to scale dynamically as
demand grows. In the case of large-scale site changes, you want to be able to
scale as Googlebot’s crawling demand grows, enabling it to crawl faster. While
this requires more effort to set up, it could be worthwhile for enterprise-scale
businesses with high-traffic websites. John Mueller also offered some
interesting insight in a Webmaster Office Hours hangout: In general, we try to
do our crawling based on what we think the page might be changing. So, if we
think that something stays the same for a longer period of time, we might not
crawl it for a couple of months.

DOMAIN CHANGES, CONTENT MOVES, AND REDESIGNS

383

What this tells us is that Google attempts to recognize pages that change
frequently and those that do not and adjusts the way it allocates its crawl
budget for your site accordingly. Along with crawl budget, index budget is
another aspect of search that you will need to manage. Arguably, this is even
more important than crawl budget for most sites, yet it’s rarely talked about:
it’s the number of pages from your site that Google budgets to have in its
index, and the more authoritative your site is, the more index budget you will
get. While Google has never confirmed this concept, anecdotal evidence suggests
that a large percentage of sites do not get all of their pages indexed. A key
reason you should manage your site’s index budget with care is that Google may
index pages that you don’t care about at the expense of those that you do. For
example, index budget gets wasted when previously indexed pages get “removed”
from the index using robots.txt Disallow directives instead of meta robots
noindex tags. More specifically, if you are blocking pages in robots.txt, those
pages won’t use crawl budget but they can still potentially be indexed. If your
goal is to keep them out of the index, you may be better served by removing the
Disallow directives in robots.txt and noindexing them. Google will crawl these
pages, see the noindex directives in the robots meta tags, and keep the pages
out of the index. It’s good practice to regularly review the list of “crawled,
currently not indexed” pages in Google Search Console’s coverage report to look
for potential index budget issues. NOTE A special thanks to Mats Tolander of
Perficient for his contributions to several sections of the chapter.

Conclusion Making your website readable by search engines and taking advantage
of the rich features available (such as Schema.org markup) are cornerstones of
the overall SEO process. If you fail to execute these tasks effectively, you can
end up with a website that struggles to earn the level of organic search traffic
that it might otherwise deserve. Mastering this part of SEO is the first step in
a robust overall program. Once you’ve dealt with this, you can then begin to
focus on how to scale your site’s visibility further through creating great
content for users and promoting it effectively.

384

CHAPTER SEVEN: DEVELOPING AN SEO-FRIENDLY WEBSITE

CHAPTER EIGHT

SEO Analytics and Measurement An essential component of digital marketing is the
tracking, measurement, and analysis of various data points via analytics and
other data-oriented tools. These tools include traffic and performance analysis
platforms, link analysis and discovery tools, keyword and topic research tools,
and other systems designed to support your online marketing efforts. Once you
have selected the appropriate analytics and measurement tools to use in
coordination with your SEO efforts, it is important to establish baseline
measurements, with proper baselines accounting for as many variables as are
applicable to your business and industry. These baselines can include metrics
such as conversion rates by acquisition channel, conversion rates by content
type, URLs sending organic traffic, indexing status, backlinks, and more.
Benefits of these activities include: • Quantifying organic search traffic by
search engine • Determining which site content areas are getting organic search
traffic and quantifying this per area and per search engine • Quantifying
conversions, user behavior events, and other metrics search engines track •
Identifying poorly performing pages (e.g., low-converting, high bounce rate) •
Identifying best-performing pages (e.g., high traffic, high conversion rate) •
Tracking search engine crawler activity on the site • Determining the number of
indexed pages

385

• Determining whether the indexed pages are getting search traffic • Determining
whether best-selling product pages are indexed and getting search traffic Your
baselines are going to help you as you measure your efforts to generate various
outcomes through SEO, including: • Transactions • Subscriptions and memberships

• Audience building and community creation

• Lead generation

• Affiliate revenue generation

• Ad revenue generation

Why Measurement Is Essential in SEO It is important to recognize that effective
SEO requires understanding various SEOrelated metrics, such as organic traffic,
average position (as can be seen in your website’s Google Search Console data),
crawling, indexing, and backlinks, as well as understanding the relevant
business outcome metrics (e.g. conversions, sales, and ROI). As an SEO
practitioner, you should always aim to integrate understanding and analysis of
each of these aspects in order to appropriately align your SEO efforts with your
business’s digital marketing goals, and to effectively measure the impact of
your efforts.

Analytics Data Utilization: Baseline, Track, Measure, Refine An effective
measurement process generally includes the following components, some of which
may vary and/or be specific to your unique situation: 1. Determine your
baselines based on your overall strategy for achieving your objectives. If your
strategy for generating more ecommerce sales is to create more rich snippet
results in Google, establish baselines appropriate to that strategy. Ensure you
don’t get false baselines due to seasonal factors or some other unusual events.
Comparing year-over-year data will usually eliminate fluctuations due to
seasonality; however, you must also consider how changes in the market, new
competition, elimination of competition, industry consolidation, changes in your
business strategy, and changes in the search engines themselves may affect your
data. 2. Identify the appropriate tools for tracking and measuring the various
metrics specific to your objectives and collect the newest data for each metric
you decided to focus on. SEO can take many months (and usually longer) to show
results;

386

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

ensure you wait long enough for your efforts to have a measurable impact. Some
considerations: • If your site is brand new, it may take longer for your changes
to take effect. • If the scope of the change is drastic (such as a complete
redesign incorporating new information architecture, new URLs, etc., versus a
simple visual reskinning), the time to see results will probably be longer. •
Sites that get crawled at great depth and frequency will probably yield visible
results faster. 3. Compare the baseline data to the new data within your various
data sources to identify areas for improvement, modification, and expansion, and
refine your efforts. If you don’t see any changes in your traffic, conversion
rates, or other relevant metrics over the appropriate measurement time frame (or
if the changes are measurably negative), then you will need to examine the data
more closely. Perhaps you’ve overlooked something important, or there was an
error in your implementation. If you are achieving great results—such as a
noticeable increase in traffic and/or conversions—then consider expanding your
efforts and applying these effective strategies to more areas of your site.

Measurement Challenges It might sound easy to record a set of SEO metrics before
the start of the project and then compare the same set of metrics over 30, 60,
or 90 days to measure the progress. But what if you don’t make any changes
during one quarter, and the metrics reflect an improvement that can reasonably
be attributed to earlier SEO efforts? Conversely, what if the improvements are
because of external business factors? How can your SEO project get attribution
for its business impact? One other issue that may significantly impact your
ability to establish a baseline for measurement is a seemingly spontaneous drop
in traffic. When analyzing the organic traffic to a site, if you notice a large
drop it’s vital to determine the cause before establishing your baselines. Large
reductions in traffic can be caused by a number of factors, including a
large-scale site redesign or rebuild, a shift in the nature of the business,
seasonal factors (which can usually be determined by looking at several years’
worth of data at a time), or even organic search algorithm updates. An extremely
useful tool to use in determining whether a large traffic shift might have been
caused by an algorithm update is Panguin (a concatenation of the words panda and
penguin, which are the names of older Google algorithms you can learn more about
in Chapter 9), created by Barracuda Digital. Panguin allows you to overlay your

WHY MEASUREMENT IS ESSENTIAL IN SEO

387

Google Analytics 4 (GA4) organic traffic with Moz’s history of algorithm updates
to see if traffic shifts coincide with the updates. It’s not always possible to
definitively determine if you have been impacted by an algorithm update, as
occasionally several updates happen over a very short period of time, or you may
have a traffic drop due to another cause that simply happens to coincide roughly
with an algorithm update. However, before embarking on an SEO project it’s
important to note the possibility of how future algorithm updates may impact the
potential success of the project.

Analytics Tools for Measuring Search Traffic The following are some commonly
used web, data, and customer insight analytics platforms for digital marketers:
• Google Analytics 4 (Google Marketing Platform) NOTE Note that as of July 1,
2023, Google has replaced Universal Analytics with Google Analytics 4. From this
point on, any websites using the old Google Analytics tracking code will no
longer have their data tracked.

• Adobe Web Analytics • Adobe Marketing Analytics • Woopra Customer Journey
Analytics • Clicky (privacy-friendly web analytics) • Matomo (privacy-friendly
web analytics) • Webtrends Analytics for Web Apps

Valuable SEO Data in Web Analytics You can extract numerous types of valuable
data from web analytics. Here are some examples of information you may want to
extract:

Organic traffic by search engine A high-level data point is to identify your
traffic by search engine. Generally speaking, you will likely see more traffic
from Google than from other search engines because of Google’s dominant search
market share in most countries (though there will always be outliers, or you may
be based in one of the few countries where Google doesn’t dominate the market).

388

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Organic impressions, traffic, and conversions by landing page and content type
An indirect way of measuring the impact of your SEO efforts (including
indexation status) is to keep an eye on the number of pages and types of content
that are earning organic search impressions and traffic (generally measured as
sessions in most analytics applications). This data represents a subset of the
total amount of content indexed, but has greater value in that it indicates your
content was compelling and ranked highly enough that users decided to engage
with it. There are several ways in which someone knowledgeable in analytics can
configure your analytics account(s) to provide insight into which of your pages
and content types are receiving organic traffic where the keyword is “not
provided” (which will be the great majority of them). From this you can
extrapolate which groups of related topics those landing pages are optimized for
and which queries they actually rank for. You can also use Google Search Console
to cross-reference your pages and see what search terms are the source of that
“not provided” traffic.

Referring Domains, Pages, and Sites Referrer data is very useful to help SEO
practitioners identify where nonorganic traffic is coming from, which can often
point to external links gained. You can often see those new links in referrer
reports first, even before your backlink analysis tools report them.

Event Tracking Event tracking goes one step deeper than basic analytics, and the
ability it provides to identify specific user actions (i.e., events), beyond
transaction data, is incredibly valuable to digital marketers. Rather than
simply observing what pages are visited and how many unique sessions are
generated, event tracking allows you to narrow down groups of visitors based on
the behaviors they perform on your site and how they engage with your content.
The following is a basic list of trackable user behavior events:

Add to cart Studies have shown that users who “add to cart,” even if they do not
complete the checkout process, are more likely to return to make a purchase.
This is also a good way to calculate shopping cart abandonment rates and make
changes to refine and improve the checkout process.

Complete checkout An obvious one, this action will show you what percentage of
each user group is converting into sales. It is also of interest to measure what
percentage of people start the checkout process but do not complete it.

ANALYTICS TOOLS FOR MEASURING SEARCH TRAFFIC

389

Save to wish list Ecommerce sites offering wish lists are still in the minority,
but they’re a great way to track interest that isn’t quite a purchase.

Share/send to a friend Many sites offer a “share this page” function, and it’s a
great action to be aware of. If folks are sending out your link, you know you
have a hit. (Note that this is different from a social media share, listed
below.)

Subscribe to newsletter A subscription is a tacit endorsement of your brand and
a desire to stay in contact. It may not be a conversion, but for B2B, it may be
the next best thing.

Contact form submission Filling out a contact form can be even more valuable
than a newsletter subscription, in some cases. Though some of these forms will
report support issues, many may contain questions about your products/services
and will indicate a desire to open a sales conversation.

Email link As with contact forms, direct email links have the possibility of
becoming sales contacts. Be sure to clearly label sales emails and track them
separately from support or business issues.

Post comment For blogs, anyone who is participating or contributing content
should be paid attention to (as should those channels that earn user
engagement).

Social bookmark/share All those folks who are submitting your content to
Facebook, X (formerly Twitter), TikTok, Reddit, and other social media and news
aggregation/discussion sites deserve to be recognized (and sought after).

Register as a user Registered users provide you with some form of information
you can use in your digital marketing efforts, ranging from name and location to
email address and phone number. This data can be useful in all of your digital
marketing efforts, including email marketing and display retargeting.

Sign up for a subscription and/or membership Many sites operate on a
subscription or membership model, where recurring payments drive value beyond
the original event.

390

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Contribute content When a user publishes content to your site, discovering the
path the user takes in the process is important and can help you refine your
site’s layout and calls to action.

Vote/rate Even low levels of participation, such as a rating or a vote, are
worth tracking when every piece of participation counts.

Social engagement metrics Likes, shares, mentions, follows, and many other user
engagement actions are trackable.

Connecting SEO and Conversions As we discussed previously in this chapter, it is
important to tie your SEO efforts to the results they bring to the business. A
fundamental piece of that is measuring the conversions driven by organic search
traffic. The following are some of the most common types of conversions (you’ll
notice that these are also trackable “events,” as described in the previous
section):

Sales and transactions This is the one that everyone assumes is part of
conversions. Sales and sales revenue (or better still, margin) conversions can
be the simplest things to track, except when you are selling many different
products at different price points and in different quantities. In this case,
the process would need to be a bit more sophisticated. If your site is
advertising driven, you need to look at the impact of organic search traffic on
advertising revenue. If you have no financial goals for your site, you’ll need
to look at some of the other types of conversions and determine their value or
worth.

Newsletter sign-ups Anytime a user signs up to receive regular communications
from you, it’s a win. Even though there are not direct financial consequences to
this action, it is still a conversion. Someone who has subscribed to something
you offer is more likely to become a customer than a first-time visitor to your
site.

Subscriptions and memberships Perhaps you offer content on a subscription or
membership basis—each new subscriber or member is a converting user.

CONNECTING SEO AND CONVERSIONS

391

Content downloads Many sites offer free downloads, such as white papers or free
downloadable tools. Even if you do not require a sign-up of any type, you should
still count a download as a conversion; you’re getting your message out there
with the downloads you offer.

Contact form submissions and phone calls This is when your users request contact
from you or contact you directly. Phone calls can be tracked through various
tracking systems in available advertising platforms.

Users sharing your content This kind of conversion happens when a visitor shares
information they found on your site with someone else. This includes Facebook
shares and links on X. In addition, if your site has a “share” or “send to a
friend” feature, you can track conversions by noting each time a user uses that
feature.

Users linking to your content A user has visited your site and found its content
useful, entertaining, or otherwise compelling enough to link to it from their
own site. It’s important to assign a value to every type of conversion you
receive. SEO software packages such as Conductor’s Organic Marketing Platform,
BrightEdge, Searchmetrics, and seoClarity allow you to view search ranking data
together with traffic and revenue data. This enables you to connect organic
search traffic to conversions to measure SEO performance. You can see a sample
screen from Conductor’s Organic Content Performance report in Figure 8-1.

392

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Figure 8-1. Conductor’s Organic Content Performance report

Attribution Another issue to be aware of is attribution—identifying user actions
that lead to specific outcomes. For example: • A user performs a search, clicks
on an organic search result that takes them to your site, views a few pages of
content, and leaves. The next day, they remember what they saw, perform another
search, click on a paid search ad for your site, and buy a product. In this
instance, organic search should receive some credit (i.e., attribution). • A
user performs a search on a mobile browser, clicks on an organic search result
that takes them to your site, looks around, and leaves. A few days later, they
remember your website from the search results, download your mobile app, and
make a purchase. Organic search should also receive some credit for this
transaction.

CONNECTING SEO AND CONVERSIONS

393

Ideally, you should attempt to track the impact of your SEO efforts in a
multichannel data tracking environment. The following marketing attribution
tools offer various ways to measure attributions across various channels: •
Adobe Analytics Attribution IQ

• Attribution

• Branch

• Dreamdata

• C3 Metrics

Segmenting Campaigns and SEO Efforts by Conversion Rate Once you have conversion
and attribution tracking configured, how do you use it to focus your SEO
efforts? You’ll want to track conversion data in different ways, such as:

By keyword What keywords are bringing the best results?

By referrer Which traffic source leads to the most conversions?

By web page Which pages on your site result in the highest number of
conversions?

By initial entry page Which initial entry pages ultimately lead to the highest
number of conversions?

Increasing Conversions As an SEO practitioner, you should strive to become an
expert at conversion rate optimization (CRO) because higher conversion rates
mean a greater impact for your SEO efforts. Various tools are available to
assist with CRO, including: • Optimizely

• Unbounce

• VWO

• Attention Insight

• Mouseflow

• UserTesting

• Plerdy

• Hotjar

Calculating SEO Return on Investment An effective SEO process is one that
continuously works toward a positive return on investment. Your methodology for
determining the ROI of your SEO will be very specific to your business, though
most will likely include the following components:

394

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Number of people searching for your keywords This can be challenging to
estimate, because you cannot completely map out the long tail. One rough
estimation strategy is to multiply the search volume for the top terms for your
business by 3.3 (i.e., assume that the head terms account for about 30% of the
available volume).

Average conversion rate Once you have attracted a user through organic search,
how successful are you at completing a conversion? Typical conversion rates for
a website might be between 2% and 5%. It should be easy to get this data from
your analytics. You should already know what your conversion rate is!

Average transaction value Last but not least, factor in the average transaction
value. Again, this is data you already have. Here’s an example formula for
calculating ROI based on revenue: • SEO revenue = people searching for topics
related to your content (e.g., identified keywords and/or queries) *
click-through rate * average conversion rate * average transaction amount • For
example: 10,000 per day * 10% * 5% * $100 = $3,000 per day • SEO ROI = SEO
revenue / SEO cost (use total $ spent for salaries, consulting, and development,
and/or number of hours spent) • For example: $3,000 per day / $500 per day = an
ROI of 6x Predicting an SEO project’s ROI based on impressions and click-through
rate can, in some cases, be problematic because you have very little control
over the variables. You end up relying on numbers that you have a limited
ability to influence. As an alternative approach, you can measure and track SEO
ROI based on an increase in search visibility. To do this, begin by determining
two things: • How many pages and content types are generating traffic for a
given time period • How many visits each page and/or content type receives for a
given time period Next, record these supporting metrics: • Average ranking
across the whole keyword spectrum • Average click-through rate

CONNECTING SEO AND CONVERSIONS

395

Now, by making it easier for search bots to find more pages on your site,
consolidating duplicate content, improving the depth and breadth of your
content, changing your pages to better match your users’ needs, and improving
page titles, metadata, and microdata, you should see an increase in the number
of pages getting search clicks and/or the number of clicks per page. Your
combined efforts should result in more traffic when compared year-over-year.
With this approach, the formula given previously for calculating SEO revenue can
be modified as follows: SEO revenue = increase in (pages/content getting search
clicks * search clicks per page) * average conversion rate * average transaction
value Controlled tests, like those possible using tools such as SearchPilot,
provide additional potential for calculating ROI in more concrete ways, by
quantifying the uplift associated with different initiatives, providing data
about drops in performance avoided by testing, and reducing the requirement for
engineering investments in ineffective changes. Their “full funnel” approaches
also allow the combining of organic search traffic data and conversions to tie
SEO activities to revenue. We discuss this kind of testing more in Chapter 14.

Diagnostic Search Metrics Thus far in this chapter we have focused on the
basics—the dollars and cents of determining whether your SEO campaign is
succeeding. As we noted at the beginning of the chapter, determining these
metrics should be your first priority in your analytics efforts. In this
section, we will start looking at metrics that you can use to diagnose specific
SEO issues. One example of this would be finding out whether a major section of
your site is not indexed. Another example is seeing how your traffic growth
compares to that of your competitors (this helps you decide whether you have set
the right objectives for your efforts). Numerous tools allow you to monitor your
site and those of your competitors, providing insight into your SEO progress.
You can also use these tools to figure out what your competitors are doing from
an SEO perspective. This type of intelligence can provide you with new ideas on
how to adapt your strategy to get better results. As with all such tools, it is
always important to understand their capabilities and to have an idea of what
you are looking for. Better knowledge of your competitors’ strategies is
certainly one valuable goal. Detecting a problem in how your website is crawled
is another. By selecting specific and actionable goals, you can set yourself up
for the highest possible return.

396

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Site Indexing Data It is often valuable to know how many pages of a site are in
a search engine’s index. For example, this enables you to: • Determine whether
important parts of your site are not in the index. If key parts of the site are
not in the index, you can then embark on an effort to determine why. • Learn
about your competitors’ sites and strategies by checking how many of their pages
are indexed. You can get an estimate of the number of indexed pages for a site
using the site: operator in Google or Bing. Figure 8-2 shows the results for The
New York Times website, at the time of writing.

Figure 8-2. Site indexing data

As shown in Figure 8-2, Google reports over 4.8 million pages in its index.
While this is not necessarily an exact figure (you can find more accurate Google
indexation data for your site in Google Search Console, discussed later in this
chapter), many site owners can use this information to estimate the number of
their site’s pages that are indexed. You can also use this method to identify
the indexation status of specific pages, such as those containing a particular
term in the URL, as shown in Figure 8-3.

DIAGNOSTIC SEARCH METRICS

397

Figure 8-3. Combining the site: and inurl: operators

This shows that, at the time we did this search, the searchengineland.com domain
had 165 pages in the index with “privacy” in the URL.

Index-to-Crawl Ratio This is the ratio of pages indexed to unique crawled pages.
If a page gets crawled by Googlebot, that doesn’t guarantee it will show up in
Google’s index. A low ratio can mean your site doesn’t carry much weight in
Google’s eyes.

Search Visitors per Crawled Page Calculated for each search engine separately,
this is how much traffic the engine delivers for every page it crawls. Each
search engine has a different audience size. This metric helps you fairly
compare the referral traffic you get from each one. As you optimize your site
through multiple iterations, watch the aforementioned KPIs to ensure that you’re
heading in the right direction. Those who are not privy to these metrics will
have a much harder time capturing the long tail of SEO.

Free SEO-Specific Analytics Tools from Google and Bing To get data that is
specific to your website directly from these two search engines, you can use the
free tools they provide for website owners and webmasters, including Google
Search Console (GSC) and Bing Webmaster Tools (BWT). Both of these tools are
useful, and the data they provide can assist you tremendously in understanding
how your website is performing in these search engines, and why.

398

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Using GA4 and GSC Together Google provides for data sharing between Google
Analytics and Google Search Console to enable you to correlate GA4 data with
organic search data in a more streamlined fashion, by allowing correlation
between pre-click data (e.g., search queries, impressions, and average position)
and post-click user behaviors (e.g., page views, bounce rate, and conversions).

Differences in How GA4 and GSC Handle Data There are some differences between
how GA4 and GSC handle and present data. For example, whereas GA4 uses the
actual URL, GSC aggregates landing page data under canonical URLs, defined as
follows: A canonical URL is the URL of the best representative page from a group
of duplicate pages. … For example, if you have two URLs for the same page (such
as example.com?dress=1234 and example.com/dresses/1234), Google chooses one as
canonical. Similarly, if you have multiple pages that are nearly identical,
Google can group them together (for example, pages that differ only by the
sorting or filtering of the contents, such as by price or item color) and choose
one as canonical. Google can only index the canonical URL from a set of
duplicate pages. A duplicate can be in a different domain than its canonical
(such as example.com/mypage and example2.com/myduplicate). As a result, GSC will
show aggregated impressions and aggregated clicks in its reports, as shown in
Table 8-1. Table 8-1. An example of how GSC aggregates impressions and clicks
URL

Impressions

Clicks

https://www.example.com

1,000

100

https://m.example.com

1,000

100

https://www.example.com/amp

1,000

100

Canonical URL

Aggregated impressions

Aggregated clicks

https://www.example.com

3,000

300

Most of the time the landing page shown in GSC is also the canonical URL. To
obtain the canonical URL associated with a landing page, use the GSC URL
Inspection tool. Table 8-2 lists the various discrepancies in how GSC and GA4
handle different kinds of data.

FREE SEO-SPECIFIC ANALYTICS TOOLS FROM GOOGLE AND BING

399

Table 8-2. Differences in how GSC and GA4 handle data Context

Search Console

Analytics

Landing-page URLs that redirect

Search Console reports the canonical URL for a landing page, even when the click
was to a noncanonical landing page. If www.example.com/amp has a canonical URL
of www.exam ple.com, Search Console reports search metrics for www.example.com.

Analytics reports the URL that results from the redirect; for example:
www.example.com.

Page has no Analytics tracking code

Data for the page appears in Search Console.

Data for the page does not appear in Analytics.

Number of URLs recorded per site per day

Search Console records up to 1,000 URLs for landing pages

Analytics does not observe the 1,000-URL limit and can include more landing
pages.

Analytics property tracks multiple domains

Search Console can link to a single domain.

If an Analytics property collects data for multiple domains, the Search Console
reports have data for only the single linked domain.

Time zones vary

Search Console timestamps data according to Pacific Daylight Time.

Analytics timestamps data in each view according to the time zone defined in the
view settings.

JavaScript not enabled in browsers

Search Console collects data regardless of whether JavaScript is enabled.

Analytics collects data only when JavaScript is enabled. Users can opt out of
data collection by implementing a browser add-on.

NOTE At the time of writing, Google’s data retention policy for GSC data is 16
months, and Bing’s Webmaster Tools data retention policy is 6 months. If you
want to maintain a history of the data provided by these tools, you should plan
on regularly downloading and saving your data. Many third-party analytics,
marketing, and business intelligence platforms provide the ability to pull down
GSC and BWT via API data source connectors.

Differences Between Metrics and Dimensions in GA4 and GSC Table 8-3 identifies
terms that are used in both GSA and GA4 reports and shows how they differ.

400

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

Table 8-3. Term usage differences between GSC and GA4 Term

Search Console usage

Analytics usage

Impressions

Used for both Google Ads impressions and Google Search impressions.

Used exclusively for Google Search impressions.

Clicks

Used exclusively for Google Search clicks.

Used for both Google Ads clicks and Google Search clicks.

Average Position

Average ranking in Google Search results.

Average ranking in Google Search results.

CTR (clickthrough rate)

Clicks/impressions for Google Search clicks.

Clicks/impressions for both Google Ads and Google Search clicks.

Keyword

Applies to the key terms used in the written content of the website pages. These
terms are the most significant keywords and their variants that Google found
when crawling your site. When reviewed along with the Search queries report and
your targeted keywords, it provides insight into how Google is interpreting the
content of your site.

In paid-search or Google Ads reports, describes a paid keyword from a
search-engineresults page. In the organicsearch reports, describes the actual
query string a user entered in a web search.

Query

The actual query a user entered in Google Search.

Only used in the Search Console reports. Applies to the actual query a user
entered in Google Search.

First-Party Data and the Cookieless Web As a result of the growing emphasis on
user data privacy on the web, digital marketers are working in an increasingly
cookieless environment. First-party data is king, and alternative methods for
tracking users for the purpose of offering them personalized solutions will need
to be developed. The Privacy Sandbox is Google’s solution to the deprecation of
third-party cookies and migrating web technologies toward a more privacy-focused
web. Google’s migration to Google Analytics 4 in July 2023 represents a shift
from the “pageview” metric in Universal Analytics to an event-based model,
moving away from “bounce rates” to engagement rates. Much of this new reporting
will be reliant upon model-based estimations of data, rather than actual data.
Much remains to be seen in terms of how digital marketers will adapt to these
kinds of shifts within the major analytics tools and platforms. The topics of
data privacy, regulation, and related legal issues are covered further in
Chapter 13.

FIRST-PARTY DATA AND THE COOKIELESS WEB

401

Conclusion For SEO professionals, effective tracking, measurement, and analysis
are critical to the ability to productively implement and optimize your SEO
efforts. Putting the right analytics tools in place can provide the insights you
need to empower this process and ensure that you continue to build a strong
foundation for SEO success.

402

CHAPTER EIGHT: SEO ANALYTICS AND MEASUREMENT

CHAPTER NINE

Google Algorithm Updates and Manual Actions/Penalties Google tunes and tweaks
its algorithms on a daily basis, and periodically releases larger algorithm
updates. In addition, Google actively reviews its results to find sites that are
violating its guidelines, and those sites may be subjected to ranking penalties.
These measures are designed to help improve the overall quality of the search
results. Both algorithm changes and penalties can have a major impact on your
organic traffic, and significant decreases in the search engine traffic can be
devastating to a business. For example, if the business shown in Figure 9-1
generates most or all of its revenue from organic search traffic, the drop
depicted here would represent a crippling blow. This type of loss of revenue can
mean laying off employees, or even closing the business.

Figure 9-1. Major loss in traffic

If you have already suffered such a traffic loss or been otherwise impacted by
an update or penalty, it is important to understand the cause and what you need
to do to recover. For that reason, you need to have a working understanding of
how the Google ecosystem works, how Google recommends that you operate your
website, and

403

the various scenarios that can lead to visibility and traffic losses. Otherwise,
you may be adversely affected by Google updates or penalties, and it may seem
like you’re suffering for reasons beyond your control. However, with knowledge
of what Google is trying to achieve through these mechanisms, you can not only
reduce your exposure to them but potentially even set yourself up to benefit
from the updates.

Google Algorithm Updates Google’s updates to its various search algorithms take
many different forms, including changes to search functionality, updates to
search result composition and layout, changes in various aspects of the
relevance and ranking algorithms, and daily testing and bug fixes. In this
section we’ll review the types of changes that Google makes and their impact on
the search results that users ultimately engage with.

BERT BERT, short for Bidirectional Encoder Representations from Transformers, is
a neural network–based technique for natural language processing (NLP). Prior to
the introduction of BERT in October 2019, when Google’s algorithms were trying
to understand the meaning of a word or phrase, they could only consider nearby
text that came either before or after that keyword, as the text was processed in
order. BERT made it possible for Google to analyze the text before and after the
keyword to understand its meaning. Figure 9-2 shows an example of how this
capability affected the search results returned for a complex travel-related
query.

Figure 9-2. Sample search query results before and after BERT

404

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

As Figure 9-2 illustrates, prior to BERT, Google did not properly understand the
query and hence did not correctly answer the user’s question. But as you can see
on the right, this key algorithm update enabled the search engine to fully
understand the context of the query, and thus deliver more relevant results. As
with all of Google’s algorithms, BERT continued to evolve and improve over time,
and the results continued to improve as well. Another example of the kind of
complex query this important algorithm update enabled the search engine to
understand is shown in Figure 9-3.

Figure 9-3. Query understanding enabled by BERT

GOOGLE ALGORITHM UPDATES

405

In the initial rollout, BERT was only applied to English-language queries in the
US, and Google determined that it would enable better understanding of (and thus
impact the results for) 10% of those queries. In December 2019, Google announced
that BERT had been further rolled out to 70 languages. In April 2020, Google
published a paper on a new algorithm called the Siamese Multidepth
Transformer-based Hierarchical (SMITH) Encoder, which could be the next step
beyond BERT. SMITH has improved capabilities for understanding longer passages
within long documents, in the same way that BERT understands words or phrases.
As of the publication of this book it was not clear whether the SMITH algorithm
had been rolled out in Google Search, but the description on the patent sounds
very similar to the behavior of the passages algorithm, discussed next.

Passages and Subtopics Google announced its intention to release two new search
algorithms in October 2020. The first of these to be released was an algorithm
that would enable Google to divide search results into topics and subtopics. The
genesis of this was Google’s recognition that in many cases broad user queries
get followed rapidly by additional queries designed to further refine what the
user is looking for. Figure 9-4 shows how this might work.

Figure 9-4. An illustration of the subtopics algorithm

406

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

If a user searches on pest control, Google may show some initial results for
that query but also add subsections for how to do pest control at home by
yourself?, paprika pest control, and pest control under deck, as these are
popular follow-on queries for users. Google’s Danny Sullivan confirmed in
January 2021 that the subtopics algorithm had been rolled out by Google in
mid-November 2020. The second of the announced algorithms was one that would
enable Google to identify and “index” specific passages within a web page
separately from the content of the rest of the page. The purpose of this update
was to allow Google to provide answers to very specific user questions. Figure
9-5 shows the example of such a query that Google shared in its announcement.

Figure 9-5. An example of the passages algorithm in action

The reason that this algorithm is important is that many user queries are highly
specific. While the answers to these queries may exist in various places on the
web, many of these answers may be buried inside other content whose general
relevance may not align well with the specific user question. With this update,
Google can start to recognize specific passages within a larger document that
are relevant to such specific queries. Danny Sullivan confirmed that the initial
release of the passages algorithm had been rolled out on February 11, 2021.

GOOGLE ALGORITHM UPDATES

407

MUM MUM (an abbreviation for Multitask Unified Model) represented another
substantial step forward with natural language processing for Google, adding
significant capabilities. This algorithm, released in May 2021, has the
capability to work with multimodal data; it can seamlessly search across 75
languages as well as multiple media types (text, images, video, and audio). In
the announcement of MUM, Google shared an example of a user entering the search
query I’ve hiked Mt. Adams and now want to hike Mt. Fuji next fall, what should
I do differently to prepare? Per Google’s explanation, MUM can apply its depth
of knowledge to: 1. Realize that the mountains are close to the same elevation.
2. Understand the weather on Mt. Fuji in the fall and recommend appropriate
clothing. 3. Surface related topics such as training exercises and the best
available gear. 4. Show relevant videos and images in the SERPs. For example,
key information to answer the user’s question may be found only in a local
Japanese blog post. With MUM, Google can surface information from the blog and
deliver it to a user making the query while sitting in Chicago. In addition,
Google will be able to seamlessly include relevant results from other media such
as images, videos, and audio files that add additional pertinent context related
to the query—so, whereas before this algorithm update Google still could have
surfaced images and videos about Mt. Fuji in the SERPs, with MUM Google can
deliver much more specific information about where to find items such as onsen
towels and clothing.

Page Experience and Core Web Vitals Google announced its intention to begin
using a new signal called page experience in May 2020, and the initial rollout
of the update took place between mid-June and late August 2021. The page
experience signal is actually a collection of many preexisting signals related
to whether or not a site offers a good experience to users. Figure 9-6 provides
a visual representation of what those signals are and how they come together
into one ranking signal. NOTE The page experience algorithm update was discussed
in some depth in Chapter 7. This section provides a brief recap.

408

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Figure 9-6. Google’s page experience signal (source: Google Search Central Blog)

A brief description of each of the components follows:

Core Web Vitals As discussed in “Core Web Vitals” on page 358, this is a set of
metrics related to page speed and visual stability. The three components are
LCP, FID, and CLS. Largest Contentful Paint is a measure of how long it takes
for the main content elements of the page to be displayed. Web.dev (a site owned
and operated by Google) defines LCP as “the render time of the largest image or
text block visible within the viewport, relative to when the page first started
loading.” The emphasis on the viewport is an important detail, as this means
that the true focus is on rendering of the content “above the fold.” Google
suggests that websites try to have this occur in 2.5 seconds or less. First
Input Delay is a measure of “the time from when a user first interacts with a
page (i.e. when they click a link, tap on a button, or use a custom,
JavaScript-powered control) to the time when the browser is actually able to
begin processing event handlers in response to that interaction.” This is
essentially a measure of how quickly things begin to happen after a user clicks
on something on your web page. Google suggests targeting a value of 0.1 seconds
or less. Cumulative Layout Shift is a measure of visual stability. Many sites
begin painting content, and then the page content appears to jump around as it
continues to render. This is bad because the user may attempt to click on
something during the rendering process, only to have the page jump around right
at that moment

GOOGLE ALGORITHM UPDATES

409

so they end up clicking on something that they didn’t intend to. Web.dev defines
CLS as “the sum total of all individual layout shift scores for every unexpected
layout shift that occurs during the entire life span of the page.”

Mobile-friendliness This search signal provides an indication of whether or not
your site offers a good experience on mobile devices. You can test this with
Google’s Mobile-Friendly Test tool.

Secure browsing The next search signal is whether or not you have implemented
HTTPS/TLS for your website. You can learn more about how to do that by reading
Google’s page on enabling HTTPS on your servers. You can also learn how to see
if a site is secure by visiting the Chrome Help Center’s page on checking if a
site’s connection is secure.

No intrusive interstitials or dialog boxes The final component of the page
experience signal is the absence of interstitials or dialog boxes that block a
user’s access to the information they came for, particularly on initial page
load. Grouping all of these signals together into the page experience ranking
factor makes it much simpler to communicate with the webmaster community about
them. In addition, the relative weighting of the individual components can be
determined in isolation from the main algorithm, and new components can easily
be added to the page experience signal without having to tinker with the larger
algorithm. While page experience is important, remember that content relevance
and quality are always the most important signals. To illustrate, your page
about tadpoles will not begin ranking for user search queries about cooking pots
just because it’s fast. Similarly, a strong page experience score will not help
your poor-quality content rank, even if it is highly relevant. However, in
instances where queries are highly competitive, with many potential pages that
offer highly relevant, high-quality content to address what the user is looking
for, the page experience signal can play a role in helping you rank just a bit
higher than your competition. You can read more about page experience and
optimizing for it in “Google Page Experience” on page 355.

The Link Spam Update Google continues to consider links to your website as
important ranking signals, and to invest in improving its algorithms for
detecting both high-quality links and low-quality links. The first link spam
update, announced on July 26, 2021, and rolled out over the next four weeks, was
a step in that process. While the announcement does not specifically discuss
what the update addressed, it does include a focused discussion on problems with
affiliate links and guest posting, along with a reminder of how

410

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

important it is to use the appropriate rel attribute values (nofollow,
sponsored, and ugc) in link tags. That does not mean that other aspects of link
spam weren’t potentially addressed by the update as well, but it does suggest
that these areas were the core focus. A second link spam update was rolled out
in December 2022, incorporating the use of SpamBrain, Google’s AI-based
spam-prevention system, to “neutralize the impact of unnatural links on search
results.” As these updates indicate, fighting link spam remains a real issue for
Google, and while a lot of progress has clearly been made over the years,
there’s still room for improvement.

The Helpful Content Update When users use a search engine, their goal is to
locate something—this might be a video to watch, a product or service that they
want, some piece of information that they need, or one of myriad other
possibilities. They’re looking to address a need, and when websites publish
content that does not serve their needs well, it can provide users with a bad
experience, driving them away. The primary concern of any publisher of a website
should therefore be doing a good (or great!) job of delivering high-value
experiences that meet user needs. The array of potential user needs that can be
addressed with online content is nearly limitless, but publishing pages that
focus on very specific niche user needs can be a great way to grow your presence
in search. In the section “SEO Content Strategy” on page 135, we mentioned that
building out a broad and deep content portfolio that thoroughly covers all
aspects of your market space is a great strategy. Unfortunately, many
organizations that try to pursue such a strategy do so with the sole goal of
increasing their organic search traffic, and try to implement it with minimum
cost. This can lead to content being created for search engines, not users,
which results in a poor experience for users. This behavior was the motivation
for the helpful content update that Google rolled out in August and September
2022. This update initially targeted only English-language queries, but a
subsequent release in December 2022 extended it to work globally across all
languages. The algorithm is applied to all search queries, but Google predicted
that the market areas that would see the largest impact would be “online
education, as well as arts and entertainment, shopping and tech-related
content.” The algorithm works on a sitewide basis, meaning that if you have a
section of your content that is deemed to be content that is not helpful to
users, it will apply a negative ranking factor to your entire site. Further, if
you are impacted by this algorithm and you fix or remove the offending content,
you will likely need to wait some number of months before the negative ranking
signals are lifted from your site. For these reasons, it’s important to ensure
that all your content is written with the primary

GOOGLE ALGORITHM UPDATES

411

goal of serving users. This means ensuring that it is all created by subject
matter experts, and that they are given sufficient time to create content that
has significant meaningful value for users. It also requires that the
instructions to the SMEs are not SEO-centric, such as being specific about the
length of the content or specifying lists of keywords that must be used in the
creation of the content. So, if you do pursue a strategy of publishing a large
content library in your market area, keep the focus on building out a deep
resource that will be more helpful to users in your topic area than other sites
on the web. Spewing out content from an assembly line that is more concerned
with the quantity of pages published than the quality of those pages will put
you at high risk of falling foul of this algorithm update.

Broad Core Algorithm Updates Broad core algorithm updates are one of the
fundamental processes that Google uses to continuously improve the quality of
its search algorithms, as acknowledged by Danny Sullivan in a March 9, 2018,
tweet (Figure 9-7). Google has regularly announced these larger updates ever
since that date, and as Sullivan indicated, they happen several times per year.
It’s also important to note that the announcements are about the updates that
Google deems significant enough to confirm; many others are made as well that
Google does not make any comment on. NOTE In fact, Google makes tweaks to its
algorithms on a daily basis; indeed, in July 2019 Sullivan noted that the
company had made over 3,200 algorithm changes in the past year.

From newest to oldest, the dates of the acknowledged core algorithm updates are:
• March 15, 2023

• January 13, 2020

• September 12, 2022

• September 24, 2019

• May 25, 2022

• June 3, 2019

• November 17, 2021

• March 12, 2019

• July 1, 2021

• August 1, 2018

• June 2, 2021

• April 17, 2018

• December 3, 2020

• March 7, 2018

• May 4, 2020

412

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Figure 9-7. Google’s Danny Sullivan responding to a question about the March
2018 update

You can see Google’s list of its announced updates on the Google Search Status
Dashboard. Moz also maintains its own list of Google algorithm updates, which
includes those that were noticed by the industry but were not announced by
Google. In addition to the confirmed updates, the industry has noted many dates
where unannounced algorithm changes by Google appear to have had a significant
impact, with many websites gaining or losing traffic.

Functionality Changes Google periodically makes tweaks to the way that search
works. Some of these are announced as well. Examples as of the time of writing
include:

Page title rewrites update (August 16, 2021) Prior to this update, Google had
begun to experiment with showing titles for pages in the search results that
were not the same as what the pages used as their tags (as defined in the
section of the web page). The update modified the algorithm for deciding when
and how to do this and expanded the number of search results where Google chose
to make those page title rewrites.

GOOGLE ALGORITHM UPDATES

413

Passage indexing update (February 10, 2021) This update (discussed in more
detail in “Passages and Subtopics” on page 406) marked the initial rollout of
passages functionality into the core algorithm.

Featured snippet deduping (January 22, 2020) Prior to this update, pages shown
in featured snippets would also be shown as regular search results further down
in the SERPs. This approach was a result of the featured snippet algorithm being
a separate one from the main Google ranking algorithm. With this update, the
featured snippet algorithm was effectively fully integrated. As a result, Google
stopped showing the regular search results for pages highlighted in featured
snippets.

BERT update (October 22, 2019) This update significantly improved Google’s
ability to understand the intent of the searcher from their search query . BERT
was discussed earlier in this chapter.

Site diversity update (June 6, 2019) Prior to this update, Google had many
instances of search results where many listings came from the same domain. The
site diversity update reduced the frequency of these occurrences.

Chrome “not secure” site warnings (July 24, 2018) While not an update to the
Google search algorithms, with the release of Chrome 68 Google began labeling
all sites that were not running on HTTPS as “not secure.” For sites that had not
converted as of that date, this was an action that likely impacted their
traffic.

Mobile page speed update (July 9, 2018) With this release, page speed officially
became a ranking factor for mobile results. Google stated then, and continues to
state, that this only impacts the slowest of sites.

Video carousels update (June 14, 2018) With this update, Google moved all videos
appearing in the search results into video carousels, causing significant
changes in click-through rates in the SERPs. However, this particular change was
partially reversed in August 2018.

Search snippet length adjustments (May 13, 2018) Google formally ended tests
that it had been running on showing longer snippets for some queries in the
search results on this date.

Mobile-first indexing rollout (March 26, 2018) This update marked the beginning
of the formal process of moving sites into being considered “mobile-first.” The
move was first announced on November 4, 2016, and Google had largely been
testing the potential impact since then. After

414

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

the rollout was announced, the migration took place at a slow pace to minimize
disruption to the search results.

Zero-result SERPs test (March 14, 2018) For some knowledge panel results, Google
tested zero-result SERPs combined with a “Show all results” button. An example
is shown in Figure 9-8. Per Danny Sullivan, this test was canceled just a few
days later, likely because Google determined that this format was not
appreciated by users.

Figure 9-8. Example of a Google zero-result SERP

Google Bug Fixes Google search is a large and complex ecosystem, and it’s
inevitable that bugs will show up in the system from time to time. Google
typically acknowledges a few of these bug fixes per year, though it’s likely
that there are many other issues it fixes that we do not hear about.

Google Search Console Google Search Console is a web service from Google that
provides webmasters with insight on how Google sees their sites. Some publishers
fear that setting up a Search Console account will give Google access to more
data about their business or website than they want to share, but this is not
the case—the only thing that Search Console does is give you access to
information that Google already has about your site. In fact, this is a great
way to get invaluable data about the performance of your site. Google Search
Console provides: • Detailed information on the search queries sending traffic
to your site, including clicks, rankings, impressions, and click-through rates •
Crawl rate statistics, including crawl rate data • A record of crawl errors
found on your site • The ability to validate your robots.txt file’s behavior •
Tools to submit and check a sitemap

GOOGLE SEARCH CONSOLE

415

• Page indexation status information • Site speed data from the Chrome User
Experience Report (CrUX) • A list of external links that Google sees pointing to
your site (note that this list is not comprehensive and is heavily filtered) •
Notifications and messages from Google, including about any manual penalties
assessed by Google • Details on any security issues that have been identified by
Google, such as a hacked site or if any malware has been found • Access to the
Google Data Highlighter tool, which can be used to mark up structured data on
pieces of your site • The ability to analyze your structured data • A URL
Inspection tool that can be used to see how Google renders your page Creating a
Search Console account requires that you validate yourself as the site owner.
Several methods are available to do this, including: • Adding an HTML tag to
your home page • Uploading an HTML file to the root folder of your home page •
Providing the site’s Google Analytics tracking code • Providing the site’s
Google Tag Manager tracking code Setting up your Search Console account is a
foundational part of any SEO program. Access to this data can provide you with
rich information about what Google sees and how your site is performing in
Google search.

Google Webmaster Guidelines If you’re the owner/publisher of a website who wants
to grow your traffic from Google, it’s valuable to develop a strong
understanding of Google’s Webmaster Guidelines. These detail the principles that
Google wants webmasters to follow with their websites. While Google can’t
require you to follow these guidelines, it can choose to give poorer rankings to
websites that don’t. The basic principles that Google wants webmasters to follow
are:

Make pages primarily for users, not for search engines. This is a critical
aspect for the web presence of any business. Knowing what your target users
want, how they search, and how to create a site that presents that in an
understandable and engaging way is just good business, and it’s good for ranking
in Google search as well.

416

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Don’t deceive your users. Sadly, this is because many websites use
bait-and-switch tactics to draw users into content and experiences that aren’t
what they expected. For example, sites that have problems with Cumulative Layout
Shift (discussed “Core Web Vitals” on page 358) may cause users to accidentally
click on an element other than the one they intended to, creating a very poor
user experience.

Avoid tricks intended to improve search engine rankings. A good rule of thumb is
to ask yourself whether you’d feel comfortable explaining what you’ve done to a
website that competes with you, or to a Google employee. Another useful test is
to ask, “Does this help my users? Would I do this if search engines didn’t
exist?” Take the final sentence of this principle to heart. At one level, it may
sound naïve, but when you realize that Google actively tunes its algorithms to
find the sites that do the best job of serving users, it starts to make more
sense. Google is tuning all its algorithms to find the best user experiences,
and as a result, focusing your efforts on creating great value for users is
strongly aligned with maximizing your chances of ranking in Google search.

Think about what makes your website unique, valuable, or engaging. Make your
website stand out from others in your field. A focus on users is necessary but
not sufficient. You must also strive to create a site that stands out, just as
you seek to have a business that stands out from the competition. Otherwise,
there will be nothing about it that compels users to want to see your website,
and likewise Google will have little reason to rank it highly in the search
results. In addition to these basic principles, Google also discusses a number
of specific guidelines. In the following sections, these are divided into
practices to avoid as well as a couple of good practices to follow.

Practices to Avoid Google recommends avoiding the following:

Automatically generated content Here Google is targeting pages that are
artificially generated for purposes of attracting search traffic, and that add
no practical unique value. Of course, if you run a retail site, you may be using
your ecommerce platform to automatically generate pages representing your
database of products, and that is not Google’s concern here. This is more
targeted at machine-generated (a.k.a. “spun”) content that makes little sense to
users.

Participating in link schemes Since links to your site remain a core part of the
Google algorithm, many parties out there offer ways to cheaply (and
artificially) generate such links. As we’ll

GOOGLE WEBMASTER GUIDELINES

417

discuss in “Quality Links” on page 428, you should focus your efforts instead on
attracting links that represent legitimate citations of your site.

Creating pages with little or no original content These can take many forms
(described elsewhere in this list), such as pages that are automatically
generated, pages with little user value or purpose that exist just to get
someone to click on an affiliate link, content scraped from other sites, or
doorway pages.

Cloaking Google defines this as “the practice of presenting different content or
URLs to human users and search engines.” This has been an issue because some
websites were structured to show Google a rich informational experience that it
might choose to rank, but when users arrived at the site, they would get
something entirely different.

Sneaky redirects This is the practice of using redirects to send users to a
different page than what gets shown to Googlebot. As with cloaking, the concern
is that users may get sent to content that does not match up with what they
expected when they click on a link in a Google SERP.

Hidden text or links These are spammy tactics that date back to the early days
of search engines where content is rendered on a page in such a way that it’s
not visible, such as implementing white text on a white background, or using CSS
to position it well off the page. With links, a common spam tactic was to
include a link to a page but implement the link on only one character, such as a
hyphen.

Doorway pages These are pages that were created solely to attract search engine
traffic and not for the purpose of creating a good user experience. In practice,
these pages are often created in high volumes and are not well integrated into
the rest of the website. They also may be designed to target lots of highly
similar, though not exactly identical, search phrases.

Scraped content Taking content from third-party sites (“scraping” it) and
republishing it on your own site is not only a copyright violation, but Google
frowns on it as well. Making minor modifications such as using synonyms is not
sufficient; if you’re going to quote something from another site, you must
provide a citation to that site and add your own unique value to it.

418

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Participating in affiliate programs without adding sufficient value
Historically, Google had problems with money-making websites finding ways to
rank poor-quality content in the search results. There is nothing wrong with
making some, or all, of your revenue from affiliate programs. (Figure 9-9 shows
what Google’s Gary Illyes had to say about one affiliate site, indicating that
being an affiliate site is not a problem by itself.) However, if you don’t offer
added value to users, Google will not want to rank your site.

Figure 9-9. Google’s Gary Illyes praises an affiliate site’s content

Loading pages with irrelevant keywords Also called “keyword stuffing,” loading
your page with irrelevant or overly repeated uses of words creates a poor user
experience and is also seen as spammy behavior by Google.

Creating pages with malicious behavior, such as phishing or installing viruses,
trojans, or other badware Google not wanting to serve these pages in the search
results makes a great deal of sense, but it’s not always the result of actions
taken by the publisher of the website. Sites can get hacked; it pays to be
vigilant in maintaining the security of your site and regularly checking to see
if it has been attacked.

Abusing structured data markup Structured data provides you with opportunities
to enhance the appearance of your listings in Google’s search results, but there
is the potential for abuse here too. For example, placing rating markup on all
pages of your website is likely to earn you a manual action.

Sending automated queries to Google This is the practice of using automated
tools to send large quantities of queries to Google. This type of activity is
often used for rank tracking purposes, and Google does not care for it; it uses
up its resources without returning any benefit. Note, though, that tools such as
Ahrefs, BrightEdge, Moz, Rank Ranger, Searchmetrics, Semrush, seoClarity,
Conductor, and more do large-scale rank tracking and can be a valuable component
of your SEO program.

GOOGLE WEBMASTER GUIDELINES

419

Good Hygiene Practices to Follow The list of recommended practices is
comparatively short and focuses on two areas that represent best practices in
site hygiene:

Monitoring your site for hacking and removing hacked content as soon as it
appears This is unfortunately more common than you might expect. Hackers create
programs to comb the web looking for sites that have security vulnerabilities
and then infiltrate those vulnerable sites, injecting code into the web pages
and often inserting invisible links to their own spammy web pages. One of the
best defenses you can implement to limit your risk here is to always keep your
software platform up to date. For example, if you use WordPress, always install
the latest updates very soon after they become available to you. This includes
any plug-ins.

Preventing and removing user-generated spam on your site Any site that allows
users to contribute content in any form has a high risk of having spammy content
put there by users. Common examples of user-generated content are if you allow
comments on the content you publish, or host forums on your site. Bad actors can
come in and manually submit spammy content, and even worse actors implement
programs that crawl the web looking to submit spammy content en masse. Some of
the best practices here include requiring moderation of all user-generated
content or reviewing all user-generated content shortly after submission. There
are gradations of this too, such as requiring moderation of the first comment or
post by any user, but then letting them contribute additional content without
moderation thereafter. However, you should still plan to review those
contributions once they’re posted. Do take the time to read through the Google
Webmaster Guidelines. Anyone who is serious about their organic search presence
should understand these guidelines and take steps to ensure that their
organization does not cross the lines.

Quality Content Since we, as publishers of website content, want traffic from
Google, it becomes our task to provide high-quality content. This requires
developing an understanding of our target audiences and how and what they search
for, and then providing quality content wrapped in a great user experience so
they can find what they want quickly. However, as you might expect, creating
high-quality content is not always that easy, and many parties attempt to take
shortcuts that can potentially result in low-quality content in the search
results. To combat this, Google looks for and does many things to ensure that it
minimizes the presence of poor-quality content in its SERPs.

420

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Google took a big step forward when it released the Panda algorithm on February
24, 2012. In its announcement of the release, Google said the following: Many of
the changes we make are so subtle that very few people notice them. But in the
last day or so we launched a pretty big algorithmic improvement to our ranking—a
change that noticeably impacts 11.8% of our queries—and we wanted to let people
know what’s going on. This update is designed to reduce rankings for low-quality
sites—sites which are low-value add for users, copy content from other websites,
or sites that are just not very useful. At the same time, it will provide better
rankings for high-quality sites—sites with original content and information such
as research, in-depth reports, thoughtful analysis, and so on. Panda really
upleveled Google’s capabilities for evaluating content quality. Part of this
involved downgrading sites that were publishing low-quality content in high
volume to drive large quantities of search traffic. Over time, though, Panda was
adapted to address issues of content quality at a much broader level. Initially
Panda was an algorithm that ran separately from the main Google algorithm, but
in January 2016 Google confirmed that it had fully integrated Panda into the
main search algorithm. In November 2022, the original Panda algorithm was
replaced by a newer algorithm called Coati. Content quality remains a critical
component of Google’s algorithms, and creating great content is still the best
way to improve your rankings (see Figure 9-10).

Figure 9-10. Google recommends creating great content

GOOGLE WEBMASTER GUIDELINES

421

Content That Google Considers Lower Quality Some of the key types of content
that Google considers to be poor are as follows:

Thin content As you might expect, this is defined as pages with very little
content. Examples might be user profile pages on forum sites with very little
information filled in, or an ecommerce site with millions of products but very
little information provided about each one. However, this goes beyond word
count. Pages that have a large number of words that provide minimal value in
addressing the related user need can also be seen as thin content. For example,
some sites publish thousands of city-specific pages that are related to their
products or services when there is no logical local component to such products
or services. These pages may provide incidental information about the city
itself or variants of the text for the products/services that add no real value
beyond what you would find on the noncity-specific pages.

Unoriginal content These may be scraped pages, or pages that are only slightly
rewritten, and Google can detect them relatively easily. Having even a small
number of these types of pages can be considered a negative ranking factor.

Nondifferentiated content Even if you create all original content, that may not
be enough. If every page on your site covers topics that others have written
about hundreds or thousands of times before, then you really have nothing new to
add to the web. Consider, for example, the number of articles in the Google
index about making French toast. As shown in Figure 9-11, there are over 3,000
pages in the Google index that include the phrase how to make french toast in
their title. From Google’s perspective, it doesn’t need more web pages on that
topic.

Figure 9-11. There are thousands of pages with the title tag “how to make french
toast”

422

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Poor-quality/inaccurate content This is content that contains factual
inaccuracies, is poorly assembled, or is otherwise of low quality. Poor-quality
content may be hard to detect, but one indicator is content that includes poor
grammar or a lot of spelling mistakes. Google could also potentially use fact
checking as another way to identify poor-quality content.

Curated content Sites that have large numbers of pages with lists of curated
links may not be seen as having content of value, and as a result may receive
little traffic. Content curation is not inherently bad, but if you’re going to
do it, it’s important to incorporate a significant amount of thoughtful
commentary and analysis. Pages that simply include lots of links will not do
well, nor will pages that include links and only a small amount of unique text.

Too-similar content This used to be a popular tactic for content farms, which
published many articles on essentially the same topic. For example, if the topic
was schools with nursing programs, they might create lots of highly similar
articles with titles such as “nursing schools,” “nursing school,” “nursing
colleges,” “nursing universities,” “nursing education,” and so forth. There is
no need for all those different articles, as they will have no material
differentiation from each other.

Database-generated content Like providing curated content, the practice of using
a database to generate web pages is not inherently bad. However, when done at a
large scale this can lead to lots of thin-content or poor-quality pages, which
Google does not care for. Note that ecommerce platforms essentially generate
content from a database, and that’s OK as long as you also work to get strong
product descriptions and other information onto those pages.

AI-generated content In the past, Google has repeatedly stated that it doesn’t
want to rank AI-generated content. For example, the post announcing the helpful
content update asks, “Are you using extensive automation to produce content on
many topics?” Google’s John Mueller further addressed this topic in an SEO
Office Hours session, indicating that if Google’s webspam team determines that
content is AI-generated they are authorized to take action against it. However,
that position has since been softened a bit. A January 2023 post titled “Google
Search’s Guidance About AI-Generated Content” includes a section subtitled “How
automation can create helpful content” that acknowledges: Automation has long
been used to generate helpful content, such as sports scores, weather forecasts,
and transcripts. AI has the ability to power new

GOOGLE WEBMASTER GUIDELINES

423

levels of expression and creativity, and to serve as a critical tool to help
people create great content for the web. So, all AI-generated content is not
bad. In addition, as discussed in Chapter 2, generative AI tools like ChatGPT
can be used to identify potential article topics, create draft article outlines,
and identify gaps in existing content. What can still be problematic, however,
is if AI-generated content accounts for substantial portions of the content on
your site. Our advice in a nutshell is: • If you use generative AI to create
draft articles for you, have a true subject matter expert review the content
before publishing it. In order to ensure that you have their full attention
during this review, publish the article under their name. • Don’t use this
approach for the bulk of your site. Potential customers that come to your site
want to understand what makes you special and unique— what you offer that others
don’t, or why they should buy from you rather than other places they could go.
AI-generated content is at best a summarization of information found elsewhere
on the web. This means that it does not fit well with the Experience piece of
the EEAT equation (discussed in Chapters 3 and 7).

The Importance of Content Diversity Diversity is important to overall search
quality for Google. One simple way to illustrate this is with the search query
jaguar. This word can refer to an animal, a car, a guitar, or even an NFL team.
Normal ranking signals might suggest the results shown in Figure 9-12. Note that
the search results at the top focus on the car, which may be what the basic
ranking signals suggest the searcher is looking for. If the searcher is looking
for information on the animal, they’ll see those results further down in the
SERPs.

424

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Figure 9-12. Normal ranking signals may show these results for “jaguar”

In such cases where the search query is ambiguous, Google might use other
signals to decide to alter the results to look more like those shown in Figure
9-13.

GOOGLE WEBMASTER GUIDELINES

425

Figure 9-13. Search results may be altered to satisfy differing search intents

In this version of the SERP, one of the animal-related results has been inserted
into the second position. Google makes these types of adjustments to the SERPs
using a concept commonly known in the SEO industry as Query Deserves Diversity
(QDD), though it’s not clear what name Google uses for this approach. Google
makes these adjustments by measuring user interaction with the search results to
determine what ordering of the results provides the highest level of user
satisfaction. In this example, even if traditional ranking signals would put
another page for the car next, it might make sense for the next result to be
about the animal, as the result might be a higher percentage of satisfied users.

426

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

The Role of Authority in Ranking Content Consider again the search query how to
make French toast that we showed in Figure 9-10. While Google has plenty of
results on the topic, there are, of course, some sites that rank more highly
than others for this search query. How is their rank determined? Very
high-authority sites are likely to do fine when publishing content on a topic
that is already well covered on the internet. There are a few possible reasons
why this is the case: • Reputation and authority are a big factor. For example,
if The New York Times Lifestyle section posted a new article on how to make
French toast, even though it is not particularly unique, readers might respond
well to it anyway. User interaction signals with the search result for that
content would probably be quite strong, simply because of the site’s reputation.
• High-authority sites probably got to be that way because they don’t engage in
much of the behavior that Google advises webmasters to avoid. Chances are that
you won’t find a lot of thin content, cookie-cutter content, too-similar
content, or any of the other issues that are triggers for Google’s algorithms. •
Google may simply be applying looser criteria to a high-authority site than it
does to other sites. Exactly what factors allow higher-authority sites to have
more leeway is not clear. Is it that Google is measuring user interaction with
the content, the quality of the content itself, the authority of the publisher,
or some combination of these factors? There are probably elements of all three
in what Google does.

The Impact of Weak Content on Rankings Weak content in even a single section of
a larger site can cause Google to lower the rankings for the whole site. This is
true even if the content in question makes up less than 20% of the total pages
for the site. This may not be a problem if the rest of your site’s content is
strong, but it’s best not to take the chance: if you have known weak pages, it’s
worth the effort to address them.

Improving weak content When addressing thin content, it’s best to dig deep and
take on hard questions about how you can build a site full of fantastic content
that gets lots of user interaction and engagement. Highly differentiated content
that people really want, enjoy, share, and link to is what you want to create on
your site.

GOOGLE WEBMASTER GUIDELINES

427

There is a science to creating content that people will engage with. We know
that picking engaging titles is important, and that including compelling images
matters too. Make a point of studying how to create engaging content that people
will love, and apply those principles to every page you create. In addition,
measure the engagement you get, test different methods, and improve your ability
to produce high-quality content over time.

Ways to address weak pages As you examine your site, a big part of your focus
should be addressing its weak pages. They may come in the form of an entire
section of weak content, or a small number of pages interspersed among the
higher-quality content on your site. Once you have identified those pages, you
can take a few different paths to address the problems you find:

Improve the content This may involve rewriting the content on the page and
making it more compelling to users who visit.

Add the noindex meta tag to the page You can read about how to do this in “How
to Avoid Duplicate Content on Your Own Site” on page 264. This will tell Google
to not include these pages in its index, and thus will take them out of the
Coati equation.

Delete the pages altogether, and 301-redirect visitors to other pages on your
site Use this option only if there are high-quality pages that are relevant to
the deleted ones.

Delete the pages and return a 410 HTTP status code when someone tries to visit a
deleted page This tells the search engine that those pages have been removed
from your site.

Use the Remove Outdated Content tool to take the pages out of Google’s index
This should be done with great care. You don’t want to accidentally delete other
quality pages from the Google index!

Quality Links To understand how Google uses links, we need only review Larry
Page and Sergey Brin’s original thesis, “The Anatomy of a Large-Scale
Hypertextual Web Search Engine”. At the beginning is this paragraph: The
citation (link) graph of the web is an important resource that has largely gone
unused in existing web search engines. We have created maps containing as many
as 518 million of these hyperlinks, a significant sample of the total. These

428

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

maps allow rapid calculation of a web page’s “PageRank,” an objective measure of
its citation importance that corresponds well with people’s subjective idea of
importance. Because of this correspondence, PageRank is an excellent way to
prioritize the results of web keyword searches. The concept of a citation is
critical. Consider the example of an academic research paper, which might
include citations similar to those shown in Figure 9-14.

Figure 9-14. Academic citations

The paper’s author uses the citation list to acknowledge major sources they
referenced as they wrote the paper. If you did a study of all the papers on a
given topic area, you could fairly easily identify the most important ones,
because they would have the most citations (votes) by prestigious papers around
the same topic area. To understand why links have value as a signal for search
engines, one need only to consider what they represent. When someone links to
your website, they’re offering users the opportunity to leave their site and go
to yours. Generally speaking, most website publishers work hard to bring as many
people to their sites as they can. They want those people to then complete some
action of value on their site, such as buying something, viewing or clicking on
ads, or signing up for a newsletter. For some sites where expressing a strong
position on a debated topic matter is the goal, the desired outcome may simply
be to get the user to read your entire viewpoint. In all these cases, the direct
commercial value of having a user click on a link to a third-party website which
is not an ad can be difficult to see. Ultimately, what it comes down to is that
people implement links when they believe they are referring the user to a
high-quality resource on the web that will bring value to that user. This brings
value back to the site implementing the link, because the user will have had a
good experience due to that referral, and it may lead to the user returning for
future visits. The way that Google uses this information is to help it determine
which resources are the best-quality resources on the web. For example, if
someone enters a query such as make a rug, Google likely has tens of thousands
of pages in its index to choose

QUALITY LINKS

429

from that discuss this topic. How does it know which one is the best, the second
best, and so forth? Even the latest AI algorithms can’t make that determination
by simply analyzing the content. Links help Google see what others on the web
think are great resources and act as a valuable input to its algorithms for
determining the quality of content. That said, not all links are useful. Ads, of
course, are biased because they have been paid for. Links from pages that are of
low relevance to the page they link to likely count for less too. In addition,
there remain many sites that seek to game the link algorithm and drive high
rankings for their sites without really deserving them. As well as understanding
the reasons why some sites might organically implement links to third-party
sites, it is useful to understand what types of behavior are unnatural and
therefore likely to be ignored or penalized by Google. For example, in the
academic world you do not buy placement of a citation in someone else’s research
paper. Nor do you barter such placements (“I’ll mention you in my paper if you
mention me in yours”), and you certainly would not implement some tactic to
inject mentions of your work in someone else’s research paper without the
writer’s knowledge. You would also not publish dozens or hundreds of poorly
written papers just so you could include more mentions of your work in them. Nor
would you upload your paper to dozens or hundreds of sites created as
repositories for such papers if you knew no one would ever see it there, or if
such repositories contained a lot of illegitimate papers that you would not want
to be associated with. In principle, you can’t vote for yourself. Of course, all
these examples have happened on the web with links—and all of these practices
run counter to the way that search engines want to use links, as they are
counting on the links they find being ones that were earned by merit. This means
that search engines don’t want you to purchase links for the purpose of
influencing their rankings. You can buy ads, of course—there’s nothing wrong
with that—but search engines would prefer those ad links to have the nofollow or
sponsored attribute, so they know not to count them. Additionally, pure barter
links are valued less or ignored altogether. Many years ago, it was quite
popular to send people emails that offered to link to them if they linked to
you, on the premise that this helped with search engine rankings. Of course,
these types of links are not real citations either. Google will not place any
value on the links from user-generated content sites, such as social media
sites, either. Anywhere people can link to themselves is a place that search
engines will simply discount, or even potentially punish if they detect patterns
of abusive behavior.

430

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Google has invested heavily in developing techniques for detecting poor-quality
links. For many years this was a highly manual process, but Google took a huge
step forward with the release of the Penguin algorithm in April 2012. This
marked the beginning of Google automatically detecting links that were poor in
quality and either discounting them or assigning an algorithmic adjustment to
the sites receiving these links. Penguin ran separately from the main algorithm
and was updated only on a periodic basis until the release of Penguin 4.0 in
September 2016, when it was fully absorbed into the main algorithm. Google also
changed the algorithm to solely focus on identifying poor-quality links and
discounting them to zero value. By this point its confidence in the efficacy of
Penguin had grown high enough that there was no longer a need to penalize these
links. Google’s webspam team, however, still manually reviews the link profiles
for sites that are considered to have a suspicious link profile and may assign
penalties to those sites (we will discuss this more in “Google Manual Actions
(Penalties)” on page 442). For that reason, it is a good idea to have an
understanding of what types of links Google doesn’t care for.

Links Google Does Not Like Here is a list of various types of links that Google
may consider less valuable, or potentially not valuable at all. Detecting these
links is pretty trivial for Google:

Article directories These are sites that allow you to upload an article to them,
usually with little or no editorial review. These articles can contain links
back to your site, which is a form of voting for yourself.

Cheap directories Many directories on the web exist only to collect fees from as
many sites as possible. These types of directories have little or no editorial
review, and the owner’s only concern is to make money. NOTE These comments on
directories are not meant to apply to local business directories, whose dynamics
are quite different. Inclusion in local business directories is helpful to your
business. This topic is discussed more in Chapter 12.

Links from countries where you don’t do business If your company does business
only in Brazil, there is no reason you should have large numbers of links from
sites based in Poland or Russia. There is not much you can do if people choose
to give you links you did not ask for, but there is

QUALITY LINKS

431

certainly no reason for you to proactively engage in activities that would
result in you getting links from such countries.

Links from foreign sites in a different language than the page content Some
aggressive SEO professionals actively pursue getting links from nearly anywhere.
As shown in Figure 9-15, there is no reason to have a “Refinance your home
today!” link on a page where the rest of the text is in Chinese.

Figure 9-15. Foreign language mismatch

Comment spam Another popular technique in the past was to drop links in comments
on forums and blog posts. This practice became much less valuable after Google
introduced the nofollow attribute, but aggressive spammers still pursue it,
using bots that drop comments on an automated basis on blog posts and forums all
over the web. They may post a million or more comments this way, and even if
only 0.1% of those links are not nofollowed, it still nets the spammers 1,000
links.

Guest post spam These are generally poorly written guest posts that add little
value for users and have been written just to get a link back to your own site.
Consider the example in Figure 9-16, where the author was looking to get a link
back to their site with the anchor text “buy cars.” They didn’t even take the
time to work that phrase into a single sentence!

432

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Figure 9-16. Guest post spam

Guest posts not related to your site This is a type of guest post spam where the
article written does not really relate to your site. If you sell used cars, you
should not expect Google to see any value in a guest post you write about
lacrosse equipment that links back to your site. There is no relevance.

In-context guest post links Another form of guest posting that Google frowns
upon is posts that include links in the body of the article back to you,
particularly if those links are keyword rich, and if they don’t add a lot of
value to the post itself. Figure 9-17 shows a fictional example of what this
might look like.

Figure 9-17. Embedded keyword-rich anchor text links

Advertorials This is a form of guest post that is written like an ad. Given the
structure, it’s highly likely that the site posting it was influenced to do so
in some manner. If you are going to include guest posting as part of your
strategy, focus on sites that don’t permit these types of guest posts.

QUALITY LINKS

433

Guest posts While the prior four examples all relate to guest posts, Google more
or less frowns on any type of guest posting done for link building. This does
not mean you should never guest post, but your goal in doing so should be to get
people to read the content you write, not to get links.

Widgets One tactic that became quite popular in the early days of Google’s
prominence as a search engine was building useful or interesting tools (widgets)
and allowing third-party websites to publish them on their own sites. These
normally contained a link back to the widget creator’s site. If the content is
highly relevant, there is nothing wrong with this idea in principle, but the
problem is that the tactic was abused by SEOs, resulting in Google wanting to
discount many of these types of links.

Infographics This is another area that could in theory be acceptable but was
greatly abused by SEOs. It’s not clear what Google does with these links at this
point, but you should create infographics only if they are highly relevant,
highly valuable, and (of course) accurate.

Misleading anchor text This is a more subtle issue. Imagine an example where the
anchor text of a link says “Free WordPress themes,” but the page that’s linked
to offers only a single paid theme. This is not a good experience for users and
is not something that search engines will like.

Sites with malware Of course, Google looks to discount these types of links.
Sites containing malware are very bad for users, and hence any link from them is
of no value and potentially harmful. A related scenario is where a site has been
hacked and had links injected into its source code in a manner that hides them
from the website owner and users. Google will also try to identify these links
and discount them.

Footer links Once again, there is nothing inherently wrong with a link from the
footer of someone’s web page, but as these links are less likely to be clicked
on or viewed by users, Google may discount their value. For more on this topic,
you can read Bill Slawski’s article “Google’s Reasonable Surfer: How the Value
of a Link May Differ Based Upon Link and Document Features and User Data”, which
discusses Google’s “reasonable surfer” patent. Also, while those footer links
might appear on every page of a website, Google may well give more weight to one
single link that appears in the main body of the content from a highly relevant
page.

434

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Links in a list with unrelated links This can be a sign of a purchased link.
Imagine Google finds a link to your “Travel Australia” website mixed in with a
list of links to an online casino site, a mortgage lead generation site, and a
lottery ticket site. This will not look legitimate to Google.

Links from poor-quality sites The links that have the most value are the ones
that come from very high-quality sites that show substantial evidence of strong
editorial control. Conversely, as quality drops, editorial control tends to as
well, and Google may not count these links at all.

Press releases It used to be quite popular to put out lots of press releases,
complete with keyword-rich text links back to your site. Of course, this is a
form of voting for yourself, and this is not the way that press releases should
be used to promote your site. As shown in Figure 9-18, a much better way to use
press releases is to get your news in front of media people and bloggers, and
hope that it’s interesting enough that they will write about you or share your
news on social media.

Figure 9-18. The right way to use press releases

QUALITY LINKS

435

Cleaning Up Toxic Backlinks The first part of the link cleanup process is to
establish the right mindset. As you review your backlink profile, consider how
Google looks at your links. Here are some rules of thumb to help you determine
whether a link has real value: • Would you want that link if Google and Bing did
not exist? • Would you proudly show it to a prospective customer that’s about to
make a purchasing decision? • Was the link given to you as a genuine
endorsement? While reviewing your backlinks, you may find yourself at times
trying to justify a link’s use. This is usually a good indicator that it’s not a
good link. High-quality links require no justification—it’s obvious that they
are good links. Another key part of this process is recognizing the need to be
comprehensive. Losing a lot of your traffic is scary, and being impatient is
natural. If there is a manual link penalty on your site, you will be anxious to
send in your reconsideration request, but as soon as you do, there’s nothing you
can do but wait. If you don’t do enough to remove bad links, Google will reject
your reconsideration request, and you will have to go through the whole process
again. If you end up filing a few reconsideration requests without being
successful, Google may send you a message telling you to pause for a while. Make
a point of being very aggressive in removing and disavowing links, and don’t try
to save a lot of marginal ones. This almost always speeds up the process in the
end. In addition, those somewhat questionable links that you’re trying to save
often are not helping you much anyway. For more information on this topic, refer
to “Filing Reconsideration Requests to Remediate Manual Actions/Penalties” on
page 451. With all this in mind, you also want to be able to get through the
process as quickly as possible. Figure 9-19 provides a visual outline of the
link removal process. Categorizing the links is quite helpful in speeding up
this process. For example, you can identify many of the blogs simply by using
the Excel filter function and filtering on “blog.” This will allow you to more
rapidly review that set of links for problems. Tools such as Remove’em and Link
Detox will do this for you as well. This step is especially helpful if you know
you have been running an aggressive guest posting campaign, or worse, paying for
guest post placements. Some additional tips include:

436

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

• You do not need to worry about links that are marked as nofollow unless they
have other problems, such as contributing to excessive anchor text. In addition,
if you use a tool that identifies them as toxic, you can put them in your
disavow file. • Links from sites with very low authority for their home page
probably are adding little to no value to your site. • Links from very
low-relevance pages are not likely to be adding much value either.

Figure 9-19. Process for cleaning up low-quality links

Contacting sites directly to request that they remove links shows good intent to
Google. Google likes to see that you’re putting in the effort to clean up those
bad links. It also recognizes that you will get low compliance from toxic site
owners, so don’t fret if you don’t hear back even after multiple contact
attempts. Google’s webspam team reviews reconsideration requests. This
introduces a human element that you can’t ignore. The members of this team make
their living dealing with sites that have violated Google’s guidelines, and you
are one of them.

QUALITY LINKS

437

As we note in the following section, even when you use all the available sources
of link data, the information you have is incomplete. This means that it’s
likely that you will not have removed all the bad links when you file your
reconsideration request, even if you are very aggressive in your review process,
simply because you don’t have all the data. Showing reviewers that you’ve made a
good faith effort to remove some of the bad links is very helpful, and can
impact their evaluation of the process. However, there is no need to send link
removal requests to everyone in sight. For example, don’t send them to people
whose links to you are marked with nofollow. Once the process is complete, if
(and only if) you have received a manual penalty, you are ready to file a
reconsideration request.

Sources of Data for Link Cleanup Google provides a list of external links in the
Search Console account for your site. Figure 9-20 shows a sample of that report.

Figure 9-20. Search Console links report

438

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

The problem with this list is that it tends to be incomplete. Thus, we recommend
that you also pull links from several other sources. Some of the best additional
sources include Ahrefs, Majestic, Semrush, Link Explorer, and LinkResearchTools.
As with Search Console, each of these tools only offers a partial list of the
links to your site. These software vendors are relatively small, and the
challenge of crawling the web as thoroughly as Google does is formidable, so it
should be no surprise that they don’t cover the entire web. That said,
aggregating link data across multiple such tools will provide you with a more
complete list of links. A study performed by coauthor Eric Enge and Perficient
on various link tool vendors (which included Ahrefs, Majestic, Moz, and Semrush)
found that using these data sources together resulted in finding twice as many
links as were found by the individual tool that reported the most links. Of
course, there will also be a lot of overlap in what they show, so make sure to
deduplicate the list. Also, bear in mind that even the combination of all these
sources is not comprehensive. Google shares only a portion of the links it is
aware of in Search Console. The other link sources are reliant on the crawls of
the individual companies, and as we’ve mentioned, crawling the entire web is a
big job for which they simply do not have the resources.

Using Tools for Link Cleanup As mentioned previously, there are tools (such as
Remove’em and Link Detox) available to help speed up link removal by automating
the process of identifying bad links. However, it is a good idea to not rely
solely on these tools to do the job for you. Each tool has its own algorithms
for identifying problem links, and this can save you time in doing a full
evaluation of all your links. Keep in mind, though, that Google has spent over
20 years developing its algorithms for evaluating links, and it is a core part
of its business to evaluate them effectively, including detecting link spam.
Third-party tools won’t be as sophisticated as Google’s algorithm, nor can they
crawl the web to the depth that Google can. They can detect some of the bad
links, but not necessarily all of the ones you will need to address. You should
plan on evaluating all of the links—not only the pages labeled as toxic, but
also any that are marked as suspicious. You may even consider a quick review of
those marked as innocuous. Use your own judgment, and don’t just rely on the
tools to decide for you what is good or bad.

QUALITY LINKS

439

The Disavow Links Tool Google provides a tool that allows you to disavow links.
This tells Google that you no longer wish to receive any PageRank (or other
benefit) from those links, providing a method for eliminating the negative
impact of bad links pointing to your site. However, we recommend that you don’t
rely solely on this tool, as Google staff who review reconsideration requests
like to see that you have invested time in getting the bad links to your site
removed. Figure 9-21 shows what the Disavow Links tool’s opening screen looks
like. It includes a warning stating that this is an advanced feature that should
be used with caution, and recommending that it only be used “if you believe that
there are a considerable number of spammy, artificial, or low-quality links
pointing to your site, and if you are confident that the links are causing
issues for you.”

Figure 9-21. Google’s Disavow Links tool

Once you select a site, you are able to upload a file (or edit an existing one)
containing a list of web pages and domains from which you would like to disavow
links, as shown in Figure 9-22.

440

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Figure 9-22. The final screen of Google’s Disavow Links tool

The sample screen in Figure 9-22 shows that the current disavow file for this
site (the name is blacked out) is disavowing 75 domains and 9,065 URLs.
Considering that the link data you have is incomplete (as described in “Sources
of Data for Link Cleanup” on page 438), it is best practice to disavow entire
domains. In other words, if you see one bad link coming to you from a domain, it
is certainly possible that there are other bad links coming to you from that
domain, and that some of these bad links are not in the data available to you.
An example would be a guest post that you want to disavow. Perhaps you have done
only one guest post on that site, but the post will also appear in category
pages and date-based archives on that blog. If you disavow only the post page,
Google will still find many other bad links from that site. In the example shown
in Figure 9-22, it is quite likely that this publisher has not solved their
problem and that many (if not all) of the disavowed URLs should be made into
full disavowed domains. That is usually the safest course of action. Note that
Google does not remove disavowed links from your link graph until it visits the
linking page. This means the crawler must return to all of the pages listed in
the disavow file before that information is fully processed. Some tools can help
speed up this process, such as Link Detox Boost, but otherwise you may need to
wait a few months until the Google crawler visits the links. Refer to the Google
help page for more specifics on formatting the disavow file.

QUALITY LINKS

441

Google Manual Actions (Penalties) There are two ways that you can lose traffic:
due to algorithm changes by Google and manual actions. Algorithm changes are not
penalties and do not involve any human component, whereas manual penalties do.
While the details of what prompts Google to perform a manual review of a website
are not always evident, there appear to be several ways that such reviews can be
triggered. Figure 9-23 illustrates various possibilities in the case of a site
that has problems with its link profile.

Figure 9-23. Possible ways that Google manual reviews may be triggered

Note that in some cases an algorithm may trigger an algorithmic ranking
adjustment (as shown in Figure 9-23, algorithmic adjustments are made only when
Google’s confidence in the signals is very high; if the confidence level is not
high but there are indications of a problem, a manual review might be
initiated). However, these are not considered “penalties” by Google. Here is a
summary of the major potential triggers:

Spam report Any user (including one of your competitors) can file a spam report
with Google. While Google has not revealed how many of these it receives on a
daily basis, it’s likely that there are a lot of them. Google evaluates each
report, and if it finds one

442

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

credible (it may run some type of algorithmic verifier to determine that), then
it conducts a manual review.

Algorithmically triggered review While this approach has never been verified by
Google, it’s likely that Google uses algorithms to trigger a manual review of a
website. The presumption is that it uses algorithms that identify large
quantities of sites whose behavior is potentially bad, but not bad enough for
Google to algorithmically adjust them, and these sites are queued for manual
review. Google could also implement custom algorithms designed to flag sites for
review.

Regular search results reviews Google maintains a large team of people who
perform manual reviews of search results to evaluate their quality. This effort
is primarily intended to provide input to the search quality team at Google that
they can use to help them improve their algorithms. However, it is quite
possible that this process could also be used to identify individual sites for
further scrutiny. Once a review is triggered, the human reviewer uses a set of
criteria to determine if a penalty is merited. Whatever the outcome of that
review, it is likely that Google keeps the notes from the review in a database
for later use. Google most likely keeps a rap sheet on all webmasters and their
previous infractions, whether they result in a penalty or not. NOTE It’s
Google’s policy to notify all publishers that receive a manual penalty via a
message in their Search Console describing the nature of the penalty. These
messages describe the penalty in general terms, and it is up to the publisher to
figure out how to resolve it. Generally, the only resource that Google provides
to help with this is its Webmaster Guidelines. If you receive such a message,
the reconsideration request option in Google Search Console becomes available.

Google provides two key pages to help you understand the different types of
penalties and what they mean: the Manual Actions report and the Security Issues
report. Familiarity with the content of these two pages is an important aspect
of your SEO program, as they detail the types of behaviors that cause Google to
have concerns with your site.

Types of Manual Actions/Penalties Manual penalties come in many forms. The
best-known types are related to thin content or links, but you can also get a
variety of other penalties. Some of the most common types of manual penalties
are discussed in the following subsections.

GOOGLE MANUAL ACTIONS (PENALTIES)

443

Thin content penalties This type of penalty relates to pages that don’t add
enough value for users, in Google’s opinion. Figure 9-24 shows an example of a
thin content message from Google in Search Console.

Figure 9-24. A thin content penalty message

Unfortunately, when you receive this kind of penalty, Google doesn’t provide any
guidance on what the cause might be. It does tell you that it is a thin content
penalty, but it’s up to you to figure out exactly what the issue is and how to
fix it. There are four primary triggers for thin content penalties:

Pages with little useful content As the name of the penalty suggests, pages with
very little content are potential triggers. This is especially true if a large
number of these pages are on the site, or in a particular section of the site.

Pages with too-similar content Publishers whose primary aim is to implement
pages that are really designed to just garner search traffic often try to build
pages for each potential search query a visitor might use, with small or
insignificant variations in the content. To use an earlier example, imagine a
site with information on nursing schools with different pages with the following
titles, and very similar content: • Nursing schools

• Nursing universities

• Nursing school

• Best nursing schools

• Nursing colleges Sometimes publishers do this unintentionally, by
autogenerating content pages based on queries people enter when using the
website’s search function. If you decide to do something like this, then it’s
critical to have a detailed review process for screening out these too-similar
variants; pick one version of the page, and focus on that.

Doorway pages These are pages that appear to be generated just for monetizing
users arriving from search engines. One way to recognize them is that they are
usually pretty

444

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

much standalone pages with little follow-on information available, and/or they
are pages that are largely written for search engines and not users. The user
arriving on these pages basically has two choices: buy now or leave. Once you
believe you have resolved these issues, you need to submit a reconsideration
request. You can read more about this in “Filing Reconsideration Requests to
Remediate Manual Actions/Penalties” on page 451. If you are successful, then you
are in good shape and just need to make sure not to overstep your boundaries
again in the future. Otherwise, it’s back to the drawing board to see what you
might have missed.

Partial link penalties Another possible manual penalty is a partial link
penalty. This is sometimes called an “impacts links” penalty, as that term is
part of the message you get from Google (see Figure 9-25). These penalties
indicate that one or a small number of your pages have been flagged for bad
linking behavior. Normally, only the rankings and traffic for those particular
pages suffer as a consequence of this penalty.

Figure 9-25. A partial link penalty message

Unfortunately, Google does not tell you which of your pages is receiving the
penalty, so you have to determine that for yourself. This penalty is normally
due to too many questionable or bad links to pages other than your home page.

GOOGLE MANUAL ACTIONS (PENALTIES)

445

The cause is often a link-building campaign focused on bringing up the rankings
and search traffic to specific money pages on your site. One of the more common
problems is too many links with keyword-rich anchor text pointing to those
pages, but other types of bad links can be involved as well. The steps to
recover from this type of penalty are: 1. Pull together a complete set of your
links, as described in the section “Sources of Data for Link Cleanup” on page
438. 2. Look for pages on your site, other than the home page, that have the
most links. 3. Examine these pages for bad links, as described in “Links Google
Does Not Like” on page 431. 4. Use the process described in “Cleaning Up Toxic
Backlinks” on page 436. 5. Submit a reconsideration request, as described in
“Filing Reconsideration Requests to Remediate Manual Actions/Penalties” on page
451. To maximize your chances of success, it’s best to be aggressive in removing
any links that you consider questionable. There is no win in trying to get cute
and preserve links that likely have little value anyway.

Sitewide link penalties Manual link penalties can also be applied on a sitewide
basis. This usually means more than a few pages are involved and may well also
involve the home page of the site. With this type of penalty, rankings are
lowered for the publisher on a sitewide basis. Consequently, the amount of lost
traffic is normally far more than it is for a partial link penalty. Figure 9-26
shows an example of a sitewide link penalty message. The steps to recover from
this type of penalty are the same as those outlined in the previous section for
partial link penalties. As with partial link penalties, it’s best to be
comprehensive in removing/disavowing questionable links.

446

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Figure 9-26. A sitewide link penalty message

Other types of manual actions/penalties Some of the other manual penalties
include:

Cloaking and/or sneaky redirects You can get this message if Google believes you
are showing different versions of pages to Googlebot than you show to users. To
diagnose this, use the URL Inspector tool in Search Console to retrieve the
page, then use the tool to load the same page in another browser window and
compare the two versions. If you don’t have access to Search Console, the next
best bet is the Mobile-Friendly Test tool. You can also compare the cache of the
page in the search results with the real page.

GOOGLE MANUAL ACTIONS (PENALTIES)

447

Another idea is to use Preserve Log in Chrome Developer Tools (you’ll need to
explicitly check the Preserve Log checkbox on the Network tab). This will enable
you to spot if redirects are happening that could be changing the delivered
content from what Google was shown. If you see differences, invest the time and
effort to figure out how to remove the differing content. You should also check
for URLs that redirect and send people to pages that are not in line with what
they expected to see—for example, if they click on anchor text to read an
article about a topic of interest but instead find themselves on a spammy page
trying to sell them something. Another potential source of this problem is
conditional redirects, where users coming from Google search, or a specific
range of IP addresses, are redirected to different pages than other users.

Hidden text and/or keyword stuffing This message is generated if Google believes
you are stuffing keywords into your pages for the purpose of manipulating search
results. An example of such a violation is if you put content on a page with a
white background using a white font, so it is invisible to users but search
engines can still see it. Another way to generate this message is to simply
repeat your main keyword over and over again on the page in hopes of influencing
search results.

User-generated spam This type of penalty is applied to sites that allow
user-generated content but are perceived to not be doing a good job of quality
control on that content. It’s very common for such sites to become targets for
spammers uploading low-quality content with links back to their own sites. The
short-term fix for this is to identify and remove the spammy pages. The
longer-term fix is to implement a process for reviewing and screening out spammy
content, to prevent it from getting on your site in the first place.

Unnatural links from your site This is an indication that Google believes you
are selling links to third parties, or participating in link schemes, for the
purposes of passing PageRank. The fix is simple: remove the links on your site
that look like paid links, or add a nofollow or sponsored attribute to those
links.

Pure spam Google will give you this message in Search Console if it believes
that your site is using very aggressive spam techniques. This can include things
such as automatically generated gibberish or other tactics that appear to have
little to do with trying to add value for users.

448

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

If you get this message, the best course of action may be simply to shut down
the site and start over with a new one.

Spammy free hosts Even if your site is clean as a whistle, if a large percentage
of the sites using your hosting company are spamming, Google may take action
against all of the sites hosted there. Take care to make sure you are working
with a highly reputable hosting company! For any of these problems, you need to
address the source of the complaints. When you believe you have done so, follow
the procedure outlined in “Filing Reconsideration Requests to Remediate Manual
Actions/Penalties” on page 451.

Security Issues If Google sees indications that your site has been hacked or is
otherwise dangerous to visit, it will communicate this by sending you a message
in Search Console and/or displaying a warning to potential visitors (either a
label in the search results or an interstitial warning page). Details of
Google’s findings will be available in the Security Issues report. The most
common cause for security issues is failing to keep up with updates to your
content management system or its installed plug-ins. Spammers may take advantage
of vulnerabilities in the CMS or plug-ins to modify your web pages, most often
for the purpose of inserting links to their own sites, but sometimes for more
nefarious purposes such as accessing credit card data or other personally
identifiable information. To resolve the problem, you will need to determine how
your site has been compromised. If you don’t have technical staff working for
you, you may need to get help to detect and repair the problem. To minimize your
exposure going forward, always keep your CMS and all active plug-ins on the very
latest version possible.

Diagnosing the Cause of Traffic/Visibility Losses The first step to identifying
the cause of traffic and visibility losses is to check your analytics data to
see if the drop is in fact a loss of organic search engine traffic. If you have
Google Analytics, Adobe Analytics, or some other analytics package on your site,
make sure you check your traffic sources, and then isolate just the Google
traffic to see if that is what has dropped. If you confirm that it is a drop in
Google organic search traffic or visibility, then the next step is to check if
you have received a message in Google Search Console indicating that you have
been penalized by Google. You can see if you have any messages by clicking on
the bell icon in the top right of the Search Console screen for your site.

DIAGNOSING THE CAUSE OF TRAFFIC/VISIBILITY LOSSES

449

If you have received such a message, you now know what the problem is, and you
can get to work fixing it. It’s not fun when this happens, but knowing what you
are dealing with is the first step in recovery. If you don’t have such a
message, you will need to dig deeper to determine the source of your problem.
The next step is to determine the exact date on which your traffic dropped. You
can use many tools on the market to see if there were significant Google updates
on that day. Here are eight possible tools that you can use for this purpose: •
MozCast

• Algoroo

• Google Algorithm Update History from Moz

• Advanced Web Ranking’s Algorithm Changes

• Semrush Sensor

• cognitiveSEO Signals’ Google Algorithm Updates

• Rank Ranger’s Rank Risk Index tool

Google

• AccuRanker’s “Google Grump” rating For example, if your site traffic dropped
on October 10, 2022, Figure 9-27 suggests that sites that suffered traffic
losses on this date may have been impacted by Google’s passage indexing update.

Figure 9-27. Moz’s Google Algorithm Update History page showing the passage
indexing update

If you haven’t gotten a message in Google Search Console, and the date of your
traffic loss does not line up with a known Google algorithm update, the process
of figuring out how to recover is much harder, as you don’t know the reason for
the drop. Google does make smaller changes to its algorithms on a daily basis,
as mentioned earlier in this chapter. From its perspective, these are not major
updates and they are not reported. However, even these small updates could
possibly have a significant impact on traffic to your site, either positive or
negative. If they do impact you negatively, such tweaks may be hard to recover
from. The best strategy is to focus on the best practices outlined in Chapters 7
and 8, or, if you can afford SEO advice, bring in an expert to help you figure
out what to do next.

450

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Part of the reason Google makes daily tweaks is that it allows the company to
make small improvements on a continuous basis, as well as to run a variety of
tests for the purposes of improving the algorithms. Sometimes the scope of these
changes rises to a level where the industry notices them, and you can see active
discussions about what’s taking place on X (formerly Twitter) or in the major
search industry journals, such as Search Engine Land, Moz, Search Engine
Journal, and others. Some of these updates get confirmed by Google, and others
do not. Nonetheless, any of them can impact traffic to your site in material
ways.

Filing Reconsideration Requests to Remediate Manual Actions/Penalties
Reconsideration requests are applicable only to penalties. Unless you have a
manual penalty, you will not be able to file a reconsideration request to
address traffic losses you may have experienced. Be aware that a person will
review this request, and that person likely reviews large numbers of them every
single day. Complaining about what has happened to your business, or getting
aggressive with the reviewer, is not going to help your cause at all. The best
path is to be short and to the point: 1. Briefly define the nature of the
problem. Include some statistics if possible. 2. Explain what went wrong. For
example, if you were ignorant of the rules, just say so, and tell them that you
now understand. Or, if you had a rogue SEO firm do bad work for you, say that.
3. Explain what you did to clean it up: • If you had a link penalty, let them
know how many links you were able to get removed. • If you did something
extraordinary, such as removing and/or disavowing all of your links from the
past year, tell them that. Statement actions such as this can have a strong
impact and improve your chances of success. 4. Clearly state that you intend to
abide by the Webmaster Guidelines going forward. As already noted, keep your
reconsideration request short. Briefly cover the main points and then submit it
using the Search Console account associated with the site that received the
penalty. In fact, you can’t send a reconsideration request from an account
without a manual penalty. Once you’ve filed the request, you now get to wait.
The good news is that you generally get a response in two to three weeks.
Hopefully, you will be successful! If not, you have to go back to the beginning
of the process to figure out what you missed.

DIAGNOSING THE CAUSE OF TRAFFIC/VISIBILITY LOSSES

451

Recovering from Traffic Losses Not Due to a Manual Action/Penalty As mentioned
in the previous section, you can only file reconsideration requests if you have
been subject to a penalty. For all other causes of lost traffic, all you can
really do is to make the improvements to your site that you believe will help
you recover, and wait. Google has to recrawl your site to see what changes you
have made. If you have made sufficient changes, it may still take Google several
months before it has seen enough of the changed or deleted pages to tilt the
balance in your favor. What if you don’t recover? Sadly, if your results don’t
change, this usually means that you have not done enough to address whatever
issues caused your traffic loss. Don’t overlook the possibility that your
development team may have made changes that cause your site to be difficult for
Google to crawl. Perhaps they made a change to the platform the site is
implemented on, used JavaScript in a way that hides content from Google, blocked
content from being crawled in robots.txt, or created some other technical issue.
If you can’t identify the cause of the problem, then you will simply need to
keep investing in the areas of your site that you might think are related to the
traffic drop, or, more broadly, that will help increase the value of your site.
Address this situation by taking on the mission to make your site one of the
best on the web. This requires substantial vision and creativity. Frankly, it’s
not something that can be accomplished without making significant investments of
time and money. One thing is clear: you can’t afford to cut corners when trying
to address the impact of traffic losses from Google. If you continue to invest a
lot of time and make many improvements, but you still have content that you know
is not so great, or other aspects of the site that need improvement, chances are
pretty good that you haven’t done enough. You may find yourself four months
later wishing that you had kept at the recovery process. In addition, as you’ve
seen, Google’s algorithms are constantly evolving. Even if you have not been hit
by traffic loss, the message from Google is clear: it is going to give the
greatest rewards to sites that provide fantastic content and great user
experiences. Thus, your best path forward is to be passionate about creating a
site that offers both. This is how you maximize your chances of recovering from
any traffic loss, and from being impacted by future Google updates.

452

CHAPTER NINE: GOOGLE ALGORITHM UPDATES AND MANUAL ACTIONS/PENALTIES

Conclusion Traffic and visibility losses due to manual actions/penalties or
algorithmic updates can have a significant impact on your business. It is
therefore critical as a digital marketer to understand Google’s ever-evolving
Webmaster Guidelines to create compelling websites that satisfy the needs of
your end users, and to promote those websites with legitimacy and longevity in
mind.

CONCLUSION

453

CHAPTER TEN

Auditing and Troubleshooting Even if you have a mature SEO program, new
challenges and/or opportunities can still arise. The reasons for this are many,
including: • The technology you use to implement your site requires you to work
around its limitations to properly support SEO. • Many (or most) of the people
in your organization don’t understand how SEO works, or worse, don’t value SEO,
leading to mistakes being made. • Google algorithm changes can create new
opportunities or issues to address. • Your competition may invest heavily in
SEO, resulting in new challenges to your organic search market share. As a
result, knowing how to conduct an SEO audit and being able to troubleshoot SEO
problems are essential skills for any professional SEO.

SEO Auditing There are many reasons why you may need to conduct an SEO audit.
Perhaps you plan to perform one every quarter, or every six months. Perhaps a
Google algorithm update impacted your site traffic. Or maybe you saw a drop in
your organic search traffic in your analytics data. It could also be that you
are proactively trying to find ways to better optimize your site in order to
increase organic traffic market share. Regardless of how you think of why you’re
doing the audit, the underlying purpose of performing it comes down to finding
ways to improve the SEO of your site and therefore increase the organic search
traffic to the site. It’s good to keep that in mind throughout the entire
process and to remind all impacted stakeholders as well. For this reason, this
chapter focuses on finding SEO issues and opportunities, either as a result of
technical SEO problems on your site, through the creation of new content,

455

or by making improvements to existing content (you can read more about the
broader goal of creating an SEO strategy in Chapter 5). Once you’ve built a list
of problems and opportunities, the next step is to devise a plan to make changes
to improve your overall organic search results. Because of the dynamic nature of
the environment that we live in, it’s a good idea to conduct audits on a regular
basis. For example, you might decide to schedule an audit once per quarter or
twice per year. This type of cadence will enable you to limit the amount of
technical debt–driven SEO problems that your site accumulates. The regular audit
schedule will offer other benefits as well, including helping raise the
consciousness of your entire organization around the impact and importance of
SEO.

Unscheduled Audits Having regularly scheduled audits is smart business, but the
need can also arise from time to time for an unscheduled audit. Some reasons why
this may occur include: • A Google algorithm update • An unexpected organic
search traffic drop

• A merger or some other corporate activity requires merging web properties
together

• An unexpected rankings drop

• A site redesign takes place

• Site changes being pushed live before SEO checks

• New brand guidelines necessitate significant site changes

• A senior manager demanding some type of review If any of these events occur,
you may feel the need to do an immediate audit. In such a case the focus of your
audit will be guided by the event causing you to do it. For example, if you have
experienced an unexpected rankings drop for one page or across a single section
of a site, your audit will likely focus on the aspects of your site that are
most likely to impact that ranking.

Customized Approaches to Audits There are many variations of SEO audits that you
can conduct. Sometimes this could be because the audit was unscheduled and
therefore focused on a specific issue. Some of the reasons why you may conduct a
subset of a full audit include: • Routine analytics monitoring may show specific
issues you want to investigate. For example, perhaps traffic to one specific URL
has dropped; your audit may focus on issues specific to that page.

456

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

• You may have a new version of the site that you are about to release and are
checking to ensure that new problems have not been introduced. In this event it
is very useful to perform a technical SEO check only. • A new Google algorithm
may have been released, and industry commentary may have exposed specific issues
that update focuses on. In such a case, you should consider a mini-audit to
determine whether or not your site has those issues and might be impacted by the
change. • You may discover that traffic to different language and country
versions of your site is lagging far behind. In this event, you may focus an
audit just on the international pages and how they are integrated into your
overall web ecosystem. This should include verifying that your hreflang tags are
correctly implemented across all language and country versions of the site (or
sites, if your language and country versions appear on different domains). •
Your website may have page experience issues. This Google ranking signal (which
is discussed more in Chapter 7) includes the Core Web Vitals (Cumulative Layout
Shift, First Input Delay, and Largest Contentful Paint) as well as whether or
not the site is mobile-friendly, uses interstitials, and serves pages over
HTTPS. Audits of these individual factors are easily done separately, as they
often involve different teams to do the work. • Changes may have been proposed
to the main navigation of the site. There are many reasons beyond SEO why you
may want to make such changes, including usability and user experience concerns.
Whatever their motivation, these changes can have a large impact on SEO, and
this should be monitored. • Brand guidelines may have been updated,
necessitating a review of how your brand is represented across the site. In this
type of audit, you will want to validate that SEO for the site is not being
damaged in the process. • Marketing may want to have a brand audit of the site
done. During this process you would focus on validating your brand guidelines,
and ensure that any corrections made don’t harm SEO. • If the organization
acquires another business or website, you may want to combine those new web
assets into the existing site or establish cross-linking between the web
properties. These types of activities should be optimized for SEO. • Multiple
departments may be responsible for different portions of the site. This can lead
to frustrating outcomes if different groups at your organization make changes to
the site without first validating that SEO has been actively considered in the
process. If changes have already been implemented on the site without SEO being
considered, you should retroactively check if they have an impact on SEO.

SEO AUDITING

457

• The site may have a paywall that is correctly implemented from a user
perspective, but that is preventing it from performing in organic search how you
want it to. It may be that your implementation of flexible sampling (which you
can read more about in “Reasons for Showing Different Content to Search Engines
and Visitors” on page 271) isn’t implemented properly. • Your organization may
be making changes to the site to improve accessibility to users with some form
of disability, such as vision impairment, reduced mobility, or cognitive
impairment (in the US these issues are addressed by a civil rights law known as
the Americans with Disabilities Act of 1990; other countries have different
standards). Improving site accessibility can be important for your business, but
there may be significant interactions with SEO optimization of the site. • Some
organizations set up a rotation of regularly scheduled partial audits for the
website. For larger web properties this is an excellent idea, and it can allow
you to balance the workload of having to conduct a full audit of everything all
at once. • Due to resource limitations, you may want to focus on an audit that
tackles only your biggest areas of concern. These are just some examples of
reasons you may want to perform different types of audits. There are many other
reasons why you may want to focus on auditing specific issues. Regardless of the
motivation, once you’ve decided to validate some portion of the site via a
partial audit, the key is to determine what factors are of most importance based
on the actual need you’re trying to address.

Pre-Audit Preparations Prior to beginning an audit, you should take several
steps to lay the groundwork for a successful result. This preparation will help
ensure that the audit focuses on the right areas and that the work goes as
smoothly as possible. Key steps at this stage are: • Identify the reasons why
you are deciding to conduct an audit. Is it a routine check that you perform
quarterly or twice per year? Or is it driven by some event such as a drop in
organic search traffic? Are there specific issues that you are looking for? Make
a clear list of what you want to watch out for, and use that to determine the
specific areas where you will focus your audit. • Connect with the various
stakeholders that are impacted by the audit to find out what they might be
looking to learn from it and what their concerns might be. Be sure to include
any development teams or content teams who might expect to receive work requests
to support the audit. • Define the scope of the audit. Are you reviewing the
entire website or just a portion? Is an SEO content audit part of the plans?

458

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

• Develop an SEO audit plan that includes what steps will be taken and in what
order. • Review the proposed plan with the stakeholders and get their buy-in. •
Identify the people who will lead and conduct the audit and confirm their
availability to do the work. • Determine which special tools or databases may be
required to support the audit and ensure that these are properly set up before
starting the work. Once this preparation is complete, you’re ready to begin the
audit. Going through each of these steps prior to beginning the work will
improve the chances that the entire process will go smoothly.

Additional SEO Auditing Tools While your analytics package and Google Search
Console will both play a large role in your SEO audit, there are many other
tools that can be invaluable aids. A comprehensive discussion and review of
these is provided in Chapter 4, but what follows is a short summary of the top
tools you can leverage in an audit.

Crawlers There are many high-quality crawlers available. These include (listed
alphabetically):

Botify Botify provides both crawling and logfile analysis solutions, as well as
site change alerts and keyword tracking. It promotes its solutions as tailored
to specific market segments (E-Commerce, Travel, Publisher, Classifieds, and
Consumer Products). Botify is known for being well suited to crawling
large-scale websites and can crawl 100 URLs requiring JavaScript rendering per
second. Botify pricing is by quotation only.

Lumar Lumar (previously known as Deepcrawl) offers a solution that includes a
crawler, a logfile analyzer, and an additional tool called Lumar Monitor that is
used to validate new code releases before you push them live and can also be
used to detect changes to your site. Lumar supports crawls in excess of 10M
pages. Pricing is by quotation only.

Oncrawl Oncrawl also offers a web crawler and a logfile analysis solution and
provides guidance on how they apply to specific industry segments. It also
offers business intelligence and machine learning solutions to enable customers
to do more with their data. Oncrawl was acquired by BrightEdge in March 2022.
Pricing starts at under $100 per month.

SEO AUDITING

459

Screaming Frog Screaming Frog is highly popular partly because it provides a
robust free version that can crawl up to 500 URLs, and the paid version is
comparatively inexpensive (less than $200 per year). A separate logfile analyzer
is also available, also costing less than $200 per year. Screaming Frog requires
you to provide your own computers to run its programs.

SEO platforms A wide range of SEO platforms are available. These include (listed
alphabetically): • Ahrefs

• Searchmetrics

• Bright Edge

• Semrush

• Conductor

• seoClarity

• Moz Each of these provides a variety of ways to audit your website. Figure
10-1 shows a sample of the data you can get from the Semrush SEO auditing tool.

Figure 10-1. Sample Semrush Site Audit report

The Semrush report summary provides a series of thematic reports, and you can
click the “View details” buttons to get more details on each item. These
automated SEO

460

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

checks can uncover many types of problems. However, with many of these tools the
number of pages they let you crawl is limited, and there are also many other
things you will want to check, so use this as the start of your audit—not the
whole audit.

Core SEO Audit Process Summary Before you start your audit, you should put
together a list of all the types of checks you want to perform and the issues
that you want to look for (these are described in the following section). That
SEO audit checklist can include six major areas of activity that you will want
to execute: 1. Crawl the website with one of the crawlers discussed in the
previous section. The crawl of your site is one of the richest sources of
information available to you during an audit. Crawls of large sites can take
days, so start your crawl at the very start of the audit process. You can see
the types of checks that crawlers can help perform in “Issues that can be found
through crawl analysis” on page 462. 2. Use Google Search Console to learn what
you can about how Google sees the site. Bonus: also use Bing Webmaster Tools, as
this may provide some different insights on how search engines see your website.
You can learn more about how to use Search Console in your audit in “Issues that
can be found in Google Search Console” on page 465. 3. Review your analytics
data. Regardless of whether you use Google Analytics, Adobe Analytics, or
another platform, this should provide invaluable insight as to where the main
level of activity takes place on your site. Analytics can help expose a number
of types of issues; you can learn more about these in “Issues that can be found
in your analytics tool” on page 468. If you have not done so already, set up a
benchmark of current traffic and conversions, rankings, bounce rate, and other
factors. You will be able to use this in future audits. If you have done this
previously, then leverage it to see what has changed in terms of progress or
setbacks for your site. 4. Use backlink tools to audit your backlinks. This can
help you determine if your site has toxic backlinks or a weak backlink profile,
identify broken incoming links, and more. Details of the types of checks you can
perform with these tools are provided in “Issues that can be found with backlink
tools” on page 469. 5. Have an SEO expert conduct a detailed human review of the
site. While all of the above tools are valuable sources of information, there
are many types of issues that are best spotted by the human eye. These are
discussed further in “Issues best found by human examination” on page 470. 6.
Review the Google SERPs to identify additional issues. This can include
identifying missed opportunities and structured data errors, as well as checking
the titles and descriptions Google uses for your pages. You’ll find more
information on

SEO AUDITING

461

this in “Issues best found by reviewing the SERPs” on page 471 and “Validating
Structured Data” on page 483. While these are the main areas of activity, there
are many other areas that you may choose to include in your audit. For example,
you may choose to perform a detailed review of other elements of your site, such
as structured data or hreflang tags, or you may have other areas where you know
you have specific issues and you want to examine those. These are described in
detail in “Troubleshooting” on page 475.

Sample SEO Audit Checklist There are many different types of issues that you’ll
want to look for during a technical SEO audit, which may involve various tools
and approaches. The following sections discuss the different types of analyses
that you can include in an audit and the types of issues that each addresses.
You can use this information to put together your own SEO checklist.

Issues that can be found through crawl analysis One of the critical steps in any
audit is to perform a fresh crawl of your site, as well as an analysis of the
site’s logfiles. Figure 10-2 shows an example of a crawl report from Semrush.

Figure 10-2. Sample Semrush crawl report

462

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Crawls like these can expose many core issues, including the following:

Incorrect or missing canonical tags Incorrectly implemented canonical tags can
come in the form of links to pages that don’t exist or links to incorrect pages.
You can read more about how these tags should be used in Chapter 7.

Broken internal links These are links that return a 404 or 410 error. This can
happen if the links were improperly implemented or when a page is removed from
the site. You can see a sample of 404 response codes found by Screaming Frog in
Figure 10-3.

Figure 10-3. Sample Screaming Frog 404 errors report

Use of nofollow, sponsored, or ugc attributes on internal links As noted in
“Using the rel=“nofollow” Attribute” on page 278, there is no reason to apply
any of these attributes on links to other pages on your site.

Pages blocked in robots.txt Your audit should validate that you’re not blocking
crawling of any pages or resources that you want Google to be able to see. A
common error is blocking crawling of CSS or JavaScript files.

Bad redirects Any of the well-known crawling tools will provide a report on
redirects found on the site, including those that do not return a 301 HTTP
status code.

Missing title or heading tags Each of the popular SEO crawlers includes
reporting on title tags and heading tags that makes it easy to identify pages
that are missing these tags. Figure 10-4 shows an example of missing tags
discovered in a crawl by Screaming Frog.

SEO AUDITING

463

Figure 10-4. Sample Screaming Frog Missing H1s report

Duplicate title or heading tags Every page on your site should have title and
heading tags that are unique to that page. If you are not able to come up with
unique tags for a given page, then you need to consider whether or not the page
in question should exist. These issues are easily identified in your crawl
report. Note that if you have duplicate title and heading tags across paginated
pages this is likely not a problem, though we might still advise you to include
simple differentiators, such as a page number.

Missing or generic meta descriptions Instances of missing or generic meta
descriptions should be validated. On very large sites it may not be possible to
tailor meta descriptions for every page. In such a case, using a more generic
meta description or no meta description can be a valid choice to make.
Nonetheless, identifying critical pages where these can be improved is a key
activity during an audit.

Missing images Any missing images should be identified and fixed. These will
show up as 404 errors in your crawler reports.

Missing/inaccurate image alt attributes alt attributes play an important role in
assisting search engines in understanding

the content of images.

Nondescriptive image filenames Naming your image files in a descriptive way
helps confirm what they contain.

Pages on your site that are not being crawled by Google, or important pages that
are crawled with low frequency The easiest way to check this is to use a logfile
analysis tool to compare the pages crawled by Googlebot with the list of pages
on your site found with your SEO

464

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

crawler. How to determine what may be causing these issues is discussed more in
“Pages Not Being Crawled” on page 475.

Duplicate content These are pages that are essentially copies of other pages on
your site or of other pages on the web. Crawl reports can help you potentially
identify duplicate content by a variety of means. For detection of literally
duplicate pages, tools use either a content hash or a checksum for the page. In
addition, you can use other techniques to detect pages that may have too much
content duplicated, such as looking at pages that have identical title tags.

Suboptimal anchor text Review the anchor text of your links for generic phrases
like “Learn More” or other language that is not likely to be descriptive of the
destination page, as well as repetitive/overused anchor text.

Content on your site that has no links pointing to it (“orphan pages”) You can
identify orphan pages by looking for pages in your XML sitemaps that are not
found in your crawl of the site, pages identified in your analytics program that
are not found in the crawl, pages shown in Google Search Console that are not
shown in your crawl, and pages on your site that appear in your logfiles but not
in the list of pages from a complete crawl of your site.

Content on your site that is too many clicks away from the home page This is
important because the number of clicks that your content is away from your home
page is an indicator of how important you see that content as being to your
site. If it’s five or more clicks away, then you’re sending Google a strong
signal that it’s very low priority. Use this analysis to verify that you’re
comfortable with that for any pages for which that is the case.

Overreliance on PDF file content Review the list of pages found on your site to
see how many of them are PDF files. Google does not value PDF file content as
much as HTML content. If you wish to offer users downloadable PDFs, make sure
that you also have HTML versions of the pages.

Issues that can be found in Google Search Console Google Search Console is an
invaluable tool for obtaining data on how Google sees your site. While this
section focuses on Search Console, you should also strongly consider setting up
Bing Webmaster Tools, which provides a different search engine’s perspective on
your site; in some cases the information provided is different.

SEO AUDITING

465

Search Console could find the following: • Pages on your site that are not being
indexed by Google. • The top-performing queries sending traffic to your site. •
JavaScript not rendering properly, causing content or links to be invisible to
Google. An example of this is shown in Figure 10-5.

Figure 10-5. Google Search Console’s URL Inspection tool

• Issues with XML sitemaps. These may include missing sitemaps, sitemaps not
specified in robots.txt, sitemaps for multiple domains (e.g., international
domains), or sitemaps with too many errors. • New or updated pages that should
be submitted for recrawling. • Structured data problems. Search Console offers
high-level summary reports of structured data issues and also lets you drill
down to see the details for each page. We describe how to do this in more detail
in “Validating Structured Data” on page 483.

466

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

• Page experience issues. As discussed in Chapter 7, the page experience signal
has many components. These include: Lack of mobile-friendliness Check your pages
with the Mobile-Friendly Test tool. You can also check this directly within the
Google SERPs if you search on google search console mobile friendly test. You
can see an example of this in Figure 10-6.

Figure 10-6. The Mobile-Friendly Test tool in the Google SERPs

Presence of insecure (HTTP) pages The presence of HTTP (rather than HTTPS) pages
on your site is a negative ranking factor. Your crawl report will make
identification of these pages easy. Use of interstitials on the site Google
treats using interstitials on your site, particularly on initial page load, as a
negative ranking factor. While this can be effective in generating revenue, you
should seriously consider the negative implications, in terms of both organic
search rankings and how users experience your site. Core Web Vitals factors such
as slow page load speed, or high Cumulative Layout Shift (CLS) While page speed
and CLS are small ranking factors, they do matter. Search Console provides
detailed information on the pages with errors on your site. You can see an
example report in Figure 10-7.

SEO AUDITING

467

Figure 10-7. Search Console Core Web Vitals report

In addition, you can use Search Console to: • See a snapshot of some of the
external links pointing to your site. This list is not comprehensive but can
still be useful. • Review and learn what you need to do to improve internal
links on your site. • Identify issues with the implementation of Accelerated
Mobile Pages (AMP) on your site (if you have done this).

Issues that can be found in your analytics tool Your analytics program also has
a key role to play in your SEO audits. Whether you use Google Analytics, Adobe
Analytics, or some other analytics package, you should be able to see the
following information about your site:

Overall traffic trends over time Here you can see if your traffic is growing,
shrinking or staying the same. For example, if your traffic is down then you’ll
need to focus your audit on trying to find out why.

468

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Which pages/sections of your site are getting traffic This can help you identify
important sections or specific key pages of your site that aren’t performing the
way you would hope. You may then decide to focus your audit on these pages or
site sections to figure out why they may be underperforming. Are there ways that
the content can be improved? Does the site section/page have sufficient links
from other pages on the site? Are there any external links to the page or
section? How do your pages compare to those of competitors?

Pages that have seen sudden traffic drops Perhaps they have been penalized, or a
recent change to the page’s content or to the site caused the search rankings to
drop.

Issues that can be found with backlink tools As we’ll discuss in Chapter 11,
links continue to play a major role in search rankings. For that reason,
understanding how your backlink profile compares to your competition and keeping
track of any recent changes to that profile can be very revealing. There are
many tools that can help with this. Some of the most popular tools are
(alphabetically): Ahrefs, , Majestic, Moz Link Explorer, and Semrush. The kinds
of issues these tools can help you find include:

A weak backlink profile Do your competitors have much stronger link profiles
than you do? If their overall link authority is significantly greater, this can
impede your ability to compete and grow your traffic over time.

Backlink profile gaps Are there certain types of links that your site lacks?
Perhaps your competitors have earned links from many different types of sites
(say, media sites, blogs, education sites, government sites, and industry
conference sites), while your site only has links from blogs and industry
conference sites. This type of coverage gap could lead to diminished overall
authority.

Broken external links External links that go to pages that return 40x errors do
not add value to your site’s ranking. For that reason, rapidly identifying those
links and taking steps to repair them can be a fruitful activity. You can see
more about how to do this in Chapter 11.

Link disavowal Many of these tools have the ability to create a link disavowal
file or list of URLs to disavow. However, use this capability with care, as
Google primarily just discounts toxic links (so disavowing links should not be
necessary). In addition,

SEO AUDITING

469

chances are that some of the links you submit for disavowal may be perfectly
fine, so disavowing them would be harmful to your site. You can see example
output from the Semrush backlink audit tool in Figure 10-8.

Figure 10-8. Sample Semrush Backlink Audit report

There are many other ways that link analysis tools can be used to bolster your
organic search visibility. These are discussed in more detail in Chapter 11.

Issues best found by human examination Some issues are more easily recognized by
human examination. Examples of these include:

Poorly structured site navigation Your site navigation plays a large role in how
users navigate through your site. For that reason, it also plays a key role in
how Google can access the site and how it sees what content you are
prioritizing.

Opportunities for interlinking How you interlink on your site is also a powerful
signal for SEO. This includes your overall navigation as well as in-line
interlinking within your content. It’s often productive to review your content
in detail and find opportunities to add more highly relevant in-context links in
the content. Some of the crawling tools, such as Oncrawl and Botify, can perform
an internal linking analysis and crossreference that with traffic and logfile
data.

470

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Thin or poor-quality content While data from a crawl of the website can identify
pages with less content on them in many cases it’s best to examine the pages by
hand prior to concluding that they are thin content. Removing these pages from
the Google index can in some cases offer a powerful SEO upside, especially on
what Google considers Your Money or Your Life (YMYL) sites.

Content created without sufficient expertise Content quality is a key part of
long-term SEO success, and determining the quality of content on a site is very
difficult to do programmatically. Having someone with subject matter expertise
review the content for quality and then addressing any shortcomings can offer
significant benefits.

Content created primarily for search engines Content created with the primary
goal of ranking in search engines rather than for users is problematic.
Detecting this requires evaluating content to see if your site is providing any
unique expertise or perspective that other sites do not. It can also be helpful
to review your organization’s perspective during the content creation process.
Was the primary goal to include certain keywords in the content or were you
focused on helping users during the content creation process?

Problems with Experience, Expertise, Authoritativeness, and Trustworthiness
(EEAT) Evaluating how your site encourages EEAT (discussed in Chapters 3 and 7)
and how others see your site has many subtleties that make it a task best
performed by a human, and it’s an important part of SEO today. The preceding
list is by no means comprehensive; there are many different types of issues that
can crop up in your SEO program where a detailed review by an experienced SEO is
needed to finalize what actions should be taken with respect to potential
problems or opportunities.

Issues best found by reviewing the SERPs It can also be quite helpful to review
the SERPs on which your pages appear, as well as the ones where you would like
them to appear (missed opportunities). At the start of your audit, make a list
of the search queries that you want to review. Some potential inputs to this
list are: • Make a list of the most important search queries to your site based
on which ones deliver the most organic search traffic to your site. • Make a
list of the most important target search queries that don’t currently deliver
the amount of search traffic that you would like to be receiving. • If you have
an SEO enterprise platform (BrightEdge, Conductor, Searchmetrics, Semrush,
seoClarity) you can use that to find major changes in traffic on search

SEO AUDITING

471

queries over time. Each of these platforms will also show you which search
result pages currently have which search features. There are a number of
different areas that you can investigate during your reviews of the SERPs. Two
of the most important are: • Changes in search results page layout. Changes in
the SERPs, including the addition or removal of search features, can have a
large impact on the amount of search traffic you can receive. Review how these
may have impacted your organic search traffic. • The titles and meta
descriptions that Google uses for your search listings. For those queries where
one or more of your pages rank, review what Google is using for your page title
and meta description. Are they the same as the title tag and the meta
description that you wrote for the page? If not, consider why Google may have
chosen to write something different. The main reason these are rewritten is that
Google feels that its version is a better fit to the search query. Based on
that, consider how you might update your title or meta description. You may be
able to create a title or meta description that covers the value proposition of
your site better but still captures what Google was looking for to describe your
page.

Auditing Backlinks SEO auditing can sometimes include auditing the links
pointing to your site. One of the key reasons for doing this is to determine if
any of these links are problematic (you’ll learn more about how to perform this
analysis in “Bad or Toxic External Links” on page 488). The first step in this
type of audit process is to get as much information as possible on your
backlinks. The most popular tools for pulling backlink data were listed in
“Issues that can be found with backlink tools” on page 469. Each tool provides
access to a significant link database. However, as we’ve mentioned before,
crawling the entire web is a massive endeavor, and even Google—which has the
world’s largest infrastructure for crawling, dwarfing that of any of these
comparatively small vendors—doesn’t crawl all of it. Nonetheless, pulling a list
of links from any of these tools will provide a great deal of value in your
backlink audit. If you are able to afford working with one more provider of
links, it is recommended that you pull link data from more than one vendor. If
you’re able to do this, you will need to dedupe the lists as there will likely
be significant overlap; however, it’s also likely that your deduped link list
will be as much as 50% larger than the list you would get by working with only
one backlink vendor.

472

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Once you have pulled the list, there are two types of audits you can perform: 1.
Review your backlinks to look for bad links, as discussed in “Bad or Toxic
External Links” on page 488. 2. Compare your backlinks with those of your
competitors. You’ll learn more about how to do this in Chapter 11.

SEO Content Auditing Content plays a critical role in how much you can scale
your SEO program. Once you have resolved your technical SEO issues, working on
the depth, breadth, and quality of your content is how you build scale. For that
reason, auditing your content is something that you should also do on a regular
basis. The key stages in a content audit are as follows: 1. Compile a complete
list of all the content on your site. If you don’t have an easy way to do this,
you can leverage a crawl of your website to get a complete inventory of all your
pages. 2. Build a list of pages that you want to review in detail. 3. Segregate
those pages into classes: a. Informational content targeted at the middle and
the top of the funnel b. Commercial and transactional content targeted at
driving conversions 4. Conduct audits for each of these two groups, as further
described in the next two sections.

Reviewing content for SEO One of the basic areas to investigate for all your
content is how well it has been optimized for SEO. The first step is to ensure
that your content has been created, or at least reviewed, by subject matter
experts. It’s not possible to build authority on a topic with content created by
copywriters who are not knowledgeable about that topic. Imagine that other
subject matter experts are reviewing your site’s content. What will they think
about the content? Will they be impressed? Then consider how well optimized the
content is for SEO. Does the title tag include the main keyword related to the
topic of the content? Does the content naturally use related keywords and
synonyms that indicate it has thoroughly covered the topic? Does the content
link to other related content on your site or other sites? Are you linking
to/citing sources that support your claims?

SEO AUDITING

473

Note that content created by SMEs will naturally be rich in keywords and ideally
already be in good shape from an SEO perspective because of that. However, a
detailed review by an experienced SEO may reveal some ways to tweak the title
tags, meta description, or the content to help the page rank higher.

Informational content audits Informational content can play a large role in
building your overall site authority and also fill the top of your purchase
funnel with long-term prospects. Your audit of your informational content should
start with validating SEO optimization, as described in the previous section.
Beyond that, the following factors are the key to success with your
informational content:

Breadth If your coverage of the information people are looking for is too
narrow—for example, if you only create content for the highest-volume search
terms (a.k.a. head terms), then most users will not be fully satisfied by the
content they find on your site.

Depth As with breadth, depth also helps you satisfy a larger percentage of the
visitors to your site. The deeper you go on a topic, the easier it is for you to
be seen as authoritative. When you consider the depth and breadth of your
content, it’s helpful to build out a map of the topics covered by your
competition. Crawl their sites if you can, or use other means such as manual
reviews to build a complete map of all the informational content that you can
find on their sites. Then ask yourself if they have significantly more
informational content than you do on your site. Also, how does the content
quality compare? Learn what you can about what you can do to improve the
competitiveness of your content. Having betterquality content and more depth and
breadth is a great place to start. If you don’t have the resources to create an
overall broader and deeper array of content than one or more of your
competitors, consider doing so on some specific subtopics. As an output of this
part of the audit, build out a map of the additional pages of content you want
to create and/or pages that you need to enhance. This type of targeted effort
can help you increase your market share around these subtopics and can
potentially provide high ROI.

Commercial content audits Commercial content pages often have significantly less
text content than informational pages, as much of the focus is on driving
conversions. However, optimization of the

474

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

content on these pages can still have a large impact. This starts with
optimizing the title tags and the meta descriptions, and the other actions
described in “Reviewing content for SEO” on page 473. You can also consider
adding content to the page that assists users in their experience with your
commercial pages. Informational content about product/service options that is
integrated with the page experience can help with conversion. Don’t fall into
the trap of sticking text content blocks at the bottom of your page where no one
will ever read it, but instead find ways to integrate it into the overall
experience of the page. Another issue that is common on ecommerce sites occurs
when they resell products from third parties and use descriptive copy from the
manufacturer. The problem arises because the manufacturer will provide that same
descriptive copy to anyone who sells their product. As a result, there is
nothing unique about it, and it is essentially duplicate content. If you can,
write your own copy for the products; if there are too many of them, find ways
to add value (for example, user reviews or other value-add content), or at least
create your own content for the most important products.

Content audit summary Creating the right mix of informational and commercial
content can provide your site with an excellent way to grow market share,
particularly if this is combined with efforts to raise your brand visibility and
attract high-authority links to the site. You can read more about attracting
links in Chapter 11. There are many other aspects of content that you can audit
as well. These include following brand guidelines, how well it addresses your
target personas, and its ability to help drive conversions. Describing these is
beyond the scope of this book, but remember to include those considerations
during any content auditing process.

Troubleshooting Even when you try to do all the right things within your SEO
program, you can still run into problems from time to time. This section reviews
many common problem SEO scenarios and how to figure out where they are occurring
on your site.

Pages Not Being Crawled As a first step in the troubleshooting process, you can
determine whether or not a page is being crawled by looking for it in your
logfiles. If the page can’t be found in your logfiles over an extended period of
time, such as 60 days, it’s possible that it is not being crawled at all. At
this point you may need to consider the possibility that the page has some type
of problem either preventing it from being crawled or causing

TROUBLESHOOTING

475

Google to not want to crawl it. To investigate this further there are a series
of checks you can perform to identify potential causes.

Blocked by robots.txt Check your robots.txt file
(https://www.yourdomain.com/robots.txt) to see whether you are preventing the
crawlers from accessing parts of the site that you actually want them to see.
This mistake is quite common. Both Google Search Console (see Figures 10-9 and
10-10) and Bing Webmaster Tools provide simple ways for you to see whether they
are aware of content that robots.txt is blocking them from crawling.

Figure 10-9. Google Search Console: restricted by robots.txt

476

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Figure 10-10. Google Search Console: crawl errors

These reports are helpful when you have content on the site that has links to it
(either internal or external) but that the search engines don’t crawl because
they are excluded from it in robots.txt. The solution is simple: figure out what
line in your robots.txt file is blocking the search engines and remove it or
update it so that the content is no longer being blocked.

Blocked by the robots meta tag The robots meta tag in a page’s header might look
something like this:

As we discussed in “Using the Robots Meta Tag” on page 280, a setting of noindex
will tell the search engine that it is not allowed to include the page in its
index. Clearly, you should check to see whether you have made this error if you
find that the engines are not crawling certain pages that you want crawled.
While the noindex tag does not explicitly tell the search engines not to crawl
the page, after some period of time Google will stop crawling a page that is
marked with this tag. Also, pages that have a noindex tag for a long period of
time will have an implied nofollow tag. A nofollow tag will tell search engines
not to pass any link authority to the pages linked to on that page, though the
engines treat this as a hint (or suggestion), not a

TROUBLESHOOTING

477

directive. If all the links on your site to a particular piece of content are
nofollowed, you are passing no link authority to the page. This tells the search
engines that you don’t value the page, and as a result they won’t treat the
links as endorsements for it. Again, while this does not specifically instruct
the search engines to not crawl or index the page, it can result in their
choosing not to do so. Solving this problem requires locating the places where
these robots meta tags are on your site and removing them. Note that the default
setting for the robots meta tag is content="index, follow", so there is no need
to implement the tag if that is your desired setting. Just make sure you don’t
have robots meta tags in place that change the default where that is not the
desired behavior. You can see if a page has no tag on it by checking the
Coverage report in Search Console, finding the page in the crawl report from
your web crawling tool, or looking in the source code for the page and searching
for the noindex tag.

No direct links You may find that a particular piece of content has no links to
it. You can also make links invisible to the search engines (possibly
unintentionally) by encrypting the links to the content in some fashion. If
Google is aware of the page (perhaps via your XML sitemap, or IndexNow for Bing
and Yandex) it may still choose to crawl the page, but that is not guaranteed.
The solution here is to make sure you implement plain-text (or image) links to
the content. Better still, get some third-party websites to link to the content
as well.

Form submission requirement Requiring a login or some other type of form
submission to see content is another common cause of nonspidering. Search
engines will not attempt to fill out forms to see what is behind them. The
simplest solution is often to remove the requirement for the form if you want
the search engines to index this content. However, some sites sell content on a
subscription basis (also referred to as being behind a paywall), and they will
not want to offer their content for free. In October 2008, Google announced the
flexible sampling program, which allows subscriptionbased sites to have their
content crawled and indexed by Google, but still allows the publisher to require
human visitors to subscribe to access the content. You can read more about
flexible sampling in “Reasons for Showing Different Content to Search Engines
and Visitors” on page 271.

478

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Session IDs Session IDs confuse search engine crawlers: every time the engines
come to your site, they see a different page. For example, they may see
https://www.yourdomain.com?SessID=2143789 one time and
https://www.yourdomain.com?SessID=2145394 the next. Even though your intent is
to track the session of a particular user, and you think of these URLs as
pointing to the same page, the search engine does not. You can read more about
session IDs in “Controlling Content with Cookies and Session IDs” on page 266.

Not enough link authority to warrant crawling Sometimes the nonspidering problem
has nothing to do with the issues we just discussed. The search engines may see
the page just fine, but there may not be enough link juice going to it to merit
inclusion in their main indexes. This is more common than people think, and it
happens because the search engines do not attempt to index all the world’s web
pages. For example, content that Google perceives to be of low importance (i.e.,
content that doesn’t have enough link authority, or is perceived to be duplicate
content) will be excluded from the main index. Many years ago, this content may
have been relegated to what Google called its “supplemental index,” but in 2014,
Google’s John Mueller confirmed that for the purposes of treating pages
differently, Google no longer has a supplemental index. Google wants to
emphasize the more important pages on the web and doesn’t want the rate at which
it delivers search results to be slowed down by pages that most people probably
don’t want to see.

Page Indexing Problems It’s also possible to have content that gets crawled by
Google but doesn’t get indexed. There are a few ways that this can happen, such
as:

The page is marked with a noindex meta tag. You can read more about this in
“Blocked by the robots meta tag” on page 477.

The page does not have enough quality content for search engines to want to
index it. Search engines want to deliver high-quality content to their users. If
there is little or no content on a page, or all the content is hidden behind a
form so search engines can’t see it, they may choose to not include that page in
their index. This can also happen if the content is of poor quality (for more on
this, see “Content That Google Considers Lower Quality” on page 422). In such
cases the fix is to increase the quality and/or quantity of the content on the
page. While looking at your crawl report data for the page may provide hints as
to the quantity of

TROUBLESHOOTING

479

content on a page (based on the page size), this issue is best detected by human
examination.

The page does not have enough link authority for Google to want to index it. As
discussed in “Issues that can be found through crawl analysis” on page 462,
there are a few ways that you can identify orphan pages or pages that receive
very few links on your site: • Look for pages in your XML sitemaps that are not
found in the crawl. • Look for pages in your analytics program that are not
found in the crawl. • Look for pages in Google Search Console that are not shown
in your crawl. • Identify pages on your site that appear in your logfiles but
not in the list of pages from a complete crawl of your site. Even if there are
no links from your own site to the page, you can check to see if any third-party
sites link to it. That is best done by using one or more link analysis tools, as
described earlier in this chapter. As a best practice, any page on your site you
want the search engines to find should receive one or more links from other
pages on your site. While search engines may choose to index pages that have no
links (which they might still find if you list them in your XML sitemaps), it is
far less likely that they will do so.

The page is subject to a manual action. As discussed in Chapter 9, Google may
assign different types of manual actions to websites, or specific pages on
websites, for various reasons. If your page is the subject of one of these
actions, this may be impacting its indexation. To determine if this is the case
for your page, check the Manual Actions section within Search Console to see if
you have received any such penalty. To address this issue you will need to
understand and address the cause of the manual action, and then file a
reconsideration request within Search Console.

The page was removed by the Remove Outdated Content tool in Search Console. You
can manually remove pages from the Google search results using the Remove
Outdated Content tool. Note that these pages are not actually removed from the
index—they are just blocked from showing up in the search results. In addition,
the removal is not permanent; the pages will reappear again after 180 days.

Duplicate Content There are a number of techniques that you can use to attempt
to detect duplicate content. These include:

480

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

• Identify pages with duplicate titles and heading tags and manually check them
to see if they’re duplicates. • Take a string of text from a page on your site,
surround it in double quote characters, and search on it in Google to see if it
appears elsewhere on the web. Use a site: query to specify your domain to see if
it is duplicated on your site; for example, “the boats circled endlessly in the
storm” site:yourdomain.com. • Use Siteliner to examine your site. This will
provide an overview and a detailed report of duplicate content on the site
(excluding common content such as menus and navigation). You can see an example
of a duplicate content report from Siteliner in Figure 10-11.

Figure 10-11. Siteliner duplicate content report

• You can also check for content on your site being duplicated on other pages on
the web using Copyscape. Here are a few ways this can happen: — Your content
creators may have taken content from a third-party website and provided it to
you as if it was their original work. Best practice is to proactively check for
this with any content provided to you prior to publishing, but you may also
discover instances of this during an audit. — Third parties may have scraped
pages of your site and republished them on their sites. You can address these
instances by demanding that they take the content down, notifying their web
hosts of the problem (in many countries the hosting company is required to
remove such content once it has been notified of its existence), and/or filing a
DMCA takedown request with Google.

TROUBLESHOOTING

481

NOTE Quoting or citing content published on other websites in your content with
appropriate attribution is not considered posting duplicate content, as long as
the quoted content does not make up too large a portion of the page.
Unfortunately, there is no clear metric for what constitutes “too large,” but
you should consider it a best practice that the great majority of content on a
given page be your own original content.

We discussed duplicate content issues, including copyright infringement, in
Chapter 7. You can read more about detecting duplicate content in Brian
Harnish’s excellent Search Engine Journal article.

Broken XML Sitemaps You may also have issues with your XML sitemaps. These
issues can cause search engines to ignore the content of your sitemaps entirely,
and hence render them useless. The easiest way to determine if you have such a
problem is to use the Sitemaps report in Google Search Console. This will
provide you with a summary of all the errors Google has detected. Most
commercial crawling tools also provide means for crawling sitemap files. These
include (listed alphabetically): • Ahrefs

• Oncrawl

• Botify

• Screaming Frog

• ContentKing

• Semrush

• Lumar

• Sitebulb

Ultimately, what you want is to have all the pages in the XML sitemap meet the
following criteria: 1. The page returns a 200 HTTP status code (not a 3xx, 4xx,
or 5xx status code) when loaded. 2. The page does not contain a canonical tag
pointing to a different URL. 3. The page does not contain a meta robots noindex
tag. While it’s not always possible to have all of your pages meet these
criteria, you should aim for the number of pages that have one of these errors
to represent less than one percent of the pages in the sitemap file.

482

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Validating Structured Data As we discussed in “CSS and Semantic Markup” on page
250, Schema.org markup is a critical aspect of SEO. Using it properly can result
in enhanced display of your pages in the search results, and this can increase
traffic to your site. There are many options for discovering or debugging
structured data problems. Google Search Console is a powerful tool for
identifying where pages have problems and what the specific issues are on a
page-by-page basis. You can start with the Unparsable Structured Data report to
see how many total errors there are. You can see an example of this report in
Figure 10-12.

Figure 10-12. Search Console Unparsable Structured Data report

From here, you can click on each of the reported errors to get more details on
what the specific issues are. Search Console also offers more specific reports
for different types of structured data, including: • Breadcrumbs

• Products

• FAQs

• Sitelinks search boxes

• How-tos

• Videos

• Logos Figure 10-13 shows an example of what one of these reports looks like.

TROUBLESHOOTING

483

Figure 10-13. Sample Structured Data Errors report

In this report, you can click on the “Error” or “Valid with warnings” summary
boxes to get more details on the specific issues that were identified. Search
Console also provides a URL Inspection tool that allows you to validate whether
a page has any of these types of structured data on it: • Breadcrumb

• Logo

• Dataset

• Product

• Event

• Q&A

• FAQ

• Recipe

• Fact check

• Review snippet

• Guided recipe

• Sitelinks search box

• How-to

• Special announcement

• Image metadata

• Video

• Job posting The information provided includes descriptions of each item and
details about any warnings or errors found.

484

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Another tool you can use is the Schema Markup Validator. This tool allows you to
input a single URL at a time to verify your schema implementation on that page.
You can see an example of the Schema Markup Validator showing a warning in
Figure 10-14.

Figure 10-14. Sample Schema Markup Validator errors

You can also use Google’s Rich Results Test tool. The specific purpose of this
tool is to determine whether your page is eligible for rich results in the
SERPs. You can see an example of what these results might look like in Figure
10-15.

TROUBLESHOOTING

485

Figure 10-15. Sample Rich Results Test tool output

Validating hreflang Tags If you support multiple languages on your site, it’s
important to use hreflang tags to help search engines understand the
relationships between the various versions of your international content. If you
determine that you have problems ranking in alternative language markets, one of
the first areas to investigate is whether or not your hreflang tags are properly
implemented. Proper implementation requires including tags that cross-reference
all of the translations of the page, and a tag that points to the page itself.
This complete handshake ensures that Google will properly identify all of the
translated versions of the page. You will need to validate your implementation
of the hreflang tags during your audit. To do so, you can use tools such as
Merkle’s hreflang Tags Testing Tool or the Hreflang Tag Checker Chrome add-in
from Adapt Worldwide. You can read more about how to properly implement hreflang
tags in “Using hreflang annotations” on page 319.

Local Search Problems For multilocation businesses, local search plays a huge
role in search visibility. Identifying whether you have a problem in local
search can be done through a manual review of the search results, but the
complexity of this task rises rapidly as the number

486

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

of locations you have and the number of search queries you target scale up. In
these scenarios, using a tool that can review your listings in an automated way
can be a big help. Selecting a tool can be complicated because there are a large
number of different local search tools available in the market. These include: •
Advice Local

• Rio SEO

• Birdeye

• Semrush Listing Management

• BrightLocal

• SOCi

• GeoRanker

• Synup

• Localo (previously Surfer Local)

• Uberall

• Moz Local

• Whitespark

• ReviewTrackers

• Yext

Yext has the largest market share among medium to large enterprise
organizations, but other tools also have significant enterprise clients,
including BrightLocal, Moz Local, Rio SEO, and Semrush Listing Management. As
Yext is a premium platform, it costs more than many of the other options, but it
is a full-service provider. For smaller businesses some of the other platforms
may be a better fit. Whichever platform you choose, use it to map out how your
business is performing in the SERPs in order to identify the scope of any
problem you may have there. You can also use the tool you choose to review the
quality of your business listings in local search directories, business listing
sites, and Google Business Profile. Note that when you select your tool, review
the quality of the data provided in these areas as it’s a critical factor in
selecting the tool that is best for your business. You can read more about local
search and local search optimization in Chapter 12.

Missing Images Image tags specify the URL where the image file can be found.
However, it can happen that an incorrect URL gets placed in the tag or that the
image file gets deleted or moved. Any of the commercial crawling tools mentioned
previously will identify any cases of missing images. Resolving the problem is
straightforward; the options are: 1. Remove the image tag. 2. Find the original
image and update the tag to point to its current location, or move it to the
location pointed to by the image tag. 3. Find a new image, upload it to an
appropriate location on the site, and update the image tag to point to the new
image.

TROUBLESHOOTING

487

Missing alt Attributes for Images alt attributes play a key role in helping
search engines better understand the content of an image. Any of the major
commercial crawlers (such as those listed in “Broken XML Sitemaps” on page 482)
will identify any images on your pages that are missing these attributes.

Improper Redirects While Google has confirmed that each type of 30x redirect you
might use (e.g., 301, 302, 307, or 308) will still pass PageRank to the
destination URL, that doesn’t mean that these redirects will pass all other
signals through. For example, in the case of a 302 redirect Google may keep the
source page containing the redirect in the index, rather than the target page.
As a result, you may not have the page you want in the index. For that reason,
301 redirects are still the recommended type of redirect to use. You can find
out if you’re using any other kinds of redirects on your site by looking at the
crawl report from whatever tool you have been using to crawl your site. These
tools will all provide you with a report where you can see any potential
redirect problems. NOTE For a full discussion of redirect options and the
differences among them, see “Redirects” on page 288.

Bad or Toxic External Links As discussed in the previous chapter, how SEO
professionals view link building has evolved significantly over the past two
decades. It used to be common practice to buy links, implement extensive guest
posting programs, and participate in large-scale link acquisition schemes. Not
everyone did these things, of course, but it happened enough, and if your site
has been around for a long time, you may inherit some of these historical
problems (and be stuck with the headache of having to clean them up). Please
note that most sites don’t need to go through this type of process—it’s
generally only necessary if the scale of potentially bad links, you have is
extensive. If your site has just a few bad links Google will likely just mark
them so they don’t pass any PageRank. Before you commit to a large-scale project
to clean up your links, first make the determination that this level of effort
is warranted. You can start by reviewing the Google guidelines on links,
including “Spam Policies for Google Web Search” and the blog post “A Reminder
About Links in Large-Scale Article Campaigns”. If you believe your organization
has been systematically building links that violate Google’s guidelines, then
you should implement a project to find the potentially offending

488

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

backlinks and either get them removed from the pages containing them or
disavowed using Google’s Disavow Links tool. One of the first steps you should
take is to talk to various members of your organization to see if you can
discover any history of using link schemes to influence SEO. You should also
consider the possibility that a more general marketing campaign may have been
problematic. For example, many large brands have participated in programs for
paid guest post placements in which SEO was not a consideration. This may have
been done simply to drive brand visibility, but if the placements were paid and
links back to the website were not marked as nofollow, then this could also
present a problem. Once you have conducted your investigation within your
organization, you should also pull the raw backlink data for your site and begin
looking through it for more potential problem links. There are many tools in the
market that can help you with this task, including Ahrefs, LinkResearchTools,
Majestic, Moz Link Explorer, and Semrush. Be aware that while these tools can
identify some of the links that may be problematic for you, you should not rely
on them to identify all of the problem links. You will also want to review the
full backlink files to see if you can identify problem links that you will want
to address. If you have a large quantity of backlinks, this work might be quite
extensive, but you can simplify it somewhat by learning to look for certain
types of patterns. For example, if you know that your organization has engaged
in questionable link building practices, one of the first things to look for is
links that have anchor text that is almost too good. Most links that your site
receives will most likely use your domain name, organization name, or relatively
basic phrases like “click here.” If you find that key pages on your site have
rich anchor text in most of the links pointing to them, that can be a leading
indicator of a problem. For example, if you own a travel site and have a page
for “Austin Hotels” and that page has 35 links, 28 of which use “Austin Hotels”
as the anchor text, there is a high likelihood that you have an issue. A second
pattern to look for is when deep URLs on your site appear to have too many
external links pointing to them. For example, it’s rare for ecommerce sites to
have links to a large number of their specific product pages. If your backlink
research determines that you have a material percentage of ecommerce pages that
have external links, even if it’s only one or two per page, and most of those
also use rich anchor text, then that could be a key indicator of a problem. In
such cases, you should dig deeper into those links. Ask around the organization
to determine if there is any known history of link acquisition programs related
to those pages. You should also visit the pages containing the links and see
where they are on

TROUBLESHOOTING

489

the linking pages for further indication as to whether or not the links were
improperly obtained. Another common problem is when your organization has
participated in large-scale guest posting campaigns. If that has happened, focus
on links that are from blogs on other sites. Some tips for that are to look for
linking URLs that have one or more of these attributes: • /blog/ is part of the
URL. • The URL contains ?q= (this is common in WordPress blogs). • The links
come from a blog hosting domain such as wordpress.org or blogger.com, or other
domains that host third-party blogs. • The links use anchor text that looks like
it may have been suggested by your organization. Once you have completed your
investigation, you will need to decide if you want to disavow the bad links you
may have found. If they account for less than 10% of the links to your site, and
you also find that none of the key pages on your site appear to have links that
are mostly bad, you may decide that you don’t want to do so. However, if you
find that there is a serious problem, you can use the Disavow Links tool to let
Google know that you are disavowing those links and you don’t wish to receive
any PageRank from them. You can read more about how to do that in the Search
Console Help Center.

Single URL/Section Ranking/Traffic Loss In the event that you detect a
significant traffic loss to a single URL or a specific section of your site,
you’ll want to try to determine the cause of the problem. Since the traffic loss
appears to be localized on your site, it’s likely that the causes are somewhat
localized too. This distinction should guide a key part of the plan for your
investigation. Some key areas to explore are:

Tech changes Did your development team make changes to the impacted pages on the
site? In particular, were these changes made in a way that impacted only those
pages that have shown a traffic loss? If so, review the changes and see if they
may have hurt the SEO potential of those pages. For example, were pages marked
as noindex, or were canonical tags misapplied to those pages on the site? Was
that particular subdirectory or group of pages accidentally included in a
Disallow directive in the robots.txt file, or were key portions of the content
hidden from the search engines by requiring a user action to retrieve them? If
you find these types of issues, fixing them may resolve the problems causing
your traffic loss.

490

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Loss of internal links Another type of change that can impact traffic to a
specific page or section of pages is if they have lost some or all of the
internal links that previously existed to them from your site. Any of the
popular crawling tools (e.g., Botify, Lumar, Oncrawl, Screaming Frog, or
Semrush) will help you identify pages with lost internal links. Restoring those
links may help address the traffic loss.

Loss of external links It’s also possible that your pages may have lost external
links that previously were helping them rank. Link analysis tools such as
Ahrefs, Majestic, Moz, or Semrush can help you identify lost links. However, be
aware that these tools have some latency; an external link may be gone for some
time before the tools notice this and show it in your reports. For that reason,
you should make a point of identifying the most impactful external links to the
affected pages and checking the linking pages manually to see if those links are
still in place. Unfortunately, there is not always that much you can do to
address this particular issue. If you have lost some key external links, you can
always ask the site(s) that removed the links to put them back, but they may
simply decline. In such a case you either need to live with the traffic loss or
launch a promotional campaign around those pages and hope to attract some more
links to them.

Changes to the content Another potential driver of traffic loss could be
material changes that have been made to the content of the impacted page(s).
Perhaps Google finds the new version of the content less valuable to users and
has lowered the page’s ranking. In this case, you may want to consider if it
makes sense to restore some of the changed content to what it was previously.
One resource for checking what changes have been made to your site is the
Internet Archive, a site that crawls the web regularly and keeps historical
snapshots of websites at various points in time.

Aging of the content Ask yourself how current the content on the impacted
page(s) is. Has it grown out of date? Are there events that have occurred that
should be accounted for in the content but aren’t? A simple example for an
ecommerce site would be that the version of the product you are showing is no
longer the latest product. Informational content can also become out-of-date.
For example, new scientific discoveries can render an explanation of how
something works inaccurate, or dietary advice can change when medical advances
allow us to understand more about how the human body works. The fix for this is
to update the content and make it as current as possible. In each of these
cases, you will need to make the changes and then wait for Google to recrawl
those pages and process them through its algorithms to see if your fix was

TROUBLESHOOTING

491

successful. Depending on the size of your site, the number of pages involved,
and the frequency with which those pages are crawled, you may need to wait
anywhere from a week to several months. Once you’re certain that the pages have
been recrawled, you can determine if the fixes worked. If not, you’ll need to
dive back in and consider what factors you may have missed that could be causing
your traffic loss.

Whole Site Ranking/Traffic Loss What if your site experiences a broad-based
traffic loss? Some of the potential causes are similar to those you would look
for when you experience a drop in traffic to a single page or section of a site.
However, there are some differences and additional issues to consider:

Tech changes Changes made by your development team can be a potential cause of
traffic loss. Review any recent changes that might have impacted most of the
pages of the site, and look for any possible SEO impact. For example, were pages
marked as noindex, or were canonical tags misapplied across much of the site? Or
were key portions of the content or navigation hidden from the search engines by
requiring a user action to retrieve them? If you find these types of issues,
fixing them might resolve the problem causing the drop in traffic.

Changes in internal linking Significant changes to the internal linking on your
site can also impact SEO across the site. Any of the popular crawling tools
(such as Botify, Lumar, Oncrawl, Screaming Frog, or Semrush) will help you
identify changes in internal linking. Restoring the internal linking structure
may help address the traffic loss.

Loss of external links Have you lost key external links that used to point to
your site? Link analysis tools such as Ahrefs, Majestic, Moz, or Semrush can
help you identify lost links. However, as mentioned in the previous section, it
can take a while for these tools to notice lost links and surface them in your
reports, so while you’re investigating this possibility it’s a good idea to do
manual checks to ensure that your most impactful external links are still in
place. Unfortunately, there is not always that much you can do to address this
particular issue. If you have lost some key external links, you can always ask
the site(s) that removed the links to put them back, but they may simply
decline. Addressing this will require launching promotional campaigns to attract
more high-quality links to the site.

492

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Changes to site content Another potential driver of traffic loss could be any
significant changes you may have made to site content. If material changes have
been made to the content across the site, such as deciding to significantly
reduce the amount of content, then Google may find your site less helpful to
users and lower its ranking accordingly. Consider reversing those changes if
it’s a probable cause for your traffic loss.

Aging of the content How often do you update the content on your site? Has it
possibly become out of date? If so, consider investing the time to update it.
Also, implement policies to refresh your content on a regular basis so that it
does not get out of date in the future.

Google algorithm updates Has there been a recent announced update to the Google
search algorithms, or possibly an unannounced update that is causing chatter?
Perhaps industry sources such as Search Engine Roundtable (from Barry Schwartz)
and SEOs on X are discussing industry buzz about some type of Google change. For
a list of additional places you can check, see “Diagnosing the Cause of
Traffic/Visibility Losses” on page 449. If you see a change whose timing lines
up with when your traffic drop began, research all that you can about what other
people are saying about the update. Try to determine whether those factors might
apply to your site, and do what you can to address them.

EEAT issues As discussed in “Google’s EEAT and YMYL” on page 346, Google asks
its Search Quality Raters to evaluate the Experience, Expertise,
Authoritativeness, and Trustworthiness of a website. While this is not a ranking
signal in itself, Google’s Danny Sullivan has confirmed that Google does look at
other factors that help it determine “if content seems to match E-A-T as humans
would assess it” (this comment predates the addition of Experience to the
acronym, but no doubt still holds true). Most changes you might make to your
site that are relevant to EEAT are unlikely to cause significant traffic loss,
but in certain cases they may have a larger affect— for example, if you have a
site that deals with what Google refers to as Your Money or Your Life topics and
you publish a lot of content of questionable quality. Google is very sensitive
about YMYL sites, and if it believes that your site is providing bad or
questionable information on YMYL topics, it may well lower the site’s rankings
in a broad way. In this case, the solution is to either remove the poor-quality
content or improve it significantly.

TROUBLESHOOTING

493

In each of these cases, you will need to make the changes and then wait for
Google to recrawl the affected pages and process them through its algorithms to
see if your fix was successful. Depending on the size of your site, the number
of pages involved, and the frequency with which those pages are crawled, you may
need to wait anywhere from a week to several months. Once you’re certain that
the pages have been recrawled, you can determine if the fixes worked. If not
you’ll need to dive back in and consider what factors you may have missed that
could be causing your traffic loss.

Page Experience Issues Google began rolling out the page experience algorithm
update, covered in depth in Chapters 7 and 9, in mid-June 2021. The initial
update targeted mobile sites, but in February 2022 it was extended to desktop
sites. As discussed in previous chapters, the page experience ranking factor
includes the Core Web Vitals (CLS, FID, and LCP) as well as various search
signals (mobile-friendliness, use of interstitials, and use of HTTPS). Google
Search Console offers an overall Page Experience report as well as a report
specific to Core Web Vitals. Detecting and resolving problems with
mobile-friendliness, interstitials, or HTTP pages is comparatively easy. You can
use Google’s Mobile-Friendly Test tool to evaluate the mobile-friendliness of
your site; simply sample URLs from each of the major page templates and test
them in the tool to see how they score. The Google Search Central documentation
outlines what Google is looking for with regard to interstitials and dialogs. If
there is a problem, you may want to modify how you are using page elements that
obstruct users’ view of the content. There are many methods available to find
pages on your site that don’t use HTTPS: • Review the output of your crawl
report to see if any pages were found during the crawl that use sHTTP. • Check
your XML sitemaps to see if any HTTP pages are listed. • Use Chrome to check
whether your site is secure. • Create a Domain property or a URL-prefix property
for your site, as described in the Search Console Help page for the Page
Experience report. If you find any problems with these three aspects of page
experience, you’ll need to come up with a plan for fixing them. From a technical
perspective, the needed changes are usually relatively straightforward to define
(though your site platform may make them hard to implement). However, this may
involve design decisions and/or business decisions that can be difficult to get
agreement on within your organization.

494

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Detecting problems with Core Web Vitals is also easy to do, but determining how
to resolve them can be a lot more complex. As discussed in Chapter 7, numerous
tools are available to help with detecting CWV issues, including:

Chrome User Experience Report (CrUX) From the Chrome team, this is actually a
Google Data Studio dashboard that you can use to see aggregated real-world
performance data for users on your site. It requires some configuration, but the
data is invaluable as it also helps you understand the mix of different types of
devices that people use to access your site. The CrUX report can also be seen
within PageSpeed Insights and Google Search Console.

PageSpeed Insights This highly valuable tool from Google will show you data from
the CrUX report or, if there is not enough CrUX data available, lab data based
on real-time testing of the page. PageSpeed Insights also provides a list of
potential items to fix.

Lighthouse This tool, also from Google, is an extension embedded within Chrome.
You can access it by clicking the three dots in the top-right corner of the
browser (next to your picture), then More Tools, and finally Developer Tools.
From the Lighthouse tab, you can generate a full report of page performance.

GTmetrix GTmetrix also offers a well-known free tool for monitoring your page
speed, along with a variety of test centers so you can test performance from
different locations. In addition, for a fee you can set up ongoing metrics that
will trigger alerts when page performance falls below a threshold that you
define.

WebPage Test This is another well-known third-party tool offering a wide variety
of metrics and diagnostics. The site also has an active blog and highly active
forums that are rich with information on the tricky aspects of page speed
optimization. While many tools are available to monitor you website’s
performance, truly speeding up your pages can be quite a complex task. Since
interactions between your web server and the user’s browser are multithreaded,
it can be hard to determine where the actual bottlenecks are. In addition, some
of the problems may not be in how you coded your page, but could instead be
within your overall hosting infrastructure.

TROUBLESHOOTING

495

For example, you may need to: • Upgrade the web server you are using. This could
mean moving from a shared server to a dedicated one, or if you already have a
dedicated server, upgrading to a faster one. • Get more memory for your hosting
server. • Upgrade the connection bandwidth to your server. • Configure a CDN or
change the configuration of your CDN. • Enable gzip compression on your web
server. • Update your database servers. This by no means a complete list of
potential issues; these are just examples of the issues you may be facing.
Determining exactly which changes to make to improve the performance of your
page(s) can be very hard to do, and in some cases the changes can be very
difficult to implement. For many organizations, working with someone who
specializes in addressing these types of issues may be the best course of
action. Smaller organizations may need to rely on a best practices approach, or,
if you’re using a platform like WordPress, there are also many plug-ins that you
can use to help speed up your site.

Thin Content As discussed in the previous chapter, a page is considered thin in
content if it has little information of value to users. This is considered bad,
and sites with significant amounts of thin content can have their rankings
lowered or even receive a manual penalty from Google (see “Thin content
penalties” on page 444). Thin content is not necessarily a matter of the number
of words on a page. Other examples include: • Pages with autogenerated content •
Affiliate pages that simply repeat commonly available information • Doorway
pages created for the sole purpose of ranking for search queries • Pages
containing content scraped from another website • Substantially similar pages
that match highly similar search phrases Note that pages with very little text
content might be considered thin content, but there are many cases where this is
not a sufficient measure. For example, ecommerce pages may have a number of
product listings with only very basic descriptions, and still be considered
pages that offer some user value.

496

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Some of the key questions to consider when working to determine whether or not
you have thin content on your site are: • Have you been autogenerating content
without taking sufficient care to ensure its quality? Note that advances in AI
are making machine-generated content something that may soon be practical, but
you need to anticipate that human reviews of the content (and ownership of the
final output) are still a requirement. • If you’re running a site that monetizes
through affiliate programs, are you taking care to create unique content of high
value to users? • Have you created a large number of pages for the sole purpose
of ranking for a wide variety of search phrases, with little attention to the
content on those pages? These pages may be considered doorway pages and are
often poorly integrated into the core part of your site. • Does your site
contain content that has largely been scraped from other sites? • Do you have
many pages designed to rank for minor variants of search phrases? For example,
if you offer a site that assists users in creating resumes, do you have separate
pages for keywords like: — Resume writing services

— Services for writing results

— Resume writing service

— Resume writing assistance

Asking these types of questions can help you find pages that Google may consider
thin content pages. Ultimately, it’s very helpful to examine the pages manually
to evaluate whether or not they are actually helpful to users.

Poor-Quality Content While thin content is one example of poor-quality content,
there are other ways it can manifest as well. As discussed in Chapter 9, other
examples include content that: • Is low in relevance to your site • Relies on
keyword stuffing • Is inaccurate or misleading

• Offers users one thing but then tries to sell them something else
(bait-andswitch content)

• Does not address the user need in a meaningful way One leading indicator of
pages that may be of poor quality is a failure to attract any organic search
traffic. However, that on its own does not mean that the pages are necessarily
of poor quality; for example, pages may offer highly useful content on topics
that do not have high search volume, or there may be other reasons why they are
not ranking.

TROUBLESHOOTING

497

Here are some questions to consider when attempting to identify pages where you
may have poor-quality content: • Has your content been created by subject matter
experts, or at least reviewed by SMEs before publication? • Does the content fit
with the overall theme of the site? • Has all of the content been fact-checked
for accuracy? • Do the pages on your site deliver content highly related to
their title tags? If the answer to any of these questions is no, then you should
look at the pages in question to see if they contain content that would
objectively be considered to be of poor quality. Read the pages carefully and
evaluate how they help users. Even if the content seems decent, take the
opportunity to improve it while you’re there!

Content That Is Not Helpful to Users Google’s primary goal in constructing its
search results is to deliver pages to users that help address their needs. This
helps the company maintain its market share and increase usage of its search
engine. However, ranking in search engines is of critical importance to many
organizations, and some of them implement programs to generate content solely
for this purpose. The problem with starting with this mindset is that it can
lead to content spew—creating large volumes of content that does not add any new
value or perspectives on the topic covered (a.k.a. unhelpful content). Google’s
helpful content algorithm update, discussed in Chapter 9, was designed to
identify this type of content and apply a negative ranking factor to sites that
implement such practices. In the blog post announcing this update, Google
described it as “part of a broader effort to ensure people see more original,
helpful content written by people, for people, in search results.”
Unfortunately, because the negative ranking factor is applied sitewide, it
doesn’t just lower the rankings of the unhelpful content; it lowers the rankings
for all of the pages across the entire site. In this regard, this algorithm
update has a punitive aspect to it (though Google does not label it as a
penalty; it applies that term only to manual actions). Further, resolving the
problems with the unhelpful content on your site doesn’t result in speedy
recovery—Google’s announcement indicated that once the problems are addressed,
it can take many months to recover your previous rankings across other parts of
the site. In evaluating your content to see if this algorithm change is
impacting you, here are some questions that you can ask yourself: • Is the
primary focus of your content to help users?

498

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

• Does your site have a clear purpose, or are you generating content in many
areas to see what works? • Is your content created by people with clear
expertise on the topic? • Was your content created by people rather than
machines? • Does your content add new value to the web on the topic that it
addresses? • Did you create the content simply because it was trending, or does
it actually address user needs? • Are you creating large volumes of content on
many different topics just to gain search rankings? Ultimately, one of the best
ways to determine if you have content that may be considered unhelpful is by
reviewing your motivations in creating it and the approach used in the process.
If the primary goal was to gain organic search traffic, that is a leading
indicator of a potential problem. Some other aspects of your approach to the
content creation process that can be problematic are: • Using AI/machine
learning algorithms to create the content. • Failure to use subject matter
experts to author your content. • Providing SEO-centric instructions to guide
the content creation. This includes specifying content length and
keyword-centric guidelines in your instructions to the author. In general, the
best strategy to guide your content strategy is to focus on the unique user
value that you’re addressing with the content you create. If Google determines
that your site contains unhelpful content, your options for addressing it are to
remove the unhelpful content from your site or add the noindex tag to the
unhelpful content pages. According to Google’s John Mueller, though, removing
the pages from the index via noindex tags should be viewed as a patch; the best
solution is to remove the offending content from the site. If the pages were
created primarily in an effort to boost your search rankings and do not contain
quality content, they are not likely to be helpful in building your reputation
and relationship with the users that encounter them anyway.

Google Altering Your Title or Meta Description As we’ve discussed previously in
this book, one key area of SEO is to create wellwritten page titles and meta
descriptions, as these are often what will be shown for your page when a listing
is included for it within the SERPs. These are parameters that appear within the
section of your web page and typically are configured within

TROUBLESHOOTING

499

your CMS or ecommerce platform. If you’re using a CMS such as WordPress, you may
require a plug-in like Yoast SEO to set these values. However, Google sometimes
chooses to not use the title and description that you specified and instead
configures its own. There are several reasons that Google may choose to edit
your title and/or meta description prior to showing them in the SERPs: • They
may not accurately reflect the content that can be found on your page. • They
may not accurately reflect how the content on your page offers value related to
the specific search query entered by the user. • They may be overly self
promotional. The most common way that you can discover when this is happening is
by trying sample search queries and seeing what Google shows for your pages in
the search results. If what Google is showing differs from the text you
provided, the best approach to fixing the problem is to take a stab at rewriting
the titles/meta descriptions that Google does not appear to like. Try to address
the likely cause for why Google felt it necessary to rewrite them. However,
first consider whether that is even necessary. Perhaps Google is only editing
them for very specific user search queries (and not others), and these changes
are actually improvements in the context of those queries. Your titles and meta
descriptions should focus on the primary intent of the page; if Google is
rewriting them in certain cases for secondary intents that are not the core
purpose of the page, then perhaps you should leave well enough alone.

Hidden Content In “Content Delivery and Search Spider Control” on page 270, we
discussed ways that you can hide content from the search engines when you want
to. However, at times this is done unintentionally—that is, sometimes publishers
produce great content and then, for one reason or another, fail to expose that
content to search engines. Valuable content can be inadvertently hidden from the
search engines, and occasionally, the engines can find hidden content and
construe it as spam, whether that was your intent or not.

Identifying content that search engines don’t see How do you determine when you
have unintended hidden content? Sometimes the situation is readily apparent; for
example, if you have a site that receives a high volume of traffic and your
developer accidentally places noindex tags on every page, you will see a
catastrophic drop in traffic. Most likely this will set off a panicked
investigation, during which you’ll quickly identify the noindex issue as the
culprit.

500

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

Does this really happen? Unfortunately, it does. As an example scenario, suppose
you’re working on a site update on a staging server. Because you don’t want the
search engines to discover this duplicate version of your site, you noindex the
pages on the staging server. Normally, when you move the site from the staging
server to the live server you would remove the noindex tags, but unfortunately,
many site owners forget to do this. This type of problem can also emerge in
another scenario. Some webmasters implement a robots.txt file that prohibits the
crawling of their staging server website. If this file gets copied over when the
site on the staging server is switched to the live server, the consequences will
be just as bad as in the noindex case. The best way to prevent this type of
situation is to implement a series of safety checks on the site that take place
immediately after any update of the live server. There are potential problems,
however, that are much more difficult to detect. First, with a new site launch
you won’t have any preexisting traffic, so there will be no drop in traffic
levels to alert you that something is wrong. In another scenario, you may have
an established site where you accidentally do something to hide only a portion
of the site from the engines, so the issue is less obvious. Regardless of your
situation, web analytics can help you detect these issues. Use your analytics
software to find pages on your site that get page views but no referring search
traffic. By itself, this is not conclusive, but it provides a good clue as to
where to start. Note that this review can also be helpful for another purpose:
if you see content that is getting search referrals even though you don’t want
or expect it to, you may want to hide that content. Another data point you can
examine is the number of pages the search engines report as indexed for your
site. In a new site scenario, you can look at this to see whether the search
engines appear to be picking up your content. For example, if you have a site
with 1,000 pages with a good inbound link profile, and after three months only
10 pages are indexed, that could be a clue that there is a technical problem.
Using multiple sitemap files, one for each site content area covering a specific
segment of URLs, can be helpful in diagnosing such issues. Take care not to
overreact to the count of indexed pages, because the numbers that the search
engines report will naturally fluctuate quite a bit. But if you are aware of the
types of numbers typically reported for your site, and they drop to an unusually
low level and stay there (or keep dropping), you probably have a problem.

Identifying hidden content that may be viewed as spam Hidden text is one of the
challenges that webmasters and search engines still face. For example, spammers
continue to use hidden text to stuff keywords into their pages, to

TROUBLESHOOTING

501

artificially boost their rankings. Search engines seek to figure out when
spammers are doing this and then take appropriate action. There are many ways to
create hidden text unintentionally, though, and no one wants to be penalized for
something they did not intend to do. Google’s spam policies prohibit the use of
hidden text or links to place content on a page “in a way solely to manipulate
search engines and not to be easily viewable by human visitors.” This includes
hiding text behind an image, setting the font size to 0, displaying white text
on a white background, and other suspicious tactics. If you’re using such
techniques to try to stuff keywords into your web pages, you’re definitely over
the line and into black hat territory. However, there are also scenarios where
your CMS may create some hidden text, as outlined in the next section;
fortunately, Google is generally able to recognize these and will not penalize
you for them.

Unintentionally creating hidden text There are a few ways to create hidden text
without intending to do so. One of the most common ways is via your CMS, which
has some CSS-based methods built into it. For example, many content management
systems use the display:none technique to implement drop-down menus or other
widgets that “expand” to display more text when clicked. Tab folders are a great
example of this. This technique is sometimes used in user-generated content
systems too, where the page normally shows the number of comments on a post, but
suppresses the text “0 comments” in the event that no comments have been made.
Another scenario where hidden text may be created is when providing enhancements
for the visually impaired. For example, suppose you have a short video on your
web page and want to provide users with a text description of the content. You
may not want to place the text on the page, as it might make the page look
cluttered to a user with normal vision. The solution some people use to serve
both audiences is to hide the text from sighted users and make it accessible
only to screen readers. Many of these scenarios have no SEO value, even when
manipulated by spammers. These types of techniques generally do not carry a risk
of being penalized, because there is no reason to suspect negative intent.

Conclusion Audits play a critical role in your SEO program. Even highly aware
SEO organizations can have problems creep into their websites, be impacted by
Google algorithm changes, or discover new opportunities. A well-planned SEO
auditing program can minimize the scope and risk of SEO problems as well as
enable you to remain highly competitive for search traffic.

502

CHAPTER TEN: AUDITING AND TROUBLESHOOTING

CHAPTER ELEVEN

Promoting Your Site and Obtaining Links The first steps in SEO are getting your
website optimized and search friendly, getting your SEO toolset configured and
operational, and creating a spreadsheet of researched and refined keywords that
you want to target. Once you have completed these steps you have completed much
of the SEO puzzle, but not all of it. You also need to think about how you want
to promote your business and establish your organization as a leader in your
topical domain or market areas on the web. The mechanics of increasing search
traffic are rarely understood by people who aren’t in the industry. This is one
of those aspects of SEO where traditional brick-and-mortar marketing tactics are
often ineffective, which can make it difficult to steer your organization or
client in the right direction. The goal isn’t to “get anyone in the door and
hope they stick around and buy something.” Similarly, from a link-building
perspective, the goal isn’t to get any link you can. Instead, you need to be
thinking about how your promotional strategies create a presence for you on the
web that establishes you as a leader in your market space. To do this, you’ll
want to identify the other leaders in your market—some will be competitors, but
you should also look at media organizations that cover the space, significant
customers, government and educational sites that touch your space, related
suppliers, etc. Through the process of mapping all the related parties, you’ll
develop a vision of what sources potential customers trust most, and therefore
which sources Google will trust most. From there you can lay out a strategy to
build your presence and visibility across these key players and attract some
links from them to your site. The process of earning those high-quality links is
called link building or link attraction, and it’s foundational for success in
SEO. This is not purely a numbers game, though. In this chapter, we’ll explore
the many factors involved with link valuation and help you

503

identify the most valuable sites and pages to target; links from these sites
have a very high probability of increasing your rankings for relevant keywords.
The first step is to have something to link to. The best way to approach this is
by creating a site with content that is highly valuable to users. This should
always be your primary focus. Establish yourself as a leader in your
marketplace, cultivate an image of your brand as trustworthy and authoritative,
and publish lots of top-quality content and resources. If you do these three
things, many people will link to you naturally. You could paraphrase this advice
as “create a site with the best content (in your topical domain) on the
internet.” It may be beyond your means to do this on a broad scale within your
market; if so, find a way to create the best content within one or more specific
areas of that market. For many site owners, the instinct is to focus on getting
links to their commercial pages, but that’s not likely to be effective. Nobody
wants to link to you just to help you make money; you need to give them
something that they want to link to because it helps their users. Therefore, the
best strategy is to create excellent content that enhances the reputation of
your (or your client’s) organization and brand, and that sets your site apart
from competitors. Unless your site is already popular, though, even the best
content doesn’t market itself; you’ll have to put some effort into promoting it.
We have plenty of advice for creating and marketing excellent content in this
chapter. Link building shouldn’t be a one-off event or a time-limited process;
it should be integrated into the company’s culture. Your organization should
always be thinking of cool content ideas that users will find interesting,
looking for new sites that might be valuable to those same users, blogging and
engaging on social media, and encouraging customers to share links and reviews.
If you have the resources, you can go one step further and build a strong public
presence in your market by supporting worthy causes, by publishing content in
content hubs or blogs, or otherwise getting the word out through podcasts,
books, interviews, and speaking engagements.

Why People Link Before we go further, let’s consider why people implement links
to other sites. As noted, they certainly don’t link to your site to help you
make money. Site owners actually have a strong incentive not to link to other
people’s websites. Why? Because implementing a link somewhere else is inviting
them to leave your site. Let that sink in—why would someone do that? As we said
at the chapter’s opening, the main reason to link to other sites is if doing so
will help users. This can happen in many ways, for example: • The site
implementing the link (the “Linking Site”) believes that linking to the page on
the other site (the “Target Site”) will help its users so much that those

504

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

users will value their relationship with the Linking Site more as a result.
(Note: brands that regularly link to valuable content on other sites may be seen
as more trustworthy than sites that don’t link to other sites.) • The Target
Site has published high-value research data that the Linking Site wants to write
about and share with its audience. This is a common strategy for media sites
that are hungry for news stories: they can cite data from a third-party site and
drive lots of page views. • The Linking Site is quoting or using data from the
Target Site, and as a part of fair use it has implemented a link to the Target
Site. (You can read more about fair use at the US Copyright Office’s website. •
The Linking Site has content that discusses or analyzes the Target Site or some
aspect of the people, products, or business of the site owner. • A member of the
organization that owns the Target Site is being referenced or quoted on the
Linking Site. This is only a partial list of scenarios, and as you operate your
marketing program for your website, you may discover more. Nonetheless,
understanding why people may choose to link to your site is a key part of this
process. In all cases, sites link to other sites because there is some perceived
benefit for them. Understand what those potential benefits may be before you
start, or even initially consider, any link-building campaign.

Google’s View on Link Building Google’s search algorithms continue to place
material value on links, as they represent endorsements of your content and
site. Note that these types of endorsements carry value outside the realm of
search as well, especially when you get them from highly respected and trusted
people or organizations that operate in the same topical domain as your
organization. As explained in the previous section, the reason that links are a
powerful indicator of prominence and value is that when someone puts a link on
their site pointing to another site, it’s offering users the opportunity to
leave their site. The motivations for doing that all are rooted in a belief that
the site they are linking to will provide value to their users. Unfortunately,
because links are a ranking factor for Google, a great deal of effort has been
invested by SEO professionals and website publishers in getting people to link
to their sites any way they can. Historically, this has resulted in tactics such
as: • Paying site owners to link • Swapping links

• Exchanging charitable donations for a link • Excessive guest posting

GOOGLE’S VIEW ON LINK BUILDING

505

• Offering site owners gifts, free products, or compensation of some sort in
exchange for links

• Using other tactics to obtain links from many other sites without regard to
their relevance to your brand

NOTE Making donations or contributing guest posts in ways that are highly
relevant to your brand can be acceptable, provided that this is not done to
excess. For example, publishing monthly columns on a couple of sites that you
know reach your target audience is just good marketing, and is perfectly fine.
On the other hand, creating 10 guest posts each month and publishing them on 10
different sites where you’ve never published before is not. Supporting a great
cause that aligns with your brand values is also a great thing to do, but
actively researching dozens of causes to identify those that link to their
donors and donating some small amount to all of them is not.

Over time, the aggressiveness with which many sites were pursuing low-relevance
link campaigns began to cause problems for Google in terms of the quality of its
results. Thus, over a decade ago Google began implementing algorithms to detect
poor-quality links (such as Penguin, discussed in Chapter 9), and prior to that
it instituted a process for assigning manual link penalties. Google
representatives also made various public statements about link-building schemes,
like the tweet from John Mueller shown in Figure 11-1.

Figure 11-1. John Mueller tweet on link building

506

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Note the responses from well-known SEO industry professionals Dixon Jones and
Ammon Johns. Between their two comments, they expose a key aspect of the right
way to pursue link attraction—it is best to start by understanding what you
would do to promote your website if there were no Google, or if external links
to your site were not considered a ranking signal. Some of those actions might
include: • Developing close relationships with major influencers in your market
• Building strong connections with media outlets (including bloggers) that cover
your market • Partnering with other leading businesses that publish sites that
reach audiences that include your potential customers • Supporting causes that
you believe in and that are of interest to your target audience • Using
effective vehicles, such as social media, to raise your visibility • Creating
great content on your site that can help your users, and promoting it This is
just a small sampling of what you would likely consider doing if there were no
search engines or links didn’t matter to them. In the process of pursuing these
activities, you would be seeking endorsements from others. These might come in
the form of being written about, either on their websites or in their social
media feeds, and ideally would also include links to your site, helping to raise
your visibility and helping you reach your target audience. You will note that
we’re taking great pains to frame this in a way that aligns well with what you
would do if search engines didn’t exist. Here are the key reasons that we’re
doing this: • It’s good for your organization. • The links you will attract
during this process are likely to have higher value to search engines. • The
links you will attract are less likely to be ignored (or penalized) by the
search engines That said, you need to be purposeful in getting high-value sites
to link to your site. It’s just good marketing to develop that level of
visibility. Done properly, these activities are all within the scope of
traditional marketing: these are normal things that you would do to grow your
business. Once you start veering significantly away from these traditional
marketing activities, you need to do so very carefully. If you’re publishing
guest posts on dozens (or even hundreds) of different websites, or supporting
dozens of different causes, or implementing any other type of campaign to
excess, you’re heading for trouble.

GOOGLE’S VIEW ON LINK BUILDING

507

On the other hand, if you stay focused on targeting sites that serve your
potential customer base, and their relevance and authority are evident, then any
links that you do earn in the process will likely be seen as valuable by Google.
Keep this in mind as you read through the rest of this chapter.

How Links Affect Traffic Before we dive into creating and marketing content, we
feel it’s important to take a closer look at the value of links from a
conceptual perspective. First, let’s review the three key terms involved in a
successful link-building strategy: trust, authority, and relevance. (For more
information on ranking factors, refer to Chapter 3.) In general, a link to an
external page is a vote for its relevance (to the topic that pertains to the
page you’re linking from) and authority (on that topic) and creates an
association between it and the page that links to it. When a lot of pages within
a topical domain link to the same page in that same topical domain, that page
will be ranked highly in relevant search results. Links to your page that come
from sites in different topical domains may still convey some value depending on
their authority and trustworthiness, but probably not as much. The link text
(also known as anchor text), which is the text on the page that a user can click
on to go to the page pointed to by the link, is also important; Google uses this
text to better understand the subject matter of the page receiving the link.
Collectively, these factors are used by Google as a measure of the quality and
authoritativeness of your content and your website, more generally. The types of
sites you get links from, including their quality and authoritativeness, matter
a great deal too. Bear in mind that trying to measure concepts like “quality”
and “authoritativeness” is very difficult to do simply by evaluating the content
itself, so it makes sense for Google to place considerable weight on how your
site is perceived on the web. Google typically doesn’t value links and citations
from sites where all the content is user generated (such as social media sites),
as these are easily gamed, or sites that aren’t trustworthy or authoritative.
For that reason, just as you would do in a broader marketing plan for your
organization, it makes sense to focus your energy on sites that will generate a
lot of visibility for your business—that is, sites that have a significant
number of users whose interests may make your site attractive for them to visit.
Novices often make the mistake of focusing on getting as many links as possible
without considering how to go about it or weighing the value of each link. The
number of incoming links you earn is a performance indicator, and while you
should keep track of them, you shouldn’t think about increasing this metric in
isolation. Remember: the goal is not to get links, or even to get traffic—it’s
to increase sales, conversions, ad revenue, or visibility by drawing the right
kind of attention to your content, services, or products. Incoming links should
reflect the quality and popularity

508

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

of your content, and by extension, the value and integrity of your business. If
you try to get more abstract about it than that, you’ll end up wasting time and
possibly breaking Google’s rules and getting your site penalized or banned from
the index.

Resource Links Versus Organic Contextual Links There are two backlink paradigms
of note: resource links and contextual links. Resource links often don’t have
any link text beyond the URL or the page title and are typically included in a
small list of links to external resources that are relevant to the page content.
For instance, a page that has an article about recovering from a heart attack
might have, in a sidebar or at the end of the article, a list of links to the
websites for the American Heart Association, HealthLine, and the Wikipedia pages
for “atherosclerosis” and “heart disease.” NOTE Be careful with pages that have
too many resource links. If a page’s primary purpose is to post a bunch of links
to other sites, then it may be considered a “link farm,” and links from those
types of pages will not help you rank higher in search results.

Contextual links are integrated into page content and use link text that is a
natural part of a sentence. This approach has been a standard part of online
journalism since before the World Wide Web and is also used in PDFs and ebooks.
Figure 11-2 shows an example of a contextual link highlighted as bolded text
(the example is courtesy of educational services provider Study.com).

Figure 11-2. Example of a contextual link on Study.com

HOW LINKS AFFECT TRAFFIC

509

Between these two, contextual links are likely to be more impactful for your
search rankings, but if you can earn a resource link, take it! Note that it’s
important that contextual links be organic—i.e., that the link text is a natural
part of a sentence, not contrived to include sought-after commercial keywords.
For example, if a search engine finds 12 external links to a page on your site
and they all use the exact same anchor text that also happens to be the main
keyword for your page, that may be perceived as unnatural.

Finding Authoritative, Relevant, Trusted Sites The previous section laid out the
evaluation criteria for external sites that have the most link value to offer
yours. The simplest way to find these sites is to search for your target
keywords, make a note of the top 20 sites, then analyze them to determine for
which of these you may be able to provide value to their users with your
content. Remember: diversity of content types and relevant industry segments is
important. Don’t just search for some X accounts and call it a day (links from X
and other social media site links offer minimal to no SEO value anyway, as their
links are nofollowed). Here are some examples of kinds of sites that you should
keep an eye out for in the SERPs for your keywords: • Noncompeting sites in your
market space

• Sites that link to your competitors

• Media sites

• Review sites (such as Trustpilot or ConsumerAffairs)

• Blogs (on host sites such as WordPress.com or the domains or subdomains of
other sites) • University and college sites

• Related hobbyist sites

• Streamer/vlogger channels • Sites for large events, conferences, and expos

• Government sites In developing this list, it’s a good idea to consider what
content would appeal to the people who own or publish on these sites. Don’t give
up on big sites too easily. It might seem impossible to get a link from a site
like Harvard.edu, until you ask around and find out that a Harvard alumnus works
at your company, and they might be able to contribute an article that would be
of mutual interest to your company and the school. A couple scenarios that might
work with such authoritative sites include: • Publish an interview with a
professor whose field aligns with your topic.

510

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

• Find related research that they have done, do some supplemental research that
adds value to theirs, and reach out to them to discuss what you’ve found—then
suggest potential ways to collaborate. Note that many other types of marketing
activities can help support your link-building efforts. While links from social
media sites pass little to no value for SEO purposes, an active and prominent
presence on sites like Facebook, X, Instagram, Pinterest, or TikTok can provide
a strong visibility boost to your business. This can result in an increased
likelihood of people linking to your site from other types of sites where the
links do provide SEO value. Similarly, having a piece of content go viral for
you in a subreddit will likely drive many links to your site due to the
resulting visibility boost that you’ll get. For that reason, don’t limit your
link-building efforts to simply those sites that provide links which have direct
SEO value. As we noted at the beginning of this chapter, this entire exercise
should be focused on lifting the visibility of your brand as much as possible.
Link attraction is just one positive outcome of a holistic, well-thought-out
promotional strategy for your business. The sites with the largest potential
audiences that might be interested in your content will likely also be the ones
that will provide the most SEO value with their links, due to their relevance
and authority. Further, establishing yourself as a leader in your market space
will make all your marketing efforts that much easier.

Link Analysis Tools Your existing marketing efforts have likely made you aware
of high-value sites within your topical domain. In-depth link analysis tools
(a.k.a. “backlink checkers”) will help you develop this list of potential
partners. They can also list all the sites that link to a popular page and
provide a score that represents their relative authority. Some tools even take
it a step further and provide a trust score as well that represents how
trustworthy the page or site is. The link analysis tools that we recommend to
get in-depth data to support your marketing and link-building campaigns include:
• Ahrefs

• LinkResearchTools

• Moz Link Explorer

• Semrush

• Majestic One of the most important features these services offer is an
analysis of where your competitors are getting their backlinks from. The sites
that link to your competitors’ pages might make excellent candidates for a
content marketing campaign. Keep in mind, though, that if you only get links
from the same sites that your competitors do, this will not help you develop a
stronger market presence or outrank them in the

FINDING AUTHORITATIVE, RELEVANT, TRUSTED SITES

511

SERPs. To climb higher, you’ll need more high-impact backlinks than the sites
that rank above you. Again, bear in mind that these tools are from companies
that are quite small compared to Google. They do not have the infrastructure to
crawl nearly as much of the web as Google does, and as a result they can’t be
expected to show you all your backlinks. For that reason, as recommended in
previous chapters, you may want to consider using more than one link analysis
service and combining (and then de-duping) the data to get a more robust view of
the links to your site. While this likely still won’t show you all the links, it
will show you more of them, and it’s always helpful to have more of this kind of
data.

Identifying the Influencers Influencer is the term we assign to people who are
trustworthy, knowledgeable on a particular topic, and popular—not just on the
web, but in real-world communities. No matter what you do or where you do it,
one of your easiest paths to success will be through the support of an
influencer. Regardless of whether the influencer is an organization, a site, or
a person, building a relationship with them is dependent on getting them to
trust you—and accomplishing that depends on how your site, your content, and
your representative who interacts with them present themselves. Without that
trust, you’ll never get anywhere with them, so make sure to bring your A-game
across the board. In addition, you will need your content or whatever you’re
pitching to them to be highly relevant and highly valuable to their users.
Influencers on social media are easy to identify, but it’s useful to expand that
concept to include the less famous people who control influential sites.
Government and academic sites are generally controlled by webmasters, many of
whom have other responsibilities and only put a part-time effort into content
management. They may also be bound by institutional rules or regulations that
limit what can be published, whether they can link to external sites, and/or
what sites they are allowed to link to. Social media accounts are typically
controlled by the influencer whose public persona is featured there (or an
assistant of theirs), or a full-time social media manager (for corporations and
other group entities). In any case, the person who publishes content may not be
the person who decides what should be published. Before reaching out, you should
gather as much intelligence as you can about a site’s publishing process and the
people involved with it. The more popular an influencer or their site is, the
more you should consider their second- and third-degree connections. Who
influences them? Which bloggers or social media accounts do they follow? Who do
they trust? It may be cheaper and more effective to target these second- and
third-degree connections instead.

512

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Determining the Value of Target Sites Once you have a decent list of potentially
valuable pages to target for a link-building campaign, you must consider their
relative value so that you can prioritize your efforts according to the Pareto
principle, or 80/20 rule. The first step in this process is to use the scoring
system from a link analysis tool, such as Ahrefs’s Domain Rating, Semrush’s
Authority Score, Moz’s Domain Authority score, or Majestic’s Citation Flow
(discussed further in “Link Target Research and Outreach Services” on page 538).
As none of these systems are perfect, the next step is to review the resulting
scores and filter out obvious misses, such as sites that fall into one of these
three categories: • Sites where all the outbound links are marked nofollow
(examples: Forbes.com, YellowPages.com). • Sites where all the content is user
generated (examples: Wikipedia, YouTube, AboutUs, social media sites, forums) •
Domains that host user-created content, such as third-party blogging sites
(examples: WordPress.com, Blogspot.com) Once you have this filtered-down list,
sort the remaining sites into four categories, as shown in Table 7-1. Table
11-1. Prioritizing potential target sites Site value

Worth the effort?

Low (Tier 4)

Getting links from these types of sites is usually not worth the effort, as they
add very little value and come with some risk. It is best to skip these Tier 4
sites and focus on the higher-value categories. As you earn links from
higher-value sites, some of these sites will link to you anyway.

Medium (Tier 3)

Links from these sites may offer some value, but not much. If you choose to
pursue getting links from sites in this category, it should be primarily as a
stepping-stone to the next two tiers. As you get further along in building your
reputation, you can stop looking for links from Tier 3 sites; the links you get
from this category will be the result of broader “buzz” campaigns.

High (Tier 2)

Links from these sites are definitely valuable. Potential targets in this
category will likely be identified by principals of the business or senior
marketing people, or through market analysis. Because the value of links from
these sites is so high, it will take a lot of effort to tailor a contact
campaign for them. You might need to develop content specifically for a Tier 2
site.

Very high (Tier 1)

These are similar to Tier 2 sites, but you should consider going to even greater
lengths to foster these relationships, including figuring out how to meet with
key people face-to-face or pursuing other tactics to build meaningful
relationships with them. The value here is so high that putting in extra effort
is worthwhile.

FINDING AUTHORITATIVE, RELEVANT, TRUSTED SITES

513

Creating a Content Marketing Campaign Now that you have a sense of which sites
and people you want to target, you can start planning to develop content that
they’re likely to find valuable to their users. Some of it may be expensive, but
it’s useful to keep it on the list as a reference point. Also, it’s difficult to
put a dollar value on the long-term effects of remarkable content. Perhaps more
importantly than its SEO benefit, this type of content is an asset that can play
a lead role in defining your brand and enhancing your reputation and visibility
online. With enough money, it’s possible to get a link from nearly any site on
the internet. What seems too expensive now may seem like a bargain later. When
you pursue relationships with highly trusted and highly authoritative sites,
however, you should make those discussions much broader than the pursuit of
highvalue links. These types of sites can bring you far greater value than just
getting a link. In addition, they will be wary of any approach that sounds like
it’s a part of an SEO-driven link-building campaign. They get these types of
offers all the time (likely daily), and there will be many others who have
preceded you with their ill-conceived pitches. As we’ll discuss further in this
chapter, your discussion with these sites needs to be about the value that you
can bring to them/their audience and the benefits of a partnership with your
organization, not what you want from them (to be clear: your initial pitch
should not ask them for a link). We’ve said it before, but it’s worth a
reminder: don’t plan this in isolation. No marketing campaign should ever be run
without aligning with existing efforts. At least make sure that the people in
marketing, PR, and sales are aware of what you’re doing in terms of creating
content to increase search traffic. Better still, get your marketing team on
board with creating content that drives both visibility and links, and you can
actively collaborate with them. Even if you can’t work that closely with them,
everyone at your company should understand that in addition to boosting your
market credibility, incoming links can help increase search traffic. Here’s the
basic content marketing game plan: 1. Ensure your site is optimized and ready
for more traffic (this is covered in several other chapters in this book). 2.
Ensure your company is ready for more public attention and can scale up to
handle more sales or conversions. This includes verifying the capacity of your
web servers and your overall hosting environment. 3. Research current trends,
hot topics, and key needs across the topical domain that applies to your site.
Supplement this with a review of the media coverage in your industry and
identify important gaps in available information and unmet needs, both in the
media and in your potential customer base.

514

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

4. Review your keyword plan (discussed in Chapter 6) and consider the larger
context of customer needs and interests that those keywords and natural language
questions suggest. 5. Create a diverse array of remarkable content that improves
your reputation and standing in your field or industry, follows a current hot
trend, and/or would be useful or entertaining to the people who control the
sites you want to build relationships with and potentially get links from. 6.
Develop a plan to systematically promote the great content you have and be seen
as a leader in your market space. This will include publishing data and research
studies that help others understand your industry or field better, jointly
creating content or events with other leaders, developing deep relationships
with media that cover your space, and many other tactics. 7. Host your content
on your own site, and/or post it to your social media channels, and/or submit or
contribute it to popular sites where your target audience can be found. In the
process, set up your profile on those sites and include backlinks back to your
domain (but don’t expect those links to pass SEO value; they’re for visibility
purposes). 8. Carefully and respectfully solicit links to, and shares of, your
content from influential people on the web. A successful content marketing
campaign raises your overall visibility and improves your site’s search engine
rankings for your targeted keywords, which will increase organic search traffic,
which will lead to more sales or conversions. The web traffic directly generated
by the content should serve to fill the top of your sales funnel, so it won’t
lead to many immediate conversions. However, the lift in visibility for your
site and the links that result will help you generate qualified leads by
increasing your search rankings. Next, we’ll discuss some specific steps you can
follow to create a plan for your campaign.

Discuss Scalability If your site can’t handle an increase in traffic and the
server goes down or pages load too slowly, that will largely undo your content
marketing efforts. Similarly, if your company only has the capacity to fulfill
100 orders per day and your campaign results in double that number of orders
coming in daily, you’ll have angry customers and canceled orders, and your
reputation will begin to trend negatively—that’s the opposite of your goal.

CREATING A CONTENT MARKETING CAMPAIGN

515

It usually doesn’t make sense to preemptively increase your resources before
your campaign launches, but your company should have a scalability plan so that
increased traffic and sales aren’t handled on a last-minute basis. Start by
defining your limitations: • How many concurrent visitors can your web server
handle? • How many transactions per hour can your payment processor handle at
your current level of service? • How many phone calls can you answer in an hour?
• How many orders can be fulfilled per day? • For nonsales conversions such as
mailing list sign-ups and opt-ins, do you have a target number to reach? Is
there a hard limit, beyond which you’d have “too many” conversions? • If you’re
collecting leads, how long will it take to qualify them? Will they get “cold”
before you can follow up? Some other questions to consider include: • Which
areas of the business will be impacted by increased traffic and sales or
conversions? • What is the cost and process for upgrading your server resources
and network bandwidth, on both a temporary and a permanent basis? • Can you hire
temporary employees to help? Can some of the work be outsourced? • Will your
shipping service provider be able to handle more orders? Will you have to
upgrade your level of service? Would you need another shipping scale or label
maker? • Can you reliably obtain more materials? Will you be able to order in
bulk? • Are there any policies or promises that you can alter now so that you
don’t have to apologize later? • Are your service providers reliable? Could your
bank or payment processor suddenly put your account on hold?

Audit Existing Content Before you spend money on developing new content, take
inventory of what you already have. Do you have content that is worthy of people
talking about it, linking to it, and sharing it? Most content does not meet that
standard. If your existing content

516

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

doesn’t work for your promotional campaigns, can you salvage any of it with some
inexpensive modifications? On the other hand, don’t try to make bad content work
just because it’s already paid for. Don’t throw good money after bad, as the old
adage goes. This is a good opportunity to check your existing content for
problems that could sabotage you. Are its links still valid? Does it make
promises or offers that you no longer honor? Does it look dated, or reference
something that is currently out of the public consciousness? Most importantly,
is it offensive by modern standards? Western culture has changed a lot over the
past 20 years, especially regarding acceptable labels and terms. Even something
as small as the company’s CEO retweeting some comment from an actor,
professional athlete, or political campaign a decade ago can spark outrage that
destroys your reputation or erodes trust and authority.

Seek Organically Obtained Links You never know where your links are going to
come from. You may be targeting certain sites or influencers with your content,
but others that you didn’t consider are going to link to it without you even
trying. If you publish great content and people learn about it, some of those
people will link to it. Your job is to ensure that your content is easily
discovered, and then promote your content effectively so that this becomes a
common experience. Even without a large social media following, mailing list, or
readership, just publishing your new high-quality content on your website and
promoting it through whatever channels you have can lead to high-value links,
and it’ll also help grow your audience. Regardless of whether you get the links
you’re aiming for, that alone can make a content campaign worthwhile.

Researching Content Ideas and Types At the heart of most of the best
link-building campaign ideas is great content. This includes building out a
fully robust site experience that addresses the needs of your target users. A
large part of this work can be performed via keyword research, but to do this
research well, you must have the right approach to your strategy. You can read
more about this in “SEO Content Strategy” on page 135, and about how to perform
the keyword research in Chapter 6. How you progress from there in researching
content ideas depends on the actual campaigns you choose to pursue. This section
introduces many different types of content, with suggestions for how to come up
with related content ideas. Content can be designed for one specific kind of
media or adapted to fit many. There’s a lot of psychology about content formats
and what they mean to certain people about

RESEARCHING CONTENT IDEAS AND TYPES

517

certain topics, but try not to go too far down that path. To maximize your
visibility, think about who you’re trying to reach in terms of media,
influencers, and your target audience. Usually, it’s best to approach idea
generation from multiple angles. A particular influencer or site may link to a
wide variety of media types, or anything if it’s remarkable and relevant, even
if they tend to prefer one or two formats.

Articles and Blog Posts Text is the cheapest, easiest, and quickest content
format, and it’s also the most easily indexed by search engines. However, people
often don’t have the time to read lengthy articles, or find it difficult to read
on a small smartphone screen while commuting or doing other things. Most of the
time, an article is more than just text. If possible (and if you own the
publishing rights to them), include images, animated GIFs, and other visual
media to make your content more engaging. Most platforms allow you to embed
tweets, too, provided they’re presented in the context of your article. Ideally,
you’ll be able to link to your other content too. Where you publish the content
is also a factor here. Are you going to publish your article directly on your
site? On a blog connected to your site? On a social media platform? On a
third-party site as a contributor? Note that links to content you create and
publish on social media sites or on third-party sites don’t directly help
improve rankings for your site but may play a more general role in lifting your
visibility online. The only links that raise the search visibility for your site
are links that go directly to it. Articles can take the form of opinion pieces,
interviews, research, product reviews, how-to guides, news reporting (even if
it’s coverage of an industry convention that you’re attending), and more. Some
content types will be better than others for various sites, businesses,
industries, topics, or keywords; your SEO research can help reveal the best
opportunities for your situation.

Videos Videos are easily consumed in reading-unfriendly environments like a
subway train or a deli queue. The proliferation of wireless earbuds means that
you can watch a video almost anywhere without causing a disturbance, and even if
you must watch with the sound muted, you can read the subtitles. Speaking of
subtitles, make sure your videos have them. Some platforms (like Facebook) will
mute videos by default, so you should at least have subtitles for the first
minute or so, if not the entire video. Some video services (most notably
YouTube) offer automatic subtitle generation via a speech-to-text engine. It’s
rarely perfect, but

518

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

it’s usually quicker to fix small mistakes in autogenerated subtitles within
YouTube Creator Studio than it is to create the subtitling by hand. If you plan
to offer foreign language subtitles, you’ll need to use a human translator or an
AI translation service like DeepL Translator. YouTube supports the uploading of
foreign language translations of your video in Creator Studio. Bear in mind that
if you add the subtitles directly to the video and you want to make updates or
corrections, you will have to upload a new video. Deciding where to host your
videos is a significant question you will need to address. The YouTube platform
has the most reach (by far) and makes it easy to embed videos within your site.
While the total search volume on YouTube is small compared to Google, it’s the
second-most popular website on the web, and hosting your videos there is the
easiest way to get them to show up in Google’s search results. Other video
hosting services you could consider include Vimeo and Wistia. Note that Video
SEO is its own art/science; you can read more about it in “Video Search” on page
618.

Research Reports, Papers, and Studies Good research backed up by good data is
often a marketing gold mine, as the press and influencers love original
research—the right type of content can attract a lot of visibility and links.
Success with this type of content depends on producing research that is
original, interesting, informative, timely, and actionable. Before starting any
project to produce such content, investigate what similar data is already out
there and determine what new data you can produce. There are many ways to
produce interesting data. Examples include: • Survey one thousand consumers
about their opinions on a related topic (or a smaller number of B2B buyers if
that applies better to your organization). • Leverage data on the buying trends
of customers on your site over time. • Test how a variety of products work, and
assemble data that shows how they compare. • Assemble a group of case studies. •
Mine data from other resources and then provide unique analysis or an
interesting data mash-up. Better still is if you can supplement this with some
of your own data. Once you have pulled the data together, take the time and
effort to figure out the best way to present it. There are many options for
this, such as bar charts, line charts, interactives (discussed in the following
section), and more. Use the format that makes the data most interesting to
consume. These types of research pieces usually benefit from integrating in
well-structured, text-based discussion and analysis of the data.

RESEARCHING CONTENT IDEAS AND TYPES

519

Don’t make the mistake of only publishing the content as a PDF download and
requiring viewers to provide an email address to get it. While this may be a
great way to collect email addresses, it will not be as effective in driving
visibility for your organization (and obtaining links in the process) as
providing the content directly. Figure 11-3 shows Semrush data on links to the
FiveThirtyEight.com domain, a site whose stated mission is: “We use data and
evidence to advance public knowledge.” The site is showing nearly 21M links!

Figure 11-3. Links to FiveThirtyEight.com (source: Semrush)

Interactives Interactives are a form of content that is responsive to user
input. For them to be successful, the nature of the content needs to be highly
engaging. One strategy is to create an interactive where the user gets to see
how input they provide (their guess) compares to real-world data. For example,
consider the New York Times interactive in Figure 11-4. This interactive proved
to be a major link magnet. As you see in Figure 11-5, the content received 322
links from 150 different domains—an impressive result.

520

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Figure 11-4. An example of interactive content (source: The New York Times)

Figure 11-5. Links to NY Times interactive (source: Semrush)

RESEARCHING CONTENT IDEAS AND TYPES

521

Collaborations with Other Organizations One way to jumpstart the promotion
process is to partner with other organizations that are influential. For
example, if you were able to partner with a major corporation on performing and
publishing research, that would provide instant credibility to your content. You
might wonder why such a company would want to partner with you, but all
organizations have limited resources and budgets. If you drive the process of
analyzing the data and writing up the results, you can take that burden off
them. You can then have your partner play a key role as an active advisor to the
project and with helping promote the resulting content. The involvement of that
partner organization can drive the visibility resulting from the campaign much
higher than you may have been able to manage on your own. To maximize your
results, try to structure the arrangement so that the content is published on
your site.

Collaborations with Experts Another approach to attracting links is to ask
influencers or experts to review or contribute to a piece of content that you
are creating. This role could take a few different forms, such as: • Providing
guidance and input to your team during the content creation process. • Reviewing
and/or editing your content once it has been created. • Being quoted in or
contributing portions of the content. What’s great about this process is that it
helps you produce content of higher value to users. The perspective of
recognized experts in a topic area is exactly the type of insight that users
(and Google) are looking for. Andy Crestodina’s post on blogging statistics is
an excellent example of content that does this extremely well. The content is
based on a survey of over 1,000 bloggers, including some top experts, and it
includes input and quotes from over a dozen recognized content marketing
experts. This post ranks #2 in the Google SERPs for the phrase blogging
statistics. Consequently, it continues to receive visibility, including from
other media and bloggers looking for such information who may choose to link to
it. Per Semrush, this article has over 16,000 backlinks from over 3,000 domains.
In short, because of the high quality of the content it has earned a great deal
of visibility, and it continues to earn new, additional links due to the depth
and breadth of the resource and its high Google ranking. Once your content is
published (or updated with their contribution if it’s being added to previously
published content), you and your expert now have a mutual interest in

522

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

promoting the content. As part of the process, openly discuss your promotional
plans with them and ask them if they can help with that as well. Ideally, the
organization your expert works for will also link to the content and contribute
to raising awareness about it by sharing it on social media. A simple example of
an organization doing this can be seen in Figure 11-6.

Figure 11-6. Example of a university linking to content featuring one of its
professors

While this may not be the most awesome link you could imagine (given the anchor
text and the brevity of the description), the fact remains that the University
of Washington (Figure 11-6 is a screenshot from that site) chose to promote this
content. Some organizations will do full write-ups of the content in which their
experts are quoted or interviewed and promote that content in their social media
feeds. While you are working on your campaign, keep your eyes open for potential
collaborators with whom a deeper relationship may be desirable. If you
particularly like their contributions and they seem open to it, perhaps you can
engage in more extensive projects with them. This could be a way to create even
more great content offering high user value.

Quizzes and Polls In this section we’ll focus our discussion on polls, but the
same conceptual approach can be applied to quizzes (the main difference is that
quizzes can be used to gather more information than a poll can). This type of
information gathering from your website visitors can provide direct value to
your business. One simple form of poll is to ask visitors for input on what they
would like to see in your product or service.

RESEARCHING CONTENT IDEAS AND TYPES

523

For purposes of driving visibility and attracting links to your site, consider
asking users questions directly related to your market area, with a focus on
questions that might be of interest to media and influencers that cover your
industry. Don’t waste your time gathering data on topics that are already
broadly understood in your industry. Unless you come up with a surprise
conclusion, this will be a waste of time and energy. Also, stay true to what the
data says. You must let the data tell the story. If you fail to do this and
share conclusions that are sensational but hard to believe and not backed by an
objective analysis of the data you gathered, then you run a high risk of getting
called out and discredited. Executed the right way, these types of campaigns can
provide great results. For example, Pew Research conducted a survey on American
opinions on how tough a stance the US government should take toward China. As
you can see in Figure 11-7, this poll earned over eight thousand links!
Personality tests (e.g., “which superhero are you?”) and self-assessments (e.g.,
“how much is your website worth?”) are variations on quizzes that can be quite
effective at engaging visitors and gamifying the polling process.

Figure 11-7. An example of links earned by a poll (source: Semrush)

Contests This approach to driving buzz and visibility has a lot in common with
quizzes and polls. Once again, you’re seeking to drive visibility through
participation of your users in an activity. The big difference with contests is
that you’re not publishing the results but instead are drawing attention due to
the nature of the contest, the spokesperson (if you have one), notable entries,
and the prize offered. One of these aspects needs to be a natural attention
getter—merely offering a $100 Amazon gift card is not going to cut it.
Creativity is the key to success. For example, you could run a contest where you
give a prize to the winning user and offer a matching donation to a great cause,
perhaps making that donation contingent

524

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

on a minimum level of participation in the contest. What makes this effective is
that the cause you are supporting will want to help promote it, as will others
who are strong supporters of that cause. Another approach that could work is to
provide the winner with an exceptional prize, such as tickets to the next Super
Bowl or World Cup. This works because of the scarcity of such tickets, combined
with extremely high demand. A well-known spokesperson can also make a big
difference.

Cause Marketing If you pick the right cause and promote it the right way, you
can draw a lot of attention to your brand and site. It’s important to pick your
causes carefully and ensure that they align with your brand values. A great
example of this is outdoor and sporting goods supplier REI’s #OptOutside
campaign. As part of this campaign, REI’s retail locations are all closed on
Black Friday every year, and the company encourages its employees to use the
time to do something outdoors. For REI, the campaign aligns perfectly with
company values. In addition, since it’s a large brand, it can easily garner a
lot of visibility and press (though the company must still actively notify its
media contacts and let them know about the campaign). Having a strong brand
helps here too, as these media contacts are already familiar with the company
and in many cases direct relationships already exist. You can see data from
Semrush in Figure 11-8 that shows that REI’s news announcement has 7,300
backlinks.

Figure 11-8. Links to REI’s #OptOutside news announcement (source: Semrush)

If you’re not a major brand, you will likely not get as much exposure as REI
did, but the general concept remains quite effective. Most niche causes are
striving to develop as much visibility as they can, and even if you are a
relatively small company, most

RESEARCHING CONTENT IDEAS AND TYPES

525

smaller charities will be highly motivated by your support. There are many ways
to go about this. For example: • Add a page related to the cause on your
website. • Donate to the cause in a highly public way. • Sponsor the cause at a
public event. • Incorporate an aspect of the cause in one of your content
marketing campaigns. What will work for you depends on the best fit between your
organization and the cause you’re supporting. REI had a very natural fit with
the #OptOutside cause. Find that best fit, and that’s what will help you get the
most out of your cause marketing efforts.

Comprehensive Guides and In-Depth Content Libraries The concept here is simple:
create the best resource on the web on a given topic. This may not always be
easy to execute, but if you are able to create such a resource and promote it
effectively, you’ll have a good chance of drawing a lot of attention and links
to your site. This can come in two forms: • Create one in-depth piece of content
that can serve as an indispensable and comprehensive guide to what your
potential customers, influencers, and the media need to know about your market,
or some important subset of your market. Creating such a guide may be hard to
do, but it can bring big rewards due the high amount of value that it provides
to users. Then you can support this content with proper promotion to drive brand
visibility and attract links. • Create a library of content that covers your
topic area more thoroughly than what is currently available in your market, or
some important subset of your market. To maximize the value of this resource,
it’s important to do a good job of connecting and interlinking each of the pages
and making it easy for users to navigate and find the specific information that
they are looking for. As with the single in-depth piece of content approach,
support this with proper promotion. Figure 11-9 shows an example of the
backlinks earned by such a guide published by Search Engine Journal (which is
itself a solid resource to use in learning more about how to create a
comprehensive guide).

526

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Figure 11-9. Links to the Search Engine Journal guide (source: Semrush)

Infographics Infographics are among the most easily consumed and shared content
types, but they have a few drawbacks: they don’t usually render nicely on
smartphone screens, it’s time-consuming to update them, and they won’t be
indexed as neatly as a text article (though you can address that issue by
embedding the infographic as an expandable and linkable thumbnail within an
article). Additionally, if you don’t have a graphic designer on staff who knows
how to create a good infographic, then you’ll have to hire one. That can get
expensive, especially if you don’t already have most of the art assets you want
to include in it. Sure, you can hire someone on Fiverr, but the results may be
spotty. Alternatively, you can do it yourself with an infographic construction
kit tool like Piktochart, or you can hire a design studio with experience in
infographics, like Cool Infographics or Studio1 Design. Because they are visual,
infographics may seem cool and you might encounter SEO professionals who believe
that they are the best medium to use in attracting links to your site. However,
don’t fall into the trap of thinking that infographic campaigns are the solution
to all your link-building needs. Infographics are only highly effective when
this is the best medium for representing the information that you are trying to
convey. Also be aware that infographics tend to be data-focused, and that data
may become inaccurate quickly and permanently, which can have a negative impact
on your site’s reputation in the future.

Tools, Calculators, or Widgets These can be effective if they provide
exceptional value to users. As when providing a comprehensive guide, aim to
create something of very high value to users and then promote it to help attract
attention and links. However, this tactic has been well known (and abused) by
some SEO professionals for over two decades. For that reason, take care to
strive for very high levels of quality and relevance with your tools,
calculators, and widgets. In addition, be aware that becoming the 17th website
to publish a tool to be used for a given purpose is likely not going to attract
much

RESEARCHING CONTENT IDEAS AND TYPES

527

attention. For example, don’t create a mortgage calculator and expect to drive
much interest—a Google search on “mortgage calculator” (including the quotes)
indicates that there are millions of results. To succeed, you need to come up
with something new and unique. Also, be careful with how you promote your tool,
as many companies have abused this tactic in the past. For example, creating a
“booze death calculator” (how many shots of whiskey would it take to kill you
based on your weight?) allowing other sites to republish it, and embedding a
link in the tool that points to a page on your site that is not at all relevant
to the content of the calculator is a tactic that could get you into trouble,
and the links would almost certainly be discounted.

Viral Marketing Content Another approach is to create content that is designed
to go viral, benefiting from a rapid spread of shares. Content that becomes
viral can deliver a huge spike in traffic and can also attract a large number of
links. This process is often jump-started when a significant influencer shares
the content, as their extensive reach accelerates exposure for the content, and
their reputation and authority increase its credibility. The viral behavior
makes it a form of news, and may cause many people to write about it and link to
it. The result can be a major spike in visibility. Success with this type of
strategy is harder to achieve than it may sound. The market conditions for
creating such content that’s related to your business may come around
infrequently; you’ll need to execute the campaign flawlessly and beat your
competition to the punch. In addition, viral marketing campaigns may not be a
fit for many organizations, depending on the nature of their brand. A number of
sites focus on popular trends and viral content, such as: • BuzzFeed

• ViralNova

• Reddit

• Bored Panda

• Upworthy

• Distractify

• Ranker

• Thought Catalog

Search those sites for your keywords and see what the most popular recent
results are. The point is not to find new keywords; it’s to find popular,
current content that pertains to your keywords and topics. You can consider the
authors of the most popular results to be influencers (and potentially hire them
to create content for you), and the sites they publish on to be influential.
You’ll want to target those people or sites with influencer campaigns. This is
also a good exercise in market research to see which content types are hot right
now and gauge the audience’s engagement levels.

528

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Then, the follow-on question to ask yourself is what you could add to the
conversation that could potentially be interesting to many people, including
major influencers. As seen in Figure 11-10, the number of links earned by
content that goes viral can be spectacular. The example content is a study
conducted by MIT that reported that the human brain can process an image in as
little as 13 milliseconds. The number of links earned? 8,900!

Figure 11-10. Links earned by viral content

Memes The same benefits and caveats apply here as with infographics, except
memes are easier and cheaper to create, and have a higher risk of copyright
infringement issues. A lot of popular memes use photos and drawings without the
copyright holder’s permission, and that can get you into legal trouble. The best
way to avoid that issue is to use your own graphics, though that may make your
content less shareable. Note that depending on the nature of your business and
brand, a meme may not be appropriate for you. For example, many B2B companies
may not get value from this type of campaign. A meme isn’t usually valuable in
isolation; it’s best used as a component of, or gateway to, more substantial
content. At the very least, ensure you have a link back to your site somewhere
in the margin, and in the image metadata. Tools to create memes include Meme
Generator and GIPHY.

Content Syndication There are also times when you might consider allowing
someone to syndicate your content even if they won’t implement canonical tags or
links back to your original article or website. Remember, your goal is to build
your site’s reputation. If The Wall Street Journal wants to republish your
article without a canonical metadata tag, see if they will implement a link. But
if they won’t, don’t argue—just go with the flow. The reputation and visibility
benefits will far outweigh the downside of having The Wall

RESEARCHING CONTENT IDEAS AND TYPES

529

Street Journal rank for your content instead of your site, and it’s not a
certainty that your page will rank lower in the search results than their
syndicated copy. Focus on high-value targets for content syndication. Sites that
will take any article with little editorial review are not likely to offer a lot
of visibility or high-quality links. In fact, those links may be of no value at
all. Getting onto higher-value sites, such as a major regional newspaper’s, may
require more effort, but those links will be much more valuable not just for
SEO, but for direct traffic as well.

Social Media Posts Content posted solely on social media (such as as a post on
Facebook, Instagram, TikTok, or X (formerly Twitter) is easily shared within the
walled garden of the host platform, but outside of that it just looks like a
generic link to a social media page. Bear in mind that any links back to your
site in that content will not count as links from the search engines’
perspective. They may use the link to discover and crawl your content, but other
than that it has no SEO value. This is true even if other users of the social
media platform reshare your post or create their own posts and link to your
site. However, social media is another platform where you can deliver value to
users and raise your general visibility. In addition, social media pages can
show up in Google’s search results, though how much social media content the
SERPs include varies from time to time. NOTE A content marketing campaign can be
a good way to build your following on social media, which can expand your brand
reach and give you easier access to customers.

Creating Remarkable Content High-quality content tends to get discovered more
quickly, is shared more widely, and continues to be noticed for as long as it
remains relevant. The worst rookie mistake in creating and managing website
content is to think of it as ephemeral—something that’s only valuable for a
short period of time, like a billboard or a magazine ad. That couldn’t be
further from the truth! Content is an asset with long-term (often permanent)
value. If you lose sight of that principle, then the time and money you spend on
content creation and marketing will largely be wasted. The second-worst rookie
mistake is creating content whose sole purpose is to drive your visibility
(content marketing content) and then using it to sell stuff to your customers.
That may sound crazy, but consider this: you’re not making a sales pitch here
—this isn’t a TV ad. The purpose of your content should be to address user needs
(in

530

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

areas directly related to your overall market). Those needs will be related to a
problem they want to solve or something that they want to do, and they may not
involve buying your product or service just yet. If you do this effectively, it
can increase the search rankings for your content marketing content so that
potential customers find you when they search for content to address their
needs. You must minimize the commercial nature of your content marketing content
pages; don’t try to sell anything right here, right now. On the other hand,
don’t disguise the relationship between the content and the commercial part of
your site. You’re creating content for the purposes of helping customers and
impressing influencers, and at the end of the customer journey this ultimately
includes them buying your product or service. This is all part of one integrated
continuum. In addition, because of the increases in your reputation and
authority, and the links that your site will receive, the search rankings for
your commercial pages will improve too, and this is how you’ll increase your
sales and/or leads. You should have content on your commercial pages too. This
content will not typically be considered content marketing content. It should be
designed to help users understand the options they have in buying your products
or services, or how they can help them. In other words, commercial page content
should help visitors to your site make intelligent buying decisions. If you do
that well, your commercial pages have a stronger chance of ranking higher. These
improvements to your commercial pages will combine with the lift in visibility
and links that you get from your content marketing content to provide you with
the strongest possible results.

Hiring Writers and Producers While you definitely have in-house domain
expertise, you may not have all the right people on staff to create content for
a content marketing campaign. Ultimately, what you want to do is create quality
content that: • Provides high value to users • Is visually engaging in a way
that enhances the value to users and encourages sharing • Is search
engine–optimized so that it can rank for the appropriate search terms This may
require several different people to execute, such as: • A subject matter expert
to create the content • Graphic arts talent that can create the appropriate
imagery • An SEO professional to provide input on how to properly optimize the
content

CREATING REMARKABLE CONTENT

531

The best way to find good creators for your content marketing program is to find
some remarkable content that fits the style you want to emulate and is directed
at the topical domain or industry segment that you’re aiming at, and hire the
people that created it. If you didn’t already do so, go back to the sites where
you found the highest-ranked content for your keywords, make note of the
authors, producers, and publishers of that content, then research them. Do they
solicit freelance offers, or can you hire them? Ideally you want to hire someone
with a long history of creating great content in your market space. Better still
is if you find that they have social media pages that have high visibility, as
they may also be willing to share the content that they create for you. You can
also look for creators through ordinary freelance channels like Craigslist,
Upwork, Fiverr, ProBlogger Jobs, and The Society of Professional Journalists.
For creating images, infographics, and videos, search for graphic design and
video production firms that operate on a work-for-hire basis. The firms in the
top 10 results obviously know what it takes to rank highly in a SERP, so working
with them could be a bonus (but it’s not required). Regardless of how you find
them, review each creator’s portfolio to ensure that they have a history of
making the kind of content you’re looking for. Do they have significant
expertise in your topic area? Does their content enhance the reputation of the
sites where it’s been published? Does it seem spammy, gaudy, or untrustworthy?
Has this person or firm published content that you definitely don’t want
associated with your brand or business?

Generating and Developing Ideas for Content Marketing Campaigns In “SEO Content
Strategy” on page 135 we discussed how you can use competitive research to get
tons of ideas for content that you can create. From that effort you will have
been able to identify: • Key topic areas that your current content does not
cover • Weaknesses in your existing content • Strengths in your competitors’
content • Important gaps in your competitors’ content Chapter 6 covered how to
perform keyword research to identify the search terms that represent key
opportunities for your organization to provide valuable content to your
prospective and current customers. During the process you will have exposed many
content marketing ideas. These ideas may cover a lot of ground beyond what you
find

532

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

during your competitive research. If you haven’t gone through those steps yet,
now is a good time to do that. You should also plan on doing in-depth
brainstorming with those in your company who best understand your target market,
the media and influencers covering it, and your customers. Use these internal
company conversations to learn about customers’ wants and needs and how you can
best serve them. Make a point of learning and understanding all of this not only
to provide outstanding content for your users, but also to support your content
marketing efforts. Having the best content in your market (or for a significant
subtopic within your market) is an outstanding way to lift your visibility, but
you should keep your eye out for what the hot topics are too. One approach is to
research key areas of interest to influencers within your market. These may not
currently be hot topics or in the news, but they’re areas where a need exists.
For example, imagine that no one has ever gotten a firm handle on how customers
in your market feel about certain related trends. Can you conduct a survey and
get the answer to that question? This is a bit different than riding the
coattails of a hot trend, more focused on addressing a real market need. In
addition, because of the value of your content, you will need the help of your
top subject matter experts to validate that the content ideas that you come up
with during brainstorming have merit. And don’t plan on using generative AI
tools like ChatGPT to create unique, high-value content for you—these tools can
only create homogenized renderings of what is already on the internet. Many
great uses for such tools are documented elsewhere in this book, but creating
your site content without significant review is not one of them.

Speedstorming An excellent technique for collecting internally sourced content
ideas is speedstorming. This approach is designed to stimulate creativity. Here
is how it works: 1. Identify five people who know the industry well and are
familiar with the concept of content marketing. Schedule a 20-minute meeting
with them. 2. When everyone’s assembled, seat them around a table and give each
person a blank sheet of paper. Make sure they know that during this exercise
they can’t discuss their ideas or look at one another’s papers (until step 4).
3. Tell them all to come up with three content ideas, and set a timer for five
minutes. 4. When the timer goes off, have each participant pass their paper to
the person seated to their left. 5. Repeat the previous two steps five times,
then collect the papers.

CREATING REMARKABLE CONTENT

533

Just like that, you should have 75 ideas to consider for your campaign. Even if
twothirds of the suggestions are not particularly useful, this will still leave
you with 25 pretty decent ideas to start from. What makes this process work well
is that it gets all five people involved. If you were to try to do the same
exercise with a whiteboard, you would likely get far fewer ideas, and some of
the people in the room would contribute little for fear of looking foolish or
incompetent. This process gets everyone involved, and it’s actually fun as well!

The importance of the headline The headline is essentially the title of your
content. It may be used as the tag and/or the tag for the page, and people
linking to your content may use it as the anchor text in the link. It also plays
a critical role in enticing people to read and engage with your content. Be
careful to avoid headlines that are pure “clickbait.” These are headlines that
are overly sensational or that are too formulaic. In time, as you publish more
content, people will start to see through these, and they may even harm your
reputation. However, it’s important that you make your headlines engaging to
entice users to click into your content. With these principles in mind, here are
some characteristics of effective headlines: • Don’t lie or mislead. • Keep it
as short as you reasonably can. • Include the most important keywords that apply
to your content. Some other tips that may help you create effective headlines
are: • Address the reader with a second-person pronoun (you, your, yours,
you’re). • penInclude a number. • If you can do so without exaggerating, use an
emotional adjective (surprising, exciting, shocking, amazing) and/or a
superlative (best, worst, most, least, craziest, creepiest, etc.). • If the
content is instructional, use “how to” in the title. • Search for your headline
before you commit to it—try to avoid using the headline of an existing piece or
page unless you’re confident that what you created is much better. Don’t treat
any of these tips for headlines as hard and fast rules. Use them to help you
generate ideas. At the end of the day, you need to do the work to determine what
is most likely to be effective in your topical domain.

534

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Don’t Be a Troll Don’t try to be remarkable in a negative way. As Maya Angelou
said, “I’ve learned that people will forget what you said, people will forget
what you did, but people will never forget how you made them feel.” Your content
campaign must satisfy your business purpose and align with your values and
goals; you should be proud to have your name on it. Content marketing’s main
focus should always be, above all else, building your reputation and visibility
with your target audience. Negative content associates you with negative
emotions, and ultimately with negative actions. In many places around the world,
your business may be held criminally and/or civilly responsible for negative
actions instigated by your content. We humbly admit that in previous editions of
this book, we advised readers to try to evoke any kind of emotional response
with their content—to inspire laughter or outrage. Evoking an emotional response
is still generally desirable, but you must consider the moral impact of the mood
you’re trying to create. Laughter is good, but outrage is not. Over the past
several years a lot of companies have gone all-in on troll tactics, relying on a
constant supply of outrage to generate clicks and shares. Ultimately this does
not seem to have been beneficial to them, except to establish their reputation
as trolls. When you become dependent on outrage to drive engagement, you put
yourself on a path that never stops escalating toward violence. Your brand will
become associated with hate and division, and while that may generate some new
sales in the beginning, in the long term you will alienate more people than
you’ll attract. At that point, the only way to repair your reputation is with a
complete rebranding. Trolling content isn’t even very effective in the short
term, because this type of content is easily displaced. When you stimulate
negative emotions in people, you make them vulnerable to even more provocative
content, so one hate-click leads to another, and you become a stepping-stone for
someone less scrupulous than you. None of this is in service to your goal of
increasing SERP rankings by creating linkable content. In an online world
increasingly filled with outrage and hate, the brands with hopeful, positive,
supportive, nonjudgmental messaging will stand out as trustworthy and
authoritative—not just today, but long into the future.

Don’t Spam, and Don’t Hire Spammers Over the years we’ve seen many freelance job
postings for “SEO content writers” that make us wince. The requirements almost
always involve keyword stuffing—jamming as many keywords onto a page as
possible—and explicitly breaking other search engine rules. On many occasions,
we’ve been hired to help fix a site that has been

CREATING REMARKABLE CONTENT

535

banned from Google’s search index because a bad SEO used black hat tactics to
try to climb through the search rankings faster and with less money. You should
use your most important keyword naturally in your content title or headline,
then no more than once or twice per paragraph of text (for an article) or
narrative (for a video or podcast), and use related synonyms regularly
throughout. When in doubt, read it aloud to yourself—or to someone else—and ask
if it sounds unnatural or strange. Does it sound like you’re trying to use a
particular word or phrase as often as possible? Subject matter experts writing
high-quality articles don’t reuse the same words and phrases repeatedly because
it’s amateurish and annoying to read. Instead, a good writer substitutes
often-used terms with alternatives, abbreviations, and pronouns. Remarkable
content should include your keywords, but never to the point that it stops being
remarkable (in a good way). Ideally a reader shouldn’t be able to detect the
keywords you’re targeting—or that your content was created for link-building
purposes. SEO-friendly content should be indistinguishable from regular, not
specifically for SEO content. If your content looks, sounds, or in any way feels
spammy, then it is; if it seems spammy to you, you can bet that it’s going to
seem spammy to Google, too.

Relationships and Outreach This section is all about how to raise awareness of
your remarkable content. In a perfect world, you wouldn’t have to do much to
promote an article or ask for links; you’d just post some content, and a loyal
group of fans and followers would link to it from everywhere. You may, in fact,
already be close to that ideal situation if you have a well-known brand or have
built up your mailing list and social media pages, or if you have access to
free, useful publicity. But regardless of your existing marketing and PR
resources, it always pays to continue expanding your reach.

Building Your Public Profile Your company should already be expanding its brand
reach through regular customeroriented site content, social media pages,
promotions, ads, and direct contact through an email or physical mailing list.
That’s typical marketing stuff. To achieve long-term success in the 21st
century, you should also be building your own personal brand. Whether you’re an
SEO consultant, an employee working on SEO at a company outside of the SEO
industry, or a small business owner who must do a little bit of everything, you
will benefit from this effort. It will follow you from one job, company, or
client to the next, and will make your outreach efforts much easier.

536

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

The more well-known and respected you are, the more links your content will get.
The only way to build a solid reputation in your industry is to show, over time,
that you’re interesting, trustworthy, and in general the sort of person who most
people want to be friends with. The challenge is that you must do this work in
the public eye—i.e., in public forums such as social media, traditional media,
and in-person events. This necessitates engaging positively with strangers in a
selfless manner, volunteering to help newcomers in your industry, contributing
articles to newsletters and journals, and speaking at conferences or on live
streams (or assisting with production).

Link Reclamation One of the easiest, cheapest, and most effective link-building
tactics is to find all current mentions of your company’s name, brands,
products, and content (or, if you’re a public figure, your own name) on other
sites that are not currently linking to yours (i.e. unlinked mentions). If the
content on someone’s page is already directed at you in some way, it makes
perfect sense for them to link to your site. As a secondary exercise, you should
investigate why those people didn’t link to your site to begin with. Perhaps
they merely didn’t think to look for the URL of your page, or didn’t have the
time to look it up. Perhaps they chose to link to something else, such as your
Facebook profile or a product review video on YouTube, instead of your actual
product page. A number of services can assist with finding mentions without
links. Aside from your existing SEO backlinking tools and the other services
mentioned in the following section, you might find BuzzSumo and Mention useful
for this purpose. They’re convenient and reasonably priced, but probably aren’t
necessities unless you have a lot of high-popularity names/titles/brands to look
up and track. For smaller-volume link reclamation research, just use your Google
search skills, and set up a Google Alert to continue monitoring for new
mentions. Beyond names (companies, brands, people), you should also search for
titles of your pages and content (article titles, for instance), image alt text,
filenames (of images, PDFs, HTML files—anything published on your site), and
URLs (sometimes people don’t use an tag and just paste in a URL in plain text).
TIP If you do not have excellent search skills, we suggest picking up the most
recent edition of coauthor Stephan Spencer’s Google Power Search.

RELATIONSHIPS AND OUTREACH

537

Link Target Research and Outreach Services Your preferred marketing suite or SEO
platform will have a variety of features that are directly or indirectly useful
for link acquisition. What follows is a list of some of the most important tools
for link research:

Ahrefs While Ahrefs is useful for keyword research, site auditing, crawl
troubleshooting, and competitive analysis, it’s also an excellent tool for
monitoring backlinks—new backlinks, recently lost backlinks, pages linking to
multiple competitors but not you, etc. You can also use it to analyze
competitors’ websites. Ahrefs calculates an authority metric called the Domain
Rating (DR) score for each website in its database, as well as another metric
called URL Rating (UR) that is calculated on a per-URL basis. DR and UR run from
0 to 100 and are on a logarithmic scale. DR is used by many in the industry as
the standard for assessing the authority of a domain, and Ahrefs recommends that
people working on link-building efforts use UR, as it aligns closely with
Google’s traditional PageRank metric. One important component of the Ahrefs
offering is that it actively works to remove spammy links from the calculation,
so they don’t artificially influence the DR. This provides a much cleaner score
than would otherwise be possible.

LinkResearchTools LinkResearchTools, as the name implies, is an in-depth toolset
for link analysis. You can analyze the authority, trustworthiness, and backlink
profile of a single page, a subdomain, or an entire site. This is particularly
important information to have when you are seeking links from large sites,
because their trustworthiness and authority are not uniformly spread across all
pages; entire areas of websites (such as forums or login-restricted services)
can be disregarded or disfavored by search engine crawlers even if other areas
of the site have top search rankings for your keywords. One of this suite’s most
helpful tools is Link Detox, the industry standard in analyzing the toxicity of
a site’s inbound links.

Majestic Majestic claims to visit billions of pages per day and to have the
largest commercial link intelligence database in the world. Like Ahrefs, it
assigns two scores to reflect the authority and trustworthiness of the pages it
crawls: Citation Flow (CF) and Trust Flow (TF). TF is a metric offered uniquely
by Majestic that adds an additional dimension by evaluating how much value a
link from a given page may provide. Majestic’s incredibly handy Link Context
tool shows where in the page a link is contained and how many other links are on
the page with it.

Moz Moz offers an excellent link analysis tool called Link Explorer and boasts
an index with 40.7 trillion links across 718 million domains and 7 trillion
pages. It assigns

538

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

two authority scores, Domain Authority (DA) and Page Authority (PA), as well as
a Spam Score to enable you to rapidly filter out links that are potentially of
poor quality.

Semrush Semrush is a later entrant into the backlink tool market, but it has
made great strides and states that it has over 43 trillion links in its
database. Like the other tools mentioned here, you can use it to track your
backlinks and those of your competitors. Semrush also offers a Domain Authority
(DA) score that can be used to help assess the value of a given link, as well as
a backlink auditing tool that enables you to detect spammy links. Many of these
tools, such as Ahrefs, Majestic, and Moz, include many other functions and
features, and all the tools provide mechanisms for identifying poor-quality or
even toxic links that may be pointing to your site. Many other tools can assist
in your link campaigns in other ways. Two of those that the authors are familiar
with are:

SparkToro SparkToro’s primary focus is on audience research: understanding the
larger context around the people you’re trying to reach. What other interests do
they have? Which social media profiles do they follow? What are their
demographics? This is an outstanding tool for rapidly finding new link targets
on the web and social media, either from the top down (starting with a specific
site, topic, or social media profile and analyzing the people who visit, talk
about, or follow it) or from the bottom up (analyzing the related interests of
specific influencers, customers, or demographic cohorts that you want to reach).
SparkToro’s “podcast insights” feature is particularly useful for finding
podcasts that are relevant to your topics and keywords. Podcasters are always
looking for new guests, and they will link to the guest’s site and/or social
media accounts in the episode description. The amount of useful SEO data you can
get from this service is amazing, not just for link building but for finding
content ideas, gaining competitive insights, and keyword research.

Pitchbox A CRM for link building and digital PR is an accurate way to summarize
this service that can play a key role in scaling your link-building campaign.
Pitchbox helps you find quality link prospects and design custom communication
strategies to reach them through its specialized email system and analytics
engine. Whereas SparkToro is a research tool, Pitchbox is a management tool that
spans the outreach life cycle, from first contact to continued relationship
building with people to acquire links.

RELATIONSHIPS AND OUTREACH

539

Qualifying Potential Link Targets If you’re serious about obtaining more links,
then you should create a list of pages and social media profiles that could
potentially be of value. You can build the list yourself, but the most efficient
method is to export it from a research tool like the ones mentioned in the
previous section, or from an enterprise SEO platform such as BrightEdge,
Conductor, Searchmetrics, or seoClarity. As with keyword research, it’s easiest
to compare, filter, and sort your link target list in a spreadsheet. We suggest
these columns:

URL This is the URL of the specific page that you’re evaluating.

Site Authority Using this metric from your backlink tool of choice is a useful
way to get a measurement of the authority of the potential link target.

Topic This is a subjective measure of how well the page fits your content’s
topic or theme. Use whatever scale you prefer. Some suggestions: numbers from 0
to 5; words such as “good,” “mediocre,” and “poor”; stoplight colors; or Boolean
pass/ fail indicators.

Trust How much do you trust this site’s content and stewardship? Does it look
like the site or the domain owner might be doing something illegal, unethical,
or against Google’s guidelines? Is this a new site with no meaningful history?
Will Google be suspicious if your pages are linked to from this site?

Traffic If you have traffic estimates (daily, weekly, or monthly visitor count),
put them in this column. If you don’t have traffic data, you can skip this.

Comments Put any other notes you may have about the page here, such as its
domain ownership, traffic history, potential penalties or spam, or any other
insights.

Link Quality This is your overall “opportunity score” that takes all factors
into consideration. How to balance those factors is highly variable depending on
your topics and industry, but we suggest that Topic be the factor with the most
weight. Links from high-traffic, high-trust sites that aren’t topically related
to yours are usually not as valuable as links from high-relevance sites,
regardless of traffic or trust (within reason).

540

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

The process of qualifying potential link targets is much like developing a
keyword plan, except your worksheet will pivot on the URL column instead of the
Keyword column. When you’ve filled in all the relevant rows, you can identify
your best opportunities by sorting and filtering the table by the Link Quality
column. It usually pays to be somewhat forgiving when evaluating links; as long
as they aren’t complete duds, they may still have some value in aggregate. Don’t
limit yourself to a few perfect candidates, and don’t be afraid to aim
high—don’t assume any influencer or website is too big and famous to bother
with. The next step is figuring out how to approach the owners or maintainers of
those pages to ask them for links. This will involve understanding why some
aspect of your site offers enough value to their users that they would be
interested in linking to it and communicating that effectively (see “Why People
Link” on page 504).

The Basic Outreach Process Understanding the inner workings of the outreach
process will help you understand the kinds of value propositions you need to
bring to the potential linking site and how to position the dialogue. This
section walks through the steps involved in preparing and crafting your pitch.

Effort versus opportunity Before you can contact anyone, you must know who they
are and how to contact them. The nonobvious aspect of this process is that you
must also consider how much time it will take you to find that information, and
how much time you might spend on communication efforts that don’t result in
links. Even if you don’t have a huge list to work through, you can easily waste
a lot of time on this. Limit yourself to a certain amount of time to find the
correct contact information for the person who is responsible for updating site
content. One minute is usually sufficient; if you cannot find it within that
time frame, then it’s likely going to require a much longer effort. In some
cases, it absolutely will be worth the extra time because the site has a high
amount of traffic and relevance and you have a good reason to believe the
influencer or webmaster will be amenable to linking to you, but for your first
round of outreach attempts, it’s better to skip those prospects for now and
focus on the sites that are easier to contact. Note that outreach tools provide
built-in capabilities for identifying email addresses. Regardless of how
valuable a potential link may seem, you must still consider whether you’re
wasting your time chasing after it. That time may be better spent getting a few
other links from other sites that are easier to contact. In aggregate, many
less-impactful links are likely to provide more value than the one difficult
one. When

RELATIONSHIPS AND OUTREACH

541

you’ve exhausted your supply of easy requests, reevaluate the remaining options
and focus on the best opportunities first. NOTE If you aren’t using a CRM or
some other outreach-tracking tool, then you should add two new columns to your
spreadsheet: Contact and History. Put the webmaster’s contact information (name,
email, possibly social media accounts as well) in the Contact column (or, if you
prefer, you can separate the various aspects of their contact information into
many columns), and document your efforts to get in touch with them in the
History column.

Understand the value proposition you have to offer As we discussed in “Why
People Link” on page 504, people are not going to link to you just to help you
out or to help you make money. People implement links to other sites because
they believe doing so will bring enough value to their users that they’re
willing to have those users leave their site. To succeed at outreach, you must
fully understand what it is about your site, and the specific page on that site
that you want a link to, that will bring a lot of value to their users. If you
don’t have a clear understanding of what this is, then you’re not ready to
conduct outreach.

The pitch email This is another critical part of the link outreach process, and
the easiest one to do poorly. Here are some guidelines:

Recognize that you’re not the only one pitching them. Pretty much every
webmaster of a site of any visibility is getting pitches to add links to other
people’s sites on a regular basis. The other people pitching them will range
from those who are bluntly asking for links (a bad idea) to those who are very
sophisticated and highly focused on showing the recipient of the email that they
have something of great value to their users. You need to do the right things to
stand out. You also need to avoid doing the wrong things. Pitches that read like
you’re primarily concerned with how the recipients can help you will set off
alarm bells and can eliminate any chance you have of building a meaningful
relationship with them that will include them linking to your content.

Remember that it’s a small community. Don’t treat the people you reach out to
like someone with whom you will never communicate again. If they’re running a
site with users who are in your target

542

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

audience, you may have a desire to interact with them again. Don’t ruin that
with a selfish pitch.

Contact the right person. Who at the organization is most likely to be
responsible for updating page content? Don’t email the CEO. If you can’t figure
out who the appropriate person is, then begin by contacting someone who is
public-facing and obligated to respond to a business inquiry—someone in public
relations, or an administrative assistant—and who can forward your message or at
least provide the information you’re looking for.

Be brief. No one has time to read unsolicited email, but many people will at
least glance at it before clicking “delete.” Without sacrificing friendliness or
leaving out key information, use as few words and sentences as possible, and
make your subject line accurate and enticing without being hyperbolic or spammy.

Don’t ask for a link. In general, try to get out of the link-seeking mindset.
Directly asking for links makes your outreach about them helping you, when they
have little reason for doing so. They’ll write you off as a spammer, making
future contact with them even more difficult. The exceptions to this rule are
when you’re asking to modify an existing link (if the URL has changed or the
link is broken due to a typo), or in a link reclamation scenario (as described
earlier in this chapter), or if the site often publishes resource links (as long
as it isn’t a link farm and doesn’t seem spammy). Aside from those scenarios,
the focus of your pitch should be on the value proposition you have to offer to
their users—keep in mind this is not a sales offer, but more about the value of
your website or content to their users. Ideally, you want your value proposition
to be so interesting and useful to this person (and their site’s users) that
they write a blog entry or social media post that links to it in a positive
context.

Don’t ask for a link to be changed from nofollow. If someone has implemented a
link to your page and they have chosen to attach the nofollow attribute to that
link, just thank them for linking to you and move on. Trying to get the nofollow
attribute removed will set off alarms for them, and the most likely outcome is
that your link will be removed. The link has value to you as is, so it’s best to
leave well enough alone.

Show that you know who they are and what their site is about. Personalize your
pitch email so that you’re speaking to a person about the content they publish.
The less personalized your pitch is, the more it seems like it’s part of a bulk
mailing campaign, so the more spammy it will seem.

RELATIONSHIPS AND OUTREACH

543

Be friendly… Try to make some level of connection with the recipient, but don’t
overdo it. For example, if you lead your pitch with a lot of superlatives about
how great they are, or how you’ve been reading their site for years, chances are
that they’re going to feel like you’re just blowing smoke. Don’t be overly
formal, but make it clear that you’re seeking a relationship. You’re meeting
someone new, so show them that you’re a real person seeking a real connection
with a like-minded individual. Note, though, that there may be many exceptions
to this rule, especially outside of the US. In some cultures, overformalizing
with titles and honorifics is a sign of due respect and would be expected in a
pitch email; in others, it might be seen as pretentious and dehumanizing. Know
your audience.

…but don’t be too friendly. If someone’s public contact information lists their
name as “Elizabeth Tudor, PhD,” then you should address them as Elizabeth in
your pitch email—not as Beth, Liz, Betty, Lisa, Liza, Dr. Tudor, Professor
Tudor, or Ms. Tudor—and don’t make Queen Elizabeth jokes; she’s heard them all
before and they have never been funny. People will teach you how they prefer
their equals to address them; by not honoring this preference, you prove that
you have not considered their perspective and therefore don’t know how to
connect with them.

Be helpful. You aren’t asking for a link; you’re just letting the people who run
this awesome site know about some new material that they and their visitors may
be interested in. What content would they love to link to, but can’t find?

Ask for (a little) help. In many cases you’re asking for this person’s expert
opinion—which you highly respect—on a new piece of content you’ve published.
Don’t be too presumptive about what they will enjoy or find valuable; ask them
to tell you instead. If nothing else, they’ll tell you what you’d need to create
in order to attract their attention.

Offer something new. You must learn to be as objective as possible when
assessing the value of your content to the person you’re pitching. If it’s a
site dedicated to the Beatles, then the webmaster is probably not interested in
yet another “top 10” list or other loweffort content, but would probably be very
interested in a previously unreleased video clip or interview transcript from
the 1970s. Assume that the person you’re pitching to is obsessed with this
topic, but has pretty much “seen it all” at this point. What are you offering
that would be new to them?

544

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

Unless you are literally a professional clown and are promoting a site for
clowns, don’t be a clown. Don’t use weird fonts, HTML emails with backgrounds,
inline GIFs, emojis, or file attachments, and do not try to get a laugh. Even if
you’re a comedian promoting your act, it’s better for your pitch email to be
short and friendly, and for your comedic content to speak for itself.

If appropriate, brag (a little). If you’re making the person you contact aware
of a great article you wrote, your request may benefit from sharing examples of
recognition you’ve received from prestigious sites. For instance, if The New
York Times or Washington Post published your op-ed, or if you were interviewed
for a show on CNN or CNBC, it’s worth including links to that material—but not
too many, and make sure they aren’t paywalled (if possible). Consider which
other accolades, awards, honors, and credentials you might list either in the
email or as part of your signature line, but don’t go too far with it; you only
want to establish your expertise, which suggests how reputable you are.

The subject line. The best pitch email in the world will not be opened if its
subject line is unengaging or feels spammy. Keep the subject line simple, get to
the point immediately, don’t lie, don’t exaggerate, don’t let it be boring, and
don’t make it too long. Questions often make good subject lines, as do apparent
responses to your target’s most recent or most popular content. Here are some
sample subject lines: • Is this unreleased Beatles clip legitimate? • More
recent data to consider RE: mortgage defaults • Quick question RE: your latest
home brewing video • Would this work for low-noise hothouse ventilation? •
Projected damage from named storms • Something to consider RE: Microsoft’s
Surface product roadmap • Full list (with map) of Florida scrub jay sightings
this year • Anything you’d add to this video on jewelry photography? • A few
things to think about for your next piece on corn subsidies Hopefully you can
infer who the recipients would be, and the topics their websites cover, by
reading these subject lines. A good pitch email subject line is a delicate
balance between being engaging and presumptuous, enticing and vague,
extraordinary and hyperbolic. To avoid going

RELATIONSHIPS AND OUTREACH

545

overboard, don’t try to manipulate your reader, and don’t assume that they know
who you are and care what you think or do. For instance, here are some alternate
subject lines that would not be nearly as effective: • My Beatles video :)

• Apple > Microsoft

• Please link to my mortgage industry blog thanks

• Link to better info

• Best ventilation solutions

• I finally got around to doing that video I promised everyone last year lol
enjoy

• The Bahamas are doomed

• Link to put in your next article

• Follow my homebrew Facebook page

Email templates.

Every site, business, industry, and topic will have its own unique challenges
when it comes to writing successful pitch emails. It isn’t possible to create a
perfect one-size-fits-all template for you to follow for every situation, so we
created a few examples that might work for different campaigns. Here is an
example of a basic request to update a brand mention of your site that is
currently not a link: Hi , My name is Craig and I’m the editor over at Green and
Growing. Thank you for mentioning us in your article yesterday on resources that
people can use to learn more about how to find success in their efforts to grow
their own garden. I also saw the other resources you suggested in the same
article and think that it’s a great collection. I notice that you didn’t include
the American Horticultural Society in your list. Do you think that would be a
good addition to your article? Also, would you consider updating the mention of
our site to a link? This will make it easier for users who are looking for more
info. The correct URL is https:// www.greenandgrowing.org. Hope to hear back
from you. Thank you, Craig Another way to be helpful in a pitch email is to
point out a missing image or broken link on the page you’re targeting: Hi , I
was on your website this morning and really appreciated the depth and quality of
resources you provide to home brewers like me. I noticed that there was a broken
link to the Homebrewer’s Association on your page located at
www.yourdomain.com/homebrewingresources. The broken link can be found under the

546

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

second image and is using the anchor text “Home Brewer’s Association.” It
currently points to https://www.homebrewersassociation.com. The correct link
should be https://www.homebrewersassociation.org. I also have a resource that
you might want to include as an addition while you are fixing that broken link.
This is a relatively new site that has excellent homebrewing content on a
variety of topics, from growing your own hops to managing a small brewing
business: www.example.org/cheersbrewing. Let me know what you think of it! Thank
you, Norm Peterson Cheers Brewing If you take the approach of creating
comprehensive guides, then it’s likely some of the people you’ll be pitching are
linking to the top-ranked articles that your article will compete with. Here’s
an example of an approach you can take in such cases: Hi , My name is Johnny
Appleseed, and I run a website called Example Site for This Book. We cover
everything from harvesting apple seeds to pruning wild trees to backyard DIY
orchard projects. My first contact with your site was your excellent article on
Steve Jobs’s favorite apple breeds:
https://www.yourdomain.com/articles/steve_jobs_preferred_granny_smith_over_mcintosh.html.
I‘m not sure if you’re aware, but we ran a recent survey that found consumers
appear to agree with Steve Jobs, with 51% preferring Granny Smith apples and
only 38% preferring McIntosh apples (11% were undecided). You can see the full
survey results in our “Ultimate Apple Guide” located at this URL: https://
www.example.org/world-domination/ultimate-apple-guide.html. I’d be thrilled if
you decided to share it or add it to your website, and if you think it can be
improved, just let me know how! Thanks, John “Johnny Appleseed” Doe Example Site
for This Book Finally, here’s an example of a more understated approach. The
purpose is not to immediately get a link, but just to start a conversation:

RELATIONSHIPS AND OUTREACH

547

Hi Deb, I just had a quick question re: your post on the Succulent Lovers
Facebook group. Your comments on organic pesticides were great. I saw that you
didn’t mention onion in your list there. Is there some reason that you didn’t
include it? Thanks! If you get a reply, you can continue the conversation and
introduce your site at some point. If this is the approach you take, it pays to
keep in touch with this person on a regular basis, either through email or
social media. You’ll expand your professional network, which will hopefully lead
to future partnership opportunities (and links).

Case study: GHC Housing. GHC Housing, a client of coauthor Stephan Spencer, was
rehabbing an apartment building in downtown Denver, turning it into luxury
Section 8 housing. Shortly before the grand reopening, the client asked Stephan
to help them get some online press coverage (ideally with links). The first step
was to find a relevant recent article in the local newspaper, The Denver Post.
The article “Denver rents actually inch up as apartments rise at near-record
clip” fit the bill perfectly. It was only three weeks old at the time, so it was
still fresh enough to email the journalist with a comment and for it not to seem
weird. The general manager of GHC then drafted an email to send to the
journalist. The first draft wouldn’t have been effective, as it read like a
mini-press release, as you can see: Denver’s rental market has matured into one
of the most desirable markets in the nation for residents and investors alike.
As affordability has become a greater concern for the city’s leaders and
residents, a renewed concentration has been placed on meeting increased
residential needs with new construction. New construction is a key component of
keeping housing costs in check by helping meet demand, but a focus on
maintaining the existing stock of housing at an affordable level is equally as
important. This is most apparent in affordable and subsidized housing. The cost
of renovating an existing affordable housing community is a fraction of the cost
and time it takes to build a new residential building, and once the existing
affordable housing is lost to new unaffordable redevelopment projects it can
take years to replace, if and when it is actually feasible. Considering the
limited resources allocated toward affordable housing development, developers
and local leaders should consider placing more focus on the benefits of
revitalizing our existing housing stock to help meet the community’s needs. GHC
Housing Partners (GHC) specializes in the acquisition and redevelopment of
affordable housing communities across the country. GHC has acquired over 2,000
units in Colorado over the past 24 months, making it one of the largest private
owners of affordable housing in the state. GHC is focused on creating the
highest-quality properties, utilizing top design and construction methods to
create comfortable and high-quality living environments for residents. GHC is
completing

548

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

renovations at six Colorado affordable housing properties this year with plans
to complete over a dozen more in the next 24 months. It needed to sound like it
was written by a real human being, not the communications team. After some
coaching from Stephan, the revised version was personal, thoughtprovoking, and
insightful: I read your excellent article last week about Denver rents
increasing and I was compelled to comment. I concur that Denver’s rents keep
going up and I think that threatens Denver’s future. But what do we do about it?
There are only two market-focused ways to gain more affordability or reduce
rents: increase supply and/or decrease demand. Most people don’t want to
decrease demand; they want more jobs, more growth, and more economic vibrancy,
hence more demand for housing and population growth. Attacking supply is also
problematic, as flooding the market with new projects and easy building
standards may drive down prices, but it comes at the price of smart planning and
to the detriment of the existing owners and stakeholders. So, there need to be
quasi-market solutions to promote both new affordable housing and preservation
of existing affordable housing. Whether it is through bonuses to new building
density or tax abatement to buildings that have an affordability component, or
it is granting incentives for owners to keep their buildings affordable, it is
important we think about this. Having a monolithic society of only a
concentration of wealth is not healthy for economies; you lose diversity,
creativity, not to mention a sense of empathy to those senior citizens or
families who need a place to live. Diversity is a necessary and vital ingredient
which makes a place like Denver so amazing. We are really proud to have recently
worked with the regional Denver HUD office to preserve and rehabilitate Halcyon
House in Downtown Denver and sign an affordability contract for 20 years,
ensuring an affordable home for over 200 senior citizens and disabled residents
for the future. But we need as many solutions as possible, to continue to keep
Denver the type of place that has made it one of the top cities in the US.
Within hours the journalist replied, saying he’d shared the comments with his
colleague who covers the affordable housing beat. That colleague attended the
grand reopening the following week and took photos. The result was a full-page
article in The Denver Post, including a photo of the new building: “Low-Income
Apartment Tower Halcyon House Unveils $7M Renovation”. That single article led
to an important deal for GHC. That just goes to show that a link won’t
necessarily be the biggest needle-mover for the company; sometimes it’s the
press coverage itself.

Following up. So, you’ve sent off an excellent pitch email that follows all the
guidelines you learned about in The Art of SEO, including an epic subject
line…but you didn’t get a response. What now?

RELATIONSHIPS AND OUTREACH

549

First, wait at least three business days, and give some consideration to
potential holidays and long weekends. Do not assume anything other than the
following scenarios: 1. The person you emailed didn’t get the message because:
a. They’re on vacation or out of the office and aren’t reading email. b. It got
marked as spam and didn’t make it to their inbox. 2. The person you emailed did
get the message, but: a. As a continuation of 1a, they’ve returned to work and
have hundreds of unread messages to deal with, and yours is the lowest priority.
b. They’re too busy with more important immediate concerns and can’t take the
time to reply. c. They read it on their phone during lunch or on the subway and
meant to respond later, then forgot, then other emails took priority. d. They
read your message, deleted it, and will not consider your request.
Unfortunately, you have little influence in most of these scenarios. If you
suspect your email was never received due to spam filtering, then you can
compose a different pitch email and send it from a different email address on a
different domain, and mention in a postscript that your first email didn’t seem
to get through. If your email was received and deleted, then all future
communications will not be welcome, but you have no way of knowing that until
you get a response. Therefore, you might consider a follow-up email that
restates your pitch even more concisely, and explicitly asks whether you should
follow up again. Lastly, if someone’s not able to read your pitch right now for
any reason, then that’s the reality and you have to live with it for the moment.
The best you can do is to follow up in a month or so and hope they’re back at
work and/or not occupied with urgent matters. When considering all these
scenarios, it benefits you most to assume that several may be true. Therefore,
we recommend a two-step follow-up strategy: 1. After three business days
(accounting for potential holidays), send a brief followup from the same email
address, and mention that your first message may have been inadvertently
filtered out. 2. If you get no response, then take the position that your
previous messages have not yet been received because the recipient either was
not the correct contact, is still out of the office or away from work, and/or
has been busy and will not be available to respond for a while longer. Your
second and final follow-up should be sent anywhere between two and four weeks
later, and you should consider

550

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

sending this from a different email account in case your email is being filtered
out. If you still do not get a response, then it’s time to set this site aside
for now and reevaluate it sometime in the future.

Creating a value proposition for high-value relationships The highest-value
targets usually take a lot more effort to contact. In order to get in touch with
an influencer or someone who controls an influential site, you must be
interesting and useful to them. Just about everyone has some experience with
that (we hope), but here are some specific tips for influencer outreach: • Start
by engaging with them on social media or by posting comments on their
articles—meaningful comments that add value to the discussion and show that you
read their article or watched their video, not quick throwaway stuff like “Great
video!” Asking a question tangential to one of the topics they’ve covered is an
excellent opener. • Ask them to provide a quote for an article you are writing.
Make this easy for them, so it’s more likely they will agree. (This might not
work if you don’t have a history of publishing trustworthy articles, or if your
article doesn’t have enough reach to make it worth their while.) • If you have a
podcast, stream, or other regular content channel, offer to interview them on or
for it. • If they will be attending a conference, go to that conference yourself
and find a way to meet them in person. One great way to do this is to attend a
session where they’re presenting and go up and speak to them afterward. •
Monitor their article and social postings and take note if they ever ask for
help in some fashion (“I wish someone would show me how to…”). Then, offer them
that help—even if you have to spend a few hours learning how to do it yourself
first. • You’re probably not over two degrees of separation from them. If you
know someone who knows them, ask for an introduction. LinkedIn is a great way to
find the shortest path to creating a new professional connection with someone,
as you can see what people you know in common. Basically, just be of service
without seeming desperate or sycophantic. Don’t treat this as an “I need
something, so I will just ask for it” situation. These relationships are
important, and it’s not a good idea to start off on the wrong foot. Once you
have a relationship, many possibilities start to open up, and your reputation
and visibility can grow as a result. Don’t ask for a link, just focus on
building a relationship, and at the right point in time, the person you built
that relationship with will decide on their own to link to a fantastic piece of
content you have created that is

RELATIONSHIPS AND OUTREACH

551

of high interest to them. Better still is if you can find a way to help them
with some of their content by doing research or contributing key data to
something they’re working on. At this point you may even get into a situation
where you’re collaborating with them.

What Not to Do Linking is the natural result of discovering great content; if
you publish great content, then you should have faith that people will
eventually link to it. That said, as good marketers it makes sense to
proactively make people aware of your great content. However, ensure that your
outreach follows sound traditional marketing tactics. Influencer outreach is a
delicate, long-term process that should be completed slowly and carefully. If
you make a small mistake, like asking for a link without establishing a
relationship first, then you might be able to get onto the right track and be
forgiven. If you make a major mistake, it could cost you more than just a lost
opportunity for a link. Some examples of things to avoid are described in the
following subsections (refer to Chapter 9 for a more comprehensive explanation
of black hat SEO tactics, spam, and penalties).

Being a pest If someone isn’t responding to your pitch and (at most two)
follow-ups, then it probably isn’t a good idea to increase the pressure on them.

Asking for specific anchor text The descriptive text inside the element is an
extremely important ranking factor. That is why the major search engines forbid
people from requesting that incoming links use specific words or phrases. You
can ask for a link, but don’t ask for specific link text.

Quid pro quos (reciprocal linking) Don’t offer or request a reciprocal link (“I
link to you, you link to me”). While it’s natural that topically related sites
would link to each other in their page content, reciprocal linking as a
deliberate practice always ends up looking spammy. If it looks like a link farm
to you, then it looks like one to Google, too, and you will get no benefit from
a link from it. Having said that, in some cases the people you request links
from are going to ask you for a link too. Use good judgment here. If you think
their site is a strong resource for your site visitors and you link to their
site in a natural way (in a blog post or other page content), then that should
be fine. If you dump the link in a list of other links in your page footer or on
a Links page, then you’re not doing anything of benefit for

552

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

your users or the page to which you’re linking, and in extreme cases you could
end up being penalized by search engines.

Paid reviews and endorsements If you are trying to reach a product reviewer, be
warned: some high-level reviewers may ask you for money in exchange for a
review. Among journalists this is considered highly unethical, but bloggers and
social media power users don’t always fall into the “journalism” category and
may not know or choose to honor those standards. That doesn’t matter, though; in
the United States, anyone who publicly posts a product review, whether they
consider themselves a professional reviewer or not, must disclose the fact that
they received compensation (including a free product) for it. This is a US
Federal Trade Commission (FTC) regulation, and it applies to all published
content—even product endorsements on X (formerly Twitter). The FTC puts the
responsibility for disclosure on the brand, not the publisher, so if you send a
check with your review materials, or if you tell a blogger or journalist to go
ahead and keep the review unit you’re sending, you must ensure that they clearly
print an appropriate disclosure notice. Giving out free products for a blogger
or influencer to review is generally acceptable as long as they provide proper
disclosure; they do not need to disclose anything if they return or destroy the
product prior to publication. For instance, if you send a prerelease ebook to a
reviewer, they could avoid the disclosure requirement by deleting the ebook from
their library; and if you send a physical product to a reviewer, you can include
a prepaid return shipping label and ask that it be returned by a certain date.
Unless you own a content site that publishes product reviews, paying someone to
review a product is almost universally considered unethical, and it’s against
the rules on most retail sites. Amazon regularly deletes fake reviews, bans paid
reviewers who don’t disclose their compensation, delists products that have a
lot of fake reviews, and terminates the accounts of sellers who use fake review
services. It isn’t just a matter of direct or indirect compensation; the FTC
rules also require disclosing any connection between a reviewer or endorser and
the marketer, manufacturer, or seller. That encompasses anything that could
affect how consumers evaluate the review or endorsement. So, if your cousin
works for (on a salaried or freelance basis) The Wall Street Journal, they would
have to disclose their familial relationship to you if their review of your
product were to be published there (or anywhere else).

Private blog networks (PBNs) Some SEOs try to get away with operating a network
of sites—blogs, usually—that only exist for the secret purpose of link building
for a client site. Google is pretty good

RELATIONSHIPS AND OUTREACH

553

at detecting PBNs. If you get caught, at best you’ve wasted your time and money,
and at worst you’ll receive a manual penalty or algorithmic adjustment. This is
only a bad tactic when those blogs aren’t legitimate content sites and are
clearly fronts for rank manipulation. For instance, if you genuinely maintain
several blogs on different topics—one about Apple devices, one about online
gaming, one about self-driving cars—then you aren’t doing anything wrong, and
there’s nothing underhanded about linking out from two or three of them when
their topics overlap. However, Google may aggregate or collectively discount the
authority of their outbound links when it detects that those sites are owned by
the same entity.

Buying expired domains to use for links When a domain that has some authority
with search engines expires or goes up for sale, some SEOs think they can buy it
and use its link authority to enhance the rankings of other sites. As with most
spam tactics, Google is good at detecting this, so don’t do it.

Paid links Google does not like it when people pay to be linked to from
influential sites and demands that webmasters qualify paid links with an HTML
attribute that designates them as such (refer to “Using the rel=“nofollow”
Attribute” on page 278). When webmasters honor these requirements, you get no
SEO benefit from their links (though you will still get direct traffic from
them); when they break the rules, Google almost always finds out, and may assign
manual penalties to those involved. If it’s important enough for you to be seen
on a site that you decide you want to pay for a link that includes the sponsored
or nofollow attribute (say, if it has particularly high brand value or can
deliver high-value traffic), it may be worth it. Some social media influencers
offer extra promotional services on a commercial basis. For instance, a popular
Pinterest user might solicit money to do a photo shoot as part of an in-depth
product review or interview, or food bloggers might offer to publish a recipe
and high-quality photos that revolve around a particular culinary product. This
is not prohibited by Pinterest, but if you take this route, make sure any links
use Google-approved rel attributes, and the proper FTC-approved disclosure is
printed.

Outbound Linking This chapter has focused on incoming links, but outbound links
can also have an impact on search rankings. From Google’s perspective, you’re
endorsing a site’s authority, and the relevance of that site to yours, when you
link out to it. Linking out to untrustworthy or irrelevant (to the topics that
apply to your business or field) sites can negatively affect your search
rankings. Conversely, linking out to well-curated

554

CHAPTER ELEVEN: PROMOTING YOUR SITE AND OBTAINING LINKS

pages that are on highly relevant and authoritative sites will offer more value
to your users, and it may cause search engines to see your site as more valuable
as well. Note that you can qualify outbound links with rel attributes (such as
nofollow, spon sored, or ugc), as discussed in “Using the rel=“nofollow”
Attribute” on page 278. These are ways of informing Google that you don’t wish
to endorse the content being linked to. The two most common reasons for
implementing these attributes are if you sell ads on your site (use the
sponsored or nofollow attribute) or if the link is found in user-generated
content on your site (use ugc or nofollow). There may also be other scenarios
where you choose to not endorse content that you’re linking to, however; for
example, if your content has a reason to link to a competitor’s site or a
low-quality site (use nofollow).

Conclusion Great content marketing comes from this simple idea: “Build great
content, tell everyone about it, and motivate them to share.” Quality content
will naturally attract and earn inbound links—and links remain a large factor in
search rankings. The best links can deliver traffic to your site on their own
and are most likely to be seen as valuable by search engines in the long term,
but those links would be valuable even if there were no search impact. A solid
content development and content marketing plan is essential to all your online
efforts, not just SEO. You should view content development and marketing as an
ongoing activity, ideally with an editorial calendar optimized for seasonal
strategies. We have seen cases where a client focused only briefly on link
accumulation (with or without focused content marketing), achieved a little bit
of success, then stopped. Unfortunately, due to the lack of fresh links the
client’s sites soon lost momentum and rankings to their competitors (the same
ones they had previously passed), and it proved very difficult to catch up to
them again.

CONCLUSION

555

CHAPTER TWELVE

Vertical, Local, and Mobile SEO SEO has traditionally been focused on web
search, but the massive expansion of internet-connected device technology over
the past few decades has forced us to think beyond the World Wide Web. It’s no
longer just about what someone is searching for; it’s also about where that
person is when they search for it, the device they’re using to perform that
search, the data the search engine has about them that might influence the
search results that person sees, their input method, and whether the SERP is
likely to show them one or more special features in addition to the usual web
search results. People don’t search for web pages, they search for content, and
though that content is increasingly hosted on parts of the internet that are not
web pages, people still use web search engines to find it. They use Google to
search for music, images, videos, and apps; to look for a 24-hour plumber,
restaurant, gas station, or ATM; or to gain quick access to a very specific
piece of information such as a restaurant’s current menu, a local store’s
opening hours, a phone number, a street address and directions to it, technical
specifications for one of their devices, the price and availability of a certain
product, the weather forecast, the time of day, the exact date of a particular
holiday, a hotel room in a certain place, or a ticket to an event. While most of
this information can be found on various web pages, it’s still often quicker and
easier to ask Google instead—even if you know the URLs of those pages. In some
instances, the answer isn’t on the internet at all; for many years running a
popular Google search query has consistently been what is my IP address? This
chapter explains how to optimize for various special kinds of searches that
reach beyond the traditional web.

557

Defining Vertical, Local, and Mobile Search There are three search categories
that require extra consideration in SEO:

Vertical search Refers to queries for specific content types that are separately
indexed.

Local search Refers to queries that are explicitly (as part of the query) or
implicitly (via location data or search settings) determined to be within the
scope of a certain locale.

Mobile search Refers to queries executed from a mobile device (or, more
specifically, from a mobile browser, search app, or widget). In other words,
local is about where you search (or the locale that you want to focus on),
mobile is about how you search (from a specific device, or using a specific
input method), and vertical specifies the kind of content you’re interested in
finding.

Vertical Search The major search engines maintain separate vertical search
indexes—often automatically created through structured data on relevant sites
and services—for content such as images, videos, events, news, travel, products,
music, and people. In addition, there are many other types of vertical search
engines, such as (to give just a few examples) YouTube, Spotify, corporate
intranets, newspaper archives, and case law databases. When people use vertical
search, they’re typically either looking for a specific result (such as Lady
Gaga’s “Bad Romance” music video, or images from the surface of Mars) or
drilling down into universal search results. As an example of the latter, let’s
say you want to know more about Dr. Martin Luther King, Jr.’s famous “I Have a
Dream” speech, so you search on martin luther king “I Have a Dream”. The initial
SERP will show many different content types and special features (SERP special
features are covered in Chapter 3): there’s a carousel at the top that offers
more details on specific aspects of the speech, a knowledge panel on the right
side of the screen that lists basic information, video results, news stories,
images, and links to books and audiobooks related to this topic. These are all
vertical results. Below them, far down the page, are the usual web results. “I
Have a Dream” can be represented in several different ways: as text on a web
page, text in a printed book, text overlaid on an image, text overlaid on a
video, digital images of Dr. King delivering the speech, video of Dr. King
delivering the speech, print images (posters, postcards, etc.) of the speech
and/or of Dr. King, video commentary about the speech, audio of the speech in a
video or an audiobook, current news stories about the speech, and historical
news stories about the speech (either as images of

558

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

scanned newspapers or text produced by performing optical character recognition
on scanned images of newspapers). Before you continue reading, stop for a moment
and consider all the ways that your content can be represented beyond “text on a
web page.” We’re going to take a look at local and mobile search next, but later
in this chapter we’ll explore optimizing for news, image, and video search.

Local and Mobile Search Google and Bing used to maintain separate indexes (and
crawling processes) for mobile and desktop search, but as of at least 2016 they
now each only maintain one universal search index. Google announced in March
2020 that it would begin transitioning to mobile-first indexing for all websites
in September of that year—meaning it would store information from the mobile
site, not the desktop site, in its index for all sites on the internet. The
transition had not been fully completed as of November 2022 (and likely still
has not been at the time of publication), but this is to be expected as some
sites have very weak (or no) mobile versions. However, you can assume that the
overwhelming majority of sites are being indexed on a mobile-first basis.
Consequently, you can safely assume that any information you have on your
desktop site but not on your mobile site will not get indexed. The search index
doesn’t make decisions about search results—it’s just a metadata model—but the
algorithms that generate SERPs will make some decisions on what to show partly
based on the device from which the user conducted their search. From a
functional perspective, a mobile search query is far more likely to: • Be in the
form of a natural language question • Lead to interaction with mobile apps
instead of web results • Be associated with a Google or Microsoft account (which
enables a higher level of personalization) • Be associated with much more
precise location data from various integrated radios (it’s not just GPS
coordinates; WiFi, Bluetooth, and wireless data signals can also be used to
calculate a searcher’s location) Over half of searches on the major search
engines are executed from mobile devices, and mobile search has a greater
probability of having local intent. Therefore, mobile search frequently has
local search scope, but local search is larger than mobile; it also encompasses
desktop search queries that seem to have local intent. For instance, an
obviously local search query might include the name of the location as a
keyword, as in: Sarasota photographers

LOCAL AND MOBILE SEARCH

559

or a relative location, as in: photographers near me

Google is smart enough to know that a photographer is usually a local service
provider and will consider the possibility that you want to find a professional
photographer near where you seem to be. So, if you live in Sarasota, FL, and
Google has access to your location data (or if it can reasonably guess at your
location based on your IP address or search settings), then this query might
provide many of the same top results as the previous two: photographers

Of course, such a broad and undetailed search context creates a lot of
uncertainty about your intention, so in a case like this you’ll get a mix of
local and global results— maybe you’re researching famous photographers for a
paper or article, or you’re just interested in the topic of photography and
aren’t interested in hiring anyone. Until your search context is refined with a
more specific query, the results will reflect the search engine’s best guess
about your intention, but for a query like photographers it will still skew
local—the SERP will be different depending on where Google thinks you are right
now, and doubly so if you’re searching from a mobile device. Note that a query
like albert einstein is less likely to be treated with local intent. NOTE It’s
important to note that word order in the search query matters to Google.
Sarasota photographers is a different search query from photographers Sarasota
and will return different results.

The Impact of Personalization Location data isn’t the only signal for search
results, and in fact many of the differences you may see between a mobile and a
desktop SERP could have less to do with location, and more to do with
personalization. Note that Google does not consider localization of search
results as a form of personalization, so even if you use incognito mode in your
browser to disable personalization, your results may still be localized. On a
low level, many people show a preference for different content types in mobile
versus desktop SERPs, or when searching from different locations. For example,
someone who might never click a video link in a desktop search from their home
office PC might only click video results from mobile searches. Similarly, when
searching from a specific device in a household, they might only click through
to streaming services. Despite there being no explicit preference setting for
this, Google monitors searches across devices and locations and will adjust
universal results according to the content types that it predicts will be most
impactful to the user at that time and place.

560

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Personalization can also factor into vertical results on universal SERPs.
Google’s current approach to personalization is that it’s enabled by default,
but you can choose to turn it off. Turning it off will not stop Google from
collecting information about you, and you may still get different results than
other users based on language settings or your location. This can potentially
impact what vertical search results you get as well.

Journeys and Collections Major search engines store various data associated with
past queries (even if you aren’t signed in to a user account) and will consider
each new query in the context of the previous ones, on a user-specific basis as
well as in aggregate for the query. The extent to which any one user’s
historical search queries and user behaviors are incorporated into a search
engine’s rendering of that user’s subsequent search results depends on numerous
factors, including but not limited to whether the user was authenticated (and if
so, what their search settings are), the privacy level of the devices and
browsers used by that user, and other advertising-specific data the search
engine may have stored about that user over time. Depending on the time frame
over which you performed the searches, if you executed the previous three
example queries in the order they’re printed, by the time you got to
photographers the results would very likely be skewed toward the Sarasota, FL,
locale, and would probably be mostly identical to the results from the second
query. However, if you executed these queries in reverse order (or if you appear
to be a new searcher by switching web browsers, logging out of your Google or
Microsoft account, or connecting to a VPN), you’d get a completely different
result set for photographers. Google, for example, describes taking this one
step further by recording and analyzing the context of a series of topically
related queries over an extended period of time— what the company refers to as a
journey. The previous three queries would constitute an early-stage search
journey full of uncertainty. If your next query is reasonably related to the
topic of photographers, then it will be added to the journey and the scope will
narrow to the remaining overlapping topics. For instance, consider how this
query would both expand and contract the search journey: Sarasota caterers

A person executing this set of queries, based on their content and sequencing,
might very well be planning an event such as a wedding in Sarasota, FL. You
could leave out Sarasota and, based on your location data and the search journey
so far, the results would still strongly skew toward the Sarasota locale. You
can prove that by making this your next query: tuxedo rentals

Again, this is a local service provider kind of business, so the results (and
on-SERP ads) will always skew as local as possible, and since the topics of
photographers, caterers,

LOCAL AND MOBILE SEARCH

561

and tuxedo rentals are all within the domain of formal occasions, Google will
start trying to guess the occasion and anticipate your future queries. At a
glance, it may look like a wedding because that’s the most common formal event
that involves all of these services, but there isn’t enough evidence to declare
this a certainty; it could also be any one of a number of less common one-off
events such as a Masonic lodge function, a fashion show, a movie premiere gala,
or a “big band” concert. If the search index contains information on one or more
of those types of events in Sarasota in the near future, the results could skew
toward them, but weddings would not be eliminated from consideration. However,
if your next query is: bridesmaid dressmakers

it’ll be a certainty that you’re interested in planning a wedding in Sarasota,
FL, and other event types will generally be excluded from the results until you
deviate from this journey. At this point, Google will shapeshift into a wedding
planning machine. It will begin guessing your future queries and providing
preemptive answers in the results, and Google Ads and content from Google
marketing partners will show you wedding planning content everywhere they
reasonably can—not just on the SERPs, but on any site or in any app that Google
shares advertising data with. Remarketing agencies could even send you snail
mail ads and print catalogs from wedding service providers (this actually
happened to one of the authors after testing these query examples while logged
into a personal Google account). Search journeys can stop and resume over a
period of months, depending on how much data Google collects about you (which,
again, is based on various choices you’ve made on your end of the technology
connection). So, if you walk away and come back in an hour and query for
something like best dog food for toy poodles, then that query will constitute
its own separate journey because it does not appear to share a context with your
“Sarasota wedding” search journey. Three months from now, if you search for: dj
OR band OR "wedding singer" -"Adam Sandler"

then that query will probably go into your “Sarasota wedding” search journey,
and the SERPs will reflect both past and presumed future queries. You can also
curate this process by creating a collection in your Google mobile app. A
collection is essentially the same thing as a journey, except you choose the
queries, topics, and bookmark URLs yourself, and you can also include images and
deep links to mobile apps. Modeling searcher behavior in this manner can be
extremely helpful in keyword research, and in identifying local and mobile
search optimization gaps.

How Local Is Local Search? Mobile devices can provide extremely accurate
location data that can dramatically impact local search results. Google can
pinpoint the location of a device with a discrete GPS chip (rather than a
cheaper software-based GPS solution) with a clear view to

562

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

the sky in an urban or suburban environment with several nearby WiFi hotspots,
and with every radio turned on and all accuracy improvement options enabled,
with a margin of error of 1 meter, which is good enough to reliably soak someone
with a water balloon. With some radios disabled, accuracy can slip to 10–20
meters, and in poor conditions it may be as much as 100 meters off. Even when
GPS services aren’t available due to interference or manual disablement, Google
Maps can determine your location through your phone’s camera—with your
permission, of course. This optional feature is called Live View, and it
attempts to match your captured images to those stored in Google Street View.
Google refers to this dynamic method of calculating coordinates from many
potential sources as its fused location service. Regardless of how location is
determined, in the most reasonable worst-case scenarios for internet marketing
purposes (assuming your storefront isn’t a yurt in the remote steppes of
Mongolia), the fused location service will be accurate to about 20–30 meters,
which is good enough to deliver search results “within walking distance” of the
given coordinates. However, the radius of local search results can vary
depending on factors such as commercial competition, population density, and
access to public transportation. In a mall or in the heart of a large urban
commercial district, local search results will be limited to a relatively short
distance; in a rural or relatively uncompetitive market, local results could
encompass a radius of several miles. In a competitive local market, mobile
search traffic could make or break a business. For instance, if you’re promoting
a restaurant on International Drive in Orlando, FL, your main SEO efforts may
include targeting keywords related to Orlando and restaurants because this is a
major tourist and industrial convention destination, and you’ll want to rank
highly in universal search results for people who are planning their vacations
and business trips ahead of time. However, the Orlando metropolitan region
encompasses an area of about 4,000 square miles where over 75 million annual
visitors and 2.5 million residents have over 7,000 restaurants to choose from;
in other words, Orlando may seem like a local query, but it encompasses a fairly
large area. In this case, local search optimization should focus on the
International Drive district, where you are competing with hundreds of other
restaurants to reach a large number of pedestrians, hotel guests, and tourist
attraction visitors when they unlock their smartphones and ask: what are the
best restaurants near me?

Vertical Search Overlap Consider the kind of results you would want to see if
you were on the previously chronicled “Sarasota wedding” search journey. What
sort of content would most effectively capture your interest and earn your
trust? Many searchers will prefer visual results to see venue galleries, dress
designs, floral displays, and the like. Given this preference, a wedding service
provider in the Sarasota, FL, market should prioritize creating image rich
content as well as optimization

LOCAL AND MOBILE SEARCH

563

for image search, even if your service has nothing to do with photography. There
are few local service industries that can’t benefit from image search
optimization. Photographers, for instance, have two challenges: marketing
themselves as fun and easy to work with, and showing the feelings invoked by the
quality of their finished work. Similarly, music and live entertainment service
providers must also market the mood they’re meant to inspire—people dancing and
having a good time. Caterers, tailors, and jewelers rely on high-quality
industry-specific professional product photography for sales leads: the food has
to look delicious; the dresses and tuxedos have to look glamorous on happy,
beautiful models; the jewelry has to look expensive. Video search optimization
can benefit the same industries in similar ways. Since planning a wedding is
often rather stressful for the betrothed, wedding photographers would likely
benefit from a video showing them engaging with a playful wedding party in a fun
way. Caterers would benefit from a video montage of guests raving about the
food. Dressmakers would benefit from professional video shoots with commercial
models who look like they’re having a great time at a wedding reception, and
fashion models who are both happy and attractive as they walk down the aisle in
some brilliant nondenominational wedding venue (or you could make a series of
videos in different venues that would appeal to specific religious
demographics). Entertainment service providers must have videos of people
dancing and singing. The goal is to make your pages accessible in a wide range
of local search contexts. Images and videos represent vertical search results,
but local is a search scope, not a vertical; it is an extra set of algorithms
that apply location-based information to a universal or vertical search query.
You can expect that someone who executes a universal search for Sarasota
bridesmaid dressmakers may switch to the images vertical at some point in their
search journey if the web pages they find don’t have enough imagery to satisfy
them, so your local optimization efforts should include relevant vertical
indexes. Since vertical results are often pulled into universal SERPs, it may
even prove to be more cost-effective to optimize for image and video verticals
than universal search because of the reduced competition within a local search
scope.

Optimizing for Local Search The search engines utilize multiple algorithms to
return the best results to searchers, based on the intent of each individual
search query. When a search is determined to have local intent, the search
engines will use their local algorithms. These algorithms weight signals
differently and include additional signals that the traditional search
algorithms don’t. Any entity that does face-to-face business with customers
should be optimizing for the local algorithms. Whether you’re a business with a
brick-and-mortar storefront (such as a doctor, a restaurant, or a car
dealership) or a business that services customers at

564

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

their location (like an electrician or a plumber), the local algorithms will be
used to display search results related to you. When local search results are
displayed, the local algorithms constrain the results to a particular geographic
area. Instead of competing with every similar business online, a local business
only has to compete with every similar business in the local market. The radius
of local search results can vary. In a crowded and densely competitive market,
local search results might be limited to businesses within a radius of a few
miles of the searcher. In a more rural or less competitive market, local results
could be displayed in a much wider radius. While standard search results are
displayed based on relevance and prominence, local results are also displayed
based on proximity. Google can determine where a user is located when they
conduct a search, either by the IP address of the computer or by GPS or cell
tower triangulation on a mobile device. The proximity of search results to the
searcher (how far away they are from the user who’s conducting a search) will
influence how local businesses rank.

Figure 12-1. How Google determines local ranking

OPTIMIZING FOR LOCAL SEARCH

565

Where Are Local Search Results Returned? It’s important to understand where
results are returned from local algorithms. We’ll use Google for this example,
but the other search engines work the same way. The most obvious location served
by Google’s local algorithm is the map pack, where three results are displayed
directly under a map showing their locations. An example of this is shown in
Figure 12-2.

Figure 12-2. Google map pack results

If you see a map pack for any search phrases that matter to your business,
that’s proof that Google ascribes local intent to that search query, and you
know you need to do local SEO.

566

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

NOTE Remember that a search query conducted in Google Maps doesn’t produce the
same results when conducted in Google search, and in fact may not include a map
pack.

For some verticals, the map pack looks like the example shown here, with
top-line business information on the left and Website and Directions links on
the right. For other verticals, review scores are displayed on the left and
images are shown on the right instead of links. On any map pack, you can click
the “View all” or “More …” link at the bottom of the pack to see the Local
Finder page, which displays every local search result (instead of only the top
three). You can see an example of what this looks like in Figure 12-3.

Figure 12-3. An example of a Google Local Finder page

Users can also search directly in Google Maps. The result looks similar to the
Local Finder page even though it’s fully integrated into the standard search
results, as you can see in Figure 12-4.

OPTIMIZING FOR LOCAL SEARCH

567

Figure 12-4. Local Finder results as seen in the regular search results

Notice how the local radius of displayed results is much tighter in the Google
Maps search result. The Local Finder is expanded from the map pack and typically
covers a much wider search radius. Finally, Google also uses the local algorithm
to return the “ten blue links” you’re used to seeing in standard search
results—but in local intent searches, the results are localized, as shown in
Figure 12-5. Many search queries are assumed to have local intent even when a
location term isn’t included in the query. Figure 12-6 shows an example of the
results after a search for Ford dealer conducted from the Plano, TX, area.

568

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Figure 12-5. Example results for a query with local intent

OPTIMIZING FOR LOCAL SEARCH

569

Figure 12-6. Example of Google assuming local intent for a query

The map pack will show the closest Ford dealers to your real-world location,
even though you didn’t specify a location in the query. It’s important to test
your own important keywords once you’ve finished your keyword research. If you
see any that return localized results, you’ll need to optimize for the local
algorithm. More broadly, it’s a great idea to assess your current situation
across the local search landscape in all the areas that you service. This can be
quite tricky, as the search results returned by Google are dependent on how it
perceives the current location of the searcher. One tool that can help with this
is Local Falcon; it assesses your visibility across a given area by searching
from multiple grid points. Local Falcon also provides you with a Share of Local
Voice (SoLV) report including data on your competitors to help give you a robust
picture of how your local search ranking campaign is going, and which areas
continue to represent opportunities.

Factors That Influence Local Search Results As mentioned previously, the search
engines’ local algorithms include additional signals and weight the “standard”
signals differently. Whitespark releases an annual Local Search Ranking Factors
report detailing the results of an in-depth survey about the various factors
that influence visibility in Google’s local search results, completed by the top
35–40 worldwide experts on local SEO. The results are aggregated and outlined,
so local SEO practitioners know which signals have the most influence on
visibility and how Google’s local algorithm has changed over time. The signals
are grouped into two categories: signals that influence visibility in the map
pack and Google Maps, and signals that influence visibility in localized organic
results.

570

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Full details are available in the report, but the results are also summarized
with two bar charts, as Figures 12-7 and 12-8 show.

Figure 12-7. Local map pack/Google Maps ranking factors

Figure 12-8. Localized organic ranking factors

Optimizing Google Business Profiles Originally named “Google Places” (and later
“Google+ Local Pages” then “Google My Business”), Google Business Profiles are
the most important element of local SEO. Your Google Business Profile (GBP)
listing is what allows you to show up in map packs or Google Maps results. It’s
basically your new home page—it’s the first impression you make with potential
customers. In a search for your company name, your GBP listing will appear in
the right column as a knowledge panel. The listing will display your business
name, address, phone number, website link, reviews, and photos. If you’re just
starting out with local SEO, you’ll need to check if a GBP listing exists for
your business. Do a search for the business name; if a GBP panel appears on the
right, the listing already exists and you’ll just need to click the link to
claim the business.

OPTIMIZING GOOGLE BUSINESS PROFILES

571

If the listing doesn’t exist, you’ll need to go to Google Business to set up a
new listing. Enter your business name, and you’ll see a list of possible
matches. If you don’t see your business listed, you can add it. You’ll need to
verify that you really represent that business, and in most cases that means
Google will send you a verification PIN on a postcard. The postcard takes 10–14
business days to arrive. Once you enter the PIN, you’ll have full access to edit
and optimize your GBP listing. You should fill out and optimize every available
element. Here are a few basic guidelines for what to include:

Business name The name of your business is a ranking factor. Use your actual
business name. Adding additional keywords is against GBP guidelines and can
cause a suspension of your listing.

Address The address of your business is a ranking factor. Enter your business
address. Suite numbers don’t matter to the algorithm, but they should be
included for potential customers. Be sure to check your map pin location and
ensure correct placement. If you’re a service business with no brick-and-mortar
storefront, click the checkbox to hide your address so it won’t be displayed in
search results.

Phone number The phone number is not a ranking factor. It’s important to enter a
phone number with a local area code. That doesn’t mean you can’t use a call
tracking number though. In fact, using a tracking number is best practice—just
make sure it’s a tracking number with a local area code. Enter the tracking
number as the primary number, and your actual local number as one of the
alternate numbers.

Website The website link used in your listing is a ranking factor. In most
cases, you should link to your home page. If your business has multiple
locations but uses a single website, links that reference a specific location
should point to the page for that location. Be sure that the address and phone
number for your business are displayed on the page that you link to. In
addition, the content on your site plays an indirect role in local rankings. We
know this because Google has said that organic presence (“prominence,” including
your Google review count, review score, and position in web results) is a
ranking factor.

Categories Categories are a ranking factor. The categories you select have a
massive influence on how you show up in local searches. You can choose up to 10
categories, but

572

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

it’s important to only select categories that match what you do. PlePer has an
interactive list of GBP categories that’s incredibly helpful if you’re not sure
which to select.

Hours Hours are not a ranking factor, but they affect user engagement, which is
a ranking factor. Enter your standard hours of operation. If you offer
additional service options like “seniors only” or “curbside pickup,” use the
“more hours” widget to select the hours for those services.

Photos and Videos Photos and videos are not local ranking factors, but they
affect user engagement, which is a ranking factor. Upload high-quality photos
that show potential customers who you are. Include interior and exterior photos,
along with photos of your products and services. You can also upload videos, as
long as they’re less than 75 MB in size and under 30 seconds in length.

Description The description is not a ranking factor, but it affects user
engagement, which is a ranking factor. As this field has zero influence on
ranking, don’t stuff it with keywords. Write a compelling description of your
business that helps you stand out from competitors.

Questions and Answers The Questions and Answers widget is actually a community
discussion feature from Google Maps that’s displayed in your GBP listing. Many
businesses don’t know it exists because it isn’t represented in the GBP
dashboard. This widget allows anyone to ask your business a question, and anyone
can answer for you. As a business owner, you can also create your own questions
and answer them, which makes this a valuable opportunity to improve user
engagement. It’s important to upload commonly asked questions and then answer
them (like you were creating a pre-site FAQ page). Monitor answers and keep your
business owner answers upvoted so they’ll appear as primary answers.

Google Posts Google Posts are not a ranking factor, but they affect user
engagement, which is a ranking factor. Google Posts are almost like free ads
that show in your GBP listing. Don’t share social fluff; share compelling
messages that will help you stand out from competitors and increase
click-throughs to your website. Pay attention to how the images and text are
cropped to the thumbnails that appear in your listing. If a thumbnail isn’t
compelling, no one will click it to see the full content of the Post.

OPTIMIZING GOOGLE BUSINESS PROFILES

573

The Whitespark study mentioned in the previous section also discusses the most
important ranking factors within your GBP. In order, these are: • Primary GBP
category

• Physical address in city of search

• Keywords in the GBP business title

• Additional GBP categories

• Proximity of address to the searcher Note that local search practitioners used
to focus on the proximity to the center (or “centroid”) of the city. This is no
longer what Google is believed to focus on; proximity to the searcher is
considered a better signal.

Customer Reviews and Reputation Management Customer reviews are incredibly
important for attracting new customers, but they’re also weighted in Google’s
local algorithm. You need a solid reputation management process in place. It’s
important to make it easy for customers to leave a review, and it can really
make an impact if you get in the habit of asking every customer to do so. Set up
a “leave us a review” page on your site with a simple message that says
something like “Thanks for doing business with us today, let us know how we
did,” and list links to your various review sites. Include a link to your review
page in email signatures and on printed receipts. Monitor for new reviews and
respond to every review you receive. When you respond to negative reviews,
remember that your response isn’t for the person who left the review—it’s for
every potential customer who reads the review in the future. Be honest and
individualize your replies—never give a canned, impersonal response.
Unfortunately, for a variety of reasons, occasionally an irate customer will
exaggerate or outright fabricate some details in a review. Perhaps they didn’t
feel heard when the problem initially occurred, or they subconsciously recognize
that their rage is disproportionate to the impact of the actual issue, or they
were turned away or asked to leave for legitimate reasons (drunkenness,
inappropriate behavior toward other patrons, being abusive to employees,
refusing to pay, violating a posted dress code) and want revenge, or they’re
posting a fake review at the behest of a friend (perhaps a former employee who
left on negative terms, or one of your competitors, or an angry customer who has
recruited a hate brigade on social media). Here are some tips for dealing with
these situations:

Privately identify the customer. If this is a legitimate customer, then this
probably isn’t the first time you’ve heard from them. If you know this customer
and have already tried to resolve the issue privately, and if their negative
review is honest and accurate, then you should publicly respond with empathy and
gratitude, and explain (for the

574

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

benefit of observers) any measures you’ve taken to avoid the problem in the
future. For instance, maybe someone mistakenly selected the wrong options for a
custom-ordered product that cannot be returned. In that instance there’s nothing
to apologize for, but you can offer a discount on future purchases, and say that
you’ll review your sales process to look for ways to reduce similar mistakes in
the future. If you can’t match the reviewer to an actual customer interaction,
then say that in your response, but don’t take the position that this is a fake
review. Say something like “I’d be glad to work with you to resolve these
issues, but I can’t find you in our sales records. Please give me a call or send
me an email so we can work together to make this right.” If you don’t hear from
them, then you’ve done all you can.

Identify yourself, but don’t reveal a customer’s private information in your
reply. If a reviewer isn’t using their real name in their review, don’t use it
in your response even if you can identify them—respect their privacy. Google
recommends that you identify yourself by name in your response, though. It shows
that you’re a real human being, and that you’re a responsible person within the
business and not some uncaring employee who’s paid to say “sorry” all day to
angry customers.

Talk to any employees who were involved. If the review claims that an employee
did or said something, ask the employee for more information. Something may have
been misinterpreted, or there might have been a mistake or an accident that
wasn’t properly addressed. If there is a vast difference between the two
accounts, don’t try to litigate the entire matter; evaluate the situation,
decide what you want the future to look like, and at the very least respond to
the review and say that you’ve read the complaint and spoken to the employee in
question. Don’t throw your people under the bus, though; customers don’t like to
see you treat your employees like they’re disposable. Also, there are
circumstances where the customer is definitively wrong, and in that case you
should diplomatically say so in your response.

Always act with professionalism. Everything you write in your response is
attributed to your name and reflects your business’s reputation. Exaggeration
and distortion are common in negative reviews; they should never appear in
replies. When you engage with a reviewer, be as neutral as possible, and state
the relevant details when they are of benefit. For instance, a reviewer might
say that your restaurant staff was rude and inattentive, but in reality they’re
upset because their repeated advances toward some of your staff members were
politely rebuffed. In that case, be the adult; explain that inappropriate
behavior toward your staff or other customers will not be tolerated,

OPTIMIZING GOOGLE BUSINESS PROFILES

575

acknowledge that we all have bad days when we’re not at our best and end up
doing things that we later regret, and let them know that they are welcome back
anytime as long as they behave in a respectful manner. At face value this seems
like a very negative situation, but think about how it will be interpreted by
people who read the review and the response: you’re showing that your restaurant
is a safe place, and that you care about your customers and employees. You
should strive to be above a 4.0 review rating at a minimum. If you drop below a
4.0, you automatically get filtered out for a range of qualitative keywords such
as best and greatest. Obtaining reviews can seem difficult, and you might be
tempted to buy fake reviews. Don’t do it! The legal system is becoming more
aggressive in taking action against fake reviews (the FTC can impose penalties
of upwards of $43,000 per review), and it’s not worth the risk. Remember, SEO is
a long-term game. Shortcuts usually have negative consequences.

Understanding Your Local Search Performance One of the trickiest things to
understand is how you’re performing in local search results. It’s understandable
that you’d want to see some measure of success with regard to your local SEO
efforts. The best way to assess this is through a series of checks. These should
be performed in order, as skipping any of these checks could lead to false
signals. In a business context, start with a sales analysis. How many sales did
you close this month, versus a year ago? If your sales are up year-over-year,
your overall efforts are paying off—but you might have different channels, such
as ads, TV commercials, and SEO efforts, and you want to figure out which of
these are truly working. For the sake of this exercise, let’s assume your sales
are down year-over-year. You’ll probably want to think about your lead
conversion process next. Is the issue with your sales team’s efforts on
converting leads to sales? Suppose you determine that this isn’t the problem.
Next, you’ll want to take a look at your website. Think about the customer
journey. Imagine you’re a customer. If you landed on your website’s home page,
would you know what to do next to find what you were looking for? Are there
sufficient directions or calls to action? Your SEO efforts could be driving a
ton of leads, but if people get to your website and don’t know how to find what
they’re looking for, their journeys will end. You can run a huge number of
checks on your website and your organic search presence, but let’s assume your
site is well optimized and there aren’t any obvious problems to fix here. What
should you look at now?

576

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Because our focus here is on local search, we’re going to dive into Google Maps
next. If you’re a local business, your Google Maps listing is going to be one of
your most important assets and will likely drive a good percentage of your
leads. So, the next check is to see how that listing is performing. The first
step in this process is to log in to your Google Business Profile dashboard, as
shown in Figure 12-9.

Figure 12-9. Log in to the Google Business Profile dashboard

Once you’re logged in, click Performance, as shown in Figure 12-10.

Figure 12-10. Select Performance in the Google Business Profile dashboard

You should now be able to see a breakdown of how your Google Maps listing is
performing. The metrics to look at are calls and website clicks. Calls will tell
you if people are calling from your Google Maps listing, and website clicks will
tell you if people are following the link from your Google Maps listing to your
website. How do these figures compare to the figures from a year ago? If calls
and website clicks are up year-over-year, this is an indication that your local
SEO efforts are working.

OPTIMIZING GOOGLE BUSINESS PROFILES

577

Note that Google only allows historical comparisons of performance stats for
completed months, so if you’re partway through a month you’ll need to deselect
that one when you choose the comparison period, as shown in Figure 12-11.

Figure 12-11. Viewing historical performance stats

What if the numbers of calls and clicks you’re getting from your Google Maps
listing are down? If you’re a multilocation business, you’ll want to maintain
perspective in terms of your overall success. If you have three Google Maps
listings and one is down in calls and website clicks year-over-year but the
other two are up, you’re still doing well. Definitely think about how you can
work on the one listing that is down, but it isn’t exactly a five-alarm fire. On
the other hand, if all three listings are down for the year, that is cause for
concern and you’ll definitely need to figure out what’s going wrong. If you’re
experiencing a problem like this, what should you look at next? One of the
primary causes of fluctuations in traffic from Google Maps listings is updates
to Google’s local search algorithm, such as the so-called “vicinity update”
rolled out in December 2021. Tracking these updates is important to
understanding your success (or lack thereof). In this case, the update appeared
to focus on prioritizing proximity as a ranking factor. For example, before
December 2021, many Google Maps listings were ranking well for searches by users
over five miles from the business’s location. These businesses were getting a
healthy influx of leads from Google Maps and were happy. But when the vicinity
update hit, many businesses experienced a drop in calls and website clicks: if
they were more than three to five miles from the target location and there were
competitive options available within that range, their listings wouldn’t appear.
If you experienced such a drop and you were aware of this update and its
effects, you likely would have just considered this to be the new normal
(unfortunately, there

578

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

isn’t a simple workaround to counteract it). However, if you weren’t aware the
local algorithm had been updated to prioritize results closer to the searcher’s
location, you might well have assumed you or your marketing team had messed
something up. This could have caused you to rethink your entire local SEO
strategy, or to spend time and resources trying to return to the old normal
without realizing that you were fighting an algorithm change with limited
chances of success. And even if you could regain some of your old ground, the
same time, energy, and money could have likely been spent elsewhere for a better
result. But what if you can’t find any announcements or discussion of algorithm
updates that coincide with the drop in traffic that you’re seeing? What might
you examine next? Start by taking a look at your Google Business Profile
listing. You’ll want to make sure that you have set this up properly, filled it
out completely, and optimized it, as discussed in the introduction to this
section. You should also be working to keep it fresh by attracting new reviews
and images and posting about new and exciting developments with your business.
If your profile gets stale, you risk falling behind your competition. Let’s
assume the listings for your various locations are optimized and you’re
maintaining a consistent stream of reviews, images, and posts. What should you
look at now? Notice that we haven’t mentioned keyword rankings yet. This is
because the checks you’ve been running up until now encompass every single
keyword combination possible. If your local search rankings for certain keywords
are weak, but GBP-delivered leads are up and sales are up, then individual
keyword rankings don’t really matter. For keywords with a high search volume, or
in a dense, competitive area, even a relatively low ranking can still bring in
plenty of leads. Say you’re a personal injury attorney in New York City. You
might be tempted to track various versions of your main keyword, like: •
personal injury attorney

• personal injury attorney new york city

• personal injury attorneys

• nyc personal injury attorney

• personal injury lawyer

• …and many other variations

• personal injury lawyers Indeed, looking at the rankings of a single keyword
can be misleading. You could have a strong ranking for personal injury attorney
but a weak ranking for personal injury attorneys. If you didn’t track both, you
would get a false sense of how you’re doing— but the incomplete picture doesn’t
end there. Let’s say you have a strong ranking for personal injury attorney and
a weak ranking for personal injury attorneys. So, you decide to check personal
injury lawyers, and you see that

OPTIMIZING GOOGLE BUSINESS PROFILES

579

your ranking for this keyword is strong. But for personal injury lawyer, your
ranking is weak. Do you see the hole you’re getting yourself into? How many
keyword variations would you have to check to say whether you’re doing well or
not? The simple solution is not to get bogged down exploring variations of each
keyword, and instead to rely on your GBP-delivered calls and website clicks to
tell the story of whether your local SEO efforts are working or not. Suppose,
though, that you want to go a little further with your keyword checks. Type the
first keyword you care about into Google and ask yourself the following: • Does
the keyword currently generate a map pack on page 1 of the SERPs? If not, is
this a recent change? If the keyword has lost its map pack in Google search,
this could be a reason for a loss of traffic. If the keyword does generate a map
pack, ask yourself the next question. • Where on the SERP does the map pack
display? A map pack at the bottom of the page generates much less traffic than a
map pack at the top of the page; again, if this is a recent change it could
explain a drop in traffic. These are important checks because, while Google
doesn’t change the status of a map pack often, it can happen occasionally. If a
keyword that was sending you a lot of traffic through its map pack at the top of
page 1 of the SERPs loses that map pack, this can have a big effect on your
local organic search traffic. You could manually track the map pack changes, or
you could use a keyword tracking tool like GetStat.net that will track them for
you. Let’s assume the keyword(s) you checked all have a map pack at the top of
the page, and there weren’t any position changes to the map pack. What could you
examine next? Are you using a geogrid tool? A traditional keyword ranking
tracker will be insufficient for tracking your Google Maps presence, as you’re
about to find out. Continuing with our previous example, you might track the
following keywords: personal injury attorney, car accident attorney, truck
accident attorney. When it comes to tracking keyword rankings in local search,
it’s important to understand “where” you rank in a physical sense. You can’t use
traditional keyword tracking tools for this purpose, because they only tell you
at which position you rank. Consider the example keyword report shown in Table
12-1. The solution is to use a program like BrightLocal or Local Falcon to run a
grid scan. You can see a sample grid in Figure 12-12. This shows that the office
is in the center of the grid, and you’re doing well toward the northwest and
southwest, but not toward the east. No matter what number a traditional keyword
ranking report shows, it can’t tell you the story the grid scan tells!

580

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Table 12-1. An example keyword report Keyword

Google Maps ranking

personal injury attorney

4

car accident attorney

3

truck accident attorney

5

Figure 12-12. An example grid view from Local Falcon

Suppose these checks don’t turn up any surprises. What might you look at next?
One action worth considering is conducting a fake competitor audit, using
keywords that are important to your market. If there are listings that are
competing with you using unfair tactics, they could be stealing market share
from you. We’ll talk more about those issues in the next section, but for
purposes of this section, checking keyword rankings is the farthest we will go.
Digging deeper would require more strategic discussions. If your campaign is
working at its peak efficiency, it might be a matter of altering your current
strategies or increasing your spend. Maybe there’s room to ramp up your content
creation or link-building efforts, or maybe you could even consider opening a
new office to create an additional GBP listing. On the other hand, maybe you
just need to acknowledge that your strategies can’t be altered, and spending
can’t be increased. It is important to be honest with yourself about whether you
have room to grow, or if you’re already operating at your peak.

OPTIMIZING GOOGLE BUSINESS PROFILES

581

The most important thing is to make sure that you aren’t making decisions based
on bad information or faulty reasoning. In summary, if you notice a problem with
your local search performance (or just want to get a clear picture of how you’re
performing), these are the checks to perform: 1. Examine your lead conversion
process. 2. Examine your website from the point of view of a customer, ensuring
that the conversion funnel is optimized. 3. If no obvious problems are
discovered in steps 1 and 2, examine your Google Business Profile performance
metrics, comparing them to a previous baseline. 4. If you identify a drop in
traffic from local search, check for announcements or industry discussion of
local algorithm updates that might be affecting your local search rankings. 5.
Examine the GBP profiles for each of your locations, and make sure they’re
filled out and optimized. 6. Examine your keywords. Determine whether any map
pack changes have occurred that might be affecting your local search traffic,
and look beyond rankings at where you are ranking for your most important
keywords, using a specialized local rank tracking tool.

The Problem of Fake Competitors There are many actions you can take to improve
your local search rankings and overall local SEO performance. In this section we
are going to discuss one powerful, simple, but often overlooked white hat
strategy: fake competition audits. The concept of success in a local three-pack
is simple. It’s a zero-sum game: three positions, three winners. So, it’s an
issue if any of those winners are obsolete listings (businesses that no longer
exist), fake listings (made by hackers whose goal is to intercept traffic and
sell it to real businesses), or keyword-stuffed listings (intended to manipulate
the system to unfairly gain an edge). To begin a detailed audit of your local
search rankings, you’ll want to run a grid scan, as depicted in Figure 12-12. As
mentioned in the previous section, you can use tools like BrightLocal or Local
Falcon for performing these scans. Next, you’ll need to check each node to see
who’s competing against you and if any of them appear to be suspicious. You can
use the criteria listed in this section to determine whether a competing listing
is worth reporting to Google. If this seems like a lot of effort, bear in mind
that the potential value of every listing removed is huge.

582

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Let’s say you run a scan on the keyword personal injury attorney, for which you
rank third in local search, and you determine based on the criteria given here
that the other two listings in the map pack appear to be fake. So, you report
them to Google. If Google agrees with your report and delists both of those fake
listings, you could jump from third position to first in the map pack! But the
impact of the removal of these fake competitors is even wider reaching. You’ll
move up two positions wherever those two fake listings competed against you for
the keyword personal injury attorney, and indeed for every other keyword where
your listing and the two fake listings were in competition. The impact of such a
removal can be seen within hours. It’s recommended to conduct a fake competition
audit monthly, as fake, obsolete, and keyword-stuffed listings enter the market
on an ongoing basis. Here are the criteria we recommend you use as you’re
assessing questionable listings. First, look for signs of obsolete or unclaimed
listings. Ask yourself these questions:

What does your gut tell you? Does the listing display “Claim this business”?
This is a clear indicator the listing isn’t claimed and might be obsolete (out
of business or moved).

Are there any reviews? If so, are there any recent ones? While this doesn’t
necessarily mean the listing is obsolete, it does mean it might not be claimed,
or perhaps that the company has gone out of business.

What sort of photos are used? Not having any photos, or no recent photos (e.g.,
none in the last year), is a potential sign that the listing is unclaimed or
obsolete.

Do you see signage using Street View? If there is no signage visible in Street
View, the business in the listing might have moved or closed down. Next, it’s
worth examining competitive listings to see if they’re fake. To do this, ask
yourself the following questions:

What does your gut tell you? Does it make sense for that particular business to
be in that area or on that street? For example, strip malls usually house
businesses like grocery and clothing stores. It would be a little strange to
find a garage door repair business in a strip mall, or on the main street of a
town, where you’d expect to see retail shops, restaurants, and bars clumped
together. Make a quick judgment call and move on to the next criterion.

Does the name sound strange? If you saw a listing for “Bob’s Law Firm,” this
would seem normal. A listing for “Bob’s Law Firm - Personal Injury Attorney”
would be an indication of keyword

OPTIMIZING GOOGLE BUSINESS PROFILES

583

stuffing (discussed next), but that doesn’t mean the listing is fake. However,
if you saw a listing with just the name “Personal Injury Attorney,” that would
be suspicious, and you might want to report it.

Are there any reviews? Odds are, if it’s a legitimate business, it’s going to
have at least a few reviews. The likelihood of a business having zero reviews is
low, so this is a potential indicator of a fake listing.

Are there any images? Similarly, most legitimate listings will include at least
one or two photos. Not having any images is a potential indicator it could be
fake.

What sort of images are used? Fake listings might have very generic images. Most
businesses will add their logo as an image, but a fake business does not
typically display a logo. Also, as you begin to look at listings, you might pick
up patterns. Completely different listing names might use the same images.

Does the address on the listing look normal? Sometimes the address looks off
(for example, if you know it’s not a long street but the number given is in the
800s). This is an indicator the listing could be fake.

Do you see signage using Google’s Street View? This is a standard check that
Google representatives use as well. If you display your address, then you’re
saying you service your customers there, and signage is essential. Fake listings
can’t create signage at the locations they list, so this is a strong indicator
that the listing could be fake.

Does Street View show the address listed in a spot that you can’t click to view
it? A trick that spammers sometimes use is to put the listing in a spot where
you can’t use Street View in Google Maps. Their intent is to make it harder for
Google reps to check Street View for signage. This is an indicator the listing
could be fake.

Is the business in a residential area, and if so does that make sense for the
type of business? It would make sense for an accountant to have a home office.
It makes less sense for a locksmith to have a storefront at their house.

Is the listing using a business.site website? Google allows you to make a
website via your Business Profile. It will have the name you choose, with a
business.site domain. Most business owners won’t use this, but people who make
fake listings like leveraging it because it’s fast and easy to produce a website
there.

584

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Does the website look like it was made with little to no effort or is a
cookie-cutter site? This one can be tricky to assess, but if you look at enough
websites you will begin to pick up on patterns. A low-quality website isn’t by
itself an indicator, but if you see multiple businesses with similarly
low-quality site structures, you might be looking at a group of fake listings.
Finally, to identify potential keyword-stuffed listings, ask yourself the
following questions:

What does your gut tell you? Does the name just sound off to you? Does it seem
odd that there’s a city or a keyword like Personal Injury Attorney built into
the listing’s name?

What is the URL of the website? If the listing says “Bob’s Law Firm - Personal
Injury Attorney” but the website URL says www.bobslawfirm.com, the difference in
naming conventions is an indicator of keyword stuffing.

What does the website’s logo show? It’s unlikely the logo will be stuffed with
keywords, so this is a great indicator of the non-keyword-stuffed name.

What does the copyright section in the footer say? Often, the copyright section
in the footer of the website will reveal the nonkeyword-stuffed name. If the
company name shown here is different from what you see in the company’s Google
Business Profile, this is an indicator the listing is keyword stuffed.

What does the Contact Us page say? The Contact Us page will sometimes reveal the
non-keyword-stuffed version of the name.

What do the branding images in the listing indicate? This is another likely
place to see the real name of the business. This might look like a lot of work,
but once you get used to identifying suspicious listings, it shouldn’t take any
longer than 10 or 15 minutes to determine whether a competitor’s listing is
legitimate. You can conduct these fake competitor audits yourself, or you can
hire a company to do it for you. There could be hundreds of fake competitors in
your area, and they could keep reappearing every month. If you think there is a
strong chance that a listing could be fake, or for a business that has closed
its doors or moved, go ahead and report it. Remember, all suggested edits are
just suggestions, so it’s OK if you occasionally get something wrong—but do take
care to not flood Google with tons of frivolous reports.

OPTIMIZING GOOGLE BUSINESS PROFILES

585

Common Myths of Google Business Profile Listings There are many strategies and
tactics you can use to increase your presence in local search, but for every one
that works, you’ll find others promoted online that do not (or no longer) work.
This section will review some of the common myths.

Myth 1: Google My Maps is a ranking factor Google My Maps is a program that
allows you to create your own interactive map leveraging Google Maps. The myth
states that creating thousands of driving directions within the map and pointing
them all to your listing will help your rankings. This is a complete
fabrication. If it has any effect at all, it is either very close to zero or
negative. Your efforts are certainly better spent elsewhere. What does work is
people actually using Google Maps to drive to your location. If Google sees
people driving to your location more than other competitors, this is considered
in the rankings. Think of creative ways to encourage people to visit your
location. It doesn’t always have to be about business; maybe you’re holding a
raffle or giving free workshops.

Myth 2: Embedding your Google Maps listing on your website affects rankings
Embedding a Google Maps listing on your website doesn’t impact rankings. But it
does indirectly complement your overall SEO campaign. It helps customers get
directions to your business locations and saves them from opening a new window
for this purpose. The fewer hoops someone must jump through to get to you, the
better the odds are of them doing so. There are additional advantages; for
example, the Google Maps listing shows off your review count and ratings, which
can influence whether someone will do business with you or not. It also makes
your contact information easier to find and confirms your address, so there’s no
confusion on where to go. All these considerations help the client in their
journey of converting into a potential sale. So, while having your Google Maps
listing on your website doesn’t help with rankings, it can help with
conversions.

Myth 3: Geotagging images impacts rankings There’s been a massive debate in the
SEO community about geotagging images and whether it has any value. If there is
any value, it’s so small, the ranking impact is drowned out by the simple action
of you posting images and people viewing those images. In other words, your
efforts are better spent elsewhere. Let’s think about it for a second. You take
one image and add three pieces of data to it, then you send the image to Google.
You have no way of knowing exactly what Google does with

586

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

those pieces of data. It might throw out the first piece, acknowledge the second
but not display it or use it as a ranking factor, and acknowledge and display
the third but still not use it as a ranking factor. Just because Google
acknowledges something doesn’t mean it’s a ranking factor, and Google isn’t
obligated to acknowledge or care about anything the rest of the world sends it.
What’s more, even if the third piece of data is a ranking factor today, that
doesn’t mean it will be tomorrow. In conclusion, the effort you might spend
geotagging images is better spent elsewhere.

Myth 4: Paying for Google Ads improves rankings Just because you see the same
business ranking with an ad and with an organic position at the same time
doesn’t mean ads and organic SEO impact each other. When a query is commercial
in nature, ads are often shown, and these always appear above the organic
listings. A business may just happen to focus on both SEO and PPC; there isn’t
actually a correlation. Consider the mobile SERPs shown in Figure 12-13. Notice
how Fortuna & Cartelli PC shows up twice on the same page. That’s because the
result at the top is an ad they are paying for, while the lower result is their
position in organic search. The other two law firms rank higher in organic
search, but haven’t paid for ads. Paying Google for an ad slot has zero impact
on your organic presence.

Figure 12-13. Sample local search results

OPTIMIZING GOOGLE BUSINESS PROFILES

587

Myth 5: Call tracking will hurt rankings If you’re going to use a call tracking
number, make that number the primary number in your Google Business Profile
listing and make the receiving number the secondary number. That way Google
knows the relationship between the two numbers. There will not be any negative
impact on your rankings with this setup.

Myth 6: Keywords in review replies help rankings This is patently false.
Keywords in the reviews people leave for you can help your rankings, but
keywords in your responses have no impact. If anything, keyword stuffing in your
replies will hurt you. Remember, user engagement is a big ranking factor, so if
someone sees a review reply that looks sketchy, it’s going to negatively impact
that person’s customer journey. Figure 12-14 shows an extreme example of what
not to do when replying to a review.

Figure 12-14. An extremely bad approach to use in responding to reviews

Myth 7: Keywords in your Google Business Profile description impact rankings As
discussed in “Optimizing Google Business Profiles” on page 571, the only ranking
factors in your listing are as follows:

588

• Your business’s name

• The categories you selected

• Your business’s address

• The website link used in your listing

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Everything else, including your description, only affects your ranking
indirectly, through its impact on user engagement. If you stuff your description
with keywords and make it sound artificial or otherwise unappealing, this will
harm user engagement, which will in turn negatively affect your rankings.

Myth 8: Having a defined service area impacts rankings This is a huge myth. The
only thing a service area does is create a red outline around the area you serve
in Google Maps. It has no impact on your rankings at all. In fact, showing or
hiding your address has no effect on your rankings in Google Maps. One is not
better than the other. The rankings for all Google Maps listings stem from the
address used to verify the listing, and this is a very important factor to
understand. Even if a business hides its address, the rankings will be based
around the address used to verify the listing, and its rankings will be the
strongest in the vicinity of that address. Let’s say you’ve verified your
business address as 100 North Central Expressway, Dallas, TX 75201. And let’s
say you’re a storefront business, meaning you show your address, and you service
people at this address. This will be the focal point of your rankings; you can
expect those rankings to begin to degrade around 3 to 5 miles out, depending on
the competitiveness for a given keyword. Now suppose you’ve verified that same
address as your business address, but you’ve decided to hide your address and
define your service area as encompassing Dallas, Houston, and Austin. Your
Google Business Profile listing will still rank well for searches in a radius of
about 3 to 5 miles from 100 North Central Expressway in Dallas, but
unfortunately, despite you indicating that you service Houston and Austin, you
will never have a local search presence in those cities because your verified
business address is much further than 5 miles from them! This leads us to
another point: if you’re not seeing rankings for a listing that hides its
address, always ask yourself (or your client) what address was used to verify
the listing and run a scan over that address. If the address is hidden,
third-party tools like Local Falcon will default to the center point of the area
the business says it services, so you’ll have to manually adjust the scan to run
from the actual address. Continuing with our previous example of a business
defined as servicing Dallas, Houston, and Austin, running a scan with default
settings would place the scan somewhere near Bremond, TX. That’s way too far
away to pick up rankings for a listing in Dallas, so you would likely see no
rankings in your report. This is why people often think hiding your address
hurts your rankings. In fact, only if you run the scan using your verified
address as the center point (or if it happens to be in the middle of your
defined service area) will you see your proper rankings.

OPTIMIZING GOOGLE BUSINESS PROFILES

589

In this scenario, the only solution if you want to rank in local search in
Houston and Austin is to open additional offices in those cities and set up
Google Business Profile listings for those addresses. You can try to boost your
presence by focusing on local Google Ads or organic search for your one office,
but to show up in Google Maps searches, there are limits on how far you can be
from the searcher’s desired location.

Optimizing On-Site Signals for Local When it comes to optimizing your website
for local search, it works the same as for universal search: if you want to show
up in the SERPs for a particular keyword, you need to have a page of
high-quality content on your site about that topic. The difference here is that
to rank highest, instead of having the best content on the internet for that
search query, you only need the best content in your local area. Remember, the
local algorithm will constrain search results to the local market. So, make sure
your content is unique and that it’s actually about your business and about the
local area. Once you’ve written your content, you need to optimize the page for
the local algorithm. Geo-optimization helps make it clear to Google and
potential customers that you’re an amazing local solution. Here are the page
elements you need to optimize for local search (remember to use the same keyword
in each element):

tag Include your target keyword and city and state abbreviation in the title.
You should never list your business name first—you’re the only business with
that name, so you’ll always rank first for local searches for your name. Don’t
waste valuable optimization space. tag Include the same keyword and location
information in your headline. This should be more conversational than the title
and succinctly summarize the page.

Content If you’ve written content that’s unique and truly about your business
and the local area, it will already include your keyword and location multiple
times, so you shouldn’t need to do much (if anything) to further optimize your
content.

URL Include the keyword and location in your URL. This will make it easier for
humans to read than the machine-generated URLs that some CMSs create, and it’s a
better signal to the local algorithm.

590

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Image alt text Including your target keyword and location information in image
alt text helps build additional relevancy. Note that this will only impact
rankings in image search, as Google does not use this signal in organic search.

Meta description Remember, the meta description isn’t a signal considered by the
algorithm—but including your keyword and location in the description can help
drive home the fact that you are the best resource in the local area. A more
compelling meta description will lead to more local click-throughs. It’s also
incredibly important to be sure that your phone number and address are listed on
the page that you link to from your Google Business Profile. It’s a good idea to
include your address and phone number on every page of your site, but it’s vital
to include them on your GBP landing page so that the algorithm sees the same
information that is listed in your profile. If you’re a multilocation business
with a single website, you should create a unique page for each location.
Location pages should include top-line information (address, phone number, and
hours) and a unique description of the location. Include photos of the location
as well. Be sure to include the appropriate local business Schema.org markup on
your GBP landing page as well (either your home page or your location page).
This markup only needs to appear on this page, not on every page of the website.
NOTE For more information about Schema.org markup for local business–related
structured data, see “Structured Data for Google Business Profile Landing Pages”
on page 593.

Optimizing Inbound Links for Local Links are weighted differently in the local
algorithm. Links from local websites are more valuable, even if they’re from
businesses or websites that are unrelated to your site. In other words, in local
search, links from others in your community carry more weight. When you’re doing
link research for local SEO, you can (to an extent) ignore the traditional
authority metrics in link research tools. Typically, links from smaller local
websites will have much less authority, but because locality is a big factor for
your inbound links, these are valuable links to acquire. The easiest way to
acquire local links is to get involved in the local community. Pay attention to
things you’re already

OPTIMIZING GOOGLE BUSINESS PROFILES

591

doing and relationships you’ve already established in your area, as these likely
opportunities for link building. Here are a few of the most common local link
acquisition tactics:

Local sponsorships Buying links is a bad idea that can lead to a penalty.
However, buying a sponsorship that results in a link is perfectly OK. Little
league teams, golf tournaments, 5k races, and any other local events are all
opportunities for garnering valuable local links.

Local volunteer opportunities Instead of donating money, in this case you’re
donating time for a worthy cause. Any local service–based activities can lead to
link opportunities.

Local meetups Use a site like Meetup.com to find local groups with regular
monthly meetings. If you have a conference room or meeting room available, look
for groups that don’t have a permanent meeting space and offer them yours;
you’ll get a few links that way. If you don’t have a meeting space, look for
groups seeking meeting sponsors—for $40–50 a month, you can buy their snacks and
drinks and get a few links.

Local blogs Find local bloggers and get them to write about your business. Even
if you give them a free product or service and the blog post mentions that fact,
you still get a great local link.

Local clubs and organizations Talk to staff (especially upper management and
ownership) about what they’re passionate about in their free time. If someone is
involved in a local club or organization, particularly if they’re part of the
leadership of that group, it’s usually fairly easy to get a link from that
organization’s website.

Local business associations Join all the applicable local business associations.
The links you’ll get are powerful signals of local relevancy, so they’re well
worth the annual membership fees.

Optimizing Citations for Local Citations are mentions of your business’s NAP
information (name, address, and phone number) on third-party websites. The
importance of citations has waned over the past few years as Google’s local
algorithm has gotten smarter. Citations are a foundational factor you need to
get right, but they have limited influence, so this doesn’t need to be an
ongoing tactic.

592

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Most importantly, you’ll want to use a service to submit your business
information to the three primary data aggregators, so the correct information is
fed to all of the directory and citation sites. Then, search for your business
name on Google and check the first few pages of search results. Make sure that
the correct information is displayed on any sites that appear (and correct any
sites that display the wrong information).

Structured Data for Google Business Profile Landing Pages Google maintains
structured data schemas for many different kinds of information, including local
business information. For optimal local search visibility, you should add a
structured data block to the page that your Google Business Profile links to.
There’s no need to put it on any other pages unless you have multiple separate
pages for different locations (in which case each profile landing page would
need its own customized structured data block). There are many potential
structured data elements you can use to describe your business and its location;
complete documentation is available in the Google Search Central documentation.
Here’s an example of what a minimal structured data block might look like for a
restaurant:

NOTE A special thanks to Greg Gifford and Marian Mursa for their contributions
to the local search–related sections of this chapter.

Image Search A significant amount of traffic can come from image search, and the
number of people competing effectively for that traffic is much lower than it is
in universal search. Industries that don’t immediately seem to provide
compelling subjects for images may enjoy greater potential in this area, because
the competition might not know the advantages of integrating images into their
sites and into an overall search marketing strategy. There are a few different
ways that image search optimization can help improve traffic and conversions for
your site:

Subtle reputation management Images of your products, services, or facility
assist consumers during the research phase of their shopping journey and lend an
implicit message of openness and forthrightness to your business. Providing
images can improve consumers’ confidence in your company, increasing the chances
that they’ll decide to do business with you.

Sales via image search results Increasingly, consumers are searching for
products via image search engines because they can rapidly find what they are
seeking without having to dig through promotion-laden websites. This is
especially true for sites that have products that lend themselves to visual
display. If your products can be found in the image search engine, then you have
an improved chance of being found by those

594

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

people. With no pictures, there’s zero chance of your site being found in image
search. Google has also increasingly supported shopping usage of image search by
displaying product information such as pricing, reviews, brand, and in-stock
status within image search.

Increased chances of showing up in universal search/blended search results
Performing image search optimization improves your chances of showing up in
additional positions on the main search results pages, as universal search pulls
image search content into the main SERPs for some keyword search terms. You can
see an example of this in Figure 12-15.

Figure 12-15. Image content integrated into universal search

IMAGE SEARCH

595

Site/business promotion opportunities If you have a flexible enough organization
and you hold the legal copyrights to your images, you can allow others to reuse
the images in return for promotion of your site/business.

Improved click-through rates for some types of universal search result listings
Google and Bing frequently add thumbnail images beside listings for various
types of content, particularly content that incorporates structured data markup.
Thumbnails may be included for recipes, news articles, personal profiles, and
other content, and these listings draw more attention and more clicks than
text-only listings.

Image Optimization Tips In terms of the page content, you should give particular
emphasis to the text immediately preceding and following the image. This is what
the user most closely associates with an image, and the search engines will view
it the same way. A descriptive caption underneath the image is highly
beneficial. You can do a number of things to optimize your images. Here are the
most important: • Make sure the image filename or img src attribute contains
your primary keyword. If it is a picture of Abraham Lincoln, name the file
abe-lincoln.jpg and/or have the src URL string contain that keyword, as in
https://example.com/abe-lincoln/ portrait.jpg. The filename is also a useful
place to include your keyword, though how much weight it may carry is unclear.
If your content management system relies upon a gobbledygook ID number for
images, you may be able to set it up to include the primary keyword first,
followed by the ID number, as in https:// example.com/abe-lincoln~ABC12345.jpg.
• Always use the alt attribute for images. This attribute helps vision-impaired
users to understand your site, and search engines use it to better understand
what your images are about. Our recent research indicates that this feature is
still not used for lots of sites’ images, and that many sites have tried to use
it with invalid HTML. Make sure the alt parameter is valid, as in this example:

• Use the quotes if you want to include spaces in the text string of the alt
attribute. Omitting the quotes is a common problem; without them, all words
after the first will be lost, if any are used at all. (Note that many CMSs
automatically generate alt text content. It is important to use HTML entities or
properly escaped characters for symbols that appear within the alt text. A
common error occurs when unescaped quotation marks or apostrophes are replicated
within this text, invalidating the HTML.)

596

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

• The title attribute may optionally be used for images, but it’s likely
unnecessary if you are using the alt attribute. If you do specify a title, use
the same words as the alt text. Some systems implement the title with spaces
replaced by dashes. This still works for search engines but is off-putting for
human users who may see this text when using certain interfaces or browsers. •
Avoid query strings in the src URL, just as you should for page URLs. If you
must use URLs that include query strings, use no more than two or three
parameters. Consider rewriting the query strings in the URLs so that they do not
contain an excessive number of parameters, which could cause spiders to refuse
to crawl the links. Although Google claims to no longer have problems with these
types of situations, it’s better to be safe than sorry. • Use good-quality
pictures, which will display well when shown in thumbnail format. Good contrast
is typically the key here. Lower-contrast images are visually harder to process,
and it’s common sense that if the thumbnail image doesn’t look good, it will not
invite a click. • The content of the image is also a key factor. Google does try
to process image content, and the more relevant it is to the content of the
page, the better. • Do not save images as graphics files with embedded
thumbnails—turn this feature off in Photoshop and other image editing software.
Search engines may copy your image, reduce it in size, save it in compressed
format, and deliver up a thumbnail of it for their results pages. An embedded
thumbnail can wreak havoc with some compression software, and it increases your
file size slightly, so just leave that feature disabled. • Don’t store the
images in a sidebar column with your ads or inside the header/ footer navigation
elements; if you do this, the search engines will ignore them, just as they
ignore page decor and navigation graphics. More important images should ideally
be placed higher up on the page and above the fold, meaning they should be
visible when the page first loads without requiring scrolling to see them. (Be
aware that “above the fold” means different things with regard to how web pages
are designed for mobile versus desktop devices. You likely will need to design
your mobile interface to display images at a different size than for the desktop
view of your web pages.) • Include structured data or IPTC photo metadata for
your important images. Structured data for images used on pages containing those
images can enable overlay badges to appear in Google image search, enabling
users to readily recognize content they may be seeking. For example, you can
incorporate structured data for recipes, videos, and products to elicit the
added treatment. This also allows Google to show more details about the image in
the image search results, such as

IMAGE SEARCH

597

the name of the creator/photographer, how people can license the image to use it
(the “Licensable” badge), and credit information. • Take care with asynchronous
image delivery (“lazy loading”) on your web pages, where AJAX or JavaScript will
load the images after the main page elements as the user scrolls down the page;
this can delay Google’s absorption of images, or result in Google failing to
index the images altogether. Image sitemaps (discussed in “Deciding what to
include in a sitemap file” on page 185) can help mitigate this effect, and
following the latest responsive image techniques can help. • Incorporate
responsive images via elements that may contain multiple versions of the same
image, enabling browsers to select the optimal image to use for a particular
device and screen. The element should include an element as a fallback to
display for older browsers that do not recognize this element. An alternative to
using is to use the element with a srcset parameter, enabling you to provide
multiple copies of an image, each associated with a different screen size. • For
articles (news, blogs, etc.), Google recommends including a few highresolution
images in three different aspect ratios: 1×1, 4×3, and 16×9. Google also
recommends this for a number of other schema types, including LocalBusiness,
Recipe, Product, and Event. • Have a proper copyright license! You need to have
a license to display other people’s images that you use on your site, so that
you don’t get sued. Be careful about trying to use images from Wikimedia Commons
or other public stock photo sites, as you cannot be sure that those images
really are in the public domain. When you “purchase” an image from a stock photo
site, you are not purchasing the copyright—you are purchasing the right to use
the image. Be sure to understand the terms of the agreement well, as some of the
time these arrangements require that you link back to the copyright holder.
While copyright approval is primarily a legal consideration, note that Google is
now assessing how many copyright removal demands are associated with websites,
and if a website has too many, it will potentially cause a ranking reduction in
Google search results. • Ensure that your server configuration allows your
site’s images to be displayed when called from web pages on other domains. Some
system administrators have disabled this setting to keep people from displaying
their images on other sites, and this could cause problems if you want your
images displayed in search engines’ image results pages. Likewise, make sure
that your robots.txt file does not block the crawlers from accessing your image
file directories, and that the HTTP header for your site’s images does not block
crawlers. • Ideally, your images should be hosted on the same top-level domain
as the web pages where they are displayed. If your page is on www.example.com,
your images

598

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

ideally could be on the same domain, or on another subdomain that shares the
same top-level domain, such as images.example.com. • If it is a fit for your
business, specify that others are free to use your images for online display as
long as they link back to your website from a credit line below or adjacent to
the image, where they display your copyright notice. Enabling others to use your
photos invites more promotional attention when people wish to write about you in
blogs or in news articles. • Page speed is a factor that is affected by images
themselves, so it is important to specify the size that images are used at
within web pages, and also to compress your images efficiently to have the
smallest file size while not degrading quality too much. • Avoid delivering
images via CSS only! Google parses the HTML of web pages to index images but
does not index CSS images. Display images on your web pages via tags, not via
style parameters for , , or other elements. It’s also a good idea to create an
image sitemap to highlight images you consider particularly important, or to
help the search engines find images that might otherwise be difficult for their
crawlers to discover. Including images in articles or blog posts likely helps,
directly or indirectly, to enhance their rankings in search, as users like
visual elements illustrating the text. However, Google and the W3C recommend
against having alt text for decorative images on websites. So, spacer images,
border decor, and similar decorative elements should be implemented without this
attribute or with nothing between the quote marks: alt="".

Image Sharing Sites With the advent of blended search and the web’s overall
movement toward more visually compelling, image-rich content, there are ever
more opportunities to achieve organic exposure with images—and the area is also
becoming more competitive. The following sections list some image optimization
best practices to use when uploading your images to an image sharing site. These
tips will help ensure your image files have the best chance of appearing in the
SERPs for targeted, qualified searches.

Pinterest Pinterest is a great platform for promoting images and getting them to
appear in Google image search. It offers plenty of flexibility for how you
organize your images and earns them a lot of visibility in Google’s search
results. The following list discusses best practices for optimizing your
Pinterest presence for search visibility: • When you upload images to Pinterest
(Pinterest uniquely refers to photos as “Pins”), always add a title. The title
is the most influential element, so use great

IMAGE SEARCH

599

keywords to describe your images. Pinterest allows up to 100 characters in the
title, but only the first 40 characters will appear in feeds and other
interfaces that display its content. It’s fine to use the full 100 characters,
as this will appear in the photo page’s title when indexed by Google; just be
aware that the first 40 characters need to convey what the picture is about. •
Add a description—this is essentially a caption for the photo, which Pinterest
displays to the right of the picture on desktop, and below the picture in
mobile. Use good description text that reiterates the primary and secondary
keywords about the image, but be aware that only the first 50 characters will
appear to users when they land on the image page; further text will be hidden
and therefore of lower importance when Google indexes it. • Include specific alt
text. Pinterest allows you to supply text for the alt parameter, and this should
again reinforce the main topic and keyword to be associated with the picture.
Annoyingly, Pinterest adds the text “This contains:” at the beginning; keep this
in mind when writing your alt text. • Link the photo to your website. You can
associate a photo with a link, and this is a worthwhile connection to make with
your website. • Create thematic “boards” for your photos (as in “pinning images
to a bulletin board”). You must associate each photo with at least one board to
display it on your profile. Boards should be designed as categories for your
images with themes that end users will readily understand. Having these
topic-specific boards to categorize your photos facilitates browsing and
interaction with your images. Boards should be given short, clear names, and a
brief description should be added as well, although this is optional. A
representative image may be used to represent the board; this should be one of
the best from that particular set in order to attract visitors. For those boards
that are geographically specific, you can specify a location to associate the
board with. • Pin images to multiple boards. This improves the chances that
users will see them and save them, increasing their exposure and distribution
potential. To associate an image with multiple boards, once you have initially
saved it, view the image’s page and click the board name drop-down beside the
Save button in the upperright corner (on desktop) or the Save button below the
image (on mobile). Select a board name from the list (or choose “Create board”
to create a new one), and click Save. • You can upload images directly to
Pinterest, but it also allows you to add images directly from a web page where
they appear. On a desktop, click the + button at the bottom of your home feed,
select Create Pin, and choose “Save from site.” Enter the URL, then select the
image and click “Add to Pin” once Pinterest identifies it. The process is
similar on mobile. One benefit of this is that the interface

600

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

will already have the page URL to associate the image with. In some instances,
however, Pinterest may be unable to parse the web page to locate the image, or
the image could be in a format that Pinterest cannot grab and use. In those
cases, you’ll need to upload it yourself. If the format is an issue, you could
screen-grab the image and then upload the cropped, screen-grabbed version that
has been saved as a JPG. When uploading images from a web page, also be aware
that the content Pinterest associates with the image from that page may override
any title, description, or alt text you hand-enter.

Instagram Instagram is likewise an excellent platform for the purpose of
promoting images and getting them to appear in Google image search. The best way
to add images to your Instagram account is via the smartphone app interface, but
it is possible to add images via some third-party social media management
platforms. Instagram provides very little flexibility for image optimization.
Here are a few tips: • Try using filters. Naturally, you should use good-quality
images in the first place, but Instagram has some fairly sophisticated filters
that can make your images look even better. Experiment with these to see what
produces the best effect, or use the custom image adjustment settings to improve
the image’s appearance. Most images benefit from some slight tweaking of the
contrast, brightness, and color saturation. • Crop your images to fit. Images do
best on Instagram with a one-to-one ratio. It is possible to upload a single
image with a longer vertical ratio, but be aware of how Instagram may
automatically crop it when it generates the thumbnail view to display on your
profile. • Write a good caption that includes keywords. The caption field
provides a very generous space for long text. However, generally shorter
descriptions perform better if you craft them well and include vital
keywords—unless you can craft a caption that intrigues the viewer enough that
they want to click to see the full text description. When more viewers click to
expand the text, the engagement generated is a ranking signal, indicating to
Instagram that that image is more interesting than other images that are also
relevant for the same keywords. • Include good hashtags. Instagram particularly
revolves around hashtags, so add them for your keywords using the format
#ThisIsMyHashtag (a leading hash mark followed by a word or set of words with no
spaces). You can include multiple hashtags. As you start typing one into the
Instagram interface, it will show you some of the hashtags beginning with the
same letters as you are typing, and a

IMAGE SEARCH

601

count of how many times that hashtag has been used. This is great for
discovering additional keyword hashtags to add. • Tag other people related to
the image you are adding, such as people appearing in the photo who also have
Instagram accounts, or accounts that you reasonably believe might find the image
particularly interesting. • Add location data to the image as you submit it, if
the image is associated with a specific place. This feature enables people to
discover all the images associated with a location, including your image if you
add this information. • Make the images available to appear in Google SERPs.
Make sure your account is not set to private so that Google can access and crawl
the photo pages. Linking to your profile from your website can help the images
be discovered and indexed. Also, embedding your more important photos into your
web pages or blog posts can help expedite indexing and enhance their ability to
rank in search. Many of these suggestions can generally be applied to images
used on other thirdparty social media platforms that heavily leverage images as
well. For these types of sites, be sure to write keyword-rich descriptions and
make ample and appropriate use of hashtags and other options and features
similar to those described here for Pinterest and Instagram, such as associating
images with locations, tagging related accounts, and adding alt text when
possible. NOTE A special thanks to Chris Silver Smith for his contributions to
the “Image Search” section of this chapter.

News Search News stories can be in the form of text (articles, interviews, press
releases) or videos, and they can appear on any Google property that has a news
surface (an area that displays news content), such as universal SERPs, Google
News, YouTube, Google Assistant, and the Discover feed in the Google mobile app.

602

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Within the news vertical, there are three subtypes:

Top news Is represented in the Headlines section in Google News and the Breaking
News section in YouTube. These are the most important global news stories,
according to Google’s algorithms.

Personalized news Is represented in the Discover feed in the Google mobile app,
the For You section of Google News, and the Latest tab in YouTube. These are
personalized results that contain news stories specific to your locale, news
that pertains to your search history, and new (but not necessarily news) content
that you’ve elected to be notified of through a variety of means: saved news
stories, saved searches, followed topics and sources, periodical subscriptions,
and subscribed or followed YouTube channels.

Deep context and diverse perspectives Provide more information beyond top news
stories, and from different sources. On some news surfaces, these will appear
underneath top news results; on others, you must select the More menu next to a
top news story, then select Full Coverage in order to see these results.

Google News Google News is available in over 125 countries and 40 languages.
Data from the NewzDash SEO tool shows that most news sites get between 50% and
60% of their traffic from Google and Google News—according to Google, over 24
billion visits per month. With the great majority of the traffic attributed to
Google by news sites coming from the Top Stories (which is a Google News
feature), this makes up a very large percentage of overall site traffic.
Additionally, according to a September 2022 Gallup poll, a minority of Americans
say they place a “great deal” or even a “fair amount” of trust in mass media
sources (7% and 27%, respectively). In contrast, a Reuters survey published at
around the same time found that 53% of Americans trust Google as a news source.
In short, if you’re a news publisher, you want to be in Google News. All the top
media publishers are aware of this, as the data in Figure 12-16 shows.

NEWS SEARCH

603

Figure 12-16. Top news publishers in Google News (source: NewzDash)

Google News surfaces Google News is how many people get the latest information
on current events. As the name implies, the recency of the content is a
significant factor in what Google chooses to show prominently in the News search
results; most users are not looking for old news! The name Google News is
commonly used to represent all news surfaces in Google, but to be specific,
Google has three surfaces that provide news:

Google News This is the core Google News platform, which can be found at
news.google.com or in the Google News app. According to NewzDash, this surface
provides 3–8% of traffic to top news publishers. Figure 12-17 shows an example
of the results from news.google.com.

604

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Figure 12-17. Sample results from news.google.com

Top Stories Figure 12-18 shows an example of Top Stories results within
universal search. While this surface also shows news results, it has a different
algorithm than Google News and can draw content from different sources
(including Reddit, X, and other user-generated sources of news). According to
NewzDash, top news publishers can get 20–50% of their traffic from Top Stories.
Top Stories can show up as a mix of text links and one to two carousels. Where
Top Stories show up in the SERPs as well as how many stories are shown is
determined by factors like search demand, how important a topic is deemed to be,
and availability of fresh coverage from reputable news publishers. Top Stories
on desktop lists up to seven results with an additional “More News” link
pointing to the News tab. On mobile, there could be one to three news carousels
containing up to about 30 results each. Mobile therefore represents the bigger
opportunity: most of the traffic to publisher sites (up to 80–90% of their total
traffic) is generated from mobile devices.

NEWS SEARCH

605

Figure 12-18. Sample Top Stories results in the standard Google SERPs

The News tab in Google universal search This is an extension of Top Stories, but
it is less used by users, so it delivers less traffic (NewzDash estimates it
accounts for 2–4% of traffic for major media sites). Figure 12-19 shows an
example of the results on the Google News tab. In addition, for some queries
news stories are prominently featured in Google Discover, so this can also
provide an additional source of traffic. One thing to note is that being
included in Google News doesn’t mean you will get good visibility and rankings
in Google News or Top Stories. Indexation and rankings are two separate
processes, as discussed in the following two sections.

606

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Figure 12-19. Sample Top Stories results on the News tab in Google search

How to get into Google News Historically, publishers had to apply for their
content to be shown in Google News results, but today that’s no longer possible.
Google scans the web to find sites that are quality publishers of news and will
include them automatically if they are deemed to be eligible. Here are some
prerequisites that you must meet to be eligible to be selected for inclusion in
Google News: • Your site must be a dedicated news site. If you simply have a
news section on your site, this will not be eligible. • You must publish
original content. Simply republishing content from other sites is not
acceptable. In addition, Google News doesn’t allow content that “conceals or
misrepresents sponsored content as independent, editorial content.” • You must
use highly credible authors. Authors must be subject matter experts in the topic
areas that they cover. • You must update your content frequently. Updates should
be made to your site multiple times per day.

NEWS SEARCH

607

Additional steps to increase your chances of inclusion include: • Don’t publish
content that can be considered dangerous, hateful, terrorist, violent, vulgar,
sexually explicit, manipulated, or deceptive. Providing medical content that is
inaccurate or that is not backed by sufficient medical expertise can disqualify
you as well. • Don’t publish content that is misleading, where what is promised
by the headline doesn’t reflect what the user will find in that content. •
Become familiar with Google’s efforts to fight fake news and support journalism,
known as the Google News Initiative. • Create and submit a news sitemap,
following the process outlined in the documentation. • Create bio pages for each
of your authors and editorial staff, including information on their experience
and qualifications. • Publish a page listing your writing and editorial staff,
including links to each of their bio pages. • Publish an About Us page that
shows your physical address and information on how to contact you. • Prominently
display an author byline and a date of publication for each piece of content. If
you meet all the requirements, you can create a Google Publisher Center account
by filling out the form and clicking Add Publication. Once this process is
complete you can submit the URL of your news feed, though that will not
influence inclusion. You can learn more about Google’s expectations via the
Google Publisher Center’s Help Center. Some publishers may think that setting up
an account and submitting a URL is all it takes to be included in Google News,
but this is incorrect. As mentioned previously, your content will be subjected
to a series of automated checks that it must pass, and your publishing history,
reputation, and other factors are also taken into account. John Shehata’s
NewzDash article lists a couple of ways for publishers to determine if they are
included in Google News.

Google News ranking factors Once you’re in Google News, there are many factors
that are analyzed algorithmically that determine how your content will rank.
These are:

608

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Relevance to the user query Presence of the keyword or close synonyms in the
title and article copy plays a large role in determining the relevance. This is
true for both Google News and Top Stories.

Prominence Your chances of ranking highly are increased if this is a topic that
you cover in depth frequently, and if that content has been well received.

Authoritativeness The more credible your sources, authors, and editorial staff
are on the topic matter, the better.

Content originality Simply summarizing other content or syndicating it will not
position you well for ranking in Google News. Provide original content that is
ideally differentiated from what is published by others.

Freshness As a news story updates, make a point of updating your story, even if
things are changing on a minute-by-minute basis. The more up-to-date you are,
the better. This can be particularly significant for ranking in Top Stories.

Content quality As with producing content that you hope to see Google surface
across any of its platforms, the quality of your content matters. This includes
practicing good journalism, creating content based on evidence and facts, as
well as using proper grammar and spelling.

Inclusion of images and/or videos Original, high-quality, relevant images and/or
videos are seen as valuable to users.

Audience loyalty It’s helpful to cultivate loyalty among your readers, as their
preferences are a material factor in what those users will see. If users show a
strong preference for reading content from your site, you will rank higher in
Google News for those users.

Topic preference Google will use its understanding of the topics that users
prefer to deliver content that best fits those preferences. Google News
determines a user’s preferences by building a history of what they have selected
over time. Users can also show interest (or disinterest) in topics whenever they
search, scroll to the bottom of the results, and click “More stories like this”
or “Fewer stories like this.” In addition, Google may make use of topic
preferences provided by users within the Google app.

NEWS SEARCH

609

Diversity of results Google strives to show results from a broad range of
perspectives and sources. This can provide publishers with an additional way
into the Google News SERPs.

Location Users may be served content from news providers that are from their
local area. Figure 12-20 shows an example analysis performed by NewzDash on the
Google News share of voice for queries performed in the UK market.

Figure 12-20. News share of voice in the UK market (source: NewzDash)

Language Google will favor content that is in the user’s preferred language.

Publication frequency and timing Publish content just prior to the time frame
when your readers are likely to consume it. Make sure you learn the timing
preferences of your audience and publish new content shortly prior to that. For
example, entertainment content may be more in demand in the evening, and world
events/breaking news content tends to be in more demand in the morning.

610

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

Top Stories ranking factors While there are many common aspects to ranking in
Top Stories and Google News, there are also some differences. Relevance, as
always, is the most important ranking factor. It’s also important to make
prominent use of the target keyword(s) in the headline, subheadline, and opening
paragraph. Working out the best keywords will be driven by how users are
searching for the related news. Start by concisely capturing the topic of your
story in a clear, short phrase appropriate for a headline. Google also places a
great deal of emphasis on the authority of your site on the topic of the
content, as part of its effort to combat fake news and incorrect information.
Hence, the authority of your editors and authors is just as important for
ranking in Top Stories as is Google News. Other factors that are significant
include:

Freshness It’s important to keep your stories current and up-to-date. This means
updating the content frequently as the story develops.

Click-through rate Google measures the rate at which your story receives clicks.
If you are not receiving many clicks, you may be lowered in the Top Stories
rankings, and correspondingly, if you are receiving a lot of clicks you may be
moved higher up.

Citations References to your content are highly valuable. This can include
signals such as links and citations (references to the content without a link),
but other factors, such as tweets, Facebook shares, and Reddit mentions are also
valuable. Traditionally, Top Stories were reserved only for news pages that
implemented Accelerated Mobile Pages (AMP). This requirement was removed in
April 2021. Google has also clarified that a high Core Web Vitals score is not a
requirement. NOTE In September 2022, Google announced that it would begin
displaying local news coverage from global sources with automatic translation
into the user’s preferred language in 2023. For example, US English users
searching for news about a major earthquake in Mexico will get Spanish news
articles from Mexican news sites directly in Top Stories, with Google Translate
providing instant content translation. As of publication, however, the rollout
plan for this feature has not yet been confirmed.

As with other aspects of its search algorithms, entities are a big part of what
helps Google understand the fabric of the web and its most important content.
Several types of entities could be the focus of your news content, such as:

NEWS SEARCH

611

Events These can be scheduled events (a sporting event, a concert, a movie
release, etc.) or breaking news events (a natural disaster, two celebrities
getting engaged, a major business merger, etc.).

People Famous individuals (Jennifer Aniston, Dolly Parton, Barack Obama, Jeff
Bezos, etc.) can be the focus of news stories. There is a lot of interest in
things that famous people do.

Organization Organizations (the Boston Red Sox, Amazon, Progressive Insurance,
etc.) are also frequent generators of news stories of interest.

Places A lot of news is local in nature, so including location names (Milan,
Paris, Mumbai, Brazil, Canada, etc.) in your story can be quite helpful.

Things/concepts In many cases, news can relate to things (Tesla cars, insulin,
drones, government budgets, etc.). As you develop your story, consider which
entity (or entities) is the major focus of your content. Positioning this
properly in your headline and in the content can materially impact your
click-through rate in Top Stories, and therefore your rankings. As we’ve already
noted, authority influences rankings in both Google News and Top Stories. Bear
in mind that authority is something that must be cultivated. Invest in having
true subject matter experts create your content. Get someone with a known name
in the topic area of interest if you can, or develop them if you must. Either
way, repeated coverage in the same topic area over time will go a long way
toward making it clear that this is an area where your organization has deep
expertise.

Schema.org markup for Google News and Top Stories As with universal search,
including structured data markup in your content can impact how your stories are
shown in Google News and Top Stories. For example, specifying the NewsArticle
schema type can help Google show better titles, images, and date information for
your articles. This can lead to higher click-through rates and help driving
rankings in Top Stories. An example schema for a news article is shown here:

Title of a News Article



The properties you can specify for an article (of type Article, NewsArticle, or
BlogPost ing) are:

author The author of the article. This can be a person or an organization.
Google’s structured data guide provides details on author markup best practices.
author.name The author’s name. author.url The URL of a page that uniquely
identifies the author of the article. dateModified The most recent date that the
content was modified. This should be in ISO 8601 format. datePublished The date
that the content was originally published. headline The title of the article.
This should not exceed 110 characters. image A link to a relevant image that you
want to represent the article.

NEWS SEARCH

613

Complete details on the structured data markup for articles and other content
types are available in Google’s structured data guide. Other types of markup
that may apply for news content include:

speakable Designed to enable you to identify portions of your content that are
suitable for being used in audio playback. This makes your content eligible for
being returned in response to voice queries, such as those that are more
commonly used with Google Assistant on mobile devices. VideoObject Use this to
mark up any videos that you have embedded in your article. LiveBlogPosting For
use with content intended to provide live ongoing coverage of an event in
progress.

Subscription and paywalled content You only need to include this markup if your
content is paywalled and you want to allow Google to crawl it. You also need to
permit users coming from Google to read the full article without being presented
with a paywall (other users coming to your site will still have to pay for
access). You’ll also need to implement flexible sampling to enable Google to
crawl the paywalled content. This allows you to specify how many times per month
users coming to your site from Google can get the full article content without
paying, so effective use of this program can help drive new subscriptions. Some
other tips for ensuring that your article content is displayed the way you want
it to be in the Google search results are: • Provide permanent and unique
article URLs. Each news article should have a dedicated, static URL. • Use as
few redirects as possible. Redirect chains are bad practice, not just for Google
News but for Google overall. In addition, don’t use a meta refresh to redirect
users; see “Redirects” on page 288 for further details. • Avoid parameters in
your URLs. They can cause Google to ignore your content. • Optimize your
headlines. There are many components to this. These include: — Create a title
that has between 2 and 22 words, and a minimum of 10 characters. — Use the same
title for both your tag and the tag for the page.

614

CHAPTER TWELVE: VERTICAL, LOCAL, AND MOBILE SEO

— Place your tag in a highly visible spot above the article content, the
author’s name, and the date. Make sure that there is no other text between the
tag containing your title and the main content of your article. — Don’t start
your title with a number. For example, avoid something like: “22 Most Important
Lies Told in Last Night’s Debate.” — Don’t use the headline in anchor text used
on the article page or elsewhere on the site to link to other pages on the site.
— Do use the headline in anchor text on other pages on the site to link to your
article page. — Don’t use the date or time of the event in your headline. This
can cause Google to get confused about when the content was published or last
modified.

Images in news content As mentioned earlier, one of the Google News ranking
factors is rich media; embedding photos and videos in an article can impact
rankings. In the past Google News used to frequently display images from one
source with articles from another source, but nowadays Google no longer uses
that approach. Here are some tips to increase the likelihood that your images
are included in Google News: • Place the image below the article headline. •
Images must be relevant to the story: no logos, stock images, or icons. • Use
structured data tags (Schema.org and/or Open Graph Protocol) to define full
image and thumbnail metadata (refer to “Thumbnail” on page 622 for more
information). • Use the JPG or PNG image formats. • Format images for modern
standard aspect ratios, such as 1:1, 4:3, 3:2, and 16:9. NOTE Your site should
be responsive to multiple user agents and device types, so there should be
multiple versions of each image, each corresponding to a different screen size.

• Use reasonable image resolutions, with the smallest thumbnail size being 60×90
px, and define each image’s dimensions (in pixels) with both a width and a
height attribute. • Use the display: inline style attribute in the tag, or in
the CSS that corresponds to it.

NEWS SEARCH

615

• Put an appropriate (descriptive, yet concise) caption for each image in the
tag’s alt attribute. Here’s an example of a Google News–friendly tag that
incorporates these guidelines:



REPORT "THE ART OF SEO: MASTERING SEARCH ENGINE OPTIMIZATION [4 ED.] 1098102614,
9781098102616"

×
--- Select Reason --- Pornographic Defamatory Illegal/Unlawful Spam Other Terms
Of Service Violation File a copyright complaint


Close Submit

--------------------------------------------------------------------------------


CONTACT INFORMATION

Michael Browner
info@dokumen.pub

Address:

1918 St.Regis, Dorval, Quebec, H9P 1H6, Canada.




SUPPORT & LEGAL

--------------------------------------------------------------------------------

 * O nas
 * Skontaktuj się z nami
 * Prawo autorskie
 * Polityka prywatności
 * Warunki
 * FAQs
 * Cookie Policy


SUBSCRIBE TO OUR NEWSLETTER

--------------------------------------------------------------------------------

Be the first to receive exclusive offers and the latest news on our products and
services directly in your inbox.

Subscribe
Copyright © 2024 DOKUMEN.PUB. All rights reserved.
 * 
 * 
 * 
 * 
 * 

Unsere Partner sammeln Daten und verwenden Cookies zur Personalisierung und
Messung von Anzeigen. Erfahren Sie, wie wir und unser Anzeigenpartner Google
Daten sammeln und verwenden. Cookies zulassen