scrapy.org Open in urlscan Pro
108.138.85.102 Public Scan

Back to summary

URL:
https://scrapy.org/
Submission: On October 20 via api (October 20th 2024, 10:40:51 pm UTC) from DE — Scanned from US

Form analysis
0 forms found in the DOM

Text Content

Download
Documentation
Resources
Community
Commercial Support
FAQ
Fork on GitHub

An open source and collaborative framework for extracting the data you need from
websites.

In a fast, simple, yet extensible way.

Maintained by Zyte and many other contributors


Install the latest version of Scrapy

Scrapy 2.11.2

pip install scrapy
PyPI Conda Release Notes

Terminal•

 pip install scrapy
 cat > myspider.py <<EOF


import scrapy

class BlogSpider(scrapy.Spider):
    name = 'blogspider'
    start_urls = ['https://www.zyte.com/blog/']

    def parse(self, response):
        for title in response.css('.oxy-post-title'):
            yield {'title': title.css('::text').get()}

        for next_page in response.css('a.next'):
            yield response.follow(next_page, self.parse)EOF
 scrapy runspider myspider.py


Build and run your
web spiders

Terminal•

 pip install shub
 shub login
Insert your Zyte Scrapy Cloud API Key: <API_KEY>

# Deploy the spider to Zyte Scrapy Cloud
 shub deploy

# Schedule the spider for execution
 shub schedule blogspider 
Spider blogspider scheduled, watch it running here:
https://app.zyte.com/p/26731/job/1/8

# Retrieve the scraped data
 shub items 26731/1/8


{"title": "Improved Frontera: Web Crawling at Scale with Python 3 Support"}
{"title": "How to Crawl the Web Politely with Scrapy"}
...

Deploy them to
Zyte Scrapy Cloud

or use Scrapyd to host the spiders on your own server


FAST AND POWERFUL

write the rules to extract the data and let Scrapy do the rest


EASILY EXTENSIBLE

extensible by design, plug new functionality easily without having to touch the
core


PORTABLE, PYTHON

written in Python and runs on Linux, Windows, Mac and BSD


HEALTHY COMMUNITY

 * - 43,100 stars, 9,600 forks and 1,800 watchers on GitHub
 * - 5.500 followers on Twitter
 * - 18,000 questions on StackOverflow


WANT TO KNOW MORE?

 * - Discover Scrapy at a glance
 * - Meet the companies using Scrapy

Maintained by Zyte and many other contributors

scrapy.org Open in urlscan Pro 108.138.85.102 Public Scan

Form analysis 0 forms found in the DOM

Text Content

scrapy.org Open in urlscan Pro
108.138.85.102 Public Scan

Form analysis
0 forms found in the DOM