Scraping Real Estate Data from Zillow | Scraping Real Estate Data Using Python

January 22, 2021
scraping-real-estate-data-from-zillow

Scraping real estate data can be a feasible option for keeping track of the available real estate listings for agents and sellers. Having in possession of scraped real estate data from the real estate websites like Zillow.com can assist to adjust the prices of listing on the website or support you in preparing a business database. In this tutorial blog, we will extract Zillow data with Python as well as show how to scrape real estate data. We will also show here how to extract real estate data using zip code.

Just follow these steps for scraping Zillow

Create a URL of search result pages from Zillow. For example – https://www.zillow.com/homes/02126_rb/

Then download the HTML of search results pages using the Python Requests.

Parse a page using LXML as it lets you direct HTML Tree Structure with Xpaths.

Then save data in the CSV file.

We will scrape the following data from Zillow website:

  • Title
  • Street’s Name
  • City
  • Zip Code
  • State
  • Pricing
  • Features and Facts
  • Property Provider
  • URL

Here is the screenshot of some data fields that we will scrape from Zillow:

data-field

Required Tools

Installing Python 3 as well as Pip

Just go through this guide for installing Python 3 for Linux –
http://docs.python-guide.org/en/latest/starting/install3/linux/

For Mac Users, this is the guide to follow – http://docs.python-guide.org/en/latest/starting/install3/osx/

Web Scraping Packages

For the data scraping tutorial having Python 3, we would require some packages to download and parse the HTML. Here are the package requirements:

PIP for install these packages in the Python (https://pip.pypa.io/en/stable/installing/)

Use Python Requests for making requests as well as downloading HTML content pages at (http://docs.python-requests.org/en/master/user/install/).

Use Python LXML to parse HTML Tree Structure with Xpaths (Discover how to install it here – http://lxml.de/installation.html)

The Code

We need to first make search results pages URL and create the URL manually for scraping results from the page. For instance, go through the one used for Boston- https://www.zillow.com/homes/02126_rb/.

Runa Zillow Scraper

Presume that a script is known as zillow.py. Whenever you type script name in the terminal or command prompt using a -h

usage: zillow.py [-h] zipcode sort
positional arguments:
zipcode
sort
available sort orders are :
newest : Latest property details
cheapest : Properties with cheapest price
optional arguments:
-h, --help  show this help message and exit;

You have to run a Zillow data scraper using Python having arguments for sort and zip code. The sort arguments have the options ‘cheapest’ and ‘newest’ listings accessible. For example, to get listing of the latest properties for sale in the Boston, Massachusetts we will run a script like:

python3 zillow.py 02126 newest

It will make a CSV file named properties-02126.csv, which will be available in the similar folder like the script. Just go through some extracted sample data from Zillow.com for the above command.

sample-data

Limitations

The Zillow scraper can extract real estate listings for the majority of zip codes given. If you require some professional assistance with scraping real estate data, just contact us!