Table Of Content

How to Scrape Flipkart Product Data with Beautiful Soup and Python?

Publish Date

May 14, 2021

Author

Scraping Intelligence

In this blog, we will see how we scrape Flipkart product data scraping using BeautifulSoup and Python in an elegant and simple manner.

The target of this blog is to get started on practical problem resolving while holding it easy so that you get practical and familiar outcomes as quick as feasible.

The first thing we require to do is install Python 3. If you don’t than you need to install Python 3 before the process.

pip3 install beautifulsoup4

Once it is installed you require to type in and open the editor:

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests

Now visit the Flipkart List page and examine what information we can acquire

Not let’s get back to code. Let’s get data and try to imagining we are a browser like this:

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests
import reheaders = {‘User-Agent’:’Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9′}
url = ‘https://www.flipkart.com/mobile-accessories/power-banks/pr?sid=tyy,4mr,fu6&otracker=categorytree&otracker=nmenu_sub_Electronics_0_Power Banks’response=requests.get(url,headers=headers)soup=BeautifulSoup(response.content,’lxml’)

You can save this by scrapeFlipkart.py

python3 scrapeFlipkart.py

You will able to see the full HTML page.

Now, utilize CSS selectors to acquire data you need. To perform that you need to go to open chrome review the tool.

We observe that all the particular product information is included with the quality data-id. You also observe that the feature worth is some rubbish and it always keeps changing. But the hint is the occurrence of the data-id features itself. The whole thing we require. So let’s scrap that.

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests
import reheaders = {‘User-Agent’:’Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9′}
url = ‘https://www.flipkart.com/mobile-accessories/power-banks/pr?sid=tyy,4mr,fu6&otracker=categorytree&otracker=nmenu_sub_Electronics_0_Power Banks’response=requests.get(url,headers=headers)soup=BeautifulSoup(response.content,’lxml’)for item in soup.select(‘[data-id]’):
try:
print(‘—————————————-‘)
print(item)
except Exception as e:
#raise e
b=0

This will print the content in every of the ampoules that clutch the product information.

Now get back to work for every field we require. This is interesting because Flipkart HTML has no significant CSS programs we can utilize. So we will route to some actions that is dependable.

print(item.select(‘a img’)[0][‘alt’])
print(item.select(‘a’)[0][‘href’])

The other lines beyond give us the URL of the list.

But we can utilize the *= operator to choose whatever which has the term product rating like this:

print(item.select(‘[id*=productRating]’)[0].get_text().strip())

Extracting the price is more interesting as it do not contain visible ID or class name as a hint to get. But every time it has the exchange denominator ₹ in it.

prices = item.find_all(text=re.compile(‘₹’))
print(prices[0])

We do similar to acquire the discount rates.

discounts = item.find_all(text=re.compile(‘off’))
print(discounts[0])

Put everything together

# -*- coding: utf-8 -*-
from bs4 import BeautifulSoup
import requests
import reheaders = {‘User-Agent’:’Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9′}
url = ‘https://www.flipkart.com/mobile-accessories/power-banks/pr?sid=tyy,4mr,fu6&otracker=categorytree&otracker=nmenu_sub_Electronics_0_Power Banks’response=requests.get(url,headers=headers)soup=BeautifulSoup(response.content,’lxml’)for item in soup.select(‘[data-id]’):
try:
print(‘—————————————-‘)
#print(item)
print(item.select(‘a img’)[0][‘alt’])
print(item.select(‘a’)[0][‘href’])         print(item.select(‘[id*=productRating]’)[0].get_text().strip())
prices = item.find_all(text=re.compile(‘₹’))
print(prices[0])        discounts = item.find_all(text=re.compile(‘off’))
print(discounts[0])     except Exception as e:
#raise e
b=0

If you need to measure the scraping speed and don’t need to fix up your particular infrastructure, then you can utilize our Flipkart product data crawler to effortlessly scrape millions of URLs at great speed from our crawlers.

If you are looking for the best Flipkart Data Scraping Services, then you can contact Scraping Intelligence for all your queries.

About the author

Zoltan Bettenbuk

Zoltan Bettenbuk is the CTO of ScraperAPI - helping thousands of companies get access to the data they need. He’s a well-known expert in data processing and web scraping. With more than 15 years of experience in software development, product management, and leadership, Zoltan frequently publishes his insights on our blog as well as on Twitter and LinkedIn.

Latest Blog

Explore our latest content pieces for every industry and audience seeking information about data scraping and advanced tools.

Google

17 Oct 2025

How to Scrape Flight Data from Google Like a Pro: A Complete Guide

Learn how to Extract Google Flights data using Python and Playwright. Build a reliable Flight Data Scraper to track prices, routes & schedules easily.