How e-commerce teams can use web scraping to monitor prices in near real-time

Money
(Image credit: Shutterstock)

If you're running your own online business or work within an e-commerce team, you know how important it is to competitively price your products and/or services.

According to the experts at BigCommerce, almost 90% of online shoppers in the USA make their buying decisions based on price, rating it as the most important factor. 

According to a study by the Pew Research Center, over 85% will compare prices before making a purchase. Ultimately just under 4 in 5 online shoppers will opt for the best bargain.

This means it's vital that your business is aware not only of competitors' prices but also any discounts or special offers they're currently running. You also need to have a clear idea of exactly what they're selling, so you aren't comparing apples with oranges.

In this guide, you'll learn how how e-commerce teams can use web scraping to monitor prices in near real-time to beat the competition.


Reader Offer: Exclusive 50% discountTECHRADAR

Reader Offer: Exclusive 50% discount
Get an exclusive 50% discount for all Smartproxy eCommerce scraping API subscriptions with code TECHRADAR.

Preferred partner (What does this mean?) 

What is Price Monitoring? 

Price monitoring is simply the activity of checking market prices for certain products/services as often as possible. Businesses most often use it to check competitor's pricing. 

In theory you could hire an e-commerce team to work around the clock and monitor every product on rival websites but this is extremely costly and time consuming. This is where price scraping comes in. 

What is Price Scraping?

Web Scraping is simply the activity of extracting data from a website for analysis. This means, in theory if you copy and paste flight times from an airport's website you've 'scraped' the information.

As the name suggests, price scraping is the process of extracting pricing information from competitor's websites. You can do this using specialist scripts and automated tools to gather the pricing data you need from other websites, then export it to a format for your e-commerce team to analyze later. You can also configure a specialist API (Application Programming Interface) to monitor prices in real time and adjust those on your site if necessary.

Laptop with blue fibre optics growing out

(Image credit: Shutterstock)

Monitoring pricing via scraping has a number of benefits to your business, including:

Build Market Share through lowest pricing

It stands to reason that if you know exactly how much your competitors are charging for their products or services you can adjust your prices to make sure they represent the best value for money.

Although you might make less profit on individual products, if customers sign up to your site, you'll be in a better position to offer them other products/services you sell via online ads, e-mail marketing and so on.

Increase Prices on Low Competition Products

Another advantage of constantly monitoring competitors prices is you can check which of your products has minimal competition, then maximize this advantage by raising prices accordingly. This works best when your business sells particularly original/novelty items.

Discover Competitive Products

Price scraping isn't just about extracting raw numbers from a website. It's a way to discover any new products that your competitors have placed on sale. If a high-sale, low-margin product enters the market, you need to be the first to know so you can offer it too. Similarly if a product is failing to sell well, you can avoid making the same mistake as your competitors by not investing in it. 

Timing Promotions

Running promotions by offering 2 for 1 or x % of your chosen product is par for the course in online businesses. Still, there's no sense in running a promotion for a product that's already more competitively priced than other businesses. Similarly if your promotion doesn't offer better value for money than others available elsewhere, it's unlikely to generate any sales. 

Web & price scraping is an excellent way to monitor competitors to check what promotions they're running (if any), then price your business' products accordingly.

How to get the most from price scraping 

If you're now sold on the idea of using price scraping to monitor competitors products and costs, you need to develop a strategy that'll make sure you're scraping in the most efficient way possible. 

Best practices for price scraping include:

Choose the Right Software

There are any number of scraping tools available to help your e-commerce team but you'll need to choose one that's the right fit for your business. You can better inform your decision by making a list of the competitor websites you want to monitor, as well as consider whether you need real-time (preferable) or periodic updates).

Open source tools like BeautifulSoup and Scrapy are very cost-effective but you'll need to be familiar with coding Python scripts to make full use of it. If you want to scrape a very popular website like Amazon Marketplace, you may be able to confgure an existing script to your needs.

If your e-commerce team isn't comfortable with coding, there are a number of commercial solutions available such as import.io or Octoparse, which have intuitive user interfaces and helpful customer support. 

Some providers like Octoparse offer a free plan, but you'll probably need to have a subscription to obtain useful amounts of pricing data. 

Determine your Data

Once you've chosen which websites you want to target and have a scraping tool in mind, you'll need to decide what data you want to scrape. Naturally this will vary depending on your e-commerce team's needs but in general terms you should focus on:

Price Index

The most basic use of a price scraping tool should be to monitor prices of goods or services over a set period of time. You can program your bot to monitor and record prices at set intervals and/or a particular time of day (see below).

Product Availability

For limited items your bot should always track how many are left in stock. As we learned earlier, this is an extremely good way to introduce new products that are likely to sell well and/or avoid those which are hard to shift.

Special Offers

If your competitors are running promotions on certain products, you can use this information to do the same. Similarly if there's a particular product or service they're failing to promote well, your e-commerce team can exploit this by running special offers on your own website.

Use IP rotation

As useful as competitors' pricing data can be to you as an e-commerce team member, repeated scraping can cause an increase in web traffic. This is why many websites have safeguards built in to prevent price scraping.

One of the most common ways this is done is by checking the IP address of devices and blocking those that make repeated requests. 

You can get around issues like these by making use of proxy servers. These essentially act as a gateway between your scraping tools and the website, so your scraping bot can connect via a different IP address each time.

In the context of scraping, there are two main proxy server types : ISP/Residential proxies and Datacenter proxies. You can read more about the differences between the two in our online guide but in the context of price scraping we recommend using an ISP proxy.

This is because in order to monitor prices in real-time, you'll need your scraping tool to make repeated requests for data. ISP proxies can rotate the IP address used each time you connect to the site. 

In theory you can set up a proxy manually but it's much simpler to subscribe to a dedicated service. Platforms like Smartproxy even have integrated support for popular scraping tools like Octoparse. 

Smartproxy landing page

(Image credit: Smartproxy)

Set up an automated scheduler

When it comes to price scraping, you'll need to decide how often you want your tool of choice to check prices. If this is too frequent, you run the risk of your bot being blocked. However if the bot fails to update the prices quickly enough, you may be missing out on vital data.

Most web scraping tools allow you to alter the frequency at which prices can be gathered e..g from daily to hourly. You can also set specific times when to check certain sites, such as during business hours. (This said, some sites are less likely to block bots who try to scrape data during off-peak hours).

Automating the process in this way makes gathering pricing data much more efficient.

Identify Trends

Assuming that your price scraping tool is now gathering relevant data, you need to put it to work. If your scraping tool uses an API it may be able to automatically populate a database containing historical pricing data.

This is most likely the best way that your e-commerce team will be able to track changes to prices and product availability, such as during seasonal periods like Christmas.

The best way to present this data is through infographics such as pie charts or line/candlestick graphs. If your scraping tool can import information into a spreadsheet e.g. in CSV format, this is very easy to create using popular software like Microsoft Excel or LibreOffice Calc.

This is also an excellent way to uncover your competitors' own pricing strategies, such as how they alter their own prices in response to demand for a product. This is particularly useful when deciding if and when to set up promotions and discounts for your own business.

With enough data, you may be able to use a machine learning tool like XGBoost for predictive modeling. In other words when properly configured and with enough valid data, it could actually predict pricing trends, giving you an edge over your competitors.

In general terms, it's not illegal to gather publicly available pricing data from websites provided doing so doesn't violate their terms and conditions. One excellent way to stay on the right side of other retailers is to examine the 'robots.txt' file for the website in question : this contains information like whether scraping is allowed and if so, how many data requests your bot should make.

There have been cases of companies being used for gathering pricing data but this is usually when doing so has placed an unfair burden on the target website's traffic. Play safe by respecting the website's robots.txt. 

If your target website has an API for accessing data like this, make sure to configure your bot to use it. Not only is this less likely to get your bot banned but it's also a much more efficient way to obtain real-time pricing data. Major websites like Amazon and Google Shopping have APIs for this purpose.

 Pricing Pitfalls 

While price scraping is an excellent way for your e-commerce team to gather useful business intelligence, it can go wrong very quickly. This can happen if your target websites change their layout, even slightly as your scraping bot will make invalid data requests, which can result in it being banned.

As we've learned many websites also have bot countermeasures, such as CAPTCHA challenges which are designed to deter bots. 

One good way to make your bot's traffic seem more legitimate is to use an ISP/Residential proxy, as these work through real devices. You can also reduce your chance of being banned by making sure your scraping tool requests data no faster than a human being could. CAPTCHA Challenges can also be resolved by having your bot deploy a CAPTCHA Solver service.

You should also make sure to read through our guide on how to crawl or scrape a website without being blocked.

Bottom Line

Web scraping is an excellent way to gather pricing data for similar products/services offered by your competitors. If set up correctly, it can automatically gather large amounts of relevant information which you can use to monitor prices in real time, then set your own pricing strategies.

Before getting started, think carefully about what websites you want to target and what data you wish to gather. You should also check the target sites' Terms and Conditions and use a proxy to reduce the chance of your scraping tool being blocked.

Nate Drake is a tech journalist specializing in cybersecurity and retro tech. He broke out from his cubicle at Apple 6 years ago and now spends his days sipping Earl Grey tea & writing elegant copy.