Screen scraping: how to stop the internet's invisible data leeches

Code on a screen
Is your data safe?

Data is your business's most valuable asset, so it's never a good idea to let it slip into the hands of competitors.

Sometimes, however, that can be difficult to prevent due to an automated technique known as 'screen scraping' that has for years provided a way of extracting data from website pages to be indexed over time.

eBay screenshot

eBay introduced an API in 2004 to combat screen scraping (credit: homerjoe426)

TRP: Have there been any recent developments in competitive screen scraping?

AS: In contrast over the past few years, recent developments in competitive screen scraping are not necessarily so welcome. For a site to be scraped by a search engine crawler is OK if the crawler visits are infrequent.

For a site to be the target of a price comparison site scraper is OK if the information obtained is used fairly. However as the number of specialized search engines continues to increase and the frequency of price check visits skyrockets these automated page views can rise to levels which impact the intended operation of the target site.

More specifically, if the target site is the victim of competitive scraping the information obtained can be used to undermine the business of the site owner. For example, undercutting prices, beating odds, aggressively acquiring event tickets, reserving inventory, etc.

Kane Fulton
Kane has been fascinated by the endless possibilities of computers since first getting his hands on an Amiga 500+ back in 1991. These days he mostly lives in realm of VR, where he's working his way into the world Paddleball rankings in Rec Room.