Can you scrape Yelp?

What does Yelp say?

Can you scrape Yelp? A cursory look at the Google search results for this query might have you believe that you cannot. This is because the first result that pops up is Yelp’s support page, and it clearly deters any scraping enthusiasts with a straight forward, “No, Yelp does not allow any scraping of the site….”.

Well, I would have been satisfied with this answer, only if it really answered the question we’re asking here. Whether Yelp “allows” us to scrape their site or not doesn’t really matter. Scraping public information remains a fundamental right of every internet user.

The answer to that is a straight forward, “Yes, with some persistence you can”. Go ahead and paste the following command on your terminal, what response do you see?

curl https://www.yelp.com/biz/crowne-plaza-hy36-midtown-manhattan-new-york

You should most likely see a huge blob of HTML on your terminal’s standard output. If you see this, that means you did just in fact scrape Yelp—well, kind off.

Why (and why not) scrape Yelp?

Here are a few good uses to extract data from Yelp. These range from academics to enhancing your customer experience.

  • Tracking the performance and customer reviews of businesses and try to get insights on different business categories.
  • To analyze reviews using machine learning models in an attempt to understanding the customers’ language and get insights about various aspects of their experience.
  • Verifying the authenticity and estimating the performance of a business in an automated manner. This could be very useful to the credit and insurance industry.
  • To add Yelp reviews to your internal dashboards for monitoring and marketing purposes.

Since we’re listing some good reasons to scrape Yelp, even though they do not “allow” it. We do think that the following are BAD reasons to scrape Yelp and we do not endorse these applications.

  • Blackhat SEO websites – copying and publishing Yelp reviews to your own ghost website without attribution.
  • Request Yelp’s servers at an unreasonably high rate. To avoid this, maintain healthy rate limit.
  • Increase impressions on advertisements artificially with conflicting interest.

How to scrape Yelp?

For developers

If you’re a developer and have experience with Python, I would recommend jumping right into to BeautifulSoup. And if you prefer JavaScript, cheerio is worth a look. These libraries make it easy to select the elements you are looking for—using different HTML document selectors, like classes, or other attributes.

Your scraper will request the HTML page in an automated way and select the elements you’re looking for. You can then run a loop over the URLs you wish to scrape and store the data in the format your desire. This should work at a small scale, but soon, Yelp will notice that there’s something fishy about your requests and you will see a page telling you that you’re not allowed to visit Yelp anymore.

What must you do then? Using a tool designed to do this is the quickest way, unless you’re experienced with rotating web proxies and the stack that’s required to bypass modern anti-bot systems—in which case you would probably not be reading this post in the first place.

For non-developers

Using an existing tool that does this for you is the most cost effective solution to this problem. It’s not uncommon for experienced developers to also use a tool rather than investing their own time to manually write and maintain a parser for Yelp.

This is where we step in. Unwrangle offers multiple solutions to extract data from Yelp. The Yelp Reviews Scraper lets you scrape all the reviews for any business listing on Yelp. The Yelp Search API lets you extract data from Yelp’s search results pages with a simple get request. The inputs required are similar to when we’re browsing Yelp manually. The Yelp Reviews Scraper requires a listing url and it makes all the reviews available as a CSV or JSON. The Yelp Search API requires a keyword, location and page number and it responds with the results in JSON format .

The Yelp Reviews Scraper can be used with our no code application. You can sign up here to get a free trial and get started with getting business reviews from Yelp with a few clicks.

As an example, here’s a link to a dataset with more than 20,000 reviews from popular hotels in Las Vegas that we created with our own API.

Our service will let you scrape millions of reviews from Yelp, Google Maps and other websites without worrying about anit-bot systems. We want to make it easier for you to uncover insights from your customer’s feedback. If you have any questions, please write to us at support@unwrangle.com.