How to Scrape More than 100 Reviews from Amazon?

With the increasing importance of product reviews in influencing purchase decisions, the demand for scraping reviews from online marketplaces like Amazon has surged. However, Amazon recently limited the number of reviews one can view to just 100. But what if you need more data? In this article, we’ll explore a workaround using Unwrangle.com’s Amazon Product Reviews API and how you can potentially fetch up to 500 reviews.

Why the Limitation?

Amazon has implemented a limit, restricting users from viewing beyond the 10th page of product reviews, which equates to approximately 100 reviews. This has posed a challenge for researchers, marketers, and developers who rely on comprehensive review data for their projects.

Since mid July, 2023 Amazon displays 0 reviews for all products on its website beyond page 10.

The Workaround: Exploiting Filters

While it’s true that the direct view is limited, there’s a nifty trick you can employ. Amazon allows users to filter reviews based on different criteria, such as star ratings, helpfulness, or recency. Each filter can display up to 100 reviews, providing an avenue to extract a larger dataset.

Here’s how you can utilize Unwrangle.com’s API to leverage this:

1. Choose Your Product

Identify the Amazon product for which you need reviews and obtain its URL.

2. Initialize the API Call

Make a GET request to Unwrangle’s endpoint: /api/getter/?platform="amazon_reviews"

For instance, to fetch reviews for the latest iPhone, you’d use:

curl -v -L 'https://data.unwrangle.com/api/getter/?platform=amazon_reviews&url=https%3A%2F%2Fwww.amazon.com%2FApple-iPhone-13-128GB-Blue%2Fdp%2FB09LNX6KQS%2F&filter_by_star=all_stars&api_key=API_KEY'

Note: Always ensure you add the -L option with curl to handle potential redirects.

3. Apply Filters

To the same product URL, apply Amazon’s review filters one by one and repeat the API call for each filter. This way, you can scrape reviews based on different star ratings, sorting them by helpfulness or recency.

Here are the query parameters you can utilize to apply filters:

  • filter_by_star: Specify the star rating (1-5) to filter reviews. E.g. all_stars, five_star, four_star, etc.
  • sort_by: Sort reviews by ‘recent’, ‘helpful’, etc.
  • page: Specify the page number to navigate through paginated results.

4. Compile the Data

Aggregate the results from each API call to have a comprehensive dataset of reviews. The following JSON is exactly how the response for the above example API call looks.

{
    "success": true,
    "url": "https://www.amazon.com/Apple-iPhone-13-128GB-Blue/dp/B09LNX6KQS",
    "page": 1,
    "total_results": 2256,
    "no_of_pages": 226,
    "result_count": 10,
    "reviews": [
        {
            "id": "R3C7EMIUQ2O4UZ",
            "date": "2024-04-04",
            "author_name": "Noelia García Vicente",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AFXNT4KDBML67PMTBJ6XWU22NI6A/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 1,
            "review_title": "1.0 out of 5 stars\nProducto inservible",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R3C7EMIUQ2O4UZ/",
            "review_text": "He devuelto el producto porque no se puede utilizar por las condiciones en las que está. No recomiendo comprarlo reacondicionado, si no es con mejores garantías.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R30LZTKM09XXY6",
            "date": "2024-04-04",
            "author_name": "Smmusa",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AGV5SVBK7TRSYDUB6M3QKM3CFTEA/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 1,
            "review_title": "1.0 out of 5 stars\nNot in excellent condition",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R30LZTKM09XXY6/",
            "review_text": "Arrived with a screen problem . Not in excellent condition as advertised. Problem on-left side screen showing a pixel. Check picture.Returned it!",
            "review_imgs": [
                "https://m.media-amazon.com/images/I/61G5wmATGHL._SL1600_.jpg"
            ],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "RJEMYPZ33M923",
            "date": "2024-04-03",
            "author_name": "Robert",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AEL6WQC3QPEXYQJKDXHZ57KUUFEA/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 5,
            "review_title": "5.0 out of 5 stars\nLooks like new",
            "review_url": "https://www.amazon.com/gp/customer-reviews/RJEMYPZ33M923/",
            "review_text": "This was a gift for my grand daughter. She loves it and says it works great. She's a teen, and they love their phones, so that's a really big endorsement. I would definitely recommend this vendor.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "RH558IZDCGKBY",
            "date": "2024-04-03",
            "author_name": "Spacecowboy",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AEITSH74TL3S7X5UZUSJJX3J6MKQ/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 5,
            "review_title": "5.0 out of 5 stars\nVery good product and experience",
            "review_url": "https://www.amazon.com/gp/customer-reviews/RH558IZDCGKBY/",
            "review_text": "I bought this as a replacement phone and it meets all my expectations. The phone is in very good condition with no visible scratches or damage, and the battery life is 98%. I would purchase a renewed phone from here in the future with no hesitation.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R2SBQ9GPTVYZKV",
            "date": "2024-04-03",
            "author_name": "R. Sanchez Mx",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AGR4W76CGZ74UFQDHFKJSJVA7KNA/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 4,
            "review_title": "4.0 out of 5 stars\nGood but.. no charger included",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R2SBQ9GPTVYZKV/",
            "review_text": "Iphone came with 90% of battery life and no charger. Seller stated it was a mistake from Amazon which is not fare at all. All other features are in very good condition.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R30YM1DVPEAQF9",
            "date": "2024-04-03",
            "author_name": "Marc Robaczynski",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AEFZAXXV4SWCTOXQSACJIJIV7IKA/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 2,
            "review_title": "2.0 out of 5 stars\nPoor battery life for \"excellent condition\" iphone 13",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R30YM1DVPEAQF9/",
            "review_text": "I received the iphone 13 in nice shape, but its battery life is only 81% which is low considering the price point of this phone versus other sellers.  I'd expect this battery life to be closer to 85-88%.  I will be returning.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R1SRZIKWUZKU9L",
            "date": "2024-04-03",
            "author_name": "Lt.",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AGW3PJ4GILLV4WF3SG3YZYCBRQZA/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 5,
            "review_title": "5.0 out of 5 stars\nMuch cheaper alternative to the I phonec15",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R1SRZIKWUZKU9L/",
            "review_text": "Truth be told, I have a fellow Veteran that needed a newer phone. He wanted to buy the newest I Phone and just about fell over from the price. So we found this refurbished I Phone on Amazon, ordered it and a case and well thats all she wrote! Happy",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R2LVEKPMHOAZIX",
            "date": "2024-04-03",
            "author_name": "Dez",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AEW5TBMA4GVKNT4PLJ4SA5JNV2AA/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 1,
            "review_title": "1.0 out of 5 stars\nScreen was unresponsive",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R2LVEKPMHOAZIX/",
            "review_text": "Received in visually good condition. But the screen was unresponsive around all edges.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R3IFECR4MZBHIB",
            "date": "2024-04-02",
            "author_name": "JM28",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AENPWUBU5N3RI5AVSAGWGPLTJPGQ/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 3,
            "review_title": "3.0 out of 5 stars\nGood",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R3IFECR4MZBHIB/",
            "review_text": "The phone overall was in great condition the outer part of the camera lens was a little scratched up. But the screen was fine just one light scratch but great. Although the battery was not great as it didn’t seem to last that long as I expected and charging it would take too long compared to my last phone.",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        },
        {
            "id": "R13C41VHGP3Y7Z",
            "date": "2024-04-02",
            "author_name": "Melissa",
            "author_url": "https://www.amazon.com/gp/profile/amzn1.account.AFDN3QBZILFQTRJOTPFRQJAOANZQ/ref=cm_cr_arp_d_gw_btm?ie=UTF8",
            "rating": 5,
            "review_title": "5.0 out of 5 stars\nWorks great",
            "review_url": "https://www.amazon.com/gp/customer-reviews/R13C41VHGP3Y7Z/",
            "review_text": "I purchased this for my daughter. She loves it. Works perfectly!",
            "review_imgs": [],
            "meta_data": {
                "verified_purchase": true
            },
            "location": "United States"
        }
    ],
    "meta_data": {
        "total_ratings": 7104,
        "rating_distribution": {
            "5 star": "68%",
            "4 star": "13%",
            "3 star": "5%",
            "2 star": "3%",
            "1 star": "11%"
        }
    },
    "remaining_credits": 614630
}

In Conclusion

While Amazon’s new limitation might seem restrictive, with the right tools and methods, you can still access a rich dataset of reviews. Unwrangle.com’s Amazon Product Reviews API is a powerful ally in this quest, turning challenges into opportunities for richer data extraction.

Can you scrape Yelp?

What does Yelp say?

Can you scrape Yelp? A cursory look at the Google search results for this query might have you believe that you cannot. This is because the first result that pops up is Yelp’s support page, and it clearly deters any scraping enthusiasts with a straight forward, “No, Yelp does not allow any scraping of the site….”.

Well, I would have been satisfied with this answer, only if it really answered the question we’re asking here. Whether Yelp “allows” us to scrape their site or not doesn’t really matter. Scraping public information remains a fundamental right of every internet user.

The answer to that is a straight forward, “Yes, with some persistence you can”. Go ahead and paste the following command on your terminal, what response do you see?

curl https://www.yelp.com/biz/crowne-plaza-hy36-midtown-manhattan-new-york

You should most likely see a huge blob of HTML on your terminal’s standard output. If you see this, that means you did just in fact scrape Yelp—well, kind off.

Why (and why not) scrape Yelp?

Here are a few good uses to extract data from Yelp. These range from academics to enhancing your customer experience.

  • Tracking the performance and customer reviews of businesses and try to get insights on different business categories.
  • To analyze reviews using machine learning models in an attempt to understanding the customers’ language and get insights about various aspects of their experience.
  • Verifying the authenticity and estimating the performance of a business in an automated manner. This could be very useful to the credit and insurance industry.
  • To add Yelp reviews to your internal dashboards for monitoring and marketing purposes.

Since we’re listing some good reasons to scrape Yelp, even though they do not “allow” it. We do think that the following are BAD reasons to scrape Yelp and we do not endorse these applications.

  • Blackhat SEO websites – copying and publishing Yelp reviews to your own ghost website without attribution.
  • Request Yelp’s servers at an unreasonably high rate. To avoid this, maintain healthy rate limit.
  • Increase impressions on advertisements artificially with conflicting interest.

How to scrape Yelp?

For developers

If you’re a developer and have experience with Python, I would recommend jumping right into to BeautifulSoup. And if you prefer JavaScript, cheerio is worth a look. These libraries make it easy to select the elements you are looking for—using different HTML document selectors, like classes, or other attributes.

Your scraper will request the HTML page in an automated way and select the elements you’re looking for. You can then run a loop over the URLs you wish to scrape and store the data in the format your desire. This should work at a small scale, but soon, Yelp will notice that there’s something fishy about your requests and you will see a page telling you that you’re not allowed to visit Yelp anymore.

What must you do then? Using a tool designed to do this is the quickest way, unless you’re experienced with rotating web proxies and the stack that’s required to bypass modern anti-bot systems—in which case you would probably not be reading this post in the first place.

For non-developers

Using an existing tool that does this for you is the most cost effective solution to this problem. It’s not uncommon for experienced developers to also use a tool rather than investing their own time to manually write and maintain a parser for Yelp.

This is where we step in. Unwrangle’s Yelp Search API and Yelp Reviews API let you query search results and business reviews from Yelp with a simple get request. The inputs required are similar to when we’re browsing Yelp manually. The search endpoint requires a keyword, location and page number and the reviews endpoint requires the listing URL and the page number.

If you are not used to using APIs, you can also use our self service application to get all reviews for any business on Yelp without any code. Sign up here to get a free trial and start scraping Yelp in seconds.

As an example, here’s a link to a dataset with more than 20,000 reviews from popular hotels in Las Vegas that we created with our own API.

Our API will let you scrape millions of reviews from Yelp and other websites without worrying about anit-bot systems. So that you can focus more on finding insights from your customer’s feedback. We’d love to hear from you, write to us at support@unwrangle.com.