Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: URL scraper not getting specific item metadata from Amazon #94

Closed
gearopolis opened this issue Jan 19, 2024 · 3 comments · Fixed by #95
Closed

Bug: URL scraper not getting specific item metadata from Amazon #94

gearopolis opened this issue Jan 19, 2024 · 3 comments · Fixed by #95
Labels
bug Something isn't working

Comments

@gearopolis
Copy link

Describe the bug
When adding a new item sold on Amazon.com to a wish list from my PC, the scraper is not getting the correct item name, price or image. This does not seem to be an issue with other online shopping urls. I also did not get this bug while adding the same item from my phone.

To Reproduce
Steps to reproduce the behavior:

  1. While on computer, go to 'My Wishes' in the UI
  2. Click on the 'round plus' icon in the bottom right
  3. In the popup interface, paste an amazon in the top field, 'Item URL'
    I used this
  4. Click out of or tab out of the field
  5. See that the 'Item Name', 'Price' and 'Image URL' populate incorrectly (image is a Captcha image)

Expected behavior
As it does with other online stores, expecting the fields to populate correctly

Screenshots
image

image

Desktop (please complete the following information):

  • OS: Windows 10
  • Browser Firefox
  • Version 121.01

Smartphone (please complete the following information):

  • Device: Google Pixel 8 Pro
  • OS: Android 14
  • Browser Firefox 121.01

Additional context
As mentioned in the description, I do not experience this bug on my phone. I tried copying the link from the Amazon app and pasting it in and it worked. Because that appeared to have been run through a url shortner, I also browsed to the same product and copied the url from the browser, then attempted to add it to wishlist; it worked as expected.

@gearopolis gearopolis added the bug Something isn't working label Jan 19, 2024
@cmintey
Copy link
Owner

cmintey commented Jan 19, 2024

Yeah unfortunately, Amazon doesn't like web scraping all that much, so sometimes if they can detect it, they will present a captcha, in which case, we can't do anything. I already have a retry in place that if we are presented with a captcha, we will retry and that fixes it sometimes, but in other cases it doesn't.
I don't know that there is an easy workaround without making the scraper more complex, which I want to avoid. I will make it so that if we can't get by the captcha, the user will be presented with a message and no data will be filled into the form

@gearopolis
Copy link
Author

That is a great work-around. It is not surprising that Amazon would limit that functionality. I was thinking about sharing this with the rest of the family and I think that could be a blocker for some of my less tech savvy family members. Thanks again for sharing and the great work!

@gearopolis
Copy link
Author

I just circled back around to this and it works as described. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants