You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
File "/Users/rdm/Dev/funda-scraper-2022/funda/spiders/funda_spider.py", line 32, in parse_dir_contents
postal_code = re.search(r'\d{4} [A-Z]{2}', title).group(0)
AttributeError: 'NoneType' object has no attribute 'group'
The text was updated successfully, but these errors were encountered:
it will say <h1 class="fd-h1 fd-m-none">Je bent bijna op de pagina die je zoekt</h1>\n <p class="fd-text-size-l--bp-m fd-color-dark-3 fd-m-bottom-none fd-m-top fd-p-right-6xl--bp-m">We houden ons platform graag veilig en spamvrij. Daarom moeten we soms verifi\xc3\xabren dat onze bezoekers echte mensen zijn.</p>
Obviously they try to block - but I most definitely have seen this scraper work with some modifications in the config.
As per my original comment; it works on most homes, about 1 out of 20 has a failure due to HTML parsing... not due to captcha/blacklist/anti-scrape stuff...
...the scraper definitely works. But the parser has some issues, sometimes
This happens on some homes, not all:
File "/Users/rdm/Dev/funda-scraper-2022/funda/spiders/funda_spider.py", line 32, in parse_dir_contents
postal_code = re.search(r'\d{4} [A-Z]{2}', title).group(0)
AttributeError: 'NoneType' object has no attribute 'group'
The text was updated successfully, but these errors were encountered: