Banyule Victoria AU not working #302

Boabee · 2022-07-17T11:12:49Z

I may be incorrectly using the integration but I think there may be an issue as the local council have recently introduced another bin and so the calendar may not pull the data the same way?

My waste collection calendar is as follows:

waste_collection_schedule:
sources:
- name: banyule_vic_gov_au
args:
street_address: an address in, IVANHOE
customize:
- type: recycling
alias: Fogo
show: true
icon: mdi:recycle
picture: false
calendar_title: Recycling

ravngr · 2022-07-28T00:05:10Z

I authored that source. I'm not sure if the council has changed the interface for the new bins, it's a bit redundant at the moment since it seems OpenCities (Banyule and a number of other councils have outsourced their websites to them) have recently implemented anti-scraping features on that API.

The anti-scraping protection is a little nasty. In my testing the response I get from the API without some magic cookies redirects to a JavaScript file that's heavily obfuscated. PR #250 was reverted in #256 for this reason. In my original PR (#160) we discussed sharing some common code for OpenCities-sourced APIs - since they seem identical - but now it means there are probably a range of sources that are broken by one feature change on their end.

dt215git · 2023-04-16T13:18:09Z

It looks like this can be made to work if you just make the final api call using the geolocationid.

import json
import requests
from bs4 import BeautifulSoup
from datetime import datetime

URL = "https://www.banyule.vic.gov.au/ocapi/Public/myarea/wasteservices"
HEADERS = {
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/112.0",
    "referer": "https://www.banyule.vic.gov.au/Waste-environment/Bin-collection"
}
GEOLOC = "4f7ebfca-1526-4363-8b87-df3103a10a87"  # borrowed from banyule_vic_gov_au.py
PARAMS = {
    'geolocationid': GEOLOC,
    "ocsvclang": "en-AU"
}

s = requests.Session()
r = s.get(
    URL,
    headers=HEADERS,
    params=PARAMS,
)

schedule = json.loads(r.text)
soup = BeautifulSoup(schedule["responseContent"], "html.parser")

x = soup.findAll("div", {"class": "note"})
y = soup.findAll("div", {"class": "next-service"})

a = [item.text.strip() for item in x]
b = [datetime.strptime(item.text.strip(),"%a %d/%m/%Y").date() for item in y]
z = list(zip(a,b))

print(z)

[('Food organics and garden organics', datetime.date(2023, 4, 17)), ('Recycling', datetime.date(2023, 4, 17)), ('Rubbish', datetime.date(2023, 4, 24))]

Implementing this change probably means anyone who was previously using it will have to change their config, and spend a few minutes extracting the geolocationid the website is using for their address. I'd assume that's acceptable if it's currently not working?

mampfes · 2023-04-16T15:31:26Z

Ok, this sound very interesting. I think changing the config is not big issue, because no one uses this source.

ravngr · 2023-04-17T00:15:32Z

The source supports providing geolocation_id manually thus skipping the address lookup step, an example is included in the source documentation. Otherwise, I think the only difference in the code from @dt215git is the headers? I experimented with copying all the headers from the browser at one stage - excluding the anti-scraping magic cookie - but couldn't bypass the anti-scraping once triggered, but maybe it will contribute to not trigging it in the first place 🤷.

From memory the issue is somewhat transient, coming and going at the whim of the back-end. I used the source for a few weeks before getting redirects almost 100% of the time. When it broke I assumed an update on their end, maybe it's been backed off since or I (and others) got unlucky?

dt215git · 2023-04-25T19:19:58Z

Same script now generates errors, so maybe I got lucky when looking at this last week.

mampfes added the source defect Defect of a source reported label Jul 23, 2022

5ila5 mentioned this issue Apr 20, 2024

[Feature]: Someone please fix banyule_vic_gov_au #1964

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Banyule Victoria AU not working #302

Banyule Victoria AU not working #302

Boabee commented Jul 17, 2022

ravngr commented Jul 28, 2022

dt215git commented Apr 16, 2023

mampfes commented Apr 16, 2023

ravngr commented Apr 17, 2023 •

edited

Loading

dt215git commented Apr 25, 2023

Banyule Victoria AU not working #302

Banyule Victoria AU not working #302

Comments

Boabee commented Jul 17, 2022

ravngr commented Jul 28, 2022

dt215git commented Apr 16, 2023

mampfes commented Apr 16, 2023

ravngr commented Apr 17, 2023 • edited Loading

dt215git commented Apr 25, 2023

ravngr commented Apr 17, 2023 •

edited

Loading