Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Banyule Victoria AU not working #302

Open
Boabee opened this issue Jul 17, 2022 · 5 comments
Open

Banyule Victoria AU not working #302

Boabee opened this issue Jul 17, 2022 · 5 comments
Labels
source defect Defect of a source reported

Comments

@Boabee
Copy link

Boabee commented Jul 17, 2022

I may be incorrectly using the integration but I think there may be an issue as the local council have recently introduced another bin and so the calendar may not pull the data the same way?

My waste collection calendar is as follows:

waste_collection_schedule:
sources:
- name: banyule_vic_gov_au
args:
street_address: an address in, IVANHOE
customize:
- type: recycling
alias: Fogo
show: true
icon: mdi:recycle
picture: false
calendar_title: Recycling

@mampfes mampfes added the source defect Defect of a source reported label Jul 23, 2022
@ravngr
Copy link
Contributor

ravngr commented Jul 28, 2022

I authored that source. I'm not sure if the council has changed the interface for the new bins, it's a bit redundant at the moment since it seems OpenCities (Banyule and a number of other councils have outsourced their websites to them) have recently implemented anti-scraping features on that API.

The anti-scraping protection is a little nasty. In my testing the response I get from the API without some magic cookies redirects to a JavaScript file that's heavily obfuscated. PR #250 was reverted in #256 for this reason. In my original PR (#160) we discussed sharing some common code for OpenCities-sourced APIs - since they seem identical - but now it means there are probably a range of sources that are broken by one feature change on their end.

@dt215git
Copy link
Contributor

It looks like this can be made to work if you just make the final api call using the geolocationid.

import json
import requests
from bs4 import BeautifulSoup
from datetime import datetime

URL = "https://www.banyule.vic.gov.au/ocapi/Public/myarea/wasteservices"
HEADERS = {
    "user-agent": "Mozilla/5.0 (X11; Linux x86_64; rv:109.0) Gecko/20100101 Firefox/112.0",
    "referer": "https://www.banyule.vic.gov.au/Waste-environment/Bin-collection"
}
GEOLOC = "4f7ebfca-1526-4363-8b87-df3103a10a87"  # borrowed from banyule_vic_gov_au.py
PARAMS = {
    'geolocationid': GEOLOC,
    "ocsvclang": "en-AU"
}

s = requests.Session()
r = s.get(
    URL,
    headers=HEADERS,
    params=PARAMS,
)

schedule = json.loads(r.text)
soup = BeautifulSoup(schedule["responseContent"], "html.parser")

x = soup.findAll("div", {"class": "note"})
y = soup.findAll("div", {"class": "next-service"})

a = [item.text.strip() for item in x]
b = [datetime.strptime(item.text.strip(),"%a %d/%m/%Y").date() for item in y]
z = list(zip(a,b))

print(z)
[('Food organics and garden organics', datetime.date(2023, 4, 17)), ('Recycling', datetime.date(2023, 4, 17)), ('Rubbish', datetime.date(2023, 4, 24))]

Implementing this change probably means anyone who was previously using it will have to change their config, and spend a few minutes extracting the geolocationid the website is using for their address. I'd assume that's acceptable if it's currently not working?

@mampfes
Copy link
Owner

mampfes commented Apr 16, 2023

Ok, this sound very interesting. I think changing the config is not big issue, because no one uses this source.

@ravngr
Copy link
Contributor

ravngr commented Apr 17, 2023

The source supports providing geolocation_id manually thus skipping the address lookup step, an example is included in the source documentation. Otherwise, I think the only difference in the code from @dt215git is the headers? I experimented with copying all the headers from the browser at one stage - excluding the anti-scraping magic cookie - but couldn't bypass the anti-scraping once triggered, but maybe it will contribute to not trigging it in the first place 🤷.

From memory the issue is somewhat transient, coming and going at the whim of the back-end. I used the source for a few weeks before getting redirects almost 100% of the time. When it broke I assumed an update on their end, maybe it's been backed off since or I (and others) got unlucky?

@dt215git
Copy link
Contributor

Same script now generates errors, so maybe I got lucky when looking at this last week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
source defect Defect of a source reported
Projects
None yet
Development

No branches or pull requests

4 participants