Releases: danieldotnl/ha-multiscrape
v7.2.0 🚀 Scraping HTML
New Feature: Raw HTML Scraping with Multiscrape 🌐
I'm excited to announce that it is now possible to scrape raw HTML in Multiscrape! This feature has been a recurring request over the years, and I'm happy I could finally implement it. 🎉
It could for example be used to for displaying rich content on a markdown card.
A new configuration option for selectors has been added called extract
. It is optional and can have these values:
- Text (default): Extracts plain text, as you are used to. 📝
- Content: Returns the content of the selected tag. 📜
- Tag: Returns both the content and the tag itself. 🏷️
With this feature, your sensors (or attributes) can now have a state/value like:
<p>This is an <b>example</b> of what can be scraped with the <i>extract</i> feature.</p>
Thank you for your continued support, and happy scraping! 🥳
Changes
- New feature to scrape and extract raw html @danieldotnl (#416)
- Update README.md @danieldotnl (#413)
- Update dependency pytest-homeassistant-custom-component to v0.13.158 @renovate (#414)
- Update dependency ruff to v0.6.3 - autoclosed @renovate (#410)
- Update dependency pytest-homeassistant-custom-component to v0.13.157 @renovate (#412)
- Update actions/setup-python action to v5.2.0 @renovate (#411)
- Update dependency pytest-homeassistant-custom-component to v0.13.156 @renovate (#408)
v7.1.2 ⭐ Cookies and form variables
I'm excited to share some awesome new features and improvements in this update! 🚀 Here’s what’s new:
✨ New Feature: Form Variables
A big shoutout to @jeremicmilan for his incredible dedication to this feature! 👏 I’ve added Form Variables to Multiscrape, allowing you to scrape the (token of a) page returned after logging in on some sites (specifically PHP). This token can then be sent in a header for authentication or other purposes. For all the details, make sure to check out the README! 📚
🍪 New Feature: Cookies support
You asked, I delivered! The long-awaited support for cookies is finally here! 🎉 Now, all cookies returned in HTTP sessions are automatically transferred to the next request. Plus, I’ve added logging so you can easily see which cookies are set. Sweet, right? 🍪
🤖 Automated Tests!!
I’m taking stability to the next level with the newly set up automated testing infrastructure! The first 2 automated tests have been added to Multiscrape, ensuring even more reliability in the future. Continuous improvements are on the way! 🛠️
As always, a huge thank you to the amazing community for your continued support and feedback. Happy scraping! 🕷️💻
pre-v7.1.1 🌈 Fix issue with hass variable in HA 2024.8
v7.0.3 Fix issue with hass variable in HA 2024.8
As of HA release 2024.8 an issue (#391) occurs when using a value_template showing the following error:
Unable to scrape data: hass variable not set on template
This release will fix this.
pre-v7.1.0 🌈 Form variables
New features: Form variables
A big thank you to @jeremicmilan for his dedication on this one! We now have form variables which can be used to scrape the (token of a) page that's returned after logging in on some sites (specifically PHP). The token can then be send in a header. See the README for more details.
Changes
- Form variables @jeremicmilan (#374)
- Update dependency pytest-homeassistant-custom-component to v0.13.144 @renovate (#384)
- Update dependency ruff to v0.5.0 @renovate (#382)
- Update dependency pytest-homeassistant-custom-component to v0.13.138 @renovate (#383)
- Update dependency pytest-homeassistant-custom-component to v0.13.137 @renovate (#381)
- Update dependency pytest-homeassistant-custom-component to v0.13.136 @renovate (#379)
- Update dependency ruff to v0.4.10 @renovate (#378)
- Update dependency pytest-homeassistant-custom-component to v0.13.135 @renovate (#375)
- Update dependency ruff to v0.4.9 @renovate (#377)
- Update actions/checkout action to v4.1.7 @renovate (#376)
- Update dependency ruff to v0.4.8 @renovate (#371)
- Update dependency pytest-homeassistant-custom-component to v0.13.133 @renovate (#372)
- Update dependency pytest-homeassistant-custom-component to v0.13.124 @renovate (#370)
- Update actions/checkout action to v4.1.6 @renovate (#369)
- Update dependency pytest-homeassistant-custom-component to v0.13.123 @renovate (#358)
- Update dependency ruff to v0.4.4 @renovate (#367)
- Update actions/checkout action to v4.1.5 @renovate (#366)
- Update actions/checkout action to v4.1.4 @renovate (#362)
- Update dependency ruff to v0.4.3 @renovate (#357)
- Update dependency ruff to v0.3.6 @renovate (#356)
- Update dependency pytest-homeassistant-custom-component to v0.13.113 @renovate (#354)
v7.0.2 Separate http settings for form submit and scrape requests
Update: Breaking change
In this release, the http request settings for submitting a form are different from those for scraping. Before this release some of the setting were shared. If you need specific settings for form_submit
(like headers
or verify_ssl
), you can specify those under the form_submit
part of your configuration.
Changes
- Separate http settings for form submit and scrape request @danieldotnl (#353)
- Update dev environment @danieldotnl (#352)
- Update lint.yml to python 3.12 @danieldotnl (#349)
v7.0.1 Fix for missing trigger services
Changes
- Create a trigger service for each configuration. @danieldotnl (#351)
- Upgrade dev container to python 3.12 + prep pytest @danieldotnl (#350)
- Update dependency homeassistant to v2024.4.0 @renovate (#347)
- Update dependency ruff to v0.3.5 @renovate (#344)
v7.0.0 Scrape service with response
New services!
This major release contains 2 brand new services that should make figuring out your configuration and css selectors much easier!
It makes use of the new functionality in Home Assistant that services can now provide a response. To make this possible, significant refactoring was required.
multiscrape.get_content
This service retrieves the content of the website you want to scrape. It shows the same data for which you had to enable log_response
and open the page_soup.txt file.
multiscrape.scrape
This does what it says. It scrapes based on a configuration you can provide in the service data. It is ideal for quickly trying out multiple css selectors, or to scrape data in an automation that you only need when running that automation.
A nice detail is that both services accept exactly the same configuration as you provide in your configuration yaml. Even the form_submit features are supported! However, there is a small but important caveat. Read more about it in the readme.
Changes
- Update readme and version for release with new services @danieldotnl (#343)
- Add service icons for hassfest validation @danieldotnl (#342)
- Update actions/setup-python action to v5.1.0 @renovate (#341)
- Update dependency homeassistant to v2024.3.3 @renovate (#339)
- Update dependency ruff to v0.3.4 @renovate (#338)
- Update dependency homeassistant to v2024.3.1 - autoclosed @renovate (#321)
- Update dependency ruff to v0.3.3 @renovate (#322)
- Implement service for scraping @danieldotnl (#335)
- Update dependency pip to v24 @renovate (#326)
- Update dependency colorlog to v6.8.2 @renovate (#323)
- Update release-drafter/release-drafter action to v6 @renovate (#324)
- Update dependency homeassistant to v2024.1.3 @renovate (#318)
- Update dependency ruff to v0.1.13 @renovate (#317)
- Vscode task to upgrade dependencies @danieldotnl (#316)
- Update dependency homeassistant to v2024.1.2 @renovate (#314)
Fix issue with headers when using form_submit
Changes
- Fix issue in creating renderers for dictionaries with templates @danieldotnl (#313)