You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey, so I wanted to use Price Tracker with an Unreal Engine Marketplace URL but the discount price is ignored in favor of the (higher) base price which makes the tracker pretty much useless for this domain.
So I thought "OK I'll just add unrealengine.com with XPath to parser_configuration.json and submit a pull request but ParserXPath actually never gets called in scraper.dart because there's a JSON-LD object in the page so ParserSD is called instead of ParserXPath.
The problem is: the price attribute in the JSON-LD object on unrealengine.com/marketplace is for the base price (which rarely changes), not the discount price (which changes a lot).
My question: What would be your advice if I want to add unrealengine.com to the parser_configuration.json + ignore the JSON-LD for this domain + submit a pull request?
The text was updated successfully, but these errors were encountered:
Hi thanks for the bug report @3id0.
The scraper favors all structured data over given xpaths. This problem never occured before and it is puzzling me why unity isn't updating their JSON-LD when they have it available.
To fix this case, the priority of the parser has to be changed. The problem now is that several xpaths exist in the configuration and are used as a fallback alternative and would require a lot of manual testing.
As a temporary fix I would suggest adding a boolean property called "favorXPath" or something to the configuration JSON for each domain and checking this property before using the structured data parser (in scraper.dart).
The parser priority list is then as follows:
XPath/Selector if available and favored in configuration
Default Structured Data parser
Fallback XPath/Selector
This should work without having to test all domains in the configuration manually.
Add new boolean property "favorXPath" in parser_configuration.json
In some cases, a page's content may include JSON-LD (sdJSON) but
the data (e.g. price, name) is outdated/wrong compared to what can
be scraped with ParserXPath (this is the case with products pages
on unrealengine.com/marketplace for example).
In this case, the property "favorXPath" can be set to "true" for
a specific domain in order to ignore the problematic JSON-LD data.
Closes#55
Hey, so I wanted to use Price Tracker with an Unreal Engine Marketplace URL but the discount price is ignored in favor of the (higher) base price which makes the tracker pretty much useless for this domain.
So I thought "OK I'll just add unrealengine.com with XPath to
parser_configuration.json
and submit a pull request but ParserXPath actually never gets called inscraper.dart
because there's a JSON-LD object in the page so ParserSD is called instead of ParserXPath.The problem is: the price attribute in the JSON-LD object on
unrealengine.com/marketplace
is for the base price (which rarely changes), not the discount price (which changes a lot).My question: What would be your advice if I want to add unrealengine.com to the
parser_configuration.json
+ ignore the JSON-LD for this domain + submit a pull request?The text was updated successfully, but these errors were encountered: