Added wraith spider [config_name]
command
#488
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
We have a lot of open 'Spider mode' issues, many of which arise from the Wraith spidering logic being quite complex and difficult to test.
Wraith has to maintain a lot of internal state, triggering spidering automatically when certain config properties are missing, but only triggering the spidering if it hasn't been done in the last x-config-option days, etc.
We can make Wraith much simpler under the hood by giving spidering its own dedicated
spider
command, which the user must choose to run manually, as regularly as they choose. A new 'Imports' feature allows users to import configs into one another.Proposed workflow
Say we have a simplified config:
wraith spider test.yml
=> spiders the sites, and saves the paths tospider_configs.yml
. NB: the first time you run this, Wraith will warn thatspider_configs.yml
doesn't exist. That's ok.wraith capture test.yml
=> thespider_configs.yml
paths are automatically imported into thetest.yml
, and Wraith continues as if the paths were specified manually.The 'Imports' feature has scope for lots of different uses. For example, you may have a common Wraith config defining browser engine, screen sizes to capture, colour of diff image, etc - and then you can have multiple different Wraith configs for each of your sites, all of which import the base config and stop you from having to duplicate all of that information.