Skip to content

Commit

Permalink
Merge pull request #333 from eliashaeussler/feature/config-api
Browse files Browse the repository at this point in the history
[!!!][FEATURE] Introduce Config API
  • Loading branch information
eliashaeussler authored Mar 26, 2024
2 parents b43777e + 55076c7 commit e1c164b
Show file tree
Hide file tree
Showing 65 changed files with 3,663 additions and 77 deletions.
122 changes: 122 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,7 @@ The following input parameters are available:
|-------------------------------------------------|------------------------------------------------------------------------|
| [`sitemaps`](#sitemaps) | URLs or local filenames of XML sitemaps to be warmed up |
| [`--allow-failures`](#--allow-failures) | Allow failures during URL crawling and exit with zero |
| [`--config`](#--config) | Path to configuration file |
| [`--crawler-options`, `-o`](#--crawler-options) | JSON-encoded string of additional config for configurable crawlers |
| [`--crawler`, `-c`](#--crawler) | FQCN of the crawler to use for cache warmup |
| [`--exclude`, `-e`](#--exclude) | Patterns of URLs to be excluded from cache warmup |
Expand Down Expand Up @@ -130,6 +131,33 @@ $ cache-warmup -u "https://www.example.org/" -u "https://www.example.org/de/"
| Multiple values allowed ||
| Default | **** |

#### `--config`

Path to a configuration file. Read more in the
[Configuration file](#configuration-file) section below.

At the moment, the following file formats are available:

* [`json`](#json-and-yaml)
* [`php`](#php)
* [`yaml`/`yml`](#json-and-yaml)

> [!NOTE]
> Configuration options provided by command options take
> precedence over those provided by configuration files.
> Command options, on the other hand, will be overwritten
> by [environment variables](#environment-variables).
```bash
$ cache-warmup --config cache-warmup.yaml
```

| Shorthand ||
|:------------------------|:------|
| Required | **** |
| Multiple values allowed | **** |
| Default | **** |

#### `--exclude`

Patterns of URLs to be excluded from cache warmup.
Expand Down Expand Up @@ -533,6 +561,100 @@ object. It includes the following properties:

💡 The complete JSON structure can be found in the provided [JSON schema][22].

### Configuration file

Instead of passing all available configuration options as
command parameters, it is also possible to provide a
configuration file to be used for cache warmup. At the moment,
the following file formats are supported:

* [`json`](#json-and-yaml)
* [`php`](#php)
* [`yaml`/`yml`](#json-and-yaml)

#### JSON and YAML

For JSON and YAML files, the name of each configuration option
can be derived from the list of available
[command parameters](#command-line-usage) and must be written in
camel case, e.g. `crawler-options` is configured as `crawlerOptions`.

> [!NOTE]
> Crawler options must be written in object notation instead of
> JSON notation (see examples below).
JSON example:

```json
{
"sitemaps": [
"https://www.example.org/sitemap.xml"
],
"crawlerOptions": {
"concurrency": 3,
"request_options": {
"delay": 3000
}
}
}
```

YAML example:

```yaml
sitemaps:
- https://www.example.org/sitemap.xml
crawlerOptions:
concurrency: 3
request_options:
delay: 3000
```
> [!TIP]
> For a full list of supported configuration options, have a look at the
> [`EliasHaeussler\CacheWarmup\Config\CacheWarmupConfig`](src/Config/CacheWarmupConfig.php)
> class.

#### PHP

PHP config files must return a closure which returns an instance
of [`EliasHaeussler\CacheWarmup\Config\CacheWarmupConfig`](src/Config/CacheWarmupConfig.php).

Example:

```php
use EliasHaeussler\CacheWarmup;
return static function (CacheWarmup\Config\CacheWarmupConfig $config): void {
$config->addSitemap('https://www.example.org/sitemap.xml');
$config->setCrawlerOption('concurrency', 3);
$config->setCrawlerOption('request_options', [
'delay' => 3000,
]);
};
```

### Environment variables

Several cache warmup options can also be configured using environment
variables. Each environment variable is prefixed with `CACHE_WARMUP_`,
followed by the cache warmup option in upper camel case.

Example:

* The `sitemaps` option is expected as `CACHE_WARMUP_SITEMAPS`
* The `crawler-options` option is expected as `CACHE_WARMUP_CRAWLER_OPTIONS`

The following value transformation between environment variables and
cache warmup options exists:

* Lists: Values are separated by comma
- Example: `CACHE_WARMUP_SITEMAPS="https://www.example.org/sitemap.xml, /var/www/html/sitemap.xml"`
* Booleans: Values matching `true`, `yes` or `1` are interpreted as `true`
- Example: `CACHE_WARMUP_PROGRESS=yes` or `CACHE_WARMUP_PROGRESS=1`
* All other values are converted to their underlying type
- Example: `CACHE_WARMUP_LIMIT=50` is converted to an integer value

## 🧑‍💻 Contributing

Please have a look at [`CONTRIBUTING.md`](CONTRIBUTING.md).
Expand Down
3 changes: 2 additions & 1 deletion composer.json
Original file line number Diff line number Diff line change
Expand Up @@ -26,7 +26,8 @@
"psr/http-message": "^1.0 || ^2.0",
"psr/log": "^2.0 || ^3.0",
"symfony/console": "^5.4 || ^6.0 || ^7.0",
"symfony/filesystem": "^5.4 || ^6.0 || ^7.0"
"symfony/filesystem": "^5.4 || ^6.0 || ^7.0",
"symfony/yaml": "^5.4 || ^6.0 || ^7.0"
},
"require-dev": {
"armin/editorconfig-cli": "^1.8 || ^2.0",
Expand Down
74 changes: 73 additions & 1 deletion composer.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

20 changes: 20 additions & 0 deletions phpstan-baseline.neon
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,21 @@ parameters:
count: 1
path: src/CacheWarmer.php

-
message: "#^Parameter \\#1 \\$crawler of method EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\CrawlerFactory\\:\\:get\\(\\) expects class\\-string\\<EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\CrawlerInterface\\>, string given\\.$#"
count: 1
path: src/Config/Adapter/ConsoleInputConfigAdapter.php

-
message: "#^Parameter \\#1 \\$crawler of method EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\CrawlerFactory\\:\\:get\\(\\) expects class\\-string\\<EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\CrawlerInterface\\>, string given\\.$#"
count: 1
path: src/Config/Adapter/EnvironmentVariablesConfigAdapter.php

-
message: "#^Variable property access on \\$this\\(EliasHaeussler\\\\CacheWarmup\\\\Config\\\\CacheWarmupConfig\\)\\.$#"
count: 1
path: src/Config/CacheWarmupConfig.php

-
message: "#^Property EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\AbstractConfigurableCrawler\\<TOptions of array\\<string, mixed\\>\\>\\:\\:\\$options \\(TOptions of array\\<string, mixed\\>\\) does not accept array\\<string, mixed\\>\\.$#"
count: 1
Expand Down Expand Up @@ -45,6 +60,11 @@ parameters:
count: 1
path: tests/src/CacheWarmerTest.php

-
message: "#^Parameter \\#1 \\$limit of method EliasHaeussler\\\\CacheWarmup\\\\Config\\\\CacheWarmupConfig\\:\\:setLimit\\(\\) expects int\\<0, max\\>, \\-10 given\\.$#"
count: 1
path: tests/src/Config/CacheWarmupConfigTest.php

-
message: "#^Parameter \\#1 \\$crawler of method EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\CrawlerFactory\\:\\:get\\(\\) expects class\\-string\\<EliasHaeussler\\\\CacheWarmup\\\\Crawler\\\\CrawlerInterface\\>, string given\\.$#"
count: 2
Expand Down
30 changes: 3 additions & 27 deletions src/CacheWarmer.php
Original file line number Diff line number Diff line change
Expand Up @@ -26,18 +26,12 @@
use GuzzleHttp\Client;
use GuzzleHttp\ClientInterface;
use GuzzleHttp\Exception\GuzzleException;
use GuzzleHttp\Psr7;
use Psr\Http\Message;

use function array_key_exists;
use function array_values;
use function count;
use function fnmatch;
use function is_array;
use function is_string;
use function preg_match;
use function str_contains;
use function str_starts_with;

/**
* CacheWarmer.
Expand Down Expand Up @@ -75,7 +69,7 @@ final class CacheWarmer
private array $excludedUrls = [];

/**
* @param array<string> $excludePatterns
* @param list<Config\Option\ExcludePattern> $excludePatterns
*/
public function __construct(
private readonly int $limit = 0,
Expand Down Expand Up @@ -124,8 +118,7 @@ public function addSitemaps(array|string|Sitemap\Sitemap $sitemaps): self
foreach ($sitemaps as $sitemap) {
// Parse sitemap URL to valid sitemap object
if (is_string($sitemap)) {
$sitemapUri = $this->resolveSitemapUri($sitemap);
$sitemap = new Sitemap\Sitemap($sitemapUri);
$sitemap = Sitemap\Sitemap::createFromString($sitemap);
}

// Throw exception if sitemap is invalid
Expand Down Expand Up @@ -196,19 +189,6 @@ public function addUrl(string|Sitemap\Url $url): self
return $this;
}

private function resolveSitemapUri(string $sitemap): Message\UriInterface
{
// Sitemap is a remote URL
if (str_contains($sitemap, '://')) {
return new Psr7\Uri($sitemap);
}

// Sitemap is a local file
$file = Helper\FilesystemHelper::resolveRelativePath($sitemap);

return new Psr7\Uri('file://'.$file);
}

private function exceededLimit(): bool
{
return $this->limit > 0 && count($this->urls) >= $this->limit;
Expand All @@ -217,11 +197,7 @@ private function exceededLimit(): bool
private function isExcluded(string $url): bool
{
foreach ($this->excludePatterns as $pattern) {
if (fnmatch($pattern, $url)) {
return true;
}

if (str_starts_with($pattern, '#') && 1 === preg_match($pattern, $url)) {
if ($pattern->matches($url)) {
return true;
}
}
Expand Down
Loading

0 comments on commit e1c164b

Please sign in to comment.