Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RDSL lib init commit #869

Merged
merged 6 commits into from
Dec 7, 2023
Merged

Conversation

norberttech
Copy link
Member

Change Log

Added

  • RDSL library

Fixed

Changed

Removed

Deprecated

Security


Description

RDSL is a simple library that allows to define a chain of execution of DSL functions.
It comes with access control lists, to exclude/include only specific functions from given namespaces to be a part of the DSL.
It also allows to define which functions are available as an entry points, also through ACL.

Usage:

<?php

$executables = $builder->parse(
    [
        [
            'function' => 'int',
            'args' => [0],
            'call' => [
                'method' => 'add',
                'args' => [
                    [
                        'function' => 'lit',
                        'args' => [5],
                    ],
                ],
            ],
        ],
    ]
);

$results = (new Executor())->execute($executables);

The goal is to define data_frame/df functions as an entry point from which we can call other functions.
ETL will parse pipelines from Yaml/Json/XML, turn them into associative arrays and build DataFrame.

It will all be possible through a static factory DataFrame::parse(array $definition) : DataFrame.
Then specific parses would just use that static factory, for example Flow\ETL\Parser\JsonParser::parse(string $json)

Copy link
Contributor

github-actions bot commented Dec 7, 2023

Flow PHP - Benchmarks

Results of the benchmarks from this PR are compared with the results from 1.x branch.

Extractors
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| benchmark             | subject           | revs | its | mem_peak         | mode             | rstdev          |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
| AvroExtractorBench    | bench_extract_10k | 1    | 3   | 35.123mb +0.02%  | 710.695ms +0.26% | ±0.99% -52.34%  |
| CSVExtractorBench     | bench_extract_10k | 1    | 3   | 4.822mb +0.16%   | 302.266ms +0.74% | ±0.77% +263.28% |
| JsonExtractorBench    | bench_extract_10k | 1    | 3   | 4.920mb +0.15%   | 927.157ms -0.52% | ±0.48% +192.82% |
| ParquetExtractorBench | bench_extract_10k | 1    | 3   | 239.626mb +0.00% | 1.129s +1.30%    | ±1.23% +14.49%  |
| TextExtractorBench    | bench_extract_10k | 1    | 3   | 4.699mb +0.16%   | 24.866ms +0.27%  | ±0.76% -6.12%   |
| XmlExtractorBench     | bench_extract_10k | 1    | 3   | 4.701mb +0.16%   | 420.463ms +4.14% | ±1.82% +264.07% |
+-----------------------+-------------------+------+-----+------------------+------------------+-----------------+
Transformers
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| benchmark                   | subject                  | revs | its | mem_peak         | mode            | rstdev         |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
| RenameEntryTransformerBench | bench_transform_10k_rows | 1    | 3   | 110.386mb +0.01% | 65.443ms +4.32% | ±0.22% -82.83% |
+-----------------------------+--------------------------+------+-----+------------------+-----------------+----------------+
Loaders
+--------------------+----------------+------+-----+------------------+------------------+------------------+
| benchmark          | subject        | revs | its | mem_peak         | mode             | rstdev           |
+--------------------+----------------+------+-----+------------------+------------------+------------------+
| AvroLoaderBench    | bench_load_10k | 1    | 3   | 94.784mb +0.01%  | 456.957ms +1.84% | ±1.54% +33.98%   |
| CSVLoaderBench     | bench_load_10k | 1    | 3   | 54.862mb +0.01%  | 71.562ms +0.10%  | ±3.13% +6263.68% |
| JsonLoaderBench    | bench_load_10k | 1    | 3   | 105.445mb +0.01% | 55.634ms +3.82%  | ±0.94% +242.23%  |
| ParquetLoaderBench | bench_load_10k | 1    | 3   | 320.654mb +0.00% | 1.256s -1.29%    | ±0.04% -96.85%   |
| TextLoaderBench    | bench_load_10k | 1    | 3   | 17.739mb +0.04%  | 41.823ms +3.53%  | ±0.81% +101.61%  |
+--------------------+----------------+------+-----+------------------+------------------+------------------+
Building Blocks
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| benchmark               | subject                    | revs | its | mem_peak         | mode             | rstdev          |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+
| RowsBench               | bench_chunk_10_on_10k      | 2    | 3   | 76.437mb +0.01%  | 2.891ms +1.76%   | ±3.23% +14.94%  |
| RowsBench               | bench_diff_left_1k_on_10k  | 2    | 3   | 96.230mb +0.01%  | 179.555ms -3.64% | ±0.68% -23.98%  |
| RowsBench               | bench_diff_right_1k_on_10k | 2    | 3   | 74.755mb +0.01%  | 18.281ms -4.53%  | ±0.50% -4.91%   |
| RowsBench               | bench_drop_1k_on_10k       | 2    | 3   | 77.677mb +0.01%  | 1.988ms +18.40%  | ±1.03% +139.75% |
| RowsBench               | bench_drop_right_1k_on_10k | 2    | 3   | 77.677mb +0.01%  | 1.933ms +15.03%  | ±2.64% +33.94%  |
| RowsBench               | bench_entries_on_10k       | 2    | 3   | 74.789mb +0.01%  | 3.107ms +25.67%  | ±1.39% +32.34%  |
| RowsBench               | bench_filter_on_10k        | 2    | 3   | 75.318mb +0.01%  | 16.701ms +15.72% | ±2.52% +361.53% |
| RowsBench               | bench_find_on_10k          | 2    | 3   | 75.318mb +0.01%  | 16.808ms +18.53% | ±1.68% -6.51%   |
| RowsBench               | bench_find_one_on_10k      | 10   | 3   | 73.220mb +0.01%  | 1.900μs +0.32%   | ±0.00% -100.00% |
| RowsBench               | bench_first_on_10k         | 10   | 3   | 73.220mb +0.01%  | 0.400μs 0.00%    | ±0.00% 0.00%    |
| RowsBench               | bench_flat_map_on_1k       | 2    | 3   | 86.777mb +0.01%  | 12.294ms -5.48%  | ±2.61% +10.13%  |
| RowsBench               | bench_map_on_10k           | 2    | 3   | 116.137mb +0.01% | 63.439ms +1.44%  | ±1.80% +139.60% |
| RowsBench               | bench_merge_1k_on_10k      | 2    | 3   | 75.838mb +0.01%  | 1.936ms +7.02%   | ±2.00% +122.11% |
| RowsBench               | bench_partition_by_on_10k  | 2    | 3   | 78.111mb +0.01%  | 35.929ms +3.62%  | ±1.75% +94.02%  |
| RowsBench               | bench_remove_on_10k        | 2    | 3   | 77.939mb +0.01%  | 3.931ms +3.28%   | ±2.34% +720.71% |
| RowsBench               | bench_sort_asc_on_1k       | 2    | 3   | 73.366mb +0.01%  | 40.294ms -1.87%  | ±0.95% -56.28%  |
| RowsBench               | bench_sort_by_on_1k        | 2    | 3   | 73.366mb +0.01%  | 40.370ms +1.23%  | ±2.64% +121.71% |
| RowsBench               | bench_sort_desc_on_1k      | 2    | 3   | 73.366mb +0.01%  | 39.662ms -2.95%  | ±1.98% -24.86%  |
| RowsBench               | bench_sort_entries_on_1k   | 2    | 3   | 75.663mb +0.01%  | 7.391ms +1.53%   | ±0.81% +87.59%  |
| RowsBench               | bench_sort_on_1k           | 2    | 3   | 73.220mb +0.01%  | 28.867ms -0.28%  | ±0.98% +179.27% |
| RowsBench               | bench_take_1k_on_10k       | 10   | 3   | 73.220mb +0.01%  | 13.638μs +2.49%  | ±2.87% +712.44% |
| RowsBench               | bench_take_right_1k_on_10k | 10   | 3   | 73.220mb +0.01%  | 15.800μs +0.42%  | ±0.52% -60.60%  |
| RowsBench               | bench_unique_on_1k         | 2    | 3   | 96.231mb +0.01%  | 178.932ms -4.11% | ±0.86% +2.58%   |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 59.381mb +0.01%  | 339.109ms -0.80% | ±0.92% -69.89%  |
| TypeDetectorBench       | bench_type_detector        | 1    | 3   | 14.304mb +0.05%  | 65.653ms -0.32%  | ±0.58% -42.37%  |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 115.979mb +0.01% | 378.800ms +1.28% | ±0.84% +533.71% |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 59.698mb +0.01%  | 195.783ms +3.58% | ±1.26% +218.15% |
| NativeEntryFactoryBench | bench_entry_factory        | 1    | 3   | 14.822mb +0.04%  | 41.684ms +0.87%  | ±1.41% -30.73%  |
+-------------------------+----------------------------+------+-----+------------------+------------------+-----------------+

@github-actions github-actions bot added the core label Dec 7, 2023
@@ -46,6 +46,7 @@ jobs:
tools: composer:v2
php-version: "${{ matrix.php-version }}"
ini-values: memory_limit=-1
extensions: :psr
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm honestly not sure why suddently psr extension become installed but it was breaking the testsuite, disabled according to docs

@norberttech norberttech merged commit 290ea18 into flow-php:1.x Dec 7, 2023
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant