Refactor workInternal #297

wirew0rm · 2024-03-14T12:34:16Z

This PR aims to restructure the core work function dispatch mechanism implemented in Block::workInternal.

Noteworthy changes:

breaks up the code into determining available samples and afterwards dispatching to the correct work function
restructure and partially reimplement decimate and stride functionalities:
- the decimation settings (numerator/denominator) now determine the minimum atomic amount of work that can be processed (previously this was determined by the ports' minimum samples)
- stride is applied after denominator samples. (Previously controlled by the ports' maxium samples)
cleanup
- reduce (gcc) warnings to 0

Open issues:

clang/emscirpten test failures
treatment of incomplete last chunk
support for moving tags to the first tag of this or the next tag if port constraints don't allow chunking.
remove debug printouts
squash git history
investigate and reenable asan

restructure the workInternal function to be easier to read and modify. This also subtly changes the meaning of the resampling and stride annotations.

RalphSteinhagen · 2024-03-22T08:58:06Z

core/include/gnuradio-4.0/Port.hpp

@@ -955,7 +956,7 @@ template<typename T>
 concept TagPredicate = requires(const T &t, const Tag &tag, Tag::signed_index_type readPosition) {
    { t(tag, readPosition) } -> std::convertible_to<bool>;
 };
-inline constexpr TagPredicate auto defaultTagMatcher    = [](const Tag &tag, Tag::signed_index_type readPosition) noexcept { return tag.index >= readPosition || tag.index < 0; };


Why the drop of tag.index < 0? This used to indicate that a tag should be applied immediately regardless of other index constraints.

This predicate was and is used to get the next matching tag and the matching tag after that in nSamplesUntilNextTag(port, offset=0), where offset is the default of 0 for getting the first tag and 1 for getting the Tag not at position 1. But since offset ends up in readPosition of the predicate tag index only, as soon a there is a tag with index < 0, it will also be return when looking for the second tag... I did not follow up more detailed how this worked previously, re-adding it makes qa_Block spin indefinitely.

Tag propagation and setting _cachedTag is not influenced by this since there the -1 condition is checked separately in Port::getTags(untilOffset).

I'm not completely sure on whether this is the correct change, this is just the rationale for removing it. I tried to keep the changes outside of the workInternal (and its subfunctions) small, as i've been bitten sometimes by doing extensive changes thoughout the codebase at too many places at once.

RalphSteinhagen

@wirew0rm kudos, this was a tremendous amount of work to get it done this neatly and cleanly. 👍

The changes make the code much more readable and the now linear flow of processing w/o the earlier jumps easier to follow, to test and -- I believe -- also to maintain.

I added some small comments and suggestions for improvements. These are minor and the PR could from my point-of-view be merged as is, but I think these are quality items that would be beneficial and/or at least should be discussed/documented for a follow-up.

Main things are:

use of structs and named fields for function returns rather than std::tuple. This would complement and mirror the already used structured bindings.
use of emitErrorMessage(..) rather than throwing exceptions in core-lib code

The others things and missing features you mentioned in the PR description could be done in a later follow-up.

RalphSteinhagen · 2024-03-22T09:03:23Z

core/test/qa_Block.cpp

        // assuming buffer size is approx 65k
-        stride_test( {.n_samples = 1000000, .stride = 250000, .in_port_max = 100, .exp_in = 100, .exp_out = 100, .exp_counter = 4, .exp_total_in = 400, .exp_total_out = 400 }, thread_pool);
-        stride_test( {.n_samples = 1000000, .stride = 249900, .in_port_max = 100, .exp_in = 100, .exp_out = 100, .exp_counter = 5, .exp_total_in = 500, .exp_total_out = 500 }, thread_pool);
+        stride_test( {.n_samples = 1000000, .numerator = 100, .denominator = 100, .stride = 250000, .exp_in = 100, .exp_out = 100, .exp_counter = 4, .exp_total_in = 400, .exp_total_out = 400 }, thread_pool);


TODO: as discussed, naming is hard: we need to find better descriptive names for 'numerator' and 'denominator'.

This number pair does not describe the resampling ratio of the up- or down-conversion but also the default number of 'M' input samples that are converted into 'N' output samples ... Maybe our other SYS colleagues may have some suggestions since it's also them that should recognise/understand this complex ... the more intuitive the name the less we have to document the up-/downconversion.

core/include/gnuradio-4.0/Block.hpp

RalphSteinhagen · 2024-03-22T09:11:14Z

core/include/gnuradio-4.0/Block.hpp

                            if constexpr (Out::isMultiThreadedStrategy()) {
                                if (!out.isFullyPublished()) {
-                                    fmt::print(stderr, "Block::write_to_outputs - did not publish all samples for MultiThreadedStrategy\n");
+                                    fmt::print(stderr, "Block::publishWriters - did not publish all samples for MultiThreadedStrategy\n");
                                    std::abort();
                                }
                            }


Perhaps this should be part of the policy definition inside the buffer itself and handled in the ~ProducableSpan() destructor.

I know we had some discussion about this with @drslebedev I don't remember what what the outcome was there.

RalphSteinhagen · 2024-03-22T09:12:05Z

core/include/gnuradio-4.0/Block.hpp

                                if constexpr (Out::spanReleasePolicy() == SpanReleasePolicy::Terminate) {
-                                    fmt::print(stderr, "Block::write_to_outputs - did not publish samples, default SpanReleasePolicy is {}\n", magic_enum::enum_name(SpanReleasePolicy::Terminate));
+                                    fmt::print(stderr, "Block::publishWriters - samples were not published, default SpanReleasePolicy is {}\n", magic_enum::enum_name(SpanReleasePolicy::Terminate));
                                    std::abort();
                                } else if constexpr (Out::spanReleasePolicy() == SpanReleasePolicy::ProcessAll) {
-                                    out.publish(available_values_count);
+                                    out.publish(nSamples);
                                } else if constexpr (Out::spanReleasePolicy() == SpanReleasePolicy::ProcessNone) {
                                    out.publish(0U);
                                }
                            }


as above: functionally correct but should perhaps be pushed to the ~ProducableSpan() destructor.

core/include/gnuradio-4.0/Block.hpp

RalphSteinhagen · 2024-03-22T10:02:45Z

core/include/gnuradio-4.0/Block.hpp

+    work::Status
+    invokeProcessBulk(auto &inputSpans, auto &writersTuple) {
+        // cannot use std::apply because it requires tuple_cat(inputSpans, writersTuple). The latter doesn't work because writersTuple isn't copyable.
+        return [&]<std::size_t... InIdx, std::size_t... OutIdx>(std::index_sequence<InIdx...>, std::index_sequence<OutIdx...>) {
+            return self().processBulk(std::get<InIdx>(inputSpans)..., std::get<OutIdx>(writersTuple)...);
+        }(std::make_index_sequence<std::tuple_size_v<std::remove_cvref_t<decltype(inputSpans)>>>(), std::make_index_sequence<std::tuple_size_v<std::remove_cvref_t<decltype(writersTuple)>>>());
+    }


Not (yet) fully tested but the below could convert the present std::vector<std::span> or std::vector<"ConsumableSpan"> to std::span<std::span..> or std::span<"ConsumableSpan">`

work::Status invokeProcessBulk(auto &inputSpans, auto &writersTuple) { auto convertIfNeeded = [](auto&& arg) -> decltype(auto) { using T = std::decay_t<decltype(arg)>; if constexpr (std::is_same_v<T, std::vector<typename T::value_type>>>) { // convert std::vector to std::span of [Consumable,Producable]Spans return std::span<typename T::value_type>{arg.data(), arg.size()}; } else if (...) { // similar for std::array<...> } else { // pass-through for other types return std::forward<decltype(arg)>(arg); } }; return [&]<std::size_t... InIdx, std::size_t... OutIdx>(std::index_sequence<InIdx...>, std::index_sequence<OutIdx...>) { return self().processBulk(convertIfNeeded(std::get<InIdx>(inputSpans))..., convertIfNeeded(std::get<OutIdx>(writersTuple))...); }(std::make_index_sequence<std::tuple_size_v<std::remove_cvref_t<decltype(inputSpans)>>>(), std::make_index_sequence<std::tuple_size_v<std::remove_cvref_t<decltype(writersTuple)>>>()); }

@drslebedev your view?

N.B. this way we do not expose if the dynamic input ports/spans are stored as vector or array and makes IMO a more maintainable and future-proof API. Feedback is welcome.

Using std::span for an array of ports as the signature for processBulk() looks good to me.
std::span<std::span..> is better than std::vector<std::span..>.

However, I recommend postponing this to the next pull request, as it requires further modifications in the unit tests.

core/include/gnuradio-4.0/Block.hpp

drslebedev

Thanks for the PR!
Really great work to simplify the code and make it clear and readable!
Given our previous private discussions about your changes, I've added a few minor comments. Please consider implementing them at your convenience.

drslebedev · 2024-03-22T09:51:58Z

core/include/gnuradio-4.0/Block.hpp

@@ -618,11 +523,10 @@ class Block : public lifecycle::StateMachine<Derived>, //
        , numerator(std::move(other.numerator))


We've previously had a private discussion regarding potential new names for numerator and denominator, given their altered meaning following your changes. Essentially, they now align more closely with the limits on the number of samples or chunk sizes for input and output. Are we still considering this adjustment, or do you prefer to retain the original names?

core/include/gnuradio-4.0/Block.hpp

drslebedev · 2024-03-22T10:31:48Z

core/include/gnuradio-4.0/Block.hpp

+        auto         writersTuple = prepareStreams(outputPorts<PortType::STREAM>(&self()), limitByFirstTag ? 1 : decimatedOut);
+        work::Status ret;
+        std::size_t  processed    = limitByFirstTag ? 1 : decimatedIn;
+        std::size_t  processedOut = limitByFirstTag ? 1 : decimatedOut;


Suggested change

std::size_t processedOut = limitByFirstTag ? 1 : decimatedOut;

std::size_t nSamplesToProcessOut = limitByFirstTag ? 1 : decimatedOut;

These are the only 2 vairables where I didn't manage to make them const, otherwise i'd agree with the rename, but these are also updated with the result of the processing...
What i'm missing here is the ability to define an uninitialized const variable and then have the compiler check that it will be assigned exactly once in every branch... Java does that, but in c++ the only way to do get something like that is to wrap it in a lambda and return from there, but that would make the code a lot less straightforward to read...

core/include/gnuradio-4.0/Block.hpp

core/test/qa_Block.cpp

wirew0rm · 2024-03-22T17:55:57Z

Addressed most of the review comments and some sonarcloud issues and compiler warnings, open points:

naming numerator/denominator
moving publish logic into producable span destructor
using span for dynamic ports

drslebedev · 2024-03-22T20:26:49Z

naming numerator/denominator

moving publish logic into producable span destructor

using span for dynamic ports

At least second and third points should be addressed in the next PRs.

sonarcloud · 2024-03-25T11:40:33Z

Quality Gate failed

Failed conditions
39.9% Coverage on New Code (required ≥ 80%)

See analysis details on SonarCloud

frankosterfeld · 2024-03-28T11:19:59Z

core/include/gnuradio-4.0/Block.hpp

-                throw std::runtime_error(fmt::format("Block is not defined as `ResamplingRatio<>`, but numerator = {}, denominator = {}, they both must equal to 1.", numerator, denominator));
+                auto e = gr::Error(fmt::format("Block is not defined as `ResamplingRatio<>`, but numerator = {}, denominator = {}, they both must equal to 1.", numerator, denominator));
+                emitErrorMessage("Block::checkParametersAndThrowIfNeeded:", e);
+                requestStop();


Doesn't this return DONE where it should return ERROR?

frankosterfeld · 2024-03-28T11:38:23Z

core/include/gnuradio-4.0/Block.hpp

+        // if the block state changed to DONE, publish EOS tag on the next sample
+        if (ret == work::Status::DONE) {
+            this->setAndNotifyState(lifecycle::State::STOPPED);
+            publishTag({ { gr::tag::END_OF_STREAM, true } }, 1);


Why index 1 here, and not 0 (like in other places?)?

The done tag always gets the index of the next sample (which will never be published). Otherwise a block could not publish EoS without publishing more samples. But you are right, that seems to be already considered in publishTag, it does not publish to the last written but to the next written index. Probably it doesn't matter because in the next iteration it sees the state and publishes another tag on the correct position... so maybe it should just be removed s.t. all eos tag handling is done at the beginning of the function...

frankosterfeld · 2024-03-28T11:38:41Z

core/include/gnuradio-4.0/Block.hpp

+        if (ret == work::Status::DONE) {
+            this->setAndNotifyState(lifecycle::State::STOPPED);
+            publishTag({ { gr::tag::END_OF_STREAM, true } }, 1);
+            ret = work::Status::DONE;


This line should be necessary (ret already DONE)

there probably was a second condition in the if before... can be removed when fixing the eos behavior for process bulk that doesn't consume anything

frankosterfeld · 2024-03-28T11:50:57Z

core/include/gnuradio-4.0/Block.hpp

+            static_assert(gr::meta::always_false<gr::traits::block::stream_input_port_types_tuple<Derived>>, "neither processBulk(...) nor processOne(...) implemented");
+        }
+        forwardTags();
+        if (lifecycle::isShuttingDown(this->state()) || nextEosTag <= processedIn + 1) {


Here something is going wrong for my Delay block (processBulk(ConsumableSpan, PublishableSpan): The block produces/consumes 0, but both processedIn and processedOut are 100, which is the data available at the input port. This check then considers the EOS tag in range and finishes.

Yes, see also this comment: https://github.com/fair-acc/graph-prototype/pull/297/files#diff-d0a6abe85a332778390dc74e1c498e4c580677fe46db76395cdf2afefc41fb1eR1507 Until now, this has not been a problem, because there was no block that relied on being able to not consume all input samples in process bulk. Probably the easiest and most generic way to fix it would be to remove the eosTag condition and only shutdown the block in the next call to process bulk... have to check if this breaks something else.

wirew0rm linked an issue Mar 14, 2024 that may be closed by this pull request

[Xpt] graph: refactor Block::work() function #81

Closed

5 tasks

wirew0rm force-pushed the refactorWork branch 10 times, most recently from 56ee033 to 26da7e9 Compare March 21, 2024 12:45

wirew0rm force-pushed the refactorWork branch from 26da7e9 to 08ffb37 Compare March 21, 2024 14:21

drslebedev mentioned this pull request Mar 21, 2024

Fix qa_source effective sample rate test. #301

Merged

wirew0rm force-pushed the refactorWork branch from 08ffb37 to 073ddbc Compare March 21, 2024 14:54

wirew0rm marked this pull request as ready for review March 21, 2024 16:03

wirew0rm force-pushed the refactorWork branch from 073ddbc to 5432d47 Compare March 21, 2024 16:27

wirew0rm mentioned this pull request Mar 21, 2024

Code cleanup and build improvements #302

Merged

refactor workInternal

bf423d3

restructure the workInternal function to be easier to read and modify. This also subtly changes the meaning of the resampling and stride annotations.

wirew0rm force-pushed the refactorWork branch from 5432d47 to bf423d3 Compare March 22, 2024 08:34

RalphSteinhagen reviewed Mar 22, 2024

View reviewed changes

RalphSteinhagen approved these changes Mar 22, 2024

View reviewed changes

drslebedev approved these changes Mar 22, 2024

View reviewed changes

Incoroprate reveiw and sonarcloud feedback

77b9038

feedback II

a6dedc2

RalphSteinhagen merged commit 697f99d into main Mar 25, 2024
7 of 8 checks passed

RalphSteinhagen deleted the refactorWork branch March 25, 2024 12:01

frankosterfeld reviewed Mar 28, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor workInternal #297

Refactor workInternal #297

wirew0rm commented Mar 14, 2024 •

edited

Loading

RalphSteinhagen Mar 22, 2024

wirew0rm Mar 22, 2024

RalphSteinhagen left a comment

RalphSteinhagen Mar 22, 2024

RalphSteinhagen Mar 22, 2024

wirew0rm Mar 22, 2024

RalphSteinhagen Mar 22, 2024

RalphSteinhagen Mar 22, 2024

RalphSteinhagen Mar 22, 2024

drslebedev Mar 22, 2024

drslebedev left a comment •

edited

Loading

drslebedev Mar 22, 2024

drslebedev Mar 22, 2024

wirew0rm Mar 22, 2024

wirew0rm commented Mar 22, 2024 •

edited

Loading

drslebedev commented Mar 22, 2024

sonarcloud bot commented Mar 25, 2024

frankosterfeld Mar 28, 2024

frankosterfeld Mar 28, 2024

wirew0rm Mar 28, 2024

frankosterfeld Mar 28, 2024

wirew0rm Mar 28, 2024

frankosterfeld Mar 28, 2024

wirew0rm Mar 28, 2024

		@@ -618,11 +523,10 @@ class Block : public lifecycle::StateMachine<Derived>, //
		, numerator(std::move(other.numerator))

	std::size_t processedOut = limitByFirstTag ? 1 : decimatedOut;
	std::size_t nSamplesToProcessOut = limitByFirstTag ? 1 : decimatedOut;

Refactor workInternal #297

Refactor workInternal #297

Conversation

wirew0rm commented Mar 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RalphSteinhagen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

drslebedev left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wirew0rm commented Mar 22, 2024 • edited Loading

drslebedev commented Mar 22, 2024

sonarcloud bot commented Mar 25, 2024

Quality Gate failed

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wirew0rm commented Mar 14, 2024 •

edited

Loading

drslebedev left a comment •

edited

Loading

wirew0rm commented Mar 22, 2024 •

edited

Loading