Add instrumentation about volume of data parsed during resynchronization #1675

Mohan-Dhawan · 2024-02-20T18:13:13Z

The spicy profiler provides no information about the time spent in the resynchronization code. It would be great to have metrics around the volume of data required to achieve resynchronization and the total time taken.

With `--enable-profiling` the output for Spicy units/fields now includes a new `volume` column, like this: ``` #name count time avg-% total-% volume [...] spicy/unit/test::A 1 285500 43.96 43.96 8 spicy/unit/test::A/__gap__ 4 3167 0.12 0.49 0 spicy/unit/test::A/__synchronize__ 1 35500 5.47 5.47 4 spicy/unit/test::A::a 1 74833 11.52 11.52 - spicy/unit/test::A::b 1 15333 2.36 2.36 1 spicy/unit/test::A::c 1 19125 2.94 2.94 1 spicy/unit/test::A::d 1 7583 1.17 1.17 1 spicy/unit/test::A::e 1 8042 1.24 1.24 1 ``` Three different things here: - The `volume` column for `spicy/unit/TYPE` and `spicy/unit/TYPE::FIELD` augments the already existing timing measurement and reports the total, aggregate number of bytes that this unit/field got to parse over the course of the processing. - For units going into synchronization mode, there are now additional rows `spicy/unit/TYPE/__synchronize__` that report both CPU time and volume spent in synchronization while processing that unit. - For units encountering input gaps during synchronization, there are now additional rows 'spicy/unit/TYPE/__gap__` that report total aggregate gap size encountered while processing the unit. All the volume measurements are taken as differences of two offsets inside the input stream. For normal unit/field parsing, we subtract the final offset after parsing an instance from the initial offset where its parsing started.[1] For synchronization, it's the offset where synchronization stopped successfully minus where it started.[2] For gaps, it's the offset where we continued after the gap minus where the gap started.[3] All these differences are then added up for each row over the course of total input stream processing. Note that volume isn't counted if parsing for some reason never reaches the point where the end measurement would be taken (e.g., a parser error prevents it from being reached; in the output above that's the case for `spicy/unit/test::A::a`). Closes #1675. [1] This *includes any ranges that the unit spent in synchronization mode trying to recover from parse errors. [2] This does *not* include any gaps encountered because they don't affect stream offsets. [3] Litte glitch: these values can currently by off by one due to some internal ambiguity.

rsmmr · 2024-02-21T12:33:59Z

@Mohan-Dhawan give #1676 a try.

Mohan-Dhawan · 2024-02-23T05:35:13Z

Thanks @rsmmr . I do get the volume stats in the output. It would also be nice to know if a higher reported volume in gaps or synchronize is detrimental to performance.

rsmmr · 2024-02-23T08:18:09Z

Thanks @rsmmr . I do get the volume stats in the output.

Can I see the output?

It would also be nice to know if a higher reported volume in gaps or synchronize is detrimental to performance.

I don't follow what you mean, can you elaborate how the numbers could be improved?

Mohan-Dhawan · 2024-02-23T08:21:30Z

spicy/unit/<pdu>/__gap__                   172043743 12695491160       0.00      14.53      2013841114
spicy/unit/<pdu>/__synchronize__               82553 3648394463       0.00       4.18             184

What I wanted to know is what are the acceptable limits for performance for gap and for synchronize?

rsmmr · 2024-02-23T08:35:06Z

There's no general answer to that. You need to put in relation to the input volume / standard parsing.

Mohan-Dhawan · 2024-02-23T10:28:06Z

The context here is that I have a 698MB trace with 8883 connections but having gaps in content. A flamegraph for its execution yielded that close to 66% of samples were in the unit responsible for synchronization. About half of those samples were from MatchState::advance and majority of the calls from it were to the function jrx_regexec_partial_min. Given that the volume of bytes in the __synchronize__ entry is just 184, is the high volume of jrx_* calls indicative of any edge case?

rsmmr · 2024-02-27T09:07:06Z

Can you send me the full output please?

rsmmr · 2024-03-06T12:27:23Z

For the record, I never received the full output, so we need to take the measurement with a grain of salt for now.

Mohan-Dhawan · 2024-03-06T13:42:32Z

Hi @rsmmr . Sorry, it completely slipped out of my mind. I've sent you the detailed output in the Zeek Slack DM.

bbannier · 2024-06-10T20:23:03Z

Somewhat related, I bumped #1133 into TODO.

rsmmr self-assigned this Feb 21, 2024

rsmmr mentioned this issue Feb 21, 2024

Extend runtime profiling to measure parser input volume. #1676

Merged

rsmmr closed this as completed in d90f191 Mar 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add instrumentation about volume of data parsed during resynchronization #1675

Add instrumentation about volume of data parsed during resynchronization #1675

Mohan-Dhawan commented Feb 20, 2024

rsmmr commented Feb 21, 2024

Mohan-Dhawan commented Feb 23, 2024

rsmmr commented Feb 23, 2024

Mohan-Dhawan commented Feb 23, 2024

rsmmr commented Feb 23, 2024

Mohan-Dhawan commented Feb 23, 2024 •

edited

Loading

rsmmr commented Feb 27, 2024

rsmmr commented Mar 6, 2024

Mohan-Dhawan commented Mar 6, 2024 •

edited

Loading

bbannier commented Jun 10, 2024

Add instrumentation about volume of data parsed during resynchronization #1675

Add instrumentation about volume of data parsed during resynchronization #1675

Comments

Mohan-Dhawan commented Feb 20, 2024

rsmmr commented Feb 21, 2024

Mohan-Dhawan commented Feb 23, 2024

rsmmr commented Feb 23, 2024

Mohan-Dhawan commented Feb 23, 2024

rsmmr commented Feb 23, 2024

Mohan-Dhawan commented Feb 23, 2024 • edited Loading

rsmmr commented Feb 27, 2024

rsmmr commented Mar 6, 2024

Mohan-Dhawan commented Mar 6, 2024 • edited Loading

bbannier commented Jun 10, 2024

Mohan-Dhawan commented Feb 23, 2024 •

edited

Loading

Mohan-Dhawan commented Mar 6, 2024 •

edited

Loading