Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

util/log: new experimental integration with Fluentd #57170

Merged
merged 1 commit into from
Feb 3, 2021

Conversation

knz
Copy link
Contributor

@knz knz commented Nov 26, 2020

Release note (cli change): It is now possible to redirect logging to
Fluentd-compatible network collectors. See
the documentation for details. This is an alpha-quality feature.

@knz knz requested a review from itsbilal November 26, 2020 11:23
@knz knz requested a review from a team as a code owner November 26, 2020 11:23
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz knz changed the title util/log: new experimental integration with Fluentd util/log: new JSON output formats, and experimental integration with Fluentd Dec 3, 2020
@knz
Copy link
Contributor Author

knz commented Dec 3, 2020

rebased, PTAL

@knz knz force-pushed the 20201126-fluent branch 2 times, most recently from d299879 to a99b80c Compare December 3, 2020 14:46
Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly looks good to me, one comment about JSON serialization.

Reviewed 15 of 15 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz)


pkg/util/log/format_json.go, line 88 at r1 (raw file):

}()

func formatJSON(entry logpb.Entry, stacks []byte, forFluent, compact bool) *buffer {

This seems like a lot of manual legwork just to serialize an object into json. Was json.Marshal on a custom struct not good here?


pkg/util/log/logconfig/config.go, line 207 at r1 (raw file):

	CommonSinkConfig `yaml:",inline"`

	// used during validation.

Nit: First word should be capitalized.

@knz knz changed the title util/log: new JSON output formats, and experimental integration with Fluentd util/log: new experimental integration with Fluentd Dec 21, 2020
craig bot pushed a commit that referenced this pull request Dec 21, 2020
57990: eventpb: new JSON serialization with redaction markers r=itsbilal a=knz

Fixes  #45643

All commits but the last from  #58070 and prior 
This is a prereq for presenting structured events "inline" in  #57170. 

This patch introduces a new code generator and infrastructure to emit
structured event payloads using the JSON syntax, but also including
redaction markers for fields not marked explicitly as safe for
reporting.

This is not yet connected to the remainder of the logging system --
this change will be performed in a later commit.

Release note: None

58015: backupccl: fix progress updating test r=pbardea a=pbardea

This test was previously flaky since it was assuming that backup would
issue the same number of requests as spans issued. This assumption was
incorrect, and fixing the progress accounting for backup revealed that
this test was faulty.

While planning the work distribution for backup worker nodes,
PartitionSpans automatically merges spans that are co-located on the
same node, therefore reducing the number of ExportRequests issued.
Before this commit, we used to block on the ExportRequest responses.
The blocking was triggered on a per-range-request level. However,
progress is only updated when the processor-level batch request is
returned. This meant that the test might block a batch request and
therefore the progress of the job would be less than what the test
expected.

This is now fixed by adjusting the blocking mechanism and range counting
to all be at the level of the merged ranges with which the backup
processor operates.

Fixes #57831.

This test would fail under stress pretty quickly (~40) -- I was able to get
over 600 runs without failure with this patch.

Release note: none

58054: opt: generate semi and anti inverted joins on multi-column indexes r=rytaft a=mgartner

This commit adds tests for generating semi and anti inverted joins on
multi-column inverted indexes. A `ConstFilters` field was added to
`InvertedJoinPrivate`, similar to `LookupJoinPrivate` so that row count
estimates for these expressions are more accurate. This was necessary to
make the semi and anti joins the chosen plans for the exploration tests.
A future commit will add more comprehensive stats tests.

Release note: None

58081: roachtest,roachprod: Use new --log flag, fix parameterRE in expander r=itsbilal a=itsbilal

Now that logging configuration is specified using --log, and
not using that flag adds excess output, it's more desirable
to move the disk-stalled roachtest to that flag.

Also fixes a bug in the regular expression expander
in roachprod, as that was conflicting with parts
of YAML arguments that it shouldn't be touching.

Fixes #58021.

Release note: None.

58117: sql/rowexec: don't allocate buf per row in sketchInfo.addRow r=nvanbenschoten a=nvanbenschoten

The `intbuf` array was meant to stay on the stack, but was escaping to the heap because the call through the `hash` function variable was opaque to escape analysis.

At the end of a 4 hour, 2.2 TB IMPORT of TPC-E, this was responsible for **76.70%** of all heap allocations (by object).

<img width="1684" alt="Screen Shot 2020-12-20 at 9 58 04 PM" src="https://user-images.githubusercontent.com/5438456/102735277-fb065a00-430f-11eb-837a-7ad1c903cf55.png">

58120: storage: avoid heap allocation per value in pebbleIterator.FindSplitKey r=nvanbenschoten a=nvanbenschoten

This was fallout from 95b836d. `pebble.Iterator.Value` is unsafe (no copy) but `pebbleIterator.Value` is safe (alloc + copy).

```
name                                     old time/op    new time/op    delta
MVCCFindSplitKey_Pebble/valueSize=32-16    97.0ms ± 5%    76.5ms ±15%   -21.08%  (p=0.000 n=9+10)

name                                     old speed      new speed      delta
MVCCFindSplitKey_Pebble/valueSize=32-16   693MB/s ± 5%   881MB/s ±13%   +27.19%  (p=0.000 n=9+10)

name                                     old alloc/op   new alloc/op   delta
MVCCFindSplitKey_Pebble/valueSize=32-16    27.8MB ± 0%     0.0MB ±27%   -99.99%  (p=0.000 n=10+10)

name                                     old allocs/op  new allocs/op  delta
MVCCFindSplitKey_Pebble/valueSize=32-16      580k ± 0%        0k ± 0%  -100.00%  (p=0.000 n=10+9)
```

At the end of a 4 hour, 2.2 TB IMPORT of TPC-E, this was responsible for 15.96% of all heap allocations (by object).

<img width="1680" alt="Screen Shot 2020-12-20 at 10 13 35 PM" src="https://user-images.githubusercontent.com/5438456/102735999-d6ab7d00-4311-11eb-81fb-a1bd6d807323.png">


58121: storage: re-use buffer across calls to sstIterator.SeekGE r=nvanbenschoten a=nvanbenschoten

At the end of a 4 hour, 2.2 TB IMPORT of TPC-E, this was responsible for 2.67% of all heap allocations (by object) due to the call to `sstIterator.SeekGE` from `checkForKeyCollisionsGo`.

```
      File: cockroach
Type: alloc_objects
Time: Dec 21, 2020 at 2:51am (UTC)
Duration: 5.32s, Total samples = 6963804
Active filters:
   focus=SeekGE
Showing nodes accounting for 212999, 3.06% of 6963804 total
----------------------------------------------------------+-------------
                                            185692   100% |   github.com/cockroachdb/cockroach/pkg/storage.checkForKeyCollisionsGo /go/src/github.com/cockroachdb/cockroach/pkg/storage/mvcc.go:3807
         0     0%     0%     185692  2.67%                | github.com/cockroachdb/cockroach/pkg/storage.(*sstIterator).SeekGE /go/src/github.com/cockroachdb/cockroach/pkg/storage/sst_iterator.go:93
                                            185692   100% |   github.com/cockroachdb/cockroach/pkg/storage.EncodeKey /go/src/github.com/cockroachdb/cockroach/pkg/storage/batch.go:126
```

58135: util: add crdb_test_off build tag to disable metamorphization r=yuzefovich a=yuzefovich

We are now metamorphizing most of the test builds. However, there are
cases where that behavior is undesirable (for example, when using
`-rewrite` test flag with logic tests) - in such scenarios we want the
"production" build to occur. In order to have such functionality this
commit makes `crdb_test` build be dependent on the absence of
`crdb_test_off` build tag. An example usage of `-rewrite` option would
be `make testoptlogic TESTFLAGS='-rewrite' TAGS=crdb_test_off`.

Release note: None

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: Paul Bardea <pbardea@gmail.com>
Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com>
Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
@knz knz force-pushed the 20201126-fluent branch 2 times, most recently from eb01d89 to 8be8904 Compare December 28, 2020 19:29
craig bot pushed a commit that referenced this pull request Dec 31, 2020
58126: util/log: new JSON output formats r=itsbilal a=knz

Fixes #44755

(This code was extracted from #57170 to further simplify it.)

Release note (cli change): It is now possible to set the `format`
parameter of any log sink, including file sinks, to `json`,
`json-compact`, `json-fluent` or `json-fluent-compact` to write
entries as structured JSON. Refer to the linked reference
documentation for details.

Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rebased; now also with unit test. PTAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @itsbilal)


pkg/util/log/logconfig/config.go, line 207 at r1 (raw file):

Previously, itsbilal (Bilal Akhtar) wrote…

Nit: First word should be capitalized.

Done.

@knz knz force-pushed the 20201126-fluent branch 4 times, most recently from cf4dea0 to 73b3aa6 Compare January 29, 2021 18:58
@knz
Copy link
Contributor Author

knz commented Feb 1, 2021

Flaked on #59644 which is unrelated.

@knz
Copy link
Contributor Author

knz commented Feb 3, 2021

friendly ping

Release note (cli change): It is now possible to redirect logging to
[Fluentd](https://www.fluentd.org)-compatible network collectors. See
the reference sink documentation for details. This is an alpha-quality
feature.  Note that Fluent-enabled configuration only provide
minimal event buffering, and log entries are dropped if the
logging server becomes unavailable or network errors are
encountered. This is a known limitation and will be likely improved in
a later version.
Copy link
Member

@itsbilal itsbilal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Sorry for the late review!

Reviewed 16 of 16 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

@knz
Copy link
Contributor Author

knz commented Feb 3, 2021

Thank you!

bors r=itsbilal

@craig
Copy link
Contributor

craig bot commented Feb 3, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants