Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xsk: increase NAPI budget for AF_XDP zero-copy path #14

Closed
wants to merge 5 commits into from

Conversation

kernel-patches-bot
Copy link

Pull request for series with
subject: xsk: increase NAPI budget for AF_XDP zero-copy path
version: 1
url: https://patchwork.ozlabs.org/project/netdev/list/?series=199973

tsipa and others added 5 commits September 7, 2020 12:24
The NAPI budget (NAPI_POLL_WEIGHT), meaning the number of received
packets that are allowed to be processed for each NAPI invocation,
takes into consideration that each received packet is aimed for the
kernel networking stack.

That is not the case for the AF_XDP receive path, where the cost of
each packet is significantly less. Therefore, this commit adds a new
NAPI budget, which is the NAPI_POLL_WEIGHT scaled by 4. Typically that
would be 256 in most configuration. It is encouraged that AF_XDP
zero-copy capable drivers use the XSK_NAPI_WEIGHT, when zero-copy is
enabled.

Processing 256 packets targeted for AF_XDP is still less work than 64
(NAPI_POLL_WEIGHT) packets going to the kernel networking stack.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 include/net/xdp_sock_drv.h | 3 +++
 1 file changed, 3 insertions(+)
Start using XSK_NAPI_WEIGHT as NAPI poll budget for the AF_XDP Rx
zero-copy path.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Start using XSK_NAPI_WEIGHT as NAPI poll budget for the AF_XDP Rx
zero-copy path.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/net/ethernet/intel/ice/ice_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
Start using XSK_NAPI_WEIGHT as NAPI poll budget for the AF_XDP Rx
zero-copy path.

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
@kernel-patches-bot
Copy link
Author

At least one diff in series https://patchwork.ozlabs.org/project/netdev/list/?series=199973 expired. Closing PR.

@kernel-patches-bot kernel-patches-bot deleted the series/199972 branch September 15, 2020 17:49
kernel-patches-bot pushed a commit that referenced this pull request Sep 16, 2020
…s metrics" test

Linux 5.9 introduced perf test case "Parse and process metrics" and
on s390 this test case always dumps core:

  [root@t35lp67 perf]# ./perf test -vvvv -F 67
  67: Parse and process metrics                             :
  --- start ---
  metric expr inst_retired.any / cpu_clk_unhalted.thread for IPC
  parsing metric: inst_retired.any / cpu_clk_unhalted.thread
  Segmentation fault (core dumped)
  [root@t35lp67 perf]#

I debugged this core dump and gdb shows this call chain:

  (gdb) where
   #0  0x000003ffabc3192a in __strnlen_c_1 () from /lib64/libc.so.6
   #1  0x000003ffabc293de in strcasestr () from /lib64/libc.so.6
   #2  0x0000000001102ba2 in match_metric(list=0x1e6ea20 "inst_retired.any",
            n=<optimized out>)
       at util/metricgroup.c:368
   #3  find_metric (map=<optimized out>, map=<optimized out>,
           metric=0x1e6ea20 "inst_retired.any")
      at util/metricgroup.c:765
   #4  __resolve_metric (ids=0x0, map=<optimized out>, metric_list=0x0,
           metric_no_group=<optimized out>, m=<optimized out>)
      at util/metricgroup.c:844
   #5  resolve_metric (ids=0x0, map=0x0, metric_list=0x0,
          metric_no_group=<optimized out>)
      at util/metricgroup.c:881
   #6  metricgroup__add_metric (metric=<optimized out>,
        metric_no_group=metric_no_group@entry=false, events=<optimized out>,
        events@entry=0x3ffd84fb878, metric_list=0x0,
        metric_list@entry=0x3ffd84fb868, map=0x0)
      at util/metricgroup.c:943
   #7  0x00000000011034ae in metricgroup__add_metric_list (map=0x13f9828 <map>,
        metric_list=0x3ffd84fb868, events=0x3ffd84fb878,
        metric_no_group=<optimized out>, list=<optimized out>)
      at util/metricgroup.c:988
   #8  parse_groups (perf_evlist=perf_evlist@entry=0x1e70260,
          str=str@entry=0x12f34b2 "IPC", metric_no_group=<optimized out>,
          metric_no_merge=<optimized out>,
          fake_pmu=fake_pmu@entry=0x1462f18 <perf_pmu.fake>,
          metric_events=0x3ffd84fba58, map=0x1)
      at util/metricgroup.c:1040
   #9  0x0000000001103eb2 in metricgroup__parse_groups_test(
  	evlist=evlist@entry=0x1e70260, map=map@entry=0x13f9828 <map>,
  	str=str@entry=0x12f34b2 "IPC",
  	metric_no_group=metric_no_group@entry=false,
  	metric_no_merge=metric_no_merge@entry=false,
  	metric_events=0x3ffd84fba58)
      at util/metricgroup.c:1082
   #10 0x00000000010c84d8 in __compute_metric (ratio2=0x0, name2=0x0,
          ratio1=<synthetic pointer>, name1=0x12f34b2 "IPC",
  	vals=0x3ffd84fbad8, name=0x12f34b2 "IPC")
      at tests/parse-metric.c:159
   #11 compute_metric (ratio=<synthetic pointer>, vals=0x3ffd84fbad8,
  	name=0x12f34b2 "IPC")
      at tests/parse-metric.c:189
   #12 test_ipc () at tests/parse-metric.c:208
.....
..... omitted many more lines

This test case was added with
commit 218ca91 ("perf tests: Add parse metric test for frontend metric").

When I compile with make DEBUG=y it works fine and I do not get a core dump.

It turned out that the above listed function call chain worked on a struct
pmu_event array which requires a trailing element with zeroes which was
missing. The marco map_for_each_event() loops over that array tests for members
metric_expr/metric_name/metric_group being non-NULL. Adding this element fixes
the issue.

Output after:

  [root@t35lp46 perf]# ./perf test 67
  67: Parse and process metrics                             : Ok
  [root@t35lp46 perf]#

Committer notes:

As Ian remarks, this is not s390 specific:

<quote Ian>
  This also shows up with address sanitizer on all architectures
  (perhaps change the patch title) and perhaps add a "Fixes: <commit>"
  tag.

  =================================================================
  ==4718==ERROR: AddressSanitizer: global-buffer-overflow on address
  0x55c93b4d59e8 at pc 0x55c93a1541e2 bp 0x7ffd24327c60 sp
  0x7ffd24327c58
  READ of size 8 at 0x55c93b4d59e8 thread T0
      #0 0x55c93a1541e1 in find_metric tools/perf/util/metricgroup.c:764:2
      #1 0x55c93a153e6c in __resolve_metric tools/perf/util/metricgroup.c:844:9
      #2 0x55c93a152f18 in resolve_metric tools/perf/util/metricgroup.c:881:9
      #3 0x55c93a1528db in metricgroup__add_metric
  tools/perf/util/metricgroup.c:943:9
      #4 0x55c93a151996 in metricgroup__add_metric_list
  tools/perf/util/metricgroup.c:988:9
      #5 0x55c93a1511b9 in parse_groups tools/perf/util/metricgroup.c:1040:8
      #6 0x55c93a1513e1 in metricgroup__parse_groups_test
  tools/perf/util/metricgroup.c:1082:9
      #7 0x55c93a0108ae in __compute_metric tools/perf/tests/parse-metric.c:159:8
      #8 0x55c93a010744 in compute_metric tools/perf/tests/parse-metric.c:189:9
      #9 0x55c93a00f5ee in test_ipc tools/perf/tests/parse-metric.c:208:2
      #10 0x55c93a00f1e8 in test__parse_metric
  tools/perf/tests/parse-metric.c:345:2
      #11 0x55c939fd7202 in run_test tools/perf/tests/builtin-test.c:410:9
      #12 0x55c939fd6736 in test_and_print tools/perf/tests/builtin-test.c:440:9
      #13 0x55c939fd58c3 in __cmd_test tools/perf/tests/builtin-test.c:661:4
      #14 0x55c939fd4e02 in cmd_test tools/perf/tests/builtin-test.c:807:9
      #15 0x55c939e4763d in run_builtin tools/perf/perf.c:313:11
      #16 0x55c939e46475 in handle_internal_command tools/perf/perf.c:365:8
      #17 0x55c939e4737e in run_argv tools/perf/perf.c:409:2
      #18 0x55c939e45f7e in main tools/perf/perf.c:539:3

  0x55c93b4d59e8 is located 0 bytes to the right of global variable
  'pme_test' defined in 'tools/perf/tests/parse-metric.c:17:25'
  (0x55c93b4d54a0) of size 1352
  SUMMARY: AddressSanitizer: global-buffer-overflow
  tools/perf/util/metricgroup.c:764:2 in find_metric
  Shadow bytes around the buggy address:
    0x0ab9a7692ae0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692af0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b10: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b20: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  =>0x0ab9a7692b30: 00 00 00 00 00 00 00 00 00 00 00 00 00[f9]f9 f9
    0x0ab9a7692b40: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
    0x0ab9a7692b50: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
    0x0ab9a7692b60: f9 f9 f9 f9 f9 f9 f9 f9 00 00 00 00 00 00 00 00
    0x0ab9a7692b70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x0ab9a7692b80: f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9 f9
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:	   fa
    Freed heap region:	   fd
    Stack left redzone:	   f1
    Stack mid redzone:	   f2
    Stack right redzone:     f3
    Stack after return:	   f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:	   f6
    Poisoned by user:        f7
    Container overflow:	   fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
    Shadow gap:              cc
</quote>

I'm also adding the missing "Fixes" tag and setting just .name to NULL,
as doing it that way is more compact (the compiler will zero out
everything else) and the table iterators look for .name being NULL as
the sentinel marking the end of the table.

Fixes: 0a507af ("perf tests: Add parse metric test for ipc metric")
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Reviewed-by: Sumanth Korikkar <sumanthk@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: http://lore.kernel.org/lkml/20200825071211.16959-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Sep 24, 2020
The evsel->unit borrows a pointer of pmu event or alias instead of
owns a string.  But tool event (duration_time) passes a result of
strdup() caused a leak.

It was found by ASAN during metric test:

  Direct leak of 210 byte(s) in 70 object(s) allocated from:
    #0 0x7fe366fca0b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x559fbbcc6ea3 in add_event_tool util/parse-events.c:414
    #2 0x559fbbcc6ea3 in parse_events_add_tool util/parse-events.c:1414
    #3 0x559fbbd8474d in parse_events_parse util/parse-events.y:439
    #4 0x559fbbcc95da in parse_events__scanner util/parse-events.c:2096
    #5 0x559fbbcc95da in __parse_events util/parse-events.c:2141
    #6 0x559fbbc28555 in check_parse_id tests/pmu-events.c:406
    #7 0x559fbbc28555 in check_parse_id tests/pmu-events.c:393
    #8 0x559fbbc28555 in check_parse_cpu tests/pmu-events.c:415
    #9 0x559fbbc28555 in test_parsing tests/pmu-events.c:498
    #10 0x559fbbc0109b in run_test tests/builtin-test.c:410
    #11 0x559fbbc0109b in test_and_print tests/builtin-test.c:440
    #12 0x559fbbc03e69 in __cmd_test tests/builtin-test.c:695
    #13 0x559fbbc03e69 in cmd_test tests/builtin-test.c:807
    #14 0x559fbbc691f4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #15 0x559fbbb071a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #16 0x559fbbb071a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #17 0x559fbbb071a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #18 0x7fe366b68cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: f0fbb11 ("perf stat: Implement duration_time as a proper event")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Sep 24, 2020
The test_generic_metric() missed to release entries in the pctx.  Asan
reported following leak (and more):

  Direct leak of 128 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c9396980e in calloc (/lib/x86_64-linux-gnu/libasan.so.5+0x10780e)
    #1 0x55f7e748cc14 in hashmap_grow (/home/namhyung/project/linux/tools/perf/perf+0x90cc14)
    #2 0x55f7e748d497 in hashmap__insert (/home/namhyung/project/linux/tools/perf/perf+0x90d497)
    #3 0x55f7e7341667 in hashmap__set /home/namhyung/project/linux/tools/perf/util/hashmap.h:111
    #4 0x55f7e7341667 in expr__add_ref util/expr.c:120
    #5 0x55f7e7292436 in prepare_metric util/stat-shadow.c:783
    #6 0x55f7e729556d in test_generic_metric util/stat-shadow.c:858
    #7 0x55f7e712390b in compute_single tests/parse-metric.c:128
    #8 0x55f7e712390b in __compute_metric tests/parse-metric.c:180
    #9 0x55f7e712446d in compute_metric tests/parse-metric.c:196
    #10 0x55f7e712446d in test_dcache_l2 tests/parse-metric.c:295
    #11 0x55f7e712446d in test__parse_metric tests/parse-metric.c:355
    #12 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #13 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #14 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #15 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #16 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #17 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #18 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #19 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #20 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 6d432c4 ("perf tools: Add test_generic_metric function")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Sep 24, 2020
The metricgroup__add_metric() can find multiple match for a metric group
and it's possible to fail.  Also it can fail in the middle like in
resolve_metric() even for single metric.

In those cases, the intermediate list and ids will be leaked like:

  Direct leak of 3 byte(s) in 1 object(s) allocated from:
    #0 0x7f4c938f40b5 in strdup (/lib/x86_64-linux-gnu/libasan.so.5+0x920b5)
    #1 0x55f7e71c1bef in __add_metric util/metricgroup.c:683
    #2 0x55f7e71c31d0 in add_metric util/metricgroup.c:906
    #3 0x55f7e71c3844 in metricgroup__add_metric util/metricgroup.c:940
    #4 0x55f7e71c488d in metricgroup__add_metric_list util/metricgroup.c:993
    #5 0x55f7e71c488d in parse_groups util/metricgroup.c:1045
    #6 0x55f7e71c60a4 in metricgroup__parse_groups_test util/metricgroup.c:1087
    #7 0x55f7e71235ae in __compute_metric tests/parse-metric.c:164
    #8 0x55f7e7124650 in compute_metric tests/parse-metric.c:196
    #9 0x55f7e7124650 in test_recursion_fail tests/parse-metric.c:318
    #10 0x55f7e7124650 in test__parse_metric tests/parse-metric.c:356
    #11 0x55f7e70be09b in run_test tests/builtin-test.c:410
    #12 0x55f7e70be09b in test_and_print tests/builtin-test.c:440
    #13 0x55f7e70c101a in __cmd_test tests/builtin-test.c:661
    #14 0x55f7e70c101a in cmd_test tests/builtin-test.c:807
    #15 0x55f7e7126214 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:312
    #16 0x55f7e6fc41a8 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:364
    #17 0x55f7e6fc41a8 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:408
    #18 0x55f7e6fc41a8 in main /home/namhyung/project/linux/tools/perf/perf.c:538
    #19 0x7f4c93492cc9 in __libc_start_main ../csu/libc-start.c:308

Fixes: 83de0b7 ("perf metric: Collect referenced metrics in struct metric_ref_node")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20200915031819.386559-9-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Sep 24, 2020
Ido Schimmel says:

====================
mlxsw: Refactor headroom management

Petr says:

On Spectrum, port buffers, also called port headroom, is where packets are
stored while they are parsed and the forwarding decision is being made. For
lossless traffic flows, in case shared buffer admission is not allowed,
headroom is also where to put the extra traffic received before the sent
PAUSE takes effect. Another aspect of the port headroom is the so called
internal buffer, which is used for egress mirroring.

Linux supports two DCB interfaces related to the headroom: dcbnl_setbuffer
for configuration, and dcbnl_getbuffer for inspection. In order to make it
possible to implement these interfaces, it is first necessary to clean up
headroom handling, which is currently strewn in several places in the
driver.

The end goal is an architecture whereby it is possible to take a copy of
the current configuration, adjust parameters, and then hand the proposed
configuration over to the system to implement it. When everything works,
the proposed configuration is accepted and saved. First, this centralizes
the reconfiguration handling to one function, which takes care of
coordinating buffer size changes and priority map changes to avoid
introducing drops. Second, the fact that the configuration is all in one
place makes it easy to keep a backup and handle error path rollbacks, which
were previously hard to understand.

Patch #1 introduces struct mlxsw_sp_hdroom, which will keep port headroom
configuration.

Patch #2 unifies handling of delay provision between PFC and PAUSE. From
now on, delay is to be measured in bytes of extra space, and will not
include MTU. PFC handler sets the delay directly from the parameter it gets
through the DCB interface. For PAUSE, MLXSW_SP_PAUSE_DELAY is converted to
have the same meaning.

In patches #3-#5, MTU, lossiness and priorities are gradually moved over to
struct mlxsw_sp_hdroom.

In patches #6-#11, handling of buffer resizing and priority maps is moved
from spectrum.c and spectrum_dcb.c to spectrum_buffers.c. The API is
gradually adapted so that struct mlxsw_sp_hdroom becomes the main interface
through which the various clients express how the headroom should be
configured.

Patch #12 is a small cleanup that the previous transformation made
possible.

In patch #13, the port init code becomes a boring client of the headroom
code, instead of rolling its own thing.

Patches #14 and #15 move handling of internal mirroring buffer to the new
headroom code as well. Previously, this code was in the SPAN module. This
patchset converts the SPAN module to another boring client of the headroom
code.
====================

Signed-off-by: David S. Miller <davem@davemloft.net>
kernel-patches-bot pushed a commit that referenced this pull request Nov 16, 2020
Ido Schimmel says:

====================
nexthop: Add support for nexthop objects offload

This patch set adds support for nexthop objects offload with a dummy
implementation over netdevsim. mlxsw support will be added later.

The general idea is very similar to route offload in that notifications
are sent whenever nexthop objects are changed. A listener can veto the
change and the error will be communicated to user space with extack.

To keep listeners as simple as possible, they not only receive
notifications for the nexthop object that is changed, but also for all
the other objects affected by this change. For example, when a single
nexthop is replaced, a replace notification is sent for the single
nexthop, but also for all the nexthop groups this nexthop is member in.
This relieves listeners from the need to track such dependencies.

To simplify things further for listeners, the notification info does not
contain the raw nexthop data structures (e.g., 'struct nexthop'), but
less complex data structures into which the raw data structures are
parsed into.

Tested with a new selftest over netdevsim and with fib_nexthops.sh:

Tests passed: 164
Tests failed:   0

Patch set overview:

Patches #1-#4 introduce the aforementioned data structures and convert
existing listeners (i.e., the VXLAN driver) to use them.

Patches #5-#6 add a new RTNH_F_TRAP flag and the ability to set it and
RTNH_F_OFFLOAD on nexthops. This flag is used by netdevsim for testing
purposes and will also be used by mlxsw. These flags are consistent with
the existing RTM_F_OFFLOAD and RTM_F_TRAP flags.

Patches #7-#14 gradually add the new nexthop notifications.

Patches #15-#18 add a dummy implementation for nexthop offload over
netdevsim and a selftest to exercise both good and bad flows.

Changes since RFC [1]:

Patch #1: s/is_encap/has_encap/
Patch #3: Add a blank line in __nh_notifier_single_info_init()
Patch #5: Reword commit message
Patch #6: s/nexthop_hw_flags_set/nexthop_set_hw_flags/
Patch #7: Reword commit message
Patch #11: Allocate extack on the stack

Follow-up patch sets:

selftests: forwarding: Add nexthop objects tests
mlxsw: Preparations for nexthop objects support - part 1/2
mlxsw: Preparations for nexthop objects support - part 2/2
mlxsw: Add support for nexthop objects
mlxsw: Add support for blackhole nexthops
mlxsw: Update adjacency index more efficiently

[1] https://lore.kernel.org/netdev/20200908091037.2709823-1-idosch@idosch.org/
====================

Link: https://lore.kernel.org/r/20201104133040.1125369-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
kernel-patches-bot pushed a commit that referenced this pull request Nov 20, 2020
This fix is for a failure that occurred in the DWARF unwind perf test.

Stack unwinders may probe memory when looking for frames.

Memory sanitizer will poison and track uninitialized memory on the
stack, and on the heap if the value is copied to the heap.

This can lead to false memory sanitizer failures for the use of an
uninitialized value.

Avoid this problem by removing the poison on the copied stack.

The full msan failure with track origins looks like:

==2168==WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x559ceb10755b in handle_cfi elfutils/libdwfl/frame_unwind.c:648:8
    #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    #7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    #8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    #9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    #10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    #11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    #12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    #13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    #14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    #15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    #16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    #17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    #18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    #19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    #20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    #21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    #22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    #23 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559ceb106acf in __libdwfl_frame_reg_set elfutils/libdwfl/frame_unwind.c:77:22
    #1 0x559ceb106acf in handle_cfi elfutils/libdwfl/frame_unwind.c:627:13
    #2 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #3 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #4 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #5 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #6 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    #7 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    #8 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    #9 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    #10 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    #11 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    #12 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    #13 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    #14 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    #15 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    #16 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    #17 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    #18 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    #19 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    #20 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    #21 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    #22 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    #23 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    #24 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559ceb106a54 in handle_cfi elfutils/libdwfl/frame_unwind.c:613:9
    #1 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #2 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #3 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #4 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #5 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    #6 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    #7 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    #8 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    #9 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    #10 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    #11 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    #12 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    #13 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    #14 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    #15 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    #16 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    #17 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    #18 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    #19 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    #20 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    #21 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    #22 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    #23 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559ceaff8800 in memory_read tools/perf/util/unwind-libdw.c:156:10
    #1 0x559ceb10f053 in expr_eval elfutils/libdwfl/frame_unwind.c:501:13
    #2 0x559ceb1060cc in handle_cfi elfutils/libdwfl/frame_unwind.c:603:18
    #3 0x559ceb105448 in __libdwfl_frame_unwind elfutils/libdwfl/frame_unwind.c:741:4
    #4 0x559ceb0ece90 in dwfl_thread_getframes elfutils/libdwfl/dwfl_frame.c:435:7
    #5 0x559ceb0ec6b7 in get_one_thread_frames_cb elfutils/libdwfl/dwfl_frame.c:379:10
    #6 0x559ceb0ec6b7 in get_one_thread_cb elfutils/libdwfl/dwfl_frame.c:308:17
    #7 0x559ceb0ec6b7 in dwfl_getthreads elfutils/libdwfl/dwfl_frame.c:283:17
    #8 0x559ceb0ec6b7 in getthread elfutils/libdwfl/dwfl_frame.c:354:14
    #9 0x559ceb0ec6b7 in dwfl_getthread_frames elfutils/libdwfl/dwfl_frame.c:388:10
    #10 0x559ceaff6ae6 in unwind__get_entries tools/perf/util/unwind-libdw.c:236:8
    #11 0x559ceabc9dbc in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:111:8
    #12 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    #13 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    #14 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    #15 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    #16 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    #17 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    #18 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    #19 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    #20 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    #21 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    #22 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    #23 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    #24 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    #25 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was stored to memory at
    #0 0x559cea9027d9 in __msan_memcpy llvm/llvm-project/compiler-rt/lib/msan/msan_interceptors.cpp:1558:3
    #1 0x559cea9d2185 in sample_ustack tools/perf/arch/x86/tests/dwarf-unwind.c:41:2
    #2 0x559cea9d202c in test__arch_unwind_sample tools/perf/arch/x86/tests/dwarf-unwind.c:72:9
    #3 0x559ceabc9cbd in test_dwarf_unwind__thread tools/perf/tests/dwarf-unwind.c:106:6
    #4 0x559ceabca5cf in test_dwarf_unwind__compare tools/perf/tests/dwarf-unwind.c:138:26
    #5 0x7f812a6865b0 in bsearch (libc.so.6+0x4e5b0)
    #6 0x559ceabca871 in test_dwarf_unwind__krava_3 tools/perf/tests/dwarf-unwind.c:162:2
    #7 0x559ceabca926 in test_dwarf_unwind__krava_2 tools/perf/tests/dwarf-unwind.c:169:9
    #8 0x559ceabca946 in test_dwarf_unwind__krava_1 tools/perf/tests/dwarf-unwind.c:174:9
    #9 0x559ceabcae12 in test__dwarf_unwind tools/perf/tests/dwarf-unwind.c:211:8
    #10 0x559ceabbc4ab in run_test tools/perf/tests/builtin-test.c:418:9
    #11 0x559ceabbc4ab in test_and_print tools/perf/tests/builtin-test.c:448:9
    #12 0x559ceabbac70 in __cmd_test tools/perf/tests/builtin-test.c:669:4
    #13 0x559ceabbac70 in cmd_test tools/perf/tests/builtin-test.c:815:9
    #14 0x559cea960e30 in run_builtin tools/perf/perf.c:313:11
    #15 0x559cea95fbce in handle_internal_command tools/perf/perf.c:365:8
    #16 0x559cea95fbce in run_argv tools/perf/perf.c:409:2
    #17 0x559cea95fbce in main tools/perf/perf.c:539:3

  Uninitialized value was created by an allocation of 'bf' in the stack frame of function 'perf_event__synthesize_mmap_events'
    #0 0x559ceafc5f60 in perf_event__synthesize_mmap_events tools/perf/util/synthetic-events.c:445

SUMMARY: MemorySanitizer: use-of-uninitialized-value elfutils/libdwfl/frame_unwind.c:648:8 in handle_cfi
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: clang-built-linux@googlegroups.com
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sandeep Dasgupta <sdasgup@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lore.kernel.org/lkml/20201113182053.754625-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Dec 15, 2020
Fix the following possible deadlock reported by lockdep disabling BH
running mt76_free_pending_txwi()

================================
WARNING: inconsistent lock state
5.9.0-rc6 #14 Not tainted
--------------------------------
inconsistent {IN-SOFTIRQ-W} -> {SOFTIRQ-ON-W} usage.
rmmod/1227 [HC0[0]:SC0[0]:HE1:SE1] takes:
ffff888156a83530 (&dev->lock#2){+.?.}-{2:2}, at: mt76_dma_cleanup+0x125/0x150 [mt76]
{IN-SOFTIRQ-W} state was registered at:
  __lock_acquire+0x20c/0x6b0
  lock_acquire+0x9d/0x220
  _raw_spin_lock+0x2c/0x70
  mt76_dma_tx_cleanup+0xc7/0x200 [mt76]
  mt76x02_poll_tx+0x31/0xb0 [mt76x02_lib]
  napi_poll+0x3a/0x100
  net_rx_action+0xa8/0x200
  __do_softirq+0xc4/0x430
  asm_call_on_stack+0xf/0x20
  do_softirq_own_stack+0x49/0x60
  irq_exit_rcu+0x9a/0xd0
  common_interrupt+0xa4/0x190
  asm_common_interrupt+0x1e/0x40
irq event stamp: 9915
hardirqs last  enabled at (9915): [<ffffffff8124e286>] __free_pages_ok+0x336/0x3b0
hardirqs last disabled at (9914): [<ffffffff8124e24e>] __free_pages_ok+0x2fe/0x3b0
softirqs last  enabled at (9912): [<ffffffffa03aa672>] mt76_dma_rx_cleanup+0xa2/0x120 [mt76]
softirqs last disabled at (9846): [<ffffffffa03aa5ea>] mt76_dma_rx_cleanup+0x1a/0x120 [mt76]

other info that might help us debug this:
 Possible unsafe locking scenario:

       CPU0
       ----
  lock(&dev->lock#2);
  <Interrupt>
    lock(&dev->lock#2);

 *** DEADLOCK ***

1 lock held by rmmod/1227:
 #0: ffff88815b5eb240 (&dev->mutex){....}-{3:3}, at: driver_detach+0xb5/0x110

stack backtrace:
CPU: 1 PID: 1227 Comm: rmmod Kdump: loaded Not tainted 5.9.0-rc6-wdn-src+ #14
Hardware name: Dell Inc. Studio XPS 1340/0K183D, BIOS A11 09/08/2009
Call Trace:
 dump_stack+0x77/0xa0
 mark_lock_irq.cold+0x15/0x39
 mark_lock+0x1fc/0x500
 mark_usage+0xc7/0x140
 __lock_acquire+0x20c/0x6b0
 ? find_held_lock+0x2b/0x80
 ? sched_clock_cpu+0xc/0xb0
 lock_acquire+0x9d/0x220
 ? mt76_dma_cleanup+0x125/0x150 [mt76]
 _raw_spin_lock+0x2c/0x70
 ? mt76_dma_cleanup+0x125/0x150 [mt76]
 mt76_dma_cleanup+0x125/0x150 [mt76]
 mt76x2_cleanup+0x5a/0x70 [mt76x2e]
 mt76x2e_remove+0x18/0x30 [mt76x2e]
 pci_device_remove+0x36/0xa0
 __device_release_driver+0x16c/0x220
 driver_detach+0xcf/0x110
 bus_remove_driver+0x56/0xca
 pci_unregister_driver+0x36/0x80
 __do_sys_delete_module.constprop.0+0x127/0x200
 ? syscall_enter_from_user_mode+0x1d/0x50
 ? trace_hardirqs_on+0x1c/0xe0
 do_syscall_64+0x33/0x80
 entry_SYSCALL_64_after_hwframe+0x44/0xa9
RIP: 0033:0x7ff0da54e36b
Code: 73 01 c3 48 8b 0d 2d 0b 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d fd 0a 0c 00 f7 d8 64 89 01 48

Fixes: dd57a95 ("mt76: move txwi handling code to dma.c, since it is mmio specific")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Felix Fietkau <nbd@nbd.name>
kernel-patches-bot pushed a commit that referenced this pull request Dec 15, 2020
Ido Schimmel says:

====================
mlxsw: Introduce initial XM router support

This patch set implements initial eXtended Mezzanine (XM) router
support.

The XM is an external device connected to the Spectrum-{2,3} ASICs using
dedicated Ethernet ports. Its purpose is to increase the number of
routes that can be offloaded to hardware. This is achieved by having the
ASIC act as a cache that refers cache misses to the XM where the FIB is
stored and LPM lookup is performed.

Future patch sets will add more sophisticated cache flushing and
selftests that utilize cache counters on the ASIC, which we plan to
expose via devlink-metric [1].

Patch set overview:

Patches #1-#2 add registers to insert/remove routes to/from the XM and
to enable/disable it. Patch #3 utilizes these registers in order to
implement XM-specific router low-level operations.

Patches #4-#5 query from firmware the availability of the XM and the
local ports that are used to connect the ASIC to the XM, so that netdevs
will not be created for them.

Patches #6-#8 initialize the XM by configuring its cache parameters.

Patch #9-#10 implement cache management, so that LPM lookup will be
correctly cached in the ASIC.

Patches #11-#13 implement cache flushing, so that routes
insertions/removals to/from the XM will flush the affected entries in
the cache.

Patch #14 configures the ASIC to allocate half of its memory for the
cache, so that room will be left for other entries (e.g., FDBs,
neighbours).

Patch #15 starts using the XM for IPv4 route offload, when available.

[1] https://lore.kernel.org/netdev/20200817125059.193242-1-idosch@idosch.org/
====================

Link: https://lore.kernel.org/r/20201214113041.2789043-1-idosch@idosch.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
kernel-patches-bot pushed a commit that referenced this pull request Mar 10, 2021
Calling btrfs_qgroup_reserve_meta_prealloc from
btrfs_delayed_inode_reserve_metadata can result in flushing delalloc
while holding a transaction and delayed node locks. This is deadlock
prone. In the past multiple commits:

 * ae5e070 ("btrfs: qgroup: don't try to wait flushing if we're
already holding a transaction")

 * 6f23277 ("btrfs: qgroup: don't commit transaction when we already
 hold the handle")

Tried to solve various aspects of this but this was always a
whack-a-mole game. Unfortunately those 2 fixes don't solve a deadlock
scenario involving btrfs_delayed_node::mutex. Namely, one thread
can call btrfs_dirty_inode as a result of reading a file and modifying
its atime:

  PID: 6963   TASK: ffff8c7f3f94c000  CPU: 2   COMMAND: "test"
  #0  __schedule at ffffffffa529e07d
  #1  schedule at ffffffffa529e4ff
  #2  schedule_timeout at ffffffffa52a1bdd
  #3  wait_for_completion at ffffffffa529eeea             <-- sleeps with delayed node mutex held
  #4  start_delalloc_inodes at ffffffffc0380db5
  #5  btrfs_start_delalloc_snapshot at ffffffffc0393836
  #6  try_flush_qgroup at ffffffffc03f04b2
  #7  __btrfs_qgroup_reserve_meta at ffffffffc03f5bb6     <-- tries to reserve space and starts delalloc inodes.
  #8  btrfs_delayed_update_inode at ffffffffc03e31aa      <-- acquires delayed node mutex
  #9  btrfs_update_inode at ffffffffc0385ba8
 #10  btrfs_dirty_inode at ffffffffc038627b               <-- TRANSACTIION OPENED
 #11  touch_atime at ffffffffa4cf0000
 #12  generic_file_read_iter at ffffffffa4c1f123
 #13  new_sync_read at ffffffffa4ccdc8a
 #14  vfs_read at ffffffffa4cd0849
 #15  ksys_read at ffffffffa4cd0bd1
 #16  do_syscall_64 at ffffffffa4a052eb
 #17  entry_SYSCALL_64_after_hwframe at ffffffffa540008c

This will cause an asynchronous work to flush the delalloc inodes to
happen which can try to acquire the same delayed_node mutex:

  PID: 455    TASK: ffff8c8085fa4000  CPU: 5   COMMAND: "kworker/u16:30"
  #0  __schedule at ffffffffa529e07d
  #1  schedule at ffffffffa529e4ff
  #2  schedule_preempt_disabled at ffffffffa529e80a
  #3  __mutex_lock at ffffffffa529fdcb                    <-- goes to sleep, never wakes up.
  #4  btrfs_delayed_update_inode at ffffffffc03e3143      <-- tries to acquire the mutex
  #5  btrfs_update_inode at ffffffffc0385ba8              <-- this is the same inode that pid 6963 is holding
  #6  cow_file_range_inline.constprop.78 at ffffffffc0386be7
  #7  cow_file_range at ffffffffc03879c1
  #8  btrfs_run_delalloc_range at ffffffffc038894c
  #9  writepage_delalloc at ffffffffc03a3c8f
 #10  __extent_writepage at ffffffffc03a4c01
 #11  extent_write_cache_pages at ffffffffc03a500b
 #12  extent_writepages at ffffffffc03a6de2
 #13  do_writepages at ffffffffa4c277eb
 #14  __filemap_fdatawrite_range at ffffffffa4c1e5bb
 #15  btrfs_run_delalloc_work at ffffffffc0380987         <-- starts running delayed nodes
 #16  normal_work_helper at ffffffffc03b706c
 #17  process_one_work at ffffffffa4aba4e4
 #18  worker_thread at ffffffffa4aba6fd
 #19  kthread at ffffffffa4ac0a3d
 #20  ret_from_fork at ffffffffa54001ff

To fully address those cases the complete fix is to never issue any
flushing while holding the transaction or the delayed node lock. This
patch achieves it by calling qgroup_reserve_meta directly which will
either succeed without flushing or will fail and return -EDQUOT. In the
latter case that return value is going to be propagated to
btrfs_dirty_inode which will fallback to start a new transaction. That's
fine as the majority of time we expect the inode will have
BTRFS_DELAYED_NODE_INODE_DIRTY flag set which will result in directly
copying the in-memory state.

Fixes: c53e965 ("btrfs: qgroup: try to flush qgroup space when we get -EDQUOT")
CC: stable@vger.kernel.org # 5.10+
Reviewed-by: Qu Wenruo <wqu@suse.com>
Signed-off-by: Nikolay Borisov <nborisov@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
kernel-patches-bot pushed a commit that referenced this pull request Mar 10, 2021
The evlist has the maps with its own refcounts so we don't need to set
the pointers to NULL.  Otherwise following error was reported by Asan.

  # perf test -v 4
   4: Read samples using the mmap interface      :
  --- start ---
  test child forked, pid 139782
  mmap size 528384B

  =================================================================
  ==139782==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f1f76daee8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x564ba21a0fea in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
    #2 0x564ba21a1a0f in perf_cpu_map__read /home/namhyung/project/linux/tools/lib/perf/cpumap.c:149
    #3 0x564ba21a21cf in cpu_map__read_all_cpu_map /home/namhyung/project/linux/tools/lib/perf/cpumap.c:166
    #4 0x564ba21a21cf in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:181
    #5 0x564ba1e48298 in test__basic_mmap tests/mmap-basic.c:55
    #6 0x564ba1e278fb in run_test tests/builtin-test.c:428
    #7 0x564ba1e278fb in test_and_print tests/builtin-test.c:458
    #8 0x564ba1e29a53 in __cmd_test tests/builtin-test.c:679
    #9 0x564ba1e29a53 in cmd_test tests/builtin-test.c:825
    #10 0x564ba1e95cb4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #11 0x564ba1d1fa88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #12 0x564ba1d1fa88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #13 0x564ba1d1fa88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #14 0x7f1f768e4d09 in __libc_start_main ../csu/libc-start.c:308

    ...
  test child finished with 1
  ---- end ----
  Read samples using the mmap interface: FAILED!
  failed to open shell test directory: /home/namhyung/libexec/perf-core/tests/shell

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Link: https://lore.kernel.org/r/20210301140409.184570-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Mar 10, 2021
The evlist and the cpu/thread maps should be released together.
Otherwise following error was reported by Asan.

Note that this test still has memory leaks in DSOs so it still fails
even after this change.  I'll take a look at that too.

  # perf test -v 26
  26: Object code reading                        :
  --- start ---
  test child forked, pid 154184
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  symsrc__init: cannot get elf header.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
  Parsing event 'cycles'
  mmap size 528384B
  ...
  =================================================================
  ==154184==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 439 byte(s) in 1 object(s) allocated from:
    #0 0x7fcb66e77037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154
    #1 0x55ad9b7e821e in dso__new_id util/dso.c:1256
    #2 0x55ad9b8cfd4a in __machine__addnew_vdso util/vdso.c:132
    #3 0x55ad9b8cfd4a in machine__findnew_vdso util/vdso.c:347
    #4 0x55ad9b845b7e in map__new util/map.c:176
    #5 0x55ad9b8415a2 in machine__process_mmap2_event util/machine.c:1787
    #6 0x55ad9b8fab16 in perf_tool__process_synth_event util/synthetic-events.c:64
    #7 0x55ad9b8fab16 in perf_event__synthesize_mmap_events util/synthetic-events.c:499
    #8 0x55ad9b8fbfdf in __event__synthesize_thread util/synthetic-events.c:741
    #9 0x55ad9b8ff3e3 in perf_event__synthesize_thread_map util/synthetic-events.c:833
    #10 0x55ad9b738585 in do_test_code_reading tests/code-reading.c:608
    #11 0x55ad9b73b25d in test__code_reading tests/code-reading.c:722
    #12 0x55ad9b6f28fb in run_test tests/builtin-test.c:428
    #13 0x55ad9b6f28fb in test_and_print tests/builtin-test.c:458
    #14 0x55ad9b6f4a53 in __cmd_test tests/builtin-test.c:679
    #15 0x55ad9b6f4a53 in cmd_test tests/builtin-test.c:825
    #16 0x55ad9b760cc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #17 0x55ad9b5eaa88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #18 0x55ad9b5eaa88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #19 0x55ad9b5eaa88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #20 0x7fcb669acd09 in __libc_start_main ../csu/libc-start.c:308

    ...
  SUMMARY: AddressSanitizer: 471 byte(s) leaked in 2 allocation(s).
  test child finished with 1
  ---- end ----
  Object code reading: FAILED!

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210301140409.184570-6-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Mar 10, 2021
The evlist and the cpu/thread maps should be released together.
Otherwise following error was reported by Asan.

  $ perf test -v 28
  28: Use a dummy software event to keep tracking:
  --- start ---
  test child forked, pid 156810
  mmap size 528384B

  =================================================================
  ==156810==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7f637d2bce8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x55cc6295cffa in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
    #2 0x55cc6295da1f in perf_cpu_map__read /home/namhyung/project/linux/tools/lib/perf/cpumap.c:149
    #3 0x55cc6295e1df in cpu_map__read_all_cpu_map /home/namhyung/project/linux/tools/lib/perf/cpumap.c:166
    #4 0x55cc6295e1df in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:181
    #5 0x55cc626287cf in test__keep_tracking tests/keep-tracking.c:84
    #6 0x55cc625e38fb in run_test tests/builtin-test.c:428
    #7 0x55cc625e38fb in test_and_print tests/builtin-test.c:458
    #8 0x55cc625e5a53 in __cmd_test tests/builtin-test.c:679
    #9 0x55cc625e5a53 in cmd_test tests/builtin-test.c:825
    #10 0x55cc62651cc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #11 0x55cc624dba88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #12 0x55cc624dba88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #13 0x55cc624dba88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #14 0x7f637cdf2d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 72 byte(s) leaked in 2 allocation(s).
  test child finished with 1
  ---- end ----
  Use a dummy software event to keep tracking: FAILED!

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210301140409.184570-7-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-bot pushed a commit that referenced this pull request Mar 10, 2021
The evlist and cpu/thread maps should be released together.
Otherwise the following error was reported by Asan.

  $ perf test -v 35
  35: Track with sched_switch                    :
  --- start ---
  test child forked, pid 159287
  Using CPUID GenuineIntel-6-8E-C
  mmap size 528384B
  1295 events recorded

  =================================================================
  ==159287==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
    #0 0x7fa28d9a2e8f in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:145
    #1 0x5652f5a5affa in cpu_map__trim_new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:79
    #2 0x5652f5a5ba1f in perf_cpu_map__read /home/namhyung/project/linux/tools/lib/perf/cpumap.c:149
    #3 0x5652f5a5c1df in cpu_map__read_all_cpu_map /home/namhyung/project/linux/tools/lib/perf/cpumap.c:166
    #4 0x5652f5a5c1df in perf_cpu_map__new /home/namhyung/project/linux/tools/lib/perf/cpumap.c:181
    #5 0x5652f5723bbf in test__switch_tracking tests/switch-tracking.c:350
    #6 0x5652f56e18fb in run_test tests/builtin-test.c:428
    #7 0x5652f56e18fb in test_and_print tests/builtin-test.c:458
    #8 0x5652f56e3a53 in __cmd_test tests/builtin-test.c:679
    #9 0x5652f56e3a53 in cmd_test tests/builtin-test.c:825
    #10 0x5652f574fcc4 in run_builtin /home/namhyung/project/linux/tools/perf/perf.c:313
    #11 0x5652f55d9a88 in handle_internal_command /home/namhyung/project/linux/tools/perf/perf.c:365
    #12 0x5652f55d9a88 in run_argv /home/namhyung/project/linux/tools/perf/perf.c:409
    #13 0x5652f55d9a88 in main /home/namhyung/project/linux/tools/perf/perf.c:539
    #14 0x7fa28d4d8d09 in __libc_start_main ../csu/libc-start.c:308

  SUMMARY: AddressSanitizer: 72 byte(s) leaked in 2 allocation(s).
  test child finished with 1
  ---- end ----
  Track with sched_switch: FAILED!

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Acked-by: Jiri Olsa <jolsa@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20210301140409.184570-8-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 24, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 25, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 26, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Signed-off-by: NipaLocal <nipa@local>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Aug 27, 2024
A sysfs reader can race with a device reset or removal, attempting to
read device state when the device is not actually present. eg:

     [exception RIP: qed_get_current_link+17]
  kernel-patches#8 [ffffb9e4f2907c48] qede_get_link_ksettings at ffffffffc07a994a [qede]
  kernel-patches#9 [ffffb9e4f2907cd8] __rh_call_get_link_ksettings at ffffffff992b01a3
 kernel-patches#10 [ffffb9e4f2907d38] __ethtool_get_link_ksettings at ffffffff992b04e4
 kernel-patches#11 [ffffb9e4f2907d90] duplex_show at ffffffff99260300
 kernel-patches#12 [ffffb9e4f2907e38] dev_attr_show at ffffffff9905a01c
 kernel-patches#13 [ffffb9e4f2907e50] sysfs_kf_seq_show at ffffffff98e0145b
 kernel-patches#14 [ffffb9e4f2907e68] seq_read at ffffffff98d902e3
 kernel-patches#15 [ffffb9e4f2907ec8] vfs_read at ffffffff98d657d1
 kernel-patches#16 [ffffb9e4f2907f00] ksys_read at ffffffff98d65c3f
 kernel-patches#17 [ffffb9e4f2907f38] do_syscall_64 at ffffffff98a052fb

 crash> struct net_device.state ffff9a9d21336000
    state = 5,

state 5 is __LINK_STATE_START (0b1) and __LINK_STATE_NOCARRIER (0b100).
The device is not present, note lack of __LINK_STATE_PRESENT (0b10).

This is the same sort of panic as observed in commit 4224cfd
("net-sysfs: add check for netdevice being present to speed_show").

There are many other callers of __ethtool_get_link_ksettings() which
don't have a device presence check.

Move this check into ethtool to protect all callers.

Fixes: d519e17 ("net: export device speed and duplex via sysfs")
Fixes: 4224cfd ("net-sysfs: add check for netdevice being present to speed_show")
Signed-off-by: Jamie Bainbridge <jamie.bainbridge@gmail.com>
Link: https://patch.msgid.link/8bae218864beaa44ed01628140475b9bf641c5b0.1724393671.git.jamie.bainbridge@gmail.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
kuba-moo added a commit to linux-netdev/testing-bpf-ci that referenced this pull request Sep 7, 2024
…rnel/git/netfilter/nf-next

Pablo Neira Ayuso says:

====================
Netfilter updates for net-next

The following patchset contains Netfilter updates for net-next:

Patch kernel-patches#1 adds ctnetlink support for kernel side filtering for
	 deletions, from Changliang Wu.

Patch kernel-patches#2 updates nft_counter support to Use u64_stats_t,
	 from Sebastian Andrzej Siewior.

Patch kernel-patches#3 uses kmemdup_array() in all xtables frontends,
	 from Yan Zhen.

Patch kernel-patches#4 is a oneliner to use ERR_CAST() in nf_conntrack instead
	 opencoded casting, from Shen Lichuan.

Patch kernel-patches#5 removes unused argument in nftables .validate interface,
	 from Florian Westphal.

Patch kernel-patches#6 is a oneliner to correct a typo in nftables kdoc,
	 from Simon Horman.

Patch kernel-patches#7 fixes missing kdoc in nftables, also from Simon.

Patch kernel-patches#8 updates nftables to handle timeout less than CONFIG_HZ.

Patch kernel-patches#9 rejects element expiration if timeout is zero,
	 otherwise it is silently ignored.

Patch kernel-patches#10 disallows element expiration larger than timeout.

Patch kernel-patches#11 removes unnecessary READ_ONCE annotation while mutex is held.

Patch kernel-patches#12 adds missing READ_ONCE/WRITE_ONCE annotation in dynset.

Patch kernel-patches#13 annotates data-races around element expiration.

Patch kernel-patches#14 allocates timeout and expiration in one single set element
	  extension, they are tighly couple, no reason to keep them
	  separated anymore.

Patch kernel-patches#15 updates nftables to interpret zero timeout element as never
	  times out. Note that it is already possible to declare sets
	  with elements that never time out but this generalizes to all
	  kind of set with timeouts.

Patch kernel-patches#16 supports for element timeout and expiration updates.

* tag 'nf-next-24-09-06' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf-next:
  netfilter: nf_tables: set element timeout update support
  netfilter: nf_tables: zero timeout means element never times out
  netfilter: nf_tables: consolidate timeout extension for elements
  netfilter: nf_tables: annotate data-races around element expiration
  netfilter: nft_dynset: annotate data-races around set timeout
  netfilter: nf_tables: remove annotation to access set timeout while holding lock
  netfilter: nf_tables: reject expiration higher than timeout
  netfilter: nf_tables: reject element expiration with no timeout
  netfilter: nf_tables: elements with timeout below CONFIG_HZ never expire
  netfilter: nf_tables: Add missing Kernel doc
  netfilter: nf_tables: Correct spelling in nf_tables.h
  netfilter: nf_tables: drop unused 3rd argument from validate callback ops
  netfilter: conntrack: Convert to use ERR_CAST()
  netfilter: Use kmemdup_array instead of kmemdup for multiple allocation
  netfilter: nft_counter: Use u64_stats_t for statistic.
  netfilter: ctnetlink: support CTA_FILTER for flush
====================

Link: https://patch.msgid.link/20240905232920.5481-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 23, 2024
iter_finish_branch_entry() doesn't put the branch_info from/to map
elements creating memory leaks. This can be seen with:

```
$ perf record -e cycles -b perf test -w noploop
$ perf report -D
...
Direct leak of 984344 byte(s) in 123043 object(s) allocated from:
    #0 0x7fb2654f3bd7 in malloc libsanitizer/asan/asan_malloc_linux.cpp:69
    #1 0x564d3400d10b in map__get util/map.h:186
    #2 0x564d3400d10b in ip__resolve_ams util/machine.c:1981
    #3 0x564d34014d81 in sample__resolve_bstack util/machine.c:2151
    #4 0x564d34094790 in iter_prepare_branch_entry util/hist.c:898
    #5 0x564d34098fa4 in hist_entry_iter__add util/hist.c:1238
    #6 0x564d33d1f0c7 in process_sample_event tools/perf/builtin-report.c:334
    #7 0x564d34031eb7 in perf_session__deliver_event util/session.c:1655
    #8 0x564d3403ba52 in do_flush util/ordered-events.c:245
    #9 0x564d3403ba52 in __ordered_events__flush util/ordered-events.c:324
    #10 0x564d3402d32e in perf_session__process_user_event util/session.c:1708
    #11 0x564d34032480 in perf_session__process_event util/session.c:1877
    #12 0x564d340336ad in reader__read_event util/session.c:2399
    #13 0x564d34033fdc in reader__process_events util/session.c:2448
    #14 0x564d34033fdc in __perf_session__process_events util/session.c:2495
    #15 0x564d34033fdc in perf_session__process_events util/session.c:2661
    #16 0x564d33d27113 in __cmd_report tools/perf/builtin-report.c:1065
    #17 0x564d33d27113 in cmd_report tools/perf/builtin-report.c:1805
    #18 0x564d33e0ccb7 in run_builtin tools/perf/perf.c:350
    #19 0x564d33e0d45e in handle_internal_command tools/perf/perf.c:403
    #20 0x564d33cdd827 in run_argv tools/perf/perf.c:447
    #21 0x564d33cdd827 in main tools/perf/perf.c:561
...
```

Clearing up the map_symbols properly creates maps reference count
issues so resolve those. Resolving this issue doesn't improve peak
heap consumption for the test above.

Committer testing:

  $ sudo dnf install libasan
  $ make -k CORESIGHT=1 EXTRA_CFLAGS="-fsanitize=address" CC=clang O=/tmp/build/$(basename $PWD)/ -C tools/perf install-bin

Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Link: https://lore.kernel.org/r/20240807065136.1039977-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kernel-patches-daemon-bpf bot pushed a commit that referenced this pull request Sep 23, 2024
The fields in the hist_entry are filled on-demand which means they only
have meaningful values when relevant sort keys are used.

So if neither of 'dso' nor 'sym' sort keys are used, the map/symbols in
the hist entry can be garbage.  So it shouldn't access it
unconditionally.

I got a segfault, when I wanted to see cgroup profiles.

  $ sudo perf record -a --all-cgroups --synth=cgroup true

  $ sudo perf report -s cgroup

  Program received signal SIGSEGV, Segmentation fault.
  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  48		return RC_CHK_ACCESS(map)->dso;
  (gdb) bt
  #0  0x00005555557a8d90 in map__dso (map=0x0) at util/map.h:48
  #1  0x00005555557aa39b in map__load (map=0x0) at util/map.c:344
  #2  0x00005555557aa592 in map__find_symbol (map=0x0, addr=140736115941088) at util/map.c:385
  #3  0x00005555557ef000 in hists__findnew_entry (hists=0x555556039d60, entry=0x7fffffffa4c0, al=0x7fffffffa8c0, sample_self=true)
      at util/hist.c:644
  #4  0x00005555557ef61c in __hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      block_info=0x0, sample=0x7fffffffaa90, sample_self=true, ops=0x0) at util/hist.c:761
  #5  0x00005555557ef71f in hists__add_entry (hists=0x555556039d60, al=0x7fffffffa8c0, sym_parent=0x0, bi=0x0, mi=0x0, ki=0x0,
      sample=0x7fffffffaa90, sample_self=true) at util/hist.c:779
  #6  0x00005555557f00fb in iter_add_single_normal_entry (iter=0x7fffffffa900, al=0x7fffffffa8c0) at util/hist.c:1015
  #7  0x00005555557f09a7 in hist_entry_iter__add (iter=0x7fffffffa900, al=0x7fffffffa8c0, max_stack_depth=127, arg=0x7fffffffbce0)
      at util/hist.c:1260
  #8  0x00005555555ba7ce in process_sample_event (tool=0x7fffffffbce0, event=0x7ffff7c14128, sample=0x7fffffffaa90, evsel=0x555556039ad0,
      machine=0x5555560388e8) at builtin-report.c:334
  #9  0x00005555557b30c8 in evlist__deliver_sample (evlist=0x555556039010, tool=0x7fffffffbce0, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, evsel=0x555556039ad0, machine=0x5555560388e8) at util/session.c:1232
  #10 0x00005555557b32bc in machines__deliver_event (machines=0x5555560388e8, evlist=0x555556039010, event=0x7ffff7c14128,
      sample=0x7fffffffaa90, tool=0x7fffffffbce0, file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1271
  #11 0x00005555557b3848 in perf_session__deliver_event (session=0x5555560386d0, event=0x7ffff7c14128, tool=0x7fffffffbce0,
      file_offset=110888, file_path=0x555556038ff0 "perf.data") at util/session.c:1354
  #12 0x00005555557affaf in ordered_events__deliver_event (oe=0x555556038e60, event=0x555556135aa0) at util/session.c:132
  #13 0x00005555557bb605 in do_flush (oe=0x555556038e60, show_progress=false) at util/ordered-events.c:245
  #14 0x00005555557bb95c in __ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND, timestamp=0) at util/ordered-events.c:324
  #15 0x00005555557bba46 in ordered_events__flush (oe=0x555556038e60, how=OE_FLUSH__ROUND) at util/ordered-events.c:342
  #16 0x00005555557b1b3b in perf_event__process_finished_round (tool=0x7fffffffbce0, event=0x7ffff7c15bb8, oe=0x555556038e60)
      at util/session.c:780
  #17 0x00005555557b3b27 in perf_session__process_user_event (session=0x5555560386d0, event=0x7ffff7c15bb8, file_offset=117688,
      file_path=0x555556038ff0 "perf.data") at util/session.c:1406

As you can see the entry->ms.map was NULL even if he->ms.map has a
value.  This is because 'sym' sort key is not given, so it cannot assume
whether he->ms.sym and entry->ms.sym is the same.  I only checked the
'sym' sort key here as it implies 'dso' behavior (so maps are the same).

Fixes: ac01c8c ("perf hist: Update hist symbol when updating maps")
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Matt Fleming <matt@readmodwrite.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20240826221045.1202305-2-namhyung@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
kuba-moo pushed a commit to linux-netdev/testing-bpf-ci that referenced this pull request Oct 8, 2024
Daniel Machon says:

====================
net: sparx5: prepare for lan969x switch driver

== Description:

This series is the first of a multi-part series, that prepares and adds
support for the new lan969x switch driver.

The upstreaming efforts is split into multiple series (might change a
bit as we go along):

    1) Prepare the Sparx5 driver for lan969x (this series)
    2) Add support lan969x (same basic features as Sparx5 provides +
       RGMII, excl.  FDMA and VCAP)
    3) Add support for lan969x FDMA
    4) Add support for lan969x VCAP

== Lan969x in short:

The lan969x Ethernet switch family [1] provides a rich set of
switching features and port configurations (up to 30 ports) from 10Mbps
to 10Gbps, with support for RGMII, SGMII, QSGMII, USGMII, and USXGMII,
ideal for industrial & process automation infrastructure applications,
transport, grid automation, power substation automation, and ring &
intra-ring topologies. The LAN969x family is hardware and software
compatible and scalable supporting 46Gbps to 102Gbps switch bandwidths.

== Preparing Sparx5 for lan969x:

The lan969x switch chip reuses many of the IP's of the Sparx5 switch
chip, therefore it has been decided to add support through the existing
Sparx5 driver, in order to avoid a bunch of duplicate code. However, in
order to reuse the Sparx5 switch driver, we have to introduce some
mechanisms to handle the chip differences that are there.  These
mechanisms are:

    - Platform match data to contain all the differences that needs to
      be handled (constants, ops etc.)

    - Register macro indirection layer so that we can reuse the existing
      register macros.

    - Function for branching out on platform type where required.

In some places we ops out functions and in other places we branch on the
chip type. Exactly when we choose one over the other, is an estimate in
each case.

After this series is applied, the Sparx5 driver will be prepared for
lan969x and still function exactly as before.

== Patch breakdown:

Patch kernel-patches#1        adds private match data

Patch kernel-patches#2        adds register macro indirection layer

Patch kernel-patches#3-kernel-patches#4     does some preparation work

Patch kernel-patches#5-kernel-patches#7     adds chip constants and updates the code to use them

Patch kernel-patches#8-kernel-patches#13    adds and uses ops for handling functions differently on the
                two platforms.

Patch kernel-patches#14       adds and uses a macro for branching out on the chip type.

Patch kernel-patches#15 (NEW) redefines macros for internal ports and PGID's.

[1] https://www.microchip.com/en-us/product/lan9698

To: David S. Miller <davem@davemloft.net>
To: Eric Dumazet <edumazet@google.com>
To: Jakub Kicinski <kuba@kernel.org>
To: Paolo Abeni <pabeni@redhat.com>
To: Lars Povlsen <lars.povlsen@microchip.com>
To: Steen Hegelund <Steen.Hegelund@microchip.com>
To: horatiu.vultur@microchip.com
To: jensemil.schulzostergaard@microchip.com
To: UNGLinuxDriver@microchip.com
To: Richard Cochran <richardcochran@gmail.com>
To: horms@kernel.org
To: justinstitt@google.com
To: gal@nvidia.com
To: aakash.r.menon@gmail.com
To: jacob.e.keller@intel.com
To: ast@fiberby.net
Cc: netdev@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-kernel@vger.kernel.org

Signed-off-by: Daniel Machon <daniel.machon@microchip.com>
====================

Link: https://patch.msgid.link/20241004-b4-sparx5-lan969x-switch-driver-v2-0-d3290f581663@microchip.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
kuba-moo added a commit to linux-netdev/testing-bpf-ci that referenced this pull request Oct 31, 2024
Daniel Machon says:

====================
net: sparx5: add support for lan969x switch device

== Description:

This series is the second of a multi-part series, that prepares and adds
support for the new lan969x switch driver.

The upstreaming efforts is split into multiple series (might change a
bit as we go along):

        1) Prepare the Sparx5 driver for lan969x (merged)

    --> 2) add support lan969x (same basic features as Sparx5
           provides excl. FDMA and VCAP).

        3) Add support for lan969x VCAP, FDMA and RGMII

== Lan969x in short:

The lan969x Ethernet switch family [1] provides a rich set of
switching features and port configurations (up to 30 ports) from 10Mbps
to 10Gbps, with support for RGMII, SGMII, QSGMII, USGMII, and USXGMII,
ideal for industrial & process automation infrastructure applications,
transport, grid automation, power substation automation, and ring &
intra-ring topologies. The LAN969x family is hardware and software
compatible and scalable supporting 46Gbps to 102Gbps switch bandwidths.

== Preparing Sparx5 for lan969x:

The main preparation work for lan969x has already been merged [1].

After this series is applied, lan969x will have the same functionality
as Sparx5, except for VCAP and FDMA support. QoS features that requires
the VCAP (e.g. PSFP, port mirroring) will obviously not work until VCAP
support is added later.

== Patch breakdown:

Patch kernel-patches#1-kernel-patches#4  do some preparation work for lan969x

Patch kernel-patches#5     adds new registers required by lan969x

Patch kernel-patches#6     adds initial match data for all lan969x targets

Patch kernel-patches#7     defines the lan969x register differences

Patch kernel-patches#8     adds lan969x constants to match data

Patch kernel-patches#9     adds some lan969x ops in bulk

Patch kernel-patches#10    adds PTP function to ops

Patch kernel-patches#11    adds lan969x_calendar.c for calculating the calendar

Patch kernel-patches#12    makes additional use of the is_sparx5() macro to branch out
             in certain places.

Patch kernel-patches#13    documents lan969x in the dt-bindings

Patch kernel-patches#14    adds lan969x compatible string to sparx5 driver

Patch kernel-patches#15    introduces new concept of per-target features

[1] https://lore.kernel.org/netdev/20241004-b4-sparx5-lan969x-switch-driver-v2-0-d3290f581663@microchip.com/

v1: https://lore.kernel.org/20241021-sparx5-lan969x-switch-driver-2-v1-0-c8c49ef21e0f@microchip.com
====================

Link: https://patch.msgid.link/20241024-sparx5-lan969x-switch-driver-2-v2-0-a0b5fae88a0f@microchip.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants