Skip to content

Commit

Permalink
Merge tag 'mlx5-updates-2020-11-03' of git://git.kernel.org/pub/scm/l…
Browse files Browse the repository at this point in the history
…inux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5-updates-2020-11-03

This series includes updates to mlx5 software steering component.

1) Few improvements in the DR area, such as removing unneeded checks,
  renaming to better general names, refactor in some places, etc.

2) Software steering (DR) Memory management improvements

This patch series contains SW Steering memory management improvements:
using buddy allocator instead of an existing bucket allocator, and
several other optimizations.

The buddy system is a memory allocation and management algorithm
that manages memory in power of two increments.

The algorithm is well-known and well-described, such as here:
https://en.wikipedia.org/wiki/Buddy_memory_allocation

Linux uses this algorithm for managing and allocating physical pages,
as described here:
https://www.kernel.org/doc/gorman/html/understand/understand009.html

In our case, although the algorithm in principal is similar to the
Linux physical page allocator, the "building blocks" and the circumstances
are different: in SW steering, buddy allocator doesn't really allocates
a memory, but rather manages ICM (Interconnect Context Memory) that was
previously allocated and registered.

The ICM memory that is used in SW steering is always power
of 2 (order), so buddy system is a good fit for this.

Patches in this series:

[PATH 4] net/mlx5: DR, Add buddy allocator utilities
  This patch adds a modified implementation of a well-known buddy allocator,
  adjusted for SW steering needs: the algorithm in principal is similar to
  the Linux physical page allocator, but in our case buddy allocator doesn't
  really allocate a memory, but rather manages ICM memory that was previously
  allocated and registered.

[PATH 5] net/mlx5: DR, Handle ICM memory via buddy allocation instead of bucket management
  This patch changes ICM management of SW steering to use buddy-system mechanism
  Instead of the previous bucket management.

[PATH 6] net/mlx5: DR, Sync chunks only during free
  This patch makes syncing happen only when freeing memory chunks.

[PATH 7] net/mlx5: DR, ICM memory pools sync optimization
  This patch adds tracking of pool's "hot" memory and makes the
  check whether steering sync is required much shorter and faster.

[PATH 8] net/mlx5: DR, Free buddy ICM memory if it is unused
  This patch adds tracking buddy's used ICM memory,
  and frees the buddy if all its memory becomes unused.

3) Misc code cleanups

* tag 'mlx5-updates-2020-11-03' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net: mlx5: Replace in_irq() usage
  net/mlx5: Cleanup kernel-doc warnings
  net/mlx4: Cleanup kernel-doc warnings
  net/mlx5e: Validate stop_room size upon user input
  net/mlx5: DR, Free unused buddy ICM memory
  net/mlx5: DR, ICM memory pools sync optimization
  net/mlx5: DR, Sync chunks only during free
  net/mlx5: DR, Handle ICM memory via buddy allocation instead of buckets
  net/mlx5: DR, Add buddy allocator utilities
  net/mlx5: DR, Rename matcher functions to be more HW agnostic
  net/mlx5: DR, Rename builders HW specific names
  net/mlx5: DR, Remove unused member of action struct
====================

Link: https://lore.kernel.org/r/20201105201242.21716-1-saeedm@nvidia.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
  • Loading branch information
kuba-moo committed Nov 6, 2020
2 parents c1aedf0 + 5144368 commit c9448e8
Show file tree
Hide file tree
Showing 19 changed files with 591 additions and 467 deletions.
2 changes: 1 addition & 1 deletion drivers/net/ethernet/mellanox/mlx4/fw_qos.h
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ int mlx4_SET_VPORT_QOS_get(struct mlx4_dev *dev, u8 port, u8 vport,
* @dev: mlx4_dev.
* @port: Physical port number.
* @vport: Vport id.
* @out_param: Array of mlx4_vport_qos_param which holds the requested values.
* @in_param: Array of mlx4_vport_qos_param which holds the requested values.
*
* Returns 0 on success or a negative mlx4_core errno code.
**/
Expand Down
2 changes: 1 addition & 1 deletion drivers/net/ethernet/mellanox/mlx5/core/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -81,7 +81,7 @@ mlx5_core-$(CONFIG_MLX5_EN_TLS) += en_accel/tls.o en_accel/tls_rxtx.o en_accel/t

mlx5_core-$(CONFIG_MLX5_SW_STEERING) += steering/dr_domain.o steering/dr_table.o \
steering/dr_matcher.o steering/dr_rule.o \
steering/dr_icm_pool.o \
steering/dr_icm_pool.o steering/dr_buddy.o \
steering/dr_ste.o steering/dr_send.o \
steering/dr_cmd.o steering/dr_fw.o \
steering/dr_action.o steering/fs_dr.o
34 changes: 34 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/en/params.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,8 @@
/* Copyright (c) 2019 Mellanox Technologies. */

#include "en/params.h"
#include "en/txrx.h"
#include "en_accel/tls_rxtx.h"

static inline bool mlx5e_rx_is_xdp(struct mlx5e_params *params,
struct mlx5e_xsk_param *xsk)
Expand Down Expand Up @@ -152,3 +154,35 @@ u16 mlx5e_get_rq_headroom(struct mlx5_core_dev *mdev,

return is_linear_skb ? mlx5e_get_linear_rq_headroom(params, xsk) : 0;
}

u16 mlx5e_calc_sq_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params)
{
bool is_mpwqe = MLX5E_GET_PFLAG(params, MLX5E_PFLAG_SKB_TX_MPWQE);
u16 stop_room;

stop_room = mlx5e_tls_get_stop_room(mdev, params);
stop_room += mlx5e_stop_room_for_wqe(MLX5_SEND_WQE_MAX_WQEBBS);
if (is_mpwqe)
/* A MPWQE can take up to the maximum-sized WQE + all the normal
* stop room can be taken if a new packet breaks the active
* MPWQE session and allocates its WQEs right away.
*/
stop_room += mlx5e_stop_room_for_wqe(MLX5_SEND_WQE_MAX_WQEBBS);

return stop_room;
}

int mlx5e_validate_params(struct mlx5e_priv *priv, struct mlx5e_params *params)
{
size_t sq_size = 1 << params->log_sq_size;
u16 stop_room;

stop_room = mlx5e_calc_sq_stop_room(priv->mdev, params);
if (stop_room >= sq_size) {
netdev_err(priv->netdev, "Stop room %hu is bigger than the SQ size %zu\n",
stop_room, sq_size);
return -EINVAL;
}

return 0;
}
4 changes: 4 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/en/params.h
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ struct mlx5e_sq_param {
u32 sqc[MLX5_ST_SZ_DW(sqc)];
struct mlx5_wq_param wq;
bool is_mpw;
u16 stop_room;
};

struct mlx5e_channel_param {
Expand Down Expand Up @@ -124,4 +125,7 @@ void mlx5e_build_xdpsq_param(struct mlx5e_priv *priv,
struct mlx5e_params *params,
struct mlx5e_sq_param *param);

u16 mlx5e_calc_sq_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params);
int mlx5e_validate_params(struct mlx5e_priv *priv, struct mlx5e_params *params);

#endif /* __MLX5_EN_PARAMS_H__ */
8 changes: 4 additions & 4 deletions drivers/net/ethernet/mellanox/mlx5/core/en_accel/ktls_tx.c
Original file line number Diff line number Diff line change
Expand Up @@ -13,20 +13,20 @@ struct mlx5e_dump_wqe {
(DIV_ROUND_UP(sizeof(struct mlx5e_dump_wqe), MLX5_SEND_WQE_BB))

static u8
mlx5e_ktls_dumps_num_wqes(struct mlx5e_txqsq *sq, unsigned int nfrags,
mlx5e_ktls_dumps_num_wqes(struct mlx5e_params *params, unsigned int nfrags,
unsigned int sync_len)
{
/* Given the MTU and sync_len, calculates an upper bound for the
* number of DUMP WQEs needed for the TX resync of a record.
*/
return nfrags + DIV_ROUND_UP(sync_len, sq->hw_mtu);
return nfrags + DIV_ROUND_UP(sync_len, MLX5E_SW2HW_MTU(params, params->sw_mtu));
}

u16 mlx5e_ktls_get_stop_room(struct mlx5e_txqsq *sq)
u16 mlx5e_ktls_get_stop_room(struct mlx5e_params *params)
{
u16 num_dumps, stop_room = 0;

num_dumps = mlx5e_ktls_dumps_num_wqes(sq, MAX_SKB_FRAGS, TLS_MAX_PAYLOAD_SIZE);
num_dumps = mlx5e_ktls_dumps_num_wqes(params, MAX_SKB_FRAGS, TLS_MAX_PAYLOAD_SIZE);

stop_room += mlx5e_stop_room_for_wqe(MLX5E_TLS_SET_STATIC_PARAMS_WQEBBS);
stop_room += mlx5e_stop_room_for_wqe(MLX5E_TLS_SET_PROGRESS_PARAMS_WQEBBS);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ struct mlx5e_accel_tx_tls_state {
u32 tls_tisn;
};

u16 mlx5e_ktls_get_stop_room(struct mlx5e_txqsq *sq);
u16 mlx5e_ktls_get_stop_room(struct mlx5e_params *params);

bool mlx5e_ktls_handle_tx_skb(struct tls_context *tls_ctx, struct mlx5e_txqsq *sq,
struct sk_buff *skb, int datalen,
Expand Down
6 changes: 2 additions & 4 deletions drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.c
Original file line number Diff line number Diff line change
Expand Up @@ -385,15 +385,13 @@ void mlx5e_tls_handle_rx_skb_metadata(struct mlx5e_rq *rq, struct sk_buff *skb,
*cqe_bcnt -= MLX5E_METADATA_ETHER_LEN;
}

u16 mlx5e_tls_get_stop_room(struct mlx5e_txqsq *sq)
u16 mlx5e_tls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params)
{
struct mlx5_core_dev *mdev = sq->channel->mdev;

if (!mlx5_accel_is_tls_device(mdev))
return 0;

if (mlx5_accel_is_ktls_device(mdev))
return mlx5e_ktls_get_stop_room(sq);
return mlx5e_ktls_get_stop_room(params);

/* FPGA */
/* Resync SKB. */
Expand Down
4 changes: 2 additions & 2 deletions drivers/net/ethernet/mellanox/mlx5/core/en_accel/tls_rxtx.h
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@
#include "en.h"
#include "en/txrx.h"

u16 mlx5e_tls_get_stop_room(struct mlx5e_txqsq *sq);
u16 mlx5e_tls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params);

bool mlx5e_tls_handle_tx_skb(struct net_device *netdev, struct mlx5e_txqsq *sq,
struct sk_buff *skb, struct mlx5e_accel_tx_tls_state *state);
Expand Down Expand Up @@ -71,7 +71,7 @@ mlx5e_accel_is_tls(struct mlx5_cqe64 *cqe, struct sk_buff *skb) { return false;
static inline void
mlx5e_tls_handle_rx_skb(struct mlx5e_rq *rq, struct sk_buff *skb,
struct mlx5_cqe64 *cqe, u32 *cqe_bcnt) {}
static inline u16 mlx5e_tls_get_stop_room(struct mlx5e_txqsq *sq)
static inline u16 mlx5e_tls_get_stop_room(struct mlx5_core_dev *mdev, struct mlx5e_params *params)
{
return 0;
}
Expand Down
5 changes: 5 additions & 0 deletions drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@

#include "en.h"
#include "en/port.h"
#include "en/params.h"
#include "en/xsk/pool.h"
#include "lib/clock.h"

Expand Down Expand Up @@ -369,6 +370,10 @@ int mlx5e_ethtool_set_ringparam(struct mlx5e_priv *priv,
new_channels.params.log_rq_mtu_frames = log_rq_size;
new_channels.params.log_sq_size = log_sq_size;

err = mlx5e_validate_params(priv, &new_channels.params);
if (err)
goto unlock;

if (!test_bit(MLX5E_STATE_OPENED, &priv->state)) {
priv->channels.params = new_channels.params;
goto unlock;
Expand Down
30 changes: 5 additions & 25 deletions drivers/net/ethernet/mellanox/mlx5/core/en_main.c
Original file line number Diff line number Diff line change
Expand Up @@ -1121,28 +1121,6 @@ static int mlx5e_alloc_txqsq_db(struct mlx5e_txqsq *sq, int numa)
return 0;
}

static int mlx5e_calc_sq_stop_room(struct mlx5e_txqsq *sq, u8 log_sq_size)
{
int sq_size = 1 << log_sq_size;

sq->stop_room = mlx5e_tls_get_stop_room(sq);
sq->stop_room += mlx5e_stop_room_for_wqe(MLX5_SEND_WQE_MAX_WQEBBS);
if (test_bit(MLX5E_SQ_STATE_MPWQE, &sq->state))
/* A MPWQE can take up to the maximum-sized WQE + all the normal
* stop room can be taken if a new packet breaks the active
* MPWQE session and allocates its WQEs right away.
*/
sq->stop_room += mlx5e_stop_room_for_wqe(MLX5_SEND_WQE_MAX_WQEBBS);

if (WARN_ON(sq->stop_room >= sq_size)) {
netdev_err(sq->channel->netdev, "Stop room %hu is bigger than the SQ size %d\n",
sq->stop_room, sq_size);
return -ENOSPC;
}

return 0;
}

static void mlx5e_tx_err_cqe_work(struct work_struct *recover_work);
static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
int txq_ix,
Expand Down Expand Up @@ -1176,9 +1154,7 @@ static int mlx5e_alloc_txqsq(struct mlx5e_channel *c,
set_bit(MLX5E_SQ_STATE_TLS, &sq->state);
if (param->is_mpw)
set_bit(MLX5E_SQ_STATE_MPWQE, &sq->state);
err = mlx5e_calc_sq_stop_room(sq, params->log_sq_size);
if (err)
return err;
sq->stop_room = param->stop_room;

param->wq.db_numa_node = cpu_to_node(c->cpu);
err = mlx5_wq_cyc_create(mdev, &param->wq, sqc_wq, wq, &sq->wq_ctrl);
Expand Down Expand Up @@ -2225,6 +2201,7 @@ static void mlx5e_build_sq_param(struct mlx5e_priv *priv,
MLX5_SET(wq, wq, log_wq_sz, params->log_sq_size);
MLX5_SET(sqc, sqc, allow_swp, allow_swp);
param->is_mpw = MLX5E_GET_PFLAG(params, MLX5E_PFLAG_SKB_TX_MPWQE);
param->stop_room = mlx5e_calc_sq_stop_room(priv->mdev, params);
mlx5e_build_tx_cq_param(priv, params, &param->cqp);
}

Expand Down Expand Up @@ -3999,6 +3976,9 @@ int mlx5e_change_mtu(struct net_device *netdev, int new_mtu,

new_channels.params = *params;
new_channels.params.sw_mtu = new_mtu;
err = mlx5e_validate_params(priv, &new_channels.params);
if (err)
goto out;

if (params->xdp_prog &&
!mlx5e_rx_is_linear_skb(&new_channels.params, NULL)) {
Expand Down
18 changes: 11 additions & 7 deletions drivers/net/ethernet/mellanox/mlx5/core/eq.c
Original file line number Diff line number Diff line change
Expand Up @@ -189,19 +189,21 @@ u32 mlx5_eq_poll_irq_disabled(struct mlx5_eq_comp *eq)
return count_eqe;
}

static void mlx5_eq_async_int_lock(struct mlx5_eq_async *eq, unsigned long *flags)
static void mlx5_eq_async_int_lock(struct mlx5_eq_async *eq, bool recovery,
unsigned long *flags)
__acquires(&eq->lock)
{
if (in_irq())
if (!recovery)
spin_lock(&eq->lock);
else
spin_lock_irqsave(&eq->lock, *flags);
}

static void mlx5_eq_async_int_unlock(struct mlx5_eq_async *eq, unsigned long *flags)
static void mlx5_eq_async_int_unlock(struct mlx5_eq_async *eq, bool recovery,
unsigned long *flags)
__releases(&eq->lock)
{
if (in_irq())
if (!recovery)
spin_unlock(&eq->lock);
else
spin_unlock_irqrestore(&eq->lock, *flags);
Expand All @@ -223,11 +225,13 @@ static int mlx5_eq_async_int(struct notifier_block *nb,
struct mlx5_eqe *eqe;
unsigned long flags;
int num_eqes = 0;
bool recovery;

dev = eq->dev;
eqt = dev->priv.eq_table;

mlx5_eq_async_int_lock(eq_async, &flags);
recovery = action == ASYNC_EQ_RECOVER;
mlx5_eq_async_int_lock(eq_async, recovery, &flags);

eqe = next_eqe_sw(eq);
if (!eqe)
Expand All @@ -249,9 +253,9 @@ static int mlx5_eq_async_int(struct notifier_block *nb,

out:
eq_update_ci(eq, 1);
mlx5_eq_async_int_unlock(eq_async, &flags);
mlx5_eq_async_int_unlock(eq_async, recovery, &flags);

return unlikely(action == ASYNC_EQ_RECOVER) ? num_eqes : 0;
return unlikely(recovery) ? num_eqes : 0;
}

void mlx5_cmd_eq_recover(struct mlx5_core_dev *dev)
Expand Down
8 changes: 5 additions & 3 deletions drivers/net/ethernet/mellanox/mlx5/core/fpga/sdk.h
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,12 @@
/**
* enum mlx5_fpga_access_type - Enumerated the different methods possible for
* accessing the device memory address space
*
* @MLX5_FPGA_ACCESS_TYPE_I2C: Use the slow CX-FPGA I2C bus
* @MLX5_FPGA_ACCESS_TYPE_DONTCARE: Use the fastest available method
*/
enum mlx5_fpga_access_type {
/** Use the slow CX-FPGA I2C bus */
MLX5_FPGA_ACCESS_TYPE_I2C = 0x0,
/** Use the fastest available method */
MLX5_FPGA_ACCESS_TYPE_DONTCARE = 0x0,
};

Expand Down Expand Up @@ -113,6 +114,7 @@ struct mlx5_fpga_conn_attr {
* subsequent receives.
*/
void (*recv_cb)(void *cb_arg, struct mlx5_fpga_dma_buf *buf);
/** @cb_arg: A context to be passed to recv_cb callback */
void *cb_arg;
};

Expand Down Expand Up @@ -145,7 +147,7 @@ void mlx5_fpga_sbu_conn_destroy(struct mlx5_fpga_conn *conn);

/**
* mlx5_fpga_sbu_conn_sendmsg() - Queue the transmission of a packet
* @fdev: An FPGA SBU connection
* @conn: An FPGA SBU connection
* @buf: The packet buffer
*
* Queues a packet for transmission over an FPGA SBU connection.
Expand Down
Loading

0 comments on commit c9448e8

Please sign in to comment.