Overview

ClawNetwork is a tool to simulate a network and evaluate failures impacts on Top of Racks.

It has been specially crafted for Clos Matrix network. For now, cyclic graphs are not supported. Only trees are.

ClawNetwork is in active development.
The main features are implemented, but endpoints, structures and functions may change.

But it is now usable and you are free to play with it :)

Usecases

Operations

The main usecase it to evaluate if an operation on a device in your core network will impact a Top of Rack.

Concerned operations can be: upgrade, reboot, risky maintenance etc...

Detect anomalies / SPOF

ClawNetwork can be leveraged to detect SPOF of any anomalies such as spine without downlinks.

Quickstart

From source

Simply run ClawNetwork app using go run .

Alternative: build the binary via go build and run it.

Using Docker compose

Default backend

Run ClawNetwork with default backend (FileRepository):

docker-compose -f compose/docker-compose.yml up -d

FileRepository stores the topologies in dedicated JSON files on the disk.

By default, this uses examples/ directory provided in this repository.

At the moment this is not customizable, but it will be very soon.

Run with the Backend of your choice

docker-compose -f compose/docker-compose.yml -f <backend>.yml up -d

RedisJSON

recommended backend for production if you need to store topologies

At the moment, Redis JSON is the only alternative backend:

docker-compose -f compose/docker-compose.yml -f redisjson.yml up -d

This backend leverages RedisJSON module to store pure JSON to Redis. Persistence is enabled and forced at each changes (ADD/DELETE) by ClawNetwork.

Configuration

Configuration can be configured either via environment variables or YAML file (settings.yaml).

List of parameters available (varenv format | YAML format):

CLAW_LISTENADDRESS | ListenAddress: ClawNetwork API listen address (default: "0.0.0.0")
CLAW_LISTENPORT | ListenPort: ClawNetwork API listen port (default: "8080")
CLAW_TOPDEVICEROLE | TopDeviceRole: Role of device at the top of the topology graph (default: "edge")
CLAW_BOTTOMDEVICEROLE | BottomDeviceRole: Role of device at the Bottom of the topology graph (default: "tor")
CLAW_BACKEND | Backend: Choose backend to store topologies (choices: "file", "redis", default: "file")
CLAW_BACKENDS_FILE_PATH | Backends.Redis.Path: Redis DB to use (default: "./topologies/")
CLAW_BACKENDS_REDIS_HOST | Backends.Redis.Host: Redis server address (default: "localhost")
CLAW_BACKENDS_REDIS_PORT | Backends.Redis.Port: Redis server port (default: "6379")
CLAW_BACKENDS_REDIS_PASSWORD | Backends.Redis.Password: Redis password (default: "")
CLAW_BACKENDS_REDIS_DB | Backends.Redis.DB: Redis DB to use (default: 0)

Usage

Manage stored topologies

GET /topology: list stored topologies
GET /topology/:topology_name: get topology definition
POST /topology/:topology_name: create a new topology
DELETE /topology/:topology_name: delete a topology
GET /topology/details: list stored topologies with some stats
GET /topology/:topology_name/details: get topology stats

Simulation on a stored topology

GET /topology/:topology_name/device/:device/down/impact: run simulations on existing topology
POST /topology/custom/device/:device/down/impact: run simulations on topology provided in the request body

It will run a simulation on a stored topology.

If :device is set to each, it will simulate failure impact of each devices excluding Top of Racks.

Anomaly detection

GET /topology/:topology_name/anomalies: get topology anomalies

It list all anomalies in the topology graph.

Link anomalies

A node is not connected properly to the graph.

For example:

a ToR does not have any uplinks
a spine does not have any downlinks or any uplinks
an edge does not have any downlinks

This does not consider the status of the link, it only checks if there is a link.

Topology structure

The topology to provide looks like this in JSON:

{
  "nodes": [
    {
      "hostname": "tor-01-01",
      "role": "tor",
      "status": true,
      "layer": 1
    },
    {
      "hostname": "fabric-1-01",
      "role": "fabric",
      "status": true,
      "layer": 2
    }
  ],
  "links": [
    {
      "south_node": "tor-01-01",
      "north_node": "fabric-1-01",
      "status": true,
      "uid": "10.0.0.0->10.0.0.1"
    }
  ]
}

This structure is subject to change, as the API is not considered stable at the moment

Example

Topology = 4 healthy fabric nodes + 4 healthy ToR

Simulations:

first simulation considering first fabric node as down
second simulation considering second fabric node as down but with the first up
...

Example usecase

You can query the following endpoint to simulate down impact of each devices. It get the tppology example from the example/full_topology_with_issues.json.

$ curl http://127.0.0.1:8080/topology/full_topology_with_issues/device/each/down/impact | jq
{
  "scenarios_result": {
    "edge-0": {
      "impacts": null,
      "parameters": {
        "devices_down": [
          "edge-0"
        ],
        "links_down": null
      }
    },
    "edge-1": {
      "impacts": null,
      "parameters": {
        "devices_down": [
          "edge-1"
        ],
        "links_down": null
      }
    },
    "fabric-1-01": {
      "impacts": [
        "tor-01-01"
      ],
      "parameters": {
        "devices_down": [
          "fabric-1-01"
        ],
        "links_down": null
      }
    },
    ...,
    "compute_time": "89 ms"
}

As you can see, tor-01-01 would be down if we shut fabric-1-01.

The topology defined in example/full_topology_with_issues.json, also specifies some devices as down. Here all the fabric of pod 01 has been set to down except for fabric-1-01. This is why if there is a failure on this device, it will impact tor-01-01 as this ToR only had one healthy uplink.

Note: more advanced examples will be provided soon, with more complex scenarios.

Integrations

Below some ideas of possible integrations:

the client push the topology with the simulation request. The topology is not stored.

+-------------------------+
|  Observability metrics  |
|   example: Prometheus   |
+-------------------------+
             ^
             |
             | get metrics
             |
             |
             |
 +-----------------------+
 |                       |           get impact
 |        Client         |        on custom topology        +---------------+
 |   => convert metrics  |--------------------------------->|  ClawNetwork  |
 |      to topology      |                                  +---------------+
 +-----------------------+

the client provides the topologies and they are stored

+-------------------------+
|  Observability metrics  |
|   example: Prometheus   |
+-------------------------+
             ^
             |
             | get metrics
             |
             |
             |
 +-----------------------+
 |        Client         |       push topology      +---------------+      save topology       +-------------------------+
 |   => convert metrics  |------------------------->|  ClawNetwork  |<------------------------>| Storage (FS, redis,...) |
 |      to topology      |        get impact        +---------------+       get topology       +-------------------------+
 +-----------------------+

dedicated topology provider

                                                 +---------------------+
+-------------------------+                      |  Topology provider  |
|  Observability metrics  | <------------------- | => convert metrics  |
+-------------------------+                      |    to topology      |
                                                 +---------------------+
                                                            |
                                                            |
                                                            | push topology
                                                            |
                                                            |
                                                            |
                                                            v
 +-----------------------+        get impact        +---------------+      save topology       +-------------------------+
 |        Client         |------------------------->|  ClawNetwork  |<------------------------>| Storage (FS, redis,...) |
 +-----------------------+                          +---------------+       get topology       +-------------------------+

Name		Name	Last commit message	Last commit date
Latest commit History 83 Commits
.github		.github
api		api
compose		compose
configs		configs
examples		examples
internal		internal
pkg		pkg
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
generate_fake_topology.py		generate_fake_topology.py
go.mod		go.mod
go.sum		go.sum
main.go		main.go

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Usecases

Operations

Detect anomalies / SPOF

Quickstart

From source

Using Docker compose

Default backend

Run with the Backend of your choice

RedisJSON

Configuration

Usage

Manage stored topologies

Simulation on a stored topology

Anomaly detection

Link anomalies

Topology structure

Example

Example usecase

Integrations

About

Releases 2

Sponsor this project

Contributors 2

Languages

License

kpetremann/claw-network

Folders and files

Latest commit

History

Repository files navigation

Overview

Usecases

Operations

Detect anomalies / SPOF

Quickstart

From source

Using Docker compose

Default backend

Run with the Backend of your choice

RedisJSON

Configuration

Usage

Manage stored topologies

Simulation on a stored topology

Anomaly detection

Link anomalies

Topology structure

Example

Example usecase

Integrations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Sponsor this project

Contributors 2

Languages