Skip to content

Commit

Permalink
A good approximation of the Minimum Vertex Cover Solver
Browse files Browse the repository at this point in the history
  • Loading branch information
frankvegadelgado committed Jan 26, 2025
1 parent 86d8fb8 commit 3d542ab
Show file tree
Hide file tree
Showing 42 changed files with 1,553 additions and 1 deletion.
31 changes: 31 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
name: Upload Python Package to PyPI when a Release is Created

on:
release:
types: [created]

jobs:
pypi-publish:
name: Publish release to PyPI
runs-on: ubuntu-latest
environment:
name: pypi
url: https://pypi.org/p/capablanca
permissions:
id-token: write

steps:
- uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.x"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install setuptools wheel
- name: Build package
run: |
python setup.py sdist bdist_wheel
- name: Publish package distributions to PyPI
uses: pypa/gh-action-pypi-publish@release/v1
13 changes: 13 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -94,6 +94,12 @@ ipython_config.py
# install all needed dependencies.
#Pipfile.lock

# UV
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
#uv.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
Expand Down Expand Up @@ -160,3 +166,10 @@ cython_debug/
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# PyPI configuration file
.pypirc

# user custom
app.log
sparse_matrix_*
195 changes: 194 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1 +1,194 @@
# capablanca
# Capablanca: Minimum Vertex Cover Solver

![Honoring the Memory of Jose Raul Capablanca (Third World Chess Champion from 1921 to 1927)](docs/capablanca.jpg)

This work builds upon [The Minimum Vertex Cover Problem](https://www.researchgate.net/publication/388382740_The_Minimum_Vertex_Cover_Problem).

---

# The Minimum Vertex Cover Problem

The Minimum Vertex Cover (MVC) problem is a classic optimization problem in computer science and graph theory. It deals with finding the smallest set of vertices in a graph that `covers` all the edges. This means that for every edge in the graph, at least one of its endpoints must be in the chosen set of vertices.

## Formal Definition

Given an undirected graph $G = (V, E)$, where $V$ is the set of vertices and $E$ is the set of edges, a vertex cover is a subset $V' \subseteq V$ such that for every edge $(u, v) \in E$, at least one of the vertices $u$ or $v$ belongs to $V'$. The Minimum Vertex Cover problem aims to find a vertex cover $V'$ with the smallest possible cardinality (i.e., the fewest number of vertices).

## Visual Example

Consider the following simple graph:

A --- B
| |
C --- D

In this graph:

- $\{A, B, C, D\}$ is a vertex cover (it includes all vertices, trivially covering all edges).
- $\{B, C\}$ is also a vertex cover, as it covers all edges: $(A, B)$, $(A, C)$, $(B, D)$, and $(C, D)$.
- $\{A, D\}$ is _not_ a vertex cover because the edge $(B, C)$ is not covered.

$\{B,C\}$ is the minimum vertex cover for this graph.

## Importance and Applications

The Minimum Vertex Cover problem is important for several reasons:

- **Theoretical Significance:** It is a well-studied NP-hard problem, meaning that no known algorithm can solve it optimally for all instances in polynomial time. This makes it a crucial problem in complexity theory.
- **Practical Applications:** It has applications in various fields, including:
- **Network security:** Finding critical nodes in a network that, if compromised, would disrupt connections.
- **Bioinformatics:** Identifying important genes in gene regulatory networks.
- **Wireless sensor networks:** Determining the minimum number of sensors needed to monitor a given area.

## Related Problems

The Minimum Vertex Cover problem is closely related to other graph problems, such as:

- **Maximum Independent Set:** A set of vertices where no two vertices are adjacent. The size of the minimum vertex cover plus the size of the maximum independent set is equal to the total number of vertices in the graph.
- **Set Cover Problem:** A more general problem where sets of elements are used to cover a universe of elements.

# Our Algorithm - Polynomial Runtime

## The algorithm explanation:

We employ the `minimum_edge_cut()` function from NetworkX to identify the minimum edge cut within an undirected graph. By iteratively solving the minimum edge cut problem on the connected components of the graph, we can obtain an approximate solution to the Minimum Vertex Cover Problem with an approximation ratio less than $\sqrt{2}$.

# Compile and Environment

## Install Python >=3.10.

## Install Capablanca's Library and its Dependencies with:

```bash
pip install capablanca
```

# Execute

1. Go to the package directory to use the benchmarks:

```bash
git clone https://github.com/frankvegadelgado/capablanca.git
cd capablanca
```

2. Execute the script:

```bash
cover -i .\benchmarks\testMatrix1.txt
```

utilizing the `cover` command provided by Capablanca's Library to execute the Boolean adjacency matrix `capablanca\benchmarks\testMatrix1.txt`. We also support .xz, .lzma, .bz2, and .bzip2 compressed .txt files.

## The console output will display:

```
testMatrix1.txt: Vertex Cover Found 2, 3, 4
```

which implies that the Boolean adjacency matrix `capablanca\benchmarks\testMatrix1.txt` contains a vertex cover of nodes `2, 3, 4`.

---

## Size of the Approximate Vertex Cover - Polynomial Runtime

The `-c` flag counts the nodes in the approximate vertex cover.

**Example:**

```bash
cover -i .\benchmarks\testMatrix2.txt -c
```

**Output:**

```
testMatrix2.txt: Vertex Cover Size 5
```

## Runtime Analysis:

We employ the same algorithm used to find vertex cover set.

---

# Command Options

To display the help message and available options, run the following command in your terminal:

```bash
cover -h
```

This will output:

```
usage: cover [-h] -i INPUTFILE [-a] [-b] [-c] [-v] [-l] [--version]
Approximate Solution to the Minimum Vertex Cover Problem for an undirected graph represented by a Boolean Adjacency Matrix given in a File within a factor lesser than sqrt(2).
options:
-h, --help show this help message and exit
-i INPUTFILE, --inputFile INPUTFILE
input file path
-a, --approximation enable comparison with a polynomial-time approximation approach within a factor of 2
-b, --bruteForce enable comparison with the exponential-time brute-force approach
-c, --count calculate the size of the vertex cover
-v, --verbose anable verbose output
-l, --log enable file logging
--version show program's version number and exit
```

This output describes all available options.

## The Capablanca Testing Application

A command-line tool, `test_cover`, has been developed for testing algorithms on randomly generated, large sparse matrices. It accepts the following options:

```
usage: test_cover [-h] -d DIMENSION [-n NUM_TESTS] [-s SPARSITY] [-a] [-b] [-c] [-w] [-v] [-l] [--version]
The Capablanca Testing Application.
options:
-h, --help show this help message and exit
-d DIMENSION, --dimension DIMENSION
an integer specifying the dimensions of the square matrices
-n NUM_TESTS, --num_tests NUM_TESTS
an integer specifying the number of tests to run
-s SPARSITY, --sparsity SPARSITY
sparsity of the matrices (0.0 for dense, close to 1.0 for very sparse)
-a, --approximation enable comparison with a polynomial-time approximation approach within a factor of 2
-b, --bruteForce enable comparison with the exponential-time brute-force approach
-c, --count calculate the size of the vertex cover
-w, --write write the generated random matrix to a file in the current directory
-v, --verbose anable verbose output
-l, --log enable file logging
--version show program's version number and exit
```

**This tool is designed to benchmark algorithms for sparse matrix operations.**

It generates random square matrices with configurable dimensions (`-d`), sparsity levels (`-s`), and number of tests (`-n`). Brute-force and heuristic comparisons are available but not recommended for large datasets due to performance issues. Additionally, the generated matrix can be written to the current directory (`-w`), and verbose output or file logging can be enabled with the (`-v`) or (`-l`) flag, respectively, to record test results.

---

# Code

- Python code by **Frank Vega**.

---

# Complexity

```diff
+ We present a polynomial-time approximation algorithm for the vertex cover problem in undirected graphs, achieving an approximation ratio of less than sqrt(2).
+ This algorithm has implications for solving other computational problems in combinatorial optimization and graph theory.
+ The existence of such algorithm implies that P = NP.
```

---

# License

- MIT.
5 changes: 5 additions & 0 deletions benchmarks/testMatrix1.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
00101
00010
10001
01000
10100
5 changes: 5 additions & 0 deletions benchmarks/testMatrix10.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
00000
00001
00000
00000
01000
4 changes: 4 additions & 0 deletions benchmarks/testMatrix11.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
0000
0001
0000
0100
5 changes: 5 additions & 0 deletions benchmarks/testMatrix12.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
00000
00000
00001
00000
00100
6 changes: 6 additions & 0 deletions benchmarks/testMatrix13.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
000000
000000
000000
000001
000001
000110
2 changes: 2 additions & 0 deletions benchmarks/testMatrix14.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
00
00
5 changes: 5 additions & 0 deletions benchmarks/testMatrix15.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
00100
00101
11000
00000
01000
6 changes: 6 additions & 0 deletions benchmarks/testMatrix16.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
011100
100100
100101
111000
000000
001000
4 changes: 4 additions & 0 deletions benchmarks/testMatrix17.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
0100
1000
0000
0000
5 changes: 5 additions & 0 deletions benchmarks/testMatrix18.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
00001
00001
00000
00000
11000
6 changes: 6 additions & 0 deletions benchmarks/testMatrix19.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
011001
101001
110000
000000
000000
110000
12 changes: 12 additions & 0 deletions benchmarks/testMatrix2.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
011101011011
100101011010
100100001010
111010001010
000100011010
110000000000
000000000000
110010000000
111110000000
000000000000
111110000000
100000000000
7 changes: 7 additions & 0 deletions benchmarks/testMatrix20.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
0101100
1001000
0000000
1100100
1001000
0000000
0000000
15 changes: 15 additions & 0 deletions benchmarks/testMatrix21.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
001011010000100
000011010000100
100011010000100
000011010000100
111101110011110
111110000000000
000010000000000
111110000000000
000000000000000
000000000000000
000010000000000
000010000000000
111110000000000
000010000000000
000000000000000
15 changes: 15 additions & 0 deletions benchmarks/testMatrix22.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
001011010000100
000011010000100
100011010000100
000011010000101
111101110011110
111110000000000
000010000000000
111110000000000
000000000000000
000000000000000
000010000000000
000010000000000
111110000000000
000010000000000
000100000000000
9 changes: 9 additions & 0 deletions benchmarks/testMatrix23.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
000111000
000111000
000111010
111000000
111000000
111000000
000000000
001000000
000000000
7 changes: 7 additions & 0 deletions benchmarks/testMatrix24.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
0000000
0000001
0000000
0000000
0000000
0000000
0100000
Loading

0 comments on commit 3d542ab

Please sign in to comment.