BICL-SDP is an exact algorithm, based on the branch-and-cut technique, for solving the biclustering problem through the
Sudoso, A. M. (2024). A Semidefinite Programming-Based Branch-and-Cut Algorithm for Biclustering. INFORMS Journal on Computing, https://doi.org/10.1287/ijoc.2024.0683.
BICL-SDP calls the semidefinite programming solver SDPNAL+ by using the MATLAB Engine API for C++. It requires the MATLAB engine library libMatlabEngine and the Matlab Data Array library libMatlabDataArray. BICL-SDP calls the linear programming solver Gurobi and uses Armadillo to handle matrices and linear algebra operations efficiently.
Ubuntu and Debian instructions:
-
Install MATLAB (>= 2021b)
-
Install Gurobi (>= 10.0.2)
-
Install CMake, OpenBLAS, LAPACK and Armadillo:
sudo apt-get update
sudo apt-get install cmake libopenblas-dev liblapack-dev libarmadillo-dev
-
Open the makefile
biclustering_cpp/Makefile
- Set the variable
matlab_path
with your MATLAB folder.
- Set the variable
-
Compile the code:
cd biclustering_cpp/
make
- Download SDPNAL+, move the folder
biclustering_matlab
containing the MATLAB source code of BICL-SDP in the SDPNAL+ main directory and set the parameterSDP_SOLVER_FOLDER
of the configuration file accordingly. This folder and its subfolders will be automatically added to the MATLAB search path when BICL-SDP starts.
This code has been tested under Ubuntu 22.04 LTS with MATLAB R2021b, Gurobi 10.0.2 and Armadillo 12.6.
Various parameters used in BICL-SDP can be modified in the configuration file biclustering_cpp/config.txt
:
BRANCH_AND_BOUND_TOL
- optimality tolerance of the exact algorithmBRANCH_AND_BOUND_PARALLEL
- thread pool size: single thread (1), multi-thread (> 1)BRANCH_AND_BOUND_MAX_NODES
- maximum number of nodesBRANCH_AND_BOUND_VISITING_STRATEGY
- best first (0), depth first (1), breadth first (2)MATLAB_SESSION_THREADS_ROOT
- number of threads for the MATLAB session at the root noeeMATLAB_SESSION_THREADS_CHILD
- number of threads for the MATLAB session for child nodesSDP_SOLVER_FOLDER
- full path of SDPNAL+ folderSDP_SOLVER_TOL
- accuracy of SDPNAL+ in the relative KKT residualSDP_SOLVER_VERBOSE
- do not display log (0), display log (1)CP_MAX_ITER
- maximum number of cutting-plane iterationsCP_TOL
- tolerance between two consecutive cutting-plane iterationsCP_MAX_INEQ
- maximum number of valid inequalities to separateCP_PERC_INEQ
- fraction of the most violated inequalities to addCP_EPS_INEQ
- tolerance for checking the violation of the inequalitiesCP_EPS_ACTIVE
- tolerance for detecting active inequalitiesGUROBI_FOLDER
- Gurobi solver pathGUROBI_VERBOSE
- do not display log (0), display log (1)
cd biclustering_cpp/
./bb <W_PATH> <K> <LOG_PATH> <RESULT_PATH>
W_PATH
- full path of the data matrixK
- number of biclustersLOG_PATH
- path of the log fileRESULT_PATH
- path of the optimal bicluster assignment matrices
File W_PATH
contains the weights w_ij
and the must include an header line with the number of rows n
and columns m
:
n m
w_11 w_12 ... w_1m
w_21 w_22 ... w_2m
...
...
w_n1 w_n2 ... w_nm
The log file reports the progress of the algorithm:
N
- number of rows at the current nodeM
- number of columns at the current nodeID_PAR
- id of the parent nodeID
- id of the current nodeUB_PAR
- upper bound of the parent nodeUB
- upper bound of the current nodeTIME (s)
- running time in seconds of the current nodeCP_ITER
- number of cutting-plane iterationsCP_FLAG
- termination flag of the cutting-plane procedure-2
- maximum number of iterations-1
- SDP not solved or partially solved successfully0
- no violated inequalities1
- node must be pruned2
- upper bound greater than the previous one3
- upper bound decrease is not sufficiently large
CP_INEQ
- number of inequalities added in the last cutting-plane iterationLB
- current lower boundBEST_LB
- global lower boundSET
- vertex set selection for branchingU
- branch on the vertices in UV
- branch on the vertices in V-1
- branching is not needed
I J
- indices of branching decisionNODE_GAP
- gap at the current nodeGAP
- overall gapOPEN
- number of open nodes