HMG ==> GA/SA driven MDL-based DNA Compression

Overview

This repository contains Genetic Algorithm (GA) and Simulated Annealing (SA) based approach for DNA compression, leveraging the Minimum Description Length (MDL) principle. Our method aims to efficiently compress genomic sequences by identifying optimal k-mers (patterns) that provide the best compression performance. This project addresses the challenges associated with the exponential growth of genomic data, offering a robust solution for effective data management.

Features

Genetic Algorithm Optimization: Utilizes a genetic algorithm to optimize k-mer selection, enhancing compression efficiency.
Simulated Annealing Optimization: Utilizes a simulated annealing to optimize k-mer selection, enhancing compression efficiency.
Minimum Description Length Principle: Applies the MDL principle to identify the most compact representation of genomic sequences.
Flexible Input: Supports various genomic datasets in text and FASTA format.
Performance Benchmarking: Evaluates compression ratios and time against state-of-the-art methods.

Installation

Prerequisites

Java Development Kit (JDK) 8 or higher
Maven (for building the project)

Clone the Repository

Installation

git clone https://github.com/MuhammadzohaibNawaz/HMG.git
cd HMG

Build the Project

  mvn clean install

Usage

To run the compression algorithm, modify the DNAClassification class according to your input files and parameters.

Set Dataset: Modify the DS variable to select your dataset.
Adjust Parameters: Change parameters such as generations, and topSubsequences (this is given as input when running the code) based on your requirements.
Run the Program: Execute the main method to start the compression process.

  java -cp target/HMG-1.0-SNAPSHOT.jar dna.HMG

Contributions

Contributions are welcome! Please open an issue or submit a pull request for any enhancements or bug fixes.

Acknowledgments

The development of this compression algorithm was inspired by the need for efficient genomic data storage.

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
sample dataset		sample dataset
Classifier.java		Classifier.java
Decompression.java		Decompression.java
GA Compression.java		GA Compression.java
README.md		README.md
SA Compression.java		SA Compression.java

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HMG ==> GA/SA driven MDL-based DNA Compression

Overview

Features

Installation

Prerequisites

Clone the Repository

Installation

Build the Project

Usage

Contributions

Acknowledgments

About

Releases

Packages

Languages

MuhammadzohaibNawaz/HMG

Folders and files

Latest commit

History

Repository files navigation

HMG ==> GA/SA driven MDL-based DNA Compression

Overview

Features

Installation

Prerequisites

Clone the Repository

Installation

Build the Project

Usage

Contributions

Acknowledgments

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages