-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature: Allow to alignment between protein or nucleotide sequences #16
Comments
Thanks for submitting! It's an interesting idea, and it's definitely a use case for using the Levenshtein distance algorithm. Is this algorithm purely for biology? From the perspective of a library user (not developer), the Hamming & Levenshtein distance algorithms have a various/wide set of applications to use them in. This includes biology, but it's not solely biology. Ideally, a library should only ship what will be used. Those 2 algorithms (at least as of right now) are the main focus, but Needleman-Wunsch is biology focused. I do like the idea though, and I think it'd make better sense if we turn this repository into a monorepo of related crates. We can do this by using a "Cargo workspace" (https://doc.rust-lang.org/book/ch14-03-cargo-workspaces.html). If you look at the GitHub repository for the I think we can do something like this, except move it to where all crates are in a
And then in the future, having a workspace would also give way for a crate like |
If this is something you're interested in, then we should create an issue first to setup the repository for a monorepo, and then we can create a crate for the Needleman-Wunsch algorithm. |
As of right now there are two algorithms I would like to implement on bio_diff that being
Both have to do with aligning protien or nucleotide sequences. Each algorithm will have their own file similar to how differ is structured. I am thinking of re using the same enums and structs EXECPT I plan on implementing the memory optimizations on issue #25 from the start. |
By the way, I mentioned earlier you can have a crate as a dependency for another crate :) So, you can have |
If I crate depends on another crate does this have any performance downsides ? Also if bio_diff depends on differ does the user just have access to bio_diff or also differ ? |
No performance downsides here. Think of it this way; it would be a performance downside by having both libraries duplicate code if a user used both libraries, assuming bio_diff didn't depend on differ. It'd also be a burden on the software developer to maintain duplicate code.
They'd just have access to |
This can be done with Needleman–Wunsch algorithm. Like the title mentions its an algorithm that allowed you to align protein or nucleotide sequences. This algorithm will be in its own file to follow the standard of the project.
The text was updated successfully, but these errors were encountered: