Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: Optimize and reduce size of StringDiffOpKind enum #25

Closed
neoncitylights opened this issue Jan 1, 2023 · 2 comments · Fixed by #30
Closed

perf: Optimize and reduce size of StringDiffOpKind enum #25

neoncitylights opened this issue Jan 1, 2023 · 2 comments · Fixed by #30
Assignees
Labels
lvl-1-easy Easy-ranking issue p2-normal Priority 2: Someone plans to work on this task in the foreseeable future. t-code-quality Improvements to code to make it more maintainable, readable, etc t-performance Issue relating to performance, like optimizations or benchmarks

Comments

@neoncitylights
Copy link
Contributor

neoncitylights commented Jan 1, 2023

  • StringDiffOpKind::Delete holds a char, but this isn't needed. We only need to know at what position something should be deleted, regardless of what exists at that position.
  • StringDiffOpKind::Transpose holds two usize integers but on reflection, doesn't need to hold any (if we're only detecting adjacent transpositions, which means the two letters have to be right next to each other). For example, the Damerau-Levenshtein distance algorithm builds on top of the Levenshtein distance by recognizing transpositions too, but only if they're adjacent.

Ultimately, the StringDiffOpKind enum should look like:

pub enum StringDiffOpKind {
   Substitute(char, char),
   Insert(char),
   Delete,
   Transpose,
}

This allows for two nice optimizations:

  • previously, the total size of StringDiffOpKind was 24 bytes. Now, by reducing unnecessary associated data, it's 8 bytes -> this allows for some memory alignment since it'll take a shorter time to copy on the stack.
  • The size of a StringDiffOp also reduces from 32 bytes to 16 bytes.

You can test this by calling std::mem::size_of(), with an online Rust playground link to try it out.

fn main() {
    println!("{}", std::mem::size_of::<StringDiffOpKind>());
    println!("{}", std::mem::size_of::<StringDiffOp>());
}

Some nice resources:

@neoncitylights neoncitylights added lvl-1-easy Easy-ranking issue p2-normal Priority 2: Someone plans to work on this task in the foreseeable future. t-performance Issue relating to performance, like optimizations or benchmarks labels Jan 1, 2023
@neoncitylights neoncitylights changed the title Optimize and reduce size of StringDiffOpKind enum perf: Optimize and reduce size of StringDiffOpKind enum Jan 1, 2023
@neoncitylights neoncitylights added the t-code-quality Improvements to code to make it more maintainable, readable, etc label Jan 1, 2023
@notalfredo
Copy link
Member

This looks really good. Like you mentioned when I was implementing the deletion match I noticed that the char field for deletion was probably not needed. Overall this looks really good.

@neoncitylights
Copy link
Contributor Author

Thanks! Do you want to work on this? I can work on this instead as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lvl-1-easy Easy-ranking issue p2-normal Priority 2: Someone plans to work on this task in the foreseeable future. t-code-quality Improvements to code to make it more maintainable, readable, etc t-performance Issue relating to performance, like optimizations or benchmarks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants