Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syntax: optimize some functions in IntervalSet. #1051

Closed
wants to merge 12 commits into from

Conversation

Licheam
Copy link
Contributor

@Licheam Licheam commented Jul 21, 2023

Replace the canonicalize function the in-place way with constant memory.
Replace the negate function the in-place way with constant memory.
Simplify the difference function without changing the algorithm.
Implement symmetric_difference function like others.
Implement push function like others.
Force the symmetric_difference function in Interval return ranges in order.

Licheam added 2 commits July 21, 2023 20:32
Replace the old canonicalize part with the in-place way with constant memory.
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

w00t! This is awesome. Nice work. :-)

regex-syntax/src/hir/interval.rs Outdated Show resolved Hide resolved
@Licheam Licheam changed the title syntax: get canonicalization down in-place with constant memory. syntax: optimize some functions in IntervalSet. Jul 22, 2023
@Licheam Licheam marked this pull request as draft July 22, 2023 09:54
@Licheam Licheam marked this pull request as ready for review July 22, 2023 10:20
@BurntSushi
Copy link
Member

I see you're working on more stuff here. If possible, please avoid merge commits. Instead, please rebase and force push.

@Licheam
Copy link
Contributor Author

Licheam commented Jul 22, 2023

I see you're working on more stuff here. If possible, please avoid merge commits. Instead, please rebase and force push.

Sorry about the inconvenience I brought here, and thank you for your professional advices. I am still learning to contribute codes in public repo. The work on this PR is down from my perspective and sorry again for my inexperienced in git.

@BurntSushi
Copy link
Member

No need to apologize! Everyone deals with git and commit history a little differently. If you look at the commit log for this repo, it's a straight linear line with (usually) decent messages. Basically, if you can just avoid merges then it should be easier for me to massage your PR.

@Licheam
Copy link
Contributor Author

Licheam commented Jul 23, 2023

I see. Thx for the explanation. I'll pay attention to this point in future potential PRs.

@Licheam Licheam force-pushed the interval_canoicalization_opt branch from 5c97a7b to c4b718f Compare July 23, 2023 04:21
@Licheam Licheam force-pushed the interval_canoicalization_opt branch from c4b718f to a71e3c8 Compare July 23, 2023 04:27
@Licheam Licheam marked this pull request as draft July 23, 2023 06:06
@Licheam Licheam marked this pull request as ready for review July 23, 2023 09:44
@Licheam Licheam force-pushed the interval_canoicalization_opt branch from 2f2db21 to dd46d34 Compare July 23, 2023 13:50
BurntSushi pushed a commit that referenced this pull request Oct 6, 2023
This reduces or eliminates allocation when combining Unicode classes and
should make some things faster. It's unlikely for these optimizations to
matter much in practice, but they are likely to help in niche or
pathological cases where there are a lot of ops in a class.

Closes #1051
Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Some of this looks... quite gnarly. But I feel pretty confident with the test suite. I also tried some random mutations to provoke test failures, and it did. So that's good too.

This is really great work. I remember banging my head against the wall trying to write these routines without using extra memory and just gave up.

BurntSushi pushed a commit that referenced this pull request Oct 8, 2023
This reduces or eliminates allocation when combining Unicode classes and
should make some things faster. It's unlikely for these optimizations to
matter much in practice, but they are likely to help in niche or
pathological cases where there are a lot of ops in a class.

Closes #1051
BurntSushi pushed a commit that referenced this pull request Oct 8, 2023
This reduces or eliminates allocation when combining Unicode classes and
should make some things faster. It's unlikely for these optimizations to
matter much in practice, but they are likely to help in niche or
pathological cases where there are a lot of ops in a class.

Closes #1051
BurntSushi pushed a commit that referenced this pull request Oct 9, 2023
This reduces or eliminates allocation when combining Unicode classes and
should make some things faster. It's unlikely for these optimizations to
matter much in practice, but they are likely to help in niche or
pathological cases where there are a lot of ops in a class.

Closes #1051
@BurntSushi BurntSushi closed this in 6d2b09e Oct 9, 2023
BurntSushi added a commit that referenced this pull request Oct 11, 2023
This reverts commit 6d2b09e.

Sadly I just don't have the time to fix this code myself. It's too
subtle. So I'm just reverting it entirely for now.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153
Ref #1051
BurntSushi added a commit that referenced this pull request Oct 11, 2023
This reverts commit 6d2b09e.

Sadly I just don't have the time to fix this code myself. It's too
subtle. So I'm just reverting it entirely for now.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63168
Ref #1051
@BurntSushi
Copy link
Member

FYI, I ended up needing to revert this change because of bugs caught by OSS-fuzz. See #1102 for more info.

BurntSushi added a commit that referenced this pull request Oct 11, 2023
This reverts commit 6d2b09e.

Sadly I just don't have the time to fix this code myself. It's too
subtle. So I'm just reverting it entirely for now.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63168
Ref #1051
BurntSushi added a commit that referenced this pull request Oct 11, 2023
This reverts commit 6d2b09e.

Sadly I just don't have the time to fix this code myself. It's too
subtle. So I'm just reverting it entirely for now.

Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63155
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63153
Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=63168
Ref #1051
BurntSushi added a commit that referenced this pull request Oct 12, 2023
Licheam added a commit to Licheam/regex that referenced this pull request Oct 12, 2023
@Licheam
Copy link
Contributor Author

Licheam commented Oct 12, 2023

FYI, I ended up needing to revert this change because of bugs caught by OSS-fuzz. See #1102 for more info.

Based on the test cases, it turns out that the implementation of push is not very thoughtful. I refractor the whole push function making it more direct and readable while fixing the bugs found. Sorry for my carelessness and thx for the test cases. I made a PR #1104 for the change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants