Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong boundary detected when converting from camel case #12

Open
ThoFrank opened this issue Nov 21, 2022 · 2 comments
Open

Wrong boundary detected when converting from camel case #12

ThoFrank opened this issue Nov 21, 2022 · 2 comments

Comments

@ThoFrank
Copy link

Hi there,
I really like you crate. However I came across a small issue. If the second char is uppercase in a camel case string, there should not be boundary inserted.

Here are two test cases:

failes with "foo_bar" != "f_oo_bar"

#[test]
fn test_camel_case() {
    assert_eq!("foo_bar", "fOOBar".from_case(Case::Camel).to_case(Case::Snake))
}

succeeds

#[test]
fn test_pascal_case() {
    assert_eq!("foo_bar", "FOOBar".from_case(Case::Camel).to_case(Case::Snake))
}

It's probably a quite niche bug and also easy to work around by just making the first letter always uppercase and then converting from pascal case instead.

@rutrum
Copy link
Owner

rutrum commented Nov 24, 2022

Hello there,

This is not a bug, and is intended behavior. Recall that from_case really just pulled a list of Boundary that are commonly associated with that case. In camel case you would expect a lowercase followed by uppercase to be a boundary (aA), for example. There are also boundaries for digits as well. Luckily within convert_case you can actually easily see the associated boundaries for a case. Here are those for camel case.

println!("{:?}", Case::Camel.boundaries());
[LowerUpper, Acronym, LowerDigit, UpperDigit, DigitLower, DigitUpper]

We can also look at all the possible boundaries that can be identified in a provided string. Let's look at what is in your example strings.

println!("{:?}", Boundary::list_from("FOOBar"));
[Acronym]
println!("{:?}", Boundary::list_from("fOOBar");
[LowerUpper, Acronym]

FOOBar contains the acronym boundary, and because that is in camel case's boundaries it is used a the point to split the string into words. It gets split to create FOO and Bar which are then combined into foo_bar as snake case.

fOOBar contains the acronym boundary AND the lowerupper boundary. This lowerupper boundary is at the first two characters fO. This is also in camel case's list of boundaries so the string is split into f and OO and Bar which is combined to f_oo_bar as snake case.

This lowerupper boundary is expected for camel case, since that's how we join words. The end of one word is lowercase and the next begins with uppercase. In the case of fOOBar, the first word is f, followed by OOBar.

All that is to say this is expected behavior.

@ThoFrank
Copy link
Author

Thanks for the answer,
I already dug a bit around the code and I understand that what happend is expected to happen based on the boundary logic. However I think in the case of fOOBar the more correct way would be to ignore the first boundary / treat the first letter as uppercase.
It's probably an ugly patch to "fix" it (make it the way that I see more correct). And the user-side fix is quite easy to do.
The main motivation of this bug report was to make you aware of this and then maybe tell others who are looking for this that they have to manually fix it on their end.

Cheers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants