Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add string/base/last-grapheme-cluster #1810

Merged
merged 9 commits into from
Mar 21, 2024

Conversation

adityacodes30
Copy link
Contributor

@adityacodes30 adityacodes30 commented Mar 9, 2024

Resolves #1729

Description

What is the purpose of this pull request?

This PR adds the package @stdlib/string/base/last-grapheme-cluster

This pull request:

  • Adds the package @stdlib/string/base/last-grapheme-cluster which returns the last n grapheme clusters (i.e., user-perceived characters) of a string.

Related Issues

Does this pull request have any related issues?

#854

Questions

Any questions for reviewers of this pull request?

No.

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

Checklist

Please ensure the following tasks are completed before submitting this pull request.


@stdlib-js/reviewers

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 9, 2024

I have implemented the last-grapheme-cluster package using prev-grapheme-cluster-break. I noticed that the implementation of prev-grapheme-cluster-break was rather contrary to pattern of other similar packages. On further research i found that according to #1062 prev-grapheme-cluster-break has to be rewritten.

I think before moving ahead with this pr we should address rewriting of prev-grapheme-cluster-break. Should i open an issue and start working on that?

@kgryte @steff456

edit prev-grapheme-cluster -> prev-grapheme-cluster-break

@kgryte
Copy link
Member

kgryte commented Mar 9, 2024

It doesn't need to be rewritten tmk, just fixed.

@kgryte
Copy link
Member

kgryte commented Mar 9, 2024

Relevant issues:

Now that I am jogging my memory, yes, prev-grapheme-cluster-break does need refactoring. But we should first get #1431 over the finish line before doing so.

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 9, 2024

Yes, to be fixed

Also just to be clear the packages mentioned in #1062 next-grapheme-cluster and prev-grapheme-cluster refer to next-grapheme-cluster-break and prev-grapheme-cluster-break right.

What should be the further course of action, fixing prev-grapheme-cluster-break or should we move ahead as is?

@kgryte
Copy link
Member

kgryte commented Mar 9, 2024

@adityacodes30 I'd say go ahead and implement in terms of next-grapheme-cluster-break. It won't be as efficient, as it requires iterating over the entire string, but it's more important at this point to get something working.

@adityacodes30
Copy link
Contributor Author

Alright , ill edit this package to use next-grapheme-cluster-break instead of prev-grapheme-cluster-break

@adityacodes30
Copy link
Contributor Author

i have implemented this package with the logic

num = num of total grapheme strings 
n = num of strings to return 
no = num - n 

while ( no > 0 ) {
		i = nextGraphemeClusterBreak( str, i );
		no -= 1;
	}
	
return str.substring(  i  , len  )

i think this approach should suffice until prev-grapheme-cluster-break receives the fixes, will push code in a bit

@kgryte
Copy link
Member

kgryte commented Mar 10, 2024

Hmm...I wonder if this can be implemented in a single pass, even with next-grapheme-cluster-break.

@kgryte
Copy link
Member

kgryte commented Mar 10, 2024

In fact, I am fairly sure you don't need to first compute the number of grapheme clusters.

@adityacodes30
Copy link
Contributor Author

I thought about it as well, however nextgraphemeclusterbreak will always move us forward in the string, while we need to go backwards. Do you have a particular logic in mind ?

@kgryte
Copy link
Member

kgryte commented Mar 11, 2024

Yes, you can use a circular buffer (see https://github.com/stdlib-js/stdlib/tree/develop/lib/node_modules/%40stdlib/utils/circular-buffer), where the buffer size is n. Use nextGraphemeClusterBreak to populate with cluster breaks until the next break is -1 (i.e., no more breaks). You can then substring by using the first element (index) in the buffer.

@Planeshifter Planeshifter added Feature Issue or pull request for adding a new feature. JavaScript Issue involves or relates to JavaScript. Needs Changes Pull request which needs changes before being merged. labels Mar 12, 2024
@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 12, 2024

got it ! I thought of another approach as well , we could use the split-grapheme-cluster package and then join the last n grapheme clusters.

Edit - Thought a bit more on it, it would unnecessarily increase the space complexity. Circular buffer approach seems the best.

@adityacodes30
Copy link
Contributor Author

implementation updated !

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 13, 2024

this pr is ready for review, after merging this we can perhaps move towards resolving #854
@kgryte

@adityacodes30
Copy link
Contributor Author

adityacodes30 commented Mar 17, 2024

@Planeshifter @kgryte
can we get a review on this one so work on #854 can begin !

Copy link
Member

@kgryte kgryte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. @adityacodes30 I recommend spending some time going over the changes I made to simplify your implementation and avoid unnecessary string concatenation.

@kgryte kgryte removed JavaScript Issue involves or relates to JavaScript. Needs Changes Pull request which needs changes before being merged. labels Mar 21, 2024
@kgryte kgryte added the Utilities Issue or pull request concerning general utilities. label Mar 21, 2024
@kgryte kgryte merged commit bbc3cfc into stdlib-js:develop Mar 21, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature Issue or pull request for adding a new feature. Utilities Issue or pull request concerning general utilities.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RFC]: add @stdlib/string/base/last-grapheme-cluster
3 participants