Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strings #5

Open
cmontella opened this issue Jul 20, 2016 · 8 comments
Open

Strings #5

cmontella opened this issue Jul 20, 2016 · 8 comments

Comments

@cmontella
Copy link
Contributor

cmontella commented Jul 20, 2016

RFC: https://github.com/witheve/rfcs/blob/master/proposed/strings.md

From @shamrin:
I'm still trying to build search-as-you-type input with Eve. However, Eve seems to lack any string functions.

The bare minimum would be to have an expression that checks for substring in a string. Something like this JS function:

var contains = (search, string) => string.indexOf(search) !== -1;

The most flexible would be to have regexp match expression. Something like this:

var matches = (search, string) => !!string.match(new RegExp(search));

The middle ground is to allow prefix-matching for words inside string:

var matches = (search, string) => !!string.match(new RegExp('\\b' + search + '\\w*\\b'));

matches('ab', 'abc def'); // => true
matches('bc', 'abc def'); // => false
matches('de', 'abc def'); // => true

The only thing I could currently do is to pre-build the index with external tools, generating huge amount of [#word-prefix-match] objects:

build the index
  freeze
    [#word-prefix-match "a" "apple computer"]
    [#word-prefix-match "ap" "apple computer"]
    [#word-prefix-match "app" "apple computer"]
    // …
    [#word-prefix-match "c" "apple computer"]
    [#word-prefix-match "co" "computer"]

And I can't even build this index with Eve code: there are no split or prefix-match functions.

P.S. Bonus points is to somehow allow to ignore common words like a or an, so that an wouldn't match an apple, but it would match anne.

@RubenSandwich
Copy link

While all of your proposals will fit the baseline of "search-as-you-type", I suggest with aiming for a fuzzy search algorithm from the start. For the simple reason that the general computing publics most used search is Google and Google utilizes fuzzy search; so any deviation from this might be confusing to the new comer. (Especially ones without software backgrounds.) Might I suggest using: https://github.com/krisk/Fuse?

@shamrin
Copy link
Contributor

shamrin commented Jul 20, 2016

@RubenSandwich Thank you for the Fuse link! Another JavaScript library I've found interesting is Lunr.js.

(I don't think we can use any of these libraries except for inspiration. Eve runtime is currently written in C and Lua.)

If we are talking about search engines, they work somewhat differently. Compared to Fuse they are transparent about their fuzziness. Compare:

searrch at duckduckgo

(Personally, I've found Fuse search results confusing. That said, I have a software background… I am likely biased in the wrong way.)

Maybe it makes sense to come up with a minimal set of features for Eve. Developers could then use those features to implement search the way they want.

@cmontella
Copy link
Contributor Author

cmontella commented Jul 21, 2016

We talked a little about this today, and we've decided on some basic string functions that we can start implementing immediately.

  • concat - this is already implemented in the form of string interpolation.
  • split - (token, index) = split(text, by) takes a text and splits it according to by, returning the tokens and the indices of those tokens in the original string.
  • join -
    text = join(token, index, with) - essentially the opposite of split, takes tokens and their indices and joins them together with with, returning the full string as text.
  • char-at -
    char = char-at(text, index) - returns the character in text and position index
  • find -
    found = find(text, subtext) - Finds every instance of subtext in text, returns the index of each match.
  • length -
    len = length(text) - Returns the number of characters in text.
  • replace -
    new = replace(text, subtext, with) - replaces every instance of subtext in text with with, returns the resulting string.

I think advanced string features need some more discussion. For example, regarding regex, we could certainly do such a thing, but maybe there is a better way? For instance, being able to support BNF-style grammars. What are the expectations here for people?

@cmontella
Copy link
Contributor Author

This RFC hasn't seen attention in a while. Sometime next week I'm going to close this RFC and open a new RFC that is generally about the standard library. We can talk about strings and math and anything else that we feel needs to be in the std lib.

@jimmyhmiller
Copy link

@cmontella I know you mentioned closing this and starting a general std lib, but I was wondering if you were still interested in these functions. I started implementing them as an exercise to understand eve internals more.

@cmontella
Copy link
Contributor Author

Yeah, we still need some of these, so any help is appreciated!

@jimmyhmiller
Copy link

jimmyhmiller commented Dec 4, 2016

So, I started implementing a bunch of string functions and noticed a pattern. I ended up implementing a higher-order function to make constraints.

The code can definitely use some clean-up, but I wanted to see if you were open to this sort of approach. It vastly simplified implementing javascript string functions and seems to work as long as your results are value types.

@cmontella
Copy link
Contributor Author

Jimmy, I'll post some feedback for you, sorry I had forgotten to take a look at that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants