Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"any segmentation" for precedence and near operator #188

Open
thomaskrause opened this issue Dec 1, 2015 · 1 comment
Open

"any segmentation" for precedence and near operator #188

thomaskrause opened this issue Dec 1, 2015 · 1 comment

Comments

@thomaskrause
Copy link
Member

Currently the precedence/near operator has either no named argument (and thus is defined on the token precedence) or has the specific name of the segmentation chain. In cases where you search e.g. for "the" . "house" and there are segmentations in the corpus also the segmentations will be search for the annotation values "the" and "house". Unfortunately there is no "any segmentation" counter-part for the operator itself. My suggestion is to use an character that is not allowed as ID to mark this. In SQL there would be only a check that both segmentation names are equal.

My suggestions for the character are:

"the" .~ "house"
"the" .? "house"
"the" .+ "house"
"the" .@ "house"
"the" .= "house"

All of them have advantages and disadvantes, like some have semantically similar meaning in regular expressions (like "+"), some are used in AQL already and some would be completely new and therefore possible confusing. My current favourite is ".=" since it would express that both segmentations need to be the same (as a kind of binding).

@amir-zeldes, @CarolinOdebrecht Do you have any ideas what syntax would be the best?

@amir-zeldes
Copy link

Hm, I think this is not a bad idea, but I don't like .=, because it looks like the Perl/PHP in place concatentation:

> hello = "hello ";
> hello .= "world";
> print hello;
"hello world"

My vote would be for .~ because the tilde often has the semantics 'sort of, kind of', so we're saying 'kind of adjacent' or 'some sort of adjacent'. And we don't really use tilde anywhere else, so it's not confusing. Both . and = have other meanings in AQL, whereas for regex we just use slashes for the value (outside of the weird query builder situation).

@thomaskrause thomaskrause transferred this issue from korpling/ANNIS Aug 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants