Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dialect support #1060

Merged
merged 14 commits into from
Dec 16, 2019
Merged

Dialect support #1060

merged 14 commits into from
Dec 16, 2019

Conversation

ampli
Copy link
Member

@ampli ampli commented Dec 13, 2019

The implemented API here is similar to my original proposal.
The code contains also an incomplete and undebugged implementation of the "many API functions" approach for user dialect definitions, but I abandoned this code path when I realized it would be a nightmare to actually use it in link-parser.

The implemented link-parser UI is a string value assigned to the !dialect variable, which is just a section in the dialect file 4.0.dialect, written as one line with commas instead of newlines. !help dialect explains this format and provides examples (of course this explanation should be rewritten to use the correct terminology, and the examples should be corrected to real ones). Additional explanations are in the the dialect file. An additional major benefit of this "string" dialect interface is that if the dialect support of the library is enhanced, no changes are needed in client code (like link-parser) in order to use new features).

Some variables names are left from my old code which implements "expression tags". I would like to rename them to the corresponding "dialect" terminology after we decide on it. The terminology in the dialect file and code comments may need changes too.

Dialect definitions in the dialect file may be hierarchical, when a dialect may contain sub-dialects. Dialect components (the expression tags that actually appear in the dictionary) are a kind of terminal sub-dialects in this hierarchy, and of course only them can be assigned costs. The dialect definition decoding code checks for loops.

The current default cost for dialect components is 10000 (it could be 10 too). For normal disjunct_cost values, it is just as infinity. The reason FLT_MAX is not used instead is that a value like 1000 is displayed nicely on debugging. It also has a "benefit" that all the dialects can be used at once by e.g. issuing !cost_max=10001.

I already have code improvement ideas, but I think it is better to wait to the final terminology decision and to a wider dialect support in the dict.

Edit: This PR implements the dialect idea from issue #402 and the bad-spelling dialect from issue #404.

@ampli
Copy link
Member Author

ampli commented Dec 16, 2019

Forced-push a rebase on #1061.

Copy link
Member

@linas linas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Merging.

and the dialect components are assigned costs as defined there.
Names with values are dialect components and their values. A missing
value after a ":" delimiter denotes a "very high" cost (to disable
the related disjuncts).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you say ":" but in the example below you use an equals-sign bad-spelling=2.2.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will be fixed.

drinkin.#drinking-v: [[drinking.v]0.05]bad-spelling;
drinkin'.#drinking-v: [[drinking.v]0.05]bad-spelling;
runnin'.#running-v: [[running.v]0.05]bad-spelling;
kidnappin'.#kidnapping-v: [[kidnapping.v]0.05]bad-spelling;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after the merge, I plan to change these and many others to "colloquial" instead of "bad-spelling"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can also define a parent dialect that will include both, e.g.

[alt-spell]
bad-spelling
colloquial

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since bad-spelling and colloquial are the default, maybe instead:

[no-alt-spelling]
bad-spelling:
colloquial:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, don't get too fancy too quick. I'm already finding this combination confusing:

[default]
no-headline

[no-headline]
headline: 4

[headline]
headline: 0

I mean, I understand what it does, but I have to think about it first, and its confusing and for a while it seems broken or wrong somehow. it matches the spec, but ... it's confusing.

@@ -61,25 +62,27 @@ int size_of_expression(Exp * e)
return size;
}

Exp *copy_Exp(Exp *e, Pool_desc *Exp_pool)
Exp *copy_Exp(Dictionary dict, Exp *e, Pool_desc *Exp_pool, Parse_Options opts)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that dict isn't used anywhere; I assume that maybe an earlier version had the cost_table in dict instead of opts.

Copy link
Member Author

@ampli ampli Dec 17, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is indeed copy&paste from an earlier version. But I checked and that earlier version doesn't use this parameter too, and it is a copy&paste from even an earlier one...

@linas linas merged commit 5abcf76 into opencog:master Dec 16, 2019
@ampli ampli mentioned this pull request Dec 17, 2019
@ampli ampli deleted the dialect branch January 23, 2020 15:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants