feat: sync with upstream, position+recurrence dim creation, time+money dim fixes #2

scroskey-clinc · 2020-12-22T00:21:03Z

Pulls in previous releases and syncs with upstream.

Features:

Syncs with upstream as of 12/21/2020
Fixes classifier issues with "after <time>" and "through <time>" after rebase
Added "greater than <money>" rule
Added "below <money>" rule
Removed latent amount rules (preventing 10 dollars 30 from being extracted as "$10.30")
Added "since <time> till <time>" rule
Improves position dimension by always having span go forward from index
Adds synonyms for "bucks" for money amounts
Improves time extraction with date ranges
Adds Recurrence and Position dimensions
Fixes several Time dimension issues
Removed "from" from "since " rule so that these extractions are no longer treated as open-ended date ranges (e.g. "show my transactions from january" returning a 1 month range in January and not the range from January until now)
Regenerated classifiers to fix ambiguities with "since <date> til <date>"

Few changes in behavior:

“the <time>” now includes “the” alongside extracted tokens
“next <day-of-week>” will jump to the next calendar week instead of simply the next instance

Summary: Pull Request resolved: facebook#403 Reviewed By: haoxuany Differential Revision: D18348752 Pulled By: patapizza fbshipit-source-id: ce3b5c76cb2cf39114216842529d4eaa8df5b93f

Summary: Pull Request resolved: facebook#428 Reviewed By: haoxuany Differential Revision: D18348514 Pulled By: patapizza fbshipit-source-id: 9b0b9c2caa9fec8330746059eefa6185a8f3e072

Summary: - Setup Afrikaans (AF) language - Added Numeral Dimension Some of the paths have changed, and some extra files were necessary, after basing initial work off facebook@24d3f19 I followed some of the Numeral examples from Dutch as well as Hungarian, since Afrikaans and Dutch have some similarities. One thing was examples for numbers having the number as an example, which I didn't do here, because I'm not sure it's necessary. Pull Request resolved: facebook#422 Reviewed By: awalterschulze Differential Revision: D18348617 Pulled By: patapizza fbshipit-source-id: b8c4218629c264b48d6f2cecc4c23e2e281a64da

Summary: Supporting "orthodox good friday" in addition to "orthodox great friday" in the regex Reviewed By: chinmay87 Differential Revision: D19604033 fbshipit-source-id: c6ca68fc34e284304ca2ba07a8f1bf81378c3558

Summary: Adding locale rules for ES Numeral because Spain use "," as decimal but south american country use "." as decimal. Wiki: https://en.wikipedia.org/wiki/Decimal_separator Reviewed By: haoxuany Differential Revision: D20040111 fbshipit-source-id: e2a4bfc2928df19976ef98e90ee82e7d21b52313

Summary: Adding new holiday. Reviewed By: haoxuany Differential Revision: D20193781 fbshipit-source-id: c8be523293b7b6ee836965c8914e3db58cc41085

Summary: Leveraging `predNthClosest` helper in English rules. "the second closest monday to february 6" "the closest tax day to boss day 2018" Reviewed By: haoxuany Differential Revision: D20214444 fbshipit-source-id: b6be32f63097d221aa7ccc6df4e3639e4deee4a9

Summary: Adding new locale rules to cabal file Reviewed By: patapizza Differential Revision: D20288009 fbshipit-source-id: 71fe63973b4bc58d2fa7952af725b11238c99ef9

Summary: Pull Request resolved: facebook#460 Exposing the TimeGrain feature Reviewed By: patapizza Differential Revision: D20250270 fbshipit-source-id: 726f85eebd95ae31d911ebd9a43428d549aba877

Summary: This change applies roughly the same rules for supporting intervals in Spanish AmountOfMoney that we suppor in English: intervals using `entre _ e _` / `de _ a _` / `_ - _` with either money in both slots or a number in the first slot and money in the second. My Spanish is okay but not great - I'm confident these rules are good and cover the most likely phrases, but there's probably room to add more coverage. Reviewed By: patapizza Differential Revision: D20425979 fbshipit-source-id: deb17fc331e1aa192d91dd47bc7f3864a246f0be

…tiply apply (facebook#406) Summary: I noticed two ambiguous parses would occur when both ruleNegative and ruleMultiply would apply. For example: "minus three million two hundred thousand" ``` *Duckling.Debug> debug (makeLocale EN Nothing) "minus three million two hundred thousand" [This Numeral] compose by multiplication (minus three million two hundred thousand) -- negative numbers (minus three million two hundred) -- -- regex (minus) -- -- intersect 2 numbers (three million two hundred) -- -- -- compose by multiplication (three million) -- -- -- -- integer (0..19) (three) -- -- -- -- -- regex (three) -- -- -- -- powers of tens (million) -- -- -- -- -- regex (million) -- -- -- compose by multiplication (two hundred) -- -- -- -- integer (0..19) (two) -- -- -- -- -- regex (two) -- -- -- -- powers of tens (hundred) -- -- -- -- -- regex (hundred) -- powers of tens (thousand) -- -- regex (thousand) negative numbers (minus three million two hundred thousand) -- regex (minus) -- intersect 2 numbers (three million two hundred thousand) -- -- compose by multiplication (three million) -- -- -- integer (0..19) (three) -- -- -- -- regex (three) -- -- -- powers of tens (million) -- -- -- -- regex (million) -- -- compose by multiplication (two hundred thousand) -- -- -- compose by multiplication (two hundred) -- -- -- -- integer (0..19) (two) -- -- -- -- -- regex (two) -- -- -- -- powers of tens (hundred) -- -- -- -- -- regex (hundred) -- -- -- powers of tens (thousand) -- -- -- -- regex (thousand) ``` This PR fixes this ambiguity and Duckling will only return the second (correct) parse. Pull Request resolved: facebook#406 Test Plan: regen'd classifiers (no-op) :test Duckling.Tests Imported from GitHub, without a `Test Plan:` line. Reviewed By: chinmay87, girifb Differential Revision: D20303354 Pulled By: patapizza fbshipit-source-id: 280b0e33b7c944f9d87a7c23afda2f6a843e28a4

Summary: When I first skimmed our rules for "half an hour" vs "an hour and a half" I actually thought there might be a bug, because `timesOneAndAHalf` sounds like it's actually multiplying by `1.5`. There's no bug, the implementation is entirely correct, but it does not multiply by 1.5, it adds .5 to any integer value at the given grain. This diff renames the function to be more descriptive. Handy trick for doing this kind of refactor without IDE tooling: ``` find duckling/Duckling/Duration/ -name 'Rules.hs'| xargs sed -i 's/timesOneAndAHalf/nPlusOneHalf/g' ``` Reviewed By: haoxuany Differential Revision: D20456966 fbshipit-source-id: 35020685f091a41618b30b7e5f95dbfa48509b88

Summary: For consistency. Reviewed By: jtliao Differential Revision: D20524369 fbshipit-source-id: 44031667adccab9bca7b3b6d42c80878bb96ccae

Summary: Fix `ruleYearLatent` to be the same as the one in `en`. We don't want to match numerals that could have been hours. Reviewed By: patapizza Differential Revision: D20683975 fbshipit-source-id: cdef9b1b5f8a21dc5e207ed2a7afcad84c56a596

Summary: Pull Request resolved: facebook#467 Reviewed By: chinmay87 Differential Revision: D20700248 Pulled By: patapizza fbshipit-source-id: 17f933106c6f18fcd93b73f42af458220d93b6cf

Summary: When I was working on some related diffs, I noticed that there were some asymmetries between the regexes for ruleIntervalMax and ruleIntervalMin: - we had no support for "at most", even though we did have "at least" - we had no support for "not? less than" - the ordering of the different constructions didn't match This a minor tweak to make things match better Reviewed By: patapizza Differential Revision: D20484594 fbshipit-source-id: c3c54a9cc1b83402e42634b7a98a1a3b8cc5e09c

Summary: * Reduces size of final image from 5GB to 130MB * Builds any checkout (not locked to the master) * Doesn't run stack on CMD (executes static build of Duckling instead) Pull Request resolved: facebook#341 Reviewed By: chinmay87 Differential Revision: D21083018 Pulled By: patapizza fbshipit-source-id: d909158f20f5b8da5b0248a25103b850797bc3a3

Summary: 1. ~~Fixed broken build due to the problem with main test entry point;~~ 2. Fixed the ambiguous results caused by mishandling the ranking rules for parsing frames in ES. For example "una hora" be interpreted either as "Duration" or "1pm" in "Time" dimension. And the expected result should be in "Duration" dimension. 3. ~~ignore stack lock file~~ Pull Request resolved: facebook#478 Test Plan: ``` :test Endpoint.Duckling.Tests --hide-successes [1003 of 1003] Endpoint.Duckling.Tests (Duckling.Api changed) Ok, two modules loaded. All 357 tests passed (79.69s) ``` ``` haxlsh> H.io $ debug (makeLocale ES Nothing) "de una horas" [This Time, This Duration] <integer> <unit-of-duration> (una horas) -- number (0..15) (una) -- -- regex (una) -- hora (grain) (horas) -- -- regex (horas) [Entity {dim = "duration", body = "una horas", value = RVal Duration (DurationData {value = 1, grain = Hour}), start = 3, end = 12, latent = False, enode = Node {nodeRange = Range 3 12, token = Token Duration (DurationData {value = 1, grain = Hour}), children = [Node {nodeRange = Range 3 6, token = Token Numeral (NumeralData {value = 1.0, grain = Nothing, multipliable = False, okForAnyTime = True}), children = [Node {nodeRange = Range 3 6, token = Token RegexMatch (GroupMatch ["una","","a","","",""]), children = [], rule = Nothing}], rule = Just "number (0..15)"},Node {nodeRange = Range 7 12, token = Token TimeGrain Hour, children = [Node {nodeRange = Range 7 12, token = Token RegexMatch (GroupMatch ["ora"]), children = [], rule = Nothing}], rule = Just "hora (grain)"}], rule = Just "<integer> <unit-of-duration>"}}] it :: [Entity] ``` Reviewed By: fascpt Differential Revision: D21770015 Pulled By: chinmay87 fbshipit-source-id: 3056fcf656140c9d65b70b5c604a286ea2c307b2

…tes. (facebook#483) Summary: Current behavior: "an hour and 45 minutes" -> parsed as "1 hour" [dimension: "Duration"] "a minute and 30 seconds" ->parsed as "1 minute" [dimension: "Duration"] Expected behavior: "an hour and 45 minutes" -> "105 minutes" with dimension as "Duration" "a minute and 30 seconds" -> "90 seconds" with dimension as "Duration" The fix: adding new rule to handle this duration composition pattern. (<some duration> and <some other duration>) Pull Request resolved: facebook#483 Reviewed By: haoxuany Differential Revision: D21850773 Pulled By: chinmay87 fbshipit-source-id: 62eb6859e0ce2b88cf8ae48d836a1a6a1ac8705d

…facebook#487) Summary: Current: "el dia nueve" -> "9pm" of current day Expected: "el dia nueve" -> 9th of current or next month Fix: added new ES rule to handle the pattern like "el dia <day of month>" Pull Request resolved: facebook#487 Reviewed By: girifb Differential Revision: D21850807 Pulled By: chinmay87 fbshipit-source-id: d8edd81273c7e5f700b440ccc8c7e7bded679051

Summary: added new EN rule to parse the phrases that contain "midday". Pull Request resolved: facebook#490 Differential Revision: D21959562 Pulled By: chinmay87 fbshipit-source-id: f9ab45aecd551e8959d00b0025ed38b616ed6b14

…k#484) Summary: Current behavior sentence with pattern "xxx.yyy minutes" parsed as yyy minutes. Expected behavior: xxx.yyy minutes = 60*xxx+0.yyy*60 seconds For example: "15.5" minutes = 60*15+0.560 = 930 seconds Pull Request resolved: facebook#484 Reviewed By: haoxuany Differential Revision: D21850782 Pulled By: chinmay87 fbshipit-source-id: c007901d4dd6476e5e383a13892ecff9b2191fff

…rter of hour. (facebook#489) Summary: Pull Request resolved: facebook#489 Differential Revision: D21959268 Pulled By: chinmay87 fbshipit-source-id: 2b785b44da5437c7b27af098daef551139dad990

…quarter" or "quarters" (facebook#485) Summary: Current: if the fractional hour expression describes the hour fraction with term like "quarter or quarters", then duckling couldn't correctly recognize it. Expected: Duckling should be able to identify this kind of expression and parse it correctly. Fix: Add new rule to parse the fractional hour pattern that contains the keyword like "quarter or quarters". Pull Request resolved: facebook#485 Test Plan: Imported from GitHub, without a Test Plan: line. Reviewed By: haoxuany Differential Revision: D21850804 Pulled By: chinmay87 fbshipit-source-id: 818b7b3f37e3f8a6d1a7d579db19fb2cfb2763f4

Summary: as title Reviewed By: girifb Differential Revision: D21998107 fbshipit-source-id: 7c1c91db9a1ebf29d702930570341dc3b6b0ce65

Summary: while computing a score used to rank in Duckling, it currently sums up the log likelihoods learned during training. While ranking, the goal is to find the (same span) parse candidate which is _more_ likely to lead to a *correct* parse. However, the old logic was summing up the "more confident of the two classes" log likelihood.From what I understand this is the part which feels wrong. I created an example of two rules: #1. a rule where the classifier learns that the rule is very confidently NOT the correct parse. - okdata (positive class) is very low confidence (high negative number prior) - kodata (negative class) is very high confidence (low negative number prior) #2. a rule where the classifier is confident that it is the correct parse, but not Very Confident. - okdata (positive class) is high confidence (nonzero, but low negative number prior) - kodata (negative class) is very low confidence (high negative number prior) these two rules match the same regex, thus the same span. While duckling parses it, it turns out, that rule #1 ranks higher than rule #2. The reason why is because #1 is MORE confident that it is the INCORRECT (does not contribute to) parse than rule #2. Does this make sense? to solve this problem, I changed the ranking score estimation to use only the positive class scores (okdata). In the example above, it fixes it so rule #2 would end up ranking higher because the positive class confidence is higher than #1's positive class confidence. Would really love some deeper input from Duckling experts. I re-learned haskell and learned haxl to craft a small example here, and I am very new to Duckling (just started reading the ranking code on Friday). I know Duckling is battle-tested but I also don't believe that means a bug can't exist. And further, this specific bug may not happen a whole lot for 2 reasons: - there are not a lot of rules which end up higher negative confidence than positive (requires enough negative corpus examples over positive ones) - ranking uses span width first, and only when the spans are equivalent does the score based ranking come into play. So it requires that 2 rules match the same span before any actual score calculation even matters. Reviewed By: patapizza Differential Revision: D22009276 fbshipit-source-id: 13491689d39d810da526fa4bb8b6e526d4cafd35

Summary: the new rules could parse phrases in the form of xxx upcoming weeks upcoming xxx weeks Pull Request resolved: facebook#491 Test Plan: Imported from GitHub, without a Test Plan: line. Differential Revision: D21959647 Pulled By: chinmay87 fbshipit-source-id: a062a8c7a6c2e23b921b1099b886fa589c69c454

Summary: The root cause was the error in parsing the ES numeral value [1-9] that spelled with two words instead of one. For example "cero dos" should be parsed the as "dos". Currently it's being as two numeral values: 0 and 3. Reviewed By: chinmay87 Differential Revision: D22162804 fbshipit-source-id: 949956935a21e742f6788e7afa788ff728dd9a8d

Summary: Guarding against grains, shortening regexes. Reviewed By: jtliao Differential Revision: D23387716 fbshipit-source-id: de84d0efa79c4ae10bd9fbf14e82a724fee1a1f2

Summary: sad_palpatine Differential Revision: D23718913 fbshipit-source-id: 363bf9a43d8d1cd77405882bc70a7fa1a1de2dbe

Summary: Pull Request resolved: facebook#533 In recent versions of Data.Some the name of the constructor, `This` has changed name to `Some`. This has become rather problematic for us to migrate so we're just going to remove the dependency. The meat of this diff is adding the type `Seal` to `Duckling.Types`. That type replaces `Some`. Reviewed By: pepeiborra Differential Revision: D23929459 fbshipit-source-id: 8ff4146ecba4f1119a17899961b2d877547f6e4f

Summary: This PR accomplishes several things: - removes dist-newstyle (local build artifacts should not be checked in) - extends the .gitignore to include many common build artifacts/editor artifacts - allow more modern dependencies (upper bounds of many were out of date by one or two years' worth of releases) - upgrade stack lts (9.2 -> 14.2) to GHC 8.6.5 - regenerate .travis.yml using the now-standard haskell-ci (many haskell core libraries use this), instead of the outdated script that was maintained by hvr; as a precursor to this, the tested-with versions were updated Reviewed By: patapizza Differential Revision: D24623967 fbshipit-source-id: 838fe571df0b8d44106349659ce8ce8ab82f0bc6

Summary: Keeps accents consistent, "quinquagésimo" there is no more "Ü". Pull Request resolved: facebook#531 Reviewed By: patapizza Differential Revision: D23770703 Pulled By: chessai fbshipit-source-id: f8a34c02028faf9f51eca6a016b5bad988a83f04

…ook#539) Summary: The Dockerfile build part did not copy the Duckling implementation into the container, making the build fail. I also harmonized the target Debian to Buster, that is the one currently hidden behind `haskell:8`. Pull Request resolved: facebook#539 Reviewed By: patapizza Differential Revision: D24688839 Pulled By: chessai fbshipit-source-id: 0ffcc4d28a599b7edad668730117828d26e116ad

Summary: Spanish (ES) will now have all the same quantity rules as English (EN) (which I think is the most-supported language), plus more. This includes the following: * bowls - (bol(es)?|tazón(es)?|cuencos?|platos? (soperos?)|(hondos?)) (EN does not currently have this) * cups - (tazas?) * dishes - (platos?|fuentes?) (EN does not currently have this) * grams - (((m(ili)?)|(k(ilo)?))?g(ramo)?s?) * ounces - ((onzas?)|oz) * pints - (pintas?) (EN does not currently have this) * pounds - ((lb|libra)s?) * quarts - (cuartos? de galón) (EN does not currently have this) * tablespoons - (cucharadas? (grande)?) (EN does not currently have this) * teaspoons - (cucharaditas?) (EN does not currently have this) Reviewed By: patapizza Differential Revision: D24628214 fbshipit-source-id: 2e8d500661f30fa0928cb7d3f21470afc01e2285

Summary: Found a lacking frequent duration in German and a small typo in the existing one. Pull Request resolved: facebook#509 Reviewed By: patapizza Differential Revision: D24690104 Pulled By: chessai fbshipit-source-id: b49a7a636abf5b92f2fe7c0d5b2ca2fe64acbaa2

Summary: Pull Request resolved: facebook#550 Reviewed By: haoxuany Differential Revision: D24844625 Pulled By: chessai fbshipit-source-id: 52dcf5f9488386f7f407535e876bff1207823fe0

Summary: * use regex-pcre-builtin by default on windows * update cabal version to 2.2 to support leading commas - requires the very first line in cabal file be the cabal-version line - BSD3 is not BSD-3-Clause (don't ask me why) resolves facebook#547 Pull Request resolved: facebook#549 Reviewed By: haoxuany Differential Revision: D24838317 Pulled By: chessai fbshipit-source-id: 376eb30a94ab88420915b868dffddb252fd08e76

Reviewed By: haoxuany Differential Revision: D24929661 fbshipit-source-id: 3858d14ef1655f079daa33d2b159e8cb918a70ac

Summary: Add support for additional Hindi numbers like 300, 81, 150, 1000, 1520. These are not supported in the current master version. Pull Request resolved: facebook#552 Reviewed By: ashwinp-fb, girifb Differential Revision: D25072230 Pulled By: chessai fbshipit-source-id: 35277a2349384bcf44a20e74852113f5c010e618

Summary: Pull Request resolved: facebook#520 Reviewed By: patapizza Differential Revision: D25072459 Pulled By: chessai fbshipit-source-id: 5db72eda36fe166a452b2345cab75fb1508b192b

Summary: Improves the recognition of German time approximation language and removes a single error in the rule of <time-of-day> approximately. Pull Request resolved: facebook#435 Reviewed By: patapizza Differential Revision: D24934281 Pulled By: chessai fbshipit-source-id: 641bcb6a7e5c26e66c735fe13bccae9b7a8909ae

Summary: Missing "tercer" regex in rule Pull Request resolved: facebook#477 Reviewed By: patapizza Differential Revision: D24934794 Pulled By: chessai fbshipit-source-id: a51f6fe3187749885784bfaacfee09cf26a8df6d

Summary: Facebook is migrating away from Travis CI, to GitHub actions. Pull Request resolved: facebook#555 Reviewed By: patapizza Differential Revision: D25228779 Pulled By: chessai fbshipit-source-id: a392b93e5a7b02d1f47b477b6c459901d3171e05

Summary: External users are repeatedly confused by lack of results from the duckling example executable. We should just go through all dimensions for the duckling call in the example app. Reviewed By: patapizza Differential Revision: D25468199 fbshipit-source-id: 6cf56b130d4d0aa3181f098d6a7c9a133bfa85ff

Summary: 'miej' in Polish is the imperative form of the verb 'mieć' (to have). "mniej więcej" means "more or less" and it was the intention here. Pull Request resolved: facebook#426 Reviewed By: patapizza, girifb Differential Revision: D25546380 Pulled By: chessai fbshipit-source-id: 1047b83109cab917f1f4dbe87b667f8ccd2fb92d

Summary: Crore (1e7) and Lakh (1e5) are both commonly used to describe an amount of Indian currency. Common abbreviations are "Cr" (Crore) and "lkh", "L", "lac" (lakh). Additionally, common spellings of "crore" include "karor" and "koti" Reviewed By: patapizza Differential Revision: D25550546 fbshipit-source-id: 0c1479d9027431cb0d1182b5117eabca6f939cb2

Summary: Egyptian Arabic is a dialect of Arabic that is mostly a spoken language that is used in everyday communications. This PR adds new locale to Arabic to support the differences between Modern Standard Arabic (MSA) and Egyptian Arabic (EG). I have mainly depended on the different locales of Spanish that are supported by Duckling to create the new Egyptian Arabic locale. New modifications are added to the `Numeral` dimension since I didn't spot differences in other dimensions. Pull Request resolved: facebook#554 Reviewed By: patapizza Differential Revision: D25543502 Pulled By: chessai fbshipit-source-id: 4cbb7be78a52071c8681380077f0b4dc033a60de

Summary: Pull Request resolved: facebook#560 Reviewed By: patapizza Differential Revision: D25564850 Pulled By: chessai fbshipit-source-id: 631f96a3ed71b9d7707560ff6bfe7596feee2305

Summary: Pull Request resolved: facebook#423 Reviewed By: girifb Differential Revision: D25573001 Pulled By: chessai fbshipit-source-id: 5474f108e968bdfb53ebc2518b46f28befdeba89

Summary: This pull request is to add support for Telugu language (Numerical Dimension) to Duckling Pull Request resolved: facebook#470 Differential Revision: D25546700 Pulled By: chessai fbshipit-source-id: 1d88ee27da8a577a4a79ff31be8cb55ed6444c4e

Summary: Pull Request resolved: facebook#538 Reviewed By: haoxuany Differential Revision: D24640854 Pulled By: chessai fbshipit-source-id: 51eb0d530b143511f79992a91ca8f465b7860b6e

…y dim fixes,

Summary: while computing a score used to rank in Duckling, it currently sums up the log likelihoods learned during training. While ranking, the goal is to find the (same span) parse candidate which is _more_ likely to lead to a *correct* parse. However, the old logic was summing up the "more confident of the two classes" log likelihood.From what I understand this is the part which feels wrong. I created an example of two rules: #1. a rule where the classifier learns that the rule is very confidently NOT the correct parse. - okdata (positive class) is very low confidence (high negative number prior) - kodata (negative class) is very high confidence (low negative number prior) #2. a rule where the classifier is confident that it is the correct parse, but not Very Confident. - okdata (positive class) is high confidence (nonzero, but low negative number prior) - kodata (negative class) is very low confidence (high negative number prior) these two rules match the same regex, thus the same span. While duckling parses it, it turns out, that rule #1 ranks higher than rule #2. The reason why is because #1 is MORE confident that it is the INCORRECT (does not contribute to) parse than rule #2. Does this make sense? to solve this problem, I changed the ranking score estimation to use only the positive class scores (okdata). In the example above, it fixes it so rule #2 would end up ranking higher because the positive class confidence is higher than #1's positive class confidence. Would really love some deeper input from Duckling experts. I re-learned haskell and learned haxl to craft a small example here, and I am very new to Duckling (just started reading the ranking code on Friday). I know Duckling is battle-tested but I also don't believe that means a bug can't exist. And further, this specific bug may not happen a whole lot for 2 reasons: - there are not a lot of rules which end up higher negative confidence than positive (requires enough negative corpus examples over positive ones) - ranking uses span width first, and only when the spans are equivalent does the score based ranking come into play. So it requires that 2 rules match the same span before any actual score calculation even matters. Reviewed By: patapizza Differential Revision: D22009276 fbshipit-source-id: 13491689d39d810da526fa4bb8b6e526d4cafd35

scroskey-clinc force-pushed the sean/develop branch from d9f3bd3 to 719a862 Compare December 22, 2020 00:34

scroskey-clinc changed the title ~~Rebase latest from upstream, merge prior features, update classifier issues~~ feat: sync with upstream, position+recurrence dim creation, time+money dim fixes Dec 22, 2020

ahitrov and others added 28 commits December 22, 2020 01:10

Add Time dimension for language BG

4e1b140

Summary: Pull Request resolved: facebook#403 Reviewed By: haoxuany Differential Revision: D18348752 Pulled By: patapizza fbshipit-source-id: ce3b5c76cb2cf39114216842529d4eaa8df5b93f

Added Slovak (sk) language with numeral dimension and tests.

d9481e1

Summary: Pull Request resolved: facebook#428 Reviewed By: haoxuany Differential Revision: D18348514 Pulled By: patapizza fbshipit-source-id: 9b0b9c2caa9fec8330746059eefa6185a8f3e072

Time/EN_US: orthodox good friday

dfee93a

Summary: Supporting "orthodox good friday" in addition to "orthodox great friday" in the regex Reviewed By: chinmay87 Differential Revision: D19604033 fbshipit-source-id: c6ca68fc34e284304ca2ba07a8f1bf81378c3558

Time/EN_US: Super Tuesday

05af874

Summary: Adding new holiday. Reviewed By: haoxuany Differential Revision: D20193781 fbshipit-source-id: c8be523293b7b6ee836965c8914e3db58cc41085

Fix build

9f26574

Summary: Adding new locale rules to cabal file Reviewed By: patapizza Differential Revision: D20288009 fbshipit-source-id: 71fe63973b4bc58d2fa7952af725b11238c99ef9

Enabling TimeGrain (facebook#460)

62b152e

Summary: Pull Request resolved: facebook#460 Exposing the TimeGrain feature Reviewed By: patapizza Differential Revision: D20250270 fbshipit-source-id: 726f85eebd95ae31d911ebd9a43428d549aba877

Add type=value to JSON response for Email, PhoneNumber and Url

00fc77f

Summary: For consistency. Reviewed By: jtliao Differential Revision: D20524369 fbshipit-source-id: 44031667adccab9bca7b3b6d42c80878bb96ccae

Time/es: Fix ruleYearLatent

d156a7e

Summary: Fix `ruleYearLatent` to be the same as the one in `en`. We don't want to match numerals that could have been hours. Reviewed By: patapizza Differential Revision: D20683975 fbshipit-source-id: cdef9b1b5f8a21dc5e207ed2a7afcad84c56a596

Fix to dockerfile so PCRE regex works. (facebook#467)

f3fef3c

Summary: Pull Request resolved: facebook#467 Reviewed By: chinmay87 Differential Revision: D20700248 Pulled By: patapizza fbshipit-source-id: 17f933106c6f18fcd93b73f42af458220d93b6cf

Added new rule for "midday" (facebook#490)

f0d9c83

Summary: added new EN rule to parse the phrases that contain "midday". Pull Request resolved: facebook#490 Differential Revision: D21959562 Pulled By: chinmay87 fbshipit-source-id: f9ab45aecd551e8959d00b0025ed38b616ed6b14

Added support for parsing new ES duration phrases like half hour, qua…

61f314b

…rter of hour. (facebook#489) Summary: Pull Request resolved: facebook#489 Differential Revision: D21959268 Pulled By: chinmay87 fbshipit-source-id: 2b785b44da5437c7b27af098daef551139dad990

ES/Duration: Add Copyright header to tests file

f3c25e6

Summary: as title Reviewed By: girifb Differential Revision: D21998107 fbshipit-source-id: 7c1c91db9a1ebf29d702930570341dc3b6b0ce65

patapizza and others added 25 commits December 22, 2020 01:11

Time/FR: Some speed up

3478d18

Summary: Guarding against grains, shortening regexes. Reviewed By: jtliao Differential Revision: D23387716 fbshipit-source-id: de84d0efa79c4ae10bd9fbf14e82a724fee1a1f2

Time/EN: Fix empty group match

f5f40d3

Summary: sad_palpatine Differential Revision: D23718913 fbshipit-source-id: 363bf9a43d8d1cd77405882bc70a7fa1a1de2dbe

ghc88x compat (facebook#550)

7a8f99b

Summary: Pull Request resolved: facebook#550 Reviewed By: haoxuany Differential Revision: D24844625 Pulled By: chessai fbshipit-source-id: 52dcf5f9488386f7f407535e876bff1207823fe0

make duckling time not treat 0:xx and 12:xx ambiguously

6202293

Reviewed By: haoxuany Differential Revision: D24929661 fbshipit-source-id: 3858d14ef1655f079daa33d2b159e8cb918a70ac

FA Setup (facebook#520)

f06ae7c

Summary: Pull Request resolved: facebook#520 Reviewed By: patapizza Differential Revision: D25072459 Pulled By: chessai fbshipit-source-id: 5db72eda36fe166a452b2345cab75fb1508b192b

ES/Ordinal: Fixes "tercero" pattern regex (facebook#477)

937e481

Summary: Missing "tercer" regex in rule Pull Request resolved: facebook#477 Reviewed By: patapizza Differential Revision: D24934794 Pulled By: chessai fbshipit-source-id: a51f6fe3187749885784bfaacfee09cf26a8df6d

GitHub CI over Travis (facebook#555)

f604928

Summary: Facebook is migrating away from Travis CI, to GitHub actions. Pull Request resolved: facebook#555 Reviewed By: patapizza Differential Revision: D25228779 Pulled By: chessai fbshipit-source-id: a392b93e5a7b02d1f47b477b6c459901d3171e05

ExampleMain: fix build failure (facebook#560)

6483ded

Summary: Pull Request resolved: facebook#560 Reviewed By: patapizza Differential Revision: D25564850 Pulled By: chessai fbshipit-source-id: 631f96a3ed71b9d7707560ff6bfe7596feee2305

add: support for quarter to, quarter past and half in HI (facebook#423)

b354481

Summary: Pull Request resolved: facebook#423 Reviewed By: girifb Differential Revision: D25573001 Pulled By: chessai fbshipit-source-id: 5474f108e968bdfb53ebc2518b46f28befdeba89

Time/PL - new rules (facebook#538)

ecca156

Summary: Pull Request resolved: facebook#538 Reviewed By: haoxuany Differential Revision: D24640854 Pulled By: chessai fbshipit-source-id: 51eb0d530b143511f79992a91ca8f465b7860b6e

feat: sync with upstream, position+recurrence dim creation, time+mone…

90c0e9b

…y dim fixes,

scroskey-clinc force-pushed the sean/develop branch from 719a862 to 90c0e9b Compare December 22, 2020 01:15

wang8zai approved these changes Dec 22, 2020

View reviewed changes

scroskey-clinc merged commit 6f1fcb6 into master Dec 22, 2020

scroskey-clinc deleted the sean/develop branch December 22, 2020 16:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: sync with upstream, position+recurrence dim creation, time+money dim fixes #2

feat: sync with upstream, position+recurrence dim creation, time+money dim fixes #2

scroskey-clinc commented Dec 22, 2020 •

edited

Loading

feat: sync with upstream, position+recurrence dim creation, time+money dim fixes #2

feat: sync with upstream, position+recurrence dim creation, time+money dim fixes #2

Conversation

scroskey-clinc commented Dec 22, 2020 • edited Loading

scroskey-clinc commented Dec 22, 2020 •

edited

Loading