Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TN checks get swapped when resources updated #7198

Closed
PhotoNomad0 opened this issue Nov 25, 2021 · 13 comments
Closed

TN checks get swapped when resources updated #7198

PhotoNomad0 opened this issue Nov 25, 2021 · 13 comments
Assignees

Comments

@PhotoNomad0
Copy link
Contributor

PhotoNomad0 commented Nov 25, 2021

Problem

In translationNotes for 2 John, there are two selections for the same verse 1:3 under "Abstract Nouns"

1:3 " Grace, mercy, and peace will be with us from God the Father and from Jesus Christ"
1:3 "in truth and love"

For the Kannada project "kn_kglt_2jn_book", these two text selections for the same verse 1:3 are interchanged/displayed incorrectly.

Please check below screenshots.

kn_kglt_2jn_book_screenshot1.png

kn_kglt_2jn_book_screenshot2.png

Per the user Vishwanath (@jobby Prasannan), looks like this issue happens for some other books as well: "When two instances come in one verse, highlighted lines or words are jumbled up in TN & TW. Even after updating the content, I am facing the same issue in different books."

kn_kglt_2jn_book repo - https://git.door43.org/India_BCS/kn_kglt_2jn_book

To Reproduce

  • deleted resources folder
  • opened tc 3.0.1
  • imported from DCS: https://git.door43.org/India_BCS/kn_kglt_2jn_book
  • for tN tool: selected all tN checks and set GL to English
  • opened tN - could not see the checks specified
  • downloaded resource updates for en and kn
  • selected another project and reopened the above project
  • for tN tool: selected all tN checks
  • opened tN - now could see the checks specified
  • downloaded resource updates for el-x-koine
  • selected another project and reopened the above project
  • now see the checks scrambled

Notes:

Features / Specifications

  • [ ]
  • [ ]
  • [ ]

Definition of Done

  • [ ]
  • [ ]
  • [ ]

Additional Context

It looks like there is a similar cause for #7153

Mockups

@PhotoNomad0 PhotoNomad0 self-assigned this Nov 25, 2021
@PhotoNomad0
Copy link
Contributor Author

PhotoNomad0 commented Nov 29, 2021

Copied from discussion thread

The checks where showing properly for me at first, but then after updating el-x-koine they swapped - so I don't think it's user error.

FYI - I dug into the data stored on the repo and discovered that the saved selections are correct - so the selections themselves are not corrupted. It is a bug in tC that is displaying the selections on the wrong check when there are multiple checks of same type on the same verse. Created an issue for it Birch : 7198

Looking at tCore, we have been trying to migrate selections for checks whenever the original language changes. It has worked OK so far in most cases. For example in the above kn project, tC will migrate selections where the quote changes from "ἐντολή" => "ἐντολὰς" - it works OK if there is only one check in the same category in the same verse, but if there are more tC may update the selection with the wrong quote (since it does not understand greek or hebrew well). For tN it caused a problem for example two checks for abstract nouns with quote: “ἔσται μεθ’ ἡμῶν χάρις, ἔλεος, εἰρήνη, παρὰ Θεοῦ Πατρός καὶ παρὰ Ἰησοῦ Χριστοῦ” and “ἐν ἀληθείᾳ καὶ ἀγάπῃ”. Here it updated the wrong selection.

Since it seems that it would take too long to teach tC to understand greek and hebrew, I am suggesting one of two options:

  1. we limit migrating selections by updating the original language quotes only to the case there is only one check per verse in the same category. There is Minimal risk that the quote will be updated for the wrong selection.
  2. we stop migrating selections. This will make more work for checkers since there will be more selections lost when original language updates, but no chance that tC will update a selection with the wrong original language quote.
    Once we move to using the IDs of the checks (tN support in tC 3.0.2), it will be much easier to update the quote for the check.

Russ Perry said:

I'd vote for the choice that limits the re-do for the GL checkers, so I'd prefer not choosing option 2. For option 1, it sounds like when there are two checks in a verse it will invalidate the selection and then it would force the GL team to re-do the selection. I'm assuming that would happen only if the Greek or Hebrew quote has been changed in that verse? Anyway, option 1 sounds reasonable to me.

correct - basically it is due to changes in the original language. tCore sees it when the tn tsv files have the original quote changed, or in the case of tw the spelling of the word changed in the original bible. But another case is when the user changes the GL for checking. The original language quotes may change for example when switching from en to kn because kn may be using an older original language quote from an older version of the ugnt. The longer term solution in tCore is using the ID's (such as xyza) in the tsv, so we are sure which selection may have the original language quote changed. tNotes support is coming in tC 3.0.2, where ID support in tWords will require updating tCore to use catalog next and the twls.

@PhotoNomad0
Copy link
Contributor Author

PhotoNomad0 commented Nov 30, 2021

Description of Problem and Possible Solutions:

Background: the tCore parsing of tN tsv was implemented a long time ago. We kept track of checks by reference, groupID, orig lang quote, occurrence.  Then later it was realized that original quotes would need to change because the original language has changed.  So to make it easier for the checkers it was decided to update the quotes in their selections automatically, in order to minimize rework whenever the tNs are updated.  Later it was realized that matching previous selections based on reference, groupID, occurrence was not sufficient since it gets messy when more checks are added that have same reference, groupID, occurrence, so and ID field was added to the checks to uniquely identify a check in the tsv.  But tCore was never updated to make use of this.
 
Current problem:

  • on resource change - mismatching may occur whenever there are more than one tN that have the same  reference, groupID, occurrence
    • resource change can occur whenever:
      • user downloads updates to the tn for the current selected GL
      • users select a different GL (tNotes for the GLs can be based on different original language versions, or have different number of checks)

Automatic quotes update was intended for cases that there were fixes in the original language, etc. Here is an artificial example where the GL of a selection needs to be updated:

Screen Shot 2021-11-30 at 9 09 25 AM copy

Unfortunately it breaks when there is more than one resource with the same reference, groupID, occurrence, such as:

Screen Shot 2021-11-30 at 8 48 14 AM copy

Long term we need to fix tC to recognize the IDs when migrating selections so we can confidently match, such as:

Screen Shot 2021-11-30 at 8 53 11 AM copy

Summary of the current State of the problem:

  • this bug is also in all the releases of tC including v3.0.2 currently being released.
  • quote corruption can occur whenever:
    • user downloads updates to the tn for the current selected GL
    • users select a different GL

Suggested Fix
Since it seems that it would take too long to teach tC to understand greek and hebrew, I am suggesting one of two options:

  1. we limit migrating selections by updating the original language quotes only to the case there is only one check per verse in the same category. There is minimal risk that the quote will be updated for the wrong selection.
  2. we stop migrating selections. This will make more work for checkers since there will be more selections lost when original language updates or they change GLs, but no chance that tC will update a selection with the wrong original language quote.

Additionally we need to fix tCore to match the IDs of the checks when migrating.

@PhotoNomad0 PhotoNomad0 changed the title TN checks get swapped when resources updated in kn TN checks get swapped when resources updated Nov 30, 2021
@PhotoNomad0
Copy link
Contributor Author

PhotoNomad0 commented Dec 4, 2021

@elsylambert Tested in translationCore 3.0.2 (0f2d103).
I created a fork https://git.door43.org/photonomad1/kn_kglt_2jn_book https://git.door43.org/tCore-test-data/kn_kglt_2jn_book to capture the project before they made changes.
You can look at the steps in the PR #7199 to see what I was looking at for testing.

@BincyJ
Copy link

BincyJ commented Dec 10, 2021

@PhotoNomad0 , does @mannycolon need to review this issue or is it ready to be tested?

@BincyJ
Copy link

BincyJ commented Dec 10, 2021

@birchamp , can you confirm if this is 3.0.3 or 3.0.2 release

@PhotoNomad0
Copy link
Contributor Author

@BincyJ Moved to QA (forgot that), should be a 3.0.2 issue @birchamp. Ready for @elsylambert to test.

@elsylambert
Copy link

Verified in translationCore 3.0.2 (0f2d103). TN checks are not swapped and are displayed correctly as per the steps in PR #7199.
@PhotoNomad0 However, I have the following observation on one of the category: Assumed Knowledge ...

  • Some multiple checks in this category were not displayed correctly.

Screen Shot 2021-12-15 at 9 45 19 PM

  • Could not find the selections for this categoty in the TC folders.

Screen Shot 2021-12-15 at 9 57 14 PM

Here is the zipped project:
kn_kgl_2jn_book.zip

@PhotoNomad0
Copy link
Contributor Author

* Could not find the selections for this categoty in the TC folders.
Screen Shot 2021-12-15 at 9 57 14 PM

Here is the zipped project: kn_kgl_2jn_book.zip

That is always hard to track down, because we show localized strings on the left. But if they are using en as the GL, you can find the mapping between groupIDs and displayed text in the en_tn resource (e.g. ~/translationCore/resources/en/translationHelps/translationNotes/v49/culture/index.json) which will look like.

[
  {
    "id": "figs-explicit",
    "name": "Assumed Knowledge and Implicit Information"
  },
  {
    "id": "translate-names",
    "name": "How to Translate Names"
  },
  {
    "id": "translate-unknown",
    "name": "Translate Unknowns"
  },
  {
    "id": "writing-symlanguage",
    "name": "Symbolic Language"
  },
  {
    "id": "figs-go",
    "name": "Go and Come"
  },
  {
    "id": "translate-symaction",
    "name": "Symbolic Action"
  },
  {
    "id": "translate-bmoney",
    "name": "Biblical Money"
  },
  {
    "id": "translate-bdistance",
    "name": "Biblical Distance"
  },
  {
    "id": "translate-hebrewmonths",
    "name": "Hebrew Months"
  },
  {
    "id": "translate-bvolume",
    "name": "Biblical Volume"
  },
  {
    "id": "translate-bweight",
    "name": "Biblical Weight"
  }
]

But that list doesn't seem complete. Still looking for the others.

@PhotoNomad0
Copy link
Contributor Author

@elsylambert The full list is very difficult to navigate. It's in the tA such as https://git.door43.org/Door43-Catalog/en_ta/src/branch/master/translate

The localized strings are stored like https://git.door43.org/Door43-Catalog/en_ta/src/branch/master/translate/figs-explicit/title.md

@PhotoNomad0
Copy link
Contributor Author

Verified in translationCore 3.0.2 (0f2d103). TN checks are not swapped and are displayed correctly as per the steps in PR #7199. @PhotoNomad0 However, I have the following observation on one of the category: Assumed Knowledge ...

* Some multiple checks in this category were not displayed correctly.
Screen Shot 2021-12-15 at 9 45 19 PM

I don't know the history of this project, but it looks like the behavior is expected. Those are duplicated checks and until we release a fix for tCore that supports the IDs we will not be able to match previous selections to updated resources.

Unfortunately it will not be until after 3.0.2 is released and the duplicate alignments are corrected by the user, that this problem will go away.

@PhotoNomad0
Copy link
Contributor Author

How tCore works is whenever the user updates resources, tCore will go through all the previous selections for the tool (stored at ~/translationCore/projects/kn_kgl_2jn_book/.apps/translationCore/checkData/selections/2jn/1/5 for example) and try to match to the checks in the updated resources. But without the check Ids stored there, tC 3.0.1 would make mistakes matching the old selections. This is because often new checks are added and old checks were changed (such as by changing the original language strings). This caused selections to sometimes be duplicated or switched. Going forward we will be able to use the check IDs to prevent this from happening.

I think the only way for you to check that this is working is to fix the selections in the project. Then you switch the GL to hi and reopen the tN. Then switch back to GL of en and when you reopen tN the selections should be remembered correctly.

@PhotoNomad0
Copy link
Contributor Author

Switching GLs forces the resources for the new GL to be reloaded in tCore. As far as tCore is concerned this is the same as updating resources.

@elsylambert
Copy link

Thanks @PhotoNomad0 for the Information you provided. After the discussion and walkthrough, I am passing this issue. It works fine in tN and tWs as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants