Skip to content

Commit

Permalink
Merge pull request #52 from unfoldingWord/RJHimprovements
Browse files Browse the repository at this point in the history
Move stuff back out of core and of NPM
  • Loading branch information
RobH123 authored Sep 30, 2020
2 parents bd38838 + 49d71c0 commit 860cb05
Show file tree
Hide file tree
Showing 38 changed files with 14,313 additions and 7,426 deletions.
16 changes: 15 additions & 1 deletion .npmignore
Original file line number Diff line number Diff line change
@@ -1 +1,15 @@
./src/demos
# We don't need any of the demo stuff in the NPM package
./src/demos/

# We don't need the testing stuff in the NPM package
./cypress/
./src/__tests__/

# Nor do we need any of the styleguidist stuff from the core
./styleguide.*
./src/code/*.md

# Nor do we need some of the extra stuff
./scripts/
./makeNoticeList.py
./noticeList.txt
39 changes: 36 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,11 +45,13 @@ This code is designed to thoroughly check various types of Bible-related content

Note: There is also a separate function for checking individual TN/TSV lines which is intended to be able to provide immediate user feedback if built into a TSV editor.

The top-level checking demonstrations:
The top-level checking demonstrations return:

1. A list of things that were checked (successList)
1. Typically a list of (higher-priority) errors and a list of (lower-priority) warnings, but other formats for display of messages are also demonstrated.

### Notice Objects (in noticeList)

However, the lower-level checking functions provide only the list of success message strings and one list of `notices` (i.e., warnings/errors combined) typically consisting of an object with some or all of the following fields (as available/relevant):

There are two compulsory fields in all of these notice objects:
Expand All @@ -59,7 +61,7 @@ There are two compulsory fields in all of these notice objects:

All of the following fields may be missing or undefined, i.e., they're all optional:

1. `details`: More details about the notice (if relevant)
1. `details`: More details about the notice (if applicable)
1. `bookID`: The 3-character UPPERCASE [book identifier](http://ubsicap.github.io/usfm/identification/books.html) or [OBS](https://www.openbiblestories.org/) (if relevant)
1. `C`: The chapter number or OBS story number (if relevant)
1. `V`: The verse number or OBS frame number (if relevant)
Expand All @@ -82,9 +84,40 @@ There is a second version of the function which splits into `Severe`, `Medium`,

However, the user is, of course, free to create their own alternative version of these functions. This is possibly also the place to consider localisation of all the notices into different interface languages???

## User-settable Options

There is provision for checking to be altered and/or sped-up when the calling app sets some or all of the following fields in `optionalCheckingOptions`:

- extractLength: an integer which defines how long excerpts of lines containing errors should be -- the default is 10 characters -- the package attempts to place the error in the middle of the extract
- getFile: a function which takes the four parameters ({username, repository, path, branch}) and returns the full text of the relevant Door43 file -- default is to use our own function and associated caching
- fetchRepositoryZipFile: a function which takes the three parameters ({username, repository, branch}) and returns the contents of the zip file containing all the Door43 files -- default is to use our own function and associated caching
- getFileListFromZip: takes the same three parameters and returns a list/array containing the filepaths of all the files in the zip file from Door43 -- default is to use our own function and associated caching
- originalLanguageVerseText: the Hebrew/Aramaic or Greek original language text for the book/chapter/verse of the TSV line being checked -- this enables `OrigQuote` fields to be checked without needing to load and parse the actual USFM file
- originalLanguageRepoUsername and originalLanguageRepoBranch: these two fields can be used to specify the username/organisation and/or the branch/tag name for fetching the UHB and UGNT files for checking
- taRepoUsername, taRepoBranchName: these two fields can be used to specify the username/organisation and/or the branch/tag name for fetching the TA files for checking
- taRepoLanguageCode, and taRepoSectionName: can be used to specify how the `SupportReference` field is checked in TA -- defaults are 'en' and 'translate'
- twRepoUsername, twRepoBranchName: these two fields can be used to specify the username/organisation and/or the branch/tag name for fetching the TW files for checking

Most of the high-level demonstrations allow a choice of one of three display formats for notices:

- 'SingleList': sorts notices by priority (highest first) then colours the highest ones bright red, slowly fading to black for the lowest priorities
- 'ErrorsWarnings': arbitrarily divides notices into a list of *errors* and a list of *warnings*, each displayed in different colours
- 'SevereMediumLow': divides notices into three lists which are displayed in different colours

In addition, there are some options in the display of notices for the demonstrations, set in `optionalProcessingOptions` used by the sample notice processing functions:

- ignorePriorityNumberList: a list (array) of integers that causes of notices with these priority values to be dropped during notice processing
- sortBy: a string which can be set to 'ByPriority' -- the default is 'AsFound', i.e., unsorted
- errorPriorityLevel: an integer which can define *errors* (vs *warnings*) (if relevant) -- defaults to 700 (and above)
- severePriorityLevel: an integer which can define *severe* errors (if relevant) -- defaults to 800 (and above)
- mediumPriorityLevel: an integer which can define *medium* errors (if relevant) -- defaults to 600 (and up to `severePriorityLevel`)
- cutoffPriorityLevel: an integer which can define notices to be dropped/ignored -- defaults to 0 so none are dropped
- maximumSimilarMessages: an integer which defines how many of a certain notice to display, before summarising and saying something like *99 similar errors suppressed* -- zero means don't ever summarise notices -- defaults to 3

## Still To Do

Still unfinished (in rough priority order):

1. Add manifest read and parsing functions to determine filenames to check
1. Checking of general markdown and naked links (esp. in plain text and markdown files)
1. Write the correct checks for the forthcoming new TSV annotation formats
1. Work through all [Issues](https://github.com/unfoldingWord/uw-content-validation/issues)
Expand Down
2 changes: 1 addition & 1 deletion noticeList.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Got 212 notices:
991, "Unresolved GIT conflict", characterIndex, extract, location: ourLocation
988, `Wrong number of tabbed fields (expected $NUM_EXPECTED_TN_TSV_FIELDS)`, extract: `Found $fields.length field$fields.length === 1 ? '' : 's'`, lineNumber: n + 1, location: ourLocation
988, `Wrong number of tabbed fields (expected $NUM_EXPECTED_ANNOTATION_TSV_FIELDS)`, extract: `Found $fields.length field$fields.length === 1 ? '' : 's'`, lineNumber: n + 1, location: ourLocation
`"`.indexOf(line[0]) < 0 ? 980 : 280, C, V, "Expected line to start with backslash", lineNumber: n, characterIndex: 0, extract: line[0], location: ourLocation
`"`.indexOf(line[0]) < 0 ? 880 : 280, C, V, "Expected line to start with backslash", lineNumber: n, characterIndex: 0, extract: line[0], location: ourLocation
987, C, V, "Expected \\id line to start with book identifier", lineNumber: n, characterIndex: 4, extract, location: ourLocation
979, "Invalid book identifier passed to checkTN_TSVDataRow", location: ` '$bookID' in first parameter: $tlcNCerror`
979, "Invalid book identifier passed to checkTNLinksToOutside", location: ` '$bookID' in first parameter: $tnlcError`
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
{
"name": "uw-content-validation",
"description": "Functions for Checking Door43.org Scriptural Content/Resources.",
"version": "0.8.13_alpha3",
"version": "0.8.14_alpha3",
"private": false,
"homepage": "https://unfoldingword.github.io/uw-content-validation/",
"repository": {
Expand Down
Loading

0 comments on commit 860cb05

Please sign in to comment.