Improve specification conformance of unicode bidi library-Initial step #19

moharnab123saikia · 2015-10-30T06:42:30Z

Files changed: src/lib.rs and tools/generate.py

•added code to tools/generate.py to download the files: http://www.unicode.org/Public/UNIDATA/BidiTest.txt and http://www.unicode.org/Public/UNIDATA/BidiCharacterTest.txt
•wrote manual rust tests in lib.rs that can be run automatically

eefriedman · 2015-10-30T06:50:39Z

Please squash the commits.

Please remove the extra lib.rs~.

Stray tab in generate.py.

for class in rem_classes etc. in test_removed_by_x9.

In Rust code, space before function opening brace, and no blank lines after the opening brace or before the closing brace.

metajack · 2015-11-03T21:00:24Z

@eefriedman can you re-review? Or should @mbrubeck review?

eefriedman · 2015-11-03T22:02:22Z

@metajack Re-review what? My initial review comments still weren't addressed.

@moharnab123saikia We can't merge a PR with failing tests; change the test to reflect the current state of the code, and add a comment like // TODO(#999): This should equal "foo" (replace 999 with the issue number for the issue you filed).

metajack · 2015-11-03T22:25:12Z

@eefriedman Sorry. I assumed he had addressed them since there were new commits.

samruds1 · 2015-11-08T20:53:49Z

@eefriedman - we have made the changes. Can you please review the changes?

eefriedman · 2015-11-08T21:15:50Z

Review status: 0 of 5 files reviewed at latest revision, 8 unresolved discussions.

^{src/lib.rs, line 980 [r8] (raw file):}
Needs some sort of note about why the test is commented out.

^{src/lib.rs, line 1003 [r8] (raw file):}
Missing newline.

^{src/lib.rs, line 1009 [r8] (raw file):}
for x in rem_classes.

^{src/lib.rs, line 1012 [r8] (raw file):}
for x in not_classes

^{src/lib.rs, line 1019 [r8] (raw file):}
Space after commas.

^{src/lib.rs, line 1020 [r8] (raw file):}
for x in non_x9_classes.

^{src/lib.rs, line 1021 [r8] (raw file):}
Space after comma.

^{tools/generate.py, line 199 [r8] (raw file):}
Indentation, and please limit comments to 100 columns where possible.

Comments from the review on Reviewable.io

samruds1 · 2015-11-09T01:11:12Z

@eefriedman Please review.

eefriedman · 2015-11-09T01:27:35Z

Review status: 0 of 5 files reviewed at latest revision, 5 unresolved discussions.

^{src/lib.rs, line 980 [r9] (raw file):}
Indentation. (Tab?)

^{src/lib.rs, line 1013 [r9] (raw file):}
Space before opening brace (same for other loops).

^{tools/generate.py, line 196 [r9] (raw file):}
Are you sure reindenting the if statement is correct? It changes the semantics...

^{tools/generate.py, line 198 [r9] (raw file):}
Trailing whitespace

Comments from the review on Reviewable.io

eefriedman · 2015-11-09T01:28:35Z

@mbrubeck Do we want the testcase files in this repository?

vicky-katara · 2015-11-09T05:58:47Z

@eefriedman, we have made the requested changes. Please let us know if you need any other changes. The only change that we have left out is the commented test case. This test case was failing and should get handled in the next steps. We have left it commented there for reference, and will remove the steps when the further steps have been completed.

samruds1 · 2015-11-10T02:53:22Z

@eefriedman, @mbrubeck - Do we have to incorporate anymore changes? Can you please review.

eefriedman · 2015-11-10T03:02:22Z

I'm waiting for an answer to #19 (comment) ; looks fine otherwise.

mbrubeck · 2015-11-10T17:33:02Z

Thanks! The new tests look good.

Since the .txt files aren't used yet, let's leave them out of the repo (but keep the code to fetch them). Once we have some code to process the files, we can decide whether it makes sense to check them in.

Reviewed 3 of 4 files at r1, 1 of 1 files at r10, 1 of 1 files at r11.
Review status: all files reviewed at latest revision, 6 unresolved discussions.

^{src/BidiCharacterTest.txt, line 0 [r11] (raw file):}
I think we should leave these files out of the repo at least for now.

Comments from the review on Reviewable.io

samruds1 · 2015-11-10T17:47:19Z

@mbrubeck - I have commented out the code and deleted those two files from the repository. Please review the changes.

eefriedman · 2015-11-10T20:17:59Z

Please fix the commit message (it currently contains a bunch of useless text from squashing).

moharnab123saikia · 2015-11-10T22:36:29Z

@ https://github.com/eefriedmaneefriedman https://github.com/eefriedman -
we have deleted the garbage messages and only kept the details about the
changes we made. Please review.

Thanks and Regards,
Moharnab Saikia

Please fix the commit message (it currently contains a bunch of useless
text from squashing).

—
Reply to this email directly or view it on GitHub
#19 (comment).

added test is_ltr Added test is_rtl added test removed_by_X9 added test not_removed_by_x9() Added test case to test_reorder_line()

eefriedman · 2015-11-10T22:41:17Z

It looks like the new version isn't rebased correctly (GitHub shows two commits).

vicky-katara · 2015-11-11T20:21:58Z

@eefriedman, we only see one commit from our side which is 'unicode-bidi/ 2a99bca'. Do you want us to squash our commit into your commit that is 'unicode-bidi/ 69e42af'

eefriedman · 2015-11-11T20:34:35Z

You branch contains commit 69e42af, which doesn't exist on master (the equivalent commit on master is a202520).

samruds1 · 2015-11-12T16:05:59Z

@eefriedman - We are planning to reset and rebase our commits again. Maybe that will solve the issue.

samruds1 · 2015-11-13T07:35:25Z

@eefriedman - When I clone the unicode-bidi repository (with none of our changes added) and run the tests defined in lib.rs, a folder called target is getting created. After I add our changes, should I let the folder remain in the repository and push the changes?

mbrubeck · 2015-11-13T17:05:18Z

target is where Cargo puts compiled code and other files it generates. Git is configured to ignore this directory (using the .gitignore file), so you can leave it in place as you commit and push changes.

samruds1 · 2015-11-13T19:05:25Z

@mbrubeck - Thanks for the clarification. Can you please review the changes.

vicky-katara · 2015-11-13T20:40:14Z

@mbrubeck @eefriedman , we wanted some clarification.

How are generate.py and Cargo linked?
Should 'cargo test' call generate.py somehow?
What should be the location of 'UnicodeData.txt' and 'ReadMe.txt'?

eefriedman · 2015-11-13T21:04:13Z

Currently, generate.py and Cargo aren't linked; generate.py modifies the source code. We could change this so generate.py generates a build artifact instead, I guess... but that isn't obviously worthwhile.

Note that you wouldn't want the build process to fetch data from unicode.org: that would make the resulting binary unpredictable.

What should be the location of 'UnicodeData.txt' and 'ReadMe.txt'?

Currently, nowhere; the fact that the current generate.py saves the files into the current working directory is a bug.

If we wanted to add the files to the repository, we could add a unicode/ directory, I guess, and add a separate script to update those files. That said, I'm not sure that's necessary.

mbrubeck · 2015-11-13T21:05:06Z

How are generate.py and Cargo linked?

Currently, generate.py is run manually, and its output (src/tables.rs) is checked in to the git repository. This way, we only need to run generate.py when upgrading to a new version of Unicode, which happens relatively infrequently.

Should 'cargo test' call generate.py somehow?

I suggest we do the same thing we do for the non-test code: Run generate.py manually, and check in all the files needed for running tests (either the .txt files, or generated Rust files that contain the same data). This has the disadvantage of making the repository larger, but it should make our test process more reproducible (since it doesn't depend on network resources outside our control).

What should be the location of 'UnicodeData.txt' and 'ReadMe.txt'?

src/UnicodeData.txt and src/ReadMe.txt are temporary files used only by generate.py and should not be checked in to the repository. They are listed in the .gitignore file.

vicky-katara · 2015-11-13T21:29:05Z

@eefriedman and @mbrubeck, thanks for the detailed replies.

As far as we see, all are changes are in place. Do you see any more changes? Or are we good to go on the merge?

mbrubeck · 2015-11-13T21:34:52Z

This looks good to me. Thanks!

Reviewed 3 of 4 files at r12.
Review status: all files reviewed at latest revision, 6 unresolved discussions.

Comments from the review on Reviewable.io

mbrubeck · 2015-11-13T21:36:40Z

Oops, there's still a duplicate commit in there (accidentally created in a rebase?) but it doesn't seem to cause any issues so I'll just let this go through.

mbrubeck · 2015-11-13T21:52:05Z

@bors-servo r+

bors-servo · 2015-11-13T21:52:07Z

📌 Commit 2a99bca has been approved by mbrubeck

bors-servo · 2015-11-13T21:52:08Z

⌛ Testing commit 2a99bca with merge 06dedd1...

Improve specification conformance of unicode bidi library-Initial step Files changed: src/lib.rs and tools/generate.py •added code to tools/generate.py to download the files: http://www.unicode.org/Public/UNIDATA/BidiTest.txt and http://www.unicode.org/Public/UNIDATA/BidiCharacterTest.txt •wrote manual rust tests in lib.rs that can be run automatically  [<img src="https://reviewable.io/review_button.png" height=40 alt="Review on Reviewable"/>](https://reviewable.io/reviews/servo/unicode-bidi/19)

bors-servo · 2015-11-13T21:53:24Z

☀️ Test successful - travis

vicky-katara · 2015-11-25T01:12:26Z

Hi @mbrubeck , @eefriedman

We have the following queries about the subsequent steps of our project as mentioned here: https://github.com/servo/servo/wiki/Improve-specification-conformance-of-unicode-bidi-library

How do we interpret test cases mentioned in BidiTest.txt?
For example: @levels: x
@reorder:
LRE; 7
LRO; 7
RLE; 7
RLO; 7
PDF; 7
BN; 7

            #Count: 6

What function from lib.rs should we call to create test cases present in BidiTest.txt.

What function from lib.rs should we call to create test cases present in BidiTest.txt? Is it reorder-line?
What do 'passes' specifically mean in context of "applies steps W1-W7 in a single pass". Are they character by character scans of the input line? Or are they procedure calls of resolve_weak?

eefriedman · 2015-11-25T02:18:09Z

There's some basic documentation for the meaning of the bidi tests at the top of BidiTest.txt. Each test line consists of a list of bidi character classes. For each test line (lines without a starting # or @), two things should get checked: the level of each character (essentially the levels array returned from process_text()) and the visual order of the resulting text (essentially the result of reorder_line()). The relevant level and ordering are specified by the @levels and @reorder lines. (The very first test in the file is kind of confusing to look at because it's a test which doesn't actually contain any text.)

I'm pretty sure https://github.com/servo/unicode-bidi/blob/master/src/lib.rs#L687 is referring to the fact that the function only examines each character in the string once; it has nothing to do with the callers.

samruds1 · 2015-12-01T05:21:19Z

@eefriedman Thanks for the explanation. I am unable to understand what the number 7 denotes in the test cases given below:
L LRE R R; 7
L; 7
L LRE; 7
Can you please help me out.

eefriedman · 2015-12-01T05:39:56Z

# A data line has the following format:
# <input> ; <bitset>
#   <input>  =      An ordered list of BIDI property values
#   <bitset> =      A hex bitset for paragraph levels (P): 1 = auto-LTR, 2 = LTR, 4 = RTL
#                   Auto-LTR (standard BIDI) uses the first L/R/AL character, and is LTR if none is found.

It's a bitset, so 7 means 1|2|4, i.e. the test applies in contexts with an automatic paragraph level, a forced LTR paragraph level, and a forced RTL paragraph level; see http://unicode.org/reports/tr9/#The_Paragraph_Level .

metajack assigned eefriedman Nov 3, 2015

Version 0.2.2

69e42af

Added code to generate.py to download conformance library

2a99bca

added test is_ltr Added test is_rtl added test removed_by_X9 added test not_removed_by_x9() Added test case to test_reorder_line()

bors-servo merged commit 2a99bca into servo:master Nov 13, 2015

mbrubeck mentioned this pull request Nov 30, 2015

Add conformance tests #1

Closed

Improve specification conformance of unicode bidi library-Initial step #19

Improve specification conformance of unicode bidi library-Initial step #19

Conversation

moharnab123saikia commented Oct 30, 2015

eefriedman commented Oct 30, 2015

metajack commented Nov 3, 2015

eefriedman commented Nov 3, 2015

metajack commented Nov 3, 2015

samruds1 commented Nov 8, 2015

eefriedman commented Nov 8, 2015

samruds1 commented Nov 9, 2015

eefriedman commented Nov 9, 2015

eefriedman commented Nov 9, 2015

vicky-katara commented Nov 9, 2015

samruds1 commented Nov 10, 2015

eefriedman commented Nov 10, 2015

mbrubeck commented Nov 10, 2015

samruds1 commented Nov 10, 2015

eefriedman commented Nov 10, 2015

moharnab123saikia commented Nov 10, 2015

eefriedman commented Nov 10, 2015

vicky-katara commented Nov 11, 2015

eefriedman commented Nov 11, 2015

samruds1 commented Nov 12, 2015

samruds1 commented Nov 13, 2015

mbrubeck commented Nov 13, 2015

samruds1 commented Nov 13, 2015

vicky-katara commented Nov 13, 2015

eefriedman commented Nov 13, 2015

mbrubeck commented Nov 13, 2015

vicky-katara commented Nov 13, 2015

mbrubeck commented Nov 13, 2015

mbrubeck commented Nov 13, 2015

mbrubeck commented Nov 13, 2015

bors-servo commented Nov 13, 2015

bors-servo commented Nov 13, 2015

bors-servo commented Nov 13, 2015

vicky-katara commented Nov 25, 2015

eefriedman commented Nov 25, 2015

samruds1 commented Dec 1, 2015

eefriedman commented Dec 1, 2015