Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[makeotfexe] Increase code coverage #1008

Merged
merged 15 commits into from
Oct 31, 2019
Merged

Conversation

khaledhosny
Copy link
Collaborator

@khaledhosny khaledhosny commented Oct 18, 2019

These tests increase code coverage of feat.c, featgram.c, and featscan.c to nearly 80% (these three files, I believe, is what parses feature files and handles their syntax). with most of the remaining uncovered code being unreachable code (will add more tests once #1006 is fixed).

Opening this to get early review since I have been working on it for a few weeks already.

Fixes #781.

@cjchapman
Copy link
Contributor

@khaledhosny You've got some tests failing in CI. Let me know if you need help fixing them, and if not, then let me know when you're done fixing them.

@khaledhosny
Copy link
Collaborator Author

4 of the failing issues are testing the code below (and a similar block in the same function):

if (m.ulval > ((unsigned long)0xFFFFFFFF))
zzerr("not in range 0 .. ((1<<32) -1)");

But for some reason this check is not working on Windows and out of boundary values are accepted, I wonder if the size of unsigned long on Windows is to blame here.

@cjchapman
Copy link
Contributor

@khaledhosny I've been changing other code in the AFDKO to use uint32_t instead of unsigned long to deal with issues like that.

@cjchapman
Copy link
Contributor

@khaledhosny In general, I've been trying to move the code over to using explicit bit sizes via:

uint8_t
int8_t
uint16_t
int16_t
uint32_t
int32_t
uint64_t
int64_t

@khaledhosny
Copy link
Collaborator Author

@khaledhosny I've been changing other code in the AFDKO to use uint32_t instead of unsigned long to deal with issues like that.

That is the right thing to do. In this case I think we need 64bit integers though since the code is storing values that don’t find in 32bit and we don’t want unsigned 32bit to silently warp around, so I used int64_t and uint64_t. Lets see if it fixes the issue.

@khaledhosny
Copy link
Collaborator Author

Hmm, this fixed one test but not the other 3 which does not make sense. I think I’ll need to debug this on a Windows machine. @cjchapman I appreciate if you can help with this, since I haven’t done AFDKO development on Windows before and not even sure I will be able to have a working setup.

@cjchapman
Copy link
Contributor

@khaledhosny OK, I have a Windows VM I can use for debugging. I won't be able to get it until Monday, at the earliest.

@khaledhosny
Copy link
Collaborator Author

I spent 2 hours trying to get AFDKO to build on Windows with no luck. I might give it another try tomorrow.

@cjchapman
Copy link
Contributor

OK.

The code is supposed to fix dflt script into DFLT, but it actually kept
it dflt which would then lead to empty script in the table output.
These lines will never be reached since featMsg(hotERROR, ...) will
longjmp out of the function.
The lookup is not defined here not already define.
The tests with `.bad` suffix are files with errors and should fail to be
processed, the other files are good files and have expected output to
compare against. It is assumed that the current makeotfexe output is
correct.
Use explicit int64_t and uint64_t instead of platform-dependent long and
unsigned long so that code that stores values needing more than 32bit
works on Windows as well.
@khaledhosny
Copy link
Collaborator Author

I managed to get Windows built setup, ran the tests and, surprisingly enough, none failed!

@cjchapman
Copy link
Contributor

@khaledhosny I wish that were surprising for me. We've seen that before with other tools that had memory bugs, especially uninitialized memory bugs. Try running the test with address sanitizer (ASAN) on Mac or Linux (or Valgrind on Linux, or I think there's some Visual Studio equivalent for Windows), that might help. If not, I can dig into it further next week (Monday at the earliest).

@@ -7,5 +7,5 @@ table OS/2 {
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106
107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122
123 124 125 126 127 128;
123 124 125 126 127;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is supposed to check that extra bits are gracefully ignored.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it was triggering a buffer overflow. When I fixed the buffer overflow, it was complaining that the ; was not found.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I didn't see "bad" in the name, I incorrectly assumed that you meant it to be "good". 😀

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... the extra bits are not gracefully ignored before or after my buffer overflow check. I'm already not crazy about the fact that I hand-edited what's supposed to be ANTLR-generated code, but I wanted to prevent the memory bug from occurring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it to you how you want to handle this from the test side. At least now we know why one of the four CI failures was occurring.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear: I'm fine with debugging issues in any part of the code we plan to keep (i.e. the human-written parts of the code). I just don't want to spend too much time debugging issues in the ANTLR-generated parts of the code that we plan to replace (i.e. with something generated by a newer tool).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problems don‘t seem to be in the generated code, the grammar file featgam.g has many C snippets and these seem to be the culprit. It is rather too late here for me to think straight though, so I‘ll check your finding later.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khaledhosny OK. I dug into this a bit further. I'm seeing the value set to 0x000000007fffffff by the strtol() call in line 1050 below:

void zzcr_attr(Attrib *attr, int type, char *text) {
if (type == T_NUM) {
attr->lval = strtol(text, NULL, 10);
} else if (type == T_NUMEXT) {

Perhaps you could replace the strtol with strtoll, if strtoll is portable.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went ahead and replaced strtol with strtoll in that zzcr_attr function. That seems to have fixed the remaining three Windows CI issues.

I noticed some other strtol in that file that I did not change, so if you run into any more cases of a value getting capped at 0x000000007fffffff on Windows, then that's a good place to start looking.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khaledhosny I just pushed a better fix for 9f-3.fea in which the code just ignores the extra bits in the UnicodeRange. Note that I hand-edited both featgram.g and featgram.c as I don't have antlr built on my system, so please check that you can still compile featgram.g to featgram.c.

@cjchapman
Copy link
Contributor

Apparently no one has built antlr in the past two years. 🤦‍♂

@cjchapman
Copy link
Contributor

I tried compiling featgram.g to featgram.c with antlr and then cleaning it up with clang-format. Miraculously, I get the same code with just minor line break & white space differences.

@khaledhosny
Copy link
Collaborator Author

khaledhosny commented Oct 22, 2019

Apparently no one has built antlr in the past two years. 🤦‍♂

I did two days earlier, it works pretty well actually (the minor edits required are even described in c/makeotf/makeotf_lib/build/hotpccts/readme.txt and there is even a makefile.

Use the makefile under c/makeotf/makeotf_lib/build/hotpccts/, keeping
the manual edits to fix cpplint issues.
@cjchapman
Copy link
Contributor

Apparently no one has built antlr in the past two years. 🤦‍♂

I did to days earlier, it works pretty well actually (the minor edits required are even described in c/makeotf/makeotf_lib/build/hotpccts/readme.txt and there is even a makefile.

Did you just not check in the fix to err.h then? I couldn't build it until I fixed that.

@khaledhosny
Copy link
Collaborator Author

Apparently no one has built antlr in the past two years. 🤦‍♂

I did to days earlier, it works pretty well actually (the minor edits required are even described in c/makeotf/makeotf_lib/build/hotpccts/readme.txt and there is even a makefile.

Did you just not check in the fix to err.h then? I couldn't build it until I fixed that.

I had to fix some file to get it to compile indeed, but forgot to commit it, and it seems to be the same fix you committed.

@miguelsousa
Copy link
Member

@cjchapman @khaledhosny I've restarted Azure but the macOS job is still failing the same way. Comparing with the last commit that passed, I see that the wheel's tags of yesterday's job are macosx-10.13-x86_64 whereas today's are macosx-10.14-x86_64.

Our Azure config specifies 10.13 and the job seems to use a 10.13 image indeed, so I don't quite understand why the wheel ends up referencing 10.14.

@josh-hadley
Copy link
Collaborator

josh-hadley commented Oct 23, 2019

Regarding the Azure build failures, looking at the Mac "Build wheel, run tests, and uninstall" step that is failing, one difference that I see is that the new (failing) ones use runner == 2.159.2 whereas the older ones use 2.158.0 (compare ~line 16 of the build logs). Maybe that's something to do with the failure (environment variables)? Not sure if that can be configured anywhere in the Azure scheme.

@miguelsousa
Copy link
Member

I'd say let's go ahead and merge it. The Azure CI is just a backup anyway. Perhaps they broke it in the process of making a 10.15 (Catalina) image available.

@khaledhosny khaledhosny marked this pull request as ready for review October 23, 2019 20:35
@khaledhosny
Copy link
Collaborator Author

I still want to add a few more tests. but we can merge this now and open a new PR later.

@miguelsousa
Copy link
Member

@khaledhosny no, keep going. I thought this was done.

@khaledhosny
Copy link
Collaborator Author

I think I’m done, no more tests to add.

@cjchapman cjchapman requested review from readroberts and removed request for cjchapman October 26, 2019 01:29
@cjchapman
Copy link
Contributor

@readroberts Would you please review this PR? You know this code much better than I do. Also, I made some of the changes and suggested some of the others, so I think I should recuse myself from reviewing for those reasons as well.

Copy link
Contributor

@readroberts readroberts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have reviewed all the code changes. Looks good to me.

@cjchapman cjchapman merged commit df3768e into develop Oct 31, 2019
@cjchapman cjchapman deleted the khaled-makeotfexe-fea-spec branch October 31, 2019 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add tests that cover all FEA spec snippets
5 participants