-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tx] Replace UFO .glif file parsing with libxml2 #1556
Conversation
f451a7d
to
0b425f0
Compare
This pull request introduces 4 alerts when merging 5aa50b1 into adb9656 - view on LGTM.com new alerts:
|
@josh-hadley @skef this is marked as a draft while I add more tests to the PR, but if you'd like to, you can start reviewing whenever available! Note 1: I'll fix the lgtm alerts soon! Note 2: I think there will be some refactoring needed in the future, perhaps not necessarily in this PR— I don't like that I had to keep adding global state tracking variables defined at the top of uforead. I would like to make this cleaner. I will still need to track which part of the UFO is being parsed, but I'll manage this better. Along with this, I'm planning on splitting the setFontDictKey function into a function for each part of the UFO, rather than have all of those keys parsed in one function. I may do all of this refactoring in a different PR, but am open to feedback on this anytime! Thanks 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor twiddles
ed50df7
to
23305ec
Compare
This pull request introduces 2 alerts when merging efa4bb7 into f0e7023 - view on LGTM.com new alerts:
|
3b7fe6f
to
c895138
Compare
@@ -271,6 +282,14 @@ struct ufoCtx_ { | |||
ctlStreamCallbacks stm; | |||
} cb; | |||
dnaCtx dna; | |||
struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a request for a change, but generally with these state structs you can rely on the struct name being part of the string. So no need to include "parsing" in the field name when the struct name is "parseState".
c/shared/source/uforead/uforead.c
Outdated
const char* filetype = "glyph"; | ||
int char_begin = 0; | ||
int char_end = 0; | ||
unsigned long *unicode = memNew(h, sizeof(unsigned long)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't see a need for allocating this as opposed to declaring unicode
as unsigned long
and passing &unicode
where needed. And that's certainly better than leaking the data, which I think is what is currently happening.
c/shared/source/uforead/uforead.c
Outdated
return cur; | ||
} | ||
|
||
static bool strEqual(char* string1, char* string2) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not declaring these string arguments as const char *
will lead to many compiler warnings.
c/shared/source/uforead/uforead.c
Outdated
if ((!xmlStrEqual((cur)->name, (const xmlChar *) "dict"))) { | ||
xmlFreeDoc(doc); | ||
fatal(h, ufoErrSrcStream, "Error reading outermost <dict> in %s.\n", filename); | ||
if (!strcmp("plist", filetype)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're doing strEqual()
should we use it here too?
c/shared/source/uforead/uforead.c
Outdated
if (h->parseState.UFOFile == preParsingGLIF) { /* when called from preParseGLIF */ | ||
if (strEqual(keyName, "com.adobe.type.cid.CID")) { | ||
keyValue = parseXMLKeyValue(h, cur); | ||
h->parseState.GLIFInfo.currentCID = atoi(keyValue); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it safe to pass NULL
to atoi()
? We advanced cur
above so it could be NULL
(?) and therefore the value passed back from parseXMLKeyValue()
could be NULL
. (Same question WRT next code section below.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I just created a test case for this and it crashes. Would you suggest I add the NULL check to right before setting h->parseState.GLIFInfo.currentCID, or should I return a value like -1 instead of NULL in parseXMLKeyValue so that it fixes this scenario for other callers?
I also realized that if the value of com.adobe.type.cid.CID is a blank value (like the example below) then com.adobe.type.cid.CID is set to 0. I feel like it also probably needs to remain -1 (which is currently what it starts with when unset).
<key>com.adobe.type.cid.CID</key>
<integer></integer>
I'll look for more places where NULL may be returned from parseXMLKeyValue and the other functions. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think being more explicit with NULL checks is probably better for this. Returning "fake" values can eventually result in those values being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Although I'd have to look in more detail to give you a conclusive answer ...)
parseType1HintDataV2(h, cur->children); | ||
} | ||
} | ||
cur = cur->next; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could cur
be NULL
here because we advanced it once already?
6da6103
to
53d3aaa
Compare
This pull request introduces 4 alerts when merging de56e35 into 3920d69 - view on LGTM.com new alerts:
|
@@ -1682,32 +1681,39 @@ static char* getXmlAttrValue(xmlAttr *attr){ | |||
return (char*) attr->children->content; | |||
} | |||
|
|||
static void parseXMLLib(ufoCtx h, xmlNodePtr cur) { | |||
static bool setXMLLib(ufoCtx h, xmlNodePtr cur, char* keyName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You've declared this bool
, presumably to track whether you ran into some problem parsing (given the check on the call), but you're not returning anything.
c/shared/source/uforead/uforead.c
Outdated
} else if (h->parseState.UFOFile == parsingGLIF) { /* when called from parseGLIF */ | ||
if (strEqual(keyName, "com.adobe.type.autohint.v2")) { | ||
h->parseState.type1HintDataV2 = true; | ||
if (cur != NULL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If cur
is NULL, do we want to just ignore the fact there was a recognized key but no value here? Seems like that should be a warning at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, we've been aiming to reduce the amount of warnings until we add the verbosity flag option. However, I'm curious what @frankrolf thinks about this specifically. Frank, if within a UFO's .glif file's there is an empty com.adobe.type.autohint.v2
, would it be helpful to the user to receive a warning?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My experimental repo branch (not yet copied back from my machine) has already pulled all the C-code-level logging messages into a single framework with multiple "levels". So after we start working with that code it will be easier to unify the logging across all of our tools.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! So we can keep the warning messages for when we work with your changes
c/shared/source/uforead/uforead.c
Outdated
static bool setType1HintDataV2(ufoCtx h, char* keyName, HintMask* curHintMask, xmlNodePtr cur) { | ||
char* keyValue; | ||
if (strEqual(keyName, "hintSetList")) { | ||
parseHintSetListV2(h, cur); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are all of the uses of cur
in this method NULL-safe? Seems like since you parsed a key at the level above already, it should always have a value, it would be easier to check whether cur
is NULL above, deal with that condition there, and always be sure cur
is not NULL here.
@skef @josh-hadley This PR is up to date with the latest changes you both requested, and I've added more tests covering edge cases & etc. Let me know if you have any more thoughts on this! |
…ArrayValue, delete V1 parsing functions
…tDictKey & add sub-functions
b9f5b94
to
4b14310
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At some point we should nail down what the return value of stm_xml_read
represents, as it's sort of being used as a weird boolean right now, but I don't see that as a reason to hold things up.
Description
This PR replaces UFO .glif file parsing code with libxml2 #1522 as part of the tx parser replacement with libxml2 project. It replaces quite a few parsing functions as well as adds some.
I'll be adding tests soon as part of this PR.
Checklist: