Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Usernames in scientific notation are interpreted incorrectly somehow #461

Closed
eritbh opened this issue Feb 28, 2021 · 5 comments · Fixed by #462
Closed

Usernames in scientific notation are interpreted incorrectly somehow #461

eritbh opened this issue Feb 28, 2021 · 5 comments · Fixed by #462
Labels
bug something isn't working core: tbstorage
Milestone

Comments

@eritbh
Copy link
Member

eritbh commented Feb 28, 2021

https://new.reddit.com/r/toolbox/comments/ltwj4e/bug_mod_actions_dialog_does_not_work_correctly/

Been investigating this, issue seems to be caused by the TBStorage.purifyObject function called on the response of API requests. Which is... not great. Will post further investigation and (hopefully) a PR to fix this shortly.

@eritbh eritbh added bug something isn't working core: tbstorage labels Feb 28, 2021
@eritbh eritbh added this to the v5.5.3 milestone Feb 28, 2021
@eritbh
Copy link
Member Author

eritbh commented Feb 28, 2021

These lines in purifyObject seem to be responsible:

case 'string':
// Let's see if we are dealing with json.
// We want to handle json properly otherwise the purify process will mess up things.
try {
const jsonObject = JSON.parse(input[key]);
purifyObject(jsonObject);
input[key] = JSON.stringify(jsonObject);
} catch (e) {
// Not json, simply purify
input[key] = TBStorage.purify(input[key]);
}
break;

All JSON values are valid root values, not just objects/arrays. This means that when we parse all keys to determine whether or not we should purify the objects they represent as well, we can end up parsing a string into a primitive, purifying it, and re-stringifying it. Most of the time this doesn't matter, but since JSON supports scientific notation, it can't be assumed that JSON.stringify(JSON.parse(something)) === something. Specifically, the cause of this bug is that JSON.stringify(JSON.parse("1e2")) === "100".

The upshot of this is that when an object is run through the purify function, any string value of the object which appears to the JSON parser as scientific notation will be mangled. This is a very deep issue which probably impacts many areas of the extension.

There are two ways we can remediate this: Either remove the code here which tries to detect nested JSON strings in purified objects, instead requiring any module code which relies on this behavior to purify the nested objects after parsing them manually, or we add a type guard here so this rewriting only occurs if typeof jsonObject === 'object' && jsonObject != null. I'm leaning towards the former, because I don't think we have many (any?) cases where this behavior is leveraged, and it makes this function difficult to reason about. Pinged creesch on Discord for his opinion.

@creesch
Copy link
Member

creesch commented Feb 28, 2021

requiring any module code which relies on this behavior to purify the nested objects after parsing them manuall

I am not sure we actually can do this, not easily anyway. We are purifying because Mozilla requires us to do so for external data sources like API responses and such meaning we can't defer it and do it later as is based on the assumption that we don't control outside data sources.

@eritbh
Copy link
Member Author

eritbh commented Feb 28, 2021

We are purifying because Mozilla requires us to do so for external data sources like API responses and such meaning we can't defer it and do it later as is based on the assumption that we don't control outside data sources.

We should be able to do the sanitizing whenever we want, as long as we do it before the remote data is used in a jQuery call/DOM object construction.

Currently, every string value passed to purifyObject is parsed as JSON, and the resulting value is purified and re-stringified. Modules then have to re-parse the JSON in order to work with it. What I'm proposing is that we move all this handling to module code, so we don't have to parse the string twice - modules that know they need to parse a JSON string from some remotely-loaded object will be responsible for purifying the resulting value themselves, and other strings that are never interpreted as JSON aren't parsed at all, avoiding the side effect we're running into here.

I'm currently looking for situations where this would require a change - maybe seeing a concrete example of what I'm talking about might help.

@creesch
Copy link
Member

creesch commented Mar 1, 2021

On discord I already talked about this but I feel like there is a simpler approach we could take here. Effectively what the code is written for is detecting JSON objects, so we aren't interested in any other types that can result from parsing it.

With that in mind I see two possible simple solutions here.

The first one is to simply check if we actually got an object back from parsing.

case 'string':
    // Let's see if we are dealing with json.
    // We want to handle json properly otherwise the purify process will mess up things.
    try {
        const jsonObject = JSON.parse(input[key]);

        // check if we actually got an object if not, handle it as string.
        if (typeof jsonObject === 'object' && jsonObject !== null) {
            purifyObject(jsonObject);
            input[key] = JSON.stringify(jsonObject);
        } else {
            input[key] = TBStorage.purify(input[key]);
        }

    } catch (e) {
        // Not json, simply purify
        input[key] = TBStorage.purify(input[key]);
    }
    break;

The second one isn't as clean but does mean we don't parse unnecessarily. We simply check if the string starts with { and ends with }

case 'string':
    // Let's see if we are dealing with json.
    // We want to handle json properly otherwise the purify process will mess up things.
    if(input[key].trim().startsWith('{') && input[key].trim().endsWith('}'))
    try {
        const jsonObject = JSON.parse(input[key]);
            purifyObject(jsonObject);
            input[key] = JSON.stringify(jsonObject);
    } catch (e) {
        // Not json, simply purify
        input[key] = TBStorage.purify(input[key]);
    }
    break;

@creesch
Copy link
Member

creesch commented Mar 1, 2021

Just to make the information here complete

What I'm proposing is that we move all this handling to module code, so we don't have to parse the string twice - modules that know they need to parse a JSON string from some remotely-loaded object will be responsible for purifying the resulting value themselves, and other strings that are never interpreted as JSON aren't parsed at all, avoiding the side effect we're running into here.

Mozilla reviewers have a limited amount of time available and in my experience aren't diving deeply into code. Purify was introduced as a requirement from the Mozilla review proces and if I remember correctly their stance was that data should be cleaned up where it is received for two reasons:

  1. The review process, they can't possibly check all possible use cases of data to make sure we purify properly.
  2. The above is also somewhat true for us, it is easy to overlook where we use external data in HTML specifically as almost all data at some point is used in html somewhere.

With the above in mind and the fact that I know we all have very limited time on our hands I very strongly prefer my proposed solution as that effectively fixes the actual bug here without having to restructure toolbox in any way.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug something isn't working core: tbstorage
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants