Skip to content
This repository has been archived by the owner on Nov 15, 2017. It is now read-only.

Necessary script in frame not shown in matrix #215

Closed
ghost opened this issue Mar 24, 2014 · 27 comments
Closed

Necessary script in frame not shown in matrix #215

ghost opened this issue Mar 24, 2014 · 27 comments

Comments

@ghost
Copy link

ghost commented Mar 24, 2014

http://www.spiegel.de/politik/deutschland/pressekompass-zum-afd-parteitag-bernd-lucke-unter-druck-a-960345.html

This site contains a frame from compass.pressekompass.net. If I whitelist the frame cell, nothing is shown and the related script cell remains empty. Once I whitelist the (empty) script cell for compass.pressekompass.net and reload the site, that cell now shows 3 scripts not shown before. A new line for kompassdemo.appspot.com appears and after whitelisting the script cell for the latter, the site is finally readable.

@gorhill
Copy link
Owner

gorhill commented Mar 24, 2014

I did whitelist frame for compass.pressekompass.net, and got this as a result (4 script objects shown for compass.pressekompass.net):

httpsb_issue-215

The script cell for compass.pressekompass.net was shown empty for you?

@ghost
Copy link
Author

ghost commented Mar 24, 2014

The |script| cell for |compass.pressekompass.net| was shown empty for you?

Yes, it was. I was able to reproduce it just a minue ago. Strange that
it wasn't for you ...

@gorhill
Copy link
Owner

gorhill commented Mar 24, 2014

I tried on Chromium 33. I will have to try on something closer to what you use. Is it Chrome? Version, etc. Is smart-reload enabled? Any other add-ons? When the bug occurs, can you see the requests for scripts in the Statistics tab?

@ghost
Copy link
Author

ghost commented Mar 25, 2014

I'm on Chrome 33.0.1750.152 under Kubuntu 13.10. Smart-reload is enabled. I tried it also with all other add-ons disabled and Chrome restarted - the script cell for compass.pressekompass.net remains empty even after whitelisting the frame cell.

In the statistics tab I can see some allowed scripts like:

http://www.spiegel.de/politik/deutschland/pressekompass-zum-afd-parteitag-bernd-lucke-unter-druck-a-960345.html{inline_script}
http://www.spiegel.de/politik/deutschland/pressekompass-zum-afd-parteitag-bernd-lucke-unter-druck-a-960345.html{3rd-party_scripts}

but no blocked scripts referring to pressekompass.net.

EDIT: Adblock complex rules are enabled, and I'm using most block lists. ABP rules only block some png's and gif's on that site, though.

@gorhill
Copy link
Owner

gorhill commented Mar 25, 2014

Ok, I still can't reproduce using Chrome. There is a hint in there though.

You see only 3 scripts reported eventually, while I see 4 reported right away. The proper number is actually 4, 3 external javascript files + 1 for the inline javascript. The only way that inline javascript is counted is through the injection of a content script, which looks for <script> tags. So it appears that HTTPSB's content script is not being executed on your side. Now I have to find out why.

An extension's content scripts is executed in the page context, so that if you bring up the console for that page, you should be able to see the output of any error from the content script, which can be identified by the source file in which the error occurred: if the content script failed, an error in the file [extension id]/js/contentscript.js should reported. That's what I would look at first if I could reproduce it.

@gorhill
Copy link
Owner

gorhill commented Mar 25, 2014

Looking at the code, I think I see a mistake. If so, such bug would show up depending on how fast a page loads (depends on computer, network, etc.)... At line #217 of contentscript.js, I suspect the code should be

if ( document.readyState === 'complete' ) {

instead of

if ( document.readyState === 'interactive' ) {

Since I can't reproduce the bug, would you mind trying the above change to see if it resolves the problem?

EDIT: Looking more into this I doubt there is a bug in the code above, but I guess it should be ruled out by trying the change just to be sure.

@ghost
Copy link
Author

ghost commented Mar 25, 2014

Unfortunately the change in line #217 didn't help.

In the console I found this:
Blocked script execution in 'http://compass.pressekompass.net/compasses/spiegel/die_partei_das_bin_ich' because the document's frame is sandboxed and the 'allow-scripts' permission is not set. content.js:48
c.sendMessage content.js:48
L.run content.js:218
L.runHlp content.js:213
M content.js:257
messageListener extensions::messaging:343
Event.dispatchToListener extensions::event_bindings:394
Event.dispatch_ extensions::event_bindings:378
Event.dispatch extensions::event_bindings:400
dispatchOnMessage extensions::messaging:307

And there is an entry that says:

Uncaught SecurityError: Blocked a frame with origin "http://www.spiegel.de" from accessing a frame with origin "null". The frame requesting access has a protocol of "http", the frame being accessed has a protocol of "data". Protocols must match.
prevel.js:70

(Sorry, I was not able to upload an image: "Something went really wrong, and we can't process that image." I don't know why.)

@gorhill
Copy link
Owner

gorhill commented Mar 25, 2014

Blocked script execution in 'http://compass.pressekompass.net/compasses/spiegel/die_partei_das_bin_ich' because the document's frame is sandboxed and the 'allow-scripts' permission is not set. content.js:48

content.js is not HTTPSB, and not from the server from what I can see. It's probably from another extension which injected javascript in the frame through the DOM, not through extension content scripting. I don't see how this would affect HTTPSB's own js code anyways, unless that injected js would remove <script> tags, which would prevent HTTPSB from finding them (but then it was blocked from the error report, so it can't do anything...) Do you have an extension which injects javascript directly in the DOM?

I'm really puzzled at this point, there seems to be something specific in your environment which prevent HTTPSB's own content script from running properly, or from running properly but where <script> tags have been removed at the time HTTPSB's content script runs.

@ghost
Copy link
Author

ghost commented Mar 25, 2014

The only one I can think of is Tampermonkey. But as mentioned earlier, I had disabled all other add-ons (and I've done it for Tampermonkey again) but the problem persists.

BTW, there is one entry that says:

event.returnValue is deprecated. Please use the standard event.preventDefault() instead. javascript-V5-0-3.js:26

Don't know if that's somehow relevant.

@gorhill
Copy link
Owner

gorhill commented Mar 26, 2014

I installed Tampermonkey, and I confirm the above reference to content.js is from Tampermonkey. I don't know much about Tampermonkey, but my understanding is that it tries to add a script to the page. Is it possible for me to obtain the script it tries to add on your side?

@ghost
Copy link
Author

ghost commented Mar 26, 2014

I installed Tampermonkey, and I confirm the above reference to
|content.js| is from Tampermonkey. I don't know much about Tampermonkey,
but my understanding is that it tries to add a script to the page. Is it
possible for me to obtain the script it tries to add on your side?

Sure, it's this one: http://userscripts.org/scripts/show/158054

However, as mentioned before I had disabled all extensions (except
HTTPSB), cleard the cache and restarted Chrome but still had this
problem. Thus, I don't think that's the culprit.

@gorhill
Copy link
Owner

gorhill commented Mar 26, 2014

Do you have "Predict network actions to improve page load performance" in Settings? There is an instance where the content scripting handler would fail accounting for script tags on the page, is if the tab id is negative, which means in such case HTTPSB wouldn't know where to the found script tags are to be recorded. I believe tab id of -1 is possible if a page is premptively fetched. I'm speculating here about how this works internally in the browser.

@ghost
Copy link
Author

ghost commented Mar 26, 2014

Do you have "Predict network actions to improve page load performance"
in Settings?

Yes, it's enabled here. However, after disabling it the problem remains.

@gorhill
Copy link
Owner

gorhill commented Mar 26, 2014

Since I can't reproduce it, the only way for me to investigate I can think of is to provide you with a debug version of HTTPSB's files contentscript.js and contentscripthandler.js, which would log the various steps at the console, this would allow me to narrow where in the code something doesn't go as expected.

@gorhill
Copy link
Owner

gorhill commented Mar 26, 2014

EDIT: Never mind. I slightly re-wrote the code in there which results in the exact same behavior you are describing. I will investigate further.

@gorhill
Copy link
Owner

gorhill commented Mar 26, 2014

Alright, I believe this could fix the issue, but since you can reliably reproduce it, only you at this point can validate this fix, if you don't mind. Replacing all this at line #217 of contentscript.js:

 if ( document.readyState === 'interactive' ) {
    loadHandler();
} else {
    window.addEventListener('load', loadHandler);
}

with just

loadHandler();

I believe should fix the problem. In effect, HTTPSB's content script wasn't being executed in your case.

@ghost
Copy link
Author

ghost commented Mar 27, 2014

Alright, I believe this could fix the issue, but since you can reliably
reproduce it, only you at this point can validate this fix, if you don't
mind. Replacing all this at line #217 of contentscript.js
https://github.com/gorhill/httpswitchboard/blob/master/js/contentscript.js#L217:

| if ( document.readyState === 'interactive' ) {
loadHandler();
} else {
window.addEventListener('load', loadHandler);
}
|

with just

|loadHandler();
|

I believe should fix the problem. In effect, HTTPSB's content script
wasn't being executed in your case.
Raymond, I edited that file as you said but, unfortunately, the problem
still exists.

@gorhill
Copy link
Owner

gorhill commented Mar 27, 2014

I don't know why the content script does not get executed. The change I proposed should have made HTTPSB's content script unconditionally execute. I installed Debian-based Chrome itself here, and still unable to reproduce. Something is preventing the content script from running, and I just can't find out without first reproducing the bug, so that I can walk through the code. Did you try on another computer?

@ghost
Copy link
Author

ghost commented Mar 27, 2014

Did you try on another computer?

I just tried Chrome in Manjaro Linux running in Virtualbox. Same problem :-(

I tried a second time after editing contentscript.js as suggested by you
but that seems to have negative side-effects: Now even the script cell
for www.spiegel.de remains empty ...

Raymond, I don't want to keep you from more important work. The problem
seems to be very specific for me. I will try to see if it also occurs on
other computers and report accordingly.

@gorhill
Copy link
Owner

gorhill commented Mar 27, 2014

Could you confirm that the property run_at is set to document_end on your installation in the manifest file?

No reason why it should be otherwise, but I am at a lost to understand what is happening.

Raymond, I don't want to keep you from more important work. The problem
seems to be very specific for me.

Unlikely I will be able to not worry about this problem. This is pretty serious, as the bug here causes the user to be misinformed, hence he can't take proper action as a result of this misinformation. It's more likely that other people are also suffering the same problem, but not realizing that there is missing information.

@ghost
Copy link
Author

ghost commented Mar 27, 2014

Could you confirm that the property |run_at| is set to |document_end| on
your installation in the manifest file
https://github.com/gorhill/httpswitchboard/blob/master/manifest.json#L22?
Yes, it is.

@ghost
Copy link
Author

ghost commented Mar 27, 2014

I tried a second time after editing contentscript.js as suggested by you
but that seems to have negative side-effects: Now even the script cell
for www.spiegel.de remains empty ...

BTW, I've noticed this side-effect also on other sites.

@gorhill
Copy link
Owner

gorhill commented Mar 27, 2014

I tried a second time after editing contentscript.js as suggested by you but that seems to have negative side-effects: Now even the script cell for www.spiegel.de remains empty ...

Now this is just plain weird. The API doc clearly says that when a content script is injected at document_end, the DOM is all in there, except that sub-resources might not have been loaded yet. So all <script> tags should be there and easy to account for, even though the scripts themselves are not loaded (HTTPSB just care about the src property of the <script> tag, not the script itself).

Would you mind if I send you a custom version of contentscript.js to replace yours, and which has debug log code in it? The output would help me narrow the failure point. This is what I get on my side (in the console of the page itself, not the extension):

httpsb_issue-215-b

That would be the content of contentscript.js, with the added console.debug code:

/*******************************************************************************

    httpswitchboard - a Chromium browser extension to black/white list requests.
    Copyright (C) 2013  Raymond Hill

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU General Public License for more details.

    You should have received a copy of the GNU General Public License
    along with this program.  If not, see {http://www.gnu.org/licenses/}.

    Home: https://github.com/gorhill/httpswitchboard
*/

// Injected into content pages

/******************************************************************************/
/*------------[ Unrendered Noscript (because CSP) Workaround ]----------------*/

var fixNoscriptTags = function() {
    var a = document.querySelectorAll('noscript');
    var i = a.length;
    var realNoscript,
        fakeNoscript;
    while ( i-- ) {
        realNoscript = a[i];
        fakeNoscript = document.createElement('div');
        fakeNoscript.innerHTML = '<!-- HTTP Switchboard NOSCRIPT tag replacement: see <https://github.com/gorhill/httpswitchboard/issues/177> -->\n' + realNoscript.textContent;
        realNoscript.parentNode.replaceChild(fakeNoscript, realNoscript);
    }
};

var checkScriptBlacklistedHandler = function(response) {
    if ( response.scriptBlacklisted ) {
        fixNoscriptTags();
    }
}

var checkScriptBlacklisted = function() {
    chrome.runtime.sendMessage({
        what: 'checkScriptBlacklisted',
        url: window.location.href
    }, checkScriptBlacklistedHandler);
};

/******************************************************************************/

var localStorageHandler = function(mustRemove) {
    if ( mustRemove ) {
        window.localStorage.clear();
        // console.debug('HTTP Switchboard > found and removed non-empty localStorage');
    }
};

/******************************************************************************/

var nodesAddedHandler = function(nodeList, summary) {
    var i = 0;
    var node, src, text;
    while ( node = nodeList.item(i++) ) {
        if ( !node.tagName ) {
            continue;
        }

        switch ( node.tagName.toUpperCase() ) {

        case 'SCRIPT':
            text = node.textContent.trim();
            if ( text !== '' ) {
                summary.scriptSources['{inline_script}'] = true;
                summary.mustReport = true;
            }
            src = (node.src || '').trim();
            if ( src !== '' ) {
                summary.scriptSources[src] = true;
                summary.mustReport = true;
            }
            break;

        case 'A':
            if ( node.href.indexOf('javascript:') === 0 ) {
                summary.scriptSources['{inline_script}'] = true;
                summary.mustReport = true;
            }
            break;

        case 'OBJECT':
            src = (node.data || '').trim();
            if ( src !== '' ) {
                summary.pluginSources[src] = true;
                summary.mustReport = true;
            }
            break;

        case 'EMBED':
            src = (node.src || '').trim();
            if ( src !== '' ) {
                summary.pluginSources[src] = true;
                summary.mustReport = true;
            }
            break;
        }
    }
};

/******************************************************************************/

var mutationObservedHandler = function(mutations) {
    var summary = {
        what: 'contentScriptSummary',
        locationURL: window.location.href,
        scriptSources: {}, // to avoid duplicates
        pluginSources: {}, // to avoid duplicates
        mustReport: false
    };
    var iMutation = mutations.length;
    var mutation;
    while ( iMutation-- ) {
        mutation = mutations[iMutation];
        if ( !mutation.addedNodes || !mutation.addedNodes.length ) {
            // TODO: attr changes also must be dealth with, but then, how
            // likely is it...
            continue;
        }
        nodesAddedHandler(mutation.addedNodes, summary);
    }

    if ( summary.mustReport ) {
        chrome.runtime.sendMessage(summary);
    }
};

/******************************************************************************/

var firstObservationHandler = function() {
    var summary = {
        what: 'contentScriptSummary',
        locationURL: window.location.href,
        scriptSources: {}, // to avoid duplicates
        pluginSources: {}, // to avoid duplicates
        localStorage: false,
        indexedDB: false,
        mustReport: true
    };
    // https://github.com/gorhill/httpswitchboard/issues/25
    // &
    // Looks for inline javascript also in at least one a[href] element.
    // https://github.com/gorhill/httpswitchboard/issues/131
    nodesAddedHandler(document.querySelectorAll('script, a[href^="javascript:"], object, embed'), summary);

    // Check with extension whether local storage must be emptied
    if ( window.localStorage && window.localStorage.length ) {
        summary.localStorage = true;
        chrome.runtime.sendMessage({
            what: 'contentScriptHasLocalStorage',
            url: summary.locationURL
        }, localStorageHandler);
    }

    // TODO: indexedDB
    if ( window.indexedDB && !!window.indexedDB.webkitGetDatabaseNames ) {
        // var db = window.indexedDB.webkitGetDatabaseNames().onsuccess = function(sender) {
        //    console.debug('webkitGetDatabaseNames(): result=%o', sender.target.result);
        // };
    }

    // TODO: Web SQL
    if ( window.openDatabase ) {
        // Sad:
        // "There is no way to enumerate or delete the databases available for an origin from this API."
        // Ref.: http://www.w3.org/TR/webdatabase/#databases
    }

    console.debug('HTTPSB> firstObservationHandler(): found %d script tags', Object.keys(summary.scriptSources).length);

    chrome.runtime.sendMessage(summary);
};

/******************************************************************************/

var loadHandler = function() {
    // Checking to see if script is blacklisted
    // Not sure if this is right place to check. I don't know if subframes with
    // <noscript> tags will be fixed.
    checkScriptBlacklisted();

    firstObservationHandler();

    // Observe changes in the DOM
    // https://github.com/gorhill/httpswitchboard/issues/176
    var observer = new MutationObserver(mutationObservedHandler);
    observer.observe(document.body, {
        attributes: false,
        childList: true,
        characterData: false,
        subtree: true
    });
};

/******************************************************************************/

// rhill 2013-11-09: Weird... This code is executed from HTTP Switchboard
// context first time extension is launched. Avoid this.
// TODO: Investigate if this was a fluke or if it can really happen.
// I suspect this could only happen when I was using chrome.tabs.executeScript(),
// because now a delarative content script is used, along with "http{s}" URL
// pattern matching.
console.debug('HTTPSB> window.location.href = "%s"', window.location.href);
if ( /^https?:\/\/./.test(window.location.href) ) {
    // rhill 2014-01-26: If document is already loaded, handle all immediately,
    // otherwise defer to later when document is loaded.
    // https://github.com/gorhill/httpswitchboard/issues/168
    loadHandler();
}

@ghost
Copy link
Author

ghost commented Mar 27, 2014

No problem, I'll do that. However, I'm about to leave now so I won't find the time until tomorrow evening. I'll report!

@ghost ghost closed this as completed Mar 28, 2014
@ghost ghost reopened this Mar 28, 2014
@ghost
Copy link
Author

ghost commented Mar 28, 2014

I'm sorry that I'm not able to upload an image as I'm still getting the error message mentioned earlier (although everything for github is whitelisted ...).

You can see the image here: http://www.myimg.de/?img=js01b49.png

@gorhill
Copy link
Owner

gorhill commented Mar 28, 2014

Thank you very much, this allowed me to figure how to reproduce the bug. The key setting is that you have "Block third-party cookies and site data" enabled, while I don't. The fact that a 3rd-party cannot access the window.localStorage when the setting is enabled, causes an exception to be raised, which is not handled in HTTPSB's content script code.

So this confirms extensions are also seen as 3rd-parties, which means they can't access a web site stored data if the setting is enabled, which I think is an interesting information (I will go enable mine). [Correction: content script executes as if origin is same as frame origin].

Also, this means that HTTPSB can't clear a site data (local storage) itself if the browser setting is enabled. [Correction: only for when inside frame, so actually feature still work properly.]

I will work on a fix, for today hopefully.

Regarding drag-n-drop of images on Github, I revised the preset rules, so you can whether update the 3rd-party assets for the new Github preset, or wait for the next release. A rule has to be created in the chromium-behind-the-scene scope however, and I realize their is a shortcoming in the persist button in the matrix, which persist only the current scope, not other scope. In other word, if you import the Github preset you will have to go to the chromium-behind-the-scene to also persist the related other s3.amazonaws.com rule.

@ghost
Copy link
Author

ghost commented Mar 28, 2014

Raymond, I'm very glad that you found the "culprit". I temporarily allowed 3rd-party cookies and site data and loaded that website - and, hurray, the script cell for compass.pressekompass.net now shows 4 scripts. This confirms your findings. Thanks for your patience and persistence!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

1 participant