-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incapsula changed cookie value creation algorithm for whoscored.com #4
Comments
Code diff needs to be translated into python and inserted into the method below which can be found in incapsula.session.IncapSession._set_incap_cookie() def _set_incap_cookie(self, v_array, domain=''):
"""
Calculate the final value for the cookie needed to bypass incapsula.
.. note:: Translated from:
function setIncapCookie(vArray) {
var res;
try {
var cookies = getSessionCookies();
var digests = new Array(cookies.length);
for (var i = 0; i < cookies.length; i++) {
digests[i] = simpleDigest((vArray) + cookies[i]);
}
res = vArray + ",digest=" + (digests.join());
} catch (e) {
res = vArray + ",digest=" + (encodeURIComponent(e.toString()));
}
createCookie("___utmvc", res, 20);
}
:param v_array: Comma delimited, urlencoded string which was returned from :func:`simple_digest`.
:param domain: Cookie domain.
:return:
"""
cookies = self._get_session_cookies()
digests = []
for cookie_val in cookies:
digests.append(simple_digest(v_array + cookie_val))
# Translated code must be applied here.
res = v_array + ',digest=' + ','.join(digests)
logger.debug('setting ___utmvc cookie to {}'.format(res))
self._create_cookie(self._create_cookie('___utmvc', res, 20, domain=domain) |
The full de-obfuscated .js file from whoscored.com can be found here. |
the variable The most straight forward solution I can think of right now is:
|
Hi! b = ""
char_list = []
#Code equivalent to:
# for (var i = 0; i < b.length; i += 2) {
# z = z + parseInt(b.substring(i, i + 2), 16) + ",";
# }
# z = z.substring(0, z.length - 1);
# eval(eval('String.fromCharCode(' + z + ')'));
for i in range(0,len(b),2):
char_list.append(int(b[i:i+2], base=16))
code = ""
for char in char_list:
code = code + chr(char)
#Regex to match sl var value
sl_var = re.search('sl = "(.+)";',code).group(1)
dd = ""
asl = ""
#Code equivalent to
# for (var i=0;i<sl.length;i++) {
# asl += (sl.charCodeAt(i) + dd.charCodeAt(i % dd.length)).toString(16);
# }
for i in range(0,sl_var):
asl = asl + format(ord(sl_var[i]) + ord(dd[i % len(dd)]), 'x') and i'm not sure if try to match Incapsula_Resource url in script tag from original page with a regex is good idea. Something like: re.search("(/_Incapsula_Resource.+)'",response.text).group(1) |
@hades1996 - That logic will work. I ended up with a working solution, but not in Python. Used this repo to get most of the logic :) |
Thanks @hades1996, I updated the script using your code, I hope that @ziplokk1 accept my pr. |
Sorry, I have been a bit busy lately. I will try to review the changes this weekend and create and |
Just to add to @LuisUrrutia answer, I had to do something different in in
So the function is now:
Hope it helps :) EDIT: First in
I've added the
Then I've modified
I've added the I hope this helps someone! |
@andresarslanian Your solution doesn't seem to work. I got blocked by recapatcha error. BTW, I've changed
to
@ziplokk1 any updates here? Does it work for you now? |
Also, as I can see now we have a new problem: incapsula says 'Request unsuccessful. Incapsula incident ID: 108002140047883972-116655804302232934' - this is smth new |
Another few hours spent on researching lead me to understanding that it is almost impossible to deobfuscate whoscored. Instead of some meaningful js I'm jsut getting following (not full code):
It seems to be executable javascript, however there is no |
@lemm-leto i tried to investigate how incapsula is doing client-side verification few weeks ago but i couldn't find anything useful you're indeed right, it seems incapsula's client challenge is now being obfuscated and because of that is very hard to understand i tried to do some research and obfuscated scripts users are receiving from incapsula seems like the ones generated by this tool https://javascriptobfuscator.herokuapp.com/ if incapsula is obfuscating their code using random seeds and adding debug protection we are pretty much scrubbed |
I was able to decode it to the following format (apparently variable and function names are non-recoverable): (function(_0x171531, _0x41e00e) { But the main question is WHY it cannot be compiled without errors??? I tried many times, but compiler says variable _0xfb6b is not assigned. _}(0xfb6b, 0xb8)); |
I ended up getting pretty far into reading the obfuscated code and used this project for months until about 1 year ago. I was going the way you went - learned a lot about minification and could work through many of the constructs following that link for the javascript obfuscator above. Came to the realization that any change to the challenges would break my scraper and some sites had the higher level of incapsula protection. I got it working with chrome headless and taking over the cookies from a session into my scraper. I switched over many months ago with no issues so far. |
@brianzinn, so you use Selenium to get passed incapsula? Once you have the valid cookies you use them in your regular requests until they expire, after which you would pick up new once with Selenium? Would be interesting how you guys put up a robust solution around this problem. Any recent progress (@ziplokk1, @Nyadesune, @LuisUrrutia)? |
Original Javascript method looked like this:
Now they have changed it to this:
Here is the relevant diff:
The text was updated successfully, but these errors were encountered: