Skip to content
This repository has been archived by the owner on Mar 31, 2020. It is now read-only.

Use to bypass sites which use incapsula to block access to webscraping bots.

License

Notifications You must be signed in to change notification settings

ebates-edc/incapsula-cracker

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This version is no longer maintained, Please see here for the newest version.

Please Note

  • This library is for python2.7+.
  • As of 2017-05-22 I am officially no longer maintaining this version. See above for the python3 implementation.
  • I have not been able to keep this script updated. If there are issues, try to check out the hotfix branch and see if that works.

Description

This module is used to wrap any request to a webpage blocked by incapsula.

Usage

With Requests

from incapsula import crack
import requests

session = requests.Session()
response = session.get('http://example.com')  # url is blocked by incapsula
response = crack(session, response)  # url is no longer blocked by incapsula
from incapsula import IncapSession
session = IncapSession()
response = session.get('http://example.com')  # url is not blocked by incapsula

With Scrapy

settings.py

DOWNLOADER_MIDDLEWARES = {
    'incapsula.IncapsulaMiddleware': 900
}

Setup

pip install incapsula-cracker

There should be no problems using incapsula-cracker right out of the box.

If there are issues, try the following

  • Open incapsula/serialize.html in browser
  • Copy and paste the json data into incapsula/navigator.json

Notes

  • config.py, navigator.json, and serialize.html have all only been tested using firefox.
  • As of now, this is only proven to work with bjs.com.
  • I understand that there's minimal commenting and that's because I'm not sure exactly why incapsula is sending requests to certain pages other than to obtain cookies. This is just a literal reverse engineer of incapsulas javascript code.
  • If you would like to contribute or if there are any other sites that you would like me to add, contact me at sdscdeveloper@gmail.com.

About

Use to bypass sites which use incapsula to block access to webscraping bots.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 90.7%
  • HTML 9.3%