Siteshooter

Automate full website screen shots and PDF generation with multiple view port support

Features

Crawls specified host and generates a sitemap.xml on the fly
Generates entire website screen shots based on sitemap.xml
Define multiple view ports
Automated PDF generation
Includes crawled meta data in generated PDF
Reports on broken website links (404 http response)
Supports HTTP basic authentication
Supports Microsoft Online 3 step authentication
Supports Salesforce Visualforce 3 step authentication
Supports site maps with HTTP, HTTPS, and FTP protocol URLs
Follows HTTP 301 redirects
Custom JavaScript inject file - injects into page prior to screen shooting
Trigger page events by passing querystring values to custom inject.js file

Do you need a website and workflow management platform?

Give Catapult a shot

In This Documentation

Getting Started
Siteshooter Configuration File
CLI Options
Tests
Troubleshooting & FAQ

Getting Started

Dependencies

Install the following prerequisite on your development machine:

Node.js - version >= 6.0.0

Notable npm Modules

Quick Start

$ npm install siteshooter --global

If siteshooter is installed, make sure you have the latest version by running:

$ npm update siteshooter --global

You may need to run these commands with elevated privileges, e.g. sudo, you will be prompted to do so if needed.
Installing with the --global flag affords you the siteshooter command on your machine's command line at any path.
Read more about the --global flag here.

Create a Siteshooter Configuration File

$ siteshooter --init

Update Siteshooter Configuration File

View the full siteshooter.yml example

Inside siteshooter.yml, add additional options.

All Simple Web Crawler options can be added to sitecrawler_options and will pass through to the crawler process
Generated screenshot image files are optimized using imagemin and imagemin-pngquant modules, which reduce the overall size of generated PDFs. To adjust the image quality, update the image_quality option in your siteshooter.yml file.

domain:
  name: https://www.devopsgroup.io
  auth:
    user:
    pwd:

pdf_options:
 excludeMeta: true

screenshot_options:
  delay: 2000
  image_quality: '60-80'

sitecrawler_options:
  exclude:
   - "pdf"
  stripQuerystring: false
  ignoreInvalidSSL: true

viewports:
 - viewport: desktop-large
   width: 1600
   height: 1200
 - viewport: tablet-landscape
   width: 1024
   height: 768
 - viewport: iPhone5
   width: 320
   height: 568
 - viewport: iPhone6
   width: 375
   height: 667

CLI Options

$ siteshooter --help

Usage: siteshooter [options]

OPTIONS
_______________________________________________________________________________________
-c --config            Show configuration
-C --cwd               Set working directory, which will load a siteshooter.yml file in the specified path
-e --debug             Output exceptions
-h --help              Print this help
-i --init              Create siteshooter.yml template file in working directory
-p --pdf               Generate PDFs, by defined view ports, based on screen shots created via Siteshooter
-q --quiet             Only return final output
-s --screenshots       Generate screen shots, by view ports, based on sitemap.xml file
-S --sitemap           Crawl domain name specified in siteshooter.yml file and generate a local sitemap.xml file
-v --version           Print version number
-V --verbose           Verbose output
-w --website           Report on website information based on Siteshooter crawled results

When running a siteshooter command without any options, the following options will run in order by default:

--sitemap
--screenshots
--pdf

Custom JavaScript Inject File

To manipulate the DOM, prior to the screen shot process, add a inject.js file in the same working directory as the siteshooter.yml.

Example: inject.js file

/**
 * @file:            inject.js
 * @description:     used to inject custom JavaScript into a web page prior to a screen shot. 
 */

console.log('JavaScript injected into page.');

if ( typeof(jQuery) !== "undefined" ) {

    jQuery(document).ready(function() {
        console.log('jQuery loaded.');
    });
}

Trigger JavaScript Events

When using the optional inject.js file, events can be triggered based on the following querystring parameter - pevent

 // Add URL with pevent querystring parameter in the generated sitemap.xml
<url>
    <loc>https://www.devopsgroup.io?pevent=open-privacy-overlay</loc>
    <changefreq>weekly</changefreq>
</url>

Example: Event detection & triggering

/**
 * @file:            inject.js
 * @description:     used to inject custom JavaScript into a web page prior to a screen shot. 
 */


function getQueryVariable(variable) {
    var query = window.location.search.substring(1);
    var vars = query.split('&');
    for (var i = 0; i < vars.length; i++) {
        var pair = vars[i].split('=');
        if (decodeURIComponent(pair[0]) == variable) {
            return decodeURIComponent(pair[1]);
        }
    }
}

if ( typeof(jQuery) !== "undefined" ) {

    jQuery(document).ready(function() {
        var pageName = window.location.pathname.replace('/', ''),
            pageEvent = getQueryVariable('pevent');

        console.log('document ready.');
        console.log('userAgent', navigator.userAgent);
        console.log('Page: ', pageName);
        console.log('Event: ', pageEvent);

        switch (pageName) {

            // home
            case '':

                switch (pageEvent) {
                    case 'open-privacy-overlay':

                        jQuery('a[data-target~="#modal-privacy"]').trigger('click');
                        break;
                }

                break;
        }

    });
}

Tests

Tests are written with Mocha and can be run with npm test.

Troubleshooting

If you're having issues with Siteshooter, submit a GitHub Issue.

Make sure you have a siteshooter.yml file in your working directory and the yaml file is well formatted
Experiencing font-loading issues? Try increasing the delay setting in your siteshooter.yml file

screenshot_options:
  delay: 2000

Trying to take a screenshot of a page with a video? Unfortunately, PhantomJS does not support videos. As such, here's one approach to showing a video's poster image.

/**
 * @file:            inject.js
 * @description:     used to display a video's poster image
 */

if( jQuery('video').length >0 ){
    jQuery('video').parent().prepend('<img src="'+jQuery('video').attr('poster')+'"/>');
    jQuery('video').remove();
}

SimpleCrawler TypeError: The header content contains invalid characters
- Try setting the acceptCookies option to false

sitecrawler_options:
  acceptCookies: false

Code of Conduct

Take a moment to read or Code of Conduct

Contributing to the project

We are always looking for quality contributions! Please check the CONTRIBUTING.md for contribution guidelines.

Name		Name	Last commit message	Last commit date
Latest commit History 447 Commits
bin		bin
examples/screenshot-delay		examples/screenshot-delay
lib		lib
share/icons		share/icons
test		test
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.jscsrc		.jscsrc
.jshintrc		.jshintrc
.travis.yml		.travis.yml
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json
siteshooter.svg		siteshooter.svg
siteshooter.yml		siteshooter.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Siteshooter

Features

Do you need a website and workflow management platform?

Getting Started

Dependencies

Notable npm Modules

Quick Start

Create a Siteshooter Configuration File

Update Siteshooter Configuration File

CLI Options

Custom JavaScript Inject File

Trigger JavaScript Events

Tests

Troubleshooting

Code of Conduct

Contributing to the project

About

Releases

Packages

Languages

License

jookshub/siteshooter

Folders and files

Latest commit

History

Repository files navigation

Siteshooter

Features

Do you need a website and workflow management platform?

Getting Started

Dependencies

Notable npm Modules

Quick Start

Create a Siteshooter Configuration File

Update Siteshooter Configuration File

CLI Options

Custom JavaScript Inject File

Trigger JavaScript Events

Tests

Troubleshooting

Code of Conduct

Contributing to the project

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages