Skip to content

Latest commit

 

History

History

examples

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Examples

First git clone https://github.com/spider-rs/spider.git and cd spider. Use the release flag for the best performance --release when running the examples below.

Basic

Simple concurrent crawl Simple.

  • cargo run --example example

Subscribe to realtime changes Subscribe.

  • cargo run --example subscribe

Live handle index mutation example Callback.

  • cargo run --example callback

Enable log output Debug.

  • cargo run --example debug

Scrape the webpage with and gather html Scrape.

  • cargo run --example scrape

Scrape and download the html file to fs Download HTML. *Note: Enable feature flag [full_resources] to gather all files like css, jss, and etc.

  • cargo run --example download

Scrape and download html to react components and store to fs Download to React Component.

  • cargo run --example download_to_react

Crawl the page and output the links via Serde.

  • cargo run --example serde --features serde

Crawl links with a budget of amount of pages allowed Budget.

  • cargo run --example budget

Crawl links at a given cron time Cron.

  • cargo run --example cron

Crawl links with chrome headless rendering Chrome.

  • cargo run --example chrome --features chrome

Crawl links with chrome headed rendering Chrome.

  • cargo run --example chrome --features chrome_headed

Crawl links with chrome headless rendering remote connections Chrome.

  • cargo run --example chrome_remote --features chrome

Crawl links with view port configuration Chrome Viewport.

  • cargo run --example chrome_viewport --features chrome

Take a screenshot of a page during crawl Chrome Screenshot.

  • cargo run --example chrome_screenshot --features="spider/sync spider/chrome spider/chrome_store_page"

Crawl links with smart mode detection. Runs HTTP by default until Chrome Rendering is needed. Smart.

  • cargo run --example smart --features smart

Use different encodings for the page. Encoding.

  • cargo run --example encoding --features encoding

Use advanced configuration re-use. Advanced Configuration.

  • cargo run --example cache_chrome_hybrid --features="spider/sync spider/chrome spider/cache_chrome_hybrid"

Use chrome hybrid caching. Chrome Cache Hybrid.

  • cargo run --example advanced_configuration

Use URL globbing for a domain. URL Globbing.

  • cargo run --example glob --features glob

Use URL globbing for a domain and subdomains. URL Globbing Subdomains.

  • cargo run --example url_glob_subdomains --features glob

Downloading files in a subscription. Subscribe Download.

  • cargo run --example subscribe_download

Add links to gather mid crawl. Queue.

  • cargo run --example queue

Use OpenAI to get custom Javascript to run in a browser. OpenAI. Make sure to set OPENAI_API_KEY=$MY_KEY as an env variable or pass it in before the script.

  • cargo run --example openai

or

  • OPENAI_API_KEY=replace_me_with_key cargo run --example openai

or setting multiple actions to drive the browser

  • OPENAI_API_KEY=replace_me_with_key cargo run --example openai_multi

or to get custom data from the GPT with JS scripts if needed.

  • OPENAI_API_KEY=replace_me_with_key cargo run --example openai_extra