Skip to content
This repository has been archived by the owner on May 23, 2019. It is now read-only.

Whitelist

Gabriel Iovino edited this page May 13, 2015 · 7 revisions

Whitelisting

CIF has the capability to whitelist observations from entering a feed during the feed generation process.

How does whitelisting work in CIF?

Any observation (IP, domain, URL) with the following will be whitelisted during feed generation:

  • tag == whitelist
  • Confidence >= 25

How does an observation get an assessment of "whitelist" and a confidece >= 25?

By default CIF is configured with the following whitelists:

Looking at the 00_whitelist.yml file you'll see there are additional configuration files that contribute to whitelisting. When these feeds are processed, the CIF API applies the following logic:

  • resolve all domains to their ip's, slightly degrade the confidence value, whitelist the ip's
  • resolve all ip's to their bgp prefix, slightly degrade the confidence value, whitelist the prefix (/16, /18, /22, /24, etc).

For example:

  1. google.com is given the assessment 'whitelist' with a confidence value of 95%
  2. google.com resolves to: 173.194.46.64-78, which are whitelisted at ~ 69% confidence
  3. 173.194.46.64-78 resolves to 173.194.46.0/24 (bgp prefix lookup)
  4. 173.194.46.0/24 is whitelisted 47% confidence

When a feed is generated, a whitelist data-set is pre-populated with these values and the feed items are checked against them (sub-domains included).

Clone this wiki locally