Skip to content
This repository has been archived by the owner on May 23, 2019. It is now read-only.

ParsingFeeds

Gabriel Iovino edited this page May 2, 2015 · 66 revisions

This page is Under Construction

Reference:

Introduction

CIF ships with many [Open-source Intelligence (OSINT) feeds preconfigured](with many Open-source Intelligence (OSINT) feeds preconfigured.). It is expected that additional feeds will be added to the pre-configured OSINT feeds. A tutorial on how to create a new feed config file can be found here.

File Syntax

YAML is the syntax used to generate CIF feed configuration files.

File Format

All parameters can be a Global parameter or a Feed parameter. If the parameter is specified twice, the Feed parameter will supersede the Global parameter.

defaults:
  <parameter>: <value>
  <parameter>: <value>
  <parameter>:
    - <value>
    - <value>
feeds:
  <parameter>: <value>
    <parameter>: <value>
  <parameter>: <value>
    <parameter>: <value>

Common Parameters

Parameter Name Values Description Queryable Required
? ? ? ? ?

Delimited Text Files

parser: delim
defaults:
  confidence: 85
  tlp: amber
  provider: malwaredomains.com

feeds:
  domains:
    remote: http://mirror3.malwaredomains.com/files/domains.zip
    pattern: '[\t|\f]'
    values:
      - null
      - null
      - observable
      - description
      - provider
      - null
    tags:
      - exploit
      - malware
Parameter Name Values Description
values - <value> nested series entry indicator
delimiter <string> a sudo-regex that splits up the feed

Non-Delimited Text Files

defaults:
  tlp: amber
  provider: 'dshield.org'
  tags: scanner

feeds:
  scanners:
    remote: http://feeds.dshield.org/block.txt
    confidence: 75
    pattern: ^(\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b)\t\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b\t(\d+)
    values:
      - observable
      - mask
Parameter Name Values Description
pattern <string> a regex string that splits up a line feed
values - <value> nested series entry indicator

XML Files

Parameter Name Values Description

JSON Files

parser: json
defaults:
  provider: phishtank.com
  tlp: amber
  application:
    - http
    - https
  confidence: 85
  tags: phishing
  protocol: tcp
  remote: http://data.phishtank.com/data/online-valid.json.gz
  altid_tlp: green

feeds:
  urls:
    otype: url
    map:
      - submission_time
      - url
      - target
      - phish_detail_url
      - details
    values:
      - lasttime
      - observable
      - description
      - altid
      - additional_data
Parameter Name Values Description
map - <value> nested series entry indicator
values - <value> nested series entry indicator

More examples

Additional example feed configuration files can be found here.

All Parameters

Parameter Name Values Description Queryable Required
adata <string> Additional data - string, json, csv no
address ? ? ? ?
altid <string> usually a url pointing to the original data point (as a reference id) no no
altid_tlp <string> white, green, amber, red no no
application <string> ? yes no
asn <string> ? yes ?
asn_desc <string> ? no ?
cc ? ? yes ?
citycode ? ? no ?
confidence <int> See Confidence yes ?
content ? ? ? no
description <string> ? yes no
disabled ? ? no ?
end <int> ? no no
firsttime ? ? yes ?
group ? ? yes ?
header ? ? no no
ignore ? ? no ?
lasttime ? ? yes ?
latitude double ? no ?
longitude double ? no ?
limit ? ? no ?
map ? ? no ?
mask ? ? no ?
metrocode ? ? no ?
node ? ? no ?
null ? ? no ?
observable ? ? yes ?
otype <string> url, binary yes ?
parser <string> default (?), csv, html, pipe, rss, delim, json, rss, text no ?
password ? ? no ?
pattern <string> Perl regex with capturing no no
peers ? ? no ?
period ? ? no ?
portlist <int> 22 or 80,443 or 6660-7000 yes no
prefix ? ? no ?
protocol <int> <string> 1,6,17 or icmp, tcp, udp no no
provider <string> Friendly name of entity providing the feed yes yes
rank ? ? no ?
rdata ? ? yes ?
reference ? ? no ?
related ? ? no ?
remote <string> http(s) URL of feed no yes?
reporttime ? ? yes ?
rir ? ? no ?
skip ? ? no ?
start <int> ? no no
store_content <int> ? no ?
subdivision ? ? no ?
tags <string> See Tags yes yes
timezone ? ? no ?
title ? ? no ?
tlp <string> white, green, amber, red no no
username ? ? ? ?
values <string> Used with pattern, map; no no
? ? ? ? ?
Clone this wiki locally