Skip to content

etl pipeline, graphical explorer and general toolbox for investigations with follow the money data

License

Notifications You must be signed in to change notification settings

investigativedata/investigraph

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Deploy mkdocs site to Pages

investigraph

etl pipeline, graphical explorer and general toolbox for investigations with follow the money data

online documentation

https://docs.investigraph.dev

Tutorial: https://docs.investigraph.dev/tutorial/

build with investigraph

what is this all about?

Research and implementation of an ETL process for a curated and up-to-date public and open-source data catalog of frequently used datasets in investigative journalism.

investigraph is an ETL framework that allows research teams to build their own data catalog themselves as easily and reproducable as possible. The investigraph frameworks provides logic for extracting, transforming and loading any data source into followthemoney entities.

For most common data source formats, this process is possible without programming knowledge, by means of an easy yaml specification interface. However, if it turns out that a specific dataset can not be parsed with the built-in logic, a developer can plug in custom python scripts at specific places within the pipeline to fulfill even the most edge cases in data processing.

Value for investigative research teams

  • standardized process to convert different data sets into a uniform and thus comparable format
  • control of this process for non-technical people
  • Creation of an own (internal) data catalog
  • Regular, automatic updates of the data
  • A growing community that makes more and more data sets accessible
  • Access to a public (open source) data catalog operated by "investigraph"

components / child repositories

3rd party contributions

This project builds on top of great technology. Contributions to 3rd party libraries are listed below.

nomenklatura

Rendering / static page

This documentation can be rendered via mkdocs using the mkdocs-material theme.

Local developement:

pip install -r requirements.txt

mkdocs serve

Follow the documentation at mkdocs-material

build

mkdocs build

supported by

Media Tech Lab Bayern batch #3

About

etl pipeline, graphical explorer and general toolbox for investigations with follow the money data

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published