yarm

Yarm: Yet Another Report Maker.

Features

Yarm makes it easy for you to create recurring reports by:

Importing multiple spreadsheets and CSVs into a temporary database.
Offering easy options for common data cleaning tasks (e.g. replace, slugify_columns, pivot)
Running SQL queries (or, for pandas fans, custom Python code) on all this data.
Exporting the results as a new spreadsheet, CSV, or even SQLite database.
All configured in a simple YAML file for easy reuse. Download fresh data, yarm run, and you're done.

Basic Usage

First Time You Run a New Report

Collect your XLSX and/or CSV data files into a directory for this report.
Initialize a new YAML config file:

$ yarm new

Edit the YAML config file (see below).
- Configure your input spreadsheets and CSV files as tables.
- Write one or more SELECT queries on these tables to create output sheets.
- (Optional) Need advanced manipulation of your data? Write pandas code in a separate .py file.
Run the report:

$ yarm run

Send the output spreadsheet to your boss/client/head of state. Was it really that easy?

Every Subsequent Time

Collect fresh data. Save it over the old files.
Run the report.

$ yarm run -f

Send the output spreadsheet.
Take the afternoon off.

Advanced Usage

Please see the extensive documentation for more details and features.

Example Report Config File

You configure a report in a single YAML file.

Each query becomes a separate sheet in your output spreadsheet.

This example config file is moderately complex. Your report can be much simpler; you might have only one or two tables and a single query. (Or you might have ten queries, each with a custom postprocess function...)

---
output:
  dir: Output
  basename: Sales_Report

# Optional input options (more are available):
input:
  slugify_columns: true
  lowercase_columns: true

# Set up your data sources:
tables_config:
  # CSV file: the easiest data source.
  products:
    - path: Products.csv

  # Spreadsheet: You need both the path and the sheet name.
  orders:
    - path: Orders.xlsx
      sheet: Orders

  # You can import different sheets as separate tables.
  order_details:
    - path: Orders.xlsx
      sheet: Order Details

  # You can combine multiple data sources into a single table,
  # as long as their columns can be merged.
  tax:
    - path: Sales Tax Rates Northeast.xlsx
      sheet: NY
    - path: Sales Tax Rates Northeast.xlsx
      sheet: PA
    - path: TAXES_SOUTH.csv

# Set up your output spreadsheet:
queries:
  - name: Order Details with Product Names
    sql: SELECT * FROM order_details as od JOIN products as p ON od.product_id = p.id;

  - name: Orders With Sales Tax
    sql: >
      SELECT orders.*,
      tax.rate
      FROM orders
      JOIN tax
      ON orders.billing_state = tax.state
      ;
    # These query results will need a Python function to complete this sheet:
    postprocess: calculate_tax
    # But first, we can do simple regex replacements right here:
    replace:
      billing_state:
        Virginia: VA
        West Virginia: WV

# Since we need that custom function calculate_tax(), we'll
# write it in a separate Python file.
import:
  - path: custom.py

Read more about basic configuration and advanced options.

Custom Postprocessing Code

If the power of SQL and make-it-easy options like slugify_columns aren't enough for you, you can write a custom postprocess function for any query you like.

Status: Alpha (Try It!)

Yarm is currently in alpha. Core features are working and thoroughly documented.

I rely on yarm for my own recurring reports.

If you are desperate to stop doing a recurring report by hand, give yarm a try.

If something breaks, or if you have any suggestions or comments, please file an issue. I'd love to hear what you think.

For upcoming features, see the Roadmap.

Requirements

Python 3.7 or later
A terminal
One or more spreadsheets that you want to query
Something to do with all this impending free time...

Installation

You can install yarm via pip from PyPI:

$ pip install yarm

But since yarm is a command line tool, you may prefer the excellent pipx:

$ pipx install yarm

Documentation

Complete, extensive documentation is at yarm.readthedocs.io.

Dive right in.

Is `yarm` for You?

This tool has a clear focus: Make it easy to run and rerun reports from the command line that query multiple sources of tabular data.

Once you set up the initial configuration file, the workflow for future reports is simple. Download fresh data over the old files, then rerun the report.

This means that yarm is not a tool for data exploration.

True, you may still want yarm to prepare your data for exploration. Once you get used to listing a few data sources, setting a few options, and spitting out a nice, clean SQLite database or set of CSV files to play with, you may get hooked.

But for iterative tinkering with your data, you're going to need other tools.

Other Open Source Tools You Might Prefer

sqlitebrowser: An excellent GUI for exploring your SQLite database. I sometimes use this to figure out my queries before I save them into my config file.
Jupyter Lab: If you find your SQL queries getting more and more arcane and complex, it's probably time to learn pandas, and that means unleashing the power of this interactive "notebook". Some reports are so complex that they really deserve to be run step-by-step, with immediate output after every command. Jupyter Lab makes that absurdly easy... and repeatable.
SQL Notebook: A newer offering that I haven't used yet, but it looks like an interesting GUI combination of sqlitebrowser and a "Jupyter-style notebook interface". Could be very powerful.
For quick, one-off data manipulations on the command line, you can reach for excellent tools like jq for JSON, mlr for CSV, and even htmlq for HTML. But once the command gets long and complex enough that you want to save it to a script, you might start missing SQL queries and yarm features like slugify_columns: true.

Contributing

Contributions are very welcome. To learn more, see the Contributor Guide.

License

Distributed under the terms of the Apache 2.0 license, Yarm is free and open source software.

Issues

If you encounter any problems, please file an issue along with a detailed description.

Credits

This project was generated from @cjolowicz's Hypermodern Python Cookiecutter template.

Name		Name	Last commit message	Last commit date
Latest commit History 395 Commits
.github		.github
docs		docs
src/yarm		src/yarm
tests		tests
.cookiecutter.json		.cookiecutter.json
.darglint		.darglint
.flake8		.flake8
.gitattributes		.gitattributes
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.readthedocs.yml		.readthedocs.yml
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
codecov.yml		codecov.yml
mypy.ini		mypy.ini
noxfile.py		noxfile.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
yarm.yaml		yarm.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

yarm

Features

Basic Usage

First Time You Run a New Report

Every Subsequent Time

Advanced Usage

Example Report Config File

Custom Postprocessing Code

Status: Alpha (Try It!)

Requirements

Installation

Documentation

Is `yarm` for You?

Other Open Source Tools You Might Prefer

Contributing

License

Issues

Credits

About

Releases 3

Packages

Contributors 2

Languages

License

billalive/yarm

Folders and files

Latest commit

History

Repository files navigation

yarm

Features

Basic Usage

First Time You Run a New Report

Every Subsequent Time

Advanced Usage

Example Report Config File

Custom Postprocessing Code

Status: Alpha (Try It!)

Requirements

Installation

Documentation

Is yarm for You?

Other Open Source Tools You Might Prefer

Contributing

License

Issues

Credits

About

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Is `yarm` for You?

Packages