Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer Excel data I/O from message_ix #289

Merged
merged 44 commits into from
Mar 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
4d32748
Add backend.io.s_write_excel
khaeru Mar 20, 2020
6458b1a
Add backend.io.s_read_excel
khaeru Mar 20, 2020
0613cf8
Expand TestScenario.test_excel_io
khaeru Mar 20, 2020
f551acc
Adjust tests.backend.test_base.test_class
khaeru Mar 20, 2020
ddfe9e7
Remove unused utils.pd_read, .pd_write
khaeru Mar 20, 2020
730c0a4
Coerce path inputs to {to,read}_excel to pathlib.Path
khaeru Mar 20, 2020
5b8a705
Handle old- and new-style sheets for index sets
khaeru Mar 21, 2020
3cdb804
Improve exception handling in JDBCBackend._get_item
khaeru Mar 21, 2020
89b128c
Complete test_excel_io
khaeru Mar 21, 2020
543c010
Use Scenario.scheme if no model arg to Scenario.solve
khaeru Mar 21, 2020
2c75041
Improve exception handling in JDBCBackend._get_item 2
khaeru Mar 21, 2020
9c9894e
Document limitations of JDBCBackend
khaeru Mar 21, 2020
db38dd5
Handle existing items with read_excel(…, init_items=True)
khaeru Mar 21, 2020
7469c12
Scenario.add_set() with an empty key is a no-op
khaeru Mar 21, 2020
6f73837
Omit names from the item name -> ixmp type map if no data written
khaeru Mar 21, 2020
86b984a
Only log warning when reading sets/pars with mismatched idx_names
khaeru Mar 21, 2020
3a45711
Update docstrings in backend.io
khaeru Mar 21, 2020
f110a4a
Harmonize handling of exceptions, format strings in JDBCBackend
khaeru Mar 21, 2020
0d2233a
Add 'export' and expand 'import' CLI commands for Excel scenario data
khaeru Mar 21, 2020
3d0a6b9
Test Excel I/O CLI
khaeru Mar 21, 2020
ab636e3
Read CSV or Excel in utils.import_timeseries
khaeru Mar 21, 2020
e84922a
Add missing comma in test_import_ts
khaeru Mar 21, 2020
f46f55c
Adjust test_scenario.test_set
khaeru Mar 21, 2020
ad9dd12
Specify exceptions raised by Backend.item_get_elements
khaeru Mar 21, 2020
4fd571b
Adjust TestScenario.test_from_url
khaeru Mar 21, 2020
d563601
Update release notes
khaeru Mar 21, 2020
ac7ab5a
Add 'file-io' documentation page
khaeru Mar 21, 2020
56ec0a2
Check type before len of key in Scenario.add_set
khaeru Mar 21, 2020
11bb0e0
Move utils.import_timeseries to backend.io.ts_read_file
khaeru Mar 21, 2020
47278d7
Remove unused import in utils
khaeru Mar 21, 2020
15ee2b5
Add IXMP_JDBC_EXCEPTION_VERBOSE global option & tests, use in CI
khaeru Mar 23, 2020
b26795c
Allow Windows line endings in test_verbose_exception
khaeru Mar 23, 2020
660b9af
Raise ValueError from JDBCBackend.get
khaeru Mar 23, 2020
f24aa1e
Correct Travis YAML syntax for matrix builds
khaeru Mar 24, 2020
fe37314
Update error handling and logging
zikolach Mar 23, 2020
21ec585
Sync pytest in ci/conda-requirements.txt with setup.py
khaeru Mar 24, 2020
50a3b4f
Add cli.VersionType docstring
khaeru Mar 24, 2020
957a9d6
Lint tests.backend.test_jdbc
khaeru Mar 24, 2020
51f5068
Parametrize test_log_level
khaeru Mar 24, 2020
a0173f9
Improve documentation of {read,write}_{excel,file}, get_log_level
khaeru Mar 24, 2020
56e7b33
Lint tests.core.test_platform
khaeru Mar 24, 2020
b94a3e8
Increase coverage to 100% in ixmp.cli
khaeru Mar 25, 2020
e6798c8
Expand test_excel_io to cover unreadable items
khaeru Mar 25, 2020
386f3fe
Test TimeSeries.read_file(…, lastyear=…)
khaeru Mar 25, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .appveyor.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@
environment:
# Use the JDK installed on AppVeyor images
JAVA_HOME: C:\Program Files\Java\jdk13
# Always display verbose exceptions in JDBCBackend
IXMP_JDBC_EXCEPTION_VERBOSE: '1'
matrix:
- PYTHON_VERSION: "3.6"
- PYTHON_VERSION: "3.7"
Expand Down
21 changes: 12 additions & 9 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,20 +1,23 @@
# Continuous integration configuration for Travis
# NB use https://config.travis-ci.com/explore to validate changes

dist: xenial

language: r

r: release

matrix:
include:
- os: linux
env: PYENV=py37
- os: osx
env: PYENV=py37
# turn these on once travis support gets a little better, see pyam for example
# - os: windows
# env: PYENV=py37
# - Build against Python 3.7
# - Always display verbose exceptions in JDBCBackend
env:
PYENV=py37
IXMP_JDBC_EXCEPTION_VERBOSE=1

os:
- linux
- osx
# TODO turn this on once Travis Windows support improves
# - windows

r_packages:
- IRkernel
Expand Down
1 change: 1 addition & 0 deletions RELEASE_NOTES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ Next release
All changes
-----------

- `#286 <https://github.com/iiasa/ixmp/pull/286>`_: Add :meth:`.Scenario.to_excel` and :meth:`.read_excel`; this functionality is transferred to ixmp from :mod:`message_ix`.
- `#270 <https://github.com/iiasa/ixmp/pull/270>`_: Include all tests in the ixmp package.
- `#212 <https://github.com/iiasa/ixmp/pull/212>`_: Add :meth:`Model.initialize` API to help populate new Scenarios according to a model scheme.
- `#267 <https://github.com/iiasa/ixmp/pull/267>`_: Apply units to reported quantities.
Expand Down
2 changes: 1 addition & 1 deletion ci/conda-requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ numpydoc
pandas
pint
pytest-cov
pytest>=3.9
pytest>=5
PyYAML
sparse
sphinx
Expand Down
5 changes: 5 additions & 0 deletions doc/source/api-backend.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,10 @@ Provided backends

.. tip:: Modifying an item by adding or deleting elements invalidates its cache.

JDBCBackend has the following limitations:

- The `comment` argument to :meth:`Platform.add_unit` is limited to 64 characters.

.. automethod:: ixmp.backend.jdbc.start_jvm

Backend API
Expand Down Expand Up @@ -86,6 +90,7 @@ Backend API

close_db
get_auth
get_log_level
get_nodes
get_scenarios
get_units
Expand Down
3 changes: 3 additions & 0 deletions doc/source/api-python.rst
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ TimeSeries
is_default
last_update
preload_timeseries
read_file
remove_geodata
remove_timeseries
run_id
Expand Down Expand Up @@ -144,6 +145,7 @@ Scenario
load_scenario_data
par
par_list
read_excel
remove_par
remove_set
remove_solution
Expand All @@ -152,6 +154,7 @@ Scenario
set_list
set_meta
solve
to_excel
var
var_list

Expand Down
1 change: 1 addition & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ On separate pages:

api-python
api-backend
file-io
api-model
reporting

Expand Down
78 changes: 78 additions & 0 deletions doc/source/file-io.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
File formats and input/output
*****************************

In addition to the data management features provided by :doc:`api-backend`, ixmp is able to write and read :class:`TimeSeries` and :class:`Scenario` data to and from files.
This page describes those options and formats.

Time series data
================

Time series data can be:

- Read using :meth:`.import_timeseries`, or the CLI command ``ixmp import timeseries FILE`` for a single TimeSeries object.
- Written using :meth:`.export_timeseries_data` for multiple TimeSeries objects at once.

Both CSV and Excel files in the IAMC time-series format are supported.

.. _excel-data-format:

Scenario/model data
===================

Scenario data can be read from/written to Microsoft Excel files using :meth:`.Scenario.read_excel` and :meth:`.to_excel`, and the CLI commands ``ixmp import scenario FILE`` and ``ixmp export FILE``.
The files have the following structure:

- One sheet named 'ix_type_mapping' with two columns:

- 'item': the name of an ixmp item.
- 'ix_type': the item's type as a length-3 string: 'set', 'par', 'var', or 'equ'.

- One sheet per item.
khaeru marked this conversation as resolved.
Show resolved Hide resolved
- Sets:

- Sheets for index sets have one column, with a header cell that is the set name.
- Sheets for one-dimensional indexed sets have one column, with a header cell that is the index set name.
- Sheets for multi-dimensional indexed sets have multiple columns.
- Sets with no elements are represented by empty sheets.

- Parameters, variables, and equations:

- Sheets have zero (for scalar items) or more columns with headers that are the index *names* (not necessarily sets; see below) for those dimensions.
- Parameter sheets have 'value' and 'unit' columns.
zikolach marked this conversation as resolved.
Show resolved Hide resolved
- Variable and equation sheets have 'lvl' and 'mrg' columns.
zikolach marked this conversation as resolved.
Show resolved Hide resolved
- Items with no elements are not included in the file.

Limitations
-----------

Reading variables and equations
The ixmp API provides no way to set the data of variables and equations, because these are considered model solution data.

Thus, while :meth:`.to_excel` will write files containing variable and equation data, :meth:`.read_excel` can not add these to a Scenario, and only emits log messages indicating that they are ignored.

Multiple dimensions indexed by the same set
:meth:`.read_excel` provides the `init_items` argument to create new sets and parameters when reading a file.
However, the file format does not capture information needed to reconstruct the original data in all cases.

For example::

scenario.init_set('foo')
scenario.add_set('foo', ['a', 'b', 'c'])
scenario.init_par(name='bar', idx_sets=['foo'])
scenario.init_par(
name='baz',
idx_sets=['foo', 'foo'],
idx_names=['foo', 'another_dimension'])
scenario.to_excel('file.xlsx')

:file:`file.xlsx` will contain sheets named 'bar' and 'baz'.
The sheet 'bar' will have column headers 'foo', 'value', and 'unit', which are adequate to reconstruct the parameter.
However, the sheet 'baz' will have column headers 'foo' and 'another_dimension'; this information does not allow ixmp to infer that 'another_dimension' is indexed by 'foo'.

To work around this limitation, initialize 'baz' with the correct dimensions before reading its data::

new_scenario.init_par(
name='baz',
idx_sets=['foo', 'foo'],
idx_names=['foo', 'another_dimension'])
new_scenario.read_excel('file.xlsx', init_items=True)
63 changes: 58 additions & 5 deletions ixmp/backend/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

from ixmp.core import TimeSeries, Scenario
from . import ItemType
from .io import ts_read_file, s_read_excel, s_write_excel


class Backend(ABC):
Expand Down Expand Up @@ -177,9 +178,30 @@ def open_db(self):
def set_log_level(self, level):
"""OPTIONAL: Set logging level for the backend and other code.

The default implementation has no effect.

Parameters
----------
level : int or Python logging level

See also
--------
get_log_level
"""

def get_log_level(self):
"""OPTIONAL: Get logging level for the backend and other code.

The default implementation has no effect.

Returns
-------
str
Name of a :py:ref:`Python logging level <levels>`.

See also
--------
set_log_level
"""

@abstractmethod
Expand Down Expand Up @@ -232,6 +254,12 @@ def read_file(self, path, item_type: ItemType, **kwargs):
the `path` and `item_type` methods. For all other combinations, it
**must** raise :class:`NotImplementedError`.

The default implementation supports:

- `path` ending in '.xlsx', `item_type` is ItemType.MODEL: read a
single Scenario given by kwargs['filters']['scenario'] from file
using :meth:`pandas.DataFrame.read_excel`.

Parameters
----------
path : os.PathLike
Expand Down Expand Up @@ -260,8 +288,13 @@ def read_file(self, path, item_type: ItemType, **kwargs):
--------
write_file
"""
# TODO move message_ix.core.read_excel here
raise NotImplementedError
s, filters = self._handle_rw_filters(kwargs.pop('filters', {}))
if path.suffix in ('.csv', '.xlsx') and item_type is ItemType.TS and s:
ts_read_file(s, path, **kwargs)
elif path.suffix == '.xlsx' and item_type is ItemType.MODEL and s:
s_read_excel(self, s, path, **kwargs)
else:
raise NotImplementedError

def write_file(self, path, item_type: ItemType, **kwargs):
"""OPTIONAL: Write Platform, TimeSeries, or Scenario data to file.
Expand All @@ -270,6 +303,12 @@ def write_file(self, path, item_type: ItemType, **kwargs):
the `path` and `item_type` methods. For all other combinations, it
**must** raise :class:`NotImplementedError`.

The default implementation supports:

- `path` ending in '.xlsx', `item_type` is ItemType.MODEL: write a
single Scenario given by kwargs['filters']['scenario'] to file using
:meth:`pandas.DataFrame.to_excel`.

Parameters
----------
path : os.PathLike
Expand All @@ -289,8 +328,11 @@ def write_file(self, path, item_type: ItemType, **kwargs):
--------
read_file
"""
# TODO move message_ix.core.to_excel here
raise NotImplementedError
s, filters = self._handle_rw_filters(kwargs.pop('filters', {}))
if path.suffix == '.xlsx' and item_type is ItemType.MODEL and s:
s_write_excel(self, s, path)
else:
raise NotImplementedError

@staticmethod
def _handle_rw_filters(filters: dict):
Expand Down Expand Up @@ -355,6 +397,12 @@ def get(self, ts: TimeSeries, version):
-------
None

Raises
------
ValueError
If :attr:`~.TimeSeries.model` or :attr:`~.TimeSeries.scenario` does
not exist on the Platform.

See also
--------
ts_set_as_default
Expand Down Expand Up @@ -742,14 +790,19 @@ def item_get_elements(self, s: Scenario, type, name, filters=None):
When *type* is 'set' and *name* an index set (not indexed by other
sets).
dict
When *type* is 'equ', 'par', or 'set' and *name* is scalar (zero-
When *type* is 'equ', 'par', or 'var' and *name* is scalar (zero-
zikolach marked this conversation as resolved.
Show resolved Hide resolved
dimensional). The value has the keys 'value' and 'unit' (for 'par')
or 'lvl' and 'mrg' (for 'equ' or 'var').
pandas.DataFrame
For mapping sets, or all 1+-dimensional values. The dataframe has
one column per index name with dimension values; plus the columns
'value' and 'unit' (for 'par') or 'lvl' and 'mrg' (for 'equ' or
'var').

Raises
------
KeyError
If *name* does not exist in *s*.
"""

@abstractmethod
Expand Down
Loading