Skip to content

Commit

Permalink
Merge branch 'upstream-master' into feature/moto-version-update
Browse files Browse the repository at this point in the history
* upstream-master:
  S3 client refactor (spotify#2482)
  Rename to rpc_log_retries, and make it apply to all the logging involved
  Factor log_exceptions into a configuration parameter
  Fix attribute forwarding for tasks with dynamic dependencies (spotify#2478)
  Add a visiblity level for luigi.Parameters (spotify#2278)
  Add support for multiple requires and inherits arguments (spotify#2475)
  Add metadata columns to the RDBMS contrib (spotify#2440)
  Fix race condition in luigi.lock.acquire_for (spotify#2357) (spotify#2477)
  tests: Use RunOnceTask where possible (spotify#2476)
  Optional TOML configs support (spotify#2457)
  Added default port behaviour for Redshift (spotify#2474)
  Add codeowners file with default and specific example (spotify#2465)
  Add Data Revenue to the `blogged` list (spotify#2472)
  • Loading branch information
dlstadther committed Aug 14, 2018
2 parents 483d5b1 + c696f40 commit e5a131f
Show file tree
Hide file tree
Showing 36 changed files with 1,578 additions and 207 deletions.
12 changes: 12 additions & 0 deletions .github/CODEOWNERS
Validating CODEOWNERS rules …
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# The following patterns are used to auto-assign review requests
# to specific individuals. Order is important; the last matching
# pattern takes the most precedence.

# These owners will be the default owners for everything in
# the repo. Unless a later match takes precedence,
* @dlstadther @Tarrasch @ulzha

# Specific files, directories, paths, or file types can be
# assigned more specificially.
contrib/redshift*.py @dlstadther

1 change: 1 addition & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -149,6 +149,7 @@ or held presentations about Luigi:
* `Leipzig University Library <https://ub.uni-leipzig.de>`_ `(presentation, 2016) <https://de.slideshare.net/MartinCzygan/build-your-own-discovery-index-of-scholary-eresources>`__ / `(project) <https://finc.info/de/datenquellen>`__
* `Synetiq <https://synetiq.net/>`_ `(presentation, 2017) <https://www.youtube.com/watch?v=M4xUQXogSfo>`__
* `Glossier <https://www.glossier.com/>`_ `(blog, 2018) <https://medium.com/glossier/how-to-build-a-data-warehouse-what-weve-learned-so-far-at-glossier-6ff1e1783e31>`__
* `Data Revenue <https://www.datarevenue.com/>`_ `(blog, 2018) <https://www.datarevenue.com/en/blog/how-to-scale-your-machine-learning-pipeline>`_

Some more companies are using Luigi but haven't had a chance yet to write about it:

Expand Down
40 changes: 34 additions & 6 deletions doc/configuration.rst
Original file line number Diff line number Diff line change
@@ -1,18 +1,35 @@
Configuration
=============

All configuration can be done by adding configuration files. They are looked for in:
All configuration can be done by adding configuration files.

* ``/etc/luigi/client.cfg``
* ``luigi.cfg`` (or its legacy name ``client.cfg``) in your current working directory
* ``LUIGI_CONFIG_PATH`` environment variable
Supported config parsers:
* ``cfg`` (default)
* ``toml``

in increasing order of preference. The order only matters in case of key conflicts (see docs for ConfigParser.read_). These files are meant for both the client and ``luigid``. If you decide to specify your own configuration you should make sure that both the client and ``luigid`` load it properly.
You can choose right parser via ``LUIGI_CONFIG_PARSER`` environment variable. For example, ``LUIGI_CONFIG_PARSER=toml``.

Default (cfg) parser are looked for in:

* ``/etc/luigi/client.cfg`` (deprecated)
* ``/etc/luigi/luigi.cfg``
* ``client.cfg`` (deprecated)
* ``luigi.cfg``
* ``LUIGI_CONFIG_PATH`` environment variable

`TOML <https://github.com/toml-lang/toml>`_ parser are looked for in:

* ``/etc/luigi/luigi.toml``
* ``luigi.toml``
* ``LUIGI_CONFIG_PATH`` environment variable

Both config lists increase in priority (from low to high). The order only matters in case of key conflicts (see docs for ConfigParser.read_). These files are meant for both the client and ``luigid``. If you decide to specify your own configuration you should make sure that both the client and ``luigid`` load it properly.

.. _ConfigParser.read: https://docs.python.org/3.6/library/configparser.html#configparser.ConfigParser.read

The config file is broken into sections, each controlling a different part of the config. Example configuration file:
The config file is broken into sections, each controlling a different part of the config.

Example cfg config:

.. code:: ini
Expand All @@ -23,6 +40,17 @@ The config file is broken into sections, each controlling a different part of th
[core]
scheduler_host=luigi-host.mycompany.foo
Example toml config:

.. code:: python
[hadoop]
version = "cdh4"
streaming-jar = "/usr/lib/hadoop-xyz/hadoop-streaming-xyz-123.jar"
[core]
scheduler_host = "luigi-host.mycompany.foo"
.. _ParamConfigIngestion:

Expand Down
19 changes: 19 additions & 0 deletions doc/parameters.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,6 +88,25 @@ are not the same instance:
>>> hash(c) == hash(d)
True
Parameter visibility
^^^^^^^^^^^^^^^^^^^^

Using :class:`~luigi.parameter.ParameterVisibility` you can configure parameter visibility. By default, all
parameters are public, but you can also set them hidden or private.

.. code:: python
>>> import luigi
>>> from luigi.parameter import ParameterVisibility
>>> luigi.Parameter(visibility=ParameterVisibility.PRIVATE)
``ParameterVisibility.PUBLIC`` (default) - visible everywhere

``ParameterVisibility.HIDDEN`` - ignored in WEB-view, but saved into database if save db_history is true

``ParameterVisibility.PRIVATE`` - visible only inside task.

Parameter types
^^^^^^^^^^^^^^^

Expand Down
27 changes: 27 additions & 0 deletions luigi/configuration/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
from .cfg_parser import LuigiConfigParser
from .core import get_config, add_config_path
from .toml_parser import LuigiTomlParser


__all__ = [
'add_config_path',
'get_config',
'LuigiConfigParser',
'LuigiTomlParser',
]
41 changes: 41 additions & 0 deletions luigi/configuration/base_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging


# IMPORTANT: don't inherit from `object`!
# ConfigParser have some troubles in this case.
# More info: https://stackoverflow.com/a/19323238
class BaseParser:
@classmethod
def instance(cls, *args, **kwargs):
""" Singleton getter """
if cls._instance is None:
cls._instance = cls(*args, **kwargs)
loaded = cls._instance.reload()
logging.getLogger('luigi-interface').info('Loaded %r', loaded)

return cls._instance

@classmethod
def add_config_path(cls, path):
cls._config_paths.append(path)
cls.reload()

@classmethod
def reload(cls):
return cls.instance().read(cls._config_paths)
34 changes: 4 additions & 30 deletions luigi/configuration.py → luigi/configuration/cfg_parser.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@
See :doc:`/configuration` for more info.
"""

import logging
import os
import warnings

Expand All @@ -38,37 +37,19 @@
except ImportError:
from configparser import ConfigParser, NoOptionError, NoSectionError

from .base_parser import BaseParser

class LuigiConfigParser(ConfigParser):

class LuigiConfigParser(BaseParser, ConfigParser):
NO_DEFAULT = object()
enabled = True
_instance = None
_config_paths = [
'/etc/luigi/client.cfg', # Deprecated old-style global luigi config
'/etc/luigi/luigi.cfg',
'client.cfg', # Deprecated old-style local luigi config
'luigi.cfg',
]
if 'LUIGI_CONFIG_PATH' in os.environ:
config_file = os.environ['LUIGI_CONFIG_PATH']
if not os.path.isfile(config_file):
warnings.warn("LUIGI_CONFIG_PATH points to a file which does not exist. Invalid file: {path}".format(path=config_file))
else:
_config_paths.append(config_file)

@classmethod
def add_config_path(cls, path):
cls._config_paths.append(path)
cls.reload()

@classmethod
def instance(cls, *args, **kwargs):
""" Singleton getter """
if cls._instance is None:
cls._instance = cls(*args, **kwargs)
loaded = cls._instance.reload()
logging.getLogger('luigi-interface').info('Loaded %r', loaded)

return cls._instance

@classmethod
def reload(cls):
Expand Down Expand Up @@ -124,10 +105,3 @@ def set(self, section, option, value=None):
ConfigParser.add_section(self, section)

return ConfigParser.set(self, section, option, value)


def get_config():
"""
Convenience method (for backwards compatibility) for accessing config singleton.
"""
return LuigiConfigParser.instance()
79 changes: 79 additions & 0 deletions luigi/configuration/core.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# -*- coding: utf-8 -*-
#
# Copyright 2012-2015 Spotify AB
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
import logging
import os
import warnings

from .cfg_parser import LuigiConfigParser
from .toml_parser import LuigiTomlParser


logger = logging.getLogger('luigi-interface')


PARSERS = {
'cfg': LuigiConfigParser,
'conf': LuigiConfigParser,
'ini': LuigiConfigParser,
'toml': LuigiTomlParser,
}

# select parser via env var
DEFAULT_PARSER = 'cfg'
PARSER = os.environ.get('LUIGI_CONFIG_PARSER', DEFAULT_PARSER)
if PARSER not in PARSERS:
warnings.warn("Invalid parser: {parser}".format(parser=PARSER))
PARSER = DEFAULT_PARSER


def get_config(parser=PARSER):
"""Get configs singleton for parser
"""

parser_class = PARSERS[parser]
if not parser_class.enabled:
logger.error((
"Parser not installed yet. "
"Please, install luigi with required parser:\n"
"pip install luigi[{parser}]"
).format(parser)
)

return parser_class.instance()


def add_config_path(path):
"""Select config parser by file extension and add path into parser.
"""
if not os.path.isfile(path):
warnings.warn("Config file does not exist: {path}".format(path=path))
return False

# select parser by file extension
_base, ext = os.path.splitext(path)
if ext and ext[1:] in PARSERS:
parser_class = PARSERS[ext[1:]]
else:
parser_class = PARSERS[PARSER]

# add config path to parser
parser_class.add_config_path(path)
return True


if 'LUIGI_CONFIG_PATH' in os.environ:
add_config_path(os.environ['LUIGI_CONFIG_PATH'])
Loading

0 comments on commit e5a131f

Please sign in to comment.