[WIP] Reduce and explicitly define our public API #115

timokau · 2020-04-17T10:39:07Z

Description

This is a work in progress. I haven't quite figured out how githubs draft PRs work.

With the absolute imports in the tests we could have csrank/__init__.py empty now. The test suite would still pass. I think that would actually be my preference: Start with an empty top-level, add re-exports when experience shows that it would be more convenient.

However @kiudee advocated for some top-level imports on slack

[...] We should think about what our library looks like for a user. In scikit-learn, they group everything by topic (e.g. model_selection, linear_model etc) and inside each sub-module the hierarchy is flattened.

My opinion (please feel free to discuss): If we opt for a flat hierarchy, then there should be as few top-level class imports as possible. For example we should hide:

Dataset generators like DiscreteChoiceDatasetGenerator etc

Model cores like FATENetworkCore etc
behind their appropriate modules.
To summarize: Only the directly usable learners should be visible as top-level imports. Everything else is hidden in sub-modules

[...] the idea is that to a user it looks like he is importing learners directly from csrank. For dataset generators (s)he needs to use a module. The same thing for abstract classes like the cores
That way the immediate tab completion is very clean
accessing the sub modules is of course always possible

Which is also also fine by me.

Motivation and Context

Currently we re-export everything at the toplevel. That is suboptimal. It pollutes our public namespace with a lot of irrelevant definitions. It provides multiple ways to import things, while there should be one preferred way, ideally only one way to do it. It makes tab import less useful. It necessitates star imports.

Therefore we want to reduce our public API to an explicitly defined subset.

How Has This Been Tested?

Ran the testsuite and the pre-commit hooks.

Does this close/impact existing issues?

Closes #105.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)

Checklist:

My code follows the code style of this project.
My change requires a change to the documentation.
I have updated the documentation accordingly.
I have added tests to cover my changes.

timokau · 2020-04-17T10:44:00Z

@kiudee, I would take your slack comments to mean that the top-level csrank.__all__ should contain

Choice Functions
- AllPositive
- FATEChoiceFunction
- FATELinearChoiceFunction
- FETAChoiceFunction
- FETALinearChoiceFunction
- GeneralizedLinearModel
- PairwiseSVMChoiceFunction
- RankNetChoiceFunction
Discrete Choice Functions
- RandomBaselineDC
- CmpNetDiscreteChoiceFunction
- FATEDiscreteChoiceFunction
- FATELinearDiscreteChoiceFunction
- FETADiscreteChoiceFunction
- FETALinearDiscreteChoiceFunction
- GeneralizedNestedLogitModel
- MixedLogitModel
- ModelSelector
- MultinomialLogitModel
- NestedLogitModel
- PairedCombinatorialLogit
- PairwiseSVMDiscreteChoiceFunction
- RankNetDiscreteChoiceFunction
Object Rankers
- CmpNet
- ExpectedRankRegression
- FATEObjectRanker
- FATELinearObjectRanker
- FETAObjectRanker
- FETALinearObjectRanker
- ListNet
- RankNet
- RankSVM
- RandomBaselineRanker

Is that understanding correct?

codecov · 2020-04-17T10:54:09Z

Codecov Report

Merging #115 into master will decrease coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master     #115      +/-   ##
==========================================
- Coverage   60.01%   59.97%   -0.04%     
==========================================
  Files         116      116              
  Lines        7642     7636       -6     
==========================================
- Hits         4586     4580       -6     
  Misses       3056     3056

Impacted Files	Coverage Δ
csrank/__init__.py	`100.00% <100.00%> (ø)`
csrank/tests/test_fate.py	`92.59% <100.00%> (ø)`
csrank/tests/test_ranking.py	`100.00% <100.00%> (ø)`
csrank/tests/test_util.py	`100.00% <100.00%> (ø)`

timokau · 2020-05-30T09:21:16Z

For the public record, @kiudee agreed in slack. Also the python conventions seem to agree with a two-level hierarchy: https://www.python.org/dev/peps/pep-0423/#avoid-deep-nesting

This is a pre-requisite for cleaning up our public API.

timokau · 2020-05-30T11:37:41Z

I'm sorry, its been a while and I don't quite remember what I meant myself with comment #115 (comment). Does this mean all those functions should be directly accessible from csrank or below their respective submodules?

I.e. do we want

Option 1

from csrank.choicefunction import FATEChoiceFunction
from csrank.objectranking import FATEObjectRanker

Option 2

from csrank.choicefunction import FATE
from csrank.objectranking import FATE as FATEObject

Option 3

from csrank import FATEChoiceFunction
from csrank import FATEObjectRanker

I have kidn of operated under the assumption (1) and implemented that for now. But the comment could also be read as option (3). If we go this way, I think we should

move most other top-level modules (callbacks, constants, layers, learner, losses, metrics_np, metrics, numpy_util, tensorflow_util, tunable, tuning) under the util namespace
maybe reneame dataset_reader to datasets, since it also provides generation functionality

Then we would have no direct exports from csrank, and only the top-level modules

choicefunction
objectranking
discretechoice
util
datasets

I think this would improve tab-completion and discoverability of our API.

timokau force-pushed the public-api branch from 719cc7a to d5dba80 Compare April 17, 2020 10:44

timokau mentioned this pull request Apr 30, 2020

[WIP] Make our estimators compatible with scikit-learn #116

Draft

7 tasks

kiudee marked this pull request as draft May 30, 2020 08:01

timokau added 2 commits May 30, 2020 13:18

Remove top-level imports in tests

130b258

This is a pre-requisite for cleaning up our public API.

Use a strict two-level structure in our public API

427d05e

timokau force-pushed the public-api branch from d5dba80 to 427d05e Compare May 30, 2020 11:41

kiudee added this to the 1.3 milestone Jun 5, 2020

This was referenced Jun 8, 2020

Importing csrank causes error without optional theano installed #137

Closed

Make our optional dependencies optional again #140

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Reduce and explicitly define our public API #115

[WIP] Reduce and explicitly define our public API #115

timokau commented Apr 17, 2020

timokau commented Apr 17, 2020

codecov bot commented Apr 17, 2020 •

edited

Loading

timokau commented May 30, 2020

timokau commented May 30, 2020

[WIP] Reduce and explicitly define our public API #115

Are you sure you want to change the base?

[WIP] Reduce and explicitly define our public API #115

Conversation

timokau commented Apr 17, 2020

Description

Motivation and Context

How Has This Been Tested?

Does this close/impact existing issues?

Types of changes

Checklist:

timokau commented Apr 17, 2020

codecov bot commented Apr 17, 2020 • edited Loading

Codecov Report

timokau commented May 30, 2020

timokau commented May 30, 2020

Option 1

Option 2

Option 3

codecov bot commented Apr 17, 2020 •

edited

Loading