You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do we really need such stub methods that only call the same superclass method with the same arguments? That's already the default which occurs if no method is present. By my understanding, doc-comment tools like Sphinx will, in their current versions, already propagate superclass API docs down to subclasses.
The only thing that's varying is the comment, and while it expresses a different expected-type from the superclass, in practice that doc may be misleading: I think (but have not recently checked) that these SaveLoad.load() methods can return objects that may not be what the caller expects. They return the class that's in the file, not the class-that-.load()-was-called-on.
If so, it might be a worthwhile short-term step as soon as 4.0.0 – for limiting the risk of confusion & requirement for redundant/caveat-filled docs – to deprecate the practice of calling SpecificClass.load(filename) entirely, despite its common appearance in previously-idiomatic gensim example code. Instead, either (1) call it only on class SaveLoad itself, to express that the only expectation for the returned type is that it's a SaveLoad subclass; (2) promote load functionality to model-specific top-level functions in each relevant model – a bit more like the load_facebook_model() function for loading Facebook-FasttText-native-format models – which might themselves do some type-checking, so any docs which imply they return a certain type are true; (3) just make one utils.py generic load() or load_model(), perhaps with an optional class-enforcement parameter, and encourage its use.
(For explicitness, I think I like this third option. In practice, it might appear in example code as:
from gensim.utils import load_model
from gensim.models import Word2Vec
w2v_model_we_hope = load_model('w2v.bin')
w2v_model_or_error = load_model('w2v.bin', expected_class=Word2Vec)
Plenty of code where the file is saved/loaded in the same example block, or under strong expectations & naming conventions, might skip the enforced-type-checking – but it'd be an option & true/explicit, rather than something that's implied-but-not-really-enforced in the current idiom Word2Vec.load('kv.bin'))
Despite the effort involved in making such changes, they could minimize duplicated code/comments & avoid some unintuitive gotchas in the current SaveLoad approach. They might also help make a future migration to some more standard big-model-serialization convention (as proposed by #2848) cleaner.
Do we really need such stub methods that only call the same superclass method with the same arguments? That's already the default which occurs if no method is present. By my understanding, doc-comment tools like Sphinx will, in their current versions, already propagate superclass API docs down to subclasses.
The only thing that's varying is the comment, and while it expresses a different expected-type from the superclass, in practice that doc may be misleading: I think (but have not recently checked) that these
SaveLoad
.load()
methods can return objects that may not be what the caller expects. They return the class that's in the file, not the class-that-.load()
-was-called-on.If so, it might be a worthwhile short-term step as soon as 4.0.0 – for limiting the risk of confusion & requirement for redundant/caveat-filled docs – to deprecate the practice of calling SpecificClass.load(filename) entirely, despite its common appearance in previously-idiomatic gensim example code. Instead, either (1) call it only on class
SaveLoad
itself, to express that the only expectation for the returned type is that it's aSaveLoad
subclass; (2) promote load functionality to model-specific top-level functions in each relevant model – a bit more like theload_facebook_model()
function for loading Facebook-FasttText-native-format models – which might themselves do some type-checking, so any docs which imply they return a certain type are true; (3) just make oneutils.py
genericload()
orload_model()
, perhaps with an optional class-enforcement parameter, and encourage its use.(For explicitness, I think I like this third option. In practice, it might appear in example code as:
Plenty of code where the file is saved/loaded in the same example block, or under strong expectations & naming conventions, might skip the enforced-type-checking – but it'd be an option & true/explicit, rather than something that's implied-but-not-really-enforced in the current idiom
Word2Vec.load('kv.bin')
)Despite the effort involved in making such changes, they could minimize duplicated code/comments & avoid some unintuitive gotchas in the current
SaveLoad
approach. They might also help make a future migration to some more standard big-model-serialization convention (as proposed by #2848) cleaner.Originally posted by @gojomo in #2939 (comment)
The text was updated successfully, but these errors were encountered: