Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flake8 Round 2: We have to go deeper #158

Merged
merged 51 commits into from
Oct 4, 2021
Merged
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
51 commits
Select commit Hold shift + click to select a range
12d352f
Fix D107
cthoyt Sep 24, 2021
fd719e8
Fix D105 and cleanup config stuff
cthoyt Sep 24, 2021
d1bd442
Fix deprecated function names
cthoyt Sep 24, 2021
5248308
Fix D104
cthoyt Sep 24, 2021
8c3c25e
Enable D103 check
cthoyt Sep 24, 2021
15fbe19
flake8 compliant.
hrshdhgd Sep 24, 2021
bd504d9
flake8 compliant
hrshdhgd Sep 24, 2021
23048b0
typo
hrshdhgd Sep 24, 2021
405e7b0
flake8 compliant
hrshdhgd Sep 24, 2021
2d0baf6
will address later
hrshdhgd Sep 24, 2021
d9998fa
flake8 compliant
hrshdhgd Sep 24, 2021
4833f18
flake8 compliant
hrshdhgd Sep 24, 2021
484ed3f
flake8 compliant
hrshdhgd Sep 24, 2021
c183999
flake8 compliant
hrshdhgd Sep 24, 2021
4fa46e5
Revert black ignore
cthoyt Sep 27, 2021
b47274d
Update test_data.py
cthoyt Sep 27, 2021
0eba665
Follow-up removal of ensure_test_dir_exists
cthoyt Sep 27, 2021
b060b28
Update more mypy issues
cthoyt Sep 27, 2021
68c838f
remove type from docstrings and standardized
hrshdhgd Sep 27, 2021
a46642e
added docstring
hrshdhgd Sep 27, 2021
e2d5235
added more info
hrshdhgd Sep 27, 2021
f532dcc
better doc elaboration
hrshdhgd Sep 27, 2021
8574a54
mypy:@cthoyt need help with cliques.py line 129
hrshdhgd Sep 28, 2021
6fd4230
Address antipattern in `invert_dict()`
cthoyt Sep 28, 2021
b16e887
Fix typing in cliques.py
cthoyt Sep 28, 2021
a1b83cf
Fix typo of spurious parentheses
cthoyt Sep 28, 2021
0854096
Fix type annotations and add FIXMEs in docs
cthoyt Sep 28, 2021
1e7cf51
This is already a builtin function
cthoyt Sep 28, 2021
d298e67
Update util.py
cthoyt Sep 28, 2021
260dfbf
Update docs and type annotations for test cases
cthoyt Sep 28, 2021
88a606d
based on dicussion with Chris
hrshdhgd Sep 29, 2021
3673552
docstring update
hrshdhgd Sep 29, 2021
8315f05
docstring update
hrshdhgd Sep 29, 2021
abee8cb
unnecessary import
hrshdhgd Sep 29, 2021
2673fc3
deleted priors
hrshdhgd Sep 29, 2021
d0d9b34
Remove unnecessary re-implementation of pd.read_csv
cthoyt Sep 29, 2021
02184f9
There is no situation in which this can be none
cthoyt Sep 29, 2021
fdc5c97
@matentzn, did'nt follow solution for this, help!
hrshdhgd Sep 29, 2021
0f576bb
Merge branch 'more-flake8' of https://github.com/mapping-commons/ssso…
hrshdhgd Sep 29, 2021
0e9f681
More docs updates
cthoyt Sep 29, 2021
9295e6f
Update sssom/sparql_util.py
cthoyt Sep 29, 2021
e1f1a6f
removed row
hrshdhgd Sep 29, 2021
b4a7bd8
Update docs, variable names, readability
cthoyt Oct 2, 2021
294a246
Update sssom/util.py
cthoyt Oct 2, 2021
8c02a4a
Update util.py
cthoyt Oct 2, 2021
329ff49
Update variable names
cthoyt Oct 2, 2021
cf80b47
Update util.py
cthoyt Oct 2, 2021
cf612a4
Update util.py
cthoyt Oct 2, 2021
f4991f6
Update util.py
cthoyt Oct 2, 2021
0f74c64
Fix mismatch of docs to functionality
cthoyt Oct 2, 2021
6a2cce8
Update util.py
cthoyt Oct 2, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 1 addition & 4 deletions setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -146,13 +146,10 @@ ignore =
S318 # don't worry about unsafe xml
S310 # TODO remove this later and switch to using requests
# The following are documentation things (remove one at a time in future PRs)
BLK100
cthoyt marked this conversation as resolved.
Show resolved Hide resolved
D100
D101
D102
D103
D104
D105
D107
exclude =
sssom/sssom_datamodel.py
sssom/cliquesummary.py
32 changes: 29 additions & 3 deletions sssom/cliques.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
import hashlib
import statistics
from typing import Any, Dict
from typing import Any, Dict, List

import networkx as nx
import pandas as pd
Expand Down Expand Up @@ -59,7 +59,17 @@ def to_networkx(msdf: MappingSetDataFrame) -> nx.DiGraph:
return g


def split_into_cliques(msdf: MappingSetDataFrame):
def split_into_cliques(msdf: MappingSetDataFrame) -> List[MappingSetDocument]:
"""Split MappingSetDataFrames into cliques.
cthoyt marked this conversation as resolved.
Show resolved Hide resolved

:param msdf: MappingSetDataFrame object
:type msdf: MappingSetDataFrame
matentzn marked this conversation as resolved.
Show resolved Hide resolved
:raises TypeError: If Mappings is not of type List
:raises TypeError: If each mapping is not of type Mapping
:raises TypeError: If Mappings is not of type List
:return: List of MappingSetDocument objects
:rtype: List[MappingSetDocument]
"""
doc = to_mapping_set_document(msdf)
g = to_networkx(msdf)
gen = nx.algorithms.components.strongly_connected_components(g)
Expand Down Expand Up @@ -87,6 +97,13 @@ def split_into_cliques(msdf: MappingSetDataFrame):


def invert_dict(d: Dict[str, str]) -> Dict[str, str]:
"""Invert Dictionary.
cthoyt marked this conversation as resolved.
Show resolved Hide resolved

:param d: Dictionary
:type d: Dict[str, str]
:return: Dictionary with keys and values interchanged
:rtype: Dict[str, str]
"""
invdict: Dict[str, Any] = {}
for k, v in d.items():
if v not in invdict:
Expand All @@ -95,7 +112,16 @@ def invert_dict(d: Dict[str, str]) -> Dict[str, str]:
return invdict


def get_src(src, cid):
def get_src(src: str, cid: str):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

src should be Optional[str] -> see the None check in the definition of the function

"""Get source.
matentzn marked this conversation as resolved.
Show resolved Hide resolved

:param src: Source
:type src: str
:param cid: CURIE
:type cid: str
:return: Source
:rtype: [type]
"""
if src is None:
return cid.split(":")[0]
else:
Expand Down
27 changes: 27 additions & 0 deletions sssom/context.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,14 +15,29 @@


def get_jsonld_context():
"""Get JSON-LD context.
cthoyt marked this conversation as resolved.
Show resolved Hide resolved

:return: JSON-LD context
:rtype: Any
"""
return json.loads(sssom_context, strict=False)


def get_external_jsonld_context():
"""Get JSON-LD context.

:return: JSON-LD context
:rtype: Any
"""
return json.loads(sssom_external_context, strict=False)


def get_built_in_prefix_map() -> PrefixMap:
"""Get built-in prefix map.
cthoyt marked this conversation as resolved.
Show resolved Hide resolved

:return: Prefix map
:rtype: PrefixMap
"""
contxt = get_jsonld_context()
prefix_map = {}
for key in contxt["@context"]:
Expand All @@ -36,6 +51,13 @@ def get_built_in_prefix_map() -> PrefixMap:
def add_built_in_prefixes_to_prefix_map(
prefix_map: Optional[PrefixMap] = None,
) -> PrefixMap:
"""Add built-in prefix map.

:param prefix_map: Prefix map, defaults to None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

obviously it defaults to none, that's the default. what happens if None is passed?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:type prefix_map: Optional[PrefixMap], optional
:return: Prefix map
:rtype: PrefixMap
"""
builtinmap = get_built_in_prefix_map()
if not prefix_map:
prefix_map = builtinmap
Expand All @@ -51,6 +73,11 @@ def add_built_in_prefixes_to_prefix_map(


def get_default_metadata() -> Metadata:
"""Get Default metadata.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's almost never sufficient to just re-write the name of the function. What is default metadata? How was it chosen? Imagine a user could read this docstring and not know what it is or where

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't be afraid to be redundant of other function's docstrings, since documentation readers might only find the one function they want to know more about.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. For this PR to pass to I need to fix all these? Or should fixing these just be on the longer agenda?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The purpose of this PR is cleanup, so if the job isn't done (and also done well) then there's no advantage to merging it. As you can see, it's indeed possible to game flake8 by writing unhelpful docstrings. It's not really helpful for users to read documentation that just barely passes flake8 - we have to go the extra step and make sure that the docstrings can educate and inform users. As it's the case that we as the developers and maintainers don't currently know what some functions do... well this is not a great situation to be in. It means we need to put in the legwork to figure out what all of the code does (even if we weren't the ones who wrote it).

One alternative is to scale back the scope of this PR (i.e., reinstate the ignore for D103) and delete the stub docstrings, but I think that it's rather the case that @hrshdhgd just needs a bit of training and practice on writing useful docstrings. Tbh after a quick google search, I didn't find any material that really showed what a good docstring was other than the minimum syntactical and semantic requirements for it to be readable by Sphinx, so this will take some effort and a bit more back-and-forth in the comments.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another note on why this should be done well - it's 99% likely that once the D103 check (this is the one that makes sure functions have docstrings with the correct style) that many of them will never be touched again. If we allow an unhelpful docstring to pass through review, then it will never be fixed, and we'll never have a flake8 error to check it for us, and never another pass flake8 campaign where we organize effort to fix this.


:return: Metadata
:rtype: Metadata
"""
contxt = get_jsonld_context()
contxt_external = get_external_jsonld_context()
prefix_map = {}
Expand Down
33 changes: 31 additions & 2 deletions sssom/parsers.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
import re
import typing
from collections import Counter
from typing import Any, Dict, List, Optional, Set, TextIO, Union, cast
from typing import Any, Callable, Dict, List, Optional, Set, TextIO, Union, cast
from urllib.request import urlopen
from xml.dom import Node, minidom
from xml.dom.minidom import Document
Expand Down Expand Up @@ -294,6 +294,17 @@ def from_sssom_json(
prefix_map: Dict[str, str],
meta: Dict[str, str] = None,
) -> MappingSetDataFrame:
"""Get data from JSON.

:param jsondoc: JSON document
:type jsondoc: Union[str, dict, TextIO]
:param prefix_map: Prefix map
:type prefix_map: Dict[str, str]
:param meta: metadata, defaults to None
:type meta: Dict[str, str], optional
:return: MappingSetDataFrame object
:rtype: MappingSetDataFrame
"""
prefix_map = _ensure_prefix_map(prefix_map)
mapping_set = cast(
MappingSet, JSONLoader().load(source=jsondoc, target_class=MappingSet)
Expand Down Expand Up @@ -452,7 +463,17 @@ def from_obographs(
# All read_* take as an input a a file handle and return a MappingSetDataFrame (usually wrapping a from_* method)


def get_parsing_function(input_format, filename):
def get_parsing_function(input_format: str, filename: str) -> Callable:
"""Return appropriate function based on input format of file.

:param input_format: File format
:type input_format: str
:param filename: Filename
:type filename: str
:raises Exception: Unknown file format
:return: Appropriate 'read' function
:rtype: function
"""
if input_format is None:
input_format = get_file_extension(filename)
if input_format == "tsv":
Expand Down Expand Up @@ -633,6 +654,14 @@ def to_mapping_set_document(msdf: MappingSetDataFrame) -> MappingSetDocument:
def split_dataframe(
msdf: MappingSetDataFrame,
) -> typing.Mapping[str, MappingSetDataFrame]:
"""Split DataFrame.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does it split the dataframe? why?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


:param msdf: MappingSetDataFrame object
:type msdf: MappingSetDataFrame
:raises RuntimeError: Object is None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which object?

:return: Mapping object
:rtype: typing.Mapping[str, MappingSetDataFrame]
"""
if msdf.df is None:
raise RuntimeError
subject_prefixes = set(msdf.df["subject_id"].str.split(":", 1, expand=True)[0])
Expand Down
27 changes: 27 additions & 0 deletions sssom/sparql_util.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,10 +84,28 @@ def query_mappings(config: EndpointConfig) -> MappingSetDataFrame:


def curiefy_row(row: Mapping[str, str], config: EndpointConfig) -> Dict[str, str]:
"""CURIE-fy row.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not helpful


:param row: Mapping object row
:type row: Mapping[str, str]
:param config: Configuration
:type config: EndpointConfig
:return: Dictionary of CURIEs
:rtype: Dict[str, str]
"""
return {k: contract_uri(v, config) for k, v in row.items()}


def contract_uri(uristr: str, config: EndpointConfig) -> str:
"""Contract URI.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does it mean to contract a URI


matentzn marked this conversation as resolved.
Show resolved Hide resolved
:param uristr: URI string
:type uristr: str
:param config: Configuration
:type config: EndpointConfig
:return: URI string (contracted)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's not a URI string if it's been contracted. Then it's a prefix.

:rtype: str
"""
if config.prefix_map is None:
return uristr
for k, v in config.prefix_map.items():
Expand All @@ -97,6 +115,15 @@ def contract_uri(uristr: str, config: EndpointConfig) -> str:


def expand_curie(curie: str, config: EndpointConfig) -> URIRef:
"""Expand CURIE.

:param curie: CURIE
:type curie: str
:param config: Configuration
:type config: EndpointConfig
:return: URI of CURIE
:rtype: URIRef
"""
if config.prefix_map is None:
return URIRef(curie)
for k, v in config.prefix_map.items():
Expand Down
Loading