-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix runtime warnings #94
Conversation
genet/utils/graph_operations.py
Outdated
@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str | |||
return df | |||
|
|||
|
|||
def get_pandas_dtype(dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed - this feels pretty hacky, but it gets the job done and only broke a single test in the end. I'm still hoping for a more elegant solution, but at least we know this does work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. I wonder where we should put this method, there is another one here: https://github.com/arup-group/genet/blob/master/genet/utils/dict_support.py#L115 which is also in theme of working with pandas data, and dict_support
is not really the best place for it... should we maybe start a new module in utils.. ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I also wondered if this was the best place for it. Perhaps a new pandas-utils
module in utils is the thing?
@@ -40,7 +40,7 @@ numpy>=1.18.4 | |||
osmnx==0.15.0 | |||
osmread @ git+https://github.com/dezhin/osmread.git@d8d3fe5edd15fdab9526ea7a100ee6c796315663#egg=osmread-0.2.dev0 | |||
packaging==20.4 | |||
pandas>=1.0.3 | |||
pandas==1.3.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Upgrading pandas scared me slightly, but it didn't break any unit tests or notebooks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to >= quite recently, I wanted it to play nicely with MC (but don't need to in the end because SCOP went a different direction). I prefer the fixed versions I think
@@ -54,8 +54,8 @@ py>=1.8.1 | |||
Pygments==2.7.4 | |||
pyparsing==2.4.7 | |||
pyproj>=3.1.0 | |||
pytest>=5.4.2 | |||
pytest-cov>=2.8.1 | |||
pytest==6.2.5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had to upgrade these to make the parallel test stuff work on my machine.
'agency_id': {0: float('nan')}, 'route_desc': {0: float('nan')}, 'route_url': {0: float('nan')}, | ||
actual_stops = gtfs['stops'].to_dict() | ||
assert_semantically_equal(expected_stops, actual_stops) | ||
expected_routes = {'route_id': {0: 'service'}, 'route_short_name': {0: 'name_2'}, 'route_long_name': {0: ''}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strangely, I only had to replace NaN with None in expected routes, and not the other GTFS-related dictionaries. Not sure why this is the case, and did not investigate. Can you think of a reason, and does this seem like a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, weird! I would have thought that None would happen when you have a mixture of empty and non empty values in a column, but that's not the case here. I don't think it should make a difference though, pandas considers None and NaN empty, so if you do fillna
both None and NaN values will be considered
@@ -2,7 +2,7 @@ | |||
<!DOCTYPE network SYSTEM "http://www.matsim.org/files/dtd/network_v2.dtd"> | |||
<network> | |||
<attributes> | |||
<attribute name="crs" class="java.lang.String">{'init': 'epsg:27700'}</attribute> | |||
<attribute name="crs" class="java.lang.String">epsg:27700</attribute> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't sure of the implications of having to make this change to a test network - could this mean existing networks will be broken now? I used this branch of genet for the network simplification step in a Matesto smoke test pipeline, and nothing fell over, but that doesn't feel like comprehensive reassurance. Presumably any network written by an older version of GeNet (or Puma?) will include the CRS attribute in the deprecated style?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will break anything actually, lol
GeNet just writes the projection to xml, doesnt use it (user has to specify the projection for the network always), it does use simplified tag though
genet/genet/inputs_handler/read.py
Line 47 in 909e034
if 'simplified' not in n.graph.graph: |
genet/genet/inputs_handler/matsim_reader.py
Line 161 in 909e034
elif elem.attrib['name'] == 'simplified': |
I think it would be cool to try read the projection from xml, but we've been doing fine with people specifying each time so I don't feel a strong pull towards it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, okay. Yes, reading the projection from XML seems like a nice-to-have at best.
…er the number of CPUs available
6913c27
to
0a22663
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️ the new warning numbers
And loving that we no longer have this dictionary 'init' BS all around, thanks @mfitz !
gdf_geometries.crs = self.epsg | ||
gdf_geometries = gpd.GeoDataFrame(self.link_attribute_data_under_keys(['geometry']), crs=self.epsg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
@@ -213,7 +214,7 @@ def get_attribute_data_under_key(iterator: Iterable, key: Union[str, dict]): | |||
:param iterator: list or iterator yielding (index, attribute_dictionary) | |||
:param key: either a string e.g. 'modes', or if accessing nested information, a dictionary | |||
e.g. {'attributes': {'osm:way:name': 'text'}} | |||
:return: dictionary where keys are indicies and values are data stored under the key | |||
:return: dictionary where keys are indices and values are data stored under the key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I always do that typo, always
genet/utils/graph_operations.py
Outdated
@@ -270,6 +272,18 @@ def build_attribute_dataframe(iterator, keys: Union[list, str], index_name: str | |||
return df | |||
|
|||
|
|||
def get_pandas_dtype(dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. I wonder where we should put this method, there is another one here: https://github.com/arup-group/genet/blob/master/genet/utils/dict_support.py#L115 which is also in theme of working with pandas data, and dict_support
is not really the best place for it... should we maybe start a new module in utils.. ?
@@ -40,7 +40,7 @@ numpy>=1.18.4 | |||
osmnx==0.15.0 | |||
osmread @ git+https://github.com/dezhin/osmread.git@d8d3fe5edd15fdab9526ea7a100ee6c796315663#egg=osmread-0.2.dev0 | |||
packaging==20.4 | |||
pandas>=1.0.3 | |||
pandas==1.3.3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I changed it to >= quite recently, I wanted it to play nicely with MC (but don't need to in the end because SCOP went a different direction). I prefer the fixed versions I think
'agency_id': {0: float('nan')}, 'route_desc': {0: float('nan')}, 'route_url': {0: float('nan')}, | ||
actual_stops = gtfs['stops'].to_dict() | ||
assert_semantically_equal(expected_stops, actual_stops) | ||
expected_routes = {'route_id': {0: 'service'}, 'route_short_name': {0: 'name_2'}, 'route_long_name': {0: ''}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh, weird! I would have thought that None would happen when you have a mixture of empty and non empty values in a column, but that's not the case here. I don't think it should make a difference though, pandas considers None and NaN empty, so if you do fillna
both None and NaN values will be considered
@@ -2,7 +2,7 @@ | |||
<!DOCTYPE network SYSTEM "http://www.matsim.org/files/dtd/network_v2.dtd"> | |||
<network> | |||
<attributes> | |||
<attribute name="crs" class="java.lang.String">{'init': 'epsg:27700'}</attribute> | |||
<attribute name="crs" class="java.lang.String">epsg:27700</attribute> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this will break anything actually, lol
GeNet just writes the projection to xml, doesnt use it (user has to specify the projection for the network always), it does use simplified tag though
genet/genet/inputs_handler/read.py
Line 47 in 909e034
if 'simplified' not in n.graph.graph: |
genet/genet/inputs_handler/matsim_reader.py
Line 161 in 909e034
elif elem.attrib['name'] == 'simplified': |
I think it would be cool to try read the projection from xml, but we've been doing fine with people specifying each time so I don't feel a strong pull towards it
Closes #93
Fixes a few hundred warnings - still a few hanging around.
I also had to bump pytest-related versions to make the parallel test stuff work on my local machine, and modified the number of processes so that it is automatically matched to the number of available processors reported by the machine.
I had a brief look at the
xfail
test, but couldn't see a quick fix for it.Before
After