Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magical java typing for additional attributes #124

Merged
merged 9 commits into from
Jun 28, 2022

Conversation

KasiaKoz
Copy link
Collaborator

Mentioned in 'Future Work' of #118 and addition to that branch of work. This PR is a change in addition to #118 and allows the representation of additional attributes in a new, simpler format. (please note the branch of a branch merging request 😬)

Old, long form format:

'attributes': {
        'additional_attrib': {'name': 'additional_attrib', 'class': 'java.lang.String', 'text': 'attrib_value'}
    }

is still allowed, and encouraged when the user wants to be specific about java types of the values.

This PR makes GeNet use simple form attributes:

'attributes': {
        'additional_attrib': 'attrib_value' 
    }

across the board, i.e. when

  • reading xml files,
  • setting 'simplified' tag for the network
  • crs for network and schedule
  • reading osm (i.e., when creating network graph from osm, the format of osm tags being saved)
  • road pricing osm ID to toll matching

Breaking changes, in some cases

Because genet now represents all additional attributes in simple form, the following possible problems will need to be announced (via new release of genet with changelog).

  • In project scripts that search for links with values of specific osm tag values. Example:
links = n.extract_links_on_edge_attributes(
    conditions= {'attributes': {'osm:way:highway': {'text': 'primary'}}},
)

this will break because genet now represents this data in short form, do the way to query it is:

links = n.extract_links_on_edge_attributes(
    conditions= {'attributes': {'osm:way:highway': 'primary'}},
)

Other annoying things

The road pricing functionality in genet that allows OSM ID to toll mappings is affected by this change. The problem is that now the data can be many types. In particular, OSM IDs can be integers (genet's type when reading OSM), float (PT2MATSim network's type via being saved as java.lang.Long) and strings (puma's default when creating network from OSM). This is dealt with via an additional parameter in the offending function, the user needs to be mindful of their data types and set them accordingly to match. This was also noted in the update of the example jupyter notebook for road pricing. All defaults are set to strings which should mean no breaking changes in anyone's current setup.

Please note, this PR includes a complete rerun of many jupyter notebooks and updates to some documentation stored whithin, which I think will make it look a lot bigger than it actually is...

@KasiaKoz KasiaKoz requested review from mfitz and ana-kop June 23, 2022 11:19
Copy link
Contributor

@mfitz mfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left a handful of minor comments.

I've also left a comment about how a network could be changed purely by the round trip of reading in to GeNet and then writing out without making any modifications (attributes with an original class=java.lang.Long will become class=java.lang.Float), that has raised a flag in my brain. I can't think through the ramifications of this behaviour, but it seems worth trying to get a handle on whether or not this change is important, particularly in terms of the effect on MATSim.

genet/input/matsim_reader.py Show resolved Hide resolved
genet/output/matsim_xml_writer.py Outdated Show resolved Hide resolved
genet/output/matsim_xml_writer.py Outdated Show resolved Hide resolved
genet/utils/java_dtypes.py Show resolved Hide resolved
genet/utils/java_dtypes.py Outdated Show resolved Hide resolved
tests/test_data/matsim/network.xml Show resolved Hide resolved
tests/test_java_dtypes.py Show resolved Hide resolved
tests/test_output_matsim_xml_writer.py Show resolved Hide resolved
@mfitz mfitz self-requested a review June 27, 2022 12:30
@KasiaKoz
Copy link
Collaborator Author

KasiaKoz commented Jun 27, 2022

@mfitz the two biggest things that I tried to address post your review:

  • reading comma separated values to sets, did not fix right away as I don't think it's going to cause massive issues at this point in time but logged it: GeNet reads comma separated attributes into sets #125 and https://arupdigital.atlassian.net/browse/LAB-1750
  • the Java-python, python-Java functions are not bijective, even more so now that I added the extra java classes. To this effect, I added a flag to force reading matsim networks using the old, long form in case the java types need to be preserved: d019152. This also has an added bonus of backwards compatibility which will be appreciated by owners of current project scripts.

Copy link
Contributor

@mfitz mfitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

genet/output/matsim_xml_writer.py Show resolved Hide resolved
genet/utils/java_dtypes.py Show resolved Hide resolved
tests/test_java_dtypes.py Show resolved Hide resolved
genet/output/matsim_xml_writer.py Show resolved Hide resolved
@KasiaKoz KasiaKoz merged commit 29a5bbe into arbitrary-attribs Jun 28, 2022
@KasiaKoz KasiaKoz deleted the arb-attribs-java-typing branch June 28, 2022 08:51
KasiaKoz added a commit that referenced this pull request Jul 14, 2022
* add todos

* refactor additional attrib check method

* extract/isolate additional attrib save method

* extract/isolate general attrib save method

* refactor/generalise general attrib save method

* allow saving additional attribs for stops

* allow saving additional attribs for routes

* allow saving additional attribs for services

* allow saving additional attribs for schedule

* (add missing file) allow saving additional attribs for schedule

* general functions to prepare and check for optional and required xml attributes for nodes, links and stops

* fix var import oopsie

* add saving arbitrary additional attributes for network nodes

* tidy up consolidating stop projections

* refactor inputs_handler/outputs_handler

* refactor test file names inputs_handler/outputs_handler

* add read additional node attributes from xml

* refactor and add additional attrib read placeholders for schedule elems

* add additional attrib read for stops

* add additional attrib read for routes

* add additional attrib read for services

* add additional attrib read for schedule

* extend additional attrib read for global network attribs

* lint cleanup

* fix notebook import

* more notebook fixes

* fix up network level attributes, add read

* change to nested dict support for stop/route/service attribute setting

* tidy up todo

* add attributes as an always present attribute

* fix disappearing attribute values post save

* change simplified tag

* possible fix for pt attributes being read as None

* Magical java typing for additional attributes (#124)

* add assuming java types for additional attributes when writing to xml

* add assuming python types for additional java-typed attributes when reading from xml

* fix lingering dependence on long form attributes

* update notebooks with simple form attributes

* extra fixes, add dtype param to road pricing to match with OSM ID dtypes

* increase code coverage

* address PR comments

* PR update: enable forcing long form when reading matsim networks

* lint

* address PR comments

* Multimodal access egress script (#119)

* add example script for snapping pt stops to other parts of the network

* warn of unsnapped stops if distance threshold reached prematurely

* relax condition

* add handling multiple network modes and teleport (access only) modes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants