Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: structured representation with layout information #37

Closed
unendin opened this issue Nov 27, 2019 · 4 comments
Closed

Feature request: structured representation with layout information #37

unendin opened this issue Nov 27, 2019 · 4 comments

Comments

@unendin
Copy link

unendin commented Nov 27, 2019

Great to see the new work. I use the Graph.triples() output from the previous version of penman as an intermediate format when converting amr strings to, eg, networkx or igraph. This relies on the inverted attribute of Triples. With that attribute now gone, and no public interface to the epigraph logic, it's not clear how make use of the new version.

@goodmami
Copy link
Owner

Hi, thanks for your interest and for trying it out.

I'm not quite seeing the problem. Perhaps you can explain a bit more what you are trying to do?

The inverted attribute was always a bit of a hack. I didn't want to lose the inversion information as it was a hint to the original tree layout, but also I wanted the triples to represent the pure graph. Now the triples are "pure" and the information about the original, or preferred, layout is encoded in the epigraphical markers. This way also nearly guarantees that I can reconstruct the original tree. The downside (for some) is that triples have no notion of inversion until they are laid out in a tree again.

However you can get the information that was encoded by the inverted attribute through the public API, albeit less directly. The following should work:

from penman.layout import Push

def is_inverted(graph, triple):
    for epi in graph.epidata.get(triple, []):
        if isinstance(epi, Push) and epi.variable == triple[0]:
            return True
    return False

Test it out:

>>> from penman import decode
>>> g = decode('(a / alpha :ARG1 (b / beta) :ARG1-of (g / gamma))')
>>> for t in g.triples:
...     print(t, is_inverted(g, t))
... 
('a', ':instance', 'alpha') False
('a', ':ARG1', 'b') False
('b', ':instance', 'beta') False
('g', ':ARG1', 'a') True
('g', ':instance', 'gamma') False

This is how the layout.configure() function determines if a triple should be inverted. The Push layout marker indicates that a new node context (that is, the place where the node variable and concept are specified) is created by the triple. If the variable introduced by the node context is the source of the triple, then the triple was inverted.

In any case, based on this I see a couple ways we could perhaps make Penman easier to use:

  • include an easier way to find a specific epigraphical marker, rather than having to iterate and isinstance() each of them; e.g., Graph.get_epidatum(EpidatumType)
  • the is_inverted() function could be included as some kind of approximate introspection into the layout of the graph

Does this sound reasonable to you?

@unendin
Copy link
Author

unendin commented Nov 27, 2019

Entirely reasonable. I was concerned that a function such as this, while small, might (1) miss cases and (2) break when you make additional upstream changes. Your clarification addresses the first concern; inclusion would help with the second (and be most appreciated).

@goodmami
Copy link
Owner

Ok, then I'll see if I can put it into the next release.

Let me address your concerns:

  1. This is the same logic as the layout function, so it's as accurate as you'd expect from encoding a graph. However, the layout function may make different choices depending on the rest of the graph. For instance, in my example above, if we encoded the graph with g as the top, that edge would no longer be inverted, regardless of what the epigraphical data say. Also, modifications to the graph (adding or removing triples, or clearing the epigraphical markers) could lead to a serialization that doesn't reflect the approximations of edge inversion.

  2. I'm fairly invested now in the current state of affairs, so I don't anticipate any changes that would change the logic, although the API could possibly change slightly if the issue of duplicate triples proves too troublesome (see Duplicate relations trip up the disconnected-graph check in layout configuration #34 and Duplicate relations trip up reification #35). You're right that an "official" function in the API could give reassurance as it would need to be updated to correspond to internal changes.

Finally, the main point is that edge inversion is just a symptom of decisions made when serializing the graph---they do not bear any semantic meaning. If you really care about having an accurate picture of inverted edges, I suggest you use the tree structure instead of the graph, which is available via the codec's parse() method (and format() for serialization). See here for some brief notes.

@goodmami
Copy link
Owner

@unendin I've now released v0.8.0 which includes the penman.layout.appears_inverted() function, which works as the is_inverted() function above does. The function is renamed because I want to to be clear that the concept of inversion is orthogonal to the head-dependency (or functor-argument) direction of triples in the pure graph; instead, the serialized edge the triple gave rise to may appear inverted depending on the overall graph layout.

I did not add a function for retrieving epigraphical markers of a particular type. See #39 for that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants