Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Reversed Graph and Transform Module #331

Merged
merged 32 commits into from
Jan 4, 2019
Merged

[Feature] Reversed Graph and Transform Module #331

merged 32 commits into from
Jan 4, 2019

Conversation

mufeili
Copy link
Member

@mufeili mufeili commented Dec 31, 2018

Description

@jermainewang Here is an API for creating the reverse of a DGLGraph with node/edge features shared. This can be potentially helpful when dealing with undirected graphs. We also make a reorg by putting line_graph and reverse into a transform.py module.

Checklist

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [$CATEGORY] (such as [Model], [Doc], [Feature]])
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage
  • Code is well-documented
  • To the my best knowledge, examples are either not affected by this change,
    or have been fixed to be compatible with this change
  • Related issue is referred in this PR

Changes

  • API implementation
  • Related docs completed
  • Test case provided
  • A small fix for the installation guide: removed an outdated TODO

@jermainewang
Copy link
Member

@mufeili I would make it simpler -- the dgl.reverse generates a new graph that optionally shares node/edge features with the input graph. Similar to a tensor operation that generates new tensor, our graph operation always generates new graph. We could also have a member function DGLGraph.reverse that calls dgl.reverse. In fact, I feel line_graph should be defined in such manner. We could put them in a transform.py module. The ReversedGraph is not needed and we don't need to maintain an _reverse field in DGLGraph.

@mufeili
Copy link
Member Author

mufeili commented Jan 1, 2019

@jermainewang If the reversed graph shares node/edge features with the original graph and the topology of the original graph changes after creating the reverse, wouldn't it be very strange that the number of the rows in the frame is different from the number of nodes/edges?

@jermainewang
Copy link
Member

That's true and is the same for line graph shared mode and subgraph shared mode. This is the problem of mutable operation. For example, in pytorch:

>>> import torch as th
>>> x = th.ones((10,))
>>> y, z = x.split(5)
>>> x[0] = -1  # mutate x
>>> x[7] = -2  # mutate x
>>> y
tensor([-1.,  1.,  1.,  1.,  1.])
>>> z
tensor([ 1.,  1., -2.,  1.,  1.])

@mufeili
Copy link
Member Author

mufeili commented Jan 1, 2019

@jermainewang So let's say we ignore this issue for the time being, put reverse and line_graph in a transform.py and allow users to do things like g.reverse(), g.line_graph()?

@zheng-da
Copy link
Collaborator

zheng-da commented Jan 1, 2019

Why do we need this operation in an undirected graph? When we reverse an undirected graph, we get the same graph.

@mufeili
Copy link
Member Author

mufeili commented Jan 1, 2019

@zheng-da I suppose the point is that when we represent an undirected static graph using a directed graph, rather than adding edges for both directions, we can do rg=g.reverse() instead.

@zheng-da
Copy link
Collaborator

zheng-da commented Jan 1, 2019

Is it how we implement undirected graphs? It doesn't sound right. For example, when we do update_all, messages should be sent to all nodes. If we split an undirected graph into two direct graphs, we'll have to run update_all on two graphs and nodes in a graph can only receive some messages.

@zheng-da
Copy link
Collaborator

zheng-da commented Jan 1, 2019

undirected graphs are very common. We shouldn't implement undirected graphs very differently from directed graphs.

@VoVAllen
Copy link
Collaborator

VoVAllen commented Jan 1, 2019

What's the typical scenario for this API? Also what's the relationship between undirected graph and this?

To me it's better to design an API for sharing features rather than a new type of graph.

@jermainewang
Copy link
Member

My thoughts for undirected graph:

  • Current solution for undirected graph is fine -- aka treating undirected graph as bi-directional graph.
  • What is missing is when the bi-directional edges (u, v) and (v, u) share the same features.

This motivates the dgl.reverse API. Some demo codes:

G = ...  # some directed graph
Gr = dgl.reverse(G, shared=True)  # the reverse graph where u->v in G shares feature with v->u in Gr
# message passing (suppose we want to do update_all)
# first send along two directions
G.send(ALL, ...)
Gr.send(ALL, ...)
# then recv either G or Gr to update the nodes
G.recv(ALL, ...)

This looks tedious I agree. So the next step is to provide an undirected graph class (using the reverse API as a building block):

class UndirectedDGLGraph:
  def __init__(self, graph_data):
    self.G = DGLGraph(graph_data)
    self.Gr = dgl.reverse(self.G)
  def update_all(mfunc, rfunc, afunc):
    self.G.send(ALL, mfunc)
    self.Gr.send(ALL, mfunc)
    self.G.recv(ALL, rfunc, afunc)

The rest message passing APIs can be implemented in the same way. If you guys don't like this technical solution, feel free to propose another one.

@jermainewang
Copy link
Member

Hmm... I found the problem here. G.recv cannnot receive messages sent in Gr. Any suggestions?

@BarclayII
Copy link
Collaborator

Do we have applications where (u, v) and (v, u) in the same graph share features?

In my mind, the reverse graph is only helpful for doing stuff such as "reversed pull" (like that in graph_nets).

@jermainewang
Copy link
Member

Do we have applications where (u, v) and (v, u) in the same graph share features?

No there is not, but a high chance to be one in the future. Think about a task to learn similarity between two nodes. The "similarity" would be shared by both direction.

In my mind, the reverse graph is only helpful for doing stuff such as "reversed pull" (like that in graph_nets).

Reversed pull looks like a push on the reverse direction. It is doable in current framework (using bi-directional edges). The question is how shall we support shared edge features?

@BarclayII
Copy link
Collaborator

No there is not, but a high chance to be one in the future. Think about a task to learn similarity between two nodes. The "similarity" would be shared by both direction.

Then currently one can somehow manually maintain a mapping between edges and their corresponding reverse edges. Once the mapping is obtained, one can scatter the rows into the graph. Sounds tedious but doable with the current API.

Reversed pull looks like a push on the reverse direction. It is doable in current framework (using bi-directional edges). The question is how shall we support shared edge features?

No they are different. Reversed pull is still a pull: there is only one receiver node, but instead of pulling along inbound edges like we do now, it pulls along outbound edges. Pushing is sending messages from one node to all neighbors, so there are multiple receivers.

@VoVAllen
Copy link
Collaborator

VoVAllen commented Jan 2, 2019

We should let user control the message passing process although it would be tedious. Since user may want do message passing on G and rG at the same time or do rG at first and so on.
The UndirectedDGLGGraph is a bit confusing since message passing itself has the direction. Is it possible to do sharing features within the scope of setting edge attribute?

@zheng-da
Copy link
Collaborator

zheng-da commented Jan 2, 2019

actually, all of our test graphs we use (cora, citeseer and pubmed) are undirected graphs. They aren't undirected graphs originally, but they were converted to undirected graphs by the original paper. So is the Reddit graph.

@zheng-da
Copy link
Collaborator

zheng-da commented Jan 2, 2019

If the goal is to handle an undirected graph better, I don't think we should use reverse to solve the problem. To be honest, it's quite counter-intuitive. In an undirected graph, a vertex needs to get messages from both in-edges and out-edges. Neither G nor Gr can receive all messages required by the following node update function.

The simplest solution, I think, is to modify the graph index to make sure (u, v) and (v, u) have the same edge id. This is the root cause of this problem. It seems to me that this is easy to achieve. Whenever adding a new edge, we can reverse the edge and assign the same edge id to the reversed edge.

@VoVAllen
Copy link
Collaborator

VoVAllen commented Jan 2, 2019

If edges share id, what should be the result of g.edata['feat']?

If we show all the edges attributes including "reverse edges", it seems inevitable to risk breaking the consistency on shared feature by user's mutation.

@jermainewang
Copy link
Member

jermainewang commented Jan 2, 2019 via email

@zheng-da
Copy link
Collaborator

zheng-da commented Jan 2, 2019

so we need edge id that refers to edge data and message id that refers to messages? How about we use (edge id, u > v) as a message id? The message from u to v and message from v to u are both stored on edge (u, v), u > v refers to the right message. Is this doable?

@mufeili mufeili changed the title [Feature] Reversed Graph with Shared Features [Feature] Reversed Graph and Transform Module Jan 3, 2019
@mufeili mufeili requested a review from jermainewang January 3, 2019 19:37
python/dgl/graph.py Outdated Show resolved Hide resolved
python/dgl/transform.py Outdated Show resolved Hide resolved
python/dgl/transform.py Outdated Show resolved Hide resolved
python/dgl/transform.py Outdated Show resolved Hide resolved
@jermainewang
Copy link
Member

LGTM. Thanks murphy!

@jermainewang jermainewang merged commit 24bbdb7 into dmlc:master Jan 4, 2019
@jermainewang jermainewang mentioned this pull request Feb 18, 2019
26 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants