Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize Vf2++ matching order #375

Merged
merged 1 commit into from
Jul 6, 2021

Conversation

georgios-ts
Copy link
Collaborator

This PR removes the unstable sort and the (unnecessary) use of a HashSet that would make the result of VF2++ heuristic matching order non-deterministic.

@georgios-ts georgios-ts mentioned this pull request Jul 1, 2021
2 tasks
@coveralls
Copy link

Pull Request Test Coverage Report for Build 990596717

  • 6 of 6 (100.0%) changed or added relevant lines in 1 file are covered.
  • 2 unchanged lines in 1 file lost coverage.
  • Overall coverage decreased (-0.02%) to 97.546%

Files with Coverage Reduction New Missed Lines %
src/isomorphism.rs 2 97.51%
Totals Coverage Status
Change from base Build 972471342: -0.02%
Covered Lines: 8388
Relevant Lines: 8599

💛 - Coveralls

Copy link
Member

@mtreinish mtreinish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

I'm curious is there any difference in performance form this? I expect the sort to be slower but the use of a vec for next_level to be faster, so I'm not sure what to expect.

Copy link
Collaborator

@IvanIsCoding IvanIsCoding left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, it will make the output deterministic

LGTM

I'm curious is there any difference in performance form this? I expect the sort to be slower but the use of a vec for next_level to be faster, so I'm not sure what to expect.

I wouldn't be surprised if the difference is negligible. This change only affects the matching order, no? If we benchmark the algorithm, I belive we will spend more time on try_match

@1ucian0
Copy link
Member

1ucian0 commented Jul 4, 2021

In case this has performance consequence, on my side, I would be okey if this is optional.

@georgios-ts
Copy link
Collaborator Author

Running this benchmark, i don't really see any significant difference:

import timeit
import random

import retworkx as rx


def permute(graph, seed=42):
    nodes = list(graph.node_indexes())

    rng = random.Random(seed)
    rng.shuffle(nodes)
    
    edges = [(nodes[u], nodes[v]) for (u, v) in graph.edge_list()]
    
    if isinstance(graph, rx.PyDiGraph):
        res = rx.PyDiGraph()
    else:
        res = rx.PyGraph()
        
    res.add_nodes_from(nodes)
    res.add_edges_from_no_data(edges)
    
    return res


def benchmark():
    n = 10000
    degrees = [10, 15, 50, 100]

    cases = []
    for deg in degrees:
        p = 2 * deg / (n - 1)

        g_a = rx.directed_gnp_random_graph(n, p, seed=42)
        g_b = permute(g_a, seed=4242)

        cases += [(g_a, g_a), (g_a, g_b)]
    
    return cases


def run(cases):
    for (g_a, g_b) in cases:
        res = rx.is_isomorphic(g_a, g_b,
                               id_order=False)
        assert res
        
cases = benchmark()
print(
    timeit.timeit("run(cases)", globals=globals(), number=5)
)
Branch Run time (sec)
master 20.348561446
#375 20.359493619

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants