-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Design a new layout algorithm #35
Comments
Hi, Paul here, you may remember me as the author of matplotlib_venn_wordcloud (issue #29). I had a lazy afternoon and implemented the second approach, i.e. using Specifically, the cost function likely needs some work. At the moment, it uses the absolute difference, which doesn't work well if some of the areas are much smaller than the others. The options for import numpy as np
import matplotlib.pyplot as plt
from scipy.optimize import minimize
from shapely.geometry import Point
DEBUG = True
def get_radii(a, b, c, ab, bc, ac, abc):
areas = np.array([
a + ab + ac + abc,
b + ab + bc + abc,
c + ac + bc + abc,
])
return np.sqrt(areas / np.pi)
def get_subset_areas(origins, radii):
geometries = get_subset_geometries(origins, radii)
return [geometry.area for geometry in geometries]
def get_subset_geometries(origins, radii):
a, b, c = [Point(*origin).buffer(radius) for origin, radius in zip(origins, radii)]
return [
a.difference(b.union(c)), # A
b.difference(a.union(c)), # B
c.difference(b.union(c)), # C
a.intersection(b).difference(c), # AB
b.intersection(c).difference(a), # BC
a.intersection(c).difference(b), # AC
a.intersection(b).intersection(c), # ABC
]
def get_point_on_a_circle(origin, radius, angle):
"""Compute the (x, y) coordinate of a point at a specified angle
on a circle given by its (x, y) origin and radius."""
x0, y0 = origin
x = x0 + radius * np.cos(angle)
y = y0 + radius * np.sin(angle)
return np.array([x, y])
def solve_venn3_circles(desired_areas):
# [A, B, C, AB, BC, AC, ABC]
desired_areas = np.array(desired_areas)
radii = get_radii(*desired_areas)
def cost_function(origins):
current_areas = get_subset_areas(origins.reshape(-1, 2), radii)
cost = np.abs(current_areas - desired_areas) # absolute difference; probably not the best choice
return np.sum(cost)
# initialize origins such that all rings overlap
initial_origins = np.array(
[get_point_on_a_circle(np.zeros((2)), np.min(radii), angle) for angle in np.pi * np.array([0, 2/3, 4/3])]
)
result = minimize(
cost_function,
initial_origins.flatten(),
method='SLSQP',
jac='2-point',
options=dict(ftol=1e-3, disp=DEBUG),
)
if not result.success:
print("Warning: could not compute circle positions for given subsets.")
print(f"scipy.optimize.minimize: {result.message}.")
origins = result.x.reshape((-1, 2))
if DEBUG:
print("Desired areas:", desired_areas)
print("Returned areas:", get_subset_areas(origins, radii))
return origins, radii
if __name__ == "__main__":
def test(A, B, C, AB, BC, AC, ABC, ax=None):
origins, radii = solve_venn3_circles([A, B, C, AB, BC, AC, ABC])
if ax is None:
fig, ax = plt.subplots()
for origin, radius in zip(origins, radii):
ax.add_patch(plt.Circle(origin, radius, alpha=0.33))
ax.set_aspect("equal")
ax.autoscale_view()
ax.axis("off")
ax.set_title(f"A={A}, B={B}, C={C},\nAB={AB}, BC={BC}, AC={AC}, ABC={ABC}")
fig, axes = plt.subplots(2, 3, figsize=(15,10))
axes = axes.ravel()
test(1, 1, 1, 0.5, 0.5, 0.5, 0.25, axes[0]) # canonical example
test(3, 2, 1, 0.5, 0.4, 0.3, 0.2, axes[1]) # different areas
test(1, 1, 1, 0, 0, 0, 0, axes[2]) # no intersections; not bad, but could be handled better
test(1, 0, 0, 1, 0, 0, 1, axes[3]) # a is superset to b, b is superset to c
test(2, 2, 2, 0.5, 0, 0.5, 0, axes[4]) # no BC, and no ABC
test(2, 2, 2, 0.1, 0.1, 0.1, 0, axes[5]) # no ABC
# issue #34
test(167, 7, 25, 41, 174, 171, 51)
# issue #30
test(1, 0, 0, 32, 0, 76, 13)
test(1, 0, 0, 650, 0, 76, 13)
test(7549417, 15685620, 26018311, 5128906, 301048, 6841264, 2762301)
plt.show() |
Had an idea for a small improvement on my bike ride home yesterday. This version transforms the desired and actual areas using the relationship I also improved the layout of the debug info and made some additional cosmetic changes. import numpy as np
from scipy.optimize import minimize
from shapely.geometry import Point
DEBUG = True
def get_radii(a, b, c, ab, bc, ac, abc):
areas = np.array([
a + ab + ac + abc,
b + ab + bc + abc,
c + ac + bc + abc,
])
return np.sqrt(areas / np.pi)
def get_subset_areas(origins, radii):
geometries = get_subset_geometries(origins, radii)
return np.array([geometry.area for geometry in geometries])
def get_subset_geometries(origins, radii):
a, b, c = [Point(*origin).buffer(radius) for origin, radius in zip(origins, radii)]
return [
a.difference(b.union(c)), # A
b.difference(a.union(c)), # B
c.difference(b.union(c)), # C
a.intersection(b).difference(c), # AB
b.intersection(c).difference(a), # BC
a.intersection(c).difference(b), # AC
a.intersection(b).intersection(c), # ABC
]
def get_point_on_a_circle(origin, radius, angle):
"""Compute the (x, y) coordinate of a point at a specified angle
on a circle given by its (x, y) origin and radius."""
x0, y0 = origin
x = x0 + radius * np.cos(angle)
y = y0 + radius * np.sin(angle)
return np.array([x, y])
def solve_venn3_circles(desired_areas):
# [A, B, C, AB, BC, AC, ABC]
desired_areas = np.array(desired_areas)
radii = get_radii(*desired_areas)
def cost_function(origins):
current_areas = get_subset_areas(origins.reshape(-1, 2), radii)
# # Option 1: absolute difference
# # Probably not the best choice, as small areas are often practically ignored.
# cost = np.abs(current_areas - desired_areas)
# # Option 2: relative difference
# # This often results in the optimization routine failing.
# minimum_area = 1e-2
# desired_areas[desired_areas < minimum_area] = minimum_area
# cost = np.abs(current_areas - desired_areas) / desired_areas
# Option 3: absolute difference of log(area + 1)
# This transformation is monotonic increasing but strongly compresses large numbers,
# thus allowing small numbers to carry more weight in the optimization.
cost = np.abs(np.log(current_areas + 1) - np.log(desired_areas + 1))
return np.sum(cost)
# initialize origins such that all rings overlap
initial_origins = np.array(
[get_point_on_a_circle(np.zeros((2)), np.min(radii), angle) for angle in np.pi * np.array([0, 2/3, 4/3])]
)
result = minimize(
cost_function,
initial_origins.flatten(),
method='SLSQP',
options=dict(disp=DEBUG),
)
if not result.success:
print("Warning: could not compute circle positions for given subsets.")
print(f"scipy.optimize.minimize: {result.message}.")
origins = result.x.reshape((-1, 2))
if DEBUG:
import pandas as pd
data = pd.DataFrame(dict(desired=desired_areas, actual=get_subset_areas(origins, radii)))
data["absolute difference"] = np.abs(data["desired"] - data["actual"])
data["relative difference"] = data["absolute difference"] / data["desired"]
with pd.option_context('display.float_format', '{:0.1f}'.format):
print(data)
return origins, radii
if __name__ == "__main__":
import matplotlib.pyplot as plt
def test(A, B, C, AB, BC, AC, ABC, ax=None):
origins, radii = solve_venn3_circles([A, B, C, AB, BC, AC, ABC])
if ax is None:
fig, ax = plt.subplots()
for origin, radius in zip(origins, radii):
ax.add_patch(plt.Circle(origin, radius, alpha=0.33))
ax.set_aspect("equal")
ax.autoscale_view()
ax.axis("off")
ax.set_title(f"A={A}, B={B}, C={C},\nAB={AB}, BC={BC}, AC={AC}, ABC={ABC}")
fig, axes = plt.subplots(2, 3, figsize=(15,10))
axes = axes.ravel()
test(1, 1, 1, 0.5, 0.5, 0.5, 0.25, axes[0]) # canonical example
test(3, 2, 1, 0.5, 0.4, 0.3, 0.2, axes[1]) # different areas
test(1, 1, 1, 0, 0, 0, 0, axes[2]) # no intersections; not bad, but could be handled better
test(1, 0, 0, 1, 0, 0, 1, axes[3]) # a is superset to b, b is superset to c
test(2, 2, 2, 0.5, 0, 0.5, 0, axes[4]) # no BC, and no ABC
test(2, 2, 2, 0.1, 0.1, 0.1, 0, axes[5]) # no ABC
# issue #34
test(167, 7, 25, 41, 174, 171, 51)
# issue #30
test(1, 0, 0, 32, 0, 76, 13)
test(1, 0, 0, 650, 0, 76, 13)
test(7549417, 15685620, 26018311, 5128906, 301048, 6841264, 2762301)
plt.show() |
Wow, this is very cool. Let me try to find time to take it up on the weekend. Besides just the layout implementation this will require:
|
Do have a close look at the code though, too. Four eyes see more than two. For example, I just found a major bug in Without the bug, the performance of my optimization is much better. ;-) Issue #30, last test case, gets solved perfectly. import numpy as np
from scipy.optimize import minimize
from shapely.geometry import Point
DEBUG = True
def get_radii(a, b, c, ab, bc, ac, abc):
areas = np.array([
a + ab + ac + abc,
b + ab + bc + abc,
c + ac + bc + abc,
])
return np.sqrt(areas / np.pi)
def get_subset_areas(origins, radii):
geometries = get_subset_geometries(origins, radii)
return np.array([geometry.area for geometry in geometries])
def get_subset_geometries(origins, radii):
a, b, c = [get_shapely_circle(origin, radius) for origin, radius in zip(origins, radii)]
return [
a.difference(b).difference(c), # A
b.difference(a).difference(c), # B
c.difference(a).difference(b), # C
a.intersection(b).difference(c), # AB
b.intersection(c).difference(a), # BC
a.intersection(c).difference(b), # AC
a.intersection(b).intersection(c), # ABC
]
def get_shapely_circle(origin, radius):
return Point(*origin).buffer(radius)
def get_point_on_a_circle(origin, radius, angle):
"""Compute the (x, y) coordinate of a point at a specified angle
on a circle given by its (x, y) origin and radius."""
x0, y0 = origin
x = x0 + radius * np.cos(angle)
y = y0 + radius * np.sin(angle)
return np.array([x, y])
def solve_venn3_circles(desired_areas):
# [A, B, C, AB, BC, AC, ABC]
desired_areas = np.array(desired_areas)
radii = get_radii(*desired_areas)
def cost_function(origins):
current_areas = get_subset_areas(origins.reshape(-1, 2), radii)
# # Option 1: absolute difference
# # Probably not the best choice, as small areas are often practically ignored.
# cost = np.abs(current_areas - desired_areas)
# Option 2: relative difference
# This often results in the optimization routine failing.
# minimum_area = 1e-2
# desired_areas[desired_areas < minimum_area] = minimum_area
# cost = np.abs(current_areas - desired_areas) / desired_areas
# Option 3: absolute difference of log(area + 1)
# This transformation is monotonic increasing but strongly compresses large numbers,
# thus allowing small numbers to carry more weight in the optimization.
cost = np.abs(np.log(current_areas + 1) - np.log(desired_areas + 1))
return np.sum(cost)
# initialize origins such that all rings overlap
initial_origins = np.array(
[get_point_on_a_circle(np.zeros((2)), np.min(radii), angle) for angle in np.pi * np.array([0, 2/3, 4/3])]
)
result = minimize(
cost_function,
initial_origins.flatten(),
method='SLSQP',
options=dict(disp=DEBUG, ftol=1e-6),
)
if not result.success:
print("Warning: could not compute circle positions for given subsets.")
print(f"scipy.optimize.minimize: {result.message}.")
origins = result.x.reshape((-1, 2))
if DEBUG:
import pandas as pd
data = pd.DataFrame(dict(desired=desired_areas, actual=get_subset_areas(origins, radii)))
data["absolute difference"] = np.abs(data["desired"] - data["actual"])
data["relative difference"] = data["absolute difference"] / data["desired"]
with pd.option_context('display.float_format', '{:0.1f}'.format):
print(data)
return origins, radii
if __name__ == "__main__":
import matplotlib.pyplot as plt
def test(A, B, C, AB, BC, AC, ABC, ax=None):
origins, radii = solve_venn3_circles([A, B, C, AB, BC, AC, ABC])
if ax is None:
fig, ax = plt.subplots()
for origin, radius in zip(origins, radii):
ax.add_patch(plt.Circle(origin, radius, alpha=0.33))
ax.set_aspect("equal")
ax.autoscale_view()
ax.axis("off")
ax.set_title(f"A={A}, B={B}, C={C},\nAB={AB}, BC={BC}, AC={AC}, ABC={ABC}")
fig, axes = plt.subplots(2, 3, figsize=(15,10))
axes = axes.ravel()
test(1, 1, 1, 0.5, 0.5, 0.5, 0.25, axes[0]) # canonical example
test(3, 2, 1, 0.5, 0.4, 0.3, 0.2, axes[1]) # different areas
test(1, 1, 1, 0, 0, 0, 0, axes[2]) # no intersections; not bad, but could be handled better
test(1, 0, 0, 1, 0, 0, 1, axes[3]) # a is superset to b, b is superset to c
test(2, 2, 2, 0.5, 0, 0.5, 0, axes[4]) # no BC, and no ABC
test(2, 2, 2, 0.1, 0.1, 0.1, 0, axes[5]) # no ABC
# issue #30
test(1, 0, 0, 32, 0, 76, 13)
test(1, 0, 0, 650, 0, 76, 13)
test(7549417, 15685620, 26018311, 5128906, 301048, 6841264, 2762301) # works well with latest version
# issue #34
test(167, 7, 25, 41, 174, 171, 51)
plt.show() |
Obviously I won't just copy-paste your code somewhere :) |
Regarding dependencies:
There is no way around |
Also, eventually, it might be worth posting a question on SO, possibly with a bounty (happy to chip in), to see if anyone can find a better a cost function or minimization procedure. It's the right sort of problem (self-contained yet of practical use) that might attract the right people. |
I added some constraints to the optimisation, which did make the results more visually pleasing. import numpy as np
from scipy.spatial.distance import cdist, pdist, squareform
from scipy.optimize import minimize, NonlinearConstraint
from shapely.geometry import Point
DEBUG = True
def get_radii(a, b, c, ab, bc, ac, abc):
areas = np.array([
a + ab + ac + abc,
b + ab + bc + abc,
c + ac + bc + abc,
])
return np.sqrt(areas / np.pi)
def get_subset_areas(origins, radii):
geometries = get_subset_geometries(origins, radii)
return np.array([geometry.area for geometry in geometries])
def get_subset_geometries(origins, radii):
a, b, c = [get_shapely_circle(origin, radius) for origin, radius in zip(origins, radii)]
return [
a.difference(b).difference(c), # A
b.difference(a).difference(c), # B
c.difference(a).difference(b), # C
a.intersection(b).difference(c), # AB
b.intersection(c).difference(a), # BC
a.intersection(c).difference(b), # AC
a.intersection(b).intersection(c), # ABC
]
def get_shapely_circle(origin, radius):
return Point(*origin).buffer(radius)
def initialize_origins(radii):
"""Initialize origins on a small circle around (0, 0)."""
origin = np.zeros((2))
radius = np.min(radii)
angles = np.pi * np.array([0, 2/3, 4/3])
return np.array(
[get_point_on_a_circle(origin, radius, angle) for angle in angles]
)
def get_point_on_a_circle(origin, radius, angle):
"""Compute the (x, y) coordinate of a point at a specified angle
on a circle given by its (x, y) origin and radius."""
x0, y0 = origin
x = x0 + radius * np.cos(angle)
y = y0 + radius * np.sin(angle)
return np.array([x, y])
# # Doesn't work as well as I had hoped.
# def initialize_origins(radii):
# """Initialize circle origins such that all circles (pairwise) overlap.
# The optimization uses gradient descent. If certain subset areas
# are zero, there is no gradient to follow. This procedure
# guarantees the existence of all subsections apart from region ABC.
# However, in all examples tested, ABC did indeed exist as well.
# """
# ra, rb, rc = radii
# # choose the side lengths such that circles overlap, but not fully
# ab = ra + rb - min(ra, rb) / 2
# bc = rb + rc - min(rb, rc) / 2
# ac = ra + rc - min(ra, rc) / 2
# # find corner coordinates
# # https://math.stackexchange.com/a/1989113/439191
# a = np.array([0, 0])
# b = np.array([0, ab])
# cy = (ab**2 + ac**2 - bc**2) / (2 * ab)
# cx = np.sqrt(ac**2 - cy**2)
# c = np.array([cx, cy])
# return np.array([a, b, c])
def solve_venn3_circles(desired_areas):
# [A, B, C, AB, BC, AC, ABC]
desired_areas = np.array(desired_areas)
radii = get_radii(*desired_areas)
def cost_function(origins):
current_areas = get_subset_areas(origins.reshape(-1, 2), radii)
# # Option 1: absolute difference
# # Probably not the best choice, as small areas are often practically ignored.
# cost = np.abs(current_areas - desired_areas)
# # Option 2: relative difference
# # This often results in the optimization routine failing.
# minimum_area = 1e-2
# desired_areas[desired_areas < minimum_area] = minimum_area
# cost = np.abs(current_areas - desired_areas) / desired_areas
# Option 3: absolute difference of log(area + 1)
# This transformation is monotonic increasing but strongly compresses large numbers,
# thus allowing small numbers to carry more weight in the optimization.
cost = np.abs(np.log(current_areas + 1) - np.log(desired_areas + 1))
return np.sum(cost)
# constraints:
eps = np.min(radii) * 0.01
lower_bounds = np.abs(radii[np.newaxis, :] - radii[:, np.newaxis]) - eps
lower_bounds[lower_bounds < 0] = 0
lower_bounds = squareform(lower_bounds)
upper_bounds = radii[np.newaxis, :] + radii[:, np.newaxis] + eps
upper_bounds -= np.diag(np.diag(upper_bounds)) # squareform requires zeros on diagonal
upper_bounds = squareform(upper_bounds)
def constraint_function(origins):
origins = np.reshape(origins, (-1, 2))
return pdist(origins)
distance_between_origins = NonlinearConstraint(
constraint_function, lb=lower_bounds, ub=upper_bounds)
result = minimize(
cost_function,
initialize_origins(radii).flatten(),
method='SLSQP',# 'COBYLA',
constraints=[distance_between_origins],
options=dict(disp=DEBUG)
)
if not result.success:
print("Warning: could not compute circle positions for given subsets.")
print(f"scipy.optimize.minimize: {result.message}.")
origins = result.x.reshape((-1, 2))
if DEBUG:
import pandas as pd
data = pd.DataFrame(dict(desired=desired_areas, actual=get_subset_areas(origins, radii)))
data["absolute difference"] = np.abs(data["desired"] - data["actual"])
data["relative difference"] = data["absolute difference"] / data["desired"]
with pd.option_context('display.float_format', '{:0.1f}'.format):
print(data)
return origins, radii
if __name__ == "__main__":
import matplotlib.pyplot as plt
def test(A, B, C, AB, BC, AC, ABC, ax=None):
origins, radii = solve_venn3_circles([A, B, C, AB, BC, AC, ABC])
if ax is None:
fig, ax = plt.subplots()
for origin, radius in zip(origins, radii):
ax.add_patch(plt.Circle(origin, radius, alpha=0.33))
ax.set_aspect("equal")
ax.autoscale_view()
ax.axis("off")
ax.set_title(f"A={A}, B={B}, C={C},\nAB={AB}, BC={BC}, AC={AC}, ABC={ABC}")
fig, axes = plt.subplots(2, 3, figsize=(15,10))
axes = axes.ravel()
test(1, 1, 1, 0.5, 0.5, 0.5, 0.25, axes[0]) # canonical example
test(3, 2, 1, 0.5, 0.4, 0.3, 0.2, axes[1]) # different areas
test(1, 1, 1, 0, 0, 0, 0, axes[2]) # no intersections; not bad, but could be handled better
test(1, 0, 0, 1, 0, 0, 1, axes[3]) # a is superset to b, b is superset to c
test(2, 2, 2, 0.5, 0, 0.5, 0, axes[4]) # no BC, and no ABC
test(2, 2, 2, 0.1, 0.1, 0.1, 0, axes[5]) # no ABC
# # issue #30
# test(1, 0, 0, 32, 0, 76, 13)
# test(1, 0, 0, 650, 0, 76, 13)
# test(7549417, 15685620, 26018311, 5128906, 301048, 6841264, 2762301) # works well with latest version
# # issue #34
# test(167, 7, 25, 41, 174, 171, 51)
plt.show() |
An argument in favour of making Taking the example from issue #50 : import numpy as np
from scipy.spatial.distance import cdist, pdist, squareform
from scipy.optimize import minimize, NonlinearConstraint
from shapely.geometry import Point
from shapely.ops import polylabel
DEBUG = True
def get_radii(a, b, c, ab, bc, ac, abc):
areas = np.array([
a + ab + ac + abc,
b + ab + bc + abc,
c + ac + bc + abc,
])
return np.sqrt(areas / np.pi)
def get_subset_areas(origins, radii):
geometries = get_subset_geometries(origins, radii)
return np.array([geometry.area for geometry in geometries])
def get_subset_geometries(origins, radii):
a, b, c = [get_shapely_circle(origin, radius) for origin, radius in zip(origins, radii)]
return [
a.difference(b).difference(c), # A
b.difference(a).difference(c), # B
c.difference(a).difference(b), # C
a.intersection(b).difference(c), # AB
b.intersection(c).difference(a), # BC
a.intersection(c).difference(b), # AC
a.intersection(b).intersection(c), # ABC
]
def get_shapely_circle(origin, radius):
return Point(*origin).buffer(radius)
def initialize_origins(radii):
"""Initialize origins on a small circle around (0, 0)."""
origin = np.zeros((2))
radius = np.min(radii)
angles = np.pi * np.array([0, 2/3, 4/3])
return np.array(
[get_point_on_a_circle(origin, radius, angle) for angle in angles]
)
def get_point_on_a_circle(origin, radius, angle):
"""Compute the (x, y) coordinate of a point at a specified angle
on a circle given by its (x, y) origin and radius."""
x0, y0 = origin
x = x0 + radius * np.cos(angle)
y = y0 + radius * np.sin(angle)
return np.array([x, y])
# # Doesn't work as well as I had hoped.
# def initialize_origins(radii):
# """Initialize circle origins such that all circles (pairwise) overlap.
# The optimization uses gradient descent. If certain subset areas
# are zero, there is no gradient to follow. This procedure
# guarantees the existence of all subsections apart from region ABC.
# However, in all examples tested, ABC did indeed exist as well.
# """
# ra, rb, rc = radii
# # choose the side lengths such that circles overlap, but not fully
# ab = ra + rb - min(ra, rb) / 2
# bc = rb + rc - min(rb, rc) / 2
# ac = ra + rc - min(ra, rc) / 2
# # find corner coordinates
# # https://math.stackexchange.com/a/1989113/439191
# a = np.array([0, 0])
# b = np.array([0, ab])
# cy = (ab**2 + ac**2 - bc**2) / (2 * ab)
# cx = np.sqrt(ac**2 - cy**2)
# c = np.array([cx, cy])
# return np.array([a, b, c])
def solve_venn3_circles(desired_areas):
# [A, B, C, AB, BC, AC, ABC]
desired_areas = np.array(desired_areas)
radii = get_radii(*desired_areas)
def cost_function(origins):
current_areas = get_subset_areas(origins.reshape(-1, 2), radii)
# # Option 1: absolute difference
# # Probably not the best choice, as small areas are often practically ignored.
# cost = np.abs(current_areas - desired_areas)
# # Option 2: relative difference
# # This often results in the optimization routine failing.
# minimum_area = 1e-2
# desired_areas[desired_areas < minimum_area] = minimum_area
# cost = np.abs(current_areas - desired_areas) / desired_areas
# Option 3: absolute difference of log(area + 1)
# This transformation is monotonic increasing but strongly compresses large numbers,
# thus allowing small numbers to carry more weight in the optimization.
cost = np.abs(np.log(current_areas + 1) - np.log(desired_areas + 1))
return np.sum(cost)
# constraints:
eps = np.min(radii) * 0.01
lower_bounds = np.abs(radii[np.newaxis, :] - radii[:, np.newaxis]) - eps
lower_bounds[lower_bounds < 0] = 0
lower_bounds = squareform(lower_bounds)
upper_bounds = radii[np.newaxis, :] + radii[:, np.newaxis] + eps
upper_bounds -= np.diag(np.diag(upper_bounds)) # squareform requires zeros on diagonal
upper_bounds = squareform(upper_bounds)
def constraint_function(origins):
origins = np.reshape(origins, (-1, 2))
return pdist(origins)
distance_between_origins = NonlinearConstraint(
constraint_function, lb=lower_bounds, ub=upper_bounds)
result = minimize(
cost_function,
initialize_origins(radii).flatten(),
method='SLSQP',# 'COBYLA',
constraints=[distance_between_origins],
options=dict(disp=DEBUG)
)
if not result.success:
print("Warning: could not compute circle positions for given subsets.")
print(f"scipy.optimize.minimize: {result.message}.")
origins = result.x.reshape((-1, 2))
if DEBUG:
import pandas as pd
data = pd.DataFrame(dict(desired=desired_areas, actual=get_subset_areas(origins, radii)))
data["absolute difference"] = np.abs(data["desired"] - data["actual"])
data["relative difference"] = data["absolute difference"] / data["desired"]
with pd.option_context('display.float_format', '{:0.1f}'.format):
print(data)
return origins, radii
def get_label_positions(origins, radii, labels):
"For each non-zero area, find the point of inaccesibility."
geometries = get_subset_geometries(origins, radii)
output = list()
for ii, (label, geometry) in enumerate(zip(labels, geometries)):
if geometry.area > 0:
poi = polylabel(geometry)
output.append((poi.x, poi.y, label))
return output
if __name__ == "__main__":
import matplotlib.pyplot as plt
def test(A, B, C, AB, BC, AC, ABC, ax=None):
origins, radii = solve_venn3_circles([A, B, C, AB, BC, AC, ABC])
if ax is None:
fig, ax = plt.subplots()
for origin, radius in zip(origins, radii):
ax.add_patch(plt.Circle(origin, radius, alpha=0.33))
ax.set_aspect("equal")
ax.autoscale_view()
ax.axis("off")
# ax.set_title(f"A={A}, B={B}, C={C},\nAB={AB}, BC={BC}, AC={AC}, ABC={ABC}")
label_positions = get_label_positions(origins, radii, [A, B, C, AB, BC, AC, ABC])
for (x, y, label) in label_positions:
ax.text(x, y, label, va="center", ha="center")
# fig, axes = plt.subplots(2, 3, figsize=(15,10))
# axes = axes.ravel()
# test(1, 1, 1, 0.5, 0.5, 0.5, 0.25, axes[0]) # canonical example
# test(3, 2, 1, 0.5, 0.4, 0.3, 0.2, axes[1]) # different areas
# test(1, 1, 1, 0, 0, 0, 0, axes[2]) # no intersections; not bad, but could be handled better
# test(1, 0, 0, 1, 0, 0, 1, axes[3]) # a is superset to b, b is superset to c
# test(2, 2, 2, 0.5, 0, 0.5, 0, axes[4]) # no BC, and no ABC
# test(2, 2, 2, 0.1, 0.1, 0.1, 0, axes[5]) # no ABC
# # issue #30
# test(1, 0, 0, 32, 0, 76, 13)
# test(1, 0, 0, 650, 0, 76, 13)
# test(7549417, 15685620, 26018311, 5128906, 301048, 6841264, 2762301) # works well with latest version
# # issue #34
# test(167, 7, 25, 41, 174, 171, 51)
# issue # 50
test(1808, 1181, 1858, 4715, 3031, 26482, 65012)
plt.show() |
So, I haven't found enough time this weekend to look at the particular algorithm proposals above. After the latest commit it is possible to do the following:
Could you try packaging the best of your experiments above into such a class to make it easy to try out? |
Sorry, I had a cold and am behind on work so I won't have time for additional contributions right now. Also, I think it is pretty important that you run some tests first, to see if this version performs better than your original implementation. Until that is for certain, it doesn't really make sense to spend more time on this code. Furthermore, if it does indeed perform better, you may want to think about some further refactoring. The code should pretty easily generalize to arbitrary Venn diagrams (2, 3, 4, etc), so you may want to make more significant changes to the API. |
Well, we're not in a hurry here, and as you might notice I myself also find less time for this project than it might deserve :) I will certainly keep looking at it as I find the time gradually. Just made sure there are interfaces where you could also contribute constructively if you happen to get the time.
Yes, I saw that and I do think that it would be nice to have a |
... and in b9a34b0 I finally got to implement the cost-optimization-based layout algorithm discussed here 🎉 |
... along with a general system for pluggable layouts. This would address most of issues being reported here regularly (#30, #34).
The main problem with current layout is that it sometimes results in circles positioned so that some of the 7 regions, corresponding to subsets simply is not present. The better layout should make sure each of the regions exists (even if it would misrepresent some of the other general patterns).
There are several approaches for implementing such logic:
scipy.optimize
.Internally the layout would probably be represented as a function, which, given subset sizes, returns the radii and centers of the circles. Externally the choice of the layout would be specified by the keyword parameter
layout
passed to thevenn3/venn2
function. The value would be either None (default), string ('naive', 'optim', ...) or the function itself.The text was updated successfully, but these errors were encountered: