Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KMeans clustered logits in viz_utils.py? #8

Open
pieris98 opened this issue Mar 24, 2024 · 3 comments
Open

KMeans clustered logits in viz_utils.py? #8

pieris98 opened this issue Mar 24, 2024 · 3 comments

Comments

@pieris98
Copy link

Hey Nazir,
Yet another question.
In Fig. 4 of your paper you cluster the logits using a K-means clustering (with k=4 if I follow correctly).
In the code I couldn't find any use of KMeans clustering other than several clustering methods in tools/vis_utils.py.
However, that file isn't used anywhere and it expects some sort of dataframe data containing features.

My questions are:

  1. Have you implemented Figure 4 in code i.e. perform k-means clustering on mask classification logits? If yes, where can I find relevant code in the repository? I believe this would be useful for getting rid of boundary object pixels (which will become false positive anomalies after RbA) in practice during inference.
  2. If not, could you describe what you did to produce Figure 4 in more detail? I'm trying to understand and I'm eager implement this myself.
@NazirNayal8
Copy link
Owner

Hi @pieris98 , you can find below the code snippet used to generate the figure. The code looks long for design purposes but the idea and the goal behind the figure is simply applying k-means clustering on the logits and then applying the color overlays to show that certain patterns naturally emerge from the logits.

logit_array = logits.flatten(start_dim=1).permute(1, 0).sort(dim=1).values


clusters = apply_kmeans(logit_array.cpu().numpy(), n_clusters=4)
clusters = clusters.reshape(logits.shape[1], logits.shape[2])

def get_colormap(preds, colors):
    """
    Assuming preds.shape = (H,W)
    """
    H, W = preds.shape
    color_map = torch.zeros((H, W, 3)).long()
    
    for i in colors.keys():
        mask = (preds == i)
        if mask.sum() == 0:
            continue
        color_map[mask, :] = torch.tensor(colors[i])
    
    return color_map

color_map = {
    0: np.array([11, 180, 255]), # # blue
    1: np.array([80, 233, 145]), # green
    2: np.array([230, 216, 0]), # yellow
    3: np.array([230, 0, 73]), # red
}
clusters_color = get_colormap(clusters, color_map)


k_to_c = edict(
    outlier=1,
    boundary=2,
    fp=3,
    inlier=0,
)
pixels_list = list(pixels.keys())
num_rows=0.6
num_cols=1.2
width = clusters.shape[1]*num_cols /100
height = width * clusters.shape[0] * num_rows / (clusters.shape[1] * num_cols)

fig = plt.figure(constrained_layout=True, figsize=(width, height), edgecolor="red")
ax = fig.subplot_mosaic(
    [["outlier", "inlier", "clusters"],
     ["boundary", "fp", "clusters"]
    ],
    gridspec_kw={
        "bottom": 0.0,
        "top": 1.0,
        "left": 0.0,
        "right": 1.0,
        "wspace":0.05, 
        "hspace":0,
        "height_ratios":[1,1],
        "width_ratios":[1.0,1.0,1.5],
        
    },
)

c = 0

for k, v in pixels.items():
    
    
    if k == "clusters":
        ax[k].imshow(img.permute(1, 2,0).numpy())
        ax[k].imshow(v, alpha=0.7)
        ax[k].axis("off")
        ax[k].set_xticks([])
        ax[k].set_yticks([])
        ax[k].set_title(f"(e) k-means Clustered Logits", y=-0.15, fontsize=24)
        continue
    
    ax[k].bar(bdd100k_dataset.class_names[:19], v, color=color_map[k_to_c[k]]/255.0, edgecolor='black')#.sort().values.flip(0))
    
    ax[k].set_ylim([0.0, 1.1])
    if k in ["outlier", "boundary"]:
        ax[k].set_yticks([0, 0.25, 0.5, 0.75, 1.0])
        ax[k].set_yticklabels([0, 0.25, 0.5, 0.75, 1.0], fontsize=18)
    else:
        ax[k].set_yticks([])
        
    ax[k].set_xticks(np.arange(len(v)))
    
    if k in ["boundary", "fp"]:
        ax[k].set_xticklabels(
            bdd100k_dataset.class_names[:19], 
            rotation = 45,
            horizontalalignment='right',
            fontsize=18
        )
    else:
        ax[k].set_xticklabels([])
        
    ax[k].set_xlim([-1, 18])
    
    ax[k].xaxis.set_ticks_position('none') 
    
    t = k
    if t == "fp":
        t = "(d) Ambiguous Pixel"
    if t == "outlier":
        t = "(a) Outlier"
    if t == "boundary":
        t = "(c) Boundary"
    if t == "inlier":
        t = "(b) Inlier"
    
    ax[k].set_title(f"{t}", fontsize=24)
    
    c += 1
    if "out" in k:
        
        axins = ax[k].inset_axes([0.3, 0.05, 0.47, 0.5])
        axins.bar(bdd100k_dataset.class_names[:19], v, edgecolor="black", color=color_map[k_to_c[k]]/255.0)
        x1, x2, y1, y2 = -1, 18, 0, 0.0001
        axins.set_xlim(x1, x2)
        f = ScalarFormatter()
        f.set_scientific(True)
        axins.xaxis.set_major_formatter(f)
        axins.xaxis.set_ticks_position('none') 
        axins.set_xticklabels([])
        axins.set_yticks([0.0001, 0.0002, 0.0003, 0.0004])
        axins.set_yticklabels(["1e-4", "2e-4", "3e-4", "4e-4"], fontsize=14)
        axins.set_title("Zoom-in", fontsize=18)
        ax[k].indicate_inset_zoom(axins)
    

fig.supylabel("Logit Score", x=-0.02, y=0.57, fontsize=24)
fig.set_edgecolor("red")

plt.savefig("plots/logits_v2.pdf", format="pdf", bbox_inches="tight", pad_inches=0)
plt.show()

@pieris98
Copy link
Author

Hey Nazir,
Thanks for providing the script!
I made a simpler version with some default matplotlib colormaps.

However, doing some tests, I noticed that, for the actual logic of clustering the logits to inliers(one high vote) vs outliers (all low votes) vs ambiguous (moderate votes) vs boundary objects (2 votes close to each other), kmeans doesn't work well in a large number of scenarios.

In particular, Kmeans makes the assumption (using k=4 at least) that first of all there are outliers/anomalies in the evaluated image and second that all 4 of these types exist in the image.
Otherwise, if there are <4 of these "logit types" kmeans will just assign a cluster arbitrarily for some logits that are close together in terms of the clustering distance metric (by default Euclidean distance). I saw this repeatedly in several test data by visualizing the clusters.
Is my thinking correct or am I missing something here?

I just wanted to ask:
1.Did you use this Kmeans method only on inlier data e.g. cityscapes test set & OOD supervision data e.g. Mapillary or also on other Inlier data i.e. with no anomalies in the image?

  1. I also tried this on Anomaly evaluation data e.g. the SegmentMeIfYouCan RoadObstacle21 dataset which has a lot of domain shift (snow, lighting) and both RbA and this kmeans method didn't show good results (e.g. many false positive regions) even though the metrics reported almost perfect results.
    Hence, I was wondering how the evaluation accounts for false positives for RbA and whether there are any "tricks" e.g. thresholding the RbA values for the final anomaly/not anomaly decision or anything else.

Thanks again for your work and for your support and fast replies to my issues and I'm looking forward to your response!

@NazirNayal8
Copy link
Owner

Hi @pieris98 , I need to clarify that the k-means figure was used merely for analysis purposes and to demonstrate the different types of logits that have emerged. K-means is not used at any step of training and evaluation, we wanted to demonstrate why is it that boundary and ambiguous pixels receive lower RbA score that other pixels (known and unknown). Hence, the results you have seen are reasonable; if the image has very little boundaries and no ambiguous regions, clustering with k=4 will lead to additional arbitrary cluster. In the figure we chose an image that has the 4 components clearly present.

To answer both your questions (1+2) your questions:

  • k-means has not been used with either training or test data, and it has nothing to do with our method or any post-processing step. The evaluation step on any of the OoD benchmarks was used using purely RbA, without any thresholding or the use of any trick to remove or minimize false positives. The ambiguous or boundary regions automatically receive lower RbA scores due the behavior of the logits as explained in the paper.

I hope this was helpful. If you have any further questions and require any more clarifications please let us know.

Kindest regards,
Nazir

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants