Per-column drift compensation? #661
-
Hi there, Is it possible to implement a per-column GDC factor instead of tile-wise? For what I've seen, drift compensation is always applied in InferenceRPU as ratio of coefficients -- should I develop a new RPU? So I guess any non-GDC like drift compensation needs a new class. What if we forced the tile to have only 1 column (say, 512x1)? Overall, do you think it could improve the accuracy rentention? I think it might, as generally it's known that per-column scales are superior to per-tile scales. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
Hi @edminge, There is currently no column-wise drift compensation, since numerical experiments had shown that if one uses columnwise output scales ( from aihwkit.inference.compensation.drift import GlobalDriftCompensation
class PerColumnDriftCompensation(GlobalDriftCompensation):
"""Per column drift compensation.
Uses col-wise factors for compensating the drift.
"""
@no_grad()
def readout(self, out_tensor: Tensor) -> Tensor:
"""Read outs the mean abs."""
return clamp(torch_abs(out_tensor).mean(axis=0), min=0.0001) So basically just returning a vector as the alpha factors (I haven't tested the code, maybe one has to adjust the shape of alpha, but it should work in principle). |
Beta Was this translation helpful? Give feedback.
Hi @edminge,
There is currently no column-wise drift compensation, since numerical experiments had shown that if one uses columnwise output scales (
rpu_config.mapping.out_scaling_columnwise=True
) the conductance range of the weights is very similar per column, which means that one output drift compensation is typically enough. Moreover, a scalar factor can be better estimated during reference computation when compensating for drift and thus is typically less prone to estimation errors. That said, one should be able to quickly implement a col-wise drift compensation class in the following way: