Implementation of variance-based stochastic gradient descent as described in Tom Schaul, Sixin Zhang, and Yann LeCun, "No More Pesky Learning Rates", 6 Jun 2012.