linear model with Float32 #260

Ankur-deDev · 2018-10-18T00:48:28Z

Hi GLM,

I am trying to use the lm function with the following command:
mydata = DataFrame(X=MyDataType.(1:3), Y=MyDataType.(11:13)); lm(@formula(Y ~ 1 + X), mydata)

This works fine with MyDataType = Float64 or Int32 and Int64 but with Float32 I get the following error:

ERROR: MethodError: no method matching delbeta!(::DensePredChol{Float64,LinearAlgebra.Cholesky{Float64,Array{Float64,2}}}, ::Array{Float32,1})
Closest candidates are:
delbeta!(::DensePredQR{T<:Union{Float32, Float64}}, ::Array{T<:Union{Float32, Float64},1}) where T<:Union{Float32, Float64} at /home/tooler/.julia/packages/GLM/FxTmX/src/linpred.jl:76
delbeta!(::DensePredChol{T<:Union{Float32, Float64},#s14} where #s14<:LinearAlgebra.Cholesky, ::Array{T<:Union{Float32, Float64},1}) where T<:Union{Float32, Float64} at /home/tooler/.julia/packages/GLM/FxTmX/src/linpred.jl:130
delbeta!(::DensePredChol{T<:Union{Float32, Float64},#s14} where #s14<:LinearAlgebra.CholeskyPivoted, ::Array{T<:Union{Float32, Float64},1}) where T<:Union{Float32, Float64} at /home/tooler/.julia/packages/GLM/FxTmX/src/linpred.jl:135

My environment is the following:

julia version 1.0.1
[38e38edf] GLM v1.0.1

The text was updated successfully, but these errors were encountered:

nalimilan · 2018-10-18T15:08:59Z

The code probably needs some tweaks to be more generic. PR welcome.

andreasnoack · 2018-10-18T20:56:00Z

@kleinschmidt Is it on purpose that the model matrix promotes to Float64 even though the data columns are Float32?

It's worth noticing that lm(Matrix{Float32},Vector{Float32}) actually works so it's not Float32 that is the problem per se. The problem is the mix of the design matrix being Float64 and the response being Float32 that gives the error.

kleinschmidt · 2018-10-18T21:08:01Z

I don't know actually...I suspect it was just a sensible default. I think it could be overridden by specifying ModelMatrix{Matrix{Float32}} in fit. Or by doing some promoteing.

I'm not super inclined to do the deep dive in the current StatsModels code to fix this though given that (inshallah) that's going to be replaced by Terms 2.0 (JuliaStats/StatsModels.jl#71).

kleinschmidt · 2018-10-18T21:10:39Z

It's also hard to handle this in a sufficiently general way since you might have a categorical variable that gets expanded according to a contrasts matrix; that defaults to Float64, so you'd somehow have to figure out to un-promote that to Float32 in the case when the only continuous variables are Float32s.

kleinschmidt · 2018-10-19T15:47:55Z

Also to clarify what @andreasnoack said: this is an issue with StatsModels (which handles the formula), not with GLM.

greimel · 2019-05-02T11:57:46Z

I just encountered the same issue. This problem occurs when reading data from Stata's .dta files using StatFiles.jl. Stata seems to work with lower precision.

xgdgsc · 2023-05-25T09:51:58Z

Glad the other lm function with no formula works with Float32. I would not put the function with formula on doc if it doesn' t work with such a basic data type. Is it a good time to fix this?

kleinschmidt · 2023-05-26T21:57:45Z

I guess the question is, what should GLM do when it gets a y and X of heterogenous eltypes? the current situation with (IIRC) a method error is not great, and all the fixes I can think of for StatsModels will not completely solve this problem (it would still be possible to get heterogenous eltypes out, and JuliaStats/StatsModels.jl#294 by making promotion less aggressive by default may in fact make the problem worse here if, for instance, users are doing something like y ~ 1 + x where x is Int64 or something...)

I'd suggest something like a promotion (or at least an eltype check) in

GLM.jl/src/lm.jl

Line 129 in 468d0f8

    
           function fit(::Type{LinearModel}, X::AbstractMatrix{<:Real}, y::AbstractVector{<:Real},

to ensure that teh eltype is consistent, if that's really required for GLM to work.

kleinschmidt · 2023-05-26T22:03:15Z

Ahahaha yes here's exactly what I was talking about...statsmodels doctests picked up this (lack of Bool-to-Float64 conversion): https://github.com/JuliaStats/StatsModels.jl/actions/runs/5095234337/jobs/9159967200?pr=294#step:5:305

andreasnoack · 2024-11-20T15:37:38Z

Looks like this might be even worse at this point

julia> df = DataFrame(x = randn(Float32, 10), y = randn(Float32, 10))
10×2 DataFrame
 Row │ x           y
     │ Float32     Float32
─────┼───────────────────────
   1 │ -0.945676   -0.132859
   2 │ -0.0802135   0.180285
   3 │ -1.86329    -0.291655
   4 │ -0.705064   -0.633368
   5 │  1.63913    -1.11767
   6 │ -0.494249    1.19169
   7 │ -1.91468    -0.221668
   8 │ -1.32549    -0.136988
   9 │  0.302089   -0.940081
  10 │  0.351315   -0.532551

julia> lm(@formula(y ~ x), df)
ERROR: MethodError: no method matching delbeta!(::GLM.DensePredChol{Float64, CholeskyPivoted{Float64, Matrix{Float64}, Vector{Int64}}}, ::Vector{Float32})

but the underlying issue might be JuliaStats/StatsModels.jl#293

andreasnoack · 2024-11-21T14:00:02Z

There might be a separate issue related to offsets, see #562

greimel mentioned this issue May 2, 2019

data.dta yields DataFrame with Float32 queryverse/StatFiles.jl#18

Closed

nalimilan mentioned this issue Feb 14, 2021

lm does not work with Float32 #403

Closed

xgdgsc mentioned this issue May 26, 2023

modelcols converts x matrix of Float32 to Float64 JuliaStats/StatsModels.jl#293

Open

andreasnoack added the bug label Nov 20, 2024

andreasnoack added this to the Out of scope for next release milestone Nov 20, 2024

andreasnoack mentioned this issue Nov 21, 2024

glm fails with Float32 offset and data #562

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

linear model with Float32 #260

linear model with Float32 #260

Ankur-deDev commented Oct 18, 2018

nalimilan commented Oct 18, 2018

andreasnoack commented Oct 18, 2018

kleinschmidt commented Oct 18, 2018

kleinschmidt commented Oct 18, 2018

kleinschmidt commented Oct 19, 2018

greimel commented May 2, 2019

xgdgsc commented May 25, 2023

kleinschmidt commented May 26, 2023

kleinschmidt commented May 26, 2023

andreasnoack commented Nov 20, 2024

andreasnoack commented Nov 21, 2024

linear model with Float32 #260

linear model with Float32 #260

Comments

Ankur-deDev commented Oct 18, 2018

nalimilan commented Oct 18, 2018

andreasnoack commented Oct 18, 2018

kleinschmidt commented Oct 18, 2018

kleinschmidt commented Oct 18, 2018

kleinschmidt commented Oct 19, 2018

greimel commented May 2, 2019

xgdgsc commented May 25, 2023

kleinschmidt commented May 26, 2023

kleinschmidt commented May 26, 2023

andreasnoack commented Nov 20, 2024

andreasnoack commented Nov 21, 2024