Quantcast
Channel: Active questions tagged lm - Stack Overflow
Viewing all articles
Browse latest Browse all 122

How to minimize size of object of class "lm" without compromising it being passed to predict()

$
0
0

I want to run lm() on a large dataset with 50M+ observations with 2 predictors. The analysis is run on a remote server with only 10GB for storing the data. I have tested ´lm()´ on 10K observations sampled from the data and the resulting object had size 2GB+.

I need the object of class "lm" returned from lm() ONLY to produce the summary statistics of the model (summary(lm_object)) and to make predictions (predict(lm_object)).

I have done some experiment with the options model, x, y, qr of lm. If I set them all to FALSE I reduce the size by 38%

library(MASS)fit1=lm(medv~lstat,data=Boston)size1 <- object.size(fit1)print(size1, units = "Kb")# 127.4 Kb bytesfit2=lm(medv~lstat,data=Boston,model=F,x=F,y=F,qr=F)size2 <- object.size(fit2)print(size2, units = "Kb")# 78.5 Kb Kb bytes- ((as.integer(size1) - as.integer(size2)) / as.integer(size1)) * 100# -38.37994

but

summary(fit2)# Error in qr.lm(object) : lm object does not have a proper 'qr' component.#  Rank zero or should not have used lm(.., qr=FALSE).predict(fit2,data=Boston)# Error in qr.lm(object) : lm object does not have a proper 'qr' component.#  Rank zero or should not have used lm(.., qr=FALSE).

Apparently I need to keep qr=TRUE which reduce the object size by only 9% if compared with the default object

fit3=lm(medv~lstat,data=Boston,model=F,x=F,y=F,qr=T)size3 <- object.size(fit3)print(size3, units = "Kb")# 115.8 Kb- ((as.integer(size1) - as.integer(size3)) / as.integer(size1)) * 100# -9.142752

How do I bring the size of the "lm" object to a minimum without dumping a lot of unneeded information in memory and storage?


Viewing all articles
Browse latest Browse all 122

Latest Images

Trending Articles



Latest Images

<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>