Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automate Prototyping Activities - R-based Models #217

Open
antoinecarme opened this issue Sep 5, 2022 · 7 comments
Open

Automate Prototyping Activities - R-based Models #217

antoinecarme opened this issue Sep 5, 2022 · 7 comments

Comments

@antoinecarme
Copy link
Owner

antoinecarme commented Sep 5, 2022

It is useful to have a git branch which contains all the necessary toolkit for prototyping.

Make it possible to use R/forecast from inside pyaf. "Fake" pyaf models which call R to validate a specific implementation.

This branch is not to be merged.

First application : Threshold AR models #214 and TSMARS models #215

@antoinecarme
Copy link
Owner Author

Need to install all needed r-cran-XXXXXXXXX packages in debian.

Most needed : r-cran-forecast and r-cran-caret

antoine@z600:~/dev/python/packages/timeseries/pyaf$ apt-cache show r-cran-forecast
Package: r-cran-forecast
Version: 8.17.0-1
Installed-Size: 1914
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Architecture: amd64
Depends: r-base-core (>= 4.2.1-1), r-api-4.0, r-cran-colorspace, r-cran-fracdiff, r-cran-generics (>= 0.1.2), r-cran-ggplot2 (>= 2.2.1), r-cran-lmtest, r-cran-magrittr, r-cran-nnet, r-cran-rcpp (>= 0.11.0), r-cran-timedate, r-cran-tseries, r-cran-urca, r-cran-zoo, r-cran-rcpparmadillo (>= 0.2.35), libblas3 | libblas.so.3, libc6 (>= 2.29), libgcc-s1 (>= 3.0), libstdc++6 (>= 11)
Recommends: r-cran-testthat, r-cran-uroot
Suggests: r-cran-knitr, r-cran-rmarkdown
Description-en: GNU R forecasting functions for time series and linear models
 Methods and tools for displaying and analysing
 univariate time series forecasts including exponential smoothing
 via state space models and automatic ARIMA modelling.
Description-md5: fbe002920852e5d23ff950431c9f03c4
Homepage: https://cran.r-project.org/package=forecast
Section: gnu-r
Priority: optional
Filename: pool/main/r/r-cran-forecast/r-cran-forecast_8.17.0-1_amd64.deb
Size: 1540732
MD5sum: ad90255623ef7f6c6719b7befca32f49

antoine@z600:~/dev/python/packages/timeseries/pyaf$ apt-cache show r-cran-caret
Package: r-cran-caret
Version: 6.0-93+dfsg-1
Installed-Size: 3668
Maintainer: Debian R Packages Maintainers <r-pkg-team@alioth-lists.debian.net>
Architecture: amd64
Depends: r-base-core (>= 4.2.1-2), r-api-4.0, r-cran-ggplot2, r-cran-lattice (>= 0.20), r-cran-e1071, r-cran-foreach, r-cran-modelmetrics (>= 1.2.2.2), r-cran-nlme, r-cran-plyr, r-cran-proc, r-cran-recipes (>= 0.1.10), r-cran-reshape2, r-cran-withr (>= 2.0.0), libc6 (>= 2.4)
Recommends: r-cran-testthat (>= 0.9.1), r-cran-earth (>= 2.2-3), r-cran-mda, r-cran-mlmetrics, r-cran-fastica, r-cran-kernlab, r-cran-themis (>= 0.1.3)
Suggests: r-cran-bradleyterry2, r-cran-covr, r-cran-dplyr, r-cran-ellipse, r-cran-gam (>= 1.15), r-cran-ipred, r-cran-knitr, r-cran-mass, r-cran-matrix, r-cran-mgcv, r-cran-mlbench, r-cran-nnet, r-cran-party (>= 0.9-99992), r-cran-pls, r-cran-proxy, r-cran-randomforest, r-cran-rann, r-cran-rmarkdown, r-cran-rpart
Description-en: GNU R package for classification and regression training
 This GNU R package provides misc functions for training and plotting
 classification and regression models.
Description-md5: 568fff6316b184e50b859b0f39211d0d
Homepage: https://cran.r-project.org/package=caret
Section: gnu-r
Priority: optional
Filename: pool/main/r/r-cran-caret/r-cran-caret_6.0-93+dfsg-1_amd64.deb
Size: 3446832
MD5sum: d81b051a65be49cff8f69a1828f3bc3d
SHA256: 8225d86fd41959ba6c4314b0b3df39ff2f93fb5cd0218500bf4dc4f4d684151a

@antoinecarme
Copy link
Owner Author

antoinecarme commented Sep 5, 2022

Need to have a set of pyaf models that build custom R scripts to internally build the corresponding R forecasting models.

This is a prototyping environment, can be slow and that's OK.

All the logs coming from R should be properly saved under /tmp/pyaf_prototyping/model_name_session/(train|predict).(err | log)

Training script saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/train.R

Training dataset saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/training.csv

R models saved in R (and reloaded before each forecast/predict) under /tmp/pyaf_prototyping/model_name/model.rds

Forecasting/predict script saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/predict.R

Forecast/predict dataset saved in python (and used in R) under /tmp/pyaf_prototyping/model_name/mode_name_input.csv

mode_name should contain the type of model (TAR, TSMARS, ...) and a unique string (date , process_id , ) etc.

output datasets saved by R (and used in python) under /tmp/pyaf_prototyping/model_name/mode_name_output.csv

@antoinecarme
Copy link
Owner Author

antoinecarme commented Sep 5, 2022

Sample R training script for Threshold AR models (auto-generated by pyaf for each internal model)

write('', "/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.lock")

options(warn=1);
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.log" , open="wt"), type="output");
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.err" , open="wt"), type="message");
set.seed(1960)
paste("R_VERSION" , R.version.string)
df = read.csv("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/training.csv", header=TRUE)
library(NTS, quietly = TRUE);
cat("R_PACKAGE_VERSION",  "NTS", toString(packageVersion("NTS")) , "\n");
thresholds.est = uTAR(y=df$TGT, p1=2, p2=2, d=2, thrQ=c(0,1), Trim=c(0.1,0.9), include.mean=TRUE, method="NeSS", k0=50);
model = uTAR.est(y=df$TGT, , arorder=c(2,2), thr=thresholds.est$thr, d=2);
saveRDS(model, "/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/model.rds")

file.remove("/tmp/pyaf_prototyping/threshold_ar_20220905164142.004041_139800315743536/train.lock")

sink(type="output");
sink(type="message");
print('end')

@antoinecarme
Copy link
Owner Author

antoinecarme commented Sep 5, 2022

Sample forecast/predict script for Threshold AR models (auto-generated by pyaf for each model forecast)

write('', "/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.lock")

options(warn=1);
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.log" , open="wt"), type="output");
sink(file("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.err" , open="wt"), type="message");
paste("R_VERSION" , R.version.string)
df = read.csv("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208_input.csv", header=TRUE)
reloaded_model = readRDS("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/model.rds")
library(NTS, quietly = TRUE);
cat("R_PACKAGE_VERSION",  "NTS", toString(packageVersion("NTS")) , "\n");
predicted = uTAR.pred(mode=reloaded_model, orig=0 , h=204 - sum(reloaded_model$nobs),iterations=100,ci=0.95,output=TRUE)
nempty = length(reloaded_model$data) -  length(reloaded_model$residuals)
residuals = rbind(matrix(0, nempty) , matrix(reloaded_model$residuals))
data = reloaded_model$data
fitted = data + residuals
predicted = rbind(fitted, predicted$pred)
write.csv(predicted, file = "/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208_output.csv")

file.remove("/tmp/pyaf_prototyping/threshold_ar_20220905164840.860942_140163095026208/predict_20220905164841.627680_140163095026208.lock")

sink(type="output");
sink(type="message");
print('end')

@antoinecarme
Copy link
Owner Author

antoinecarme commented Sep 5, 2022

@antoinecarme
Copy link
Owner Author

antoinecarme commented Sep 5, 2022

@antoinecarme
Copy link
Owner Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant