We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
After fitting our model it appears that our validation curve is inverted:
The validation RMSE is systematically lower than the training RMSE which does not make intuitive sense to us.
The modelling was produced with the following code:
` X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1/3, random_state=1)
lambdas = np.logspace(0, 8, 12)
folds = KFold(n_splits = 5) MSE_list =[]
for _lambda in tqdm(lambdas): pipe_preproc = make_pipeline(PolynomialFeatures(2),StandardScaler(), Lasso(alpha = _lambda, max_iter = 1000)) MSE_train = [] MSE_list_intermediate = []
for train_index, val_index in tqdm(folds.split(X_train,y_train)): X_tr, y_tr = X_train.iloc[train_index], y_train.iloc[train_index] X_val, y_val = X_train.iloc[val_index], y_train.iloc[val_index] MSE_list_intermediate.append(mse(y_val,pipe_preproc.fit(X_tr,y_tr).predict(X_val))**(1/2)) MSE_train.append(mse(y_train,pipe_preproc.fit(X_tr,y_tr).predict(X_train))**(1/2)) MSE_list.append([_lambda] + MSE_list_intermediate + [np.mean(MSE_list_intermediate)] + [np.mean(MSE_train)])
MSE = pd.DataFrame(MSE_list) MSE.columns = ["Lambda", "Fold 1", "Fold 2","Fold 3","Fold 4","Fold 5","Mean_RMSE", "Mean_RMSE_Evaluation"]
MSE.to_excel("LASSO_output.xlsx") `
Can anybody help us.
Kind regards Anton and Søren
The text was updated successfully, but these errors were encountered:
hi @aabk-bkaa, assuming that you did not plot the data and label the curves incorrectly, there could be other reasons for the RMSE being lower on the validation data than on the training data. See: https://stats.stackexchange.com/questions/187335/validation-error-less-than-training-error
Sorry, something went wrong.
No branches or pull requests
After fitting our model it appears that our validation curve is inverted:
The validation RMSE is systematically lower than the training RMSE which does not make intuitive sense to us.
The modelling was produced with the following code:
`
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=1/3, random_state=1)
lambdas = np.logspace(0, 8, 12)
folds = KFold(n_splits = 5)
MSE_list =[]
for _lambda in tqdm(lambdas):
pipe_preproc = make_pipeline(PolynomialFeatures(2),StandardScaler(),
Lasso(alpha = _lambda, max_iter = 1000))
MSE_train = []
MSE_list_intermediate = []
MSE = pd.DataFrame(MSE_list)
MSE.columns = ["Lambda", "Fold 1", "Fold 2","Fold 3","Fold 4","Fold 5","Mean_RMSE", "Mean_RMSE_Evaluation"]
MSE.to_excel("LASSO_output.xlsx")
`
Can anybody help us.
Kind regards Anton and Søren
The text was updated successfully, but these errors were encountered: