I have regularized Lasso models based on PolynomialRegression features of four degrees (1, 3, 7, 11) on pre-trained data in sci-kit learn. I generated predictions for 100 evenly spaced points on the interval [0, 20] and stored the results in a numpy array. My task is to return the 𝑅^2 score for each of the Lasso models relative to a new ‘gold standard’ test set generated from the true underlying cubic polynomial model without noise. The initial model, from which my below code is based, includes a noise constant. I have to compute this new test set by computing the true noise-less underlying function
t^3/20 - t^2 - t for each of 100 evenly spaced points on the interval [0, 20], and ultimately select the degree which has the R^2 that gives the best fit on the given function. Here is my code so far:
degs = (1, 3, 7, 11) las_r2 =  preds = np.zeros((4,100)) for i, deg in enumerate(degs): poly = PolynomialFeatures(degree=deg) X_poly = poly.fit_transform(X_train) linlasso = Lasso(alpha=0.01, max_iter = 10000).fit(X_poly, y_train) y_poly = linlasso.predict(poly.fit_transform(np.linspace(0,20,100).reshape(-1,1))); preds[i,:] = y_poly.transpose() X_test_poly = poly.fit_transform(X_test) las_r2.append(linlasso.score(X_test_poly, y_test)) answer = las_r2.max()
What I don’t know is how to how to incorporate that “gold standard” function provided in the above paragraph into my for-loop.