Project implemented on the statistics classes.
The protein.RData file contains data.train set with the column Y: the level of a certain protein in the group of patients (computer-generated data).
Task:
-Select the best subset of variables, of any size, so that the MSE is as small as possible.
-Prepare predictions for data.test (prot_pred.Rdata file)
I used Ridge and Lasso Regression.