Jump to content

Recommended Posts

Posted

Hello

Looking for someone to point me in the right direction so I can start researching  a solution to the problem below: 

Problem

Linear regression with high dimensionality - p=29, n = 5000ish, input variables are generally quite highly correlated

When using the model for prediction, data sets regularly have a missing input parameter(s). At the moment I just refit a LSQ solution from the training data with that input deleted. This seems to lead to quite unstable results. Stability is important for my application, more so than absolute accuracy in some senses.  

--
Regularisation (e.g. Ridge) feels like it should help, but (and I'm not formally train in stats) as I understand that will reduce the variance of the model,  with all input variables - and doesn't necessarily achieve anything for model stability where input parameters are deleted.

Thanks in advance. 

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...

Important Information

We have placed cookies on your device to help make this website better. You can adjust your cookie settings, otherwise we'll assume you're okay to continue.