Data Center Transformation

Data Center Transformation
Statistics Help Please!!! Multiple Linear Regression?

I’m doing a statistics problem where I have to find the best fitted reduced model. My data has a y, x1, and x2. The R^2 value is much too low, and the residual plots have a trend (so the model is inadequate).

I found that the x2 was NOT significant using the t-test and looking at the p-value (and found that x1 IS significant), so I removed the x2 and ran the model again.

However, removing the x2 decreased my R^2 value!

What am I supposed to do now? Do I keep the new model even though it is worse and inadequate? Do I not remove the x2 even though it is insignificant? Either way, the model is inadequate.

Show I perform a Transformation or ‘Center’ it? If so, do I do this to the old model or to the model without x2?

Thanks!!
Also, my residuals plot still has a trend after I remove the X2.

It’s difficult to answer this without seeing the data, without knowing what y,x1,and x2 represent, etc.

A couple of comments/thoughts:
1) Mathematically, R^2 must decrease by removing a variable from a model (and conversely, R^2 must increase when adding any variable (regardless of utility) to the model). Analysts often look at an R^2 adjusted for the number of variables in the model.
2) What is too high or too low for R^2 cant be answered in the absolute – in some contexts, an R^2 of .40 is great; in others, it is poor
3) Since there is a trend in the residuals, it sounds like there is a variable missing – if you have another variable try it; a transformation is certainly worth trying; how about a squared term for x1
4) I would not put x2 back in – you already showed it doesnt add any information
5) Any outliers in the data?
6) Consider looking at the mean square error of the model as a measure of what the “best” model is, and not just R^2

Data Center TransformationData Center Transformation
Data Center Transformation

How do you create an interaction term with a log transformed variable in SPSS?

I am analyzing my dissertation data and had to use a log10 transformation for one of my IVs because the distribution was skewed. I now need to create an interaction term using that transformed variable. In order to do that, does the transformed variable need to be centered by subtracting the mean from each value? I know that’s what you normally have to do for IVs but I wasn’t sure what to do with a log transformed IV.

Thanks!

I don’t think you will need to if the transformation actually adjusted for the skew. It would depend on the variable if you need too or not. I don’t like manipulating my variables to much. I would have looked for new ways to transform the IV.

HP Data Center Transformation MPEG2