SAS: Standardize Variables

Standardized variables can be referred as Z score with a mean of 0 and standard deviation of 1. When multi-scale variables are enter into Regression analysis, the variables with larger variances will have more importance and influence on the results than the variables with small variances. It is important to standardized variables in the preprocessing step for regression analysis, cluster analysis, and neural network.

To calculate the standardized variable, use the non-standardized variable minus the mean and then divided by standard deviation.

Get means and standard deviation before standardizing the variables.

proc univariate data = data1 out=stat;
var var1 var2 var3;
run;

In the data set, the variables are not measured in the same units and cannot be assumed to have equal variance. We use PROC STDIZE to standardize the variables.

proc stdize data= data1 out=Stand method =std;
var var1 var2 var3;
run;

Get means and standard deviation after standardizing the variables. Means of the variables should equal 0 and standard deviations equal to 1.

proc univariate data = Stand out=stat;
var var1 var2 var3;
run;
Heuristic Andrew

Good-enough solutions for an imperfect world

Social Media Oreo

Social Media Oreo

Real Data

Adventures in Data Science

A. C. Cockerill

Past to Future in a Picosecond

Social Mathematics

The interaction between Mathematics and the modern day world.

PH Bytes

Code is the New Literacy

Data Science Insights

In Pursuit of Smarter Decisions for Performance