Power Transform:
Power transformation is used to map data from any distribution to close to Gaussian distribution, as normality of the features is necessary for many modeling scenarios. Also transformation of data is needed in order to stabilize variance and minimize skewness.
For instance, some algorithms perform better or converge faster when features are close to normally distributed.
·
linear and logistic regression
·
nearest neighbors
·
neural networks
·
support vector machines with radial bias kernel
functions
·
principal components analysis
·
linear discriminant analysis
Power transformer class canbe accessed from sklearn.preprocessing package. Power Transformer provides two transformations Yeo-Johnson transform and Box-Cox transform.
The Yeo-Johnson transform is:
The Box-Cox transform is:
Important Points:
Box-Cox can only be applied to positive data only.
Also both transformation is parameterized by λͅ, which is determined through maximum likelihood estimation.
References:
https://scikit-learn.org/stable/modules/preprocessing.html#preprocessing-scaler
No comments
Post a Comment