.. _sphx_glr_auto_examples_ex_anonymous_transformer.py: =============================== Anonymous transformers in skoot =============================== Sometimes you have a pre-processing stage that finds itself awkwardly positioned in the middle of your pipeline and you're left with one of two options: 1. Write a full transformer class 2. Break your pipeline up into pieces Obviously, the preferable action is the first, however many times your function doesn't actually fit any training set parameters, so the transformer feels like overkill. This tutorial will introduce you to making anonymous, lightweight transformers on the fly that will fit into your modeling pipeline seamlessly. .. raw:: html
.. rst-class:: sphx-glr-script-out Out:: Absolute scaled values: StandardScaler1 ... StandardScaler4 73 0.354517 ... 0.022248 18 0.133071 ... 1.179118 118 2.304867 ... 1.490583 78 0.232620 ... 0.422703 76 1.207795 ... 0.289218 31 0.498762 ... 1.045633 64 0.254968 ... 0.155733 141 1.329692 ... 1.490583 68 0.476414 ... 0.422703 82 0.011174 ... 0.022248 110 0.842104 ... 1.090128 12 1.230143 ... 1.446088 36 0.376865 ... 1.312603 9 1.108246 ... 1.446088 19 0.864452 ... 1.179118 56 0.598311 ... 0.556188 104 0.842104 ... 1.357098 69 0.254968 ... 0.111238 55 0.133071 ... 0.155733 132 0.720208 ... 1.357098 29 1.352040 ... 1.312603 127 0.354517 ... 0.823158 26 0.986349 ... 1.045633 128 0.720208 ... 1.223613 131 2.548661 ... 1.090128 145 1.085898 ... 1.490583 108 1.085898 ... 0.823158 143 1.207795 ... 1.490583 45 1.230143 ... 1.179118 30 1.230143 ... 1.312603 [30 rows x 4 columns] | .. code-block:: python print(__doc__) # Author: Taylor Smith # ############################################################################# # Introduce an interesting scenario from sklearn.model_selection import train_test_split from sklearn.pipeline import Pipeline from skoot.preprocessing import SelectiveStandardScaler from skoot.base import make_transformer from skoot.datasets import load_iris_df X = load_iris_df(tgt_name="target") y = X.pop('target') X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2) # Let's say we want to scale our features with the StandardScaler, but # for whatever reason we only want the ABSOLUTE value of the scaled values. # We *could* create a transformer or split our pipeline, but either case is # klunky and could interrupt our CV process in a grid search. # # So we'll instead define a simple commutative function that will be wrapped # in an "anonymous" transformer def make_abs(X): return X.abs() pipe = Pipeline([ ("scale", SelectiveStandardScaler()), ("abs", make_transformer(make_abs)) ]) pipe.fit(X_train, y_train) print("Absolute scaled values: ") print(pipe.transform(X_test)) **Total running time of the script:** ( 0 minutes 0.017 seconds) .. only :: html .. container:: sphx-glr-footer .. container:: sphx-glr-download :download:`Download Python source code: ex_anonymous_transformer.py ` .. container:: sphx-glr-download :download:`Download Jupyter notebook: ex_anonymous_transformer.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_