skoot.base
.make_transformer¶
-
skoot.base.
make_transformer
(func, **kwargs)[source][source]¶ Make a function into a scikit-learn TransformerMixin.
Wraps a commutative function as an anonymous BasePDTransformer in order to fit into a Pipeline. The returned transformer class methods adhere to the standard BasePDTransformer
fit
andtransform
signatures.This is useful when a transforming function that does not fit any parameters is used to pre-process data at a point that might split a pipeline.
Parameters: func : callable
The function that will be used to transform a dataset. Note that for certain scikit-learn operations or for model persistence, this will need to be pickled. Therefore, using a closure or lambda expression could cause downstream issues that are not immediately apparent. This function will raise a warning if it’s determined that a lambda expression is passed as
func
, but not all corner cases can be caught. Be cautious.**kwargs : keyword args or dict, optional
A dictionary of keyword args that will be passed to the transformer class’
transform
function (func
) and enable the anonymous transformer to be tuned via grid search similar to other transformers.Examples
>>> from sklearn.datasets import load_iris >>> from sklearn.pipeline import Pipeline >>> from sklearn.decomposition import PCA >>> from sklearn.model_selection import GridSearchCV >>> from sklearn.linear_model import LogisticRegression >>> X, y = load_iris(return_X_y=True) >>> >>> def subtract_k(x, k): ... return x - float(k) >>> >>> pipe = Pipeline([ ... ('pca', PCA()), ... ('custom', make_transformer(subtract_k, k=2)), ... ('clf', LogisticRegression(random_state=42)) ... ]) >>> >>> hyper_params = {"pca__whiten": [True, False], ... "custom__k": [1, 2]} >>> search = GridSearchCV(pipe, param_grid=hyper_params, ... scoring="accuracy") >>> search.fit(X, y) GridSearchCV(...)