skoot.base.make_transformer

skoot.base.make_transformer(func, **kwargs)[source][source]

Make a function into a scikit-learn TransformerMixin.

Wraps a commutative function as an anonymous BasePDTransformer in order to fit into a Pipeline. The returned transformer class methods adhere to the standard BasePDTransformer fit and transform signatures.

This is useful when a transforming function that does not fit any parameters is used to pre-process data at a point that might split a pipeline.

Parameters:

func : callable

The function that will be used to transform a dataset. Note that for certain scikit-learn operations or for model persistence, this will need to be pickled. Therefore, using a closure or lambda expression could cause downstream issues that are not immediately apparent. This function will raise a warning if it’s determined that a lambda expression is passed as func, but not all corner cases can be caught. Be cautious.

**kwargs : keyword args or dict, optional

A dictionary of keyword args that will be passed to the transformer class’ transform function (func) and enable the anonymous transformer to be tuned via grid search similar to other transformers.

Examples

>>> from sklearn.datasets import load_iris
>>> from sklearn.pipeline import Pipeline
>>> from sklearn.decomposition import PCA
>>> from sklearn.model_selection import GridSearchCV
>>> from sklearn.linear_model import LogisticRegression
>>> X, y = load_iris(return_X_y=True)
>>>
>>> def subtract_k(x, k):
...     return x - float(k)
>>>
>>> pipe = Pipeline([
...     ('pca', PCA()),
...     ('custom', make_transformer(subtract_k, k=2)),
...     ('clf', LogisticRegression(random_state=42))
... ])
>>>
>>> hyper_params = {"pca__whiten": [True, False],
...                 "custom__k": [1, 2]}
>>> search = GridSearchCV(pipe, param_grid=hyper_params,
...                       scoring="accuracy")
>>> search.fit(X, y)  
GridSearchCV(...)

Examples using skoot.base.make_transformer