Binning transformerΒΆ

Demonstrates how to bin continuous features in your dataset.


../../_images/sphx_glr_ex_binning_001.png

Out:

a  ...      b_binned
0  5.1  ...  (3.30, 3.90]
1  4.9  ...  (2.70, 3.30]
2  4.7  ...  (2.70, 3.30]
3  4.6  ...  (2.70, 3.30]
4  5.0  ...  (3.30, 3.90]

[5 rows x 6 columns]

print(__doc__)

# Author: Taylor Smith <taylor.smith@alkaline-ml.com>

from matplotlib import pyplot as plt
from skoot.datasets import load_iris_df
from skoot.preprocessing import BinningTransformer

# #############################################################################
# load data
iris = load_iris_df(include_tgt=False, names=["a", "b", "c", "d"])
binner = BinningTransformer(cols=["a", "b"], return_bin_label=True,
                            strategy="uniform", overwrite=False,
                            n_bins=4)

# print the head of the binned dataset
print(binner.fit_transform(iris).head())

# #############################################################################
# Show where the boundaries reside

a_lower = binner.bins_["a"].lower_bounds[1:]  # skip the -np.inf
plt.hist(iris["a"].values)

# plot vertical lines where bins are
for bound in a_lower:
    plt.axvline(bound, ls="--")
plt.title("Iris feature 'a' + bin markers")
plt.show()

Total running time of the script: ( 0 minutes 0.059 seconds)

Gallery generated by Sphinx-Gallery