.. _sphx_glr_auto_examples_balance_ex_oversample.py: ============================= Oversampling minority samples ============================= This example creates an imbalanced classification dataset, and oversamples the minority class to balance the class ratios. .. raw:: html
.. rst-class:: sphx-glr-script-out Out:: Num zero class (pre-balance): 29 Num one class (pre-balance): 471 Num zero class (post-balance): 94 Num one class (post-balance): 471 Num samples (post-balance): 565 0 ... 19 87 -2.153390 ... -0.240686 367 -0.769973 ... -0.175931 485 -0.946491 ... -0.120715 290 -0.839210 ... -0.958555 72 -1.225000 ... 0.078820 [5 rows x 20 columns] | .. code-block:: python print(__doc__) # Author: Taylor Smith from sklearn.datasets import make_classification from skoot.balance import over_sample_balance import pandas as pd # ############################################################################# # Create an imbalanced dataset X, y = make_classification(n_samples=500, n_classes=2, weights=[0.05, 0.95], random_state=42) # get counts: zero_mask = y == 0 print("Num zero class (pre-balance): %i" % zero_mask.sum()) print("Num one class (pre-balance): %i\n" % (~zero_mask).sum()) # ############################################################################# # Balance the dataset X_balance, y_balance = over_sample_balance(X, y, balance_ratio=0.2, random_state=42) # get the new counts new_mask = y_balance == 0 print("Num zero class (post-balance): %i" % new_mask.sum()) print("Num one class (post-balance): %i" % (~new_mask).sum()) print("Num samples (post-balance): %i" % X_balance.shape[0]) # ############################################################################# # This also works for pandas DataFrames X_balance_df, _ = over_sample_balance(pd.DataFrame.from_records(X), y, balance_ratio=0.2, random_state=42) print(X_balance_df.head()) **Total running time of the script:** ( 0 minutes 0.031 seconds) .. only :: html .. container:: sphx-glr-footer .. container:: sphx-glr-download :download:`Download Python source code: ex_oversample.py ` .. container:: sphx-glr-download :download:`Download Jupyter notebook: ex_oversample.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_