Multiclass oversampling¶
Multiclass oversampling is highly ambiguous task, as balancing various classes might be optimal with various oversampling techniques. The multiclass oversampling goes on by selecting minority classes one-by-one and oversampling them to the same cardinality as the original majority class, using the union of the original majority class and all already oversampled classes as the majority class in the binary oversampling process. This technique works only with those binary oversampling techniques which do not change the majority class and have a proportion
parameter to explicitly specify the number of samples to be generated. Suitable oversampling techniques can be queried by the get_all_oversamplers_multiclass
function:
-
smote_variants.
get_all_oversamplers_multiclass
(strategy='eq_1_vs_many_successive')[source]¶ Returns all oversampling classes which can be used with the multiclass strategy specified
Parameters: strategy (str) – the multiclass oversampling strategy - ‘eq_1_vs_many_successive’/’equalize_1_vs_many’ Returns: - list of all oversampling classes which can be used
- with the multiclass strategy specified
Return type: list(OverSampling) Example:
import smote_variants as sv oversamplers= sv.get_all_oversamplers_multiclass()
-
smote_variants.
get_n_quickest_oversamplers_multiclass
(n, strategy='eq_1_vs_many_successive')[source]¶ Returns the n quickest oversamplers based on testing on the datasets of the imbalanced_databases package, and suitable for using the multiclass strategy specified.
Parameters: Returns: - list of n quickest oversampling classes which can
be used with the multiclass strategy specified
Return type: list(OverSampling)
Example:
import smote_variants as sv oversamplers= sv.get_n_quickest_oversamplers_multiclass()
-
class
smote_variants.
MulticlassOversampling
(oversampler=<smote_variants._smote_variants.SMOTE object>, strategy='eq_1_vs_many_successive')[source]¶ -
__init__
(oversampler=<smote_variants._smote_variants.SMOTE object>, strategy='eq_1_vs_many_successive')[source]¶ Constructor of the multiclass oversampling object
Parameters: - oversampler (obj) – an oversampling object
- strategy (str/obj) – a multiclass oversampling strategy, currently ‘eq_1_vs_many_successive’ or ‘equalize_1_vs_many’
-
get_params
(deep=False)[source]¶ Returns: the parameters of the multiclass oversampling object Return type: dict
-
sample
(X, y)[source]¶ Does the sample generation according to the oversampling strategy.
Parameters: - X (np.ndarray) – training set
- y (np.array) – target labels
Returns: the extended training set and target labels
Return type: (np.ndarray, np.array)
-
sample_equalize_1_vs_many
(X, y)[source]¶ Does the sample generation by oversampling each minority class to the cardinality of the majority class using all original samples in each run.
Parameters: - X (np.ndarray) – training set
- y (np.array) – target labels
Returns: the extended training set and target labels
Return type: (np.ndarray, np.array)
-
sample_equalize_1_vs_many_successive
(X, y)[source]¶ Does the sample generation by oversampling each minority class successively to the cardinality of the majority class, incorporating the results of previous oversamplings.
Parameters: - X (np.ndarray) – training set
- y (np.array) – target labels
Returns: the extended training set and target labels
Return type: (np.ndarray, np.array)
-