Ranking

Based on a thorough evaluation using 104 imbalanced datasets, the following 10 techniques provide the highest performance in terms of the AUC, GAcc, F1 and P20 scores, in nearest neighbors, support vector machine, decision tree and multilayer perceptron based classification scenarios. For more details on the evaluation methodology, see our paper on the comparative study.

sampler

overall

auc

rank_auc

gacc

rank_gacc

f1

rank_f1

p_top20

rank_ptop20

polynom-fit-SMOTE

2.5

0.902538

6

0.870753

1

0.695154

1

0.992496

2

ProWSyn

4.5

0.904389

1

0.868449

4

0.690284

3

0.991112

10

SMOTE-IPF

7.5

0.902565

5

0.868715

3

0.687935

9

0.990904

13

Lee

8

0.902318

7

0.868324

5

0.688082

8

0.991008

12

SMOBD

9.25

0.902247

8

0.86766

6

0.688885

4

0.990583

19

G-SMOTE

13.5

0.901916

10

0.865103

18

0.686613

12

0.990846

14

CCR

14.25

0.902112

9

0.861994

30

0.687886

10

0.991254

8

LVQ-SMOTE

14.75

0.902799

3

0.862295

29

0.683646

24

0.992211

3

Assembled-SMOTE

15.5

0.902691

4

0.866914

7

0.688614

5

0.982685

46

SMOTE-TomekLinks

15.75

0.901016

14

0.866174

9

0.684708

20

0.990573

20