Jan Meppe

Jan Meppe

Machine learning engineer in love with building digital products.

fraud detection

building a small model for fraud detection now

usually very imbalanced datasets

think about how you want to split the dataset, scale it, and then resample the dataset

train the model on the undersampled or oversampled dataset

then get the train/validation/test predictions on the complete dataset

sampling:

sample stratified and on the train set
split train/test stratified as well

metrics:

never use acccuracy
AUC/ROC/F1/precision/recall