Project Description

It is important that credit card companies are able to recognize fraudulent credit card transactions so that customers are not charged for items that they did not purchase.


Identify fraudulent credit card transactions.

Given the class imbalance ratio, we recommend measuring the accuracy using the Area Under the Precision-Recall Curve (AUPRC). Confusion matrix accuracy is not meaningful for unbalanced classification.


The dataset has been collected and analysed during a research collaboration of Worldline and the Machine Learning Group ( of ULB (Université Libre de Bruxelles) on big data mining and fraud detection. More details on current and past projects on related topics are available on and the page of the DefeatFraud project

Data variance engineering
PCA Analysis
Leverage Test automation tool RSpec
Keep up to date on best practices for modern machine learning pipelines
Make recommendation to the project manager on technology stack and tooling to support pipeline development

AWS SageMaker, Keras, TensorFlow, RStudio Server AWS EC2, AWS S3

