Fraudulent purchases are quite expensive for online retailers as they are out the merchandise while not receiving the funds. Further, there are economic consequences for too high chargeback rates. Identifying these fraudulent transactions efficiently is key to managing losses. However, blocking legitimate transactions leads to poor customer experience and can quickly lead those customers to switch to a competitor.
Accurately identify fraudulent transactions.
Is this transaction fraudulent?
Fraud is thankfully rare, but that means that there are very few frauds compared to the volume of legitimate transactions. Simply applying traditional machine learning techniques (along with standard metrics like accuracy, sensitivity, specificity) will tend to result in poor models. Data will need to be downsampled and metrics less susceptible to unbiased datasets will need to be selected.
First, the fraud database will need to be matched to the transactional data. This information will need to be enhanced by adding broader fraud trends from aggregations of the fraud database in conjunction with aggregations of the transaction database. Next, this customer’s history needs to be added to the record (again from the transaction database). All this data needs to be cleaned, joined and transformed into valuable ML features before going into model training. This pre-modeling prep process can be frustrating and time consuming. We are here to help.
Build machine learning models to predict if the transaction is fraudulent as a function of the independent variables. Use model interpretability packages to evaluate the impact of the independent variables on the prediction.