Understanding the business problem
We will build a fraud detection system for a bank using Hadoop and Spark. The fraud detection system will predict whether a payment transaction is a suspect transaction. If such a suspicious transaction is detected, then the payment processing system can step up security and ask for more information from the account holder before the transaction can be processed. In our definition of a transaction, we will the cover payments made by a retail banking customer from his checking account to other parties. These payments can take place using a variety of modes, as follows:
- Using Internet banking to transfer money to another account
- Via swiping a card at a shop to pay for goods or services
- Payment for goods on an e-commerce site
- Direct debits for bill payment
- Cash withdrawal using an ATM card
Every mode of payment offers fraudsters the opportunity to indulge in fraudulent activities. These activities can take place by stealing the IDs, bank cards, and PINs to conduct...