Fraud losses are the subject of constant interest by organizations and individuals. There is nothing strange about it. In 2018, 49% of organizations said they had been a victim of fraud and economic crime according to PwC. Worldwide card fraud losses were $24.26 billion in 2017 in accordance with The Nilson Report. Fraud is very widespread all over the world. Organizations should always monitor their data in order to be fraud resistant. The automatization of this process can reduce costs and detect fraud faster. The powerful helper in fraud detection and understanding of the principle of fraud is Data Science. Beside the detecting known types of fraud, data analysis techniques help to uncover new fraud.
A huge number of online transactions in banking is outside the ability to detect frauds only by human analysis. Another circumstance requiring the usage of Data Science techniques is a need to detect fraudulent transactions in real time. The implementation of the real time analysis of the potential risks allows to take a proactive measure.
The difficulties in detection anomalies in transactions are associated with recognizing their origins, because each of them could be due to fraud, but also to errors or missing data. Another complexity in machine analysis of fraud is highly imbalanced classes. For the two-class case, frauds and ordinary transactions, the majority class is the negative class, ordinary transactions, and the minority or rare class is the positive class, frauds. The minority class is very infrequent, such as 1% of the whole dataset, and traditional classifiers are likely to predict everything as the majority class, that gives “high precision” about 99%. Examples of such frauds in banking with imbalanced classes are money laundering, terrorist financing, credit card fraud, identity fraud penetrated through customer account takeover, synthetic identities, nefarious applications and other financial crimes. Beside the identifying true positive financial companies need to avoid false positive fraud detection. High false positive rate requires more human resources for checking all cases with a fraud flag and, consequently, leads to poor customer service.
Here are some examples of bank’s fraud prevention system. AI algorithms detect unusual high transactions and put them on hold until the customer confirms the deal. Fraud detection algorithms investigate and block multiple accounts opened in a short period with similar data. Using data analytics, banks use geotiming method to compare the geographical locations of in-person card swipes with the amount of time elapsed between them. If time between two swipes is not enough for the customer to travel between them, this activity will be flagged as a fraud.
An example of efficient modernization of the fraud detection process with Deep Learning is Danske Bank. The bank had a low 40% fraud detection rate, 99.5% of false positive cases the bank was investigating were not fraud related. The implementation of a modern analytic solution leveraging AI allowed bank to realize a 60% reduction in false positives, increase true positives by 50%, and focus human resources on actual cases of fraud.
The feature that insurance companies face in detecting fraud claims is a big number of documents supporting claim, like witness reports, police reports, claim forms. Each claim requires manual processing to recognize misrepresenting facts on an insurance application, inflating actual claims, claims for injuries or damage that never occurred, staging accidents and so on.
Insurers enrich their internal data, such as call center notes, voice recordings, with third party details on weather, traffic, news feeds, client’s social media data, bills, wages, criminal records, address changes, etc. Then using Data Science tools, they analyze across multiple data sets to gain insight of potentially fraudulent claims. For example, algorithms detect a person that arise as a witness multiple times or in multiple positions, as a pedestrian, driver, opponent of the accident. In these cases, all the claims will be flagged as frauds.
Telecommunication industry attracts almost the most significant number of users every day and cannot afford not to use Data Science. The common examples of fraud in telecom area include the following: illegal access and authorization, theft and fake profiles, behavioral fraud, etc. Machine Learning algorithms can identify anomalies in normal traffic of a user and prevent fraud. Some mobile operators sharing cell phone GPS data with banks that helps to verify a person’s card use against their location and, thus, help to prevent credit card fraud.
E-commerce is also a field with high scammers activity caused by close connection to payments. There are many fraud schemes in e-commerce, like chargeback, unauthorized discounts, unauthorized sale voiding, returns, and many others. Thus, AI looks like the best solution to prevent fraud. Fraud detection system collect and analyze historical data of customers in order to learn normal customer behavior, such as typical customer’s devices, the time, the location. The abnormal account activity may indicate, for example, that account or credit card of a customer was stolen. Also, AI can detect subtle behavioral patterns of scammers and identify persons that abuse the refund policy. For example, in the case of a well-known scam, when fraudster orders a product and then returns a fake one.
The Data Science usage in fraud detection is not limited to the fields described above. It already is used in transportation, oil & gas, automotive industries, in casinos, hotels, higher education system.