In this recipe, we'll try to detect fraud communities using methods from network analysis. This is a use case that often seems to come up in graph analyses and intuitively appeals because, when carrying out fraud detection, we are interested in relationships between people, such as whether they live close together, are connected over social media, or have the same job.
Getting ready
In order to get everything in place for the recipe, we'll install the required libraries and we'll download a dataset.
We will use the following libraries:
- networkx - is a graph analysis library: https://networkx.github.io/documentation.
- annoy - is a very efficient nearest-neighbors implementation: https://github.com/spotify/annoy.
- tqdm - to provide us with progress bars: https://github.com/tqdm/tqdm.
Furthermore, we'll use SciPy, but this comes with the Anaconda distribution:
!pip install networkx annoy tqdm python-louvain
We'll use the following dataset...