DBSCAN inherits the idea that data can be represented as multidimensional points. Again, sticking with a two-dimensional example, this is in rough steps how DBSCAN works:
- Pick a point that has not been visited before.
- Draw a circle with the point as the center. The radius of the circle is epsilon.
- Count how many other points fall into the circle. If there are more than a specified threshold, we mark all the points as being part of the same cluster.
- Recursively do the same for each point in this cluster. Doing so expands the cluster.
- Repeat these steps.
I highly encourage you to do this on dotted paper and try to draw this out yourself. Start by plotting random points, and use pencils to draw circles on paper. This will give you an intuition of how DBSCAN works. The picture shows my working that enhanced my intuition about how DBSCAN works. I found this intuition to be...