Jaccard Similarity
- AKA Jaccard Index
- AKA Jaccard Similarity Coefficient
- Jaccard Similarity is used to find similarity between two binary vector or set
Binary Vector
For two binary vector, the formula is,
Set
For two sets
Multiset
For two Multiset
Uses:
- Text Mining: To find similarity between two text documents using the term user
- E-Commerce: from millions of dataset, find similar customer from their purchase history
- Recommendation System: In movie recommendation system, we can use the jaccard index to find the similar users by their watch history