- It is one of the methods of Feature Engineering
- In feature hashing, high cardinal categories are hashed to an integer value
- Feature hashing is very helpful with high cardinality categories to manage the size and memory of the features
- by hashing them to less number of buckets
- Its possible that one bucket can have multiple values, in other words, its possible that multiple values are hashed to one same feature
Issues:
- The main issues with feature hashing is the collision. There is a memory vs performance tradeoff. By increasing memory, you are gaining performance (less collision) at the cost of high memory.
References