Word Embeddings

Objective: The objective is to find a N-dimensional dense vector where the words will have syntactic and semantic information encoded.

Word embedding is used to represent words into a dense vector of multi-dimensional data. The main assumption behind word embedding is the same type of words should be close together. Like "orange" and "mango", they are both fruits. So they should be close together. The development of word embedding has started a long before. It ranges from TF-IDF score to BERT contextualized embeddings. There are mainly 3 types of word embeddings,

  1. Count based Word Embeddings
  2. Co-occurrence based Word Embeddings
  3. Contextualized Word Embeddings

References


Related Notes