Beam Search

#deep-learning #nlp #interview

Its a method used to generate texts based on the probability
Depends on Beam size B
Better than Greedy Decoding as looking at multiple possibilities than 1
- when B = 1, it's Greedy Decoding
- Larger B: Better Result, Slow decoding
- Small B: Worse Result, Faster Decoding
Beam Search is mostly used in inference, but it can be used in training [1]

Steps:

Start with <SOS> token
For each step,
1. Find the top B words with most probabilities, given encoded input X and generated output for this time t, $Y_{t - 1}$
2. Take the top B
3. Continue to Step 2, unless <EOS> is generated

Related Notes