Compressed Chain of Thought - Efficient Reasoning Through Dense Representations
Summary
In this paper, authors try to reason through the continuous embedding space rather than discrete decoding tokens. The main motivation was to improve the decoding time and solve latency issue. They also hypothesized that the reasoning improvement can be tuned by the number of contemplation token generate. These tokens are trained through teacher forcing by using the original CoT token's hidden states.
-
The authors believe that these compressed decodings can then be again decoded to get the reasoning path
Generates compressed contentful and continuous embeddings that are much shorter than the original CoT while keeping the accuracy similar or close to similar -
They have trained two models — (1) which generates CCOT toekns, (2) which decodes given the CCOT tokens
-
Trained the CCOT generator by using teacher forcing with respect to the original tokens gold hidden states
-
Trained the decoder independently, only using the first CCOT token to initiate the COT from the previous module
-
During inference, it generates CCOT tokens from the CCOT module and when finished, it uses the trained decoder module to generate the answer.
- problem is the results are not promising
Terms
- Contentful tokens are those which either themselves are semantically contentful or the hidden states are derived from semantical tokens
Annotations
« Our method differs in that the contemplation tokens we generate are grounded in text rather than only used as a signal to decode from. »(2)
« our grounding offers the future potential for decoding the reasoning chain from the compressed representations, allowing for post-hoc human inspection of the LLM’s reasoning. »(2)
Date : 12-17-2024
Authors : Jeffrey Cheng, Benjamin Van Durme
Paper Link : http://arxiv.org/abs/2412.13171
Zotero Link: Full Text PDF
Tags : #Computer-Science---Computation-and-Language
Citation :