Compressed Chain of Thought - Efficient Reasoning Through Dense Representations

Summary

In this paper, authors try to reason through the continuous embedding space rather than discrete decoding tokens. The main motivation was to improve the decoding time and solve latency issue. They also hypothesized that the reasoning improvement can be tuned by the number of contemplation token generate. These tokens are trained through teacher forcing by using the original CoT token's hidden states.

Terms

Annotations

Annotation

« contemplation tokens »()

Annotation

« Here we propose Compressed Chain-of-Thought (CCoT), a framework to generate contentful and continuous contemplation tokens of variable sequence length »()

Annotation

« the reasoning improvements can be adaptively modified on demand by controlling the number of contemplation tokens generated »()

Annotation

« Our framework, called Compressed Chain of Thought (CCoT), generates contemplation tokens which are compressed representations of language-based reasoning chains. »()

Annotation

« These contemplation tokens are trained through teacher forcing with respect to the gold hidden states corresponding to full reasoning traces. »()

Annotation

« Our method differs in that the contemplation tokens we generate are grounded in text rather than only used as a signal to decode from. »(2)

Annotation

« our grounding offers the future potential for decoding the reasoning chain from the compressed representations, allowing for post-hoc human inspection of the LLM’s reasoning. »(2)


Related Notes