MACHINE_LEARN Telegram 3391
Efficient Reasoning with Hidden Thinking



Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose
(as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space. We design the Heima Encoder to condense each intermediate CoT into a compact, higher-level hidden representation using a single thinking token, effectively minimizing verbosity and reducing the overall number of tokens required during the reasoning process. Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs. Experimental results across diverse reasoning MLLM benchmarks demonstrate that Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy. Moreover, the effective reconstruction of multimodal reasoning processes with Heima Decoder validates both the robustness and interpretability of our approach.

Paper: https://arxiv.org/pdf/2501.19201v1.pdf

Code: https://github.com/shawnricecake/heima

Datasets: MMBench - MM-Vet - MathVista - MMStar - HallusionBench



@Machine_learn



tgoop.com/Machine_learn/3391
Create:
Last Update:

Efficient Reasoning with Hidden Thinking



Chain-of-Thought (CoT) reasoning has become a powerful framework for improving complex problem-solving capabilities in Multimodal Large Language Models (MLLMs). However, the verbose nature of textual reasoning introduces significant inefficiencies. In this work, we propose
(as hidden llama), an efficient reasoning framework that leverages reasoning CoTs at hidden latent space. We design the Heima Encoder to condense each intermediate CoT into a compact, higher-level hidden representation using a single thinking token, effectively minimizing verbosity and reducing the overall number of tokens required during the reasoning process. Meanwhile, we design corresponding Heima Decoder with traditional Large Language Models (LLMs) to adaptively interpret the hidden representations into variable-length textual sequence, reconstructing reasoning processes that closely resemble the original CoTs. Experimental results across diverse reasoning MLLM benchmarks demonstrate that Heima model achieves higher generation efficiency while maintaining or even better zero-shot task accuracy. Moreover, the effective reconstruction of multimodal reasoning processes with Heima Decoder validates both the robustness and interpretability of our approach.

Paper: https://arxiv.org/pdf/2501.19201v1.pdf

Code: https://github.com/shawnricecake/heima

Datasets: MMBench - MM-Vet - MathVista - MMStar - HallusionBench



@Machine_learn

BY Machine learning books and papers




Share with your friend now:
tgoop.com/Machine_learn/3391

View MORE
Open in Telegram


Telegram News

Date: |

Hui said the time period and nature of some offences “overlapped” and thus their prison terms could be served concurrently. The judge ordered Ng to be jailed for a total of six years and six months. With the administration mulling over limiting access to doxxing groups, a prominent Telegram doxxing group apparently went on a "revenge spree." Judge Hui described Ng as inciting others to “commit a massacre” with three posts teaching people to make “toxic chlorine gas bombs,” target police stations, police quarters and the city’s metro stations. This offence was “rather serious,” the court said. "Doxxing content is forbidden on Telegram and our moderators routinely remove such content from around the world," said a spokesman for the messaging app, Remi Vaughn. Ng Man-ho, a 27-year-old computer technician, was convicted last month of seven counts of incitement charges after he made use of the 100,000-member Chinese-language channel that he runs and manages to post "seditious messages," which had been shut down since August 2020.
from us


Telegram Machine learning books and papers
FROM American