Skip to content
Projects

Project

LLM: Math Reasoning, Benchmark, and Reasoning Reuse

A CPIL theme on LLM mathematical reasoning, benchmarking, and reasoning reuse, with ongoing work on reusing reasoning across models and on adaptive hint generation.

  • Large Language Models
  • Math Reasoning
  • Reasoning Reuse
  • Benchmarks
  • Model Collaboration
  • Test-Time Methods
Visualization for LLM: Math Reasoning, Benchmark, and Reasoning Reuse

Overview

This theme studies mathematical reasoning in large language models, the benchmarks used to evaluate it, and how reasoning can be reused rather than regenerated. Ongoing directions include reasoning reuse as a paradigm for model collaboration and an adaptive hint generator that tailors guidance to the reasoning process.

Motivation

Mathematical reasoning is a demanding testbed for large language models, and current pipelines often regenerate reasoning from scratch for every problem. This theme studies how to evaluate LLM math reasoning rigorously and how to reuse reasoning across problems and models, reducing redundant computation and enabling more effective collaboration between models.

Ongoing Projects

  • LLM reasoning reuse. Developing methods that reuse reasoning rather than regenerating it, framed as a new paradigm for model collaboration.
  • Adaptive hint generator. Building a hint generator that adapts the guidance it provides to the model’s reasoning process.

Publications

An ICLR 2026 workshop paper, “Towards Reasoning Reuse: A New Paradigm in Model Collaboration” (Third Workshop on Test-Time Updates, Main Track), introduces the reasoning-reuse direction. See the publications list above for details.

Impact Holders

Impact holders and user communities will be added as the project scope becomes clearer.