Efficient Reasoning with Balanced Thinking
ReBalance is a training-free framework that dynamically steers large reasoning models to achieve balanced thinking, reducing reasoning length while simultaneously improving accuracy.
ReBalance is a training-free framework that dynamically steers large reasoning models to achieve balanced thinking, reducing reasoning length while simultaneously improving accuracy.
Balanced thinking refers to eliminating redundant reasoning while retaining sufficient exploration to guarantee correctness.
Large Reasoning Models (LRMs) have shown remarkable reasoning capabilities, yet they often suffer from overthinking, expending redundant computational steps on simple problems, or underthinking, failing to explore sufficient reasoning paths despite inherent capabilities. These issues lead to inefficiencies and potential inaccuracies, limiting practical deployment in resource-constrained settings. Existing methods to mitigate overthinking, such as suppressing reflective keywords or adjusting reasoning length, may inadvertently induce underthinking, compromising accuracy. Therefore, we propose ReBalance, a training-free framework that achieves efficient reasoning with balanced thinking. ReBalance leverages confidence as a continuous indicator of reasoning dynamics, identifying overthinking through high confidence variance and underthinking via consistent overconfidence. By aggregating hidden states from a small-scale dataset into reasoning mode prototypes, we compute a steering vector to guide LRMs' reasoning trajectories. A dynamic control function modulates this vector's strength and direction based on real-time confidence, pruning redundancy during overthinking, and promoting exploration during underthinking. Extensive experiments conducted on four models ranging from 0.5B to 32B, and across nine benchmarks in math reasoning, general question answering, and coding tasks demonstrate that ReBalance effectively reduces output redundancy while improving accuracy, offering a general, training-free, and plug-and-play strategy for efficient and robust LRM deployment.
ReBalance simultaneously addresses the two failure modes of LRMs: redundant overthinking and premature underthinking.
Given the question "For what real values of $x$ is $-4 < x^{4} + 4x^{2} < 21$?", the model first obtains intervals $(-\sqrt{3}, 0)$ and $(0, \sqrt{3})$, and then verifies whether $x = 0$ is included. However, the model redundantly checks irrelevant values after correctly validating $x = 0$, causing overthinking. Current mitigation methods overly suppress necessary reflection, leading to underthinking. ReBalance dynamically controls the reasoning state, effectively balancing these two extremes.
ReBalance outperforms previous state-of-the-art methods across multiple mathematical reasoning datasets and model scales (0.5B–32B), reducing reasoning length while simultaneously improving accuracy.
Shorter reasoning is not always better: reducing overthinking can inadvertently induce underthinking.
A good method should reduce redundant reasoning for easy samples without collapsing reasoning depth for hard cases. ReBalance achieves this balance better than previous methods.
High confidence variance indicates frequent path switching (overthinking), while consistent overconfidence can lead to premature commitment to incorrect reasoning paths (underthinking).
Instead of retaining or discarding the entire reasoning path, ReBalance continuously modulates reasoning trajectories using confidence-aware dynamic steering.
Offline one-pass data collection + online dynamic steering, connected by confidence signals.
on a small-scale seen dataset, at each step, the steering vector is extracted at the first token of the specified layer based on confidence, and a dynamic function is fitted according to model behaviors.
During deployment, the dynamic function outputs steering weights based on the model's real-time confidence online, thus balancing between overthinking and underthinking.
No extra model training is required. The mechanism can be attached to different reasoning models and tasks with low deployment overhead.
For each reasoning step, compute stepwise confidence and local-window confidence variance.
Use quantile thresholds to classify steps into overthinking set O and underthinking set U.
Aggregate deep-layer hidden states into prototypes and define steering from underthinking to overthinking.
At each step, adapt steering direction and strength based on current confidence and variance.
Use the sliders to see how ReBalance changes steering behavior based on stepwise confidence and confidence variance in real time.
The 3D surface and 2D map show the same controlling function. The red dot is your current reasoning state.
Dashed guide lines indicate q25/q75 boundaries in confidence and variance, which separate balanced thinking, overthinking, and underthinking regions.
Switch model scales to inspect Pass@1 improvements and token reductions (vs. baseline).
For the same input question, original large reasoning models (Overthinking), existing overthinking mitigation methods (Underthinking), and our ReBalance (Balanced Thinking) exhibit distinct reasoning behaviors. Red highlights indicate critical signals emphasizing key behavioral differences.
Question: For what real values of \(x\) is \(-4 < x^4 + 4x^2 < 21\) satisfied?
Expected interval form: \((-\sqrt{3}, \sqrt{3})\)
Generated response: Alright, so I have this inequality to solve: -4 < x⁴ + 4x² < 21. Hmm, okay. It looks like a quartic inequality because of the x⁴ term. I remember that to solve such inequalities, I can break it down into two separate inequalities and solve each part separately. That is, first solve -4 < x⁴ + 4x² and then solve x⁴ + 4x² < 21. After that, I can find the intersection of the solutions from both parts to get the final answer.\n\nLet me start by rewriting the inequality:\n\n-4 < x⁴ + 4x² < 21.\n\nSo, I'll split this into two parts:\n\n1. x⁴ + 4x² > -4\n2. x⁴ + 4x² < 21\n\nStarting with the first inequality: x⁴ + 4x² > -4.\n\nHmm, x⁴ is always non-negative since any real number raised to an even power is positive or zero. Similarly, 4x² is also non-negative for the same reason. So, adding them together, x⁴ + 4x² is always greater than or equal to zero. But here, the right side is -4, which is negative. So, is it possible for x⁴ + 4x² to be greater than -4?\n\nWell, since x⁴ + 4x² is always at least zero, and zero is greater than -4, that inequality x⁴ + 4x² > -4 is always true for all real numbers x. So, the first part doesn't impose any restrictions on x. That's a relief; I don't have to worry about excluding any x from the solution set because the left inequality is always satisfied.\n\nNow, moving on to the second inequality: x⁴ + 4x² < 21.\n\nThis one is more challenging. Let's try to simplify or rearrange it. Maybe I can set y = x² to make it easier to handle. Since x² is always non-negative, y ≥ 0. Substituting, the inequality becomes y² + 4y < 21.\n\nSo, let me write that out:\n\ny² + 4y - 21 < 0.\n\nNow, I can solve this quadratic inequality in terms of y. First, let's find the roots of the equation y² + 4y - 21 = 0.\n\nUsing the quadratic formula: y = [-b ± \sqrt{b² - 4ac}] / 2a.\n\nHere, a = 1, b = 4, c = -21.\n\nPlugging in, we get:\n\ny = [-4 ± \sqrt{16 - 4*1*(-21)}] / 2\n= [-4 ± \sqrt{16 + 84}] / 2\n= [-4 ± \sqrt{100}] / 2\n= [-4 ± 10] / 2.\n\nSo, the roots are:\n\ny = (-4 + 10)/2 = 6/2 = 3,\ny = (-4 - 10)/2 = (-14)/2 = -7.\n\nBut since y = x² and x² can't be negative, y can't be -7. So, the only relevant root is y = 3.\n\nNow, to solve the inequality y² + 4y - 21 < 0, we can analyze the intervals determined by the roots. Since the quadratic opens upwards (because the coefficient of y² is positive), the quadratic will be below zero between its two roots. However, one of the roots is negative (y = -7), which we can ignore because y is non-negative. So, the relevant interval is from y = -7 to y = 3. But since y can't be negative, we only consider y from 0 to 3.\n\nThus, the inequality y² + 4y - 21 < 0 holds when y is between 0 and 3, i.e., 0 < y < 3.\n\nBut remember, y = x², so substituting back, we get:\n\n0 < x² < 3.\n\nThis inequality implies that x² is between 0 and 3. To find x, we can take square roots.\n\nFirst, since x² > 0, that means x ≠ 0. But we have to consider x² < 3 as well.\n\nSo, x² < 3 implies that |x| < \sqrt{3}, which means x is in the interval (-\sqrt{3}, \sqrt{3}).\n\nPutting it all together, the second inequality x⁴ + 4x² < 21 is satisfied when x is between -\sqrt{3} and \sqrt{3}, excluding zero? Wait, hold on. No, actually, since x² > 0, the inequality x² < 3 holds for all x in (-\sqrt{3}, \sqrt{3}) except x = 0. But actually, when x = 0, x⁴ + 4x² = 0, which is less than 21. So, actually, x = 0 is included.\n\nWait, let's think again. When we transformed x⁴ + 4x² < 21 into y² + 4y - 21 < 0, we found that y must be between 0 and 3. Since y = x², that translates to x² between 0 and 3. So, x² < 3, which is equivalent to |x| < \sqrt{3}, meaning x is in (-\sqrt{3}, \sqrt{3}). So, actually, x can be zero here because when x is zero, x⁴ + 4x² = 0, which is indeed less than 21. So, the interval is from -\sqrt{3} to \sqrt{3}, including zero.\n\nWait, but in the first inequality, x⁴ + 4x² > -4 is always true. So, combining both inequalities, the solution is all x such that x is in (-\sqrt{3}, \sqrt{3}). Is that right?\n\nWait, but when x is zero, x⁴ + 4x² is zero, which is less than 21, so it's included. So, the solution set is all real numbers x such that x is between -\sqrt{3} and \sqrt{3}.\n\nBut let me double-check this. Let's test a value inside the interval, say x = 1. Then, x⁴ + 4x² = 1 + 4 = 5, which is between -4 and 21. So, it's valid.\n\nWhat about x = \sqrt{3}? Then, x⁴ + 4x² = (\sqrt{3})^4 + 4*(\sqrt{3})^2 = (9) + 4*(3) = 9 + 12 = 21, which is equal to the upper limit. So, x = \sqrt{3} is not included because the inequality is strict (< 21). Similarly, x = -\sqrt{3} is also not included.\n\nWhat about x = 2? Then, x⁴ + 4x² = 16 + 16 = 32, which is greater than 21, so it's excluded. So, indeed, the solution is x between -\sqrt{3} and \sqrt{3}, not including the endpoints.\n\nWait, but let me check x = 0.5, which is inside the interval. x⁴ + 4x² = (0.5)^4 + 4*(0.5)^2 = 0.0625 + 4*(0.25) = 0.0625 + 1 = 1.0625, which is between -4 and 21, so it's valid.\n\nWhat about x = 1.5, which is still less than \sqrt{3} (since \sqrt{3} ≈ 1.732). x⁴ + 4x² = (1.5)^4 + 4*(1.5)^2 = 5.0625 + 4*(2.25) = 5.0625 + 9 = 14.0625, which is still less than 21, so it's valid.\n\nHow about x = \sqrt{3} ≈ 1.732? As we saw earlier, x⁴ + 4x² = 21, so it's exactly 21, which isn't included. Similarly, x = -\sqrt{3} gives 21 as well.\n\nSo, putting it all together, the solution is all real numbers x such that -\sqrt{3} < x < \sqrt{3}.\n\nBut wait, let me think again about the first inequality. Since x⁴ + 4x² is always greater than or equal to zero, which is greater than -4, so the first inequality doesn't restrict x at all. So, the only restriction comes from the second inequality.\n\nTherefore, combining both inequalities, the solution is x ∈ (-\sqrt{3}, \sqrt{3}).\n\nBut just to make sure, let me consider if there are any points outside of this interval that satisfy both inequalities. For example, x = 2, which is outside, as we saw, gives x⁴ + 4x² = 32, which is greater than 21, so it's excluded. Similarly, x = -2 gives the same result. What about x = \sqrt{3}, as we saw, gives 21, which is excluded. x = 0, gives 0, which is included.\n\nSo, yeah, it seems that the only x that satisfy both inequalities are those in the interval (-\sqrt{3}, \sqrt{3}). Therefore, in interval notation, that would be (-\sqrt{3}, \sqrt{3}).\n\nWait, just to double-check, is there any way for x⁴ + 4x² to be less than or equal to -4? Since x⁴ + 4x² is always non-negative, it can't be less than -4. So, the only time when it's less than 21 is when it's between 0 and 21. But x⁴ + 4x² is 0 at x = 0, and as x increases or decreases, it grows. So, when does it reach 21? At x = ±\sqrt{3}, as we saw.\n\nTherefore, the solution is indeed x ∈ (-\sqrt{3}, \sqrt{3}). So, in interval notation, that's written as (-\sqrt{3}, \sqrt{3}).\n\nBut wait, just to make sure, let me consider the quadratic in y. So, we had y² + 4y - 21 < 0, with y = x² ≥ 0. We found that y is between 0 and 3, which corresponds to x between -\sqrt{3} and \sqrt{3}.\n\nYes, that seems consistent.\n\nSo, I think I can confidently say that the solution is all real numbers x such that x is between -\sqrt{3} and \sqrt{3}.\n\n**Final Answer**\nThe solution is \\boxed{(-\\sqrt{3}, \\sqrt{3})}.\n......Therefore, the solution to the inequality \\(-4 < x^4 + 4x^2 < 21\\) is:\n\\[\n\\boxed{(-\\sqrt{3}, \\sqrt{3})}\n\\]
Question: For what real values of \(x\) is \(-4 < x^4 + 4x^2 < 21\) satisfied?
Expected interval form: \((-\sqrt{3}, \sqrt{3})\)
Generated response: Okay, so I have this inequality to solve: -4 < x⁴ + 4x² < 21. Hmm, I need to find all real numbers x that satisfy this. Let me think about how to approach this.\n\nFirst, I notice that the expression is a quartic polynomial, but it's only in terms of x². Maybe I can make a substitution to simplify it. Let me set y = x². Then, the inequality becomes -4 < y² + 4y < 21. That seems easier to handle because it's a quadratic in terms of y.\n\nSo, now I have two inequalities to solve:\n\n1. y² + 4y > -4\n2. y² + 4y < 21\n\nLet me tackle the first one: y² + 4y > -4. I can rewrite this as y² + 4y + 4 > 0. Hmm, that looks like a perfect square. Let me check: (y + 2)² = y² + 4y + 4. Yes, that's correct. So, the inequality becomes (y + 2)² > 0.\n\nWhen is a square greater than zero? Well, squares are always non-negative, so (y + 2)² is greater than zero when y + 2 is not equal to zero. That means y ≠ -2. But wait, y was defined as x², and x² is always non-negative. So, y can't be negative. Therefore, y = -2 is not possible because x² can't be negative. So, the inequality (y + 2)² > 0 is always true except when y = -2, which isn't in our domain. So, the first inequality is always satisfied for all real x except when y = -2, but since y can't be negative, this inequality doesn't restrict x at all. So, the first part doesn't impose any restrictions.\n\nNow, moving on to the second inequality: y² + 4y < 21. Let me rewrite this as y² + 4y - 21 < 0. To solve this quadratic inequality, I need to find the roots of the equation y² + 4y - 21 = 0. I can use the quadratic formula: y = [-b ± sqrt(b² - 4ac)] / (2a). Here, a = 1, b = 4, c = -21.\n\nCalculating the discriminant: b² - 4ac = 16 - 4(1)(-21) = 16 + 84 = 100. That's a perfect square, which is nice. So, the roots are y = [-4 ± sqrt(100)] / 2 = [-4 ± 10] / 2.\n\nCalculating the two roots:\n\n1. y = (-4 + 10)/2 = 6/2 = 3\n2. y = (-4 - 10)/2 = (-14)/2 = -7\n\nSo, the roots are y = 3 and y = -7. Since y = x², and x² can't be negative, y can't be -7. Therefore, the relevant root is y = 3.\n\nNow, the quadratic y² + 4y - 21 factors as (y - 3)(y + 7). But since y can't be negative, we only consider y ≥ 0. So, the quadratic is negative between its roots. But since one root is negative and the other is positive, the quadratic is negative between y = -7 and y = 3. However, since y can't be less than 0, the relevant interval is y between 0 and 3.\n\nTherefore, the inequality y² + 4y - 21 < 0 holds when 0 < y < 3. Since y = x², this translates to 0 < x² < 3.\n\nNow, let's solve 0 < x² < 3. The inequality 0 < x² means that x is not zero, so x ≠ 0. The inequality x² < 3 implies that -sqrt(3) < x < sqrt(3). Combining these, we get -sqrt(3) < x < 0 and 0 < x < sqrt(3). So, in interval notation, that's (-sqrt(3), 0) union (0, sqrt(3)).\n\nBut wait, let me double-check. The original inequality was -4 < x⁴ + 4x² < 21. We transformed it to y² + 4y < 21, which gave us 0 < y < 3, so 0 < x² < 3, which is indeed -sqrt(3) < x < sqrt(3) excluding x=0. So, that seems correct.\n\nBut just to be thorough, let me test some values to make sure.\n\nFirst, let's test x = 1. Then, x⁴ + 4x² = 1 + 4 = 5. Is 5 between -4 and 21? Yes, it is.\n\nWhat about x = \sqrt(3)? Then, x⁴ + 4x² = (sqrt(3))^4 + 4*(sqrt(3))^2 = (9) + 4*(3) = 9 + 12 = 21. So, 21 is the upper bound, but the inequality is strict, so x = sqrt(3) is not included. Similarly, x = -sqrt(3) is not included.\n\nWhat about x = 0? Then, x⁴ + 4x² = 0 + 0 = 0, which is between -4 and 21. So, x = 0 is included? \n\n\nThe solution involves substituting y = x², transforming the inequality into a quadratic, and solving for y. The resulting interval for y is 0 < y < 3, which translates to x being in (-sqrt(3), 0) union (0, sqrt(3)).\n\n**Answer:** The solution in interval notation is:\n\\[\n\\boxed{(-\\sqrt{3}, 0) \\cup (0, \\sqrt{3})}\n\\]
Question: For what real values of \(x\) is \(-4 < x^4 + 4x^2 < 21\) satisfied?
Expected interval form: \((-\sqrt{3}, \sqrt{3})\)
Generated response: Okay, so I have this inequality to solve: -4 < x⁴ + 4x² < 21. Hmm, it's a quartic inequality because of the x⁴ term. I remember that for inequalities with polynomials, especially even degree ones, the graphs tend to have a U-shape on both ends. So, the function f(x) = x⁴ + 4x² will go to infinity as x approaches both positive and negative infinity. That means the function will have a minimum somewhere in the middle.\n\nFirst, maybe I should try to solve the inequality step by step. It's like two separate inequalities: x⁴ + 4x² > -4 and x⁴ + 4x² < 21. But since x⁴ and x² are always non-negative (because any real number squared is non-negative, and raising it to the fourth power is also non-negative), their sum will also be non-negative. So, x⁴ + 4x² is always greater than or equal to zero. That means the first inequality, x⁴ + 4x² > -4, is always true because a non-negative number is always greater than -4. So, I don't have to worry about that part; it doesn't restrict the solution set.\n\nSo, the real work is solving the second inequality: x⁴ + 4x² < 21. Let me rewrite that as x⁴ + 4x² - 21 < 0. Maybe I can factor this expression or find its roots to figure out where it's negative.\n\nLet me try to factor x⁴ + 4x² - 21. Hmm, noticing that it's a quadratic in terms of x², I can let y = x², so the equation becomes y² + 4y - 21 = 0. Let me solve for y.\n\nUsing the quadratic formula: y = [-4 ± sqrt(16 + 84)] / 2 = [-4 ± sqrt(100)] / 2 = [-4 ± 10]/2.\n\nSo, the solutions are y = (-4 + 10)/2 = 6/2 = 3, and y = (-4 - 10)/2 = -14/2 = -7. So, y = 3 or y = -7.\n\nBut y = x², which is always non-negative, so y = -7 doesn't make sense in this context. So, the only relevant solution is y = 3. Therefore, x² = 3, so x = sqrt(3) or x = -sqrt(3).\n\nSo, that means the quartic function x⁴ + 4x² - 21 is zero at x = sqrt(3) and x = -sqrt(3). Since it's a quartic function opening upwards (because the coefficient of x⁴ is positive), the graph will touch the x-axis at x = sqrt(3) and x = -sqrt(3), and it will be below the x-axis between these two roots.\n\nSo, the inequality x⁴ + 4x² - 21 < 0 holds when x is between -sqrt(3) and sqrt(3). Therefore, the solution to the original inequality -4 < x⁴ + 4x² < 21 is the same as x⁴ + 4x² < 21, which is when x is between -sqrt(3) and sqrt(3).\n\nBut wait, I should check if there are any other intervals or points where the inequality might hold. Since the quartic is always greater than or equal to zero, and we're looking for it to be less than 21, which is a positive number, so it's only between the two roots where it dips below 21.\n\nSo, that means x must satisfy -sqrt(3) < x < sqrt(3). Let me verify that with some test points.\n\nFor x = 0: 0⁴ + 4*(0)² = 0, which is between -4 and 21, so that's good.\n\nFor x = 1: 1 + 4 = 5, which is between -4 and 21, so that's fine.\n\nFor x = 2: 16 + 16 = 32, which is greater than 21, so that's outside the solution.\n\nFor x = sqrt(3): approx 1.732, the function equals 21, so that's the boundary.\n\nSimilarly, for x = -sqrt(3): same result.\n\nSo, the interval from -sqrt(3) to sqrt(3) is where the function is less than 21, and since the other inequality is always true, the solution is just that interval.\n\nTherefore, in interval notation, the solution is (-sqrt(3), sqrt(3)).\n\n**Final Answer**\nThe solution is \\boxed{(-\\sqrt{3}, \\sqrt{3})}.\n\n\nTo solve the inequality \\(-4 < x^4 + 4x^2 < 21\\), we start by noting that \\(x^4 + 4x^2\\) is always non-negative because it is a sum of squares. Therefore, the inequality \\(x^4 + 4x^2 > -4\\) is always true.\n\nNext, we focus on the inequality \\(x^4 + 4x^2 < 21\\). We rewrite this as \\(x^4 + 4x^2 - 21 < 0\\). Letting \\(y = x^2\\), we transform the equation into a quadratic: \\(y^2 + 4y - 21 = 0\\). Solving this quadratic equation using the quadratic formula, we find the roots:\n\n\\[\ny = \\frac{-4 \\pm \\sqrt{16 + 84}}{2} = \\frac{-4 \\pm 10}{2}\n\\]\n\nThis gives us \\(y = 3\\) and \\(y = -7\\). Since \\(y = x^2\\) must be non-negative, we discard \\(y = -7\\). Thus, \\(x^2 = 3\\) gives \\(x = \\sqrt{3}\\) or \\(x = -\\sqrt{3}\\).\n\nThe function \\(x^4 + 4x^2 - 21\\) is a quartic that opens upwards and is zero at \\(x = \\sqrt{3}\\) and \\(x = -\\sqrt{3}\\). It is negative between these roots. Therefore, the inequality \\(x^4 + 4x^2 < 21\\) holds for \\(x\\) in the interval \\((- \\sqrt{3}, \\sqrt{3})\\).\n\nThus, the solution to the original inequality \\(-4 < x^4 + 4x^2 < 21\\) is:\n\n\\[\n\\boxed{(-\\sqrt{3}, \\sqrt{3})}\n\\]
Use the links below for paper reading, implementation, and contact.
If ReBalance is useful for your research, please cite:
@inproceedings{li2026efficient,
title={Efficient Reasoning with Balanced Thinking},
author={Li, Yulin and Tu, Tengyao and Ding, Li and Wang, Junjie and Zhen, Huiling and Chen, Yixin and Li, Yong and Tian, Zhuotao},
booktitle={Proceedings of the 14th International Conference on Learning Representations},
year={2026}
}