Step AI 3.7 Overthinking and ignoring "reasoning_effort" #24203

d-shehu · 2026-06-05T18:04:23Z

d-shehu
Jun 5, 2026

I wonder if I'm doing something wrong. The 3.7 model seems to massively overthink even simple questions such as writing C++ AVL implementation. It's to the point where it's effectively several times slower than comparably-sized models. It does not get into a loop and eventually finishes.

However, I don't see the value of "checking" these test cases when it's actually NOT executing the code and testing. It's just regurgitating test cases.

I was planning to use this for non-coding scenarios but coding is simpler to test. Thanks.

Arguments: --jinja --chat-template-kwargs {"reasoning_effort":"low"}

Llama.cpp: b9496
Model: bartowski/Step-3.7-Flash-GGUF
Quant: Q8_0
MTP layer: Step3.7-flash-mtp-Q8_0.gguf

I observed the same behavior with the stepfun-ai/Step-3.7-Flash-GGUF model. MTP is working and t/s increases by 25% from 8 t/s to 10 t/s.

Code Thinking Snippet:

1. Include necessary headers: iostream, algorithm (for max, maybe), using namespace std? Or not, but for simplicity maybe use std::, but for example code, using namespace std is okay, but note that.                                                                          
2. Define the Node struct: each node has int key (or make it template? Oh, maybe make it a template class so it can work with any comparable type? That's better, more flexible. Oh right, the user might want to use other types, so template <typename T> for the key.        
Wait but if I use template, then the Node is also template. Let's see: template <typename T> struct Node { T key; Node<T>* left; Node<T>* right; int height; };                                                                                                                 
Then the AVLTree class is also template: template <typename T> class AVLTree { private: Node<T>* root; // then helper functions: int height(Node<T>* node); int getBalanceFactor(Node<T>* node); Node<T>* rightRotate(Node<T>* y); Node<T>* leftRotate(Node<T>* x); Node<T>* ins
ert(Node<T>* node, T key); Node<T>* deleteNode(Node<T>* node, T key); void inorder(Node<T>* node); void deleteAllNodes(Node<T>* node); public: AVLTree(); ~AVLTree(); void insert(T key); void deleteKey(T key); bool search(T key); void inorderTraversal(); };                
That makes sense. Then implement each helper function.                                                                                                                                                                                                                          
Wait the height function: if node is null, return 0, else return node->height. That's correct, because a null node has height 0, a leaf node has height 1, which matches.                                                                                                       
GetBalanceFactor: return height(node->left) - height(node->right). Yep.                                                                                                                                                                                                         
RightRotate: take y as the unbalanced node. Node<T>* x = y->left; Node<T>* T2 = x->right; Then perform rotation: x->right = y; y->left = T2; Then update heights: y->height = 1 + max(height(y->left), height(y->right)); x->height = 1 + max(height(x->left), height(x->right))
; Then return x, which is the new root of this subtree.                                                                                                                                                                                                                         
LeftRotate is symmetric: take x as unbalanced. Node<T>* y = x->right; Node<T>* T2 = y->left; y->left = x; x->right = T2; update x's height first, then y's, return y.                                                                                                           
Then the insert helper function: Node<T>* insert(Node<T>* node, T key) { First, if node is null, create a new node: node = new Node<T>; node->key = key; node->left = node->right = null; node->height = 1; return node; Then, if key < node->key, insert into left subtree: nod
e->left = insert(node->left, key); else if key > node->key, insert into right: node->right = insert(node->right, key); else return node; // no duplicates. Then update current node's height: node->height = 1 + max(height(node->left), height(node->right)); Then get balance 
factor: int balance = getBalanceFactor(node); Then handle the four cases: // Left Left case if (balance > 1 && key < node->left->key) return rightRotate(node); // Left Right case if (balance > 1 && key > node->left->key) { node->left = leftRotate(node->left); return right
Rotate(node); } // Right Right case if (balance < -1 && key > node->right->key) return leftRotate(node); // Right Left case if (balance < -1 && key < node->right->key) { node->right = rightRotate(node->right); return leftRotate(node); } return node; }                     
Wait that's correct for insert, right? Because after inserting, we check the balance and the position of the inserted key relative to the child to determine the case.                                                                                                          
Then deleteNode helper function: Node<T>* deleteNode(Node<T>* root, T key) { First, standard BST delete: if root is null, return root; if key < root->key, root->left = deleteNode(root->left, key); else if key > root->key, root->right = deleteNode(root->right, key); else {
 // node to be deleted found // node with only one child or no child if ( (root->left == null) || (root->right == null) ) { Node<T>* temp = root->left ? root->left : root->right; // no child case if (temp == null) { temp = root; root = null; } else { // one child case *ro
ot = *temp; // copy the contents of the non-empty child } delete temp; } else { // node with two children: get inorder successor (smallest in right subtree) Node<T>* temp = minValueNode(root->right); root->key = temp->key; // copy the key root->right = deleteNode(root->ri
ght, temp->key); // delete the successor } } // if the tree had only one node, return if (root == null) return root; // update height root->height = 1 + max(height(root->left), height(root->right)); // get balance factor int balance = getBalanceFactor(root); // handle the
 four cases for delete // Left Left case if (balance > 1 && getBalanceFactor(root->left) >= 0) return rightRotate(root); // Left Right case if (balance > 1 && getBalanceFactor(root->left) < 0) { root->left = leftRotate(root->left); return rightRotate(root); } // Right Rig
ht case if (balance < -1 && getBalanceFactor(root->right) <= 0) return leftRotate(root); // Right Left case if (balance < -1 && getBalanceFactor(root->right) > 0) { root->right = rightRotate(root->right); return leftRotate(root); } return root; }                          
Wait right, for delete, we check the balance factor of the child, not the key's position, because after deletion, the imbalance could be from either subtree. Oh right! That's a key difference from insert. Because when we delete, we don't know which side the deletion happe
ned, so we check the child's balance factor. For example, if balance is 2 (left heavy), then if left child's balance is >=0, that means left subtree is taller or equal, so left left case, right rotate. If left child's balance is -1, then left's right is taller, so left ri
ght case, rotate left then right. That's correct.                                                                                                                                                                                                                               
Then the minValueNode function: that's a helper to find the smallest node in a subtree, which is the leftmost node. So Node<T>* minValueNode(Node<T>* node) { Node<T>* current = node; while (current->left != null) current = current->left; return current; }                 
Then the public functions: insert takes a key, calls the helper with root, updates root. So void insert(T key) { root = insert(root, key); } Same for delete: void deleteKey(T key) { root = deleteNode(root, key); }                                                           
Search function: bool search(Node<T>* node, T key) { if (node == null) return false; if (node->key == key) return true; if (key < node->key) return search(node->left, key); else return search(node->right, key); } Then public search calls that: bool search(T key) { return 
search(root, key); }                                                                                                                                                                                                                                                            
Inorder traversal: void inorder(Node<T>* node) { if (node == null) return; inorder(node->left); cout << node->key << " "; inorder(node->right); } Public one: void inorderTraversal() { inorder(root); cout << endl; }                                                          
Then constructor: AVLTree() { root = null; } Destructor: ~AVLTree() { deleteAllNodes(root); } And deleteAllNodes is a postorder traversal: void deleteAllNodes(Node<T>* node) { if (node == null) return; deleteAllNodes(node->left); deleteAllNodes(node->right); delete node;1. Include necessary headers: iostream, algorithm (for max, maybe), using namespace std? Or not, but for simplicity maybe use std::, but for example code, using namespace std is okay, but note that.                                                                          
2. Define the Node struct: each node has int key (or make it template? Oh, maybe make it a template class so it can work with any comparable type? That's better, more flexible. Oh right, the user might want to use other types, so template <typename T> for the key.        
Wait but if I use template, then the Node is also template. Let's see: template <typename T> struct Node { T key; Node<T>* left; Node<T>* right; int height; };                                                                                                                 
Then the AVLTree class is also template: template <typename T> class AVLTree { private: Node<T>* root; // then helper functions: int height(Node<T>* node); int getBalanceFactor(Node<T>* node); Node<T>* rightRotate(Node<T>* y); Node<T>* leftRotate(Node<T>* x); Node<T>* ins
ert(Node<T>* node, T key); Node<T>* deleteNode(Node<T>* node, T key); void inorder(Node<T>* node); void deleteAllNodes(Node<T>* node); public: AVLTree(); ~AVLTree(); void insert(T key); void deleteKey(T key); bool search(T key); void inorderTraversal(); };                
That makes sense. Then implement each helper function.                                                                                                                                                                                                                          
Wait the height function: if node is null, return 0, else return node->height. That's correct, because a null node has height 0, a leaf node has height 1, which matches.                                                                                                       
GetBalanceFactor: return height(node->left) - height(node->right). Yep.                                                                                                                                                                                                         
RightRotate: take y as the unbalanced node. Node<T>* x = y->left; Node<T>* T2 = x->right; Then perform rotation: x->right = y; y->left = T2; Then update heights: y->height = 1 + max(height(y->left), height(y->right)); x->height = 1 + max(height(x->left), height(x->right))
; Then return x, which is the new root of this subtree.                                                                                                                                                                                                                         
LeftRotate is symmetric: take x as unbalanced. Node<T>* y = x->right; Node<T>* T2 = y->left; y->left = x; x->right = T2; update x's height first, then y's, return y.                                                                                                           
Then the insert helper function: Node<T>* insert(Node<T>* node, T key) { First, if node is null, create a new node: node = new Node<T>; node->key = key; node->left = node->right = null; node->height = 1; return node; Then, if key < node->key, insert into left subtree: nod
e->left = insert(node->left, key); else if key > node->key, insert into right: node->right = insert(node->right, key); else return node; // no duplicates. Then update current node's height: node->height = 1 + max(height(node->left), height(node->right)); Then get balance 
factor: int balance = getBalanceFactor(node); Then handle the four cases: // Left Left case if (balance > 1 && key < node->left->key) return rightRotate(node); // Left Right case if (balance > 1 && key > node->left->key) { node->left = leftRotate(node->left); return right
Rotate(node); } // Right Right case if (balance < -1 && key > node->right->key) return leftRotate(node); // Right Left case if (balance < -1 && key < node->right->key) { node->right = rightRotate(node->right); return leftRotate(node); } return node; }                     
Wait that's correct for insert, right? Because after inserting, we check the balance and the position of the inserted key relative to the child to determine the case.                                                                                                          
Then deleteNode helper function: Node<T>* deleteNode(Node<T>* root, T key) { First, standard BST delete: if root is null, return root; if key < root->key, root->left = deleteNode(root->left, key); else if key > root->key, root->right = deleteNode(root->right, key); else {
 // node to be deleted found // node with only one child or no child if ( (root->left == null) || (root->right == null) ) { Node<T>* temp = root->left ? root->left : root->right; // no child case if (temp == null) { temp = root; root = null; } else { // one child case *ro
ot = *temp; // copy the contents of the non-empty child } delete temp; } else { // node with two children: get inorder successor (smallest in right subtree) Node<T>* temp = minValueNode(root->right); root->key = temp->key; // copy the key root->right = deleteNode(root->ri
ght, temp->key); // delete the successor } } // if the tree had only one node, return if (root == null) return root; // update height root->height = 1 + max(height(root->left), height(root->right)); // get balance factor int balance = getBalanceFactor(root); // handle the
 four cases for delete // Left Left case if (balance > 1 && getBalanceFactor(root->left) >= 0) return rightRotate(root); // Left Right case if (balance > 1 && getBalanceFactor(root->left) < 0) { root->left = leftRotate(root->left); return rightRotate(root); } // Right Rig
ht case if (balance < -1 && getBalanceFactor(root->right) <= 0) return leftRotate(root); // Right Left case if (balance < -1 && getBalanceFactor(root->right) > 0) { root->right = rightRotate(root->right); return leftRotate(root); } return root; }                          
Wait right, for delete, we check the balance factor of the child, not the key's position, because after deletion, the imbalance could be from either subtree. Oh right! That's a key difference from insert. Because when we delete, we don't know which side the deletion happe
ned, so we check the child's balance factor. For example, if balance is 2 (left heavy), then if left child's balance is >=0, that means left subtree is taller or equal, so left left case, right rotate. If left child's balance is -1, then left's right is taller, so left ri
ght case, rotate left then right. That's correct.                                                                                                                                                                                                                               
Then the minValueNode function: that's a helper to find the smallest node in a subtree, which is the leftmost node. So Node<T>* minValueNode(Node<T>* node) { Node<T>* current = node; while (current->left != null) current = current->left; return current; }                 
Then the public functions: insert takes a key, calls the helper with root, updates root. So void insert(T key) { root = insert(root, key); } Same for delete: void deleteKey(T key) { root = deleteNode(root, key); }                                                           
Search function: bool search(Node<T>* node, T key) { if (node == null) return false; if (node->key == key) return true; if (key < node->key) return search(node->left, key); else return search(node->right, key); } Then public search calls that: bool search(T key) { return 
search(root, key); }                                                                                                                                                                                                                                                            
Inorder traversal: void inorder(Node<T>* node) { if (node == null) return; inorder(node->left); cout << node->key << " "; inorder(node->right); } Public one: void inorderTraversal() { inorder(root); cout << endl; }                                                          
Then constructor: AVLTree() { root = null; } Destructor: ~AVLTree() { deleteAllNodes(root); } And deleteAllNodes is a postorder traversal: void deleteAllNodes(Node<T>* node) { if (node == null) return; deleteAllNodes(node->left); deleteAllNodes(node->right); delete node;

Test Case Thinking Snippet:

Another thing: when we do *root = *temp in the delete function for one child case, that copies all the fields, including left, right, height, which is correct, because temp is the non-null child, so we're replacing the current node with its child, then delete the old node
 (temp, which is the old root). Wait wait, let's see: if root has left child null, right child not null, then temp is root->right. Then *root = *temp: so root's key becomes temp's key, root's left becomes temp->left (which is null), root's right becomes temp->right. Then 
we delete temp, which was the original right child. Oh right! That's correct, because we're effectively replacing the current node with its child, then deleting the old child node. That works, right? Let's test: root is 10, right child 20. temp is 20. *root = *temp: so ro
ot's key is 20, left is 20's left (null), right is 20's right (null). Then delete temp (the original 20 node). So now root is 20, no children. Correct. If root has no children, temp is null, so we set temp to root, root to null, delete temp. Correct.
Wait but what if the child has its own children? For example, root 10, right child 20, which has right child 30. Then temp is 20. *root = *temp: root's key is 20, left is 20's left (null), right is 20's right (30). Then delete temp (the original 20 node). So now root is 2
0, right child 30. Correct, that's right. So that part works.
Another possible bug: in the insert function, after inserting, we update the height before checking balance, which is correct, because the balance factor depends on the updated height.
What about when the tree is empty? Insert into null root, returns the new node, which is assigned to root, correct.
Now, let's write the code with all these, add explanations. Also, mention the limitations, like no duplicates, only int or template, etc. Also, explain each part so the user understands.
Wait also, maybe add a function to get the height of the tree, just for testing? Like int getHeight() { return height(root); } That's easy.
Also, in the main function, test cases: insert 10, 20, 30, 25, 15, 5, then show inorder, which should be sorted. Then delete 20, show inorder again, check balance. Search for 15, which should be found, search for 100, which is not.
Wait let's test the insert sequence 10, 20, 30, 25: after inserting 10,20,30, the tree is rotated left to 20 as root, left 10, right 30. Then insert 25: goes to left of 30. Now 30's height is 2, balance 1. 20's height is 3, balance 0 - 2 = -2. So balance is -2, right chil
d 30's balance is 1, which is >0, so right left case. So right rotate 30: 25 becomes right child of 20, 30 is right of 25. Then left rotate 20: 25 becomes root, left 20, right 30. Now 20's left is 10, right null. 30's left null, right null. 25's height is 3, balance 2 -1 
=1, correct. Inorder traversal would be 10,20,25,30, which is sorted. Correct.

sueun-dev · 2026-06-07T01:34:49Z

sueun-dev
Jun 7, 2026

this looks close to a couple existing llama.cpp threads rather than a totally new config issue.

#21445 has the accepted pointer for per-request control: thinking_budget_tokens can be sent in the request body when no CLI reasoning budget is set. #20408 is the current discussion about mapping OpenAI-style reasoning_effort values onto llama.cpp budgets.

For the Step 3.7-specific loop/overthinking behavior with tool-ish reasoning, #24181 is probably the better thread to watch/add your repro details to.

1 reply

d-shehu Jun 7, 2026
Author

I'm not sure if 21445 is applicable. All my other models which include Qwen 3.6 27b, Qwen 3.5 122b, Granite 4.1 30B, GPT_OSS, mistral medium, etc. do NOT exhibit this behavior. The reasoning is working well. Even MiniMax 2.7 seems to be working okay where as MiniMax 2.5 was notoriously verbose and would get stuck in loops earlier.

I only observe this for Step 3.7. I'll follow up in that thread.

tarruda · 2026-06-07T23:50:14Z

tarruda
Jun 7, 2026

I don't think Step 3.7 Flash really supports reasoning efforts like GPT-OSS.

Even though it is documented, I've tried setting it to "low" and saw no difference in its output when using the official API. In the end the amount of reasoning seems to be decided by the complexity of the task and some randomness.

llama.cpp reasoning budget options can work quite well if set up proper.y Before I reported the parser bug (#24181), I had been using reasoning_budget as a workaround. See this for more details: https://huggingface.co/stepfun-ai/Step-3.7-Flash-GGUF/discussions/6

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step AI 3.7 Overthinking and ignoring "reasoning_effort" #24203

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Step AI 3.7 Overthinking and ignoring "reasoning_effort" #24203

Uh oh!

d-shehu Jun 5, 2026

Replies: 2 comments · 1 reply

Uh oh!

sueun-dev Jun 7, 2026

Uh oh!

d-shehu Jun 7, 2026 Author

Uh oh!

tarruda Jun 7, 2026

d-shehu
Jun 5, 2026

Replies: 2 comments 1 reply

sueun-dev
Jun 7, 2026

d-shehu Jun 7, 2026
Author

tarruda
Jun 7, 2026