Reasoning traces in LangChainJS

Explore using LangChainJS to capture reasoning traces from GPT‑5, Grok 4, and Gemini‑2.5, handling slow or truncated replies in a multi‑model chat.

Overview

I will show how to use langchainJS to interact with current reasoning models (GPT-5, Grok 4, Gemini-2.5 Pro) and how to handle their responses.

Why are they different? Well, if you treat them like “normal” models you get some surprising behavior. Most obviously very slow responses. Less obvious, but also common they might stop their response midway, without obvious reason.

The “trick” is to consider the reasoning trace. Both, when prompting and when when building UX. Unfortunately, langchain currently doesn’t document how to do this well so, instead, I will dive into the code and show you instead ;)

For this we will use a real world example: A private chat app that we use to unify conversations with major LLM providers. You can talk to multiple models in parallel within the same conversation, which is the perfect scenario to see different thinking vs non-thinking models in action.

Tech stack