Use this file to discover all available pages before exploring further.
View your agent’s internal reasoning process for debugging, transparency, and understanding decision-making. This guide demonstrates two provider-specific approaches:
Anthropic Extended Thinking - Claude’s thinking blocks for complex reasoning
OpenAI Reasoning via Responses API - GPT’s reasoning effort parameter
Anthropic’s Claude models support extended thinking, which allows you to access the model’s internal reasoning process through thinking blocks. This is useful for understanding how Claude approaches complex problems step-by-step.
Claude uses thinking blocks to reason through complex problems step-by-step. There are two types:
ThinkingBlock (related anthropic docs): Contains the full reasoning text from Claude’s internal thought process
RedactedThinkingBlock ((related anthropic docs)): Contains redacted or summarized thinking data
By registering a callback with your conversation, you can intercept and display these thinking blocks in real-time, giving you insight into how Claude is approaching the problem.
OpenAI’s latest models (e.g., GPT-5, GPT-5-Codex) support a Responses API that provides access to the model’s reasoning process. By setting the reasoning_effort parameter, you can control how much reasoning the model performs and access those reasoning traces.
"""Example: Responses API path via LiteLLM in a Real Agent Conversation- Runs a real Agent/Conversation to verify /responses path works- Demonstrates rendering of Responses reasoning within normal conversation events"""from __future__ import annotationsimport osfrom pydantic import SecretStrfrom openhands.sdk import ( Conversation, Event, LLMConvertibleEvent, get_logger,)from openhands.sdk.llm import LLMfrom openhands.tools.preset.default import get_default_agentlogger = get_logger(__name__)api_key = os.getenv("LLM_API_KEY") or os.getenv("OPENAI_API_KEY")assert api_key, "Set LLM_API_KEY or OPENAI_API_KEY in your environment."model = "openhands/gpt-5-mini-2025-08-07" # Use a model that supports Responses APIbase_url = os.getenv("LLM_BASE_URL")llm = LLM( model=model, api_key=SecretStr(api_key), base_url=base_url, # Responses-path options reasoning_effort="high", # Logging / behavior tweaks log_completions=False, usage_id="agent",)print("\n=== Agent Conversation using /responses path ===")agent = get_default_agent( llm=llm, cli_mode=True, # disable browser tools for env simplicity)llm_messages = [] # collect raw LLM-convertible messages for inspectiondef conversation_callback(event: Event): if isinstance(event, LLMConvertibleEvent): llm_messages.append(event.to_llm_message())conversation = Conversation( agent=agent, callbacks=[conversation_callback], workspace=os.getcwd(),)# Keep the tasks short for demo purposesconversation.send_message("Read the repo and write one fact into FACTS.txt.")conversation.run()conversation.send_message("Now delete FACTS.txt.")conversation.run()print("=" * 100)print("Conversation finished. Got the following LLM messages:")for i, message in enumerate(llm_messages): ms = str(message) print(f"Message {i}: {ms[:200]}{'...' if len(ms) > 200 else ''}")# Report costcost = llm.metrics.accumulated_costprint(f"EXAMPLE_COST: {cost}")
Running the Example
export LLM_API_KEY="your-openai-api-key"export LLM_MODEL="openhands/gpt-5-codex"cd agent-sdkuv run python examples/01_standalone_sdk/23_responses_reasoning.py
The reasoning_effort parameter can be set to "none", "low", "medium", or "high" to control the amount of reasoning performed by the model.Then capture reasoning traces in your callback:
def conversation_callback(event: Event): if isinstance(event, LLMConvertibleEvent): msg = event.to_llm_message() llm_messages.append(msg)
The OpenAI Responses API provides reasoning traces that show how the model approached the problem. These traces are available in the LLM messages and can be inspected to understand the model’s decision-making process. Unlike Anthropic’s thinking blocks, OpenAI’s reasoning is more tightly integrated with the response generation process.
Debugging: Understand why the agent made specific decisions or took certain actions.Transparency: Show users how the AI arrived at its conclusions.Quality Assurance: Identify flawed reasoning patterns or logic errors.Learning: Study how models approach complex problems.