The BrowserToolSet integration enables your agent to interact with web pages through automated browser control. Built on top of browser-use, it provides capabilities for navigating websites, clicking elements, filling forms, and extracting content - all through natural language instructions.
examples/01_standalone_sdk/15_browser_use.py
import osfrom pydantic import SecretStrfrom openhands.sdk import ( LLM, Agent, Conversation, Event, LLMConvertibleEvent, get_logger,)from openhands.sdk.tool import Toolfrom openhands.tools.browser_use import BrowserToolSetfrom openhands.tools.file_editor import FileEditorToolfrom openhands.tools.terminal import TerminalToollogger = get_logger(__name__)# Configure LLMapi_key = os.getenv("LLM_API_KEY")assert api_key is not None, "LLM_API_KEY environment variable is not set."model = os.getenv("LLM_MODEL", "anthropic/claude-sonnet-4-5-20250929")base_url = os.getenv("LLM_BASE_URL")llm = LLM( usage_id="agent", model=model, base_url=base_url, api_key=SecretStr(api_key),)# Toolscwd = os.getcwd()tools = [ Tool( name=TerminalTool.name, ), Tool(name=FileEditorTool.name), Tool(name=BrowserToolSet.name),]# If you need fine-grained browser control, you can manually register individual browser# tools by creating a BrowserToolExecutor and providing factories that return customized# Tool instances before constructing the Agent.# Agentagent = Agent(llm=llm, tools=tools)llm_messages = [] # collect raw LLM messagesdef conversation_callback(event: Event): if isinstance(event, LLMConvertibleEvent): llm_messages.append(event.to_llm_message())conversation = Conversation( agent=agent, callbacks=[conversation_callback], workspace=cwd)conversation.send_message( "Could you go to https://openhands.dev/ blog page and summarize main " "points of the latest blog?")conversation.run()print("=" * 100)print("Conversation finished. Got the following LLM messages:")for i, message in enumerate(llm_messages): print(f"Message {i}: {str(message)[:200]}")
Running the Example
export LLM_API_KEY="your-api-key"cd agent-sdkuv run python examples/01_standalone_sdk/15_browser_use.py
For advanced use cases requiring only a subset of browser tools or custom configurations, you can manually register individual browser tools. Refer to the BrowserToolSet definition to see the available individual tools and create a BrowserToolExecutor with customized tool configurations before constructing the Agent. This gives you fine-grained control over which browser capabilities are exposed to the agent.