Ollama今天正式发布 网页搜索(Web Search)API,允许本地及云端运行的大模型访问实时网络信息,显著提升回答准确性并减少幻觉问题。该功能现已向所有用户开放,适用于研究、问答系统和自主智能体开发等场景。

通过将最新网页结果注入上下文,模型可在生成响应前参考权威来源,实现更可靠的内容输出。
核心能力:让本地模型“连接互联网”
传统本地 LLM 因知识截止于训练数据,难以应对动态信息查询。Ollama 的 Web Search 功能填补了这一空白:
- 支持在推理过程中调用搜索引擎;
- 返回相关网页摘要或链接列表;
- 可结合
web_fetch获取完整页面内容; - 所有操作均可通过 API 控制,便于集成到自动化流程中。
此功能特别适合需要长时间研究或多轮交互的任务,例如:
- 市场趋势分析
- 科技文献综述
- 新闻事件追踪
使用方式:多语言 SDK 与 REST API 兼容
Ollama 提供多种接入方式,满足不同开发需求。
API
从您的 Ollama 账户 创建一个 API 密钥:
REST API(通用调用)
curl https://ollama.com/api/web_search \
--header "Authorization: Bearer $OLLAMA_API_KEY" \
-d '{
"query": "what is ollama?"
}'
示例输出
{
"results": [
{
"title": "Ollama",
"url": "https://ollama.com/",
"content": "Cloud models are now available..."
},
{
"title": "What is Ollama? Introduction to the AI model management tool",
"url": "https://www.hostinger.com/tutorials/what-is-ollama",
"content": "Ariffud M. 6min Read..."
},
{
"title": "Ollama Explained: Transforming AI Accessibility and Language ...",
"url": "https://www.geeksforgeeks.org/artificial-intelligence/ollama-explained-transforming-ai-accessibility-and-language-processing/",
"content": "Data Science Data Science Projects Data Analysis..."
}
]
}
🐍 Python 库(推荐)
安装客户端:
pip install 'ollama>=0.6.0'
然后使用 ollama.web_search 发出请求:
import ollama
response = ollama.web_search("What is Ollama?")
print(response)
示例输出
results = [
{
"title": "Ollama",
"url": "https://ollama.com/",
"content": "Cloud models are now available in Ollama..."
},
{
"title": "What is Ollama? Features, Pricing, and Use Cases - Walturn",
"url": "https://www.walturn.com/insights/what-is-ollama-features-pricing-and-use-cases",
"content": "Our services..."
},
{
"title": "Complete Ollama Guide: Installation, Usage & Code Examples",
"url": "https://collabnix.com/complete-ollama-guide-installation-usage-code-examples",
"content": "Join our Discord Server..."
}
]
💻 JavaScript 库
安装并运行 Ollama 的 JavaScript 库
npm install 'ollama@>=0.6.0'
import { Ollama } from "ollama";
const client = new Ollama();
const results = await client.webSearch({ query: "what is ollama?" });
console.log(JSON.stringify(results, null, 2));
示例输出
{
"results": [
{
"title": "Ollama",
"url": "https://ollama.com/",
"content": "Cloud models are now available..."
},
{
"title": "What is Ollama? Introduction to the AI model management tool",
"url": "https://www.hostinger.com/tutorials/what-is-ollama",
"content": "Ollama is an open-source tool..."
},
{
"title": "Ollama Explained: Transforming AI Accessibility and Language Processing",
"url": "https://www.geeksforgeeks.org/artificial-intelligence/ollama-explained-transforming-ai-accessibility-and-language-processing/",
"content": "Ollama is a groundbreaking..."
}
]
}
构建搜索智能体:实战示例
你可以基于该功能快速搭建一个具备联网能力的“迷你搜索智能体”。以下是一个使用 阿里巴巴 Qwen3-4B 模型的典型流程:
- 用户提问:“当前全球气温变化有何最新研究报告?”
- 模型触发
web_search工具,获取前几页搜索结果; - 解析关键信息,生成结构化摘要;
- 输出引用来源链接,增强可信度。
这类智能体可用于自动化情报收集、客服知识增强等场景。
ollama pull qwen3:4b
from ollama import chat, web_fetch, web_search
available_tools = {'web_search': web_search, 'web_fetch': web_fetch}
messages = [{'role': 'user', 'content': "what is ollama's new engine"}]
while True:
response = chat(
model='qwen3:4b',
messages=messages,
tools=[web_search, web_fetch],
think=True
)
if response.message.thinking:
print('Thinking: ', response.message.thinking)
if response.message.content:
print('Content: ', response.message.content)
messages.append(response.message)
if response.message.tool_calls:
print('Tool calls: ', response.message.tool_calls)
for tool_call in response.message.tool_calls:
function_to_call = available_tools.get(tool_call.function.name)
if function_to_call:
args = tool_call.function.arguments
result = function_to_call(**args)
print('Result: ', str(result)[:200]+'...')
# Result is truncated for limited context lengths
messages.append({'role': 'tool', 'content': str(result)[:2000 * 4], 'tool_name': tool_call.function.name})
else:
messages.append({'role': 'tool', 'content': f'Tool {tool_call.function.name} not found', 'tool_name': tool_call.function.name})
else:
break
结果
Thinking: Okay, the user is asking about Ollama's new engine. I need to figure out what they're referring to. Ollama is a company that develops large language models, so maybe they've released a new model or an updated version of their existing engine...
Tool calls: [ToolCall(function=Function(name='web_search', arguments={'max_results': 3, 'query': 'Ollama new engine'}))]
Result: results=[WebSearchResult(content='# New model scheduling\n\n## September 23, 2025\n\nOllama now includes a significantly improved model scheduling system. Ahead of running a model, Ollama’s new engine...
Thinking: Okay, the user asked about Ollama's new engine. Let me look at the search results.
First result is from September 23, 2025, talking about new model scheduling. It mentions improved memory management, reduced crashes, better GPU utilization, and multi-GPU performance. Examples show speed improvements and accurate memory reporting. Supported models include gemma3, llama4, qwen3, etc...
Content: Ollama has introduced two key updates to its engine, both released in 2025:
1. **Enhanced Model Scheduling (September 23, 2025)**
- **Precision Memory Management**: Exact memory allocation reduces out-of-memory crashes and optimizes GPU utilization.
- **Performance Gains**: Examples show significant speed improvements (e.g., 85.54 tokens/s vs 52.02 tokens/s) and full GPU layer utilization.
- **Multi-GPU Support**: Improved efficiency across multiple GPUs, with accurate memory reporting via tools like `nvidia-smi`.
- **Supported Models**: Includes `gemma3`, `llama4`, `qwen3`, `mistral-small3.2`, and more.
2. **Multimodal Engine (May 15, 2025)**
- **Vision Support**: First-class support for vision models, including `llama4:scout` (109B parameters), `gemma3`, `qwen2.5vl`, and `mistral-small3.1`.
- **Multimodal Tasks**: Examples include identifying animals in multiple images, answering location-based questions from videos, and document scanning.
These updates highlight Ollama's focus on efficiency, performance, and expanded capabilities for both text and vision tasks.
推荐模型:擅长工具调用的高效选择
为充分发挥网页搜索能力,建议使用具备优秀函数调用(function calling)能力的模型:
| 本地模型 | 参数规模 | 特点 |
|---|---|---|
qwen3 | 4B | 中文友好,工具调用稳定 |
gpt-oss | 7B | 开源 GPT 风格,兼容性强 |
| 云模型(高阶任务) | 参数规模 | 场景 |
|---|---|---|
qwen3:480b-cloud | 480B | 复杂研究任务 |
gpt-oss:120b-cloud | 120B | 高精度推理 |
deepseek-v3.1-cloud | 671B | 超长上下文处理 |
⚠️ 建议配置上下文长度至 32,000 tokens 或以上,以容纳大量搜索返回内容,确保推理完整性。
获取完整网页内容:新增 web_fetch 支持
当用户提供具体 URL 或需深入解析某页面时,可使用新的 web_fetch 工具 获取原始 HTML 内容。
Python 库:
from ollama import web_fetch
result = web_fetch('https://ollama.com')
print(result)
该功能适用于文章摘要、网页内容提取等任务。
结果
WebFetchResponse(
title='Ollama',
content='[Cloud models](https://ollama.com/blog/cloud-models) are now available in Ollama\n\n**Chat & build
with open models**\n\n[Download](https://ollama.com/download) [Explore
models](https://ollama.com/models)\n\nAvailable for macOS, Windows, and Linux',
links=['https://ollama.com/', 'https://ollama.com/models', 'https://github.com/ollama/ollama']
)
示例 Python 代码可在 GitHub 上找到。
JavaScript 库
import { Ollama } from "ollama";
const client = new Ollama();
const fetchResult = await client.webFetch({ url: "https://ollama.com" });
console.log(JSON.stringify(fetchResult, null, 2));
结果
{
"title": "Ollama",
"content": "[Cloud models](https://ollama.com/blog/cloud-models) are now available in Ollama...",
"links": [
"https://ollama.com/",
"https://ollama.com/models",
"https://github.com/ollama/ollama"
]
}
示例 JavaScript 代码可在 GitHub 上找到。
cURL
curl --request POST \
--url https://ollama.com/api/web_fetch \
--header "Authorization: Bearer $OLLAMA_API_KEY" \
--header 'Content-Type: application/json' \
--data '{
"url": "ollama.com"
}'
结果
{
"title": "Ollama",
"content": "[Cloud models](https://ollama.com/blog/cloud-models) are now available in Ollama...",
"links": [
"http://ollama.com/",
"http://ollama.com/models",
"https://github.com/ollama/ollama"
]
}
集成方案:支持主流 MCP 客户端
Ollama 网页搜索可通过 MCP(Model Context Protocol)服务器 集成到各类 AI 编程环境,实现无缝扩展。您可以通过 Python MCP 服务器 在任何 MCP 客户端中启用网页搜索。
✅ Cline
要与 Cline 集成,请在其设置中配置 MCP 服务器。管理 MCP 服务器 > 配置 MCP 服务器 > 添加以下配置
{
"mcpServers": {
"web_search_and_fetch": {
"type": "stdio",
"command": "uv",
"args": ["run", "path/to/web-search-mcp.py"],
"env": { "OLLAMA_API_KEY": "your_api_key_here" }
}
}
}

✅ Codex
在 ~/.codex/config.toml 中添加:
[mcp_servers.web_search]
command = "uv"
args = ["run", "path/to/web-search-mcp.py"]
env = { "OLLAMA_API_KEY" = "your_api_key_here" }

✅ Goose
通过官方扩展插件一键启用 Ollama 网络能力。


免费额度与升级选项
- 所有注册用户均可免费使用网页搜索功能;
- 免费账户享有合理速率限制,适合个人探索;
- 如需更高并发或企业级调用频率,可升级至 Ollama Cloud 订阅计划,获得优先资源调度与 SLA 保障。















