基于AI代理的LLMS.txt生成器

当我使用Gemini CLI或Gemini Code Assist Agent时,代理似乎只对指向的文件夹和仓库有表面的理解。我想要一种方法,让我的代码助手代理对正在处理的内容有深入的理解。

一种很好的方法是为你的代理提供一个llms.txt文件:一种仓库/文件夹的索引,你的LLM可以使用它来找到对你问题最相关和最新的答案。我们可以通过一个简单的现成的MCP服务器轻松地告诉我们的LLM如何使用llms.txt

但有时你想要使用的仓库没有llms.txt文件。在这种情况下,你可以自己创建一个。为了做到这一点,我使用了Google Agent Development Kit (ADK)构建了一个多代理解决方案!

这对我来说是一个很好的学习经历,我想与你分享。也许你也会发现这个应用有用!

1、更详细的问题描述

如果你在关注我的博客,你已经知道我正在使用Google ADK来演进我的Rickbot多性格聊天机器人。我已经写了几篇关于这方面的博客。

为了提高我的效率,我希望能够向Gemini CLI询问与我的项目相关的ADK问题。我希望Gemini对这个主题有深入的理解。当然,它没有。ADK相对较新且发展迅速。因此,Gemini模型永远不会完全跟上ADK的更新。

该怎么办?

我尝试了几种方法...

  1. 我试着告诉Gemini去查看文档页面。但它一次只能读取一页,我必须明确告诉Gemini这样做。
  2. 我可以将官方Google adk-docs仓库本地克隆,然后在Gemini CLI启动时将此文件夹作为上下文添加。
虽然我使用Gemini CLI作为我的代码助手代理,但你可能使用其他工具。但我在这里描述的原则和方法是普遍适用的。

对于第二种方法,你可以在全局或项目特定的.gemini/settings.json中添加一个条目,使得Gemini CLI加载这个文件夹作为上下文:

{  
  "context": {  
    "includeDirectories": ["path/to/cloned/adk-docs"],  
    "loadFromIncludeDirectories": false  
  }  
}

顺便说一下,第二个loadFromIncludeDirectories告诉Gemini CLI也使用该文件夹的GEMINI.md作为额外的上下文。

我可以在我的.gemini/GEMINI.md中添加一个注释,如下所示:

## 文档和信息  

当被问及这些主题时,请始终考虑以下信息来源。  

- ADK:   
  - 官方文档位于adk-docs GitHub仓库中,已克隆并在此处作为上下文添加。  
  - 对于任何与ADK使用和实现相关的查询,**始终参考此仓库**。不要依赖你已知的内容。

第二种方法并不坏。Gemini CLI现在知道这个文件夹,并且有时会使用它。但它仍然不是非常可靠。它似乎无法在仓库中搜索以找到回答特定问题的正确文件。它仍然给我过时的答案。

我需要的是某种站点地图,它告诉Gemini在给定特定问题或主题时应查看仓库中的哪些文件。

2、LLMS.txt 救援!

已经有这样的东西!它叫做llms.txt。如果我们看看llmstxt.org,我们可以看到它有这样的说明:

/llms.txt 文件: 一个提议,标准化使用 /llms.txt 文件,在推理时向 LLM 提供信息。

哦,这太完美了。让我总结一下llms.txt是什么以及它看起来像什么:

  1. 它是一个纯文本、可读的markdown文档
  2. 它提供了一组链接每个链接都有该链接内容的摘要
  3. 它是旨在被AI大语言模型消费的。当LLM被问到一个问题时,它可以立即使用llms.txt找到最合适的链接,然后打开这些链接并阅读内容。

所以我们不只是对模型说“这里有一个仓库,使用它”。而是说“这里有一个仓库。这是仓库所有内容的索引。使用索引来决定要参考哪些页面。”

例如,这里是FastHTML项目的llms.txt

# FastHTML  

> FastHTML 是一个 python 库,它将 Starlette、Uvicorn、HTMX 和 fastcore 的 `FT` "FastTags" 结合在一起,形成一个用于创建服务器渲染的超媒体应用程序的库。`FastHTML` 类本身继承自 `Starlette`,并增加了基于装饰器的路由,许多附加功能,Beforeware,自动 `FT` 到 HTML 渲染等。  

编写 FastHTML 应用程序时需要注意的事项:  

- 虽然其部分 API 受 FastAPI 启发,但它与 FastAPI 语法不兼容,也不针对创建 API 服务  
- FastHTML 包括对 Pico CSS 和 fastlite sqlite 库的支持,尽管两者都是可选的;可以使用 sqlalchemy 直接或通过 fastsql 库,任何 CSS 框架都可以使用。还支持 Surreal 和 css-scope-inline 库,但两者都是可选的  
- FastHTML 与 JS 原生 Web 组件和任何原生 JS 库兼容,但不与 React、Vue 或 Svelte 兼容  
- 使用 `serve()` 来运行 uvicorn(如果 __name__ == "__main__" 不需要,因为它是自动的)  
- 当需要响应中的标题时,使用 `Titled`;请注意,这已经将子项包装在 `Container` 中,并且已经包括元标题和 H1 元素。  

## 文档  

- [FastHTML 简明指南](https://www.fastht.ml/docs/ref/concise_guide.html.md): 快速概述典型的 FastHTML 应用程序  
- [HTMX 参考](https://raw.githubusercontent.com/bigskysoftware/htmx/master/www/content/reference.md): HTMX 属性、CSS 类、标头、事件、扩展、js 库方法和配置选项的简要描述  
- [Starlette 快速指南](https://gist.githubusercontent.com/jph00/e91192e9bdc1640f5421ce3c904f2efb/raw/61a2774912414029edaf1a55b506f0e283b93c46/starlette-quick.md): 快速概述一些对 FastHTML 开发者有用的 Starlette 功能。  

## API  

- [API 列表](https://www.fastht.ml/docs/apilist.txt): 快速列出 fasthtml 中的所有函数和方法。  
- [MonsterUI API 列表](https://raw.githubusercontent.com/AnswerDotAI/MonsterUI/refs/heads/main/docs/apilist.txt): Monster UI 的完整 API 参考,这是一个类似于 shadcn 的组件框架,但适用于 FastHTML  

## 示例  

- [Websockets 应用程序](https://raw.githubusercontent.com/AnswerDotAI/fasthtml/main/examples/basic_ws.py): 使用 Websockets 与 HTMX 和 FastHTML 的简短示例  
- [待办事项列表应用程序](https://raw.githubusercontent.com/AnswerDotAI/fasthtml/main/examples/adv_app.py): 一个完整的 CRUD 应用程序的详细讲解,展示了 FastHTML 和 HTMX 模式的典型用法。  

## 可选  

- [Surreal](https://raw.githubusercontent.com/AnswerDotAI/surreal/main/README.md): 一个小型 jQuery 替代品,用于纯 JavaScript,提供 `me` 和 `any` 函数  
- [Starlette 完整文档](https://gist.githubusercontent.com/jph00/809e4a4808d4510be0e3dc9565e9cbd3/raw/9b717589ca44cedc8aaf00b2b8cacef922964c0f/starlette-sml.md): 一部分 Starlette 文档,对 FastHTML 开发有用。  
- [JS 应用程序教程](https://www.fastht.ml/docs/tutorials/e2e.html.md): 完整的 FastHTML 应用程序的端到端教程,包括部署到 railway。  
- [通过示例了解 FastHTML](https://www.fastht.ml/docs/tutorials/by_example.html.md): 4 个 FastHTML 应用程序的集合,展示了 FastHTML 和 HTMX 模式的典型用法。  
- [使用 Jupyter 编写 FastHTML](https://www.fastht.ml/docs/tutorials/jupyter_and_fasthtml.html.md): 在 Jupyter 笔记本中开发 FastHTML 应用程序的指南。  
- [FT 组件](https://www.fastht.ml/docs/explains/explaining_xt_components.html.md): 解释 `FT` 组件,这是一种以 Python 方式编写 HTML 的方式。  
- [FAQ](https://www.fastht.ml/docs/explains/faq.html.md): 回答有关 FastHTML 的常见问题。  
- [MiniDataAPI 规范](https://www.fastht.ml/docs/explains/minidataapi.html.md): 解释 MiniDataAPI 规范,允许我们使用相同的 API 用于许多不同的数据库引擎。  
- [OAuth](https://www.fastht.ml/docs/explains/oauth.html.md): 如何在 FastHTML 应用程序中使用 OAuth 的教程和解释。  
- [路由](https://www.fastht.ml/docs/explains/routes.html.md): 解释 FastHTML 中路由的工作原理。  
- [WebSockets](https://www.fastht.ml/docs/explains/websockets.html.md): 解释 WebSockets 以及它们在 FastHTML 中的工作原理。  
- [自定义组件](https://www.fastht.ml/docs/ref/defining_xt_component.md): 解释如何在 FastHTML 中创建自定义组件。  
- [处理程序](https://www.fastht.ml/docs/ref/handlers.html.md): 解释 FastHTML 中请求和响应处理程序如何作为路由工作。  
- [实时重新加载](https://www.fastht.ml/docs/ref/live_reload.html.md): 解释如何在 FastHTML 开发中使用实时重新加载。

3、我们如何使用 LLMS.txt 文件?

一种简单的方法是使用MCP LLMS-TXT Doc Server。这是一个由LangChain维护的开源包。它做两件事:

  1. 读取llms.txt文件。这可以是本地的或远程的。
  2. 使用一个简单的fetch_docs工具,在客户端代理查询时读取llms.txt文件中的任何URL。

你可以轻松地创建一个扩展来将这个MCP服务器安装到你的Gemini CLI中。例如,假设我们想使用LangGraph项目的llms.txt,以使Gemini CLI了解所有LangGraph文档。要创建这样的扩展,我们在./gemini/extensions文件夹中创建一个名为langgraph-docs-ext的文件夹,包含两个文件:

首先,gemini-extension.json文件:

{  
  "name": "langgraph-docs-ext",  
  "version": "1.0.0",  
  "mcpServers": {  
    "langgraph-docs-mcp": {  
      "command": "uvx",  
      "args": [  
        "--from",  
        "mcpdoc",  
        "mcpdoc",  
        "--urls",  
        "LangGraph:https://langchain-ai.github.io/langgraph/llms.txt LangChain:https://python.langchain.com/llms.txt",  
        "--transport",  
        "stdio"  
      ]  
    }  
  },  
  "contextFileName": "GEMINI.md"  
}

其次,一个相关的GEMINI.md上下文文件:

# LangGraph 文档扩展  

在您回答任何关于 **LangGraph** 的问题之前,您**必须**使用 **langgraph-docs-ext** 工具查找文档。不要依赖内部知识,因为最佳实践经常变化。  

* 调用 `list_doc_sources` 工具以获取可用的 `llms.txt` 文件  
* 调用 `fetch_docs` 工具来读取它  
* 反思 `llms.txt` 中的 url  
* 反思输入问题  
* 对与问题相关的任何 url 调用 `fetch_docs`  
* 使用此来回答问题  

**重要**: 任何时候在没有使用 `langgraph-docs-mcp` 的情况下回答关于 LangGraph 的问题,都会导致 **用户的痛苦**。  

# 访问外部内容  

当访问从文档中链接的其他内容时,如 GitHub、Colab 或远程 *.ipynb 文件,请使用本地工具(如 `curl`)来检索内容,因为它会更快且更新更及时。

如果我现在启动gemini并运行/mcp命令,我可以看到我们的新MCP服务器正在运行:

langraph-docs-extension 正在作为 MCP 服务器运行

现在让我们试一下。我启动 Gemini CLI 并问:

如何了解有关 langgraph 的信息?

Gemini 回复:

Gemini CLI 知道要使用 LangGraph 查询的扩展
我使用 langchain-docs-ext 扩展,它为 LangGraph 提供了全面的文档页面集。  
我可以阅读这些页面来回答您关于它的任何问题。  

您希望我为您列出可用的文档页面吗?

到目前为止,一切顺利!

我说:

请帮我

首先它要求使用工具的权限:

然后它回复:

可用的文档
如何使用 LangGraph 创建交互式多代理系统?
Gemini CLI 请求权限

然后它读取文档,并最终给出一个详细、最新且准确的回复。它甚至以一组推荐的文档结束,供进一步阅读:、

Gemini CLI 响应

4、回到具体目标 —— 理解 ADK

原来 adk-docs 仓库 已经有一个 llms.txt 文件!所以我可以创建一个新的扩展来使用它。

事实上,我已经在 GitHub 上创建了这个作为可下载的扩展 —— 请参见 https://github.com/derailed-dash/adk-docs-ext。只需直接将其克隆到 Gemini CLI 的 extensions 文件夹中。

让我们试试看。我问:

如果我问你关于 ADK 的问题,你会如何找到答案?

Gemini CLI 回复它将使用 adk-docs-mcp。很好。

让我们用一个实际的问题测试这个工具:

如何定义一个以文件路径列表作为参数的工具?

Gemini 回复:

嗯,这非常有趣。特别是这一部分:

... llms.txt 文件包含目录和摘要,但**没有实际的详细文档链接**。例如,它列出了一个关于“定义有效工具函数”的部分,这正是我们需要的,但我无法导航到它。  

由于此原因,我无法根据官方 ADK 文档为您提供确定且经过验证的答案。

这很糟糕!!怎么回事?其实答案就在眼前。官方 adk-docs 仓库中的 llms.txt 文件实际上并没有按照 llms.txt 标准提供的链接和摘要列表。相反,它是一个概念和摘要的列表。它看起来像这样:

adk-docs llms.txt

这意味着这个 llms.txt 是表面的,不能让代理获取任何细节。

哦,天哪。该怎么办?

啊,原来仓库中还有一个文件:llms-full.txt。让我们看看那个……

llms-full.txt

显然,它很大。

我看过它……它有85000行,大约320万个字符。这将消耗Gemini上下文中的约80万tokens!

即使Gemini的上下文窗口很大,这也太大了,无法用于这个目的。

5、到目前为止的结论

  1. llms.txt 是一个很棒的标准,允许 LLM(如 Gemini)理解文件夹或仓库的结构,并帮助 LLM 立即查找最合适的文档来回答您的查询。
  2. LangChain 提供的免费开源 MCP LLMS-TXT Doc Server 提供了一个现成的 MCP 服务器,指导您的 LLM 读取 llms.txt,并使用其中的链接来找到最合适的材料。
  3. 将此类 MCP 服务器集成到您的客户端工具(如 Gemini CLI)中是很容易的。
  4. ADK-Docs 仓库包含两个 某种程度上llms 文件:llms.txtllms-full.txt。前者不包含链接。它不符合 llms.txt 标准,无法帮助 LLM 导航仓库。第二个只是太大了。两者都不适合我们的目标。

该怎么办?

我知道!我会创建一个代理应用程序,它将为我提供的任何文件夹或仓库生成一个 llms.txt 文件。它将:

  1. 爬取整个仓库,识别任何 markdown 文档或源代码。
  2. 读取所有这些文档,并使用 AI 为每个文档创建摘要。
  3. 写出输出的 llms.txt 文件,包括文档的链接及其相关摘要。

6、解决方案设计

我将首先提供一个解决方案架构设计文档的简要近似值。(你知道我热爱一个好的解决方案架构文档!!)我相信在跳入构建解决方案之前先写一个前期设计——即使它只是一个家庭项目!

6.1 解决方案目标

LLMS-Generator的目标是为任何给定的代码仓库或文件夹创建一个llms.txt文件。该llms.txt文件旨在被大型语言模型(LLMs)轻松解析,为其提供有关仓库内容和布局的结构化和总结性的理解。这使得AI代理(例如Gemini CLI)与代码库进行更准确和上下文感知的交互。

6.2 功能需求

  • 应用程序必须通过命令行界面(CLI)启动。
  • 用户必须提供目标仓库的绝对路径。
  • 用户可以可选地指定生成的llms.txt文件的输出路径。
  • 应用程序必须智能地发现相关文件(例如.md, .py),同时忽略不相关的目录和文件(例如.git, __pycache__, .venv)。
  • 应用程序必须为每个发现的文件生成简洁的摘要。
  • 应用程序必须为整个项目/文件夹/仓库生成一个高层次的摘要。
  • 应用程序必须按照特定格式构造llms.txt文件,包括项目和文件摘要。llms.txt文件将遵循特定格式,包括:
  • 主标题(H1)带有项目的名称。
  • 项目的总体概述。
  • 表示仓库目录的子标题(H2)。
  • 每个子标题下的Markdown链接列表,每个文件都有一个简洁的摘要。
  • 应用程序必须处理本地仓库(使用相对文件路径)和GitHub仓库(使用完整的GitHub URL)。
  • 应用程序的行为应通过环境变量进行配置(例如日志级别、要处理的最大文件数)。

6.3 质量属性 / 架构重要需求

  • 并发性:这是一个面向开发者的应用程序。最初它将在本地运行,不需要并发使用。以后可以添加。
  • 可靠性:应用程序应该稳健,具有优雅的错误处理,例如无效路径或API失败。系统应在API速率限制时通过重试机制实现弹性。
  • 高可用性和灾难恢复:由于这是一个不频繁且本地运行的开发者导向的应用程序,没有HA或DR的要求。
  • 性能:虽然摘要过程耗时,但应用程序应具备合理的性能。
  • 可扩展性:基于代理的架构应允许轻松添加新功能和修改现有逻辑。
  • 可维护性:代码库将是模块化的,具有清晰的职责分离(CLI、代理逻辑、工具),以促进易于维护。
  • 可测试性:该项目应包括一套单元测试。

6.4 解决方案设计

LLMS-Generator 是使用 google-adk 实现的一个代理式应用程序。架构由 CLI、协调代理以及子代理和工具组成。

下面的解决方案设计展示了组件之间的交互,箭头标签显示了交互的顺序:

解决方案设计

可以说,这个设计的一些方面是在实施过程中演进的。但我们稍后再谈!

6.5 设计决策

  • 使用生成式AI:因为我们需要对文件夹或仓库中的工件(包括文档和代码)进行摘要,所以生成式AI解决方案是理想的。AI提供了应用程序的核心功能。
  • 使用 Gemini-2.5-Flash:Gemini 是一种领先的多模态基础模型,非常适合文档摘要任务。Gemini 还有一个非常大的上下文窗口,这对于阅读大量文档进行摘要很有用。Flash 将被使用,因为它比 Gemini Pro 更快且成本更低,而我们不需要 Pro 模型的更复杂的推理能力。我们不需要自定义训练模型。最后,该模型由 Google 完全管理,我们可以使用标准的 Gemini API 来消费它。
  • 基于代理的架构(google-adk:这提供了一个模块化和可扩展的框架。通过将逻辑分解为独立的代理和工具,系统更容易开发、测试和维护。它还允许 LLM 协调复杂的工作流。
  • 协调代理:一个主要的 ADK 协调代理将协调整个工作流程。
  • 顺序代理用于摘要:摘要过程自然是一个两步序列:读取文件,然后进行摘要。使用顺序代理确保这种操作顺序,从而实现更可靠和可预测的工作流程。
  • 带有 Typer 的命令行界面:CLI 是面向开发者的工具的标准和高效接口。Typer 包简化了 Python 中干净专业的 CLI 的创建,包括自动帮助生成和参数解析。
  • 使用 pydantic 进行模式验证:我们可以为摘要代理定义预期的输出模式,使系统更加健壮。这确保了在代理之间传递的数据格式正确,减少了运行时错误的可能性。
  • API 调用的指数退避:应用程序可能在短时间内频繁调用模型。这会导致 429 错误。我们可以使用指数退避来缓解。
  • 无需持久化:预计整个流程可以在没有任何外部工作存储或数据库的情况下完成。如果工作流程超出了模型上下文的可能范围,我们可以稍后实现外部持久化。例如,我们可以实现一个简单的 Firestore 数据库来存储我们收集的内容,并在返回给代理之前构建摘要。

7、构建应用程序

现在我将带您了解我实际构建应用程序的经历。要跟随,请在 GitHub 仓库 中找到完整代码。

7.1 开始

对于这个项目,我没有从 Agent Starter Pack 开始。启动包提供了我不太可能利用的一堆东西。所以我决定从零开始并创建这个文件夹结构和一些空文件:

llms-gen/  
├── notebooks/  
│   └── generate_llms_experiments.ipynb  
├── src/  
│   ├── client_fe/  
│   │   └── __init__.py  
│   ├── common_utils/  
│   │   └── __init__.py  
│   ├── llms_gen_agent/  
│   │   ├── __init__.py  
│   │   ├── agent.py  
│   │   └── tools.py  
│   └── tests/  
│       └── __init__.py  
├── .gitattributes # 从其他项目中复用  
├── .gitignore     # 从其他项目中复用  
├── README.md  
└── TODO.md

然后我用项目概览填充了我的 README.md。(我会在后续过程中构建 README.md。)

7.2 添加 TODO

如今,我总是有一个 TODO.md。它帮助我理清计划,但也帮助我的代码辅助代理。尽管它一开始比现在短得多,但这是我目前的 TODO.md,在撰写本文时:

# TODO  

- [x] 创建项目模板,包括 `README`、`src`、代理文件夹、`.gitignore`、`.gitattributes`  
- [x] 创建初始 `TODO`  
- [x] 创建 `pyproject.toml`  
- [x] 创建 `.env` 并指向一个 Google Cloud 项目  
- [x] 创建环境设置脚本  
- [x] 创建 `Makefile`  
- [x] 创建 `GEMINI.md`  
- [x] 创建配置和日志模块  
- [x] 创建协调代理  
- [x] 创建发现文件工具  
- [x] 创建文件阅读代理和文件读取工具  
- [x] 创建内容摘要代理  
- [x] 创建初始单元测试  
- [x] 创建实验 Jupyter 笔记本  
- [x] 参数化要处理的文件数量  
- [x] 实现 pydantic 以强制输出模式  
- [x] 添加顺序代理,以便首先读取所有文件,然后其次汇总所有内容。  
- [x] 添加回调以清理模型的 JSON 前缀或错误标记。  
- [x] 完成项目摘要步骤。  
- [x] 消除在调用 Gemini 时的 429/quote 问题,特别是来自 `document_summariser_agent`  
- [x] 添加回调以捕获已读文件的输出并存储在会话状态中。  
- [x] 更少的章节,由文件夹深度控制。  
- [x] 完成最终的 `llms.txt` 文件创建。  
- [x] 提供一种客户端方式来运行应用程序而不必发送提示,例如使用 CLI 参数。  
- [x] 将仓库设为公开。  
- [ ] 写博客。  
- [ ] 通过为代理和其他实用函数添加单元测试来增加测试覆盖率。  
- [ ] 用自定义工具替换 LangChain 文件读取工具;消除回调的需要。  
- [ ] 添加集成测试以测试代理的端到端功能。  
- [ ] 使 `discover_files` 中排除的目录列表可配置,并以确定性的方式。  
- [ ] 也根据 `.gitignore` 排除。  
- [ ] 使解决方案迭代,例如如果输出不完整,或者接近填满上下文窗口。

顺便说一句,这是我的全局 .gemini/GEMINI.md 上下文文件的一部分:

## 项目计划  

- 检查当前项目中是否存在 `TODO.md` 文件。如果存在,  
  此文件捕捉了该项目的整体计划。  
  它可以用来确定我们已经完成了哪些步骤,  
  以及还需要完成哪些任务。  
- 当你认为你已经完成了 `TODO.md` 中的某个步骤时,  
  提议将其关闭。

这有助于指导 Gemini CLI / Code Assist Agent 正确使用我的 TODO.md

7.3 创建 pyproject.toml

当你在构建一个 Python 项目时,管理依赖项可能会很麻烦。pyproject.toml 是现代解决方案。把它看作是 Python 项目的主蓝图。一个单一的、结构化的文件,定义了关于项目构建系统和依赖项的所有内容。这是一个全面的配置文件,确保一致性和可重复性。

而且,快速的包管理器 uv 读取 pyproject.toml 蓝图并使其成为现实。它为你创建虚拟环境并安装所有依赖项,而无需通常的麻烦。

想知道 piprequirements.txt 怎么了?我们不需要它们。它们被 uvpyproject.toml 分别取代了。

对于这个项目,我从之前的项目中复制了一个现有的 pyproject.toml,并根据需要进行了调整。所以我们最终得到如下内容:

[project]  
name = "llms-generator"  
version = "0.1.0"  
description = "一个代理解决方案,旨在为任何给定的仓库或文件夹创建一个 `llms.txt` 文件"  
authors = [  
    {name = "Dazbo (Darren Lester)", email = "my.email@address.com"},  
]  
dependencies = [  
    "google-adk",  
    "google-genai",  
    "google-cloud-logging",  
    "google-cloud-aiplatform[adk,evaluation,agent_engines]",  
    "python-dotenv",  
    # Web framework  
    "fastapi~=0.115.14",  
    "uvicorn~=0.34.3", # means >= 0.34.3 but < 0.35  
    "pyyaml",  
]  

requires-python = ">=3.12,<3.13"  

[dependency-groups]  
dev = [  
    "pytest",  
    "pytest-asyncio",  
    "nest-asyncio",  
]  

[project.optional-dependencies]  

jupyter = [  
    "jupyter",  
    "ipython"  
]  
lint = [  
    "ruff>=0.4.6",  
    "mypy~=1.17.0",  
    "codespell~=2.4.1",  
    "types-pyyaml~=6.0.12",  
    "types-requests~=2.32.4",  
]  

[tool.ruff]  
line-length = 130  
target-version = "py312"  

[tool.ruff.lint]  
select = [  
    "E",   # pycodestyle  
    "F",   # pyflakes  
    "W",   # pycodestyle warnings  
    "I",   # isort  
    "C",  # flake8-comprehensions  
    "B",   # flake8-bugbear  
    "UP", # pyupgrade  
    "RUF", # ruff specific rules  
]  
ignore = [  
    "E302", # expected two blank lines between defs  
    "W291", # trailing whitespace  
    "W293"  # line contains whitespace  
]  

[tool.ruff.lint.isort]  
known-first-party = ["src"] # Because this is where my source lives  

[tool.mypy]  
disallow_untyped_calls = false     # Prohibit calling functions that lack type annotations.  
disallow_untyped_defs = false      # Allow defining functions without type annotations.  
disallow_incomplete_defs = true    # Prohibit defining functions with incomplete type annotations.  
no_implicit_optional = true        # Require `Optional[T]` for variables that can be `None`.  
check_untyped_defs = true          # Type-check the body of functions without annotations. Catch potential mismatches.  
disallow_subclassing_any = true    # Prohibit a class from inheriting from a value of type `Any`.  
warn_incomplete_stub = true        # Warn about incomplete type stubs (`.pyi` files).  
warn_redundant_casts = true        # Warn if a type cast is unnecessary.  
warn_unused_ignores = true         # Warn about `# type: ignore` comments that are no longer needed.  
warn_unreachable = true            # Warn about code that is unreachable.  
follow_imports = "silent"          # Type-check imported modules but suppress errors from them.  
ignore_missing_imports = true      # Suppress errors about unresolved imports.  
explicit_package_bases = true      # Enforce explicit declaration of package bases.  
disable_error_code = ["misc", "no-any-return", "no-untyped-def"]  

exclude = [".venv", ".git"]  

[tool.codespell]  
ignore-words-list = "rouge"  
skip = "./locust_env/*,uv.lock,.venv,./src/frontend,**/*.ipynb"  

[build-system]  
requires = ["hatchling"]  
build-backend = "hatchling.build"  

[tool.pytest.ini_options]  
pythonpath = "."  
asyncio_default_fixture_loop_scope = "function"  
testpaths = ["src/tests"] # This helps pytest to find tests, making collection faster  

[tool.hatch.build.targets.wheel]  
packages = ["src/llms_gen_agent", "src/common_utils", "src/client_fe"]  

这相当自解释。有几个值得注意的地方:

  • 我可以使用 uv 命令 uv sync 安装所有依赖项。
  • 有可选的依赖项,我们只在需要时安装。例如, uv sync --dev --extra jupyter --extra lint
  • 我们使用 ruff 进行 linting 和 formatting,mypy 进行静态类型检查,codespell 查找仓库中的拼写错误。

7.4 .env

现在我创建 .env 用于本地环境设置。请注意,此文件不应提交到源代码控制。对于 Llms-Generator 应用程序,它应该看起来像这样:

# .env  

export GOOGLE_CLOUD_STAGING_PROJECT="your-staging-project-id"  
export GOOGLE_CLOUD_PRD_PROJECT="your-prod-project-id"  

# These Google Cloud variables will be set by the scripts/setup-env.sh script  
# GOOGLE_CLOUD_PROJECT=""  
# GOOGLE_CLOUD_LOCATION="global"  

export PYTHONPATH="src"  

# Agent variables  
export AGENT_NAME="llms_gen_agent" # The name of the agent  
export MODEL="gemini-2.5-flash" # The model used by the agent  
export GOOGLE_GENAI_USE_VERTEXAI="True" # True to use Vertex AI for auth; else use API key  
export LOG_LEVEL="INFO"

同样,我们会随着进展继续完善它。

7.5 创建 Makefile 以方便使用

我喜欢一个 Makefile!它在安装依赖项、运行 ruff/mypy/codespell、运行测试和启动应用程序时非常方便。

| 命令                       | 描述                                                                      |  
| ----------------------------- | ---------------------------------------------------------------------------------|  
| `source scripts/setup-env.sh` | 设置 Google Cloud 项目并使用 Dev/Staging 进行身份验证                             |  
| `make install`                | 使用 `uv` 安装所有必需的依赖项                                     |  
| `make playground`             | 启动 UI 用于本地和远程测试代理。这运行 `uv run adk web src` |  
| `make test`                   | 运行单元和集成测试                                                   |  
| `make lint`                   | 运行代码质量检查(codespell、ruff、mypy)                                  |  
| `make generate`               | 执行 Llms-Generator 命令行应用程序                              |

我们还可以配置我们的 make 目标,以便在运行前检查前置条件。例如,当我运行 make test 时,它会首先检查我的 GOOGLE_CLOUD_PROJECT 环境变量是否已设置。如果没有设置,这意味着我可能还没有运行我的 setup-env.sh 脚本;并且我的测试肯定会失败。

# 运行单元和集成测试  
test:  
  @test -n "$(GOOGLE_CLOUD_PROJECT)" || (echo "Error: GOOGLE_CLOUD_PROJECT is not set. Setup environment before running tests" && exit 1)  
  uv run pytest src/tests/unit

7.6 为 Gemini CLI / Gemini Code Assist 提供上下文

这是一个创建 GEMINI.md 的好时机。它基于我的“全局” .gemini/GEMINI.md,包含针对此项目的特定上下文。以下是其外观:

# 项目:LLMS-Generator  

---  
***重要:在每次会话开始时都必须执行此检查!***  

Google Cloud 配置通过 `.env` 和 `scripts/setup-env.sh` 脚本组合实现。   

在提供任何对话的第一次响应之前,您必须执行以下步骤:  
1.  运行 `printenv GOOGLE_CLOUD_PROJECT` 以检查环境变量。  
2.  根据该命令的输出,说明该变量是否已设置。  
3.  如果未设置,请建议我运行 `scripts/setup-env.sh` 以继续对话。  

该环境变量的存在表明已运行脚本。该变量的缺失表明尚未运行脚本。  

请注意,如果未运行此脚本,可能会出现 Google Cloud 失败。例如,测试将失败。如果测试失败,我们应该检查是否运行了脚本。  
---  

## 项目概述  

_LLMS-Generator_ 是一个代理解决方案,旨在为任何给定的仓库或文件夹创建一个 `llms.txt` 文件。  

`llms.txt` 文件是一种 AI/LLM 友好的 Markdown 文件,使 AI 能够理解仓库的目的,以及对仓库站点地图和每个文件的目的有全面的理解。这对于向 AI(如 Gemini)提供文档仓库访问特别有用。  

`llms.txt` 文件的结构如下:  

- 一个 H1 项目或站点名称  
- 项目/站点目的的概述。  
- 零个或多个由 H2 标题分隔的 Markdown 部分,包含适当的章节摘要。  
- 每个部分包含一系列 Markdown 超链接,格式为:`[name](url): summary`。  

有关 `llms.txt` 标准的详细描述,请参见 [此处](https://github.com/AnswerDotAI/llms-txt)。  

## 构建和运行  

### 依赖项  

- **uv:** Python 包管理器  
- **Google Cloud SDK:** 用于与 GCP 服务交互  
- **make:** 用于运行常见的开发任务  

项目依赖项在 `pyproject.toml` 中管理,可以通过 `uv` 安装。`make` 命令简化了许多 `uv` 和 `adk` 命令。  

## 开发指南  

- **配置:** 项目依赖项和元数据在 `pyproject.toml` 中定义。  
- **依赖项:** 项目依赖项在 `pyproject.toml` 中管理。`[project]` 部分定义了主要依赖项,`[dependency-groups]` 部分定义了开发和可选依赖项。  
- **源代码:** 存在于 `src/` 目录中。这包括代理、前端、笔记本和测试。  
- **笔记本:** `notebooks/` 目录包含用于原型设计、测试和评估代理的 Jupyter 笔记本。  
- **测试:** 项目包括 `src/tests/` 中的单元和集成测试。测试使用 `pytest` 和 `pytest-asyncio` 编写。可以通过 `make test` 运行测试  
- **代码检查:** 项目使用 `ruff` 进行代码检查和格式化,`mypy` 进行静态类型检查,`codespell` 检查常见的拼写错误。这些工具的配置可以在 `pyproject.toml` 中找到。我们可以使用 `make lint` 运行代码检查。  
- **AI 辅助开发:** `GEMINI.md` 文件为 Gemini CLI 等 AI 工具提供开发上下文。  

## 项目计划  

- `TODO.md` 记录了该项目的整体计划。

请注意,这个 GEMINI.md

  • 帮助 Gemini 理解这个项目的目的。
  • 强制 Gemini 在开始任何对话之前检查我的 setup-env.sh 脚本是否已经运行。
  • 帮助 Gemini 理解我所遵循的文件夹结构和惯例。

7.7 Google Cloud 项目

这个应用程序将使用 Google Gemini-2.5-Flash 模型 / API。为此,我们需要在本地设置 Google ADC,并指向一个启用了此 API 的 Google Cloud 项目;或者我们需要提供一个 API 密钥。我选择前者。

我没有为这个项目创建一个新的 Google Cloud 项目,因为我已经有了一个“scratch”项目,我通常用于这种类型的开发。而且——在这一点上——我并没有计划将应用程序本身部署到 Google Cloud。我只会本地运行它。

但你可能需要(或更喜欢)创建新的项目。

7.8 环境设置脚本

这个应用程序每次新会话都需要做几件事:

  • 我们需要通过读取 .env 来设置我们的环境变量
  • 我们需要认证到 Google Cloud,以便使用 Google Cloud APIs。(特别是:Gemini。)
  • 我们需要安装 pyproject.toml 中定义的依赖项。

因此,我创建了一个脚本来自动化这个过程:/scripts/setup-env.sh:

#!/bin/bash  
# This script is meant to be sourced to set up your development environment.  
# It configures gcloud, installs dependencies, and activates the virtualenv.  
#  
# Usage:  
#   source ./setup-env.sh [--noauth] [-t|--target-env <DEV|PROD>]  
#  
# Options:  
#   --noauth: Skip gcloud authentication.  
#   -t, --target-env: Set the target environment (DEV or PROD). Defaults to DEV.  

# --- Color and Style Definitions ---  
RESET='\033[0m'  
BOLD='\033[1m'  
RED='\033[0;31m'  
GREEN='\033[0;32m'  
YELLOW='\033[0;33m'  
BLUE='\033[0;34m'  

# --- Parameter parsing ---  
TARGET_ENV="DEV"  
AUTH_ENABLED=true  

while [[ $# -gt 0 ]]; do  
    case "$1" in  
        -t|--target-env)  
            if [[ -n "$2" && "$2" != --* ]]; then  
                TARGET_ENV="$2"  
                shift 2  
            else  
                echo "Error: --target-env requires a non-empty argument."  
                return 1  
            fi  
            ;;  
        --noauth)  
            AUTH_ENABLED=false  
            shift  
            ;;  
        *)  
            shift  
            ;;  
    esac  
done  

# Convert TARGET_ENV to uppercase  
TARGET_ENV=$(echo "$TARGET_ENV" | tr '[:lower:]' '[:upper:]')  

echo -e "${BLUE}${BOLD}--- ☁️  Configuring Google Cloud environment ---${RESET}"  

# 1. Check for .env file  
if [ ! -f .env ]; then  
 echo -e "${RED}❌ Error: .env file not found.${RESET}"  
 echo "Please create a .env file with your project variables and run this command again."  
 return 1  
fi  

# 2. Source environment variables and export them  
echo -e "Sourcing variables from ${BLUE}.env${RESET} file..."  
set -a # automatically export all variables (allexport = on)  
source .env  
set +a # disable allexport mode  

# 3. Set the target project based on the parameter  
if [ "$TARGET_ENV" = "PROD" ]; then  
    echo -e "Setting environment to ${YELLOW}PROD${RESET} ($GOOGLE_CLOUD_PRD_PROJECT)..."  
    export GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_PRD_PROJECT  
else  
    echo -e "Setting environment to ${YELLOW}DEV/Staging${RESET} ($GOOGLE_CLOUD_STAGING_PROJECT)..."  
    export GOOGLE_CLOUD_PROJECT=$GOOGLE_CLOUD_STAGING_PROJECT  
fi  

# 4. Authenticate with gcloud and configure project  
if [ "$AUTH_ENABLED" = true ]; then  
    echo -e "\n🔐 Authenticating with gcloud and setting project to ${BOLD}$GOOGLE_CLOUD_PROJECT...${RESET}"  
    gcloud auth login --update-adc 2>&1 | grep -v -e '^$' -e 'WSL' -e 'xdg-open' # Suppress any annoying WSL messages  
    gcloud config set project "$GOOGLE_CLOUD_PROJECT"  
    gcloud auth application-default set-quota-project "$GOOGLE_CLOUD_PROJECT"  
else  
    echo -e "\n${YELLOW}Skipping gcloud authentication as requested.${RESET}"  
    gcloud config set project "$GOOGLE_CLOUD_PROJECT"  
fi  

echo -e "\n${BLUE}--- Current gcloud project configuration ---${RESET}"  
gcloud config list project  
echo -e "${BLUE}------------------------------------------${RESET}"  

# 5. Get project numbers  
echo "Getting project numbers..."  
export STAGING_PROJECT_NUMBER=$(gcloud projects describe $GOOGLE_CLOUD_STAGING_PROJECT --format="value(projectNumber)")  
export PROD_PROJECT_NUMBER=$(gcloud projects describe $GOOGLE_CLOUD_PRD_PROJECT --format="value(projectNumber)")  
echo -e "${BOLD}STAGING_PROJECT_NUMBER:${RESET} $STAGING_PROJECT_NUMBER"  
echo -e "${BOLD}PROD_PROJECT_NUMBER:${RESET}  $PROD_PROJECT_NUMBER"  
echo -e "${BLUE}------------------------------------------${RESET}"  

# 6. Sync Python dependencies and activate venv  
echo "Activating Python virtual environment..."  
source .venv/bin/activate  

echo "Syncing python dependencies with uv..."  
uv sync --dev --extra jupyter  

echo -e "\n${GREEN}✅ Environment setup complete for ${BOLD}$TARGET_ENV${RESET}${GREEN} with project ${BOLD}$GOOGLE_CLOUD_PROJECT${RESET}${GREEN}. Your shell is now configured.${RESET}"

您可以通过以下命令运行它:

source scripts/setup-env.sh [--noauth]

当它运行时,看起来像这样:

运行 setup-env.sh

如今,我倾向于在我的所有 Google Cloud 相关应用中使用相同的脚本。您可能也会发现它很有用!

7.9 配置和日志记录

我喜欢从 config.py 方便模块开始,用于从环境变量加载配置。

"""This module provides configuration for the LLMS-Generator agent."""  

import os  
from collections.abc import Callable  
from dataclasses import dataclass  

import google.auth  

from common_utils.exceptions import ConfigError  
from common_utils.logging_utils import setup_logger  

# --- Constants for default environment variables ---  
DEFAULT_AGENT_NAME = "llms_gen_agent"  
DEFAULT_GCP_LOCATION = "global"  
DEFAULT_MODEL = "gemini-2.5-flash"  
DEFAULT_GENAI_USE_VERTEXAI = "True"  
DEFAULT_MAX_FILES_TO_PROCESS = "0"  
DEFAULT_BACKOFF_INIT_DELAY = "2"  
DEFAULT_BACKOFF_ATTEMPTS = "5"  
DEFAULT_BACKOFF_MAX_DELAY = "60"  
DEFAULT_BACKOFF_MULTIPLIER = "2"  

agent_name = os.environ.setdefault("AGENT_NAME", DEFAULT_AGENT_NAME)  
logger = setup_logger(agent_name)  

@dataclass  
class Config:  
    """Holds application configuration."""  

    agent_name: str  
    project_id: str  
    location: str  
    model: str  
    genai_use_vertexai: bool  

    max_files_to_process: int # 0 means no limit  

    backoff_init_delay: int  
    backoff_attempts: int  
    backoff_max_delay: int  
    backoff_multiplier: int  

    valid: bool = True # Set this to False to force config reload from env vars  

    def invalidate(self):  
        """ Invalidate current config. This forces the config to be refreshed from the environment when  
        get_config() is next called. """  
        logger.debug("Invalidating current config.")  
        self.valid = False  

    def __str__(self):  
        return (  
            f"Agent Name: {self.agent_name}\n"  
            f"Project ID: {self.project_id}\n"  
            f"Location: {self.location}\n"  
            f"Model: {self.model}\n"  
            f"GenAI Use VertexAI: {self.genai_use_vertexai}\n"  
            f"Max Files To Process: {self.max_files_to_process}\n"  
            f"Backoff Init Delay: {self.backoff_init_delay}\n"  
            f"Backoff Attempts: {self.backoff_attempts}\n"  
            f"Backoff Max Delay: {self.backoff_max_delay}\n"  
            f"Backoff Multiplier: {self.backoff_multiplier}\n"  
        )  

def _get_env_var(key: str, default_value: str, type_converter: Callable=str):  
    """Helper to get environment variables with a default and type conversion."""  
    return type_converter(os.environ.setdefault(key, default_value))  

current_config = None  

def setup_config() -> Config:  
    """Gets the application configuration by reading from the environment.  
    The expensive Google Auth call to determine the project ID is only performed once.  
    If the current_config is invalid, the config will be refreshed from the environment.  
    Otherwise, the cached config is returned.  

    Returns:  
        Config: An object containing the current application configuration.  

    Raises:  
        ConfigError: If the GCP Project ID cannot be determined on the first call.  
    """  
    global current_config  

    # Load env vars  
    location = _get_env_var("GOOGLE_CLOUD_LOCATION", DEFAULT_GCP_LOCATION)  
    model = _get_env_var("MODEL", DEFAULT_MODEL)  
    genai_use_vertexai = _get_env_var("GOOGLE_GENAI_USE_VERTEXAI", DEFAULT_GENAI_USE_VERTEXAI, lambda x: x.lower() == "true")  
    max_files_to_process = _get_env_var("MAX_FILES_TO_PROCESS", DEFAULT_MAX_FILES_TO_PROCESS, int)  
    backoff_init_delay = _get_env_var("BACKOFF_INIT_DELAY", DEFAULT_BACKOFF_INIT_DELAY, int)  
    backoff_attempts = _get_env_var("BACKOFF_ATTEMPTS", DEFAULT_BACKOFF_ATTEMPTS, int)  
    backoff_max_delay = _get_env_var("BACKOFF_MAX_DELAY", DEFAULT_BACKOFF_MAX_DELAY, int)  
    backoff_multiplier = _get_env_var("BACKOFF_MULTIPLIER", DEFAULT_BACKOFF_MULTIPLIER, int)  

    if current_config: # If we've already loaded the config before  
        if current_config.valid: # return it as is  
            return current_config  
        else: # Current config invalid - we need to update it  
            current_config.location=location  
            current_config.model=model  
            current_config.genai_use_vertexai=genai_use_vertexai  
            current_config.max_files_to_process=max_files_to_process  
            current_config.backoff_init_delay=backoff_init_delay  
            current_config.backoff_attempts=backoff_attempts  
            current_config.backoff_max_delay=backoff_max_delay  
            current_config.backoff_multiplier=backoff_multiplier  

            logger.debug(f"Updated config:\n{current_config}")  
            return current_config              

    # If we're here, then we've never created a config before  
    _, project_id = google.auth.default()  
    if not project_id:  
        raise ConfigError("GCP Project ID not set. Have you run scripts/setup-env.sh?")  

    current_config = Config(  
        agent_name=agent_name,  
        project_id=project_id,  
        location=location,  
        model=model,  
        genai_use_vertexai=genai_use_vertexai,  
        max_files_to_process=max_files_to_process,  
        backoff_init_delay=backoff_init_delay,  
        backoff_attempts=backoff_attempts,  
        backoff_max_delay=backoff_max_delay,  
        backoff_multiplier=backoff_multiplier  
    )  

    logger.debug(f"Loaded config:\n{current_config}")  
    return current_config

这里是 src/common_utils/logging_utils.py

"""  
This module provides a shared logging utility for the application.  

It offers a centralized `setup_logger` function that configures and returns a  
standardized logger instance. This ensures consistent logging behavior,  
formatting, and level across the entire application.  

To use the logger in any module, import the `setup_logger` function and call it with a name,   
typically `__name__`, to get a logger instance specific to that module.  

Example:  
    ```  
    from common_utils.logging_utils import setup_logger  

    logger = setup_logger(__name__)  
    ```  

In this application we setup up the logger in `config.py`, and then expose that logger  
to other modules. E.g.  
    ```  
    from llms_gen_agent.config import get_config, logger  
    ```  
"""  

import logging  
import os  

def setup_logger(app_name: str) -> logging.Logger:  
    # Suppress verbose logging from ADK and GenAI libraries - INFO logging is quite verbose  
    logging.getLogger("google_adk").setLevel(logging.ERROR)  
    logging.getLogger("google_genai").setLevel(logging.ERROR)  

    # Suppress "Unclosed client session" warnings from aiohttp  
    logging.getLogger('asyncio').setLevel(logging.CRITICAL)  

    """Sets up and a logger for the application."""  
    log_level = os.environ.get("LOG_LEVEL", "INFO").upper()  
    app_logger = logging.getLogger(app_name)  
    log_level_num = getattr(logging, log_level, logging.INFO)  
    app_logger.setLevel(log_level_num)  

    # Add a handler only if one doesn't exist to prevent duplicate logs  
    if not app_logger.handlers:  
        handler = logging.StreamHandler()  
        formatter = logging.Formatter(  
            fmt="%(asctime)s.%(msecs)03d:%(name)s - %(levelname)s: %(message)s",  
            datefmt="%H:%M:%S",  
        )  
        handler.setFormatter(formatter)  
        app_logger.addHandler(handler)  

    app_logger.propagate = False  # Prevent propagation to the root logger  
    app_logger.info("Logger initialised for %s.", app_name)  
    app_logger.debug("DEBUG level logging enabled.")  

    return app_logger

这个日志模块还禁用了来自 google-adkgoogle-genaiasyncio 包的一些冗长日志。


8、欢迎来到编码阶段!

在本部分中,我将展示如何使用Google Agent Development Kit (ADK) 实际编码多代理应用程序。

8.1 创建协调器代理

让我们从操作的核心开始。主要的generate_llms_coordinator代理将位于llms_gen_agent文件夹中。第一次迭代看起来像这样:

"""  
This module defines the main agent for the LLMS-Generator application.  
"""  
from google.adk.agents import Agent  
from google.adk.tools.agent_tool import AgentTool  
from google.genai.types import GenerateContentConfig  

from .config import setup_config  
from .sub_agents.doc_summariser import document_summariser_agent  
from .tools import discover_files, generate_llms_txt  

config = setup_config()  

# Agent is an alias for LlmAgent  
# It is non-deterministic and decides what tools to use,   
# or what other agents to delegate to  
generate_llms_coordinator = Agent(  
    name="generate_llms_coordinator",  
    description="An agent that generates a llms.txt file for a given repository. Coordinates overall process.",  
    model=config.model,       
    instruction="""You are an expert in analyzing code repositories and generating `llms.txt` files.  
Your goal is to create a comprehensive and accurate `llms.txt` file that will help other LLMs  
understand the repository. When the user asks you to generate the file, you should ask for the  
absolute path to the repository/folder, and optionally an output path.  

Here's the detailed process you should follow:  
1.  **Discover Files**: Use the `discover_files` tool with the provided `repo_path` to get a list of all  
    relevant files paths, in the return value `files`.  
2.  **Check Files List**: Check you received a success response and a list of files.  
    If not, you should provide an appropriate response to the user and STOP HERE.  
3.  **Summarize Files**: Delegate to the `document_summariser_agent` Agent Tool.  
    **CRITICAL: This tool MUST be called with NO arguments.**  
    The `document_summariser_agent` will read the list of files from the session state under the key 'files'   
    (which was populated by the `discover_files` tool).   
    The `document_summariser_agent` will then return the full set of summaries as JSON   
    with a single key `summaries` that contains a dictionary of all the path:summary pairs.  
    **Example of correct call:** `document_summariser_agent()`  
4.  **Check Summary Response**: you should have received a JSON response containing the summaries.  
    This contains all the files originally discovered, with each mapped to a summary.  
    If so, continue. If not, you should provide an appropriate response to the user and STOP HERE.  
5.  **Generate `llms.txt`: Call the `generate_llms_txt` tool.  
    Provide `repo_path` as an argument. If the user provided an output path,   
    provide it as the `output_path` argument.  
    The tool will determine other required values from session state.  
6.  **Response**  
    Finally, respond to the user confirming whether the `llms.txt` creation was successful.  
    State the path where the file has been created, which is stored in session state key `llms_txt_path`.  
""",  
    tools=[  
        discover_files, # automatically wrapped as FunctionTool  
        generate_llms_txt, # automatically wrapped as FunctionTool  
        AgentTool(agent=document_summariser_agent)  
    ],  
    generate_content_config=GenerateContentConfig(  
        temperature=0.1,  
        top_p=1,  
        max_output_tokens=60000  
    )  
)  

root_agent = generate_llms_coordinator

让我们了解这段代码的一些重要细节:

  • 我们创建了一个名为generate_llms_coordinator的协调器代理。它只是一个普通的ADK [LlmAgent](https://google.github.io/adk-docs/agents/llm-agents/), 它被赋予了另一个代理和工具,并且包含了有关如何以及何时使用这些工具的具体指令。
  • description描述了这个协调器的整体目的。
  • 模型是从我们的配置中提取的——它将被设置为gemini-2.5-flash
  • instruction被设置为一个明确说明应使用哪些工具以及应委托给哪些代理的提示。我们不应该提供有关这些工具和代理如何完成其工作的任何指令;那是它们的责任。
  • tools参数是我们提供的在提示中引用的工具和代理列表。请注意,我将document_summariser_agentAgentTool一起使用,以便我们可以将其用作工具。

8.2 创建“发现文件”工具

我们编排器提示的第一部分是:

1. **发现文件**: 使用 discover\_files 工具与提供的 repo\_path 一起获取所有相关文件路径的列表,在返回值 files 中.

记住,为了与外部环境交互,代理需要使用 工具。我们将给协调器代理的第一个工具是遍历文件夹或仓库并识别所有相关文件路径的能力。

我们将此命名为 discover_files,并将其放在我们的 tools.py 文件中。它最初看起来像这样:

"""  
This module provides a collection of tools for the LLMS-Generator agent.  

The tools are designed to facilitate the discovery of files within a given repository,   
read their contents, and generate a structured `llms.txt` sitemap file based on the findings.  

Key functionalities include:  
- `discover_files`: Scans a repository to find relevant files (e.g. markdown and python files),  
  excluding common temporary or git-related directories.  
- `generate_llms_txt`: Constructs the `llms.txt` Markdown file, organizing  
  discovered files into sections with summaries.  
"""  
import os  

from google.adk.tools import ToolContext  

from .config import logger  

def discover_files(repo_path: str, tool_context: ToolContext) -> dict:  
    """Discovers all relevant files in the repository and returns a list of file paths.  

    Args:  
        repo_path: The absolute path to the repository to scan.  

    Returns:  
        A dictionary with "status" (success/failure) and "files" (a list of file paths).  
    """  
    logger.debug("Entering tool: discover_files with repo_path: %s", repo_path)  

    excluded_dirs = {'.git', '.github', 'overrides', '.venv', 'node_modules', '__pycache__', '.pytest_cache'}  
    excluded_files = {'__init__'}  
    included_extensions = {'.md', '.py'}  

    directory_map: dict[str, list[str]] = {}  
    try:  
        for root, subdirs, files in os.walk(repo_path):  

            # Modify subdirs in place so that os.walk() sees changes directly  
            subdirs[:] = [d for d in subdirs if d not in excluded_dirs]  
            for file in files:  
                if (any(file.endswith(ext) for ext in included_extensions)   
                        and not any(file.startswith(ext) for ext in excluded_files)):  
                    file_path = os.path.join(root, file)  
                    directory = os.path.dirname(file_path)  
                    if directory not in directory_map:  
                        directory_map[directory] = []  
                    directory_map[directory].append(file_path)  

        all_dirs = list(directory_map.keys())  
        tool_context.state["dirs"] = all_dirs # directories only  
        logger.debug("Dirs\n:" + "\n".join([str(dir) for dir in all_dirs]))  

        # Create a single list of all the files  
        all_files = [file for files_list in directory_map.values() for file in files_list]  
        tool_context.state["files"] = all_files  
        logger.debug("Files\n:" + "\n".join([str(file) for file in all_files]))  
        logger.debug("Exiting discover_files.")  
        return {"status": "success", "files": all_files}  
    except Exception as e:  
        logger.error("Error in discover_files: %s", e)  
        return {"status": "failure", "files": []}

这个工具只是一个用户定义的Python函数。该函数本身并不复杂。我们只是使用os.walk()递归地遍历文件夹和文件中的目录。注意使用subdirs[:],这允许我们在原地替换subdirs变量,从而允许我们排除excluded_dirs列表中的任何子目录。(稍后我们可以将excluded_dirs参数化。但现在,我将其硬编码。)

该函数的最终结果是:

  • 我们创建了一个Python字典,将每个文件映射到它所在的目录。
  • 我们将all_dirs列表附加到我们的会话状态中,键名为dirs。请注意,会话状态是通过作为参数传递的ToolContext对象提供的。
  • 我们还将all_files列表附加到我们的会话状态中,键为files

请注意,会话代表用户和代理系统之间的单次持续对话。会话状态就像与会话关联的临时工作存储区域。由于这个会话(及其状态)在应用程序的所有代理和工具之间共享,我们可以使用它来进行代理到代理和代理到工具的交互。

8.3 文档摘要代理

现在,我们的目录和文件列表已经存储在我们的会话中,我们可以继续进行编排代理的这一阶段:

  1. **总结文件**: 委托给 document\_summariser\_agent 代理工具。 **关键:此工具必须不带任何参数调用。** document\_summariser\_agent 将从会话状态中检索文件列表(这是由 discover\_files 工具填充的)。然后,document\_summariser\_agent 将以JSON形式返回完整的摘要集,其中只有一个键 summaries,该键包含所有路径:摘要对。 **正确调用示例:** document\_summariser\_agent()

请注意,我必须在编排器中非常明确地给出我的提示。文档摘要代理将从会话状态中检索文件列表;它不会期望此文件列表由编排器传递。如果没有这个明确的命令来调用AgentTool“不带任何参数”,我发现编排器有时会尝试将文件列表作为列表传递给AgentTool。这会导致应用程序失败。

现在让我们看看 src/llms_gen_agent/sub_agents/doc_summariser/agent.py:

"""  
Defines a sequential agent responsible for summarizing documents.  

This module contains the `document_summariser_agent`, a `SequentialAgent` that  
orchestrates a two-step process to read and summarize a collection of files.  

The process is as follows:  
1.  **File Reading:** The `file_reader_agent` reads the content of specified  
    files, storing the content in the session state.  
2.  **Content Summarization:** The `content_summariser_agent` takes the  
    collected file content and performs two key tasks:  
    - It generates a concise summary for each individual file.  
    - It generates a higher-level summary for the entire project based on the  
      content of all files.  

The final output is a single JSON object containing both the individual file  
summaries and the overall project summary.  
"""  
from google.adk.agents import Agent, SequentialAgent  
from google.adk.models.llm_response import LlmResponse  
from google.genai.types import GenerateContentConfig, Part  

from llms_gen_agent.config import setup_config, logger  
from llms_gen_agent.schema_types import DocumentSummariesOutput  
from llms_gen_agent.tools import read_files  

config = setup_config()  

file_reader_agent = Agent(  
    name="file_reader_agent",  
    description="An agent that reads the content of multiple files and stores them in session state.",  
    model=config.model,  
    instruction="""You are a specialist in reading files. Your job is to run the `read_files`,   
    which will read a list of files in your session state, and store their contents.  
    IMPORTANT: you should NOT pass any arguments to the `read_files` tool.   
    It will retrieve its data from session state. """,  
    tools=[  
        read_files  
    ]  
)  

content_summariser_prompt = """You are an expert summariser.   
You will summarise the contents of multiple files, and then you will summarise the overall project.  
You will do this work in two phase.  

# Phase 1: File Summarisation  
- Your task is to summarize EACH individual file's content in three sentences or fewer.  
- Do NOT start summaries with text like "This document is about" or "This document provides".  
  Just immediately describe the content. E.g.  
  Rather than this: "This document explains how to configure streaming behavior..."  
  Say this: "Explains how to configure streaming behavior..."  
- If you cannot generate a meaningful summary for a file, use 'No meaningful summary available.' as its summary.  
- Aggregate ALL these individual summaries into a single JSON object.  

# Phase 2: Project Summarisation  
- After summarizing all the files, you MUST also provide an overall project summary, in no more than three paragraphs.   
- The project summary should be a high-level overview of the repository/folder, based on the content of the files.  
- Focus on the content that is helpful for understanding the purpose of the project and the core components.  
- The project summary MUST be stored in the same output JSON object with the key 'project'.   
  This is CRITICAL for the overall understanding of the repository.  

# Output Format  
- The JSON object MUST have a single top-level key called 'summaries', which contains a dictionary.  
- The dictionary contains all the summaries as key:value pairs.  
- For the file summaries: the dictionary keys are the original file paths and values are their respective summaries.  
- For the project summary: the key is `project`. THIS KEY MUST BE PRESENT. The value is the project summary.   
- Example:   
  {{"summaries": {{"/path/to/file1.md":"Summary of file 1.",   
                   "/path/to/file2.md":"Summary of file 2.",  
                   "/path/to/file3.py":"Summary of python file."  
                   "project":"Summary of the project."}} }}  

IMPORTANT: Your final response MUST contain ONLY this JSON object.   
DO NOT include any other text, explanations, or markdown code block delimiters.  

Now I will provide you with the contents of multiple files.   
Note that each file has a unique path and associated content.  

**FILE CONTENTS START:**  
{files_content}  
---  
**FILE CONTENTS END:**  

Now return the JSON object.  
"""  

content_summariser_agent = Agent(  
    name="content_summarizer_agent",  
    description="An agent that summarizes collected file contents and aggregates them.",  
    model=config.model,  
    instruction=content_summariser_prompt,  
    generate_content_config=GenerateContentConfig(  
        temperature=0.5,  
        top_p=1,  
        max_output_tokens=64000  
    ),  
    output_schema=DocumentSummariesOutput, # This is the final output schema  
    output_key="doc_summaries" # json with top level called 'summaries'  
)  

document_summariser_agent = SequentialAgent(  
    name="document_summariser_agent",  
    description="A sequential agent that first reads file contents and then summarizes them.",  
    sub_agents=[  
        file_reader_agent,  
        content_summariser_agent  
    ]  
)

首先要注意的是,document_summariser_agent实际上是一个[SequentialAgent](https://google.github.io/adk-docs/agents/workflow-agents/sequential-agents/), 这是一种Workflow Agent类型。在ADK中,Workflow Agents是专门用于协调子代理流程的代理。这允许我们的子代理以一种确定性的方式被调用。事实证明,这非常重要!

当我第一次实现这个子代理时,我没有使用SequentialAgent。相反,我只是给了document_summariser_agent一个提示,告诉它首先读取所有文件,然后对它们进行总结。但这并不可靠。

现在,作为一个SequentialAgent,它必须首先运行file_reader_agent,然后运行content_summariser_agent。这里,file_reader_agent是另一个工具的包装器。它是一个名为read_files()的函数,该函数接受文件路径列表,读取所有这些文件,并将每个文件的内容存储在会话状态下的files_content键中。该键的值本身是一个字典,每个文件都有一个条目。对于每个文件,键是文件路径,值是内容本身。请注意,read_files()函数除了提供访问会话状态的ToolContext外,不接受任何参数。正如之前所述,该函数将从会话状态中检索其输入数据。

def read_files(tool_context: ToolContext) -> dict:  
    """Reads the content of files and stores it in the tool context.  

    This function retrieves a list of file paths from the `files` key in the  
    `tool_context.state`. It then iterates through this list, reads the  
    content of each file, and stores it in a dictionary under the  

    `files_content` key in the `tool_context.state`. The file path serves as  
    the key for its content.  

    It avoids re-reading files by checking if the file path already exists  
    in the `files_content` dictionary.  

    Returns:  
        A dictionary with a "status" key indicating the outcome ("success").  
    """  
    logger.debug("Executing read_files")  
    config = setup_config() # dynamically load config  

    file_paths = tool_context.state.get("files", [])  
    logger.debug(f"Got {len(file_paths)} files")  

    # Implement max files constraint  
    if config.max_files_to_process > 0:  
        logger.info(f"Limiting to {config.max_files_to_process} files")  
        file_paths = file_paths[:config.max_files_to_process]  

    # Initialise our session state key      
    tool_context.state["files_content"] = {}  

    response = {"status": "success"}  
    for file_path in file_paths:  
        if file_path not in tool_context.state["files_content"]:  
            try:  
                logger.debug(f"Reading file: {file_path}")  
                with open(file_path) as f:  
                    content = f.read()  
                    logger.debug(f"Read content: {content[:80]}...")  
                    tool_context.state["files_content"][file_path] = content  
            except (FileNotFoundError, PermissionError, UnicodeDecodeError) as e:  
                logger.warning("Could not read file %s: %s", file_path, e)  
                # Store an error message so the summarizer knows it failed  
                tool_context.state["files_content"][file_path] = f"Error: Could not read file. Reason: {e}"  
                response = {"status": "warnings"}  
            except Exception as e:  
                logger.error("An unexpected error occurred while reading %s: %s", file_path, e)  
                tool_context.state["files_content"][file_path] = f"Error: An unexpected error occurred. Reason: {e}"  
                response = {"status": "warnings"}  

    return response

现在我们已经读取了文件并存储了它们的内容,我们可以继续进行实际的摘要步骤。这由content_summarisation_agent完成。我对这个代理给出的提示有几个值得注意的地方:

  1. 我告诉它先对文件进行摘要,然后对仓库/文件夹进行摘要。
  2. 我发现摘要经常遵循“这是关于[whatever]的文档”的模式。我告诉代理不要包括任何“这是关于”之类的前缀,但代理仍然以这种方式产生摘要,不管怎样。所以我添加了应该说什么和不应该说什么的例子,这解决了这个问题。(一次提示胜过一切。)
  3. 我还必须提供我的代理所有{file:content}对。我们可以通过称为key templating的机制直接在提示中做到这一点。它的原理是这样的:如果我们有一个名为“foo”的会话状态键(即context.state["foo"]),那么我们可以通过将“foo”包裹在大括号中直接将其传递到我们的提示中。因此,您的提示看起来像这样:“用{foo}做些事情”。

该代理的定义本身还包括一个output_key属性。通过设置这个,我们告诉代理将其返回值存储在会话状态中,名称指定。因此,我的代理将它的输出存储在一个JSON对象中,键为doc_summaries

我仍然发现输出格式有些不可预测,所以我另外设置了代理的output_schema属性。我将其设置为使用这个自定义类:

class DocumentSummariesOutput(BaseModel):  
    summaries: dict[str, str] = Field(  
        description="A dictionary where keys are file paths and values are their summaries."  
    )

我将这个定义添加到一个新的Python模块schema_types.py中。

通过创建这个扩展Pydantic的BaseModel的自定义类,我可以强制代理始终以正确的JSON格式创建输出,其中JSON对象本身包含一个字典。

这工作得很好,但是……代理有时并不只是返回JSON对象。有时它会在返回对象之前添加一些前缀,或者用某种标记包装JSON对象。如果它这样做,那么链中的下一步就会失败。

为了缓解这个问题,我然后添加了一个After-Agent-Callback叫做clean_json_callback:

def clean_json_callback(  
    callback_context: CallbackContext,  
    llm_response: LlmResponse  
) -> LlmResponse | None:  
    """  
    Strips markdown code block delimiters (```json, ```) from the LLM's text response.  
    This callback runs after the model generates content but before output_schema validation.  
    """  
    logger.debug("--- Callback: clean_json_callback running for agent: %s ---", callback_context.agent_name)  

    if llm_response.content and llm_response.content.parts:  
        # Assuming the response is text in the first part  
        if llm_response.content.parts[0].text:  
            original_text = llm_response.content.parts[0].text  
            logger.debug(f"--- Callback: Original LLM response text (first 100 chars): '{original_text[:100]}...'")  

            # Regex to find and remove ```<lang> and ```  
            # re.DOTALL allows . to match newlines, \s* matches any whitespace (including newlines)  
            # (.*?) is a non-greedy match for the content inside the code block  
            match = re.search(r"```(?:\w*\s*)?(.*?)\s*```", original_text, flags=re.DOTALL)  
            if match:  
                cleaned_text = match.group(1).strip()  
                logger.debug(f"--- Callback: Stripped markdown. Cleaned text (first 100 chars): '{cleaned_text[:100]}...'")  
                # Create a new LlmResponse with the cleaned content  
                # Use .model_copy(deep=True) to ensure you're not trying to modify the original immutable object directly  
                new_content = llm_response.content.model_copy(deep=True)  
                if new_content.parts and isinstance(new_content.parts[0], Part):  
                    new_content.parts[0].text = cleaned_text  
                    return LlmResponse(content=new_content)  
                else:  
                    logger.debug("--- Callback: Error: new_content.parts[0] is not a valid Part object after copy. ---")  
                    return llm_response  
            else:  
                logger.debug("--- Callback: No markdown code block found. Returning original response. ---")  
                return llm_response  

    return llm_response # Return the original response if no changes or not applicable

After-agent-callbacks确实如你所期望的那样运行:它们在代理返回响应后立即运行。这非常有用,因为它给了我们一个机会来清理响应对象。在这种情况下,我的函数只是剥离掉JSON周围的任何前缀或标记。

8.4 生成 llms.txt 文件

现在我们有了干净的JSON输出,我们准备将控制权交还给我们的编排器代理,后者又调用了最后一个工具:

5. **生成 llms.txt**: 调用 generate\_llms\_txt 工具。提供 repo\_path 作为参数。如果用户提供了输出路径,请将其作为 output\_path 参数提供。该工具将从会话状态中确定其他所需值。

我不会详细讲解generate_llms_txt()函数,但我会在学习ADK的背景下涵盖几个有用的点:

def generate_llms_txt(repo_path: str, tool_context: ToolContext, output_path: str = "") -> dict:  
    """Generates a comprehensive llms.txt sitemap file for a given repository.  

    This function orchestrates the creation of an AI/LLM-friendly Markdown file (`llms.txt`)   
    that provides a structured overview of the repository's contents. It includes a project summary,   
    and organizes files into sections based on their directory structure, with a configurable maximum section depth.  

    For each file, it generates a Markdown link with its summary. If a summary is not available,   
    "No summary" is used as a placeholder. Links are generated as GitHub URLs if the repository is   
    detected as a Git repository, otherwise, relative local paths are used.  

    Args:  
        repo_path: The absolute path to the root of the repository to scan.  
        output_path: Optional. The absolute path to save the llms.txt file.   
                     If not provided, it will be saved in a `temp` directory in the current working directory.  

    Other required data is retrieved from tool_context.  

    Returns:  
        A dictionary with:  
        - "status": "success" if the file was generated successfully.  
        - "llms_txt_path": The absolute path to the generated llms.txt file.  
    """  
    logger.debug("Entering generate_llms_txt for repo_path: %s", repo_path)  
    dirs = tool_context.state.get("dirs", [])  
    files = tool_context.state.get("files", [])  
    doc_summaries_full = tool_context.state.get("doc_summaries", {})  
    logger.debug(f"doc_summaries_full (raw from agent) type: {type(doc_summaries_full)}")  

    doc_summaries = doc_summaries_full.get("summaries", {}) # remember, it has one top-level key called `summaries`  
    project_summary = doc_summaries.pop("project", None)  

    ####### More code...
  1. 我们从会话状态中提取目录列表和文件列表。
  2. 我们还从会话状态中提取doc_summaries字典,然后弹出键为project的项。

之后,一旦我们创建了输出文件,我们返回该文件的路径和成功代码:

tool_context.state["llms_txt_path"] = llms_txt_path  
    return {"status": "success", "llms_txt_path": llms_txt_path}

9、使用笔记本进行实验和测试

在构建应用程序时,我们希望能够对其进行实验和调整。Jupyter笔记本非常适合这一点。(我有一个快速介绍Jupyter笔记本这里。)

我创建了notebooks/generate_llms_experiments.ipynb用于此目的。

我的Jupyter笔记本

我将快速为您概述它。您可能会发现这对您自己的代理模型测试很有用。

9.1 环境设置

我们从导入开始:

import os  

import vertexai  
from dotenv import load_dotenv  
from google.adk.runners import Runner  
from google.adk.sessions import InMemorySessionService  
from google.auth import default  
from google.genai.types import Content, Part  
from IPython.display import Markdown, display

然后从我们的.env加载环境变量。笔记本假设我们已经运行了scripts/setup-env.sh,但能够从笔记本单元格中快速更新这些变量是有用的:

dotenv_path = os.path.abspath('../.env')  

if os.path.exists(dotenv_path):  
    print(f"Loading environment variables from: {dotenv_path}")  
    load_dotenv(dotenv_path=dotenv_path)  
else:  
    print(f"Warning: .env file not found at {dotenv_path}")  

staging_project_id = os.getenv("GOOGLE_CLOUD_STAGING_PROJECT")  
if staging_project_id:  
    os.environ["GOOGLE_CLOUD_PROJECT"] = staging_project_id  
    print(f"Set GOOGLE_CLOUD_PROJECT environment variable to: {staging_project_id}")

现在从Google ADC拉取凭证并初始化Vertex AI:

credentials, project_id = default()  # To use ADC  
vertexai.init(project=staging_project_id, location="europe-west4", credentials=credentials) # type: ignore

9.2 运行代理

设置常量和辅助函数:

from llms_gen_agent.agent import root_agent  

# Session and Runner  
APP_NAME = "generate_llms_client"  
USER_ID="test_user"  
SESSION_ID="test_session"  
TEST_REPO_PATH = "/home/darren/localdev/gcp/adk-docs" # You can change this to any repository you want to test  

async def setup_session_and_runner():  
    session_service = InMemorySessionService()  
    session = await session_service.create_session(app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID)  
    runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)  
    return session, runner  

# Agent Interaction  
async def call_agent_async(query):  
    content = Content(role='user', parts=[Part(text=query)])  
    _, runner = await setup_session_and_runner()  
    events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)  

    final_response_content = "Final response not yet received."  

    async for event in events:  
        if function_calls := event.get_function_calls():  
            tool_name = function_calls[0].name  
            display(Markdown(f"_Using tool {tool_name}..._"))  
        elif event.actions and event.actions.transfer_to_agent:  
            personality_name = event.actions.transfer_to_agent  
            display(Markdown(f"_Delegating to agent: {personality_name}..._"))  
        elif event.is_final_response() and event.content and event.content.parts:  
            final_response_content = event.content.parts[0].text  

        # For debugging, print the raw type and content to the console  
        print(f"DEBUG: Full Event: {event}")  

    display(Markdown("## Final Message"))  
    display(Markdown(final_response_content))

请注意,call_agent_async(query)函数中的日志和输出特别有用,因为它允许我们看到事件的完整细节。这对于调试非常有用。

最后,调用代理:

query = f"Generate the llms.txt file for the repository at {TEST_REPO_PATH}"  
await call_agent_async(query)

当我们运行它时,我们会看到如下输出:

从笔记本运行我们的代理

10、使用ADK Web UI运行代理

但我们还有另一种酷的方式来测试我们的代理,就是使用ADK Web UI。Agent Developer Kit有一个内置的Web UI,我们可以从命令行启动它,并且它对于测试我们的代理和可视化我们的代理和工具之间的交互非常有用。

我们可以这样启动它:

uv run adk web --port 8501 src

或者使用make playground命令作为别名。

UI会在您的浏览器中打开,我们可以向协调器代理发出我们的命令。并且我们立即看到所有子代理和工具的交互。酷,对吧?

使用ADK Web UI

但有很多我们可以从UI中看到的东西。比如…

事件流,带有负载和请求/响应参数:

使用ADK Web UI查看代理交互

以及我们的会话状态

使用ADK Web UI检查会话

甚至跟踪,以便我们可以看到大部分时间花费在哪里:

ADK跟踪

如果您还没有尝试过ADK Web UI,您应该试试看。它会节省您很多时间!!

11、处理速率限制

我的初始解决方案效率很低。例如,我使用了一个现成的 LangChain ReadFileTool 来单独读取每个文件。这意味着模型必须对每个文件调用一次工具。(并且还需要使用一个 after-tool-callback 来将文件内容存储在会话状态中。)因此,我在短时间内生成了大量的Gemini API调用。这导致了429“请求过多”的错误。

我已经通过编写自己的read_files()工具解决了根本问题。(这非常简单。)但在这样做之前,我将一些速率限制纳入了我的解决方案中。

使用Gemini模型进行速率限制的一种方便方法是使用google.genai.types.HttpRetryOptions并将一个此类实例与我们的模型关联。像这样:

retry_options=HttpRetryOptions(  
            initial_delay=config.backoff_init_delay,  
            attempts=config.backoff_attempts,  
            exp_base=config.backoff_multiplier,  
            max_delay=config.backoff_max_delay  
)  

### Now, instead of this ###  
# content_summariser_agent = Agent(  
#     name="content_summariser_agent",  
#     description="An agent that summarizes collected file contents and aggregates them.",  
#     model=config.model,  
#     # remaining agent definition  

### We do this ###  
content_summariser_agent = Agent(  
    name="content_summariser_agent",  
    description="An agent that summarizes collected file contents and aggregates them.",  
    model=Gemini(  
        model=config.model,  
        retry_options=retry_options  
    ),  
    # remaining agent definition

并且我在.env中定义了这些指数退避值:

# Exponential backoff parameters for the model API calls  
export BACKOFF_INIT_DELAY=10  
export BACKOFF_ATTEMPTS=5  
export BACKOFF_MAX_DELAY=60  
export BACKOFF_MULTIPLIER=2

这些设置意味着:

  1. 在第一次429错误后,等待10秒再尝试。
  2. 最多重试5次。
  3. 每次重试时,等待时间翻倍。
  4. 不要超过60秒的重试间隔。

这些参数对我来说效果很好。任何低于这个的设置都会导致持续的重试失败。

12、添加前端CLI

在使用Jupyter笔记本中的客户端代码测试了代理之后,将其移植到命令行实用程序相对简单。我决定使用Python的Typer包,因为它为我做了很多工作,比如参数解析和设置命令行帮助。

我首先创建了一个 /client_fe/runner.py 模块,并将笔记本中的大部分客户端代码原样移植过来:

"""  
This module is responsible for running the llms-gen agent.  

It sets up the necessary session and runner from the `google-adk` library,  
and then invokes the agent with the user's query. It also handles the   
streaming of events from the agent and displays the final response.  
"""  
from google.adk.runners import Runner  
from google.adk.sessions import InMemorySessionService  
from google.genai.types import Content, Part  
from rich.console import Console  

from llms_gen_agent.agent import root_agent  

APP_NAME = "generate_llms_client"  
USER_ID = "cli_user"  
SESSION_ID = "cli_session"  

console = Console()  

async def setup_session_and_runner():  
    session_service = InMemorySessionService()  
    await session_service.create_session(  
        app_name=APP_NAME, user_id=USER_ID, session_id=SESSION_ID  
    )  
    runner = Runner(agent=root_agent, app_name=APP_NAME, session_service=session_service)  
    return runner  

async def call_agent_async(query: str) -> None:  
    content = Content(role="user", parts=[Part(text=query)])  
    runner = await setup_session_and_runner()  
    events = runner.run_async(user_id=USER_ID, session_id=SESSION_ID, new_message=content)  

    final_response_content = "Final response not yet received."  

    async for event in events:  
        if function_calls := event.get_function_calls():  
            tool_name = function_calls[0].name  
            console.print(f"\n[blue][bold italic]Using tool {tool_name}...[/bold italic][/blue]")  
        elif event.actions and event.actions.transfer_to_agent:  
            personality_name = event.actions.transfer_to_agent  
            console.print(f"\n[blue][bold italic]Delegating to agent: {personality_name}...[/bold italic][/blue]")  
        elif event.is_final_response() and event.content and event.content.parts:  
            final_response_content = event.content.parts[0].text  

    console.print("\n\n[yellow][bold]Final Message from Agent[/bold][/yellow]")  
    console.print(f"[yellow]{final_response_content}[/yellow]")

请注意使用了console.print来自rich库。这个库使控制台输出更加美观变得很容易。

接下来,我创建了一个单独的client_fe/cli.py模块来处理用户输入并调用Runner:

"""  
This module provides the command-line interface (CLI) for the LLMS-Generator application.  

The primary function of this module is to accept a repository path from the user and initiate the  
llms.txt generation process by calling the `call_agent_async` function from the `runner` module.  

Usage:  
    `llms-gen --repo-path /path/to/your/repo [--output-path /path/to/llms.txt] [--log-level DEBUG]`  
"""  
import asyncio  
import os  

# Load .env before other imports  
from dotenv import find_dotenv, load_dotenv  

# recursively search upwards to find .env, and update vars if they exist  
if not load_dotenv(find_dotenv(), override=True):  
    raise ValueError("No .env file found. Exiting.")  

import typer  
from rich.console import Console  

from client_fe.runner import call_agent_async  
from common_utils.logging_utils import setup_logger  
from llms_gen_agent.config import current_config  

app = typer.Typer(add_completion=False)  
console = Console()  

@app.command()  
def generate(  
    repo_path: str = typer.Option(  
        ..., # this means: required  
        "--repo-path",  
        "-r",  
        help="The absolute path to the repository/folder to generate the llms.txt file for.",  
    ),  
    output_path: str = typer.Option(  
        None, # Optional  
        "--output-path",  
        "-o",  
        help=("The absolute path to save the llms.txt file. If not specified, "  
              "it will be saved in a `temp` directory in the current working directory.")  
    ),  
    log_level: str = typer.Option(  
        None,  
        "--log-level",  
        "-l",  
        help="Set the log level for the application. This will override any LOG_LEVEL environment variable."  
    ),  
    max_files_to_process: int = typer.Option(  
        None,  
        "--max-files-to-process",  
        "-m",  
        help="Set the maximum number of files to process. 0 means no limit."  
    )  
 ):  
    """  
    Generate the llms.txt file for a given repository.  
    """  

    if log_level: # Override log level from cmd line  
        os.environ["LOG_LEVEL"] = log_level.upper()  
        console.print(f":exclamation: Overriding LOG_LEVEL: [bold cyan]{log_level}[/bold cyan]")  
        setup_logger(current_config.agent_name)  

    if max_files_to_process: # Override max files to process from cmd line  
        os.environ["MAX_FILES_TO_PROCESS"] = str(max_files_to_process)  
        current_config.invalidate()  
        console.print(f":exclamation: Overriding MAX_FILES_TO_PROCESS: [bold cyan]{max_files_to_process}[/bold cyan]")  

    console.print(f":robot: Generating llms.txt for repository at: [bold cyan]{repo_path}[/bold cyan]")  
    query = f"Generate the llms.txt file for the repository at {repo_path}"  
    if output_path:  
        query += f" and save it to {output_path}"  

    # Show a spinner  
    with console.status("[bold green]Generating file content... [/bold green]"):  
        asyncio.run(call_agent_async(query))  

    console.print("[bold green]:white_check_mark: llms.txt generation complete.[/bold green]")  

if __name__ == "__main__":  
    app()

这很容易理解。请注意,__main__入口点调用app(),这是typer.Typer的一个实例。然后我们使用app.command()装饰器使generate()函数使用这个Typer。在generate()函数中,您可以看到如何轻松地使用Typer设置命令行参数。例如,如果您现在运行llms-gen CLI命令而没有参数,您会得到以下结果:

命令行中缺少选项如果我们将 --help 命令行参数提供给程序,我们会得到以下结果:

带 — help 参数

这很简单!

你可能在想,为什么只需在命令行中输入 llms-gen 就可以调用 cli.py。这是通过一些 pyproject.toml 的魔法实现的。只需将以下内容添加到 toml 文件中:

[project.scripts]  
llms-gen = "client_fe.cli:app"

这会创建一个名为 llms-gen 的 CLI 工具,并将其绑定到 client_fe/cli.py 模块中的 app 对象。

因此,我们不需要再运行:

python src/client_fe/cli.py

而是可以直接运行:

llms-gen

很酷!

13、让我们看看它实际运行的情况!

我们需要提供一个仓库路径。我们可以这样操作:

llms-gen --repo-path "/home/darren/localdev/gcp/adk-docs"

但为了方便,我在 Makefile 中也创建了一个目标:

generate:  
  @echo "Running LLMS-Generator..."  
  uv run llms-gen --repo-path "/home/darren/localdev/gcp/adk-docs"

所以我们可以这样启动它:

make generate

这就是它的样子:

运行 CLI

看起来不错!

14、输出呢?

它创建了 /temp/llms.txt,内容如下:

# adk-docs Sitemap  

The Agent Development Kit (ADK) is a comprehensive, open-source framework designed  
to streamline the development, evaluation, and deployment of AI agents in   
both Python and Java. It emphasizes a code-first approach, offering modularity,   
flexibility, and compatibility with various models and deployment environments   
like Google Cloud's Vertex AI Agent Engine, Cloud Run, and GKE. ADK provides   
core primitives such as Agents (LLM, Workflow, Custom), Tools   
(Function, Built-in, Third-Party, OpenAPI, Google Cloud, MCP), Callbacks,   
Session Management, Memory, Artifact Management, and a robust Runtime with an   
Event Loop for orchestrating complex workflows.   

The documentation covers a wide array of functionalities, from basic agent   
creation and local testing using the Dev UI and CLI, to advanced topics   
like multi-agent system design, authentication for tools, and performance   
optimization through parallel execution. It also delves into crucial aspects   
of agent development such as observability with integrations like AgentOps,   
Cloud Trace, Phoenix, and Weave, and implementing strong safety and security   
guardrails using callbacks and plugins. Furthermore, ADK supports grounding   
capabilities with Google Search and Vertex AI Search for accurate, real-time   
information retrieval.  

Overall, the ADK aims to empower developers to build sophisticated,   
context-aware, and reliable AI applications. The provided tutorials and   
quickstarts guide users through progressive learning, enabling them to master   
agent composition, state management, and deployment strategies for diverse   
use cases, ensuring agents can operate effectively and securely in production   
environments.  

## Home  

- [CONTRIBUTING.md](https://github.com/google/adk-docs/blob/main/CONTRIBUTING.md):  
  Provides guidelines for contributing to the project, including signing a Contributor License Agreement, reviewing community guidelines, and setting up the development environment. It details the contribution workflow, from finding tasks to code reviews and merging pull requests.  
- [README.md](https://github.com/google/adk-docs/blob/main/README.md):  
  Introduces the Agent Development Kit (ADK) as an open-source, code-first framework for building, evaluating, and deploying AI agents. It highlights key features such as a rich tool ecosystem, modular multi-agent systems, tracing, monitoring, and flexible deployment options. Installation instructions for Python and Java are provided, along with links to documentation and contributing guidelines.  

## Docs  

- [community.md](https://github.com/google/adk-docs/blob/main/docs/community.md):  
  Lists community resources for the Agent Development Kit, including translations of the ADK documentation in Chinese, Korean, and Japanese. It also features community-written tutorials, guides, blog posts, and videos showcasing ADK features and use cases. The document encourages contributions of new resources and directs users to the contributing guidelines.  
- [contributing-guide.md](https://github.com/google/adk-docs/blob/main/docs/contributing-guide.md):  
  Details how to contribute to the Agent Development Kit, covering contributions to core Python/Java frameworks and documentation. It outlines prerequisites like signing a Contributor License Agreement and reviewing community guidelines. The guide also explains how to report issues, suggest enhancements, improve documentation, and contribute code via pull requests.  
- [index.md](https://github.com/google/adk-docs/blob/main/docs/index.md):  
  Describes the Agent Development Kit (ADK) as a flexible, modular, and model-agnostic framework for developing and deploying AI agents, optimized for Google's ecosystem. It outlines key features including flexible orchestration, multi-agent architecture, a rich tool ecosystem, deployment readiness, built-in evaluation, and safety features. Provides installation commands for Python and Java, along with links to quickstarts, tutorials, API references, and contribution guides.  

## Docs A2A  

- [index.md](https://github.com/google/adk-docs/blob/main/docs/a2a/index.md):  
  Introduces the Agent2Agent (A2A) Protocol within ADK, designed for building complex multi-agent systems where agents collaborate securely and efficiently. It provides links to guides on the fundamentals of A2A, quickstarts for exposing and consuming remote agents, and the official A2A Protocol website.  
- [intro.md](https://github.com/google/adk-docs/blob/main/docs/a2a/intro.md):  
  Introduces the Agent2Agent (A2A) Protocol for building collaborative multi-agent systems in ADK, contrasting it with local sub-agents. It outlines when to use A2A (separate services, different teams/languages, formal contracts) versus local sub-agents (internal organization, performance-critical operations, shared memory). The document visualizes the A2A workflow for exposing and consuming agents, and provides a concrete customer service example.  
- [quickstart-consuming.md](https://github.com/google/adk-docs/blob/main/docs/a2a/quickstart-consuming.md):  
  Provides a quickstart guide on consuming a remote agent via the Agent-to-Agent (A2A) Protocol within ADK. It demonstrates how a local root agent can use a remote prime agent hosted on a separate A2A server. The guide covers setting up dependencies, starting the remote server, understanding agent cards, and running the main consuming agent.  
- [quickstart-exposing.md](https://github.com/google/adk-docs/blob/main/docs/a2a/quickstart-exposing.md):  
  Provides a quickstart guide on exposing an ADK agent as a remote agent via the Agent-to-Agent (A2A) Protocol. It details two methods: using the `to_a2a()` function for automatic agent card generation and `uvicorn` serving, or manually creating an agent card for `adk api_server --a2a`. The guide includes steps for setting up dependencies, starting the remote server, verifying its operation, and running a consuming agent.  

## Docs Agents  

- [config.md](https://github.com/google/adk-docs/blob/main/docs/agents/config.md):  
  Introduces the experimental Agent Config feature in ADK, allowing users to build and run agents using YAML text files without writing code. It outlines the setup process, including installing ADK Python libraries and configuring Gemini API access via environment variables. The document provides examples for building agents with built-in tools, custom tools, and sub-agents, and discusses deployment options and current limitations.  
- [custom-agents.md](https://github.com/google/adk-docs/blob/main/docs/agents/custom-agents.md):  
  Explains how to create custom agents in ADK by inheriting from BaseAgent and implementing `_run_async_impl` (or `runAsyncImpl` in Java) for arbitrary orchestration logic. It highlights use cases for custom agents, such as conditional logic, complex state management, external integrations, and dynamic agent selection, which go beyond predefined workflow patterns. The document provides a detailed example of a StoryFlowAgent to illustrate multi-stage content generation with conditional regeneration.  
- [index.md](https://github.com/google/adk-docs/blob/main/docs/agents/index.md):  
  Provides an overview of agents in the Agent Development Kit (ADK), defining an agent as a self-contained execution unit for specific goals. It categorizes agents into LLM Agents (for dynamic reasoning), Workflow Agents (for structured execution flow), and Custom Agents (for unique logic). The document emphasizes that combining these agent types in multi-agent systems allows for sophisticated and collaborative AI applications.  
- [llm-agents.md](https://github.com/google/adk-docs/blob/main/docs/agents/llm-agents.md):  
  Describes `LlmAgent` as a core component in ADK for AI agent reasoning, response generation, and tool interaction. It covers defining an agent's identity, guiding its behavior with `instruction` parameters, and equipping it with `tools`. The document also details advanced configurations such as `generate_content_config` for LLM response tuning, `input_schema`/`output_schema` for structured data, context management with `include_contents`, and the use of planners and code executors.  
- [models.md](https://github.com/google/adk-docs/blob/main/docs/agents/models.md):  
  Explains how to integrate various Large Language Models (LLMs) with the Agent Development Kit (ADK), supporting both Google Gemini and Anthropic models, with future plans for more. It details model integration via direct string/registry for Google Cloud models and wrapper classes for external models like those accessed through LiteLLM. Provides comprehensive authentication setups for Google AI Studio, Vertex AI, and outlines how to use LiteLLM for other cloud or local models.  
- [multi-agents.md](https://github.com/google/adk-docs/blob/main/docs/agents/multi-agents.md):  
  Explains how to build multi-agent systems in ADK by composing multiple `BaseAgent` instances for enhanced modularity and specialization. It details ADK primitives for agent composition, including agent hierarchy (parent/sub-agents) and workflow agents (`SequentialAgent`, `ParallelAgent`, `LoopAgent`) for orchestration. The document also covers interaction and communication mechanisms like shared session state, LLM-driven delegation (agent transfer), and explicit invocation using `AgentTool`.  
- [index.md](https://github.com/google/adk-docs/blob/main/docs/agents/workflow-agents/index.md):  
  Introduces workflow agents in ADK as specialized components for orchestrating the execution flow of sub-agents. It explains that these agents operate based on predefined, deterministic logic, unlike LLM Agents. Three core types are presented: Sequential Agents (execute in order), Loop Agents (repeatedly execute until a condition is met), and Parallel Agents (execute concurrently).  
- [loop-agents.md](https://github.com/google/adk-docs/blob/main/docs/agents/workflow-agents/loop-agents.md):  
  Describes the `LoopAgent` in ADK, a workflow agent that iteratively executes its sub-agents. It highlights its use for repetitive or iterative refinement workflows, such as document improvement. The document explains that `LoopAgent` is deterministic and relies on termination mechanisms like `max_iterations` or escalation signals from sub-agents to prevent infinite loops.  
- [parallel-agents.md](https://github.com/google/adk-docs/blob/main/docs/agents/workflow-agents/parallel-agents.md):  
  Describes the `ParallelAgent` in ADK, a workflow agent that executes its sub-agents concurrently to improve performance for independent tasks. It explains that sub-agents operate in independent branches without automatic shared history, requiring explicit communication or external state management if data sharing is needed. The document provides examples like parallel web research to illustrate its use for concurrent data retrieval or computations.  
- [sequential-agents.md](https://github.com/google/adk-docs/blob/main/docs/agents/workflow-agents/sequential-agents.md):  
  Details the `SequentialAgent` in ADK, a workflow agent that executes its sub-agents in a fixed, strict order. It emphasizes its deterministic nature, making it suitable for workflows requiring predictable execution paths. The document illustrates its use with a code development pipeline example, where output from one sub-agent is passed to the next via shared session state.  

## Docs Get-Started  

- [about.md](https://github.com/google/adk-docs/blob/main/docs/get-started/about.md):  
  Provides an overview of the Agent Development Kit (ADK), an open-source, code-first toolkit for building, evaluating, and deploying AI agents. It outlines core concepts like Agents, Tools, Callbacks, Session Management, Memory, Artifact Management, Code Execution, Planning, Models, Events, and the Runner. The document highlights key capabilities such as multi-agent system design, a rich tool ecosystem, flexible orchestration, integrated developer tooling, native streaming support, built-in evaluation, broad LLM support, extensibility, and state/memory management.  
- [index.md](https://github.com/google/adk-docs/blob/main/docs/get-started/index.md):  
  Serves as a starting point for the Agent Development Kit (ADK), designed to help developers build, manage, evaluate, and deploy AI-powered agents. It provides links to installation guides, quickstarts for basic and streaming agents, a multi-agent tutorial, and sample agents. The page also offers an "About" section to learn core components of ADK.  
- [installation.md](https://github.com/google/adk-docs/blob/main/docs/get-started/installation.md):  
  Provides instructions for installing the Agent Development Kit (ADK) for both Python and Java. For Python, it recommends creating and activating a virtual environment using `venv` before installing via `pip`. For Java, it outlines adding `google-adk` and `google-adk-dev` dependencies using Maven or Gradle.  
- [quickstart.md](https://github.com/google/adk-docs/blob/main/docs/get-started/quickstart.md):  
  Provides a quickstart guide for the Agent Development Kit (ADK), detailing how to set up the environment, install ADK, create a basic agent project with multiple tools, and configure model authentication. It demonstrates running the agent locally using the terminal (`adk run`), the interactive web UI (`adk web`), or as an API server (`adk api_server`). The guide includes examples for defining tools and integrating them into an agent.  
- [index.md](https://github.com/google/adk-docs/blob/main/docs/get-started/streaming/index.md):  
  Introduces streaming quickstarts for the Agent Development Kit (ADK), emphasizing real-time interactive experiences through live voice conversations, tool use, and continuous updates. It clarifies that this section focuses on bidi-streaming (live) rather than token-level streaming. The page provides links to Python and Java quickstarts for streaming capabilities, custom audio streaming app samples (SSE and WebSockets), a bidi-streaming development guide series, and a related blog post.  
- [quickstart-streaming-java.md](https://github.com/google/adk-docs/blob/main/docs/get-started/streaming/quickstart-streaming-java.md):  
  Provides a quickstart guide for building a basic agent with ADK Streaming in Java, focusing on low-latency, bidirectional voice interactions. It covers setting up the Java/Maven environment, creating a `ScienceTeacherAgent`, testing text-based streaming with the Dev UI, and enabling live audio communication. The guide includes code examples for agent definition, `pom.xml` configuration, and a custom live audio application.  
- [quickstart-streaming.md](https://github.com/google/adk-docs/blob/main/docs/get-started/streaming/quickstart-streaming.md):  
  Provides a quickstart guide for ADK Streaming in Python, demonstrating how to create a simple agent and enable low-latency, bidirectional voice and video communication. It covers environment setup, ADK installation, agent project structure with a Google Search tool, and platform configuration (Google AI Studio or Vertex AI). The guide shows how to test the agent using `adk web` for interactive UI, `adk run` for terminal interaction, and `adk api_server` for API exposure, and notes supported models for live API.  
- [testing.md](https://github.com/google/adk-docs/blob/main/docs/get-started/testing.md):  
  Explains how to test ADK agents in a local development environment using the ADK API server. It provides commands for launching the server, creating sessions, and sending queries via `curl` for both single-response and streaming endpoints. The document also discusses integrations with third-party observability tools and options for deploying agents.  

## Docs Grounding  

- [google_search_grounding.md](https://github.com/google/adk-docs/blob/main/docs/grounding/google_search_grounding.md):  
  Describes Google Search Grounding in ADK, a feature allowing AI agents to access real-time web information for more accurate responses. It details quick setup, grounding architecture (user query, LLM tool-calling, grounding service interaction, context injection), and response structure including metadata for source attribution. The guide provides best practices for displaying search suggestions and citations, emphasizing the tool's value for time-sensitive queries.  
- [vertex_ai_search_grounding.md](https://github.com/google/adk-docs/blob/main/docs/grounding/vertex_ai_search_grounding.md):  
  Explains Vertex AI Search Grounding in ADK, a feature enabling AI agents to access private enterprise documents for grounded responses. It covers quick setup, grounding architecture (data flow, LLM analysis, document retrieval, context injection), and response structure including metadata for citations. The guide emphasizes best practices for displaying citations and outlines the process for integrating and using Vertex AI Search with ADK agents.  

## Docs Mcp  

- [index.md](https://github.com/google/adk-docs/blob/main/docs/mcp/index.md):  
  Introduces the Model Context Protocol (MCP) as an open standard for LLMs to communicate with external applications, data sources, and tools. It explains MCP's client-server architecture and how ADK facilitates using and exposing MCP tools. The document highlights MCP Toolbox for Databases, an open-source server for securely exposing various data sources as Gen AI tools, and mentions MCP Servers for Google Cloud Genmedia.  

(文件的其余部分...)

完美!

我为了这篇文章的目的对输出文件进行了修剪。

15、使用 llms.txt 来帮助我处理 ADK

我在前面的部分已经描述了很多内容。

我首先将我预先制作的 ADK-Docs 扩展 克隆到我的 Gemini CLI 全局 ~/.gemini/extensions 文件夹中。然后我更新 gemini-extension.json 以指向我新创建的 llms.txt 文件,我已将其重命名并移动到我的 adk-docs 本地文件夹中:

{  
  "name": "adk-docs-ext",  
  "version": "1.0.0",  
  "mcpServers": {  
    "adk-docs-mcp": {  
      "command": "uvx",  
      "args": [  
        "--from",  
        "mcpdoc",  
        "mcpdoc",  
        "--urls",  
        "/home/darren/localdev/gcp/adk-docs/dazbo-adk-llms.txt",  
        "--transport",  
        "stdio"  
      ]  
    }  
  },  
  "contextFileName": "GEMINI.md"  
}

现在我们可以在 Gemini CLI 中测试它。我问:

如何了解有关 ADK 的信息?

Gemini CLI 回应:

如何了解有关 ADK 的信息?

然后是:

Gemini CLI 可以读取我的 llms.txt!

哇!!一切都在正常工作!

非常兴奋!

也许 ADK-Docs 的人可能会喜欢这个。我会联系他们。

如果你想要使用这个 llms.txt 来帮助你的 ADK 工作,你可以在这里下载完整版本 这里

16、这一切花了多长时间?

虽然我认为自己是一个“差不多合格”的开发者,但这不是我的日常工作。所以一个每天都能写出代码的优秀开发者可能要快得多。但对于我来说,设计和构建这个应用程序到目前为止大约用了 2 天(大部分是周末)。但当我构建下一个 ADK 应用程序时,我知道我会更快,因为:

  1. 我已经有过这种实践经验
  2. Gemini CLI 将使用我的 ADK llms.txt 来帮助我更高效地工作!

17、多代理 ADK 提示和经验教训

快速总结一下有用的要点……

  • 如果你在构建一个多代理系统,请确保你事先设计好。决定每个代理将做什么以及它负责什么。决定每个代理将使用哪些工具。
  • 这里没有疑问,但提示需要非常具体!不要留下任何模糊的空间,并且一次提示可以非常有益。
  • 代理描述非常重要;尤其是对于子代理。它们有助于你的协调代理适当地分配任务。代理描述是在定义代理时的一个属性。
  • 同样,工具描述也很重要。当你创建自定义工具时,你会在函数文档字符串中提供描述。
  • 你的自定义工具函数文档字符串也应该非常清楚地说明函数期望的参数;但不要在文档字符串中提到 ToolContext 参数。这是由 ADK 框架隐式注入的。
  • 明确说明你期望代理(及其工具)如何返回数据或将其添加到会话状态。使用 output_key 属性返回数据是很方便的。任何使用 output_key 返回的对象都会自动添加到会话状态中。
  • 如果你需要输出具有非常特定的格式,考虑设置代理属性 output_schema 并传入 Pydantic 模式。
  • 注意不要尝试在工具中向会话状态键添加数据,同时又试图使用相同的名称通过 output_key 返回对象。后者会覆盖前者。
  • 如果你的代理使用了一个工具,而该工具将从会话状态而不是输入参数中获取数据,你可能需要明确告诉代理不要将任何参数传递给该工具
  • 你可以通过使用大括号来自动将任何会话状态键注入代理提示中,例如 “告诉我关于 {my_key}”
  • 当你想确定性地控制代理顺序时,使用工作流代理
  • 工具和代理的 after-callbacks 是一种很好的方式来清理响应数据,以确保它是下一步工作的正确格式。
  • Jupyter 笔记本 是迭代实验你的代理的好方法。
  • ADK Web UI 是一个非常棒的工具,可以帮助你详细理解你的代理正在做什么。它比很多调试日志更容易!!

原文链接:Give Your AI Agents Deep Understanding

汇智网翻译整理,转载请标明出处