Module-6: 记忆系统（Memory Systems）- 本章介绍

致学习者的一封信
欢迎来到 Module-6！作为在人工智能领域耕耘数十年的研究者，我见证了 AI 系统从简单的规则引擎演变为今天具备复杂认知能力的智能体。而在这个演变过程中，记忆系统始终是核心挑战之一。
想象一下：如果 AI 助手每次对话都像第一次见面，它如何能真正理解你的需求？如何提供个性化的服务？如何在长期交互中不断改进？这就是为什么我们需要系统化的记忆机制。
在本章中，我们将从认知科学的角度出发,深入探讨如何为 LangGraph Agent 构建短期记忆和长期记忆系统。你将学会如何让 AI 不仅记住事实,还能结构化地组织和更新这些信息。更重要的是,你将理解何时使用何种记忆模式——这是区分优秀工程师和卓越架构师的关键。
记住：好的记忆系统不是存储所有东西,而是智能地保存、组织和检索关键信息。让我们开始这段激动人心的旅程！
—— 你的 LangGraph 导师

一、本章学习目标

完成本章学习后,你将能够：

核心能力

记忆系统架构
- 理解短期记忆（Checkpointer）与长期记忆（Store）的本质区别
- 掌握 LangGraph Memory Store 的三个核心概念：namespace、key、value
- 设计适合不同场景的记忆架构
结构化记忆管理
- 使用 Pydantic 定义和验证记忆 Schema
- 区分 Profile（单一对象）和 Collection（记忆集合）的使用场景
- 实现增量更新而非完全重写的高效记忆管理
Trustcall 集成
- 使用 Trustcall 从对话中提取结构化信息
- 理解 JSON Patch 机制和 enable_inserts 参数
- 构建能够自主决策何时保存记忆的智能 Agent
实战应用
- 构建跨会话的个性化聊天机器人
- 实现多类型记忆的协同管理（Profile + Collection + Instructions）
- 优化记忆系统的性能和可扩展性

认知升级

从存储到智能：超越简单的键值对存储，理解记忆的语义组织
从静态到动态：掌握记忆的增量更新和智能合并
从单一到多元：学会为不同类型的信息选择合适的记忆模式

二、核心概念预览

2.1 记忆的认知科学基础

在开始技术实现之前，让我们先理解记忆在认知科学中的分类：

人类记忆系统
├── 短期记忆（Short-term Memory）
│   ├── 工作记忆（Working Memory）
│   └── 容量有限（7±2 个项目）
│
└── 长期记忆（Long-term Memory）
    ├── 陈述性记忆（Declarative Memory）
    │   ├── 语义记忆（Semantic）── 事实和知识
    │   └── 情节记忆（Episodic）── 事件和经历
    │
    └── 程序性记忆（Procedural Memory）── 技能和习惯

在 LangGraph 中的映射：

认知科学概念	LangGraph 实现	技术组件	用途
短期记忆	Within-Thread Memory	MemorySaver (Checkpointer)	单次会话的对话历史
语义记忆	Profile / Collection	InMemoryStore	用户资料、待办事项
程序性记忆	Instructions	InMemoryStore	用户偏好、系统指令

2.2 LangGraph 记忆系统架构

┌─────────────────────────────────────────────────────────────┐
│                     Memory Agent                             │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  短期记忆层（Session Layer）                           │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  Checkpointer (MemorySaver)                      │  │ │
│  │  │  • 对话历史（messages）                          │  │ │
│  │  │  • 当前状态（state）                             │  │ │
│  │  │  • 生命周期：单次会话                            │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  └────────────────────────────────────────────────────────┘ │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  长期记忆层（Persistence Layer）                       │ │
│  │  ┌──────────────────────────────────────────────────┐  │ │
│  │  │  Store (InMemoryStore / PostgresStore)           │  │ │
│  │  │                                                   │  │ │
│  │  │  Namespace: ("user_123", "profile")              │  │ │
│  │  │    ├─ Key: "user_profile"                        │  │ │
│  │  │    └─ Value: {name, location, interests...}      │  │ │
│  │  │                                                   │  │ │
│  │  │  Namespace: ("user_123", "memories")             │  │ │
│  │  │    ├─ Key: uuid-1 → {content: "..."}             │  │ │
│  │  │    ├─ Key: uuid-2 → {content: "..."}             │  │ │
│  │  │    └─ Key: uuid-3 → {content: "..."}             │  │ │
│  │  │                                                   │  │ │
│  │  │  Namespace: ("user_123", "instructions")         │  │ │
│  │  │    └─ Key: "preferences" → {instructions...}     │  │ │
│  │  │                                                   │  │ │
│  │  │  • 生命周期：持久化（跨所有会话）                │  │ │
│  │  └──────────────────────────────────────────────────┘  │ │
│  └────────────────────────────────────────────────────────┘ │
│                                                              │
│  ┌────────────────────────────────────────────────────────┐ │
│  │  智能决策层（Intelligence Layer）                      │ │
│  │  • Trustcall：结构化数据提取                           │ │
│  │  • Conditional Edges：路由逻辑                         │ │
│  │  • ReAct Pattern：推理-行动循环                        │ │
│  └────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

2.3 Store 的三个核心概念

LangGraph Store 使用三元组来组织数据：

Namespace（命名空间）- 分层组织

python

# 元组形式的层级结构
("user_1", "profile")        # 用户1的资料
("user_1", "memories")       # 用户1的记忆集合
("user_2", "profile")        # 用户2的资料
("project", "alpha", "settings")  # 多层嵌套

类比：就像文件系统的目录路径

/user_1/profile/
/user_1/memories/
/user_2/profile/

Key（键）- 唯一标识

python

# Profile 场景：使用固定键
key = "user_profile"

# Collection 场景：使用 UUID
key = str(uuid.uuid4())  # "e1c4e5ab-ab0f-4cbb-822d-f29240a983af"

Value（值）- 必须是字典

python

# Profile
value = {
    "user_name": "Lance",
    "user_location": "San Francisco",
    "interests": ["biking", "bakeries"]
}

# Collection
value = {
    "content": "Lance likes biking in San Francisco."
}

2.4 Profile vs Collection：何时使用哪种模式？

这是本章最重要的架构决策之一：

Profile 模式（单一结构化对象）
┌─────────────────────────────────┐
│  用户资料                        │
│  ┌───────────────────────────┐  │
│  │ user_name: "Lance"        │  │
│  │ user_location: "SF"       │  │
│  │ interests: [...]          │  │
│  │ connections: [...]        │  │
│  └───────────────────────────┘  │
│  • 固定字段                      │
│  • 增量更新（JSON Patch）        │
│  • 适合结构化信息                │
└─────────────────────────────────┘

Collection 模式（记忆集合）
┌─────────────────────────────────┐
│  记忆集合                        │
│  ┌───────────────────────────┐  │
│  │ uuid-1: {content: "..."}  │  │
│  │ uuid-2: {content: "..."}  │  │
│  │ uuid-3: {content: "..."}  │  │
│  │ ...可无限扩展              │  │
│  └───────────────────────────┘  │
│  • 灵活结构                      │
│  • 独立项目（增删改）            │
│  • 适合开放式信息                │
└─────────────────────────────────┘

决策树：

mermaid

graph TD
    A[需要保存信息] --> B{信息是否有固定结构?}
    B -->|是| C[使用 Profile]
    B -->|否| D[使用 Collection]
    
    C --> E{示例}
    E --> E1[用户资料]
    E --> E2[系统配置]
    E --> E3[认证信息]
    
    D --> F{示例}
    F --> F1[对话记录]
    F --> F2[学习笔记]
    F --> F3[待办事项]
    F --> F4[用户反馈]

对比表：

维度	Profile	Collection
数据结构	固定字段的单一对象	独立项目的集合
键策略	固定键（如 "user_profile"）	UUID 键（每项唯一）
更新方式	JSON Patch（增量更新）	新增/修改/删除独立项
Schema	`UserProfile(BaseModel)`	`Memory(BaseModel)`
Trustcall 参数	`enable_inserts=False`（默认）	`enable_inserts=True`
适用场景	姓名、年龄、地址等结构化信息	笔记、任务、事件等开放式信息
扩展性	需修改 Schema	无限扩展

代码示例对比：

python

# ========== Profile 模式 ==========
from pydantic import BaseModel, Field
from typing import List, Optional

class UserProfile(BaseModel):
    """用户资料 Schema"""
    user_name: str = Field(description="用户的首选名称")
    user_location: Optional[str] = Field(description="用户的位置", default=None)
    interests: List[str] = Field(description="用户的兴趣列表", default_factory=list)

# 创建 Trustcall 提取器（Profile）
profile_extractor = create_extractor(
    model,
    tools=[UserProfile],
    tool_choice="UserProfile"
    # enable_inserts=False (默认)
)

# 保存到 Store（固定键）
namespace = ("user_1", "profile")
key = "user_profile"  # 固定键
value = {
    "user_name": "Lance",
    "user_location": "San Francisco",
    "interests": ["biking", "bakeries"]
}
store.put(namespace, key, value)

# ========== Collection 模式 ==========
class Memory(BaseModel):
    """记忆项 Schema"""
    content: str = Field(description="记忆的主要内容")

# 创建 Trustcall 提取器（Collection）
memory_extractor = create_extractor(
    model,
    tools=[Memory],
    tool_choice="Memory",
    enable_inserts=True  # 关键！允许插入新项
)

# 保存到 Store（UUID 键）
import uuid

namespace = ("user_1", "memories")

# 第一条记忆
key1 = str(uuid.uuid4())
value1 = {"content": "Lance likes biking."}
store.put(namespace, key1, value1)

# 第二条记忆
key2 = str(uuid.uuid4())
value2 = {"content": "Lance visited Tartine bakery."}
store.put(namespace, key2, value2)

三、Trustcall：结构化数据提取的利器

3.1 为什么需要 Trustcall？

挑战：从非结构化对话中提取结构化信息

用户说："我叫 Lance，住在旧金山，喜欢骑自行车和逛面包店。"

需要转换为：
{
    "user_name": "Lance",
    "user_location": "San Francisco",
    "interests": ["biking", "bakeries"]
}

传统方法的问题：

❌ 使用 with_structured_output() 只能创建新对象，不能更新
❌ 复杂嵌套 Schema 容易提取失败
❌ 没有增量更新机制，每次都要重新生成

Trustcall 的解决方案：

✅ 自动提取结构化信息
✅ 增量更新现有记录（JSON Patch）
✅ 处理复杂嵌套 Schema
✅ 自我修正和验证

3.2 Trustcall 工作原理

┌─────────────────────────────────────────────────────────────┐
│                    Trustcall 工作流程                        │
└─────────────────────────────────────────────────────────────┘

1. 定义 Schema (Pydantic)
   ↓
   class UserProfile(BaseModel):
       user_name: str
       interests: List[str]

2. 创建提取器
   ↓
   extractor = create_extractor(
       model,
       tools=[UserProfile],
       tool_choice="UserProfile"
   )

3. 提取新信息（创建）
   ↓
   对话："My name is Lance, I like biking."
   ↓
   result = extractor.invoke({
       "messages": conversation
   })
   ↓
   输出：{"user_name": "Lance", "interests": ["biking"]}

4. 更新现有信息（增量更新）
   ↓
   新对话："I also enjoy coffee."
   ↓
   existing = [("0", "UserProfile", {"user_name": "Lance", "interests": ["biking"]})]
   ↓
   result = extractor.invoke({
       "messages": new_conversation,
       "existing": existing
   })
   ↓
   内部使用 JSON Patch：
   [
     {"op": "add", "path": "/interests/1", "value": "coffee"}
   ]
   ↓
   输出：{"user_name": "Lance", "interests": ["biking", "coffee"]}

3.3 JSON Patch 机制

JSON Patch（RFC 6902）是 Trustcall 实现增量更新的核心：

python

# 原始数据
original = {
    "user_name": "Lance",
    "interests": ["biking"]
}

# JSON Patch 操作
patch = [
    {"op": "add", "path": "/interests/1", "value": "coffee"}
]

# 应用 Patch 后
result = {
    "user_name": "Lance",
    "interests": ["biking", "coffee"]  # 添加了 "coffee"
}

常见操作：

操作	说明	示例
`add`	添加新值	`{"op": "add", "path": "/interests/1", "value": "coffee"}`
`replace`	替换值	`{"op": "replace", "path": "/user_name", "value": "Lance Martin"}`
`remove`	删除值	`{"op": "remove", "path": "/interests/0"}`

为什么使用 JSON Patch？

✅ 高效：只传输变化的部分
✅ 精确：明确指定修改位置
✅ 可追溯：可以记录所有变更历史
✅ 原子性：多个操作可作为一个事务

3.4 Trustcall 的监控：Spy 类

理解 Trustcall 内部发生了什么：

python

class Spy:
    """监控 Trustcall 的工具调用"""
    def __init__(self):
        self.called_tools = []

    def __call__(self, run):
        # 遍历所有运行记录（包括子运行）
        q = [run]
        while q:
            r = q.pop()
            if r.child_runs:
                q.extend(r.child_runs)
            if r.run_type == "chat_model":
                self.called_tools.append(
                    r.outputs["generations"][0][0]["message"]["kwargs"]["tool_calls"]
                )

# 使用 Spy
spy = Spy()
extractor_with_spy = extractor.with_listeners(on_end=spy)

# 调用后查看
result = extractor_with_spy.invoke(...)
print(spy.called_tools)  # 查看所有工具调用

四、实战示例：构建 task_mAIstro Agent

让我们通过一个完整的例子，整合所有概念：

4.1 系统架构

task_mAIstro Agent
├── 短期记忆
│   └── MemorySaver：对话历史
│
├── 长期记忆（三种类型）
│   ├── Profile（语义记忆）
│   │   └── 用户资料：姓名、位置、家庭成员
│   │
│   ├── ToDo Collection（语义记忆）
│   │   └── 待办事项：任务内容、截止日期、状态
│   │
│   └── Instructions（程序性记忆）
│       └── 用户偏好：如何创建任务、特殊要求
│
└── 智能决策
    ├── Trustcall：结构化提取
    └── Conditional Edges：智能路由

4.2 Schema 定义

python

from pydantic import BaseModel, Field
from typing import List, Optional

# ========== Profile Schema ==========
class UserProfile(BaseModel):
    """用户资料"""
    user_name: Optional[str] = Field(description="用户的首选名称", default=None)
    user_location: Optional[str] = Field(description="用户的位置", default=None)
    connections: List[str] = Field(
        description="用户的家庭成员、朋友等",
        default_factory=list
    )

# ========== ToDo Collection Schema ==========
class ToDo(BaseModel):
    """待办事项"""
    task: str = Field(description="任务描述")
    due_date: Optional[str] = Field(description="截止日期", default=None)
    status: str = Field(description="状态：pending/completed", default="pending")

# ========== Instructions Schema ==========
class Instructions(BaseModel):
    """用户偏好和指令"""
    preferences: str = Field(description="用户希望如何管理任务")

4.3 Graph 结构

python

from langgraph.graph import StateGraph, MessagesState, START, END
from langgraph.checkpoint.memory import MemorySaver
from langgraph.store.memory import InMemoryStore

# 定义 State
class State(MessagesState):
    pass

# 创建 Graph
builder = StateGraph(State)

# 添加节点
builder.add_node("chat", call_model)
builder.add_node("write_profile", write_profile)
builder.add_node("write_todos", write_todos)
builder.add_node("write_instructions", write_instructions)

# 添加边
builder.add_edge(START, "chat")
builder.add_conditional_edges(
    "chat",
    route_message,
    {
        "write_profile": "write_profile",
        "write_todos": "write_todos",
        "write_instructions": "write_instructions",
        END: END
    }
)
builder.add_edge("write_profile", END)
builder.add_edge("write_todos", END)
builder.add_edge("write_instructions", END)

# 编译（加入 Checkpointer 和 Store）
memory = MemorySaver()
store = InMemoryStore()

graph = builder.compile(checkpointer=memory, store=store)

4.4 智能路由逻辑

python

def route_message(state: State) -> str:
    """根据对话内容决定是否保存记忆"""
    messages = state["messages"]
    last_message = messages[-1]
    
    # 如果是 AI 回复，检查是否包含工具调用
    if hasattr(last_message, "tool_calls") and last_message.tool_calls:
        return END
    
    # 如果是用户消息，分析是否需要保存记忆
    # （这里简化，实际会使用 LLM 判断）
    content = last_message.content.lower()
    
    if any(word in content for word in ["name", "location", "family"]):
        return "write_profile"
    elif any(word in content for word in ["task", "todo", "deadline"]):
        return "write_todos"
    elif any(word in content for word in ["prefer", "always", "remember"]):
        return "write_instructions"
    else:
        return END

4.5 记忆保存节点

python

def write_profile(state: State, config: RunnableConfig, store: BaseStore):
    """保存用户资料"""
    user_id = config["configurable"]["user_id"]
    namespace = (user_id, "profile")
    
    # 检索现有资料
    existing = store.get(namespace, "user_profile")
    
    # 使用 Trustcall 提取和更新
    if existing:
        existing_data = [("0", "UserProfile", existing.value)]
    else:
        existing_data = None
    
    # 调用提取器
    result = profile_extractor.invoke({
        "messages": state["messages"],
        "existing": existing_data
    })
    
    # 保存到 Store
    store.put(namespace, "user_profile", result["responses"][0].model_dump())
    
    return {"messages": [AIMessage(content="已更新您的资料。")]}

def write_todos(state: State, config: RunnableConfig, store: BaseStore):
    """保存待办事项"""
    user_id = config["configurable"]["user_id"]
    namespace = (user_id, "todos")
    
    # 提取待办事项（Collection 模式）
    result = todo_extractor.invoke({
        "messages": state["messages"]
    })
    
    # 保存每个任务（使用 UUID）
    import uuid
    for todo in result["responses"]:
        key = str(uuid.uuid4())
        store.put(namespace, key, todo.model_dump())
    
    return {"messages": [AIMessage(content="已添加待办事项。")]}

五、学习路径建议

根据你的背景和目标，我们提供三种学习路径：

路径 1：快速实践（初学者）⏱️ 3-4 天

目标：快速掌握基本的记忆系统实现

Day 1: Store 基础
├── 6.2 Memory Store - 详细解读
│   ├── put/get/search 三个基本操作
│   ├── namespace、key、value 概念
│   └── 构建简单的记忆聊天机器人
└── 实践：实现基础聊天记忆

Day 2: 结构化记忆（Profile）
├── 6.3 Memory Schema - Profile
│   ├── Pydantic Schema 定义
│   ├── with_structured_output() 使用
│   └── Trustcall 基础
└── 实践：构建带 Profile 的聊天机器人

Day 3: 记忆集合（Collection）
├── 6.4 Memory Schema - Collection
│   ├── Profile vs Collection 区别
│   ├── enable_inserts 参数
│   └── UUID 管理
└── 实践：实现多记忆项管理

Day 4: 综合应用
├── 6.1 Memory Agent
│   ├── 整合多种记忆类型
│   ├── 智能路由逻辑
│   └── 完整 Agent 架构
└── 实践：构建 task_mAIstro Agent

学习重点：

✅ 理解 Store 的基本操作
✅ 掌握 Profile 和 Collection 的区别
✅ 能够实现简单的记忆 Agent

跳过内容：

JSON Patch 内部机制
复杂嵌套 Schema 的优化
性能调优细节

路径 2：系统掌握（中级开发者）⏱️ 5-7 天

目标：深入理解记忆系统的原理和最佳实践

Day 1-2: 记忆系统基础
├── 认知科学背景
├── Store 深入理解
│   ├── namespace 设计模式
│   ├── key 策略选择
│   └── value 结构优化
└── 实践：设计记忆架构

Day 3-4: 结构化数据管理
├── Pydantic 高级用法
│   ├── Field 验证
│   ├── 嵌套模型
│   └── 自定义验证器
├── Trustcall 深入
│   ├── JSON Patch 机制
│   ├── 监控和调试（Spy）
│   └── 错误处理
└── 实践：复杂 Schema 设计

Day 5-6: 智能 Agent 构建
├── 多类型记忆协同
│   ├── Profile + Collection + Instructions
│   ├── 记忆更新策略
│   └── 跨会话一致性
├── 条件路由逻辑
│   ├── 何时保存记忆
│   ├── 路由决策优化
│   └── 性能考量
└── 实践：完整 Agent 实现

Day 7: 高级主题
├── 性能优化
│   ├── 批量操作
│   ├── 缓存策略
│   └── 存储后端选择
├── 生产环境部署
│   ├── PostgresStore 迁移
│   ├── 数据迁移
│   └── 监控和维护
└── 实践：优化和部署

学习重点：

✅ 理解记忆系统的设计原则
✅ 掌握 Trustcall 的内部机制
✅ 能够设计复杂的记忆架构
✅ 了解性能优化策略

路径 3：架构精通（高级工程师）⏱️ 7-10 天

目标：成为记忆系统架构专家，能够设计企业级解决方案

Week 1: 理论与实践深度结合
├── Day 1-2: 认知科学与系统设计
│   ├── 记忆理论深入研究
│   ├── 记忆系统设计模式
│   ├── 权衡与取舍分析
│   └── 实践：架构设计文档
│
├── Day 3-4: 核心技术精通
│   ├── Store 内部实现分析
│   ├── Trustcall 源码研读
│   ├── JSON Patch 规范深入
│   ├── 自定义 Store 实现
│   └── 实践：扩展 Store 功能
│
└── Day 5-7: 高级应用场景
    ├── 多租户记忆隔离
    ├── 记忆版本控制
    ├── 分布式记忆同步
    ├── 记忆安全与隐私
    └── 实践：企业级架构设计

Week 2: 性能与生产
├── Day 8-9: 性能工程
│   ├── 记忆访问模式分析
│   ├── 查询优化策略
│   ├── 缓存层设计
│   ├── 数据分片策略
│   └── 实践：性能基准测试
│
└── Day 10: 生产部署与运维
    ├── 存储后端选择（Postgres/Redis/etc）
    ├── 数据迁移策略
    ├── 监控与告警
    ├── 容灾与备份
    └── 实践：生产环境部署

学习重点：

✅ 深入理解记忆系统的理论基础
✅ 精通 Store 和 Trustcall 的内部实现
✅ 能够设计和实现企业级记忆架构
✅ 掌握性能优化和生产部署的最佳实践

额外挑战：

🎯 实现自定义 Store 后端
🎯 构建记忆分析和可视化工具
🎯 设计多模态记忆系统（文本+图像+音频）
🎯 研究记忆的语义检索和向量化

六、关键技术对比与决策

6.1 短期记忆 vs 长期记忆

维度	短期记忆（Checkpointer）	长期记忆（Store）
技术组件	MemorySaver	InMemoryStore / PostgresStore
存储内容	对话历史（messages）、当前状态	用户资料、记忆集合、偏好设置
生命周期	单次会话（thread）	跨所有会话（persistent）
访问方式	通过 `thread_id`	通过 `namespace` + `key`
数据结构	Graph State（TypedDict）	字典（dict）
适用场景	对话上下文、中间状态	用户信息、知识库
示例	最近5轮对话	用户姓名、兴趣爱好

协同使用：

python

# 短期记忆：记住本次对话
memory = MemorySaver()

# 长期记忆：跨会话信息
store = InMemoryStore()

# 同时使用
graph = builder.compile(checkpointer=memory, store=store)

6.2 TypedDict vs Pydantic BaseModel

特性	TypedDict	Pydantic BaseModel
类型检查	静态（mypy）	静态 + 运行时
数据验证	❌ 无	✅ 自动验证
默认值	❌ 不支持	✅ 支持
序列化	手动	自动（`model_dump()`）
文档生成	有限	丰富（description）
IDE 支持	✅ 有	✅ 更好
性能	稍快	稍慢（验证开销）
LLM 集成	有限	✅ 原生支持

6.3 with_structured_output() vs Trustcall

维度	with_structured_output()	Trustcall
功能	结构化输出	结构化提取 + 更新
创建新记录	✅ 支持	✅ 支持
更新现有记录	❌ 不支持	✅ 支持（JSON Patch）
复杂 Schema	⚠️ 可能失败	✅ 更健壮
并行创建	❌ 单个	✅ 多个
自我修正	❌ 无	✅ 有
使用复杂度	简单	中等
适用场景	简单提取	记忆管理

决策指南：

python

# 简单场景：只需提取一次，不需要更新
model_with_structure = model.with_structured_output(UserProfile)
result = model_with_structure.invoke(messages)

# 复杂场景：需要增量更新、处理复杂 Schema
extractor = create_extractor(
    model,
    tools=[UserProfile],
    tool_choice="UserProfile"
)
result = extractor.invoke({
    "messages": messages,
    "existing": existing_data  # 支持更新
})

6.4 InMemoryStore vs PostgresStore

维度	InMemoryStore	PostgresStore
持久化	❌ 内存（重启丢失）	✅ 数据库（持久）
性能	⚡ 非常快	🐢 相对慢
容量	内存限制	磁盘限制（大）
分布式	❌ 不支持	✅ 支持
事务	❌ 无	✅ 支持
查询能力	基础	强大（SQL）
适用场景	开发、测试	生产环境
设置复杂度	简单	中等

迁移示例：

python

# 开发环境
from langgraph.store.memory import InMemoryStore
store = InMemoryStore()

# 生产环境
from langgraph.store.postgres import PostgresStore
store = PostgresStore(
    connection_string="postgresql://user:pass@localhost/db"
)

# API 相同，无需修改代码
store.put(namespace, key, value)
store.get(namespace, key)
store.search(namespace)

七、最佳实践与设计模式

7.1 Namespace 设计模式

模式 1：用户为中心

python

# 按用户组织
("user_{user_id}", "profile")
("user_{user_id}", "memories")
("user_{user_id}", "todos")
("user_{user_id}", "preferences")

优点：

✅ 易于实现多租户隔离
✅ 清晰的数据归属
✅ 方便按用户删除或导出数据

模式 2：功能为中心

python

# 按功能组织
("profile", "user_{user_id}")
("memories", "user_{user_id}")
("todos", "user_{user_id}")

优点：

✅ 易于跨用户分析
✅ 便于功能模块化

模式 3：混合模式

python

# 多层嵌套
("company_A", "user_{user_id}", "profile")
("company_A", "user_{user_id}", "memories")
("company_B", "user_{user_id}", "profile")

优点：

✅ 支持多租户 + 多功能
✅ 灵活的权限控制

推荐：根据业务需求选择，保持一致性最重要。

7.2 Key 设计策略

场景 1：Profile（单一对象）

python

# 使用描述性固定键
key = "user_profile"
key = "system_settings"
key = "user_preferences"

场景 2：Collection（多个项目）

python

# 使用 UUID
import uuid
key = str(uuid.uuid4())  # "e1c4e5ab-ab0f-4cbb-822d-f29240a983af"

场景 3：时间序列数据

python

# 使用时间戳
from datetime import datetime
key = datetime.now().isoformat()  # "2024-11-04T22:48:16.727572"

场景 4：层级数据

python

# 使用路径风格
key = "project/alpha/task/001"
key = "user/settings/notification/email"

7.3 Value 结构设计

原则 1：保持简洁

python

# ✅ 好的设计
{
    "user_name": "Lance",
    "interests": ["biking", "coffee"]
}

# ❌ 过度复杂
{
    "user": {
        "personal": {
            "name": {
                "first": "Lance",
                "last": None
            }
        },
        "data": {
            "interests": {
                "list": ["biking"]
            }
        }
    }
}

原则 2：包含元数据

python

{
    "content": "User likes biking",
    "created_at": "2024-11-04T22:48:16",
    "importance": 0.8,
    "category": "hobby"
}

原则 3：可扩展性

python

class Memory(BaseModel):
    content: str
    # 预留扩展字段
    metadata: Optional[dict] = None
    tags: List[str] = Field(default_factory=list)

7.4 记忆更新策略

策略 1：完全替换（简单但低效）

python

def update_memory_full_replace(store, namespace, key, new_data):
    """每次完全替换整个对象"""
    store.put(namespace, key, new_data)

适用：数据量小、更新不频繁

策略 2：增量更新（推荐）

python

def update_memory_incremental(store, namespace, key, updates):
    """只更新变化的字段"""
    existing = store.get(namespace, key)
    if existing:
        merged = {**existing.value, **updates}
        store.put(namespace, key, merged)
    else:
        store.put(namespace, key, updates)

适用：频繁更新、大对象

策略 3：Trustcall 自动更新

python

# Trustcall 自动处理增量更新
existing_data = [("0", "UserProfile", existing.value)]
result = extractor.invoke({
    "messages": new_messages,
    "existing": existing_data  # Trustcall 自动使用 JSON Patch
})

适用：复杂 Schema、需要智能合并

7.5 错误处理与验证

python

from pydantic import ValidationError

def safe_put(store, namespace, key, value):
    """安全地保存数据，包含验证和错误处理"""
    try:
        # 验证数据
        if not isinstance(value, dict):
            raise ValueError("Value must be a dictionary")
        
        # 保存
        store.put(namespace, key, value)
        return True
    
    except ValidationError as e:
        print(f"数据验证失败：{e}")
        return False
    
    except Exception as e:
        print(f"保存失败：{e}")
        return False

def safe_get(store, namespace, key, default=None):
    """安全地获取数据，包含默认值"""
    try:
        result = store.get(namespace, key)
        return result.value if result else default
    except Exception as e:
        print(f"获取失败：{e}")
        return default

八、常见陷阱与调试技巧

8.1 陷阱 1：混淆 Profile 和 Collection

错误示例：

python

# ❌ 使用固定键保存 Collection（每次覆盖）
for memory in memories:
    store.put(namespace, "memory", memory)  # 只保留最后一条！

正确做法：

python

# ✅ 使用 UUID 保存 Collection
import uuid
for memory in memories:
    key = str(uuid.uuid4())
    store.put(namespace, key, memory)

8.2 陷阱 2：忘记 enable_inserts

错误示例：

python

# ❌ Collection 场景下忘记 enable_inserts
extractor = create_extractor(
    model,
    tools=[Memory],
    tool_choice="Memory"
    # 缺少 enable_inserts=True
)
# 结果：只能更新，不能添加新记忆

正确做法：

python

# ✅ Collection 必须启用 enable_inserts
extractor = create_extractor(
    model,
    tools=[Memory],
    tool_choice="Memory",
    enable_inserts=True  # 关键！
)

8.3 陷阱 3：Namespace 不一致

错误示例：

python

# ❌ 保存和检索使用不同的 namespace
store.put(("user_1", "profile"), "key", value)
result = store.get(("user_1", "profiles"), "key")  # 拼写错误！
# 结果：None

正确做法：

python

# ✅ 使用常量定义 namespace
USER_PROFILE_NS = lambda user_id: (user_id, "profile")
USER_MEMORIES_NS = lambda user_id: (user_id, "memories")

# 使用时
namespace = USER_PROFILE_NS("user_1")
store.put(namespace, "key", value)
result = store.get(namespace, "key")

8.4 陷阱 4：Value 不是字典

错误示例：

python

# ❌ Value 必须是字典
store.put(namespace, "key", "string value")  # 错误！
store.put(namespace, "key", ["list", "value"])  # 错误！

正确做法：

python

# ✅ 包装为字典
store.put(namespace, "key", {"value": "string value"})
store.put(namespace, "key", {"items": ["list", "value"]})

8.5 调试技巧

技巧 1：使用 Spy 监控 Trustcall

python

spy = Spy()
extractor_with_spy = extractor.with_listeners(on_end=spy)

result = extractor_with_spy.invoke(...)
print("工具调用：", spy.called_tools)  # 查看内部调用

技巧 2：打印 Store 内容

python

def debug_store(store, namespace):
    """调试：打印 namespace 中的所有数据"""
    items = store.search(namespace)
    print(f"\n=== Namespace: {namespace} ===")
    for item in items:
        print(f"Key: {item.key}")
        print(f"Value: {item.value}")
        print(f"Created: {item.created_at}")
        print("-" * 40)

技巧 3：验证 Schema

python

from pydantic import ValidationError

def validate_data(model_class, data):
    """验证数据是否符合 Schema"""
    try:
        model_class(**data)
        print("✅ 数据有效")
        return True
    except ValidationError as e:
        print("❌ 验证失败：")
        print(e)
        return False

# 使用
validate_data(UserProfile, {"user_name": "Lance", "interests": ["biking"]})

技巧 4：追踪记忆更新

python

def put_with_log(store, namespace, key, value):
    """保存时记录日志"""
    print(f"[{datetime.now()}] PUT")
    print(f"  Namespace: {namespace}")
    print(f"  Key: {key}")
    print(f"  Value: {value}")
    store.put(namespace, key, value)

九、性能优化指南

9.1 批量操作

python

# ❌ 低效：逐个保存
for memory in memories:
    store.put(namespace, str(uuid.uuid4()), memory.model_dump())

# ✅ 高效：批量保存（如果 Store 支持）
# 注意：InMemoryStore 不支持批量，但 PostgresStore 可以优化
items = [
    (str(uuid.uuid4()), memory.model_dump())
    for memory in memories
]
# 使用事务批量保存（PostgresStore）
with store.begin_transaction():
    for key, value in items:
        store.put(namespace, key, value)

9.2 缓存策略

python

from functools import lru_cache
from datetime import datetime, timedelta

class CachedStore:
    """带缓存的 Store 包装器"""
    def __init__(self, store):
        self.store = store
        self.cache = {}
        self.cache_ttl = timedelta(minutes=5)
    
    def get(self, namespace, key):
        cache_key = (namespace, key)
        
        # 检查缓存
        if cache_key in self.cache:
            value, timestamp = self.cache[cache_key]
            if datetime.now() - timestamp < self.cache_ttl:
                return value
        
        # 缓存未命中，从 Store 获取
        result = self.store.get(namespace, key)
        
        # 更新缓存
        self.cache[cache_key] = (result, datetime.now())
        
        return result
    
    def put(self, namespace, key, value):
        # 更新 Store
        self.store.put(namespace, key, value)
        
        # 失效缓存
        cache_key = (namespace, key)
        if cache_key in self.cache:
            del self.cache[cache_key]

9.3 索引和查询优化

python

# PostgresStore 可以创建索引
"""
CREATE INDEX idx_namespace ON store (namespace);
CREATE INDEX idx_created_at ON store (created_at);
"""

# 使用筛选减少数据传输
def search_recent_memories(store, namespace, days=7):
    """只检索最近 N 天的记忆"""
    all_items = store.search(namespace)
    
    cutoff = datetime.now() - timedelta(days=days)
    recent = [
        item for item in all_items
        if datetime.fromisoformat(item.created_at.replace('+00:00', '')) > cutoff
    ]
    
    return recent

9.4 内存管理

python

# 限制 Collection 大小
MAX_MEMORIES = 1000

def add_memory_with_limit(store, namespace, memory):
    """添加记忆，超过限制时删除最旧的"""
    # 获取所有记忆
    items = store.search(namespace)
    
    # 如果超过限制，删除最旧的
    if len(items) >= MAX_MEMORIES:
        # 按创建时间排序
        items_sorted = sorted(items, key=lambda x: x.created_at)
        oldest = items_sorted[0]
        store.delete(namespace, oldest.key)
    
    # 添加新记忆
    key = str(uuid.uuid4())
    store.put(namespace, key, memory)

十、术语表

术语	英文	定义	示例
记忆	Memory	AI 系统存储和检索信息的能力	记住用户姓名
短期记忆	Short-term Memory	单次会话内的记忆	对话历史
长期记忆	Long-term Memory	跨会话持久化的记忆	用户资料
检查点	Checkpointer	保存 Graph 状态的组件	MemorySaver
存储	Store	键值对持久化存储	InMemoryStore
命名空间	Namespace	分层组织数据的元组	`("user_1", "profile")`
键	Key	标识特定对象的字符串	`"user_profile"`
值	Value	存储的数据（字典）	`{"name": "Lance"}`
资料	Profile	单一结构化对象	用户资料 Schema
集合	Collection	多个独立项目的集合	记忆列表
Schema	Schema	数据结构定义	Pydantic 模型
Trustcall	Trustcall	结构化数据提取库	提取和更新记忆
JSON Patch	JSON Patch	增量更新机制	RFC 6902
UUID	UUID	通用唯一识别码	`uuid.uuid4()`
语义记忆	Semantic Memory	事实和知识	姓名、位置
情节记忆	Episodic Memory	事件和经历	上周的讨论
程序性记忆	Procedural Memory	技能和习惯	用户偏好

十一、章节内容概览

6.1 Memory Agent（记忆 Agent）

难度：⭐⭐⭐⭐
时长：2-3 小时

核心内容：

构建 task_mAIstro Agent（待办事项管理助手）
整合三种记忆类型：Profile + Collection + Instructions
ReAct 架构和条件路由
Trustcall 实战应用

学习收获：

✅ 理解完整的 Memory Agent 架构
✅ 掌握多类型记忆的协同管理
✅ 学会智能决策何时保存记忆

6.2 Memory Store（记忆存储）

难度：⭐⭐
时长：1-2 小时

核心内容：

Store 的三个基本操作：put、get、search
Namespace、Key、Value 核心概念
短期记忆与长期记忆的协作
构建简单的记忆聊天机器人

学习收获：

✅ 掌握 Store 的基本使用
✅ 理解记忆系统的认知科学基础
✅ 能够实现基础的跨会话记忆

6.3 Memory Schema - Profile（记忆模式 - 资料）

难度：⭐⭐⭐
时长：2-3 小时

核心内容：

从非结构化到结构化记忆的升级
Pydantic Schema 定义和验证
with_structured_output() 方法
Trustcall 的 JSON Patch 更新机制

学习收获：

✅ 掌握结构化记忆的设计
✅ 理解 Trustcall 的工作原理
✅ 学会增量更新而非完全重写

6.4 Memory Schema - Collection（记忆模式 - 集合）

难度：⭐⭐⭐
时长：2-3 小时

核心内容：

Profile vs Collection 的本质区别
enable_inserts=True 参数的作用
UUID 管理多个记忆项
灵活的记忆扩展策略

学习收获：

✅ 理解何时使用 Profile，何时使用 Collection
✅ 掌握 Collection 的实现模式
✅ 学会设计可扩展的记忆系统

十二、学习检查清单

完成本章学习后，请确保你能够：

基础能力 ✓

[ ] 解释短期记忆和长期记忆的区别
[ ] 使用 put()、get()、search() 操作 Store
[ ] 理解 namespace、key、value 的含义和设计原则
[ ] 定义简单的 Pydantic Schema
[ ] 使用 with_structured_output() 提取结构化信息

中级能力 ✓

[ ] 区分 Profile 和 Collection 的使用场景
[ ] 使用 Trustcall 创建和更新记忆
[ ] 理解 JSON Patch 的工作原理
[ ] 实现带记忆的聊天机器人
[ ] 设计合理的 namespace 组织结构

高级能力 ✓

[ ] 构建多类型记忆协同的 Agent
[ ] 实现智能路由逻辑（何时保存记忆）
[ ] 优化记忆系统的性能
[ ] 从 InMemoryStore 迁移到 PostgresStore
[ ] 设计企业级记忆架构

十三、推荐资源

官方文档

认知科学背景

技术规范

十四、实践项目建议

项目 1：个性化新闻助手 🔰 初级

时长：3-5 小时

功能：

记住用户的兴趣主题（Profile）
保存用户阅读历史（Collection）
根据兴趣推荐新闻

技术要点：

Store 基本操作
Profile Schema 设计
简单的推荐逻辑

项目 2：智能学习笔记系统 ⭐ 中级

时长：1-2 天

功能：

记录学习笔记（Collection）
保存学习偏好（Profile）
自动分类和标签
知识图谱关联

技术要点：

Collection 管理
Trustcall 提取
复杂 Schema 设计
语义搜索

项目 3：企业级客服系统 🚀 高级

时长：3-5 天

功能：

多用户记忆隔离
客户资料管理（Profile）
对话历史（Collection）
工单管理（Collection）
客服偏好设置（Instructions）
数据分析和报表

技术要点：

多租户架构
PostgresStore 部署
性能优化
监控和日志
数据安全

十五、下一步学习

完成本章后，你将具备构建复杂记忆系统的能力。接下来的学习方向：

Module-7：高级 Agent 模式（即将学习）

Multi-Agent 协作
Agent 监督和编排
动态工具选择
Agent 性能优化

延伸主题

向量数据库集成：语义搜索记忆
RAG（检索增强生成）：结合记忆和外部知识
多模态记忆：图像、音频记忆管理
记忆压缩：长期记忆的摘要和压缩

十六、结语

记忆系统是 AI Agent 从"工具"进化为"助手"的关键。通过本章的学习，你不仅掌握了 LangGraph 记忆系统的技术实现，更重要的是理解了如何设计智能的记忆架构。

记住三个核心原则：

简洁优于复杂：从简单的 Schema 开始，逐步扩展
增量优于全量：使用 JSON Patch 等增量更新机制
智能优于存储：不是记住所有东西，而是记住重要的东西

现在，让我们开始深入每一节的详细内容，构建你自己的智能记忆系统！

Module-6: 记忆系统（Memory Systems）- 本章介绍 ​

一、本章学习目标 ​

核心能力 ​

认知升级 ​

二、核心概念预览 ​

2.1 记忆的认知科学基础 ​

2.2 LangGraph 记忆系统架构 ​

2.3 Store 的三个核心概念 ​

Namespace（命名空间）- 分层组织 ​

Key（键）- 唯一标识 ​

Value（值）- 必须是字典 ​

2.4 Profile vs Collection：何时使用哪种模式？ ​

三、Trustcall：结构化数据提取的利器 ​

3.1 为什么需要 Trustcall？ ​

3.2 Trustcall 工作原理 ​

3.3 JSON Patch 机制 ​

3.4 Trustcall 的监控：Spy 类 ​

四、实战示例：构建 task_mAIstro Agent ​

4.1 系统架构 ​

4.2 Schema 定义 ​

4.3 Graph 结构 ​

4.4 智能路由逻辑 ​

4.5 记忆保存节点 ​

五、学习路径建议 ​

路径 1：快速实践（初学者）⏱️ 3-4 天 ​

路径 2：系统掌握（中级开发者）⏱️ 5-7 天 ​

路径 3：架构精通（高级工程师）⏱️ 7-10 天 ​

六、关键技术对比与决策 ​

6.1 短期记忆 vs 长期记忆 ​

6.2 TypedDict vs Pydantic BaseModel ​

6.3 with_structured_output() vs Trustcall ​

6.4 InMemoryStore vs PostgresStore ​

七、最佳实践与设计模式 ​

7.1 Namespace 设计模式 ​

模式 1：用户为中心 ​

模式 2：功能为中心 ​

模式 3：混合模式 ​

7.2 Key 设计策略 ​

场景 1：Profile（单一对象） ​

场景 2：Collection（多个项目） ​

场景 3：时间序列数据 ​

场景 4：层级数据 ​

7.3 Value 结构设计 ​

原则 1：保持简洁 ​

原则 2：包含元数据 ​

原则 3：可扩展性 ​

7.4 记忆更新策略 ​

策略 1：完全替换（简单但低效） ​

策略 2：增量更新（推荐） ​

策略 3：Trustcall 自动更新 ​

7.5 错误处理与验证 ​

八、常见陷阱与调试技巧 ​

8.1 陷阱 1：混淆 Profile 和 Collection ​

8.2 陷阱 2：忘记 enable_inserts ​

8.3 陷阱 3：Namespace 不一致 ​

8.4 陷阱 4：Value 不是字典 ​

8.5 调试技巧 ​

技巧 1：使用 Spy 监控 Trustcall ​

技巧 2：打印 Store 内容 ​

技巧 3：验证 Schema ​

技巧 4：追踪记忆更新 ​

九、性能优化指南 ​

9.1 批量操作 ​

9.2 缓存策略 ​

9.3 索引和查询优化 ​

9.4 内存管理 ​

十、术语表 ​

十一、章节内容概览 ​

6.1 Memory Agent（记忆 Agent） ​

6.2 Memory Store（记忆存储） ​

6.3 Memory Schema - Profile（记忆模式 - 资料） ​

6.4 Memory Schema - Collection（记忆模式 - 集合） ​

十二、学习检查清单 ​

基础能力 ✓ ​

中级能力 ✓ ​

高级能力 ✓ ​

十三、推荐资源 ​

官方文档 ​

认知科学背景 ​

技术规范 ​

Module-6: 记忆系统（Memory Systems）- 本章介绍

一、本章学习目标

核心能力

认知升级

二、核心概念预览

2.1 记忆的认知科学基础

2.2 LangGraph 记忆系统架构

2.3 Store 的三个核心概念

Namespace（命名空间）- 分层组织

Key（键）- 唯一标识

Value（值）- 必须是字典

2.4 Profile vs Collection：何时使用哪种模式？

三、Trustcall：结构化数据提取的利器

3.1 为什么需要 Trustcall？

3.2 Trustcall 工作原理

3.3 JSON Patch 机制

3.4 Trustcall 的监控：Spy 类

四、实战示例：构建 task_mAIstro Agent

4.1 系统架构

4.2 Schema 定义

4.3 Graph 结构

4.4 智能路由逻辑

4.5 记忆保存节点

五、学习路径建议

路径 1：快速实践（初学者）⏱️ 3-4 天

路径 2：系统掌握（中级开发者）⏱️ 5-7 天

路径 3：架构精通（高级工程师）⏱️ 7-10 天

六、关键技术对比与决策

6.1 短期记忆 vs 长期记忆

6.2 TypedDict vs Pydantic BaseModel

6.3 with_structured_output() vs Trustcall

6.4 InMemoryStore vs PostgresStore

七、最佳实践与设计模式

7.1 Namespace 设计模式

模式 1：用户为中心

模式 2：功能为中心

模式 3：混合模式

7.2 Key 设计策略

场景 1：Profile（单一对象）

场景 2：Collection（多个项目）

场景 3：时间序列数据

场景 4：层级数据

7.3 Value 结构设计

原则 1：保持简洁

原则 2：包含元数据

原则 3：可扩展性

7.4 记忆更新策略

策略 1：完全替换（简单但低效）

策略 2：增量更新（推荐）

策略 3：Trustcall 自动更新

7.5 错误处理与验证

八、常见陷阱与调试技巧

8.1 陷阱 1：混淆 Profile 和 Collection

8.2 陷阱 2：忘记 enable_inserts

8.3 陷阱 3：Namespace 不一致

8.4 陷阱 4：Value 不是字典

8.5 调试技巧

技巧 1：使用 Spy 监控 Trustcall

技巧 2：打印 Store 内容

技巧 3：验证 Schema

技巧 4：追踪记忆更新

九、性能优化指南

9.1 批量操作

9.2 缓存策略

9.3 索引和查询优化

9.4 内存管理

十、术语表

十一、章节内容概览

6.1 Memory Agent（记忆 Agent）

6.2 Memory Store（记忆存储）

6.3 Memory Schema - Profile（记忆模式 - 资料）

6.4 Memory Schema - Collection（记忆模式 - 集合）

十二、学习检查清单

基础能力 ✓

中级能力 ✓

高级能力 ✓

十三、推荐资源

官方文档

认知科学背景

技术规范