diff --git a/.claude/scheduled_tasks.lock b/.claude/scheduled_tasks.lock
new file mode 100644
index 0000000..16ed2e0
--- /dev/null
+++ b/.claude/scheduled_tasks.lock
@@ -0,0 +1 @@
+{"sessionId":"3eed9a85-f117-45ce-82e9-e404b5547852","pid":17131,"procStart":"Wed Jun  3 09:14:36 2026","acquiredAt":1780915535766}
\ No newline at end of file
diff --git a/CHANGELOG.md b/CHANGELOG.md
index f6552a0..f0d4c79 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -15,6 +15,17 @@ The format follows [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and
 - `DashScopeKnowledgeRetrieveRequest` / `DashScopeKnowledgeRetrieveResponse` types and `knowledgeRetrieveEndpoint` added to `bailian-cli-core`.
 - Comprehensive E2E tests for knowledge retrieve covering both auth paths, dry-run, rerank flags, and error cases.
 
+- `bl usage` command group:
+  - `bl usage free` — query free-tier quota for all models (or a specific model with `--model`).
+  - `bl usage freetier` — enable (`--on`) or disable (`--off`) auto-stop for free-tier models.
+  - `bl usage stats` — query model usage statistics (requires `--workspace-id`).
+- `bl quota` command group:
+  - `bl quota list` — view model RPM/TPM rate limits (filter with `--model`, show all with `--all`).
+  - `bl quota check` — check current RPM/TPM usage against rate limits.
+  - `bl quota history` — view quota change history with pagination.
+  - `bl quota request` — request a temporary quota increase for a model.
+- `bl workspace list` — list all workspaces with region and endpoint details.
+
 ### Changed
 
 - Credential resolution priority: explicit API-Key → explicit AK/SK flags → auto-detected API-Key → fallback AK/SK from config/env.
diff --git a/CHANGELOG.zh.md b/CHANGELOG.zh.md
index f04f06a..51c0886 100644
--- a/CHANGELOG.zh.md
+++ b/CHANGELOG.zh.md
@@ -6,10 +6,20 @@
 
 [English](CHANGELOG.md) · [README](README.zh.md) · [参与贡献](CONTRIBUTING.zh.md)
 
-## [1.3.0] - 2026-06-10
+## [1.3.0] - 2026-06-11
 
 ### 新增
 
+- `bl usage` 命令组：
+  - `bl usage free` — 查询所有模型的免费额度（可通过 `--model` 指定模型）。
+  - `bl usage freetier` — 启用（`--on`）或禁用（`--off`）免费额度模型的自动停服。
+  - `bl usage stats` — 查询模型用量统计（需指定 `--workspace-id`）。
+- `bl quota` 命令组：
+  - `bl quota list` — 查看模型 RPM/TPM 速率限制（支持 `--model` 过滤，`--all` 展示全部）。
+  - `bl quota check` — 查看当前 RPM/TPM 用量与速率限制。
+  - `bl quota history` — 查看配额变更记录，支持分页。
+  - `bl quota request` — 申请模型临时配额提升。
+- `bl workspace list` — 列出所有业务空间，包含地域和 endpoint 信息。
 - `bl knowledge retrieve` 新增 API-Key 鉴权（DashScope 网关），与原有 AK/SK 并存，可用时自动优先使用 API-Key。
 - 新增检索参数：`--dense-similarity-top-k`、`--sparse-similarity-top-k`、`--rerank-model`、`--rerank-mode`、`--rerank-instruct`，API-Key 与 AK/SK 两条链路均支持。
 - `bailian-cli-core` 新增 `DashScopeKnowledgeRetrieveRequest` / `DashScopeKnowledgeRetrieveResponse` 类型及 `knowledgeRetrieveEndpoint` 端点。
diff --git a/README.md b/README.md
index 6abe41f..c1a6177 100644
--- a/README.md
+++ b/README.md
@@ -35,7 +35,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
 - **MCP integration** — Orchestrate Bailian MCP servers: list services, inspect tools, and invoke any tool directly from the terminal
 - **Web search** — Real-time internet retrieval for up-to-date, accurate answers
 - **Model recommendation** — Describe your scenario and get best-fit model suggestions; supports scoped search, model comparison, and alternative discovery
-- **Console capabilities** — Browse Bailian apps (`app list`) and check free-tier quota (`usage free`)
+- **Console capabilities** — Browse Bailian apps (`app list`), check free-tier quota (`usage free`), view model usage statistics (`usage stats`), manage workspaces (`workspace list`), and manage rate limits (`quota list/request/check/history`)
 - **Local file auto-upload** — Every URL parameter accepts a local path; uploaded to free temp storage with 48-hour validity
 
 ## Showcase: One-Sentence Cinematic Video
@@ -108,9 +108,22 @@ bl advisor recommend --message "qwen-max vs deepseek-v3 for code generation"
 # Browser login (required for console capability commands)
 bl auth login --console
 
-# Browse apps / free-tier quota
+# Browse apps / free-tier quota / usage statistics / workspaces
 bl app list
 bl usage free --model qwen3-max
+bl usage free --expiring 30                           # Quotas expiring within 30 days
+bl usage free --sort remaining                        # Sort by remaining % ascending
+bl usage stats --workspace-id <id>                    # Usage overview for a workspace
+bl usage stats --model qwen-turbo --workspace-id <id> # Per-model usage
+bl workspace list                                     # List all workspaces
+
+# Rate limit management
+bl quota list                                         # View RPM/TPM limits for all models
+bl quota list --model qwen3.6-plus                    # View limits for a specific model
+bl quota check                                        # Current usage vs rate limits
+bl quota check --model qwen3.6-plus --period 5        # Check usage over last 5 minutes
+bl quota request --model qwen3.6-plus --tpm 6000000   # Request a temporary TPM increase
+bl quota history                                      # View quota change history
 ```
 
 > More examples and scenarios: [Aliyun Model Studio CLI Site](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)
@@ -134,7 +147,7 @@ bl text chat --api-key sk-xxxxx --message "Hello"
 
 ### Console Login (OAuth)
 
-Required for console capability commands (`app list`, `usage free`). Opens the Bailian console in your browser to sign in.
+Required for console capability commands (`app list`, `usage free`, `usage stats`, `workspace list`, `quota list/request/check/history`). Opens the Bailian console in your browser to sign in.
 
 ```bash
 bl auth login --console
diff --git a/README.zh.md b/README.zh.md
index 3fbf992..4d9a101 100644
--- a/README.zh.md
+++ b/README.zh.md
@@ -35,7 +35,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 - **MCP 集成** — 统一调度百炼 MCP 服务：列出服务、查看工具、直接在终端调用任意工具
 - **联网搜索** — 实时互联网信息检索，提升回答准确性及时效性
 - **模型推荐** — 描述你的场景，智能推荐最适合的模型；支持限定范围搜索、模型对比和替代发现
-- **控制台能力** — 浏览百炼应用（`app list`），查询模型免费额度（`usage free`）
+- **控制台能力** — 浏览百炼应用（`app list`），查询模型免费额度（`usage free`），查看模型用量统计（`usage stats`），管理业务空间（`workspace list`），管理限流与提额（`quota list/request/check/history`）
 - **本地文件自动上传** — 所有 URL 参数同时支持本地路径，免费临时存储 48 小时
 
 ## 示例:一句话生成一部电影短片
@@ -103,9 +103,22 @@ bl advisor recommend --message "qwen-max 和 deepseek-v3 哪个更适合做代
 # 浏览器登录（控制台能力相关命令需要）
 bl auth login --console
 
-# 浏览应用 / 免费额度
+# 浏览应用 / 免费额度 / 用量统计 / 业务空间
 bl app list
 bl usage free --model qwen3-max
+bl usage free --expiring 30                           # 30 天内过期的额度
+bl usage free --sort remaining                        # 按剩余百分比升序排列
+bl usage stats --workspace-id <id>                    # 指定空间的用量概览
+bl usage stats --model qwen-turbo --workspace-id <id> # 指定模型用量
+bl workspace list                                     # 列出所有业务空间
+
+# 限流管理与提额
+bl quota list                                         # 查看所有模型的 RPM/TPM 限额
+bl quota list --model qwen3.6-plus                    # 查看指定模型限额
+bl quota check                                        # 查看当前用量 vs 限流阈值
+bl quota check --model qwen3.6-plus --period 5        # 查看最近 5 分钟用量
+bl quota request --model qwen3.6-plus --tpm 6000000   # 申请临时 TPM 提额
+bl quota history                                      # 查看提额历史记录
 ```
 
 > 更多案例与使用场景：[阿里云百炼 CLI 官方主页](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)
@@ -129,7 +142,7 @@ bl text chat --api-key sk-xxxxx --message "你好"
 
 ### 控制台登录（OAuth）
 
-控制台能力命令（`app list`、`usage free`）需要使用此登录方式。打开浏览器跳转百炼控制台完成登录。
+控制台能力命令（`app list`、`usage free`、`usage stats`、`workspace list`、`quota list/request/check/history`）需要使用此登录方式。打开浏览器跳转百炼控制台完成登录。
 
 ```bash
 bl auth login --console
diff --git a/packages/cli/README.md b/packages/cli/README.md
index 6abe41f..c1a6177 100644
--- a/packages/cli/README.md
+++ b/packages/cli/README.md
@@ -35,7 +35,7 @@ Equip your AI Agent out-of-the-box with these capabilities, composable across co
 - **MCP integration** — Orchestrate Bailian MCP servers: list services, inspect tools, and invoke any tool directly from the terminal
 - **Web search** — Real-time internet retrieval for up-to-date, accurate answers
 - **Model recommendation** — Describe your scenario and get best-fit model suggestions; supports scoped search, model comparison, and alternative discovery
-- **Console capabilities** — Browse Bailian apps (`app list`) and check free-tier quota (`usage free`)
+- **Console capabilities** — Browse Bailian apps (`app list`), check free-tier quota (`usage free`), view model usage statistics (`usage stats`), manage workspaces (`workspace list`), and manage rate limits (`quota list/request/check/history`)
 - **Local file auto-upload** — Every URL parameter accepts a local path; uploaded to free temp storage with 48-hour validity
 
 ## Showcase: One-Sentence Cinematic Video
@@ -108,9 +108,22 @@ bl advisor recommend --message "qwen-max vs deepseek-v3 for code generation"
 # Browser login (required for console capability commands)
 bl auth login --console
 
-# Browse apps / free-tier quota
+# Browse apps / free-tier quota / usage statistics / workspaces
 bl app list
 bl usage free --model qwen3-max
+bl usage free --expiring 30                           # Quotas expiring within 30 days
+bl usage free --sort remaining                        # Sort by remaining % ascending
+bl usage stats --workspace-id <id>                    # Usage overview for a workspace
+bl usage stats --model qwen-turbo --workspace-id <id> # Per-model usage
+bl workspace list                                     # List all workspaces
+
+# Rate limit management
+bl quota list                                         # View RPM/TPM limits for all models
+bl quota list --model qwen3.6-plus                    # View limits for a specific model
+bl quota check                                        # Current usage vs rate limits
+bl quota check --model qwen3.6-plus --period 5        # Check usage over last 5 minutes
+bl quota request --model qwen3.6-plus --tpm 6000000   # Request a temporary TPM increase
+bl quota history                                      # View quota change history
 ```
 
 > More examples and scenarios: [Aliyun Model Studio CLI Site](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)
@@ -134,7 +147,7 @@ bl text chat --api-key sk-xxxxx --message "Hello"
 
 ### Console Login (OAuth)
 
-Required for console capability commands (`app list`, `usage free`). Opens the Bailian console in your browser to sign in.
+Required for console capability commands (`app list`, `usage free`, `usage stats`, `workspace list`, `quota list/request/check/history`). Opens the Bailian console in your browser to sign in.
 
 ```bash
 bl auth login --console
diff --git a/packages/cli/README.zh.md b/packages/cli/README.zh.md
index 3fbf992..4d9a101 100644
--- a/packages/cli/README.zh.md
+++ b/packages/cli/README.zh.md
@@ -35,7 +35,7 @@ _专为 AI Agent 打造，每个命令均可作为结构化工具调用。_
 - **MCP 集成** — 统一调度百炼 MCP 服务：列出服务、查看工具、直接在终端调用任意工具
 - **联网搜索** — 实时互联网信息检索，提升回答准确性及时效性
 - **模型推荐** — 描述你的场景，智能推荐最适合的模型；支持限定范围搜索、模型对比和替代发现
-- **控制台能力** — 浏览百炼应用（`app list`），查询模型免费额度（`usage free`）
+- **控制台能力** — 浏览百炼应用（`app list`），查询模型免费额度（`usage free`），查看模型用量统计（`usage stats`），管理业务空间（`workspace list`），管理限流与提额（`quota list/request/check/history`）
 - **本地文件自动上传** — 所有 URL 参数同时支持本地路径，免费临时存储 48 小时
 
 ## 示例:一句话生成一部电影短片
@@ -103,9 +103,22 @@ bl advisor recommend --message "qwen-max 和 deepseek-v3 哪个更适合做代
 # 浏览器登录（控制台能力相关命令需要）
 bl auth login --console
 
-# 浏览应用 / 免费额度
+# 浏览应用 / 免费额度 / 用量统计 / 业务空间
 bl app list
 bl usage free --model qwen3-max
+bl usage free --expiring 30                           # 30 天内过期的额度
+bl usage free --sort remaining                        # 按剩余百分比升序排列
+bl usage stats --workspace-id <id>                    # 指定空间的用量概览
+bl usage stats --model qwen-turbo --workspace-id <id> # 指定模型用量
+bl workspace list                                     # 列出所有业务空间
+
+# 限流管理与提额
+bl quota list                                         # 查看所有模型的 RPM/TPM 限额
+bl quota list --model qwen3.6-plus                    # 查看指定模型限额
+bl quota check                                        # 查看当前用量 vs 限流阈值
+bl quota check --model qwen3.6-plus --period 5        # 查看最近 5 分钟用量
+bl quota request --model qwen3.6-plus --tpm 6000000   # 申请临时 TPM 提额
+bl quota history                                      # 查看提额历史记录
 ```
 
 > 更多案例与使用场景：[阿里云百炼 CLI 官方主页](https://bailian.console.aliyun.com/cli?source_channel=cli_github&)
@@ -129,7 +142,7 @@ bl text chat --api-key sk-xxxxx --message "你好"
 
 ### 控制台登录（OAuth）
 
-控制台能力命令（`app list`、`usage free`）需要使用此登录方式。打开浏览器跳转百炼控制台完成登录。
+控制台能力命令（`app list`、`usage free`、`usage stats`、`workspace list`、`quota list/request/check/history`）需要使用此登录方式。打开浏览器跳转百炼控制台完成登录。
 
 ```bash
 bl auth login --console
diff --git a/packages/cli/src/commands/advisor/recommend.ts b/packages/cli/src/commands/advisor/recommend.ts
index 5fe43d0..46ee0a8 100644
--- a/packages/cli/src/commands/advisor/recommend.ts
+++ b/packages/cli/src/commands/advisor/recommend.ts
@@ -234,12 +234,12 @@ export default defineCommand({
     },
   ],
   examples: [
-    'bl advisor recommend --message "我要做一个能理解图片的客服机器人"',
-    'bl advisor recommend --message "做一个Agent自动根据用户意图生成动画片"',
-    'bl advisor recommend --message "法律合同审查，要求高精准度"',
-    'bl advisor recommend --message "做一个低成本高并发的在线客服" --output json',
-    'bl advisor recommend --message "长文本摘要" --dry-run',
-    "bl advisor recommend                                          # 交互式输入需求",
+    'bl advisor recommend --message "I need a visual-understanding chatbot"',
+    'bl advisor recommend --message "Build an Agent that auto-generates animations"',
+    'bl advisor recommend --message "Legal contract review, high precision required"',
+    'bl advisor recommend --message "Low-cost high-concurrency online customer service" --output json',
+    'bl advisor recommend --message "Long document summarization" --dry-run',
+    "bl advisor recommend                                          # Interactive input",
   ],
   async run(config: Config, flags: GlobalFlags) {
     const positional = ((flags as Record<string, unknown>)._positional as string[]) ?? [];
diff --git a/packages/cli/src/commands/catalog.ts b/packages/cli/src/commands/catalog.ts
index 5fd330d..ae48fcc 100644
--- a/packages/cli/src/commands/catalog.ts
+++ b/packages/cli/src/commands/catalog.ts
@@ -36,9 +36,16 @@ import speechRecognize from "./speech/recognize.ts";
 import fileUpload from "./file/upload.ts";
 import consoleCall from "./console/call.ts";
 import usageFree from "./usage/free.ts";
+import usageFreetier from "./usage/freetier.ts";
+import usageStats from "./usage/stats.ts";
 import pipelineRun from "./pipeline/run.ts";
 import pipelineValidate from "./pipeline/validate.ts";
 import advisorRecommend from "./advisor/recommend.ts";
+import workspaceList from "./workspace/list.ts";
+import quotaList from "./quota/list.ts";
+import quotaRequest from "./quota/request.ts";
+import quotaHistory from "./quota/history.ts";
+import quotaCheck from "./quota/check.ts";
 
 /** Command registry map (no dependency on registry.ts — safe for build-time import). */
 export const commands: Record<string, Command> = {
@@ -74,11 +81,18 @@ export const commands: Record<string, Command> = {
   "file upload": fileUpload,
   "console call": consoleCall,
   "usage free": usageFree,
+  "usage freetier": usageFreetier,
+  "usage stats": usageStats,
   "pipeline run": pipelineRun,
   "pipeline validate": pipelineValidate,
   "config show": configShow,
   "config set": configSet,
   "config export-schema": configExportSchema,
   "advisor recommend": advisorRecommend,
+  "workspace list": workspaceList,
+  "quota list": quotaList,
+  "quota request": quotaRequest,
+  "quota history": quotaHistory,
+  "quota check": quotaCheck,
   update: update,
 };
diff --git a/packages/cli/src/commands/quota/check.ts b/packages/cli/src/commands/quota/check.ts
new file mode 100644
index 0000000..5c52fa8
--- /dev/null
+++ b/packages/cli/src/commands/quota/check.ts
@@ -0,0 +1,348 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  detectOutputFormat,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+import { displayWidth, padEnd } from "../../output/cjk-width.ts";
+
+const MODEL_LIST_API = "zeldaHttp.dashscopeModel./zelda/api/v1/modelCenter/listFoundationModels";
+const MONITOR_API = "zeldaEasy.bailian-telemetry.monitor.getMonitorData";
+
+interface QpmInfoItem {
+  count_limit: number;
+  count_limit_period: number;
+  usage_limit: number;
+  usage_limit_period: number;
+  usage_limit_field: string;
+  type: string;
+}
+
+interface ModelWithQpm {
+  model: string;
+  qpmInfo?: Record<string, QpmInfoItem>;
+}
+
+interface MonitorPoint {
+  value: number;
+  timestamp: number;
+}
+
+interface MonitorMetric {
+  aggMethod: string;
+  metricName: string;
+  points: MonitorPoint[];
+}
+
+function calculateRPM(item: QpmInfoItem | undefined, fallbackPeriod?: number): number {
+  if (!item) return 0;
+  const period = item.count_limit_period || fallbackPeriod;
+  if (!period) return 0;
+  return Math.floor((item.count_limit * 60) / period);
+}
+
+function calculateTPM(item: QpmInfoItem | undefined, fallbackPeriod?: number): number {
+  if (!item) return 0;
+  const period = item.usage_limit_period || fallbackPeriod;
+  if (!period) return 0;
+  return Math.floor((item.usage_limit * 60) / period);
+}
+
+function formatNumber(num: number): string {
+  return num.toLocaleString("en-US");
+}
+
+function formatRatio(usage: number, limit: number): string {
+  if (limit <= 0) return "-";
+  const pct = Math.round((usage / limit) * 100);
+  return `${formatNumber(usage)}/${formatNumber(limit)} (${pct}%)`;
+}
+
+function getStatus(usage: number, limit: number): string {
+  if (limit <= 0) return "-";
+  const pct = (usage / limit) * 100;
+  if (pct >= 100) return "已限流";
+  if (pct >= 80) return "接近限流";
+  return "正常";
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+async function fetchAllModelsWithQpm(
+  config: Config,
+  token: string,
+  region: string,
+): Promise<ModelWithQpm[]> {
+  const allModels: ModelWithQpm[] = [];
+  let pageNo = 1;
+
+  while (true) {
+    const raw = await callConsoleGateway(config, token, {
+      api: MODEL_LIST_API,
+      data: {
+        input: {
+          pageNo,
+          pageSize: 50,
+          group: false,
+          queryQpmInfo: true,
+          ignoreWorkspaceServiceSite: true,
+          supports: { selfServiceLimitIncrease: true },
+        },
+      },
+      region,
+    });
+
+    const resp = extractResponseData(raw as Record<string, unknown>);
+    const list = (resp.list as ModelWithQpm[]) ?? [];
+    const total = (resp.total as number) ?? 0;
+
+    allModels.push(...list);
+    if (allModels.length >= total || list.length === 0) break;
+    pageNo++;
+  }
+
+  return allModels;
+}
+
+async function fetchMonitorData(
+  config: Config,
+  token: string,
+  region: string,
+  modelName: string,
+  windowMinutes: number,
+): Promise<{ rpm: number; tpm: number }> {
+  const now = Date.now();
+  const startTime = now - windowMinutes * 60 * 1000;
+
+  try {
+    const raw = await callConsoleGateway(config, token, {
+      api: MONITOR_API,
+      data: {
+        reqDTO: {
+          monitorType: "Advanced",
+          metricFilters: [
+            { aggMethod: "sum_pm", metricName: "model_total_amount" },
+            { aggMethod: "sum_pm", metricName: "model_call_count" },
+          ],
+          labelFilters: {
+            resourceId: modelName,
+            resourceType: "model",
+          },
+          startTime,
+          endTime: now,
+        },
+      },
+      region,
+    });
+
+    const resp = extractResponseData(raw as Record<string, unknown>);
+    const metrics = (resp.data ?? resp) as MonitorMetric[] | Record<string, unknown>;
+    if (!Array.isArray(metrics)) return { rpm: 0, tpm: 0 };
+
+    let rpm = 0;
+    let tpm = 0;
+
+    for (const metric of metrics) {
+      if (metric.aggMethod !== "sum_pm" || !metric.points?.length) continue;
+      const lastValue = metric.points[metric.points.length - 1].value ?? 0;
+      if (metric.metricName === "model_call_count") rpm = Math.round(lastValue);
+      if (metric.metricName === "model_total_amount") tpm = Math.round(lastValue);
+    }
+
+    return { rpm, tpm };
+  } catch {
+    return { rpm: -1, tpm: -1 };
+  }
+}
+
+interface CheckRow {
+  model: string;
+  rpmUsage: number;
+  rpmLimit: number;
+  tpmUsage: number;
+  tpmLimit: number;
+}
+
+function printTable(rows: CheckRow[], noColor: boolean): void {
+  const bold = noColor ? (t: string) => t : (t: string) => `\x1b[1m${t}\x1b[0m`;
+  const dim = noColor ? (t: string) => t : (t: string) => `\x1b[2m${t}\x1b[0m`;
+  const green = noColor ? (t: string) => t : (t: string) => `\x1b[32m${t}\x1b[0m`;
+  const yellow = noColor ? (t: string) => t : (t: string) => `\x1b[33m${t}\x1b[0m`;
+  const red = noColor ? (t: string) => t : (t: string) => `\x1b[31m${t}\x1b[0m`;
+
+  const headersCn = ["模型", "RPM 用量/限额", "TPM 用量/限额", "状态"];
+  const headersEn = ["Model", "RPM Usage/Limit", "TPM Usage/Limit", "Status"];
+
+  const tableRows = rows.map((r) => {
+    const rpmStr = r.rpmUsage < 0 ? "-" : formatRatio(r.rpmUsage, r.rpmLimit);
+    const tpmStr = r.tpmUsage < 0 ? "-" : formatRatio(r.tpmUsage, r.tpmLimit);
+    const maxPct = Math.max(
+      r.rpmLimit > 0 ? (r.rpmUsage / r.rpmLimit) * 100 : 0,
+      r.tpmLimit > 0 ? (r.tpmUsage / r.tpmLimit) * 100 : 0,
+    );
+    const status =
+      r.rpmUsage < 0
+        ? "-"
+        : getStatus(Math.max(r.rpmUsage, r.tpmUsage), Math.max(r.rpmLimit, r.tpmLimit));
+    return { cells: [r.model, rpmStr, tpmStr, status], maxPct };
+  });
+
+  if (tableRows.length === 0) {
+    process.stdout.write("No models found.\n");
+    return;
+  }
+
+  const widths = headersCn.map((label, col) =>
+    Math.max(
+      displayWidth(label),
+      displayWidth(headersEn[col]),
+      ...tableRows.map((r) => displayWidth(r.cells[col])),
+    ),
+  );
+
+  const cnLine = headersCn.map((label, col) => bold(padEnd(label, widths[col]))).join("  ");
+  const enLine = headersEn.map((label, col) => dim(padEnd(label, widths[col]))).join("  ");
+  const separator = widths.map((w) => dim("─".repeat(w))).join("──");
+
+  process.stdout.write(cnLine + "\n");
+  process.stdout.write(enLine + "\n");
+  process.stdout.write(separator + "\n");
+
+  const statusCol = 3;
+  for (const r of tableRows) {
+    const cells = r.cells.map((cell, col) => {
+      if (col === statusCol) {
+        if (cell === "已限流") return red(padEnd(cell, widths[col]));
+        if (cell === "接近限流") return yellow(padEnd(cell, widths[col]));
+        if (cell === "正常") return green(padEnd(cell, widths[col]));
+      }
+      return padEnd(cell, widths[col]);
+    });
+    process.stdout.write(cells.join("  ") + "\n");
+  }
+
+  process.stdout.write(dim(`\n共 ${rows.length} 个模型 (Total: ${rows.length})`) + "\n");
+}
+
+export default defineCommand({
+  name: "quota check",
+  description: "Check current usage against rate limits",
+  usage: "bl quota check [--model <model>] [flags]",
+  options: [
+    {
+      flag: "--model <model>",
+      description: "Model name(s), comma-separated",
+    },
+    {
+      flag: "--period <minutes>",
+      description: "Query usage for the last N minutes (default: 2)",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: [
+    "bl quota check",
+    "bl quota check --model qwen3.6-plus",
+    "bl quota check --period 5",
+    "bl quota check --model qwen3.6-plus,qwen-turbo",
+    "bl quota check --output json",
+  ],
+  async run(config: Config, flags: GlobalFlags) {
+    const modelFlag = (flags.model as string) || undefined;
+    const rawPeriod = Number(flags.period) || 2;
+    if (rawPeriod < 1) {
+      process.stderr.write("Error: --period must be at least 1 minute.\n");
+      process.exit(1);
+    }
+    const windowMinutes = rawPeriod;
+    const region = (flags.region as string) || "cn-beijing";
+    const format = detectOutputFormat(config.output);
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    if (config.dryRun) {
+      emitResult(
+        {
+          apis: [MODEL_LIST_API, MONITOR_API],
+          region,
+        },
+        format,
+      );
+      return;
+    }
+
+    let models = await fetchAllModelsWithQpm(config, credential.token, region);
+
+    if (modelFlag) {
+      const names = new Set(
+        modelFlag
+          .split(",")
+          .map((n) => n.trim())
+          .filter(Boolean),
+      );
+      models = models.filter((m) => names.has(m.model));
+    }
+
+    models = models.filter((m) => m.qpmInfo);
+
+    if (models.length === 0) {
+      process.stdout.write("No models found.\n");
+      return;
+    }
+
+    const monitorResults = await Promise.all(
+      models.map((m) => fetchMonitorData(config, credential.token, region, m.model, windowMinutes)),
+    );
+
+    const checkRows: CheckRow[] = models.map((m, idx) => {
+      const qpm = m.qpmInfo!;
+      const modelDefault = qpm["model-default"];
+      const userSpec = qpm["user-spec"];
+
+      const rpmLimit =
+        calculateRPM(userSpec, modelDefault?.count_limit_period) || calculateRPM(modelDefault);
+      const tpmLimit =
+        calculateTPM(userSpec, modelDefault?.usage_limit_period) || calculateTPM(modelDefault);
+
+      return {
+        model: m.model,
+        rpmUsage: monitorResults[idx].rpm,
+        rpmLimit,
+        tpmUsage: monitorResults[idx].tpm,
+        tpmLimit,
+      };
+    });
+
+    if (format === "json") {
+      emitResult(checkRows, format);
+      return;
+    }
+
+    printTable(checkRows, config.noColor);
+  },
+});
diff --git a/packages/cli/src/commands/quota/history.ts b/packages/cli/src/commands/quota/history.ts
new file mode 100644
index 0000000..99dd510
--- /dev/null
+++ b/packages/cli/src/commands/quota/history.ts
@@ -0,0 +1,184 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  detectOutputFormat,
+  BailianError,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+import { displayWidth, padEnd } from "../../output/cjk-width.ts";
+
+const HISTORY_API = "zeldaEasy.broadscope-platform.modelInstance.listModelLimitApplications";
+
+interface LimitApplicationItem {
+  gmtCreate: string;
+  deployedModel: string;
+  usageLimit: number;
+  endTime?: string;
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+function formatDateTime(ts: string | undefined): string {
+  if (!ts) return "-";
+  try {
+    const date = new Date(ts);
+    if (isNaN(date.getTime())) return ts;
+    const y = date.getFullYear();
+    const mo = String(date.getMonth() + 1).padStart(2, "0");
+    const d = String(date.getDate()).padStart(2, "0");
+    const h = String(date.getHours()).padStart(2, "0");
+    const mi = String(date.getMinutes()).padStart(2, "0");
+    return `${y}-${mo}-${d} ${h}:${mi}`;
+  } catch {
+    return ts;
+  }
+}
+
+function formatNumber(num: number): string {
+  return num.toLocaleString("en-US");
+}
+
+function printTable(records: LimitApplicationItem[], noColor: boolean, total: number): void {
+  const bold = noColor ? (t: string) => t : (t: string) => `\x1b[1m${t}\x1b[0m`;
+  const dim = noColor ? (t: string) => t : (t: string) => `\x1b[2m${t}\x1b[0m`;
+
+  const headersCn = ["模型", "Token 账号限流", "申请时间"];
+  const headersEn = ["Model", "Token Limit", "Applied At"];
+
+  const rows = records.map((r) => [
+    r.deployedModel,
+    formatNumber(r.usageLimit),
+    formatDateTime(r.gmtCreate),
+  ]);
+
+  const widths = headersCn.map((label, col) =>
+    Math.max(
+      displayWidth(label),
+      displayWidth(headersEn[col]),
+      ...rows.map((row) => displayWidth(row[col])),
+    ),
+  );
+
+  const cnLine = headersCn.map((label, col) => bold(padEnd(label, widths[col]))).join("  ");
+  const enLine = headersEn.map((label, col) => dim(padEnd(label, widths[col]))).join("  ");
+  const separator = widths.map((w) => dim("─".repeat(w))).join("──");
+
+  process.stdout.write(cnLine + "\n");
+  process.stdout.write(enLine + "\n");
+  process.stdout.write(separator + "\n");
+
+  for (const row of rows) {
+    process.stdout.write(row.map((cell, col) => padEnd(cell, widths[col])).join("  ") + "\n");
+  }
+
+  process.stdout.write(dim(`\n共 ${total} 条记录 (Total: ${total})`) + "\n");
+}
+
+export default defineCommand({
+  name: "quota history",
+  description: "View quota change history",
+  usage: "bl quota history [flags]",
+  options: [
+    {
+      flag: "--page <n>",
+      description: "Page number (default: 1)",
+    },
+    {
+      flag: "--page-size <n>",
+      description: "Page size (default: 10)",
+    },
+    {
+      flag: "--model <model>",
+      description: "Filter by model name",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: [
+    "bl quota history",
+    "bl quota history --page 2",
+    "bl quota history --page-size 20",
+    "bl quota history --model qwen-turbo",
+    "bl quota history --output json",
+  ],
+  async run(config: Config, flags: GlobalFlags) {
+    const page = Number(flags.page) || 1;
+    const pageSize = Number(flags.pageSize) || 10;
+    const modelFilter = (flags.model as string) || undefined;
+    const region = (flags.region as string) || "cn-beijing";
+    const format = detectOutputFormat(config.output);
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    const requestData = {
+      input: { pageNo: page, pageSize },
+    };
+
+    if (config.dryRun) {
+      emitResult({ api: HISTORY_API, data: requestData, region }, format);
+      return;
+    }
+
+    let result: unknown;
+    try {
+      result = await callConsoleGateway(config, credential.token, {
+        api: HISTORY_API,
+        data: requestData,
+        region,
+      });
+    } catch (err) {
+      if (err instanceof BailianError && err.message.includes("NotLogined")) {
+        process.stderr.write(
+          "Error: session expired. Run `bl auth login --console` to re-authenticate.\n",
+        );
+        process.exit(1);
+      }
+      throw err;
+    }
+
+    if (format === "json") {
+      emitResult(result, format);
+      return;
+    }
+
+    const resp = extractResponseData(result as Record<string, unknown>);
+    let records = (resp.records as LimitApplicationItem[]) ?? [];
+    const total = (resp.items as number) ?? records.length;
+
+    if (modelFilter) {
+      records = records.filter((r) => r.deployedModel === modelFilter);
+    }
+
+    if (records.length === 0) {
+      process.stdout.write("No quota change history found.\n");
+      return;
+    }
+
+    printTable(records, config.noColor, modelFilter ? records.length : total);
+  },
+});
diff --git a/packages/cli/src/commands/quota/list.ts b/packages/cli/src/commands/quota/list.ts
new file mode 100644
index 0000000..3002757
--- /dev/null
+++ b/packages/cli/src/commands/quota/list.ts
@@ -0,0 +1,230 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  detectOutputFormat,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+import { displayWidth, padEnd } from "../../output/cjk-width.ts";
+
+const MODEL_LIST_API = "zeldaHttp.dashscopeModel./zelda/api/v1/modelCenter/listFoundationModels";
+
+interface QpmInfoItem {
+  count_limit: number;
+  count_limit_period: number;
+  usage_limit: number;
+  usage_limit_period: number;
+  usage_limit_field: string;
+  type: string;
+}
+
+interface ModelWithQpm {
+  model: string;
+  qpmInfo?: Record<string, QpmInfoItem>;
+}
+
+function calculateRPM(item: QpmInfoItem | undefined, fallbackPeriod?: number): number {
+  if (!item) return 0;
+  const period = item.count_limit_period || fallbackPeriod;
+  if (!period) return 0;
+  return Math.floor((item.count_limit * 60) / period);
+}
+
+function calculateTPM(item: QpmInfoItem | undefined, fallbackPeriod?: number): number {
+  if (!item) return 0;
+  const period = item.usage_limit_period || fallbackPeriod;
+  if (!period) return 0;
+  return Math.floor((item.usage_limit * 60) / period);
+}
+
+function formatNumber(num: number): string {
+  return num.toLocaleString("en-US");
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+async function fetchAllModelsWithQpm(
+  config: Config,
+  token: string,
+  region: string,
+  onlySelfService: boolean,
+): Promise<ModelWithQpm[]> {
+  const allModels: ModelWithQpm[] = [];
+  let pageNo = 1;
+
+  while (true) {
+    const input: Record<string, unknown> = {
+      pageNo,
+      pageSize: 50,
+      group: false,
+      queryQpmInfo: true,
+      ignoreWorkspaceServiceSite: true,
+    };
+    if (onlySelfService) {
+      input.supports = { selfServiceLimitIncrease: true };
+    }
+
+    const raw = await callConsoleGateway(config, token, {
+      api: MODEL_LIST_API,
+      data: { input },
+      region,
+    });
+
+    const resp = extractResponseData(raw as Record<string, unknown>);
+    const list = (resp.list as ModelWithQpm[]) ?? [];
+    const total = (resp.total as number) ?? 0;
+
+    allModels.push(...list);
+    if (allModels.length >= total || list.length === 0) break;
+    pageNo++;
+  }
+
+  return allModels;
+}
+
+function printTable(models: ModelWithQpm[], noColor: boolean): void {
+  const bold = noColor ? (t: string) => t : (t: string) => `\x1b[1m${t}\x1b[0m`;
+  const dim = noColor ? (t: string) => t : (t: string) => `\x1b[2m${t}\x1b[0m`;
+
+  const headersCn = ["模型", "RPM", "TPM", "可设上限 TPM"];
+  const headersEn = ["Model", "Req/min", "Token/min", "Max TPM"];
+
+  const rows = models.map((m) => {
+    const qpm = m.qpmInfo;
+    const modelDefault = qpm?.["model-default"];
+    const userSpec = qpm?.["user-spec"];
+
+    const defaultRPM = calculateRPM(modelDefault);
+    const defaultTPM = calculateTPM(modelDefault);
+    const currentRPM = calculateRPM(userSpec, modelDefault?.count_limit_period) || defaultRPM;
+    const currentTPM = calculateTPM(userSpec, modelDefault?.usage_limit_period) || defaultTPM;
+    const maxTPM = defaultTPM * 2;
+
+    return [
+      m.model,
+      currentRPM > 0 ? formatNumber(currentRPM) : "-",
+      currentTPM > 0 ? formatNumber(currentTPM) : "-",
+      maxTPM > 0 ? formatNumber(maxTPM) : "-",
+    ];
+  });
+
+  if (rows.length === 0) {
+    process.stdout.write("No models found.\n");
+    return;
+  }
+
+  const widths = headersCn.map((label, col) =>
+    Math.max(
+      displayWidth(label),
+      displayWidth(headersEn[col]),
+      ...rows.map((row) => displayWidth(row[col])),
+    ),
+  );
+
+  const cnLine = headersCn.map((label, col) => bold(padEnd(label, widths[col]))).join("  ");
+  const enLine = headersEn.map((label, col) => dim(padEnd(label, widths[col]))).join("  ");
+  const separator = widths.map((w) => dim("─".repeat(w))).join("──");
+
+  process.stdout.write(cnLine + "\n");
+  process.stdout.write(enLine + "\n");
+  process.stdout.write(separator + "\n");
+
+  for (const row of rows) {
+    process.stdout.write(row.map((cell, col) => padEnd(cell, widths[col])).join("  ") + "\n");
+  }
+
+  process.stdout.write(dim(`\n共 ${models.length} 个模型 (Total: ${models.length})`) + "\n");
+}
+
+export default defineCommand({
+  name: "quota list",
+  description: "View model RPM/TPM rate limits",
+  usage: "bl quota list [--model <model>] [flags]",
+  options: [
+    {
+      flag: "--model <model>",
+      description: "Model name(s), comma-separated",
+    },
+    {
+      flag: "--all",
+      description: "Show all models, not just self-service ones",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: [
+    "bl quota list",
+    "bl quota list --model qwen3.6-plus",
+    "bl quota list --model qwen3.6-plus,qwen-turbo",
+    "bl quota list --all",
+    "bl quota list --output json",
+  ],
+  async run(config: Config, flags: GlobalFlags) {
+    const modelFlag = (flags.model as string) || undefined;
+    const showAll = Boolean(flags.all);
+    const region = (flags.region as string) || "cn-beijing";
+    const format = detectOutputFormat(config.output);
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    if (config.dryRun) {
+      const input: Record<string, unknown> = {
+        pageNo: 1,
+        pageSize: 50,
+        group: false,
+        queryQpmInfo: true,
+        ignoreWorkspaceServiceSite: true,
+      };
+      if (!showAll) input.supports = { selfServiceLimitIncrease: true };
+      emitResult({ api: MODEL_LIST_API, data: { input }, region }, format);
+      return;
+    }
+
+    let models = await fetchAllModelsWithQpm(config, credential.token, region, !showAll);
+
+    if (modelFlag) {
+      const names = new Set(
+        modelFlag
+          .split(",")
+          .map((n) => n.trim())
+          .filter(Boolean),
+      );
+      models = models.filter((m) => names.has(m.model));
+      if (models.length === 0) {
+        process.stderr.write(`Error: no matching models found for "${modelFlag}".\n`);
+        process.exit(1);
+      }
+    }
+
+    if (format === "json") {
+      emitResult(models, format);
+      return;
+    }
+
+    printTable(models, config.noColor);
+  },
+});
diff --git a/packages/cli/src/commands/quota/request.ts b/packages/cli/src/commands/quota/request.ts
new file mode 100644
index 0000000..60f727b
--- /dev/null
+++ b/packages/cli/src/commands/quota/request.ts
@@ -0,0 +1,220 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  detectOutputFormat,
+  BailianError,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+
+const MODEL_LIST_API = "zeldaHttp.dashscopeModel./zelda/api/v1/modelCenter/listFoundationModels";
+const UPDATE_LIMITS_API = "zeldaEasy.broadscope-platform.modelInstance.updateFoundationModelLimits";
+
+interface QpmInfoItem {
+  count_limit: number;
+  count_limit_period: number;
+  usage_limit: number;
+  usage_limit_period: number;
+  usage_limit_field: string;
+  type: string;
+}
+
+function calculateTPM(item: QpmInfoItem | undefined, fallbackPeriod?: number): number {
+  if (!item) return 0;
+  const period = item.usage_limit_period || fallbackPeriod;
+  if (!period) return 0;
+  return Math.floor((item.usage_limit * 60) / period);
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+async function fetchModelQpmInfo(
+  config: Config,
+  token: string,
+  region: string,
+  modelName: string,
+): Promise<{ model: string; qpmInfo: Record<string, QpmInfoItem> } | undefined> {
+  const raw = await callConsoleGateway(config, token, {
+    api: MODEL_LIST_API,
+    data: {
+      input: {
+        pageNo: 1,
+        pageSize: 50,
+        name: modelName,
+        group: false,
+        queryQpmInfo: true,
+        ignoreWorkspaceServiceSite: true,
+        supports: { selfServiceLimitIncrease: true },
+      },
+    },
+    region,
+  });
+
+  const resp = extractResponseData(raw as Record<string, unknown>);
+  const list = (resp.list as Array<{ model: string; qpmInfo?: Record<string, QpmInfoItem> }>) ?? [];
+  return list.find((m) => m.model === modelName && m.qpmInfo) as
+    | { model: string; qpmInfo: Record<string, QpmInfoItem> }
+    | undefined;
+}
+
+export default defineCommand({
+  name: "quota request",
+  description: "Request a temporary quota increase",
+  usage: "bl quota request --model <model> --tpm <value> [flags]",
+  options: [
+    {
+      flag: "--model <model>",
+      description: "Model name (required)",
+      required: true,
+    },
+    {
+      flag: "--tpm <value>",
+      description: "Target TPM value (required)",
+      required: true,
+    },
+    {
+      flag: "--yes",
+      description: "Skip downgrade confirmation",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: [
+    "bl quota request --model qwen-turbo --tpm 100000",
+    "bl quota request --model qwen3.6-plus --tpm 8000000 --yes",
+    "bl quota request --model qwen-turbo --tpm 100000 --output json",
+  ],
+  async run(config: Config, flags: GlobalFlags) {
+    const modelName = flags.model as string;
+    if (!modelName) {
+      process.stderr.write("Error: --model is required.\n");
+      process.exit(1);
+    }
+
+    const tpmValue = Number(flags.tpm);
+    if (!tpmValue || tpmValue <= 0) {
+      process.stderr.write("Error: --tpm must be a positive number.\n");
+      process.exit(1);
+    }
+
+    const autoConfirm = Boolean(flags.yes) || config.yes;
+    const region = (flags.region as string) || "cn-beijing";
+    const format = detectOutputFormat(config.output);
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    const modelInfo = await fetchModelQpmInfo(config, credential.token, region, modelName);
+    if (!modelInfo) {
+      process.stderr.write(
+        `Error: model "${modelName}" not found or does not support self-service quota increase.\n`,
+      );
+      process.stderr.write("Hint: run `bl quota list` to view available models.\n");
+      process.exit(1);
+    }
+
+    const modelDefault = modelInfo.qpmInfo["model-default"];
+    const userSpec = modelInfo.qpmInfo["user-spec"];
+    const minLimit = calculateTPM(modelDefault);
+    const currentLimit = calculateTPM(userSpec, modelDefault?.usage_limit_period) || minLimit;
+    const maxLimit = minLimit * 2;
+
+    if (tpmValue < minLimit || tpmValue > maxLimit) {
+      process.stderr.write(
+        `Error: TPM value ${tpmValue.toLocaleString()} is out of range.\n` +
+          `  Current: ${currentLimit.toLocaleString()}\n` +
+          `  Range:   ${minLimit.toLocaleString()} ~ ${maxLimit.toLocaleString()}\n`,
+      );
+      process.exit(1);
+    }
+
+    const requestData = {
+      input: {
+        model: modelName,
+        limit: { usage_limit: tpmValue },
+        originalQpmInfo: modelInfo.qpmInfo,
+      } as Record<string, unknown>,
+    };
+
+    if (config.dryRun) {
+      emitResult({ api: UPDATE_LIMITS_API, data: requestData, region }, format);
+      return;
+    }
+
+    const submitRequest = async (confirmedDowngrade?: boolean): Promise<unknown> => {
+      if (confirmedDowngrade) {
+        requestData.input.confirmedDowngrade = true;
+      }
+      try {
+        return await callConsoleGateway(config, credential.token, {
+          api: UPDATE_LIMITS_API,
+          data: requestData,
+          region,
+        });
+      } catch (err) {
+        if (err instanceof BailianError && err.message.includes("NotLogined")) {
+          process.stderr.write(
+            "Error: session expired. Run `bl auth login --console` to re-authenticate.\n",
+          );
+          process.exit(1);
+        }
+        throw err;
+      }
+    };
+
+    let result = await submitRequest();
+    const resp = extractResponseData(result as Record<string, unknown>);
+
+    if (resp.needConfirm) {
+      const confirmCode = resp.confirmCode as string;
+
+      if (confirmCode === "Refresh_Required") {
+        process.stderr.write("Error: rate limit has been updated externally. Please retry.\n");
+        process.exit(1);
+      }
+
+      if (confirmCode === "Downgrade") {
+        if (!autoConfirm) {
+          process.stderr.write(
+            `Warning: target TPM (${tpmValue.toLocaleString()}) is lower than current (${currentLimit.toLocaleString()}).\n` +
+              "Use --yes to confirm downgrade.\n",
+          );
+          process.exit(1);
+        }
+        result = await submitRequest(true);
+      }
+    }
+
+    if (format === "json") {
+      emitResult(result, format);
+      return;
+    }
+
+    process.stdout.write(
+      `Quota updated for "${modelName}": TPM ${currentLimit.toLocaleString()} → ${tpmValue.toLocaleString()}\n`,
+    );
+  },
+});
diff --git a/packages/cli/src/commands/usage/free.ts b/packages/cli/src/commands/usage/free.ts
index 215fdf1..3dd11a0 100644
--- a/packages/cli/src/commands/usage/free.ts
+++ b/packages/cli/src/commands/usage/free.ts
@@ -2,24 +2,210 @@ import {
   defineCommand,
   callConsoleGateway,
   resolveConsoleGatewayCredential,
+  fetchModelList,
   detectOutputFormat,
   type Config,
   type GlobalFlags,
 } from "bailian-cli-core";
-import { failIfMissing } from "../../output/prompt.ts";
 import { emitResult } from "../../output/output.ts";
+import { displayWidth, padEnd } from "../../output/cjk-width.ts";
 
 const FREE_TIER_API = "zeldaEasy.broadscope-bailian.freeTrial.queryFreeTierQuota";
+const FREE_TIER_ONLY_STATUS_API = "zeldaEasy.broadscope-bailian.freeTrial.queryFreeTierOnlyStatus";
+
+interface FreeTierQuota {
+  model: string;
+  quotaInitTotal: number;
+  quotaTotal: number;
+  quotaValidityPeriod: number;
+  quotaStatus: string;
+}
+
+interface FreeTierOnlyStatus {
+  model: string;
+  freeTierOnly: boolean;
+}
+
+function formatNumber(num: number): string {
+  return num.toLocaleString("en-US");
+}
+
+function formatDate(ts: number): string {
+  const date = new Date(ts);
+  const year = date.getFullYear();
+  const month = String(date.getMonth() + 1).padStart(2, "0");
+  const day = String(date.getDate()).padStart(2, "0");
+  return `${year}-${month}-${day}`;
+}
+
+function formatUsage(quota: FreeTierQuota): string {
+  if (!quota.quotaInitTotal) return "-";
+  const used = quota.quotaInitTotal - quota.quotaTotal;
+  const percent = (used / quota.quotaInitTotal) * 100;
+  return `${percent.toFixed(1)}%`;
+}
+
+const CAPABILITY_TO_TYPE: Record<string, string> = {
+  Reasoning: "Text",
+  TG: "Text",
+  VU: "Text",
+  IG: "Vision",
+  VG: "Vision",
+  "Realtime-Omni": "Multimodal",
+  "Multimodal-Omni": "Multimodal",
+  ASR: "Audio",
+  TTS: "Audio",
+  "Voice-Replication": "Audio",
+  "Realtime-Text-to-Speech": "Audio",
+  "Realtime-Voice-Replication": "Audio",
+  "Realtime-ASR": "Audio",
+  "Realtime-Audio-Translate": "Audio",
+  ME: "Embedding",
+  TR: "Embedding",
+};
+
+function resolveModelType(capabilities: string[]): string {
+  for (const cap of capabilities) {
+    const type = CAPABILITY_TO_TYPE[cap];
+    if (type) return type;
+  }
+  return "-";
+}
+
+function printTable(
+  quotas: FreeTierQuota[],
+  stopMap: Map<string, boolean>,
+  typeMap: Map<string, string>,
+  noColor: boolean,
+): void {
+  const headersCn = ["模型", "类型", "剩余/总量", "使用率", "过期时间", "用完即停"];
+  const headersEn = ["Model", "Type", "Remaining/Total", "Usage", "Expires", "Auto-Stop"];
+
+  const rows = quotas.map((quota) => {
+    const hasQuota = quota.quotaInitTotal != null && quota.quotaTotal != null;
+    const remaining = hasQuota ? formatNumber(quota.quotaTotal) : "-";
+    const total = hasQuota ? formatNumber(quota.quotaInitTotal) : "-";
+    const stopStatus = stopMap.get(quota.model);
+    return [
+      quota.model,
+      typeMap.get(quota.model) || "-",
+      hasQuota ? `${remaining} / ${total}` : "-",
+      formatUsage(quota),
+      quota.quotaValidityPeriod ? formatDate(quota.quotaValidityPeriod) : "-",
+      quota.quotaStatus === "UNKNOWN"
+        ? "Unsupported"
+        : stopStatus === true
+          ? "ON"
+          : stopStatus === false
+            ? "OFF"
+            : "-",
+    ];
+  });
+
+  const widths = headersCn.map((label, col) =>
+    Math.max(
+      displayWidth(label),
+      displayWidth(headersEn[col]),
+      ...rows.map((row) => displayWidth(row[col])),
+    ),
+  );
+
+  const dim = noColor ? (text: string) => text : (text: string) => `\x1b[2m${text}\x1b[0m`;
+  const bold = noColor ? (text: string) => text : (text: string) => `\x1b[1m${text}\x1b[0m`;
+  const green = noColor ? (text: string) => text : (text: string) => `\x1b[32m${text}\x1b[0m`;
+  const yellow = noColor ? (text: string) => text : (text: string) => `\x1b[33m${text}\x1b[0m`;
+
+  const autoStopCol = headersCn.length - 1;
+  const cnLine = headersCn.map((label, col) => bold(padEnd(label, widths[col]))).join("  ");
+  const enLine = headersEn.map((label, col) => dim(padEnd(label, widths[col]))).join("  ");
+  const separator = widths.map((width) => dim("─".repeat(width))).join("──");
+
+  process.stdout.write(cnLine + "\n");
+  process.stdout.write(enLine + "\n");
+  process.stdout.write(separator + "\n");
+
+  for (const row of rows) {
+    const cells = row.map((cell, col) => {
+      if (col === autoStopCol) {
+        if (cell === "ON") return green(padEnd(cell, widths[col]));
+        if (cell === "OFF") return yellow(padEnd(cell, widths[col]));
+      }
+      return padEnd(cell, widths[col]);
+    });
+    process.stdout.write(cells.join("  ") + "\n");
+  }
+}
+
+function extractQuotas(result: unknown): FreeTierQuota[] {
+  const root = result as Record<string, unknown>;
+  const data = root.data as Record<string, unknown> | undefined;
+  if (!data) return [];
+
+  const dataV2 = data.DataV2 as Record<string, unknown> | undefined;
+  if (dataV2) {
+    const inner = dataV2.data as Record<string, unknown> | undefined;
+    const innerData = inner?.data as Record<string, unknown> | undefined;
+    return (innerData?.freeTierQuotas as FreeTierQuota[]) || [];
+  }
+
+  const direct = data.data as Record<string, unknown> | undefined;
+  return (direct?.freeTierQuotas as FreeTierQuota[]) || [];
+}
+
+function extractFreeTierOnlyStatuses(result: unknown): FreeTierOnlyStatus[] {
+  const root = result as Record<string, unknown>;
+  const data = root.data as Record<string, unknown> | undefined;
+  if (!data) return [];
+
+  const dataV2 = data.DataV2 as Record<string, unknown> | undefined;
+  if (dataV2) {
+    const inner = dataV2.data as Record<string, unknown> | undefined;
+    const innerData = inner?.data as Record<string, unknown> | undefined;
+    return (innerData?.freeTierOnlyStatuses as FreeTierOnlyStatus[]) || [];
+  }
+
+  const direct = data.data as Record<string, unknown> | undefined;
+  return (direct?.freeTierOnlyStatuses as FreeTierOnlyStatus[]) || [];
+}
+
+interface ModelInfo {
+  name: string;
+  type: string;
+}
+
+async function fetchAllModels(config: Config, token: string): Promise<ModelInfo[]> {
+  const allModels: Record<string, unknown>[] = [];
+  let page = 1;
+  while (true) {
+    const result = await fetchModelList(config, token, { pageNo: page, pageSize: 50 });
+    allModels.push(...result.models);
+    if (allModels.length >= result.total) break;
+    page++;
+  }
+  return allModels
+    .filter((item) => typeof item.model === "string" && item.model)
+    .map((item) => ({
+      name: item.model as string,
+      type: resolveModelType((item.capabilities as string[]) || []),
+    }));
+}
 
 export default defineCommand({
   name: "usage free",
-  description: "Query free-tier quota for a model",
-  usage: "bl usage free --model <model> [flags]",
+  description: "Query free-tier quota for models (all models if --model is omitted)",
+  usage: "bl usage free [--model <model>[,model2,...]] [flags]",
   options: [
     {
       flag: "--model <model>",
-      description: "Model name to query (e.g. qwen3-max, qwen-turbo)",
-      required: true,
+      description: "Model name(s) to query, comma-separated for multiple; omit for all models",
+    },
+    {
+      flag: "--expiring <days>",
+      description: "Only show quotas expiring within N days",
+    },
+    {
+      flag: "--sort <field>",
+      description: "Sort by: remaining (ascending), expires (ascending)",
     },
     {
       flag: "--region <region>",
@@ -27,39 +213,122 @@ export default defineCommand({
     },
   ],
   examples: [
+    "bl usage free",
     "bl usage free --model qwen3-max",
+    "bl usage free --model qwen3-max,qwen-turbo",
+    "bl usage free --expiring 30",
+    "bl usage free --sort remaining",
     "bl usage free --model qwen-turbo --output json",
     "bl usage free --model qwen3-max --region cn-beijing",
   ],
   async run(config: Config, flags: GlobalFlags) {
-    const model = flags.model as string;
-    if (!model) failIfMissing("model", "bl usage free --model <model>");
-
+    const modelFlag = (flags.model as string) || undefined;
+    const expiringDays = Number(flags.expiring) || 0;
+    const VALID_SORT_FIELDS = ["remaining", "expires"] as const;
+    const sortField = (flags.sort as string) || undefined;
+    if (sortField && !VALID_SORT_FIELDS.includes(sortField as (typeof VALID_SORT_FIELDS)[number])) {
+      process.stderr.write(
+        `Error: invalid --sort value "${sortField}". Must be one of: ${VALID_SORT_FIELDS.join(", ")}\n`,
+      );
+      process.exit(1);
+    }
     const region = (flags.region as string) || "cn-beijing";
     const format = detectOutputFormat(config.output);
 
     const credential = await resolveConsoleGatewayCredential(config);
 
-    const data = {
-      queryFreeTierQuotaRequest: {
-        models: [model],
-      },
+    let models: string[];
+    const typeMap = new Map<string, string>();
+
+    if (modelFlag) {
+      models = [
+        ...new Set(
+          modelFlag
+            .split(",")
+            .map((name) => name.trim())
+            .filter(Boolean),
+        ),
+      ];
+      const searchResults = await Promise.all(
+        models.map((name) => fetchModelList(config, credential.token, { name, pageSize: 50 })),
+      );
+      for (let idx = 0; idx < models.length; idx++) {
+        const matched = searchResults[idx].models.find((item) => item.model === models[idx]);
+        if (matched) {
+          typeMap.set(models[idx], resolveModelType((matched.capabilities as string[]) || []));
+        }
+      }
+    } else {
+      const modelInfos = await fetchAllModels(config, credential.token);
+      models = modelInfos.map((info) => info.name);
+      for (const info of modelInfos) {
+        typeMap.set(info.name, info.type);
+      }
+    }
+
+    const requestData = {
+      queryFreeTierQuotaRequest: { models },
     };
 
     if (config.dryRun) {
       emitResult(
-        { api: FREE_TIER_API, data, region, token: credential.token.slice(0, 8) + "..." },
+        {
+          api: FREE_TIER_API,
+          data: requestData,
+          region,
+          token: credential.token.slice(0, 8) + "...",
+        },
         format,
       );
       return;
     }
 
-    const result = await callConsoleGateway(config, credential.token, {
-      api: FREE_TIER_API,
-      data,
-      region,
-    });
+    const [quotaResult, stopResult] = await Promise.all([
+      callConsoleGateway(config, credential.token, {
+        api: FREE_TIER_API,
+        data: requestData,
+        region,
+      }),
+      callConsoleGateway(config, credential.token, {
+        api: FREE_TIER_ONLY_STATUS_API,
+        data: { queryFreeTierOnlyStatusRequest: { models } },
+        region,
+      }),
+    ]);
+
+    if (format === "json") {
+      emitResult(quotaResult, format);
+      return;
+    }
+
+    const allQuotas = extractQuotas(quotaResult);
+    let quotas = modelFlag
+      ? allQuotas
+      : allQuotas.filter((quota) => quota.quotaStatus === "VALID" && quota.quotaInitTotal > 0);
+
+    if (expiringDays > 0) {
+      const cutoff = Date.now() + expiringDays * 24 * 60 * 60 * 1000;
+      quotas = quotas.filter((q) => q.quotaValidityPeriod > 0 && q.quotaValidityPeriod <= cutoff);
+    }
+
+    if (sortField === "remaining") {
+      quotas.sort((a, b) => {
+        const pctA = a.quotaInitTotal ? a.quotaTotal / a.quotaInitTotal : 0;
+        const pctB = b.quotaInitTotal ? b.quotaTotal / b.quotaInitTotal : 0;
+        return pctA - pctB;
+      });
+    } else if (sortField === "expires") {
+      quotas.sort((a, b) => (a.quotaValidityPeriod ?? 0) - (b.quotaValidityPeriod ?? 0));
+    }
+
+    if (quotas.length === 0) {
+      process.stdout.write("No free-tier quota found.\n");
+      return;
+    }
+
+    const stopStatuses = extractFreeTierOnlyStatuses(stopResult);
+    const stopMap = new Map(stopStatuses.map((status) => [status.model, status.freeTierOnly]));
 
-    emitResult(result, format);
+    printTable(quotas, stopMap, typeMap, config.noColor);
   },
 });
diff --git a/packages/cli/src/commands/usage/freetier.ts b/packages/cli/src/commands/usage/freetier.ts
new file mode 100644
index 0000000..8894b02
--- /dev/null
+++ b/packages/cli/src/commands/usage/freetier.ts
@@ -0,0 +1,252 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  fetchModelList,
+  detectOutputFormat,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+
+const ACTIVATE_API = "zeldaEasy.broadscope-bailian.freeTrial.batchActivateFreeTierOnly";
+const DEACTIVATE_API = "zeldaEasy.broadscope-bailian.freeTrial.batchDeactivateFreeTierOnly";
+const FREE_TIER_API = "zeldaEasy.broadscope-bailian.freeTrial.queryFreeTierQuota";
+const FREE_TIER_ONLY_STATUS_API = "zeldaEasy.broadscope-bailian.freeTrial.queryFreeTierOnlyStatus";
+
+interface FreeTierQuota {
+  model: string;
+  quotaTotal: number;
+  quotaInitTotal: number;
+}
+
+interface FreeTierOnlyStatus {
+  model: string;
+  freeTierOnly: boolean;
+}
+
+interface BatchResultFailure {
+  failureModelId: string;
+  errorCode: string;
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+const POLL_INTERVAL_MS = 500;
+const MAX_POLLS = 20;
+
+async function pollUntilDone(
+  config: Config,
+  token: string,
+  api: string,
+  requestKey: string,
+  models: string[],
+  region: string,
+): Promise<unknown> {
+  let nextTaskId: string | undefined;
+
+  for (let attempt = 0; attempt < MAX_POLLS; attempt++) {
+    const requestData = {
+      [requestKey]: nextTaskId ? { taskId: nextTaskId } : { models },
+    };
+
+    const raw = await callConsoleGateway(config, token, {
+      api,
+      data: requestData,
+      region,
+    });
+
+    const resp = extractResponseData(raw as Record<string, unknown>);
+    if (resp.taskId && Object.keys(resp).length === 1) {
+      nextTaskId = resp.taskId as string;
+      await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));
+      continue;
+    }
+    return raw;
+  }
+  return null;
+}
+
+async function fetchAllModelNames(config: Config, token: string): Promise<string[]> {
+  const allModels: Record<string, unknown>[] = [];
+  let page = 1;
+  while (true) {
+    const result = await fetchModelList(config, token, { pageNo: page, pageSize: 50 });
+    allModels.push(...result.models);
+    if (allModels.length >= result.total) break;
+    page++;
+  }
+  return allModels.map((item) => item.model as string).filter(Boolean);
+}
+
+export default defineCommand({
+  name: "usage freetier",
+  description:
+    "Enable or disable auto-stop for free-tier models. Enables by default; use --off to disable",
+  usage: "bl usage freetier <--model <model>[,model2,...] | --all> [--off] [flags]",
+  options: [
+    {
+      flag: "--model <model>",
+      description: "Model name(s), comma-separated for multiple",
+    },
+    {
+      flag: "--all",
+      description: "Apply to all free-tier models",
+    },
+    {
+      flag: "--on",
+      description: "Enable auto-stop (default behavior)",
+    },
+    {
+      flag: "--off",
+      description: "Disable auto-stop",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: [
+    "bl usage freetier --model qwen3-max",
+    "bl usage freetier --model qwen3-max,qwen-turbo",
+    "bl usage freetier --all",
+    "bl usage freetier --on --model qwen3-max",
+    "bl usage freetier --off --model qwen3-max",
+    "bl usage freetier --off --all",
+  ],
+  async run(config: Config, flags: GlobalFlags) {
+    const modelFlag = (flags.model as string) || undefined;
+    const all = Boolean(flags.all);
+    const off = Boolean(flags.off);
+    const region = (flags.region as string) || "cn-beijing";
+    const format = detectOutputFormat(config.output);
+
+    if (!modelFlag && !all) {
+      process.stderr.write(
+        "Error: missing required flag. Specify --model <model>[,model2,...] or --all\n",
+      );
+      process.exit(1);
+    }
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    let models: string[];
+    if (modelFlag) {
+      models = [
+        ...new Set(
+          modelFlag
+            .split(",")
+            .map((name) => name.trim())
+            .filter(Boolean),
+        ),
+      ];
+    } else {
+      models = await fetchAllModelNames(config, credential.token);
+    }
+
+    const api = off ? DEACTIVATE_API : ACTIVATE_API;
+    const requestKey = off
+      ? "BatchDeactivateFreeTierOnlyRequest"
+      : "BatchActivateFreeTierOnlyRequest";
+
+    if (config.dryRun) {
+      emitResult(
+        {
+          api,
+          data: { [requestKey]: { models } },
+          region,
+          token: credential.token.slice(0, 8) + "...",
+        },
+        format,
+      );
+      return;
+    }
+
+    if (off) {
+      const [quotaResult, stopResult] = await Promise.all([
+        callConsoleGateway(config, credential.token, {
+          api: FREE_TIER_API,
+          data: { queryFreeTierQuotaRequest: { models } },
+          region,
+        }),
+        callConsoleGateway(config, credential.token, {
+          api: FREE_TIER_ONLY_STATUS_API,
+          data: { queryFreeTierOnlyStatusRequest: { models } },
+          region,
+        }),
+      ]);
+
+      const quotaData = extractResponseData(quotaResult as Record<string, unknown>);
+      const quotas = (quotaData.freeTierQuotas ?? []) as FreeTierQuota[];
+      const quotaMap = new Map(quotas.map((quota) => [quota.model, quota]));
+
+      const stopData = extractResponseData(stopResult as Record<string, unknown>);
+      const stopStatuses = (stopData.freeTierOnlyStatuses ?? []) as FreeTierOnlyStatus[];
+      const stopMap = new Map(stopStatuses.map((status) => [status.model, status.freeTierOnly]));
+
+      for (const name of models) {
+        if (stopMap.get(name) === false) {
+          process.stderr.write(`Auto-stop is already disabled for "${name}".\n`);
+          continue;
+        }
+        const quota = quotaMap.get(name);
+        if (quota && quota.quotaTotal > 0 && stopMap.get(name) === true) {
+          process.stderr.write(
+            `Cannot disable auto-stop for "${name}": free-tier quota has not been fully consumed. Please disable auto-stop after the quota is exhausted.\n`,
+          );
+          continue;
+        }
+        await pollUntilDone(config, credential.token, api, requestKey, [name], region);
+        process.stdout.write(`Disabled auto-stop for "${name}".\n`);
+      }
+      return;
+    }
+
+    const jsonResults: unknown[] = [];
+    for (const name of models) {
+      const result = await pollUntilDone(config, credential.token, api, requestKey, [name], region);
+      if (format === "json") {
+        jsonResults.push(result);
+        continue;
+      }
+      if (result) {
+        const resultData = extractResponseData(result as Record<string, unknown>);
+        const failureModels = (resultData.failureModels as BatchResultFailure[]) ?? [];
+        if (failureModels.length > 0) {
+          process.stderr.write(
+            `Failed to enable auto-stop for "${name}" (${failureModels[0].errorCode}).\n`,
+          );
+        } else {
+          process.stdout.write(`Enabled auto-stop for "${name}".\n`);
+        }
+      } else {
+        process.stderr.write(`Warning: operation timed out for "${name}".\n`);
+      }
+    }
+    if (format === "json") {
+      emitResult(jsonResults, format);
+    }
+  },
+});
diff --git a/packages/cli/src/commands/usage/stats.ts b/packages/cli/src/commands/usage/stats.ts
new file mode 100644
index 0000000..fb5074b
--- /dev/null
+++ b/packages/cli/src/commands/usage/stats.ts
@@ -0,0 +1,442 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  detectOutputFormat,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+import { displayWidth, padEnd } from "../../output/cjk-width.ts";
+
+const OVERVIEW_API = "zeldaEasy.bailian-telemetry.model.getModelUsageStatistic";
+const LIST_API = "zeldaEasy.bailian-telemetry.model.listModelUsageStatisticData";
+
+interface UsageItem {
+  key: string;
+  value: number;
+  unit: string;
+}
+
+interface OverviewStatistic {
+  callCount: number;
+  modelCount: number;
+  callSuccessCount: number;
+  usages: UsageItem[];
+}
+
+interface ModelStatisticItem {
+  model: string;
+  callSuccessCount: number;
+  usages?: UsageItem[];
+  usage?: Record<string, number | undefined>;
+}
+
+interface ListStatisticResponse {
+  list: ModelStatisticItem[];
+  totalCount: number;
+  maxResults: number;
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+const POLL_INTERVAL_MS = 500;
+const MAX_POLLS = 30;
+
+async function pollTelemetryApi(
+  config: Config,
+  token: string,
+  api: string,
+  reqDTO: Record<string, unknown>,
+  region: string,
+): Promise<unknown> {
+  let nextTaskId: string | undefined;
+
+  for (let attempt = 0; attempt < MAX_POLLS; attempt++) {
+    const requestData = nextTaskId
+      ? { reqDTO: { ...reqDTO, asyncTaskId: nextTaskId } }
+      : { reqDTO };
+
+    const raw = await callConsoleGateway(config, token, {
+      api,
+      data: requestData,
+      region,
+    });
+
+    const resp = extractResponseData(raw as Record<string, unknown>);
+
+    if (resp.taskId && Object.keys(resp).length === 1) {
+      nextTaskId = resp.taskId as string;
+      await new Promise((resolve) => setTimeout(resolve, POLL_INTERVAL_MS));
+      continue;
+    }
+
+    return raw;
+  }
+  return null;
+}
+
+function resolveWorkspaceId(config: Config, flagWorkspaceId?: string): string {
+  if (flagWorkspaceId) return flagWorkspaceId;
+  if (config.workspaceId) return config.workspaceId;
+
+  process.stderr.write(
+    "Error: workspace-id is required. Set via --workspace-id, BAILIAN_WORKSPACE_ID, or `bl config set workspace_id <id>`.\n",
+  );
+  process.stderr.write("Hint: run `bl workspace list` to view available workspaces.\n");
+  process.exit(1);
+}
+
+function formatNumber(num: number): string {
+  return num.toLocaleString("en-US");
+}
+
+function formatDate(ts: number): string {
+  const date = new Date(ts);
+  const year = date.getFullYear();
+  const month = String(date.getMonth() + 1).padStart(2, "0");
+  const day = String(date.getDate()).padStart(2, "0");
+  return `${year}-${month}-${day}`;
+}
+
+function extractOverviewData(result: unknown): OverviewStatistic | undefined {
+  const resp = extractResponseData(result as Record<string, unknown>);
+  if (resp.callSuccessCount !== undefined || resp.usages !== undefined) {
+    return resp as unknown as OverviewStatistic;
+  }
+  return undefined;
+}
+
+function extractListData(result: unknown): ListStatisticResponse {
+  const resp = extractResponseData(result as Record<string, unknown>);
+  const list = (resp.list as ModelStatisticItem[]) ?? [];
+  const totalCount = (resp.totalCount as number) ?? 0;
+  const maxResults = (resp.maxResults as number) ?? 0;
+  return { list, totalCount, maxResults };
+}
+
+function resolveUsageMap(item: ModelStatisticItem): Record<string, number> {
+  const out: Record<string, number> = {};
+  if (item.usages && Array.isArray(item.usages)) {
+    for (const entry of item.usages) {
+      if (entry.key && entry.value != null) {
+        out[entry.key] = entry.value;
+      }
+    }
+  }
+  if (item.usage && typeof item.usage === "object") {
+    for (const [key, val] of Object.entries(item.usage)) {
+      if (val != null) out[key] = val;
+    }
+  }
+  return out;
+}
+
+interface UsageLabel {
+  cn: string;
+  en: string;
+  unit?: string;
+}
+
+const USAGE_KEY_LABELS: Record<string, UsageLabel> = {
+  total_token: { cn: "总 Token", en: "Total Tokens", unit: "tokens" },
+  input_token: { cn: "输入 Token", en: "Input Tokens", unit: "tokens" },
+  output_token: { cn: "输出 Token", en: "Output Tokens", unit: "tokens" },
+  input_token_cache: { cn: "缓存 Token", en: "Cached Tokens", unit: "tokens" },
+  input_token_cache_read: { cn: "缓存读取", en: "Cache Read", unit: "tokens" },
+  input_token_cache_creation: { cn: "缓存创建", en: "Cache Creation", unit: "tokens" },
+  thinking_input_token: { cn: "思考输入", en: "Thinking Input", unit: "tokens" },
+  thinking_output_token: { cn: "思考输出", en: "Thinking Output", unit: "tokens" },
+  text_input_token: { cn: "文本输入", en: "Text Input", unit: "tokens" },
+  purein_text_output_token: { cn: "文本输出", en: "Text Output", unit: "tokens" },
+  embedding_token: { cn: "向量", en: "Embedding", unit: "tokens" },
+  image_number: { cn: "图片数", en: "Images", unit: "张" },
+  video_duration: { cn: "视频时长", en: "Video Duration", unit: "秒" },
+  content_duration: { cn: "音频时长", en: "Audio Duration", unit: "秒" },
+  tts_text_number: { cn: "语音合成", en: "TTS Chars", unit: "字符" },
+  total_token_avg: { cn: "平均 Token/次", en: "Avg Tokens/Req" },
+};
+
+function formatLabel(label: UsageLabel): string {
+  const unitSuffix = label.unit ? ` [${label.unit}]` : "";
+  return `${label.cn} (${label.en})${unitSuffix}`;
+}
+
+function printOverview(
+  stat: OverviewStatistic,
+  startTime: number,
+  endTime: number,
+  days: number,
+  noColor: boolean,
+): void {
+  const bold = noColor ? (text: string) => text : (text: string) => `\x1b[1m${text}\x1b[0m`;
+  const dim = noColor ? (text: string) => text : (text: string) => `\x1b[2m${text}\x1b[0m`;
+
+  process.stdout.write(
+    `${dim("时间范围 Period:")} ${formatDate(startTime)} ~ ${formatDate(endTime)} ${dim(`(${days} 天)`)}\n\n`,
+  );
+
+  const rows: [string, string][] = [
+    ["调用模型数 (Models Called)", formatNumber(stat.modelCount ?? 0)],
+    ["调用成功次数 (Successful Calls)", formatNumber(stat.callSuccessCount ?? 0)],
+  ];
+
+  for (const usage of stat.usages ?? []) {
+    const label = USAGE_KEY_LABELS[usage.key];
+    const text = label ? formatLabel(label) : usage.key;
+    rows.push([text, formatNumber(usage.value)]);
+  }
+
+  const maxLabel = Math.max(...rows.map(([label]) => displayWidth(label)));
+  for (const [label, value] of rows) {
+    process.stdout.write(`${bold(padEnd(label, maxLabel + 2))}${value}\n`);
+  }
+}
+
+function printModelTable(
+  items: ModelStatisticItem[],
+  startTime: number,
+  endTime: number,
+  days: number,
+  noColor: boolean,
+): void {
+  const bold = noColor ? (text: string) => text : (text: string) => `\x1b[1m${text}\x1b[0m`;
+  const dim = noColor ? (text: string) => text : (text: string) => `\x1b[2m${text}\x1b[0m`;
+
+  process.stdout.write(
+    `${dim("时间范围 Period:")} ${formatDate(startTime)} ~ ${formatDate(endTime)} ${dim(`(${days} 天)`)}\n\n`,
+  );
+
+  if (items.length === 0) {
+    process.stdout.write("No usage data found.\n");
+    return;
+  }
+
+  const usageKeys = new Set<string>();
+  const itemUsages = items.map((item) => {
+    const usage = resolveUsageMap(item);
+    for (const key of Object.keys(usage)) usageKeys.add(key);
+    return usage;
+  });
+
+  const orderedKeys = [...usageKeys].sort((keyA, keyB) => {
+    const order = [
+      "total_token",
+      "input_token",
+      "output_token",
+      "input_token_cache",
+      "image_number",
+      "video_duration",
+      "content_duration",
+      "tts_text_number",
+    ];
+    const idxA = order.indexOf(keyA);
+    const idxB = order.indexOf(keyB);
+    return (idxA === -1 ? 999 : idxA) - (idxB === -1 ? 999 : idxB);
+  });
+
+  const headersCn = [
+    "模型",
+    "调用次数",
+    ...orderedKeys.map((key) => {
+      const label = USAGE_KEY_LABELS[key];
+      if (!label) return key;
+      return label.unit ? `${label.cn} [${label.unit}]` : label.cn;
+    }),
+  ];
+  const headersEn = [
+    "Model",
+    "Calls",
+    ...orderedKeys.map((key) => USAGE_KEY_LABELS[key]?.en ?? key),
+  ];
+  const rows = items.map((item, idx) => [
+    item.model,
+    formatNumber(item.callSuccessCount ?? 0),
+    ...orderedKeys.map((key) => {
+      const val = itemUsages[idx][key];
+      return val != null ? formatNumber(val) : "-";
+    }),
+  ]);
+
+  const widths = headersCn.map((label, col) =>
+    Math.max(
+      displayWidth(label),
+      displayWidth(headersEn[col]),
+      ...rows.map((row) => displayWidth(row[col])),
+    ),
+  );
+
+  const cnLine = headersCn.map((label, col) => bold(padEnd(label, widths[col]))).join("  ");
+  const enLine = headersEn.map((label, col) => dim(padEnd(label, widths[col]))).join("  ");
+  const separator = widths.map((width) => dim("─".repeat(width))).join("──");
+
+  process.stdout.write(cnLine + "\n");
+  process.stdout.write(enLine + "\n");
+  process.stdout.write(separator + "\n");
+
+  for (const row of rows) {
+    const cells = row.map((cell, col) => padEnd(cell, widths[col]));
+    process.stdout.write(cells.join("  ") + "\n");
+  }
+
+  process.stdout.write(dim(`\n共 ${items.length} 个模型 (Total: ${items.length})`) + "\n");
+}
+
+export default defineCommand({
+  name: "usage stats",
+  description: "Query model usage statistics",
+  usage: "bl usage stats [--model <model>] [--days <days>] [flags]",
+  options: [
+    {
+      flag: "--model <model>",
+      description: "Model name(s), comma-separated; omit for overview",
+    },
+    {
+      flag: "--days <days>",
+      description: "Number of days (default: 7)",
+    },
+    {
+      flag: "--type <type>",
+      description: "Model type: Text, Vision, Multimodal, Audio, Embedding",
+    },
+    {
+      flag: "--workspace-id <id>",
+      description: "Workspace ID (env: BAILIAN_WORKSPACE_ID)",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: [
+    "bl usage stats",
+    "bl usage stats --days 30",
+    "bl usage stats --model qwen-turbo",
+    "bl usage stats --model qwen-turbo --days 7",
+    "bl usage stats --model qwen3.6-plus,deepseek-v4-pro",
+    "bl usage stats --type Text --days 14",
+    "bl usage stats --output json",
+  ],
+  async run(config: Config, flags: GlobalFlags) {
+    const modelFlag = (flags.model as string) || undefined;
+    const daysFlag = Number(flags.days) || 7;
+    const typeFlag = (flags.type as string) || undefined;
+    const region = (flags.region as string) || "cn-beijing";
+    const format = detectOutputFormat(config.output);
+
+    const flagWorkspaceId = (flags.workspaceId as string) || undefined;
+    const workspaceId = resolveWorkspaceId(config, flagWorkspaceId);
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    const endTime = Date.now();
+    const startTime = endTime - daysFlag * 24 * 60 * 60 * 1000;
+
+    if (modelFlag) {
+      const models = [
+        ...new Set(
+          modelFlag
+            .split(",")
+            .map((name) => name.trim())
+            .filter(Boolean),
+        ),
+      ];
+
+      const baseReqDTO: Record<string, unknown> = {
+        startTime,
+        endTime,
+        modelCallSource: "Online",
+        filterWorkspaceId: workspaceId,
+        maxResults: 50,
+        skip: 0,
+        sortField: "success_count",
+        sortOrder: "DESC",
+      };
+      if (typeFlag) baseReqDTO.obsModelType = typeFlag;
+
+      if (config.dryRun) {
+        emitResult(
+          { api: LIST_API, data: { reqDTO: { ...baseReqDTO, model: models.join(",") } }, region },
+          format,
+        );
+        return;
+      }
+
+      const results = await Promise.all(
+        models.map((model) =>
+          pollTelemetryApi(config, credential.token, LIST_API, { ...baseReqDTO, model }, region),
+        ),
+      );
+
+      const allItems: ModelStatisticItem[] = [];
+      const jsonResults: unknown[] = [];
+      for (const result of results) {
+        if (!result) continue;
+        jsonResults.push(result);
+        const listData = extractListData(result);
+        allItems.push(...listData.list);
+      }
+
+      if (format === "json") {
+        emitResult(jsonResults.length === 1 ? jsonResults[0] : jsonResults, format);
+        return;
+      }
+
+      printModelTable(allItems, startTime, endTime, daysFlag, config.noColor);
+    } else {
+      const reqDTO: Record<string, unknown> = {
+        startTime,
+        endTime,
+        modelCallSource: "Online",
+        filterWorkspaceId: workspaceId,
+      };
+      if (typeFlag) reqDTO.obsModelType = typeFlag;
+
+      if (config.dryRun) {
+        emitResult({ api: OVERVIEW_API, data: { reqDTO }, region }, format);
+        return;
+      }
+
+      const result = await pollTelemetryApi(config, credential.token, OVERVIEW_API, reqDTO, region);
+      if (!result) {
+        process.stderr.write("Error: request timed out.\n");
+        process.exit(1);
+      }
+
+      if (format === "json") {
+        emitResult(result, format);
+        return;
+      }
+
+      const stat = extractOverviewData(result);
+      if (!stat) {
+        process.stdout.write("No usage data found.\n");
+        return;
+      }
+
+      printOverview(stat, startTime, endTime, daysFlag, config.noColor);
+    }
+  },
+});
diff --git a/packages/cli/src/commands/workspace/list.ts b/packages/cli/src/commands/workspace/list.ts
new file mode 100644
index 0000000..b9f6edb
--- /dev/null
+++ b/packages/cli/src/commands/workspace/list.ts
@@ -0,0 +1,137 @@
+import {
+  defineCommand,
+  callConsoleGateway,
+  resolveConsoleGatewayCredential,
+  detectOutputFormat,
+  type Config,
+  type GlobalFlags,
+} from "bailian-cli-core";
+import { emitResult } from "../../output/output.ts";
+import { displayWidth, padEnd } from "../../output/cjk-width.ts";
+
+const LIST_WORKSPACES_API = "zeldaEasy.bailian-dash-workspace.space.listWorkspaces";
+
+interface WorkspaceInfo {
+  workspaceId: string;
+  agentName: string;
+  defaultAgent: boolean;
+}
+
+function getNestedRecord(
+  obj: Record<string, unknown>,
+  key: string,
+): Record<string, unknown> | undefined {
+  const val = obj[key];
+  if (val && typeof val === "object" && !Array.isArray(val)) return val as Record<string, unknown>;
+  return undefined;
+}
+
+function extractResponseData(result: Record<string, unknown>): Record<string, unknown> {
+  const data = getNestedRecord(result, "data");
+  if (!data) return result;
+
+  const dataV2 = getNestedRecord(data, "DataV2");
+  if (dataV2) {
+    const inner = getNestedRecord(dataV2, "data");
+    const innerData = inner ? getNestedRecord(inner, "data") : undefined;
+    return innerData ?? inner ?? dataV2;
+  }
+
+  const direct = getNestedRecord(data, "data");
+  return direct ?? data;
+}
+
+function printTable(workspaces: WorkspaceInfo[], noColor: boolean): void {
+  const bold = noColor ? (text: string) => text : (text: string) => `\x1b[1m${text}\x1b[0m`;
+  const dim = noColor ? (text: string) => text : (text: string) => `\x1b[2m${text}\x1b[0m`;
+  const green = noColor ? (text: string) => text : (text: string) => `\x1b[32m${text}\x1b[0m`;
+
+  const headersCn = ["空间名称", "Workspace ID", "默认空间"];
+  const headersEn = ["Name", "", "Default"];
+
+  const rows = workspaces.map((ws) => [
+    ws.agentName,
+    ws.workspaceId,
+    ws.defaultAgent ? "Yes" : "-",
+  ]);
+
+  const widths = headersCn.map((label, col) =>
+    Math.max(
+      displayWidth(label),
+      displayWidth(headersEn[col]),
+      ...rows.map((row) => displayWidth(row[col])),
+    ),
+  );
+
+  const cnLine = headersCn.map((label, col) => bold(padEnd(label, widths[col]))).join("  ");
+  const enLine = headersEn.map((label, col) => dim(padEnd(label, widths[col]))).join("  ");
+  const separator = widths.map((width) => dim("─".repeat(width))).join("──");
+
+  process.stdout.write(cnLine + "\n");
+  process.stdout.write(enLine + "\n");
+  process.stdout.write(separator + "\n");
+
+  for (const row of rows) {
+    const cells = row.map((cell, col) => {
+      if (col === 2 && cell === "Yes") return green(padEnd(cell, widths[col]));
+      return padEnd(cell, widths[col]);
+    });
+    process.stdout.write(cells.join("  ") + "\n");
+  }
+
+  process.stdout.write(
+    dim(`\n共 ${workspaces.length} 个空间 (Total: ${workspaces.length})`) + "\n",
+  );
+}
+
+export default defineCommand({
+  name: "workspace list",
+  description: "List all workspaces",
+  usage: "bl workspace list [flags]",
+  options: [
+    {
+      flag: "--list <n>",
+      description: "Limit number of results",
+    },
+    {
+      flag: "--region <region>",
+      description: "API region (default: cn-beijing)",
+    },
+  ],
+  examples: ["bl workspace list", "bl workspace list --list 5", "bl workspace list --output json"],
+  async run(config: Config, flags: GlobalFlags) {
+    const region = (flags.region as string) || "cn-beijing";
+    const limit = Number(flags.list) || 0;
+    const format = detectOutputFormat(config.output);
+
+    const credential = await resolveConsoleGatewayCredential(config);
+
+    if (config.dryRun) {
+      emitResult({ api: LIST_WORKSPACES_API, data: {}, region }, format);
+      return;
+    }
+
+    const result = await callConsoleGateway(config, credential.token, {
+      api: LIST_WORKSPACES_API,
+      data: {},
+      region,
+    });
+
+    if (format === "json") {
+      emitResult(result, format);
+      return;
+    }
+
+    const resp = extractResponseData(result as Record<string, unknown>);
+    const dataArr = resp.data as Record<string, unknown>[] | undefined;
+    if (!Array.isArray(dataArr) || dataArr.length === 0) {
+      process.stdout.write("No workspace found.\n");
+      return;
+    }
+
+    let workspaces = dataArr as unknown as WorkspaceInfo[];
+    if (limit > 0) workspaces = workspaces.slice(0, limit);
+
+    printTable(workspaces, config.noColor);
+  },
+});
diff --git a/packages/cli/src/main.ts b/packages/cli/src/main.ts
index 20d7c3e..798abe0 100644
--- a/packages/cli/src/main.ts
+++ b/packages/cli/src/main.ts
@@ -60,9 +60,16 @@ const NO_AUTH_SETUP = [
   ["app", "list"],
   ["console", "call"],
   ["usage", "free"],
+  ["usage", "freetier"],
+  ["usage", "stats"],
   ["mcp", "list"],
   ["mcp", "tools"],
   ["mcp", "call"],
+  ["workspace", "list"],
+  ["quota", "list"],
+  ["quota", "request"],
+  ["quota", "history"],
+  ["quota", "check"],
 ];
 
 async function main() {
diff --git a/packages/cli/src/output/cjk-width.ts b/packages/cli/src/output/cjk-width.ts
new file mode 100644
index 0000000..7c81de0
--- /dev/null
+++ b/packages/cli/src/output/cjk-width.ts
@@ -0,0 +1,24 @@
+function isCjk(code: number): boolean {
+  return (
+    (code >= 0x2e80 && code <= 0x9fff) ||
+    (code >= 0xf900 && code <= 0xfaff) ||
+    (code >= 0xfe30 && code <= 0xfe4f) ||
+    (code >= 0xff00 && code <= 0xff60) ||
+    (code >= 0xffe0 && code <= 0xffe6) ||
+    (code >= 0x20000 && code <= 0x2fa1f)
+  );
+}
+
+export function displayWidth(text: string): number {
+  let width = 0;
+  for (const char of text) {
+    const code = char.codePointAt(0) ?? 0;
+    width += isCjk(code) ? 2 : 1;
+  }
+  return width;
+}
+
+export function padEnd(text: string, targetWidth: number): string {
+  const gap = targetWidth - displayWidth(text);
+  return gap > 0 ? text + " ".repeat(gap) : text;
+}
diff --git a/packages/cli/tests/e2e/quota.e2e.test.ts b/packages/cli/tests/e2e/quota.e2e.test.ts
new file mode 100644
index 0000000..19e537d
--- /dev/null
+++ b/packages/cli/tests/e2e/quota.e2e.test.ts
@@ -0,0 +1,349 @@
+import { describe, expect, test } from "vite-plus/test";
+import { isBailianE2EEnabled, parseStdoutJson, runCli } from "./helpers.ts";
+import { readConfigFile } from "bailian-cli-core";
+
+function isConsoleE2EReady(): boolean {
+  if (!isBailianE2EEnabled()) return false;
+  if (process.env.DASHSCOPE_ACCESS_TOKEN?.trim()) return true;
+  try {
+    const config = readConfigFile();
+    return typeof config.access_token === "string" && config.access_token.length > 0;
+  } catch {
+    return false;
+  }
+}
+
+describe("e2e: quota", () => {
+  test("quota list --help 正常退出", async () => {
+    const { stderr, exitCode } = await runCli(["quota", "list", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("--model");
+    expect(stderr).toContain("--all");
+  });
+
+  test("quota list --help 包含所有示例", async () => {
+    const { stderr, exitCode } = await runCli(["quota", "list", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("bl quota list");
+    expect(stderr).toContain("bl quota list --model qwen3.6-plus");
+    expect(stderr).toContain("bl quota list --all");
+  });
+
+  test("quota request --help 正常退出", async () => {
+    const { stderr, exitCode } = await runCli(["quota", "request", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("--model");
+    expect(stderr).toContain("--tpm");
+    expect(stderr).toContain("--yes");
+  });
+
+  test("quota history --help 正常退出", async () => {
+    const { stderr, exitCode } = await runCli(["quota", "history", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("--page");
+    expect(stderr).toContain("--model");
+  });
+
+  test("quota check --help 正常退出", async () => {
+    const { stderr, exitCode } = await runCli(["quota", "check", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("--model");
+    expect(stderr).toContain("--period");
+    expect(stderr).toContain("bl quota check");
+  });
+
+  test("quota check --period 0 报错最小值", async () => {
+    const { stderr, exitCode } = await runCli(["quota", "check", "--period", "0.5"]);
+    expect(exitCode).toBe(1);
+    expect(stderr).toContain("at least 1 minute");
+  });
+});
+
+describe.skipIf(!isConsoleE2EReady())("e2e: quota（Console）", () => {
+  test("quota list --dry-run 输出请求参数", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "list",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      api?: string;
+      data?: {
+        input?: { queryQpmInfo?: boolean; supports?: { selfServiceLimitIncrease?: boolean } };
+      };
+    }>(stdout);
+    expect(data.api).toContain("listFoundationModels");
+    expect(data.data?.input?.queryQpmInfo).toBe(true);
+    expect(data.data?.input?.supports?.selfServiceLimitIncrease).toBe(true);
+  });
+
+  test("quota list --dry-run --all 不传 supports 过滤", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "list",
+      "--all",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { input?: { supports?: unknown } };
+    }>(stdout);
+    expect(data.data?.input?.supports).toBeUndefined();
+  });
+
+  test("quota list 文本输出包含双行表头", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "list",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("模型");
+    expect(stdout).toContain("Model");
+    expect(stdout).toContain("RPM");
+    expect(stdout).toContain("TPM");
+    expect(stdout).toContain("可设上限 TPM");
+    expect(stdout).toContain("Max TPM");
+  });
+
+  test("quota list --model 指定模型返回结果", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "list",
+      "--model",
+      "qwen3.6-plus",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("qwen3.6-plus");
+    expect(stdout).toMatch(/共 1 个模型/);
+  });
+
+  test("quota list --model 不存在的模型报错", async () => {
+    const { stderr, exitCode } = await runCli([
+      "quota",
+      "list",
+      "--model",
+      "nonexistent-model-xyz-99999",
+      "--output",
+      "text",
+    ]);
+    expect(exitCode).toBe(1);
+    expect(stderr).toContain("no matching models found");
+  });
+
+  test("quota list JSON 输出包含 qpmInfo", async () => {
+    const { stdout, stderr, exitCode } = await runCli(["quota", "list", "--output", "json"]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<Array<{ model?: string; qpmInfo?: unknown }>>(stdout);
+    expect(Array.isArray(data)).toBe(true);
+    expect(data.length).toBeGreaterThan(0);
+    expect(data[0].model).toBeTypeOf("string");
+    expect(data[0].qpmInfo).toBeDefined();
+  });
+
+  test("quota request --dry-run 输出请求参数", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "request",
+      "--model",
+      "qwen3.6-plus",
+      "--tpm",
+      "6000000",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      api?: string;
+      data?: { input?: { model?: string; limit?: { usage_limit?: number } } };
+    }>(stdout);
+    expect(data.api).toContain("updateFoundationModelLimits");
+    expect(data.data?.input?.model).toBe("qwen3.6-plus");
+    expect(data.data?.input?.limit?.usage_limit).toBeTypeOf("number");
+  });
+
+  test("quota request TPM 超范围报错", async () => {
+    const { stderr, exitCode } = await runCli([
+      "quota",
+      "request",
+      "--model",
+      "qwen3.6-plus",
+      "--tpm",
+      "999",
+    ]);
+    expect(exitCode).toBe(1);
+    expect(stderr).toContain("out of range");
+    expect(stderr).toContain("Current");
+    expect(stderr).toContain("Range");
+  });
+
+  test("quota request 不支持提额的模型报错", async () => {
+    const { stderr, exitCode } = await runCli([
+      "quota",
+      "request",
+      "--model",
+      "nonexistent-model-xyz-99999",
+      "--tpm",
+      "100000",
+    ]);
+    expect(exitCode).toBe(1);
+    expect(stderr).toContain("not found");
+  });
+
+  test("quota history --dry-run 输出请求参数", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "history",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      api?: string;
+      data?: { input?: { pageNo?: number; pageSize?: number } };
+    }>(stdout);
+    expect(data.api).toContain("listModelLimitApplications");
+    expect(data.data?.input?.pageNo).toBe(1);
+    expect(data.data?.input?.pageSize).toBe(10);
+  });
+
+  test("quota check --dry-run 输出 API 信息", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "check",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{ apis?: string[] }>(stdout);
+    expect(data.apis).toContain(
+      "zeldaHttp.dashscopeModel./zelda/api/v1/modelCenter/listFoundationModels",
+    );
+    expect(data.apis).toContain("zeldaEasy.bailian-telemetry.monitor.getMonitorData");
+  });
+
+  test("quota check 文本输出包含双行表头", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "check",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("模型");
+    expect(stdout).toContain("Model");
+    expect(stdout).toContain("RPM 用量/限额");
+    expect(stdout).toContain("RPM Usage/Limit");
+    expect(stdout).toContain("TPM 用量/限额");
+    expect(stdout).toContain("状态");
+  });
+
+  test("quota check --model 指定单模型", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "check",
+      "--model",
+      "qwen3.6-plus",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("qwen3.6-plus");
+    expect(stdout).toMatch(/共 1 个模型/);
+  });
+
+  test("quota check --model 逗号分隔多模型", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "check",
+      "--model",
+      "qwen3.6-plus,qwen-plus",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("qwen3.6-plus");
+    expect(stdout).toContain("qwen-plus");
+    expect(stdout).toMatch(/共 2 个模型/);
+  });
+
+  test("quota check JSON 输出包含用量和限额字段", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "check",
+      "--model",
+      "qwen3.6-plus",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<
+      Array<{
+        model?: string;
+        rpmUsage?: number;
+        rpmLimit?: number;
+        tpmUsage?: number;
+        tpmLimit?: number;
+      }>
+    >(stdout);
+    expect(Array.isArray(data)).toBe(true);
+    expect(data.length).toBe(1);
+    expect(data[0].model).toBe("qwen3.6-plus");
+    expect(data[0].rpmUsage).toBeTypeOf("number");
+    expect(data[0].rpmLimit).toBeTypeOf("number");
+    expect(data[0].tpmUsage).toBeTypeOf("number");
+    expect(data[0].tpmLimit).toBeTypeOf("number");
+  });
+
+  test("quota check 状态列显示正常/接近限流/已限流之一", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "check",
+      "--model",
+      "qwen3.6-plus",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const hasStatus =
+      stdout.includes("正常") || stdout.includes("接近限流") || stdout.includes("已限流");
+    expect(hasStatus).toBe(true);
+  });
+
+  test("quota history --dry-run --page 2 --page-size 20", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "quota",
+      "history",
+      "--page",
+      "2",
+      "--page-size",
+      "20",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { input?: { pageNo?: number; pageSize?: number } };
+    }>(stdout);
+    expect(data.data?.input?.pageNo).toBe(2);
+    expect(data.data?.input?.pageSize).toBe(20);
+  });
+});
diff --git a/packages/cli/tests/e2e/usage-free.e2e.test.ts b/packages/cli/tests/e2e/usage-free.e2e.test.ts
new file mode 100644
index 0000000..064f027
--- /dev/null
+++ b/packages/cli/tests/e2e/usage-free.e2e.test.ts
@@ -0,0 +1,282 @@
+import { describe, expect, test } from "vite-plus/test";
+import { isBailianE2EEnabled, parseStdoutJson, runCli } from "./helpers.ts";
+import { readConfigFile } from "bailian-cli-core";
+
+function isConsoleE2EReady(): boolean {
+  if (!isBailianE2EEnabled()) return false;
+  if (process.env.DASHSCOPE_ACCESS_TOKEN?.trim()) return true;
+  try {
+    const config = readConfigFile();
+    return typeof config.access_token === "string" && config.access_token.length > 0;
+  } catch {
+    return false;
+  }
+}
+
+describe("e2e: usage free", () => {
+  test("usage 分组展示子命令帮助且退出码为 0", async () => {
+    const { stdout, stderr, exitCode } = await runCli(["usage"]);
+    expect(exitCode, stderr).toBe(0);
+    const out = `${stdout}\n${stderr}`;
+    expect(out).toMatch(/usage|free|freetier/i);
+  });
+
+  test("usage free --help 正常退出", async () => {
+    const { stderr, exitCode } = await runCli(["usage", "free", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toMatch(/--model|quota|free-tier/i);
+  });
+
+  test("usage free --help 包含所有示例", async () => {
+    const { stderr, exitCode } = await runCli(["usage", "free", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("bl usage free");
+    expect(stderr).toContain("bl usage free --model qwen3-max");
+    expect(stderr).toContain("bl usage free --model qwen3-max,qwen-turbo");
+  });
+});
+
+describe.skipIf(!isConsoleE2EReady())("e2e: usage free（Console）", () => {
+  test("usage free --dry-run --model 输出请求参数不发起调用", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--dry-run",
+      "--model",
+      "qwen3-max",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      api?: string;
+      data?: { queryFreeTierQuotaRequest?: { models?: string[] } };
+    }>(stdout);
+    expect(data.api).toContain("queryFreeTierQuota");
+    expect(data.data?.queryFreeTierQuotaRequest?.models).toEqual(["qwen3-max"]);
+  });
+
+  test("usage free --dry-run --model 逗号分隔多个模型", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--dry-run",
+      "--model",
+      "qwen3-max,qwen-turbo",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { queryFreeTierQuotaRequest?: { models?: string[] } };
+    }>(stdout);
+    expect(data.data?.queryFreeTierQuotaRequest?.models).toEqual(["qwen3-max", "qwen-turbo"]);
+  });
+
+  test("usage free --dry-run --model 重复模型名自动去重", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--dry-run",
+      "--model",
+      "qwen3-max,qwen3-max,qwen-turbo",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { queryFreeTierQuotaRequest?: { models?: string[] } };
+    }>(stdout);
+    expect(data.data?.queryFreeTierQuotaRequest?.models).toEqual(["qwen3-max", "qwen-turbo"]);
+  });
+
+  test("usage free --dry-run --model 逗号间有空格也能正确解析", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--dry-run",
+      "--model",
+      "qwen3-max, qwen-turbo",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { queryFreeTierQuotaRequest?: { models?: string[] } };
+    }>(stdout);
+    expect(data.data?.queryFreeTierQuotaRequest?.models).toEqual(["qwen3-max", "qwen-turbo"]);
+  });
+
+  test("usage free --dry-run 不指定 --model 传全量模型列表", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { queryFreeTierQuotaRequest?: { models?: string[] } };
+    }>(stdout);
+    const models = data.data?.queryFreeTierQuotaRequest?.models ?? [];
+    expect(models.length).toBeGreaterThan(0);
+  });
+
+  test("usage free --model 单模型查询返回 JSON 结果", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      code?: string;
+      successResponse?: boolean;
+    }>(stdout);
+    expect(data.code).toBe("200");
+    expect(data.successResponse).toBe(true);
+  });
+
+  test("usage free --model 单模型文本输出包含表头", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("Model");
+    expect(stdout).toContain("Type");
+    expect(stdout).toContain("Remaining/Total");
+    expect(stdout).toContain("Usage");
+    expect(stdout).toContain("Expires");
+    expect(stdout).toContain("Auto-Stop");
+  });
+
+  test("usage free --model 文本输出包含模型名", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("qwen3-max");
+  });
+
+  test("usage free --model 逗号分隔多模型文本输出包含所有模型", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max,qwen-turbo",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("qwen3-max");
+    expect(stdout).toContain("qwen-turbo");
+  });
+
+  test("usage free --model 文本输出包含正确的 Type 列", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("Text");
+  });
+
+  test("usage free --model quotaStatus 为 UNKNOWN 时 Auto-Stop 显示 Unsupported", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "wan2.7-image",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("Unsupported");
+  });
+
+  test("usage free --model quotaStatus 为 UNKNOWN 时额度显示为 -", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "wan2.7-image",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const lines = stdout.split("\n").filter((line) => line.includes("wan2.7-image"));
+    expect(lines.length).toBe(1);
+    expect(lines[0]).toContain("Vision");
+    expect(lines[0]).toContain("Unsupported");
+  });
+
+  test("usage free --model 不存在的模型仍返回表格行", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "nonexistent-model-xyz-12345",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("nonexistent-model-xyz-12345");
+  });
+
+  test("usage free --model Auto-Stop 显示 ON、OFF 或 Unsupported", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const hasAutoStop =
+      stdout.includes("ON") || stdout.includes("OFF") || stdout.includes("Unsupported");
+    expect(hasAutoStop).toBe(true);
+  });
+
+  test("usage free --model --region cn-beijing 指定区域查询", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "free",
+      "--model",
+      "qwen3-max",
+      "--region",
+      "cn-beijing",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{ code?: string }>(stdout);
+    expect(data.code).toBe("200");
+  });
+});
diff --git a/packages/cli/tests/e2e/usage-stats.e2e.test.ts b/packages/cli/tests/e2e/usage-stats.e2e.test.ts
new file mode 100644
index 0000000..d9e60e8
--- /dev/null
+++ b/packages/cli/tests/e2e/usage-stats.e2e.test.ts
@@ -0,0 +1,299 @@
+import { describe, expect, test } from "vite-plus/test";
+import { isBailianE2EEnabled, parseStdoutJson, runCli } from "./helpers.ts";
+import { readConfigFile } from "bailian-cli-core";
+
+function isConsoleE2EReady(): boolean {
+  if (!isBailianE2EEnabled()) return false;
+  if (process.env.DASHSCOPE_ACCESS_TOKEN?.trim()) return true;
+  try {
+    const config = readConfigFile();
+    return typeof config.access_token === "string" && config.access_token.length > 0;
+  } catch {
+    return false;
+  }
+}
+
+function getStaticWorkspaceId(): string | undefined {
+  if (process.env.BAILIAN_WORKSPACE_ID?.trim()) return process.env.BAILIAN_WORKSPACE_ID.trim();
+  try {
+    const config = readConfigFile();
+    if (config.workspace_id) return config.workspace_id;
+  } catch {}
+  return undefined;
+}
+
+async function fetchDefaultWorkspaceId(): Promise<string> {
+  const staticId = getStaticWorkspaceId();
+  if (staticId) return staticId;
+
+  const { stdout } = await runCli(["workspace", "list", "--output", "json"]);
+  const result = JSON.parse(stdout);
+  const data = result?.data?.DataV2?.data?.data?.data ?? [];
+  const defaultWs = data.find((ws: { defaultAgent?: boolean }) => ws.defaultAgent);
+  if (defaultWs?.workspaceId) return defaultWs.workspaceId;
+  if (data.length > 0 && data[0].workspaceId) return data[0].workspaceId;
+  throw new Error("No workspace found for e2e tests");
+}
+
+describe("e2e: usage stats", () => {
+  test("usage stats --help 正常退出", async () => {
+    const { stderr, exitCode } = await runCli(["usage", "stats", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toMatch(/--model|--days|stats/i);
+  });
+
+  test("usage stats --help 包含所有示例", async () => {
+    const { stderr, exitCode } = await runCli(["usage", "stats", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("bl usage stats");
+    expect(stderr).toContain("bl usage stats --model qwen-turbo");
+    expect(stderr).toContain("bl usage stats --days 30");
+  });
+
+  test("usage stats --help 包含 --workspace-id 选项", async () => {
+    const { stderr, exitCode } = await runCli(["usage", "stats", "--help"]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stderr).toContain("--workspace-id");
+  });
+});
+
+describe.skipIf(!isConsoleE2EReady())("e2e: usage stats（Console）", () => {
+  let wsId: string;
+
+  test("获取默认 workspace-id", async () => {
+    wsId = await fetchDefaultWorkspaceId();
+    expect(wsId).toBeTypeOf("string");
+    expect(wsId.length).toBeGreaterThan(0);
+  });
+
+  test("usage stats --dry-run 概览模式输出请求参数", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--dry-run",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      api?: string;
+      data?: {
+        reqDTO?: {
+          startTime?: number;
+          endTime?: number;
+          modelCallSource?: string;
+          filterWorkspaceId?: string;
+        };
+      };
+    }>(stdout);
+    expect(data.api).toContain("getModelUsageStatistic");
+    expect(data.data?.reqDTO?.modelCallSource).toBe("Online");
+    expect(data.data?.reqDTO?.startTime).toBeTypeOf("number");
+    expect(data.data?.reqDTO?.endTime).toBeTypeOf("number");
+    expect(data.data?.reqDTO?.filterWorkspaceId).toBe(wsId);
+  });
+
+  test("usage stats --dry-run --days 30 时间跨度约 30 天", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--dry-run",
+      "--days",
+      "30",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { reqDTO?: { startTime?: number; endTime?: number } };
+    }>(stdout);
+    const span = (data.data?.reqDTO?.endTime ?? 0) - (data.data?.reqDTO?.startTime ?? 0);
+    const thirtyDaysMs = 30 * 24 * 60 * 60 * 1000;
+    expect(span).toBeGreaterThan(thirtyDaysMs - 5000);
+    expect(span).toBeLessThan(thirtyDaysMs + 5000);
+  });
+
+  test("usage stats --dry-run --model 指定模型使用 list API", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--dry-run",
+      "--model",
+      "qwen-turbo",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      api?: string;
+      data?: { reqDTO?: { model?: string; filterWorkspaceId?: string } };
+    }>(stdout);
+    expect(data.api).toContain("listModelUsageStatisticData");
+    expect(data.data?.reqDTO?.model).toBe("qwen-turbo");
+    expect(data.data?.reqDTO?.filterWorkspaceId).toBe(wsId);
+  });
+
+  test("usage stats --dry-run --type Text 传递 obsModelType", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--dry-run",
+      "--type",
+      "Text",
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      data?: { reqDTO?: { obsModelType?: string } };
+    }>(stdout);
+    expect(data.data?.reqDTO?.obsModelType).toBe("Text");
+  });
+
+  test("usage stats 概览模式返回 JSON 结果", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--output",
+      "json",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    const data = parseStdoutJson<{
+      code?: string;
+      successResponse?: boolean;
+    }>(stdout);
+    expect(data.code).toBe("200");
+    expect(data.successResponse).toBe(true);
+  });
+
+  test("usage stats 概览文本输出包含中英文表头", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("时间范围 Period:");
+    expect(stdout).toContain("调用模型数");
+    expect(stdout).toContain("Models Called");
+    expect(stdout).toContain("调用成功次数");
+    expect(stdout).toContain("Successful Calls");
+  });
+
+  test("usage stats 概览文本输出包含 Token 用量", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("总 Token");
+    expect(stdout).toContain("Total Tokens");
+    expect(stdout).toContain("[tokens]");
+  });
+
+  test("usage stats --model 单模型文本输出包含双行表头", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--model",
+      "qwen3.6-plus",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("模型");
+    expect(stdout).toContain("Model");
+    expect(stdout).toContain("调用次数");
+    expect(stdout).toContain("Calls");
+    expect(stdout).toContain("qwen3.6-plus");
+  });
+
+  test("usage stats --model 逗号分隔多模型返回多行", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--model",
+      "qwen3.6-plus,deepseek-v4-pro",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("qwen3.6-plus");
+    expect(stdout).toContain("deepseek-v4-pro");
+    expect(stdout).toMatch(/共 2 个模型/);
+  });
+
+  test("usage stats --model 不存在的模型返回空表格", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--model",
+      "nonexistent-model-xyz-99999",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("No usage data found");
+  });
+
+  test("usage stats --days 1 短时间范围正常返回", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--days",
+      "1",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("(1 天)");
+    expect(stdout).toContain("调用模型数");
+  });
+
+  test("usage stats --type Vision 按类型过滤", async () => {
+    const { stdout, stderr, exitCode } = await runCli([
+      "usage",
+      "stats",
+      "--workspace-id",
+      wsId,
+      "--type",
+      "Vision",
+      "--output",
+      "text",
+      "--no-color",
+    ]);
+    expect(exitCode, stderr).toBe(0);
+    expect(stdout).toContain("时间范围 Period:");
+  });
+});
diff --git a/packages/core/src/config/loader.ts b/packages/core/src/config/loader.ts
index d8546bc..30ce611 100644
--- a/packages/core/src/config/loader.ts
+++ b/packages/core/src/config/loader.ts
@@ -87,7 +87,7 @@ export function loadConfig(flags: GlobalFlags): Config {
     consoleGatewayUrl:
       process.env.BAILIAN_CONSOLE_GATEWAY_URL ||
       file.console_gateway_url ||
-      "https://pre-bailian-cs.console.aliyun.com",
+      "https://bailian-cs.console.aliyun.com",
     verbose: flags.verbose || process.env.DASHSCOPE_VERBOSE === "1",
     quiet: flags.quiet || false,
     noColor: flags.noColor || process.env.NO_COLOR !== undefined || !process.stdout.isTTY,
diff --git a/packages/core/src/console/gateway.ts b/packages/core/src/console/gateway.ts
index f447c35..a698d12 100644
--- a/packages/core/src/console/gateway.ts
+++ b/packages/core/src/console/gateway.ts
@@ -74,5 +74,16 @@ export async function callConsoleGateway(
     );
   }
 
-  return res.json() as Promise<unknown>;
+  const json = (await res.json()) as Record<string, unknown>;
+
+  const innerData = json.data as Record<string, unknown> | undefined;
+  if (innerData?.success === false && innerData.errorCode) {
+    throw new BailianError(
+      `Console gateway error: ${innerData.errorCode}`,
+      ExitCode.GENERAL,
+      typeof innerData.errorMsg === "string" ? innerData.errorMsg : undefined,
+    );
+  }
+
+  return json;
 }
diff --git a/skills/bailian-cli/reference/advisor.md b/skills/bailian-cli/reference/advisor.md
index c6e5164..c03387e 100644
--- a/skills/bailian-cli/reference/advisor.md
+++ b/skills/bailian-cli/reference/advisor.md
@@ -32,25 +32,25 @@ Index: [index.md](index.md)
 #### Examples
 
 ```bash
-bl advisor recommend --message "我要做一个能理解图片的客服机器人"
+bl advisor recommend --message "I need a visual-understanding chatbot"
 ```
 
 ```bash
-bl advisor recommend --message "做一个Agent自动根据用户意图生成动画片"
+bl advisor recommend --message "Build an Agent that auto-generates animations"
 ```
 
 ```bash
-bl advisor recommend --message "法律合同审查，要求高精准度"
+bl advisor recommend --message "Legal contract review, high precision required"
 ```
 
 ```bash
-bl advisor recommend --message "做一个低成本高并发的在线客服" --output json
+bl advisor recommend --message "Low-cost high-concurrency online customer service" --output json
 ```
 
 ```bash
-bl advisor recommend --message "长文本摘要" --dry-run
+bl advisor recommend --message "Long document summarization" --dry-run
 ```
 
 ```bash
-bl advisor recommend                                          # 交互式输入需求
+bl advisor recommend                                          # Interactive input
 ```
diff --git a/skills/bailian-cli/reference/index.md b/skills/bailian-cli/reference/index.md
index 79f462c..0da2213 100644
--- a/skills/bailian-cli/reference/index.md
+++ b/skills/bailian-cli/reference/index.md
@@ -37,18 +37,25 @@ Use this index for the full quick index and global flags.
 | `bl omni`                  | Multimodal chat with text + audio output (Qwen-Omni)                                                  | [omni.md](omni.md)           |
 | `bl pipeline run`          | Run a pipeline workflow definition                                                                    | [pipeline.md](pipeline.md)   |
 | `bl pipeline validate`     | Validate a pipeline definition without executing                                                      | [pipeline.md](pipeline.md)   |
+| `bl quota check`           | Check current usage against rate limits                                                               | [quota.md](quota.md)         |
+| `bl quota history`         | View quota change history                                                                             | [quota.md](quota.md)         |
+| `bl quota list`            | View model RPM/TPM rate limits                                                                        | [quota.md](quota.md)         |
+| `bl quota request`         | Request a temporary quota increase                                                                    | [quota.md](quota.md)         |
 | `bl search web`            | Search the web using DashScope MCP WebSearch service                                                  | [search.md](search.md)       |
 | `bl speech recognize`      | Recognize speech from audio files (FunAudio-ASR)                                                      | [speech.md](speech.md)       |
 | `bl speech synthesize`     | Synthesize speech from text (CosyVoice TTS)                                                           | [speech.md](speech.md)       |
 | `bl text chat`             | Send a chat completion (OpenAI compatible, DashScope)                                                 | [text.md](text.md)           |
 | `bl update`                | Update bl to the latest version                                                                       | [update.md](update.md)       |
-| `bl usage free`            | Query free-tier quota for a model                                                                     | [usage.md](usage.md)         |
+| `bl usage free`            | Query free-tier quota for models (all models if --model is omitted)                                   | [usage.md](usage.md)         |
+| `bl usage freetier`        | Enable or disable auto-stop for free-tier models. Enables by default; use --off to disable            | [usage.md](usage.md)         |
+| `bl usage stats`           | Query model usage statistics                                                                          | [usage.md](usage.md)         |
 | `bl video download`        | Download a completed video by task ID                                                                 | [video.md](video.md)         |
 | `bl video edit`            | Edit a video with happyhorse-1.0-video-edit (style transfer, object replacement, etc.)                | [video.md](video.md)         |
 | `bl video generate`        | Generate a video from text or image (happyhorse-1.0-t2v / happyhorse-1.0-i2v / wan2.6-t2v)            | [video.md](video.md)         |
 | `bl video ref`             | Reference-to-video generation (happyhorse-1.0-r2v / wan2.6-r2v): multi-subject, multi-shot with voice | [video.md](video.md)         |
 | `bl video task get`        | Query async task status                                                                               | [video.md](video.md)         |
 | `bl vision describe`       | Describe an image or video using Qwen-VL                                                              | [vision.md](vision.md)       |
+| `bl workspace list`        | List all workspaces                                                                                   | [workspace.md](workspace.md) |
 
 ## By group
 
@@ -66,13 +73,15 @@ Use this index for the full quick index and global flags.
 | `memory`    | `add`, `delete`, `list`, `profile create`, `profile get`, `search`, `update` | [memory.md](memory.md)       |
 | `omni`      | `(root)`                                                                     | [omni.md](omni.md)           |
 | `pipeline`  | `run`, `validate`                                                            | [pipeline.md](pipeline.md)   |
+| `quota`     | `check`, `history`, `list`, `request`                                        | [quota.md](quota.md)         |
 | `search`    | `web`                                                                        | [search.md](search.md)       |
 | `speech`    | `recognize`, `synthesize`                                                    | [speech.md](speech.md)       |
 | `text`      | `chat`                                                                       | [text.md](text.md)           |
 | `update`    | `(root)`                                                                     | [update.md](update.md)       |
-| `usage`     | `free`                                                                       | [usage.md](usage.md)         |
+| `usage`     | `free`, `freetier`, `stats`                                                  | [usage.md](usage.md)         |
 | `video`     | `download`, `edit`, `generate`, `ref`, `task get`                            | [video.md](video.md)         |
 | `vision`    | `describe`                                                                   | [vision.md](vision.md)       |
+| `workspace` | `list`                                                                       | [workspace.md](workspace.md) |
 
 ## Global flags
 
diff --git a/skills/bailian-cli/reference/quota.md b/skills/bailian-cli/reference/quota.md
new file mode 100644
index 0000000..86aa355
--- /dev/null
+++ b/skills/bailian-cli/reference/quota.md
@@ -0,0 +1,163 @@
+# `bl quota` commands
+
+> Auto-generated from `packages/cli/src/commands/catalog.ts`. Do not edit by hand.
+> Regenerate: `pnpm --filter bailian-cli run generate:reference`.
+
+Index: [index.md](index.md)
+
+## Commands in this group
+
+| Command            | Description                             |
+| ------------------ | --------------------------------------- |
+| `bl quota check`   | Check current usage against rate limits |
+| `bl quota history` | View quota change history               |
+| `bl quota list`    | View model RPM/TPM rate limits          |
+| `bl quota request` | Request a temporary quota increase      |
+
+## Command details
+
+### `bl quota check`
+
+| Field           | Value                                      |
+| --------------- | ------------------------------------------ |
+| **Name**        | `quota check`                              |
+| **Description** | Check current usage against rate limits    |
+| **Usage**       | `bl quota check [--model <model>] [flags]` |
+
+#### Options
+
+| Flag                 | Type   | Required | Description                                     |
+| -------------------- | ------ | -------- | ----------------------------------------------- |
+| `--model <model>`    | string | no       | Model name(s), comma-separated                  |
+| `--period <minutes>` | string | no       | Query usage for the last N minutes (default: 2) |
+| `--region <region>`  | string | no       | API region (default: cn-beijing)                |
+
+#### Examples
+
+```bash
+bl quota check
+```
+
+```bash
+bl quota check --model qwen3.6-plus
+```
+
+```bash
+bl quota check --period 5
+```
+
+```bash
+bl quota check --model qwen3.6-plus,qwen-turbo
+```
+
+```bash
+bl quota check --output json
+```
+
+### `bl quota history`
+
+| Field           | Value                      |
+| --------------- | -------------------------- |
+| **Name**        | `quota history`            |
+| **Description** | View quota change history  |
+| **Usage**       | `bl quota history [flags]` |
+
+#### Options
+
+| Flag                | Type   | Required | Description                      |
+| ------------------- | ------ | -------- | -------------------------------- |
+| `--page <n>`        | string | no       | Page number (default: 1)         |
+| `--page-size <n>`   | string | no       | Page size (default: 10)          |
+| `--model <model>`   | string | no       | Filter by model name             |
+| `--region <region>` | string | no       | API region (default: cn-beijing) |
+
+#### Examples
+
+```bash
+bl quota history
+```
+
+```bash
+bl quota history --page 2
+```
+
+```bash
+bl quota history --page-size 20
+```
+
+```bash
+bl quota history --model qwen-turbo
+```
+
+```bash
+bl quota history --output json
+```
+
+### `bl quota list`
+
+| Field           | Value                                     |
+| --------------- | ----------------------------------------- |
+| **Name**        | `quota list`                              |
+| **Description** | View model RPM/TPM rate limits            |
+| **Usage**       | `bl quota list [--model <model>] [flags]` |
+
+#### Options
+
+| Flag                | Type    | Required | Description                                 |
+| ------------------- | ------- | -------- | ------------------------------------------- |
+| `--model <model>`   | string  | no       | Model name(s), comma-separated              |
+| `--all`             | boolean | no       | Show all models, not just self-service ones |
+| `--region <region>` | string  | no       | API region (default: cn-beijing)            |
+
+#### Examples
+
+```bash
+bl quota list
+```
+
+```bash
+bl quota list --model qwen3.6-plus
+```
+
+```bash
+bl quota list --model qwen3.6-plus,qwen-turbo
+```
+
+```bash
+bl quota list --all
+```
+
+```bash
+bl quota list --output json
+```
+
+### `bl quota request`
+
+| Field           | Value                                                    |
+| --------------- | -------------------------------------------------------- |
+| **Name**        | `quota request`                                          |
+| **Description** | Request a temporary quota increase                       |
+| **Usage**       | `bl quota request --model <model> --tpm <value> [flags]` |
+
+#### Options
+
+| Flag                | Type    | Required | Description                      |
+| ------------------- | ------- | -------- | -------------------------------- |
+| `--model <model>`   | string  | yes      | Model name (required)            |
+| `--tpm <value>`     | string  | yes      | Target TPM value (required)      |
+| `--yes`             | boolean | no       | Skip downgrade confirmation      |
+| `--region <region>` | string  | no       | API region (default: cn-beijing) |
+
+#### Examples
+
+```bash
+bl quota request --model qwen-turbo --tpm 100000
+```
+
+```bash
+bl quota request --model qwen3.6-plus --tpm 8000000 --yes
+```
+
+```bash
+bl quota request --model qwen-turbo --tpm 100000 --output json
+```
diff --git a/skills/bailian-cli/reference/usage.md b/skills/bailian-cli/reference/usage.md
index cd80642..52ccd11 100644
--- a/skills/bailian-cli/reference/usage.md
+++ b/skills/bailian-cli/reference/usage.md
@@ -7,33 +7,53 @@ Index: [index.md](index.md)
 
 ## Commands in this group
 
-| Command         | Description                       |
-| --------------- | --------------------------------- |
-| `bl usage free` | Query free-tier quota for a model |
+| Command             | Description                                                                                |
+| ------------------- | ------------------------------------------------------------------------------------------ |
+| `bl usage free`     | Query free-tier quota for models (all models if --model is omitted)                        |
+| `bl usage freetier` | Enable or disable auto-stop for free-tier models. Enables by default; use --off to disable |
+| `bl usage stats`    | Query model usage statistics                                                               |
 
 ## Command details
 
 ### `bl usage free`
 
-| Field           | Value                                   |
-| --------------- | --------------------------------------- |
-| **Name**        | `usage free`                            |
-| **Description** | Query free-tier quota for a model       |
-| **Usage**       | `bl usage free --model <model> [flags]` |
+| Field           | Value                                                               |
+| --------------- | ------------------------------------------------------------------- |
+| **Name**        | `usage free`                                                        |
+| **Description** | Query free-tier quota for models (all models if --model is omitted) |
+| **Usage**       | `bl usage free [--model <model>[,model2,...]] [flags]`              |
 
 #### Options
 
-| Flag                | Type   | Required | Description                                      |
-| ------------------- | ------ | -------- | ------------------------------------------------ |
-| `--model <model>`   | string | yes      | Model name to query (e.g. qwen3-max, qwen-turbo) |
-| `--region <region>` | string | no       | API region (default: cn-beijing)                 |
+| Flag                | Type   | Required | Description                                                               |
+| ------------------- | ------ | -------- | ------------------------------------------------------------------------- |
+| `--model <model>`   | string | no       | Model name(s) to query, comma-separated for multiple; omit for all models |
+| `--expiring <days>` | string | no       | Only show quotas expiring within N days                                   |
+| `--sort <field>`    | string | no       | Sort by: remaining (ascending), expires (ascending)                       |
+| `--region <region>` | string | no       | API region (default: cn-beijing)                                          |
 
 #### Examples
 
+```bash
+bl usage free
+```
+
 ```bash
 bl usage free --model qwen3-max
 ```
 
+```bash
+bl usage free --model qwen3-max,qwen-turbo
+```
+
+```bash
+bl usage free --expiring 30
+```
+
+```bash
+bl usage free --sort remaining
+```
+
 ```bash
 bl usage free --model qwen-turbo --output json
 ```
@@ -41,3 +61,95 @@ bl usage free --model qwen-turbo --output json
 ```bash
 bl usage free --model qwen3-max --region cn-beijing
 ```
+
+### `bl usage freetier`
+
+| Field           | Value                                                                                      |
+| --------------- | ------------------------------------------------------------------------------------------ |
+| **Name**        | `usage freetier`                                                                           |
+| **Description** | Enable or disable auto-stop for free-tier models. Enables by default; use --off to disable |
+| **Usage**       | `bl usage freetier <--model <model>[,model2,...] \| --all> [--off] [flags]`                |
+
+#### Options
+
+| Flag                | Type    | Required | Description                                 |
+| ------------------- | ------- | -------- | ------------------------------------------- |
+| `--model <model>`   | string  | no       | Model name(s), comma-separated for multiple |
+| `--all`             | boolean | no       | Apply to all free-tier models               |
+| `--on`              | boolean | no       | Enable auto-stop (default behavior)         |
+| `--off`             | boolean | no       | Disable auto-stop                           |
+| `--region <region>` | string  | no       | API region (default: cn-beijing)            |
+
+#### Examples
+
+```bash
+bl usage freetier --model qwen3-max
+```
+
+```bash
+bl usage freetier --model qwen3-max,qwen-turbo
+```
+
+```bash
+bl usage freetier --all
+```
+
+```bash
+bl usage freetier --on --model qwen3-max
+```
+
+```bash
+bl usage freetier --off --model qwen3-max
+```
+
+```bash
+bl usage freetier --off --all
+```
+
+### `bl usage stats`
+
+| Field           | Value                                                      |
+| --------------- | ---------------------------------------------------------- |
+| **Name**        | `usage stats`                                              |
+| **Description** | Query model usage statistics                               |
+| **Usage**       | `bl usage stats [--model <model>] [--days <days>] [flags]` |
+
+#### Options
+
+| Flag                  | Type   | Required | Description                                            |
+| --------------------- | ------ | -------- | ------------------------------------------------------ |
+| `--model <model>`     | string | no       | Model name(s), comma-separated; omit for overview      |
+| `--days <days>`       | string | no       | Number of days (default: 7)                            |
+| `--type <type>`       | string | no       | Model type: Text, Vision, Multimodal, Audio, Embedding |
+| `--workspace-id <id>` | string | no       | Workspace ID (env: BAILIAN_WORKSPACE_ID)               |
+| `--region <region>`   | string | no       | API region (default: cn-beijing)                       |
+
+#### Examples
+
+```bash
+bl usage stats
+```
+
+```bash
+bl usage stats --days 30
+```
+
+```bash
+bl usage stats --model qwen-turbo
+```
+
+```bash
+bl usage stats --model qwen-turbo --days 7
+```
+
+```bash
+bl usage stats --model qwen3.6-plus,deepseek-v4-pro
+```
+
+```bash
+bl usage stats --type Text --days 14
+```
+
+```bash
+bl usage stats --output json
+```
diff --git a/skills/bailian-cli/reference/workspace.md b/skills/bailian-cli/reference/workspace.md
new file mode 100644
index 0000000..2428721
--- /dev/null
+++ b/skills/bailian-cli/reference/workspace.md
@@ -0,0 +1,43 @@
+# `bl workspace` commands
+
+> Auto-generated from `packages/cli/src/commands/catalog.ts`. Do not edit by hand.
+> Regenerate: `pnpm --filter bailian-cli run generate:reference`.
+
+Index: [index.md](index.md)
+
+## Commands in this group
+
+| Command             | Description         |
+| ------------------- | ------------------- |
+| `bl workspace list` | List all workspaces |
+
+## Command details
+
+### `bl workspace list`
+
+| Field           | Value                       |
+| --------------- | --------------------------- |
+| **Name**        | `workspace list`            |
+| **Description** | List all workspaces         |
+| **Usage**       | `bl workspace list [flags]` |
+
+#### Options
+
+| Flag                | Type   | Required | Description                      |
+| ------------------- | ------ | -------- | -------------------------------- |
+| `--list <n>`        | string | no       | Limit number of results          |
+| `--region <region>` | string | no       | API region (default: cn-beijing) |
+
+#### Examples
+
+```bash
+bl workspace list
+```
+
+```bash
+bl workspace list --list 5
+```
+
+```bash
+bl workspace list --output json
+```