"We want an AI agent inside Feishu so employees can call it from chat" — one of the top requests of the past year.

The reasoning is sound: employees spend 8+ hours a day inside Feishu / DingTalk / WeCom. That's the lowest-friction distribution path for AI. Any standalone web tool, however elegant, loses to "employees don't bother opening it". IM integration is where real adoption lives.

But IM integration is also the highest-risk rollout path. IM reaches everyone, the data is sensitive, responsibility is distributed. We've seen too many companies "ship first, govern later" — an incident two months in, compliance halts the project, everything rolls back.

This piece lays out the four questions that must be answered before launch: ingress, permissions, audit, cost. Not exhaustive. Unavoidable.

1. Question 1: Ingress

Looks the simplest. Most commonly mis-scoped.

"Integrate AI into Feishu" doesn't have a single shape. At least four forms exist, with different implementations and UX:

Form A: Bot account (@-mention)

Most common. Employees @ the bot in any chat or group.

Pro: zero learning curve for employees. Con: @-mentioning in a group exposes question and answer to everyone in the group — permission model trap.

When: low-sensitivity scenarios like "write a meeting minutes template" or "translate this".

Form B: Mini-app / dedicated entry

Employees open a Feishu-embedded mini-app for a dedicated chat-style interface. Data stays inside the mini-app, not in group chat.

Pro: clean data isolation, dedicated permissions, richer UI (e.g., cards with download buttons). Con: one more click — most employees won't open it voluntarily.

When: medium-sensitivity scenarios like "look up receivables" or "generate report".

Form C: Slash commands

Employees type /lookup-customer or /generate-report in any chat to trigger a specific Skill.

Pro: combines A and B — ingress anywhere + execution isolation. Con: employees must remember commands. Training overhead.

When: high-frequency, structured scenarios — sales daily customer lookups, finance daily reconciliation.

Form D: Hybrid

Combination of the above. Different scenarios, different ingress.

Recommended: most real deployments end up hybrid. Forcing one form everywhere squeezes scenarios into the wrong shape.


Critical reminder: ingress form must be finalized with the client at project kickoff. This is product design, not tech selection. We've seen projects where engineers deployed everything, then business said "oh, that's not how we want to use it" — rework.

2. Question 2: Permissions

The most underestimated of the four. Actually the hardest.

Core principle: AI doesn't replace permissions, it operates on top of them.

Anti-pattern: permission mirroring

Many companies' first instinct: "export the customer table and order table from ERP into a vector DB; let the agent retrieve whatever employees ask."

Fatal errors:

1. Permissions distort. In ERP, Engineer Zhang sees only East Region customers. After import, the agent retrieves by similarity — may surface North Region customer data. The agent doesn't understand "constrained by org boundary".

2. Permissions go stale. An employee leaves, changes role, or has permissions adjusted — ERP reflects that immediately, but the mirrored vector DB still has the old data. The last time an ex-employee used AI, they could see more than their current role should.

3. Audit breaks. Business systems have permission audit logs, but AI-side access records don't map cleanly back. Post-incident traceability is hard.

Correct pattern: don't mirror; check at source.

Pattern: identity pass-through + live source check

Standard implementation:

  1. Employee identity pass-through: the agent runs as the current employee (not a service account)
  2. Call business APIs, not databases: use the employee's identity to call business system APIs; the business system applies its own permission rules when returning data
  3. Live ACL re-check: for knowledge base documents, every query re-checks the source system (Feishu wiki / shared drive) in real time: "can this employee still see this document?" — 5-minute cache is acceptable
  4. Default deny: when permission is ambiguous, default deny, not default allow

All the rules the business systems already handle — permissions, role changes, departures — the AI inherits for free. No "separate permission model" means no "permission drift".

Technical details are in our SiNan product (pluggable Hermes / OpenClaw open-source agents, AES-256-GCM envelope encryption, live Feishu ACL recheck). See SiNan architecture.

3. Question 3: Audit

Once AI is in IM, three audit scenarios show up inevitably:

Scenario 1: Compliance audit

Regulators or internal compliance ask: "Last month, who asked what sensitive question, what did AI answer, where did the data come from?"

Minimum requirements:

  • Every call recorded: requester ID, timestamp, original question, Skill invoked, data sources queried, answer returned
  • Hash-chain integrity: each record contains the previous record's hash; any tamper breaks the chain
  • Retention: at least 6 months (some industries require longer — finance 5 years)
  • Exportable: during regulatory inspection, structured CSV exportable by time / person / keyword

Scenario 2: Cost tracking

Leadership asks: "How much did each department spend on tokens last month? Which employees use it most? Any anomalies?"

Needs:

  • Aggregation by org hierarchy (department → team → individual)
  • Real-time usage dashboard
  • Anomaly alerts (e.g., employee's single-day usage suddenly 10× typical)

Scenario 3: Quality traceability

Business asks: "Employee A got a wrong AI answer last week. Why was it wrong?"

Needs:

  • Full decision replay (which Skills called, which data pulled, what answer returned)
  • Data source versioning (which version of the document was cited? has it been updated since?)
  • Reproducible (can the state be reset and the query re-run)

Agent gateways that can't do these three audits shouldn't go to production. A red line. Agents built on simple SaaS clients (e.g., direct Feishu Open Platform API calls) typically have only rough usage logs — insufficient for all three.

4. Question 4: Cost

The easiest to lose control of.

After IM integration, AI agent usage far exceeds launch-time estimates. Reasons:

  1. Every employee is a potential user (vs. standalone web where only the willing open it)
  2. Chat ingress lowers friction — employees build habits and ask more
  3. Long-context conversations (continuous follow-ups) burn 3-5× single-query tokens
  4. Auto-triggered scenarios (daily reports) generate heavy background calls

Practical estimate: 3 months post-launch, usage is typically 3-5× the launch estimate. If you budgeted ¥50k/month, expect ¥150-250k.

Four things to set up upfront

1. Four-tier quota

Don't just have a "company monthly cap":

  • Org: annual company cap
  • Customer: per-client or per-project quotas (for cross-org collaboration)
  • Team: per-department monthly cap
  • VK (individual): per-employee daily/monthly cap

Stacking prevents a runaway employee from draining a department, and a runaway department from draining the company.

2. Tiered pricing / tiered models

Not every question needs the most expensive model:

  • Simple (translate, fix typos): route to lightweight model (1/10 the cost)
  • General QA: route to mid-tier
  • High complexity (code gen, long analysis): use the top model

Good tiering typically cuts overall cost 40-60%. Requires model routing at the agent layer — don't send everything to the most expensive endpoint.

3. Caching layer

Don't recompute identical/near-identical questions. Use SHA256 content hash for caching. In high-frequency scenarios ("what's the latest leave policy"), hit rate reaches 60%+, approaching zero marginal cost.

4. Real-time budget alerts

Don't wait for the month-end bill:

  • Alerts when monthly cumulative hits 50% / 80% / 100%
  • Anomaly alerts by department / individual
  • Auto-degrade before quota exhaustion (switch to lightweight model rather than hard-block)

5. Why "ship first, govern later" always wipes out

Typical failure modes we've seen:

  • Mode 1: two months of agent deployment, compliance arrives late, audit capability is inadequate — full teardown. Cost: 3-month delay, engineering work wasted.
  • Mode 2: employee @-mentions AI in a group about sensitive customer data; colleagues see the answer; dispute. Cost: trust damage, project halted.
  • Mode 3: no cost controls; month-two bill hits ¥200k; leadership shuts it down. Cost: all prior deployment scrapped.
  • Mode 4: permission-mirror pattern; customer data appears in AI answers; lawsuit. Cost: litigation + brand + compensation.

All four would have been avoided by answering this article's four questions before launch.

6. Closing

Putting AI into Feishu / DingTalk / WeCom is not a technical problem. It's a combined product-governance-cost design problem.

Technically, options abound — build your own agent framework, use open source, buy a mature product. Whichever you pick, answer these four questions first:

  1. Ingress: which scenarios, which forms? How does the hybrid combine?
  2. Permissions: don't mirror. Live source check. Identity pass-through.
  3. Audit: compliance + cost + quality — all three.
  4. Cost: four-tier quota, tiered models, caching, alerts.

Thinking first is 10× cheaper than remediating later.

Our SiNan (enterprise AI agent gateway) product started from these four questions — SHA256 hash chain audit, live Feishu ACL recheck, multi-tenant physical isolation, four-tier quota. If you're building similar internal capability, you're welcome to book a pilot discussion or use SiNan as a reference architecture.