Shipping OAuth 2.1 for an MCP Server: What Actually Broke

Notes from wiring OAuth 2.1 (RFC 8414, RFC 9728, DCR, PKCE, resource indicators) into the 3meel.ai MCP server — including a refresh-token bug we caught after shipping and some stdio resources we forgot to register.

The MCP authorization spec settled on OAuth 2.1 with a specific set of RFCs bolted on. On paper it reads like a tidy shopping list. In practice, stitching it together for a real server exposed a handful of things we got wrong on the first pass. This post is the honest version: what we shipped, what broke, and what we changed.

The target spec

We wanted an MCP server that Claude Desktop, Claude Code, and generic MCP clients could connect to without hand-edited tokens. The spec calls for: RFC 8414 authorization server metadata, RFC 9728 protected resource metadata, RFC 7591 dynamic client registration, authorization code flow with mandatory PKCE (S256), and RFC 8707 resource indicators so tokens are bound to a specific audience.

We also kept token lifetimes short: a 1-hour access token and a 30-day refresh token, both stored as SHA-256 hashes, rotated on every refresh. A Bearer middleware on the `/mcp` endpoint hands the request through to the same tool handlers the per-key transport already used.

Problem 1: the refresh token didn't check who was using it

Our first version of `rotateRefreshToken` accepted a plaintext refresh token, looked it up by its hash, and rotated it. That is the obvious shape of the function. It is also wrong. If a refresh token ever leaked to a different registered client — through a logging mistake, a misconfigured proxy, a backup dump — that client could use it to mint new tokens under another client's identity. OAuth 2.1 section 4.13.2 is explicit: the authorization server MUST bind refresh tokens to the client that received them.

The fix is small. The function takes the `client_id` from the token request and refuses to rotate if it doesn't match the `client_id` on the stored token row.

// apps/mcp/src/oauth/db.ts
export async function rotateRefreshToken(
  plainRefreshToken: string,
  clientId: string,
) {
  const hashedRefresh = hashToken(plainRefreshToken);
  const [existing] = await db
    .select()
    .from(oauthTokens)
    .where(eq(oauthTokens.refreshToken, hashedRefresh))
    .limit(1);

  if (!existing) return null;

  // Bind the refresh token to its owning client.
  if (existing.clientId !== clientId) return null;

  // ...revoke old pair, issue a new access + refresh, return them.
}

The caller in `/oauth/token` now passes `client_id` through. Without this check, a soft failure mode (token leaks) turns into a hard one (token leaks are exploitable). With it, a leaked refresh token is still useless to any party other than the client that received it.

Problem 2: the stdio transport was missing half its resources

Our MCP server ships in two transports: a Streamable HTTP `/mcp` endpoint and a local stdio server for Claude Desktop. The HTTP path registers resources through a shared Hono context, which lets us reuse one `registerUnifiedResources` function across per-KB, per-project, and unified servers. Stdio has no Hono context — it holds the user ID and knowledge base list in a closure — so we register its resources inline.

That is the moment the two code paths diverged. When we landed the initial OAuth + resources PR, the stdio server only exposed the knowledge-bases list. Template resources like `3meel://knowledge-bases/{kbId}`, `3meel://projects/{projectId}`, and `3meel://projects/{projectId}/memories` existed in the HTTP path and not in stdio. Claude Desktop users could list collections but not drill in.

The fix was filling in the missing three — same shape as the HTTP versions, but reading from the closure variables instead of the context.

server.resource(
  "project-details",
  new ResourceTemplate("3meel://projects/{projectId}", { list: undefined }),
  { description: "Project details with linked collections" },
  async (uri, { projectId }) => {
    const project = await resolveProject(userId, String(projectId));
    if (!project) {
      return {
        contents: [{
          uri: uri.href,
          mimeType: "application/json",
          text: JSON.stringify({ error: "Project not found" }),
        }],
      };
    }
    const collections = await listProjectKnowledgeBases(userId, project.id);
    return {
      contents: [{
        uri: uri.href,
        mimeType: "application/json",
        text: JSON.stringify({
          id: project.id,
          name: project.name,
          slug: project.slug,
          collections,
        }, null, 2),
      }],
    };
  },
);

The lesson: when you have two transports that share tool handlers but not resource registration, resources drift. We now treat resource parity as part of the transport contract and have an open todo to collapse the two paths into a single registration with a context shim.

Things the spec tells you but that are still easy to miss

PKCE is mandatory for every flow, including ones you might otherwise consider to be confidential clients. We reject requests without an S256 `code_challenge`.
Resource indicators (RFC 8707) mean an access token minted for `https://mcp.3meel.ai/mcp` is not valid at any other resource URL. The `aud` claim is checked by the Bearer middleware on every request.
RFC 8414 metadata goes at `/.well-known/oauth-authorization-server` and RFC 9728 metadata goes at `/.well-known/oauth-protected-resource`. Clients will fetch both without telling you, so both need to be correct before you flip on auth.
Dynamic client registration is opt-in per deployment but agents expect it. If you skip it, expect manual curl from every onboarding user.

What we would do differently

If we had a time machine, the first thing we would write is a cross-transport conformance test: spin up both the HTTP and stdio servers, issue identical MCP requests against each, and diff the resource and tool listings. We would have caught the stdio resource gap on the day we introduced it instead of on the day a user hit it.

The second would be a property test around `rotateRefreshToken` that feeds it mismatched `client_id` values and asserts it never returns a token pair. Unit tests for refresh rotation are easy to write against the happy path and easy to skip for the adversarial path, which is exactly the path that matters.

OAuth 2.1 is not hard because the individual RFCs are hard. It is hard because the failure modes only show up where the RFCs meet each other.

Both fixes are live. If you are running an MCP server and have not yet audited your refresh-token binding or your transport parity, those are two cheap hours that will probably surface something.