Fabian G. Williams aka Fabs

Fabian G. Williams

Principal Product Manager, Microsoft Subscribe to my YouTube.

One Agent Receipt, Two Buyers: Why Protocol-Neutral MCP Audit Trails Matter for Both Security AND Finance

I built a public MCP-callable storefront. The same endpoint produced an identical audit-trail receipt from hosted Claude Desktop AND from Qwen3.6 27B running fully offline on my MacBook. Same six supervision checks. Same receipt page. One artifact that satisfies both the security audit and the finance billing conversation.

Fabian Williams

10-Minute Read

Side-by-side comparison of two audit-trail receipts — Claude Desktop on left, LM Studio with Qwen3.6 27B on right — both produced by the same MCP endpoint with identical six-check audit trail

Update, 2026-05-19: A2A agent-card now live at mcp.adotob.com/.well-known/agent.json, published 24 hours after Nate B Jones’s IO-2026 video named the agent-card primitive as the second of the four core agent-protocol layers. Three of the four layers of the open-protocol stack are now live in the storefront: MCP for tool access, A2A for agent discovery, and AG-UI manifested as the public receipt page. AP2/X402 is reserved for the MVP-2 paid Stripe flow.

Qwen 3.6 vs gpt-oss:120b on M3 Max: I Ran a Harder Test, the 8× Speed Gap Surprised Me

I published a Qwen 3.6 vs gpt-oss migration story, then ran an un-gameable eval against both on the same M3 Max. The receipts changed the speed narrative — gpt-oss:120b ran 8 to 11 times faster than qwen3.6:27b at parity reasoning quality. Here is the methodology and the data.

Fabian Williams

11-Minute Read

Horizontal bar chart showing gpt-oss:120b at 137 seconds and qwen3.6:27b at 1593 seconds on the same Round 2 reasoning tasks, with an 11.6× slower callout

I published a post last week about replacing gpt-oss:120b with Qwen 3.6 on my MacBook Pro M3 Max. The numbers in that post were real, but one set of tests was structurally gameable — 38 of 40 baseline images were the same class, so an “always-say-A” stub also scored 95 percent. I went back, designed three un-gameable reasoning tasks, and ran them against both local models on identical hardware. gpt-oss:120b finished the three tasks in 137 seconds. qwen3.6:27b-q8_0 took 1593 seconds —…

Recent Posts

Categories

About

Fabian G. Williams aka Fabs Site