The BYOA stack: Users' agents, plus in-app AI, both built on the same foundation.

The BYOA strategy in a nutshell. Focus on the Foundation first, then prove out your users’ workflows, and only then build in-app AI features.

Read on for a deeper explanation!

Table of Contents

What is BYOA?

In this post, I want to talk about an AI strategy for SaaS companies that I think will be cheaper, simpler, and more future-proof than the alternative. I call it Bring your own agent (BYOA).

BYOA’s core idea is that when we build AI-powered features into our app, we’re competing with users’ personalized, state of the art (SOTA) agentic workflows.

In other words, the user will expect our AI-powered features to work at least as well as the AI tools they already use, like Claude or Gemini.

But most AI-powered features suck, and even the best ones now will suck relative to the SOTA in 12 months. This is not a space we want to be actively competing in1.

The BYOA strategy acknowledges this difficulty by allowing the user to interact with the app through their own agent. This becomes one of the three main modalities for users trying to get stuff done:

  1. No AI
  2. BYOA (Access mediated by user’s agent)
  3. In-app embedded AI features
Illustration of three ways user can interact with the app: no AI, BYOA, and in-app AI.

Of course, if a new SOTA agent comes out, the user can easily switch to it and continue using our app with no additional effort on our end.

I predict this user agent mediated workflow is going to become more and more popular as agents get more powerful and competitive. This is why the BYOA strategy is “future-proof” — in the future, users bringing their own agents will be the norm.

The most common alternative strategy I’ve observed is what I’ll call embedded AI, which is where you jump right into “how do we embed flashy AI features into our app?”

I think the embedded AI strategy comes from working backwards from a problem statement. The chain of reasoning goes like this:

  1. We know our competitors are building AI features and our customers are asking about AI features.
  2. So, we need some AI features we can use for product marketing and sales.
  3. We can embed AI into XYZ part of the app. (Or even worse, “we can add a ‘useful’ chatbot to every page.”)

This is totally valid, but it’s not future-proof because it’s focused on meeting a short-term sales/marketing need instead of working on improving the product.

The BYOA strategy starts from the same problem statement, but adds some suppositions about the future of AIs and the industry. Those suppositions lead us to a very different strategy that prioritizes maximizing long-term product UX and thought leadership in the AI space.

Also, in case you don’t get far enough in the post — note that BYOA is not incompatible with in-app embedded AI. It’s more a matter of priorities and goals. I just think BYOA should almost always be tackled first. See the Beyond BYOA section for more details!

This post assumes you have some existing familiarity with AI terms — model, agent, context, etc.

But since it’s especially important that we’re on the same page about what an agent is, I’ve included a section about that.

So, what is an agent?

Briefly, an agent is a piece of software that sits between a user and an AI model and has two main functions:

  1. Performs tasks on the user’s behalf, e.g. reading/writing files, downloading websites, and calling MCP servers.
  2. Constructs prompts for the user and sends them to an AI model for evaluation.

These two pieces of functionality — running tasks and constructing prompts — are essential for getting the highest-quality results from a given model. Because of that, almost all model interactions these days are mediated through an agent.

Even “chat” interactions like ChatGPT are typically using a lightweight agent in the background.


Without an agent, a model interaction looks like this. The user’s prompt is sent directly to the model, and the model’s raw response is returned to the user:

User interacting directly with LLM

When we add an agent into the mix, it becomes more like this:

User interacting with LLM indirectly via an agent

So a typical interaction might work like this:

  1. User constructs a prompt — Publish a new S3 bucket called foobar
  2. The agent supplements the prompt with some global instructions (the system prompt) plus any additional instructions from the user (e.g. CLAUDE.md or agents.md file contents) and sends a message to the AI model.
  3. The AI model responds with some text that includes instructions for some tasks: Look up documentation, write a plan, make the bucket.
  4. The agent proceeds to execute the tasks, managing the prompts/context in the background automatically. Often the agent has some permission checks before running tasks.
  5. Once the tasks are complete, the agent is “done” with the initial prompt and waits for the user to start something else.

As a side note, remember that only the model is AI, the agent itself is “traditional” software (i.e. not opaque, we can actually understand and change how it works, enforce permissions, etc.)


That’s enough about agents for now — for more details, I’d recommend Building Effective AI Agents from the Anthropic blog.

The strategy

The BYOA strategy is simple.

First, the SaaS focuses on building features to enable users’ agents to connect to their app, e.g. MCP servers, which I call the foundation.

The same foundation can be used by users with their own agents, and by AI features built into the app.

Foundation of the BYOA stack broken into docs, MCP, and prompts.

This way, users can bring their own agent and do the workflow that works for them.

At the moment there are three main elements of an AI foundation:

  1. AI-friendly documentation.
  2. MCP server(s).
  3. Curated prompts and instructions.

In the future, I expect other things to become important as well. For example, the recently released Agent Client Protocol might be relevant for some types of apps.

In the long run, I expect more high-level protocols to come out that make it easier for AI agents to integrate with apps.

For example, suppose a user wants to plan a vacation using AI.

One possible future, where the user has to hop from one agent to another:

  1. They use their banking app’s AI agent to check their savings balance and make a budget for the trip.
  2. They log into a Flights app and use the Flights AI agent to assist them with booking a flight.
  3. They log into AirBnB and use the AirBnB AI agent to assist them with finding a place to stay.
  4. They open a note-taking app and use an AI to summarize the details of the trip.

The user just interacted with 4 different agents of likely middling quality, manually copy/pasting context (like the times and budget for the trip) between them:

Illustration of user interacting with multiple apps through separate agents.

Another possible future, where all the sites are BYOA:

  1. The user logs into their general AI agent app.
  2. They ask the agent to book them a vacation.
  3. The agent integrates with the user’s bank, flights app, and AirBnB to come up with a plan for the trip.
  4. The user approves the plan, and their agent automatically buys everything.
  5. The agent automatically summarizes the trip in its memory bank for later search/reference.

In this future, the user stays in their single preferred agentic app and it can automatically handle ferrying information between different services.

Illustration of user interacting with multiple apps through a single agent.

Let’s quickly address the three elements of the foundation. Why are we focused on MCP servers, AI documentation, and prompts/instructions?

MCP Servers

Model Context Protocol (MCP) is the current standard for how AI agents can connect to external systems. In the MCP context, the user’s agent is considered the client, which can be connected with multiple servers. MCP servers provide a way for any compatible AI agent to interact with our product.

MCP servers can expose three key functionalities for clients (agents) to use:

  1. Resources for the agent to consume (e.g. documentation, short-term memory, tasks, etc.)
  2. Tools the agent can use (e.g. “searchFlights”, “bookFlight”, etc.)
  3. Prompts the user can use to trigger behavior (e.g. “Book a vacation” prompt).

For a SaaS company, here’s how this maps to our product’s interface:

  • Resources expose docs as well as read-only APIs.
  • Tools expose APIs.
  • Prompts expose recommended workflows.

As you can see, there is some overlap between what MCP servers support and the “AI-friendly documentation” and “prompts/instructions” parts of the foundation: MCP Resources can be used for AI-friendly documentation, and MCP Prompts can be used for documenting prompts/instructions.

That doesn’t mean you should just create the MCP server and be done with it, though. There are many use cases where a user may not want to (or be able to) connect to an MCP server, but they still want docs or prompts for their agent.

What we can do is reuse work.

We can write up some good prompts, put them in the MCP server, but also put them online in our documentation along with some screenshots and guidance.

We can also create some LLM-friendly documentation (Markdown files?) and expose them as MCP resources. But we also put them online in a way for LLMs to easily access them without MCP.

Here’s a matrix illustrating how the MCP functionalities relate to the other parts of the foundation:

MCPDocsWorkflows
Tools
Resources
Prompts

AI-friendly documentation

LLMs can read text, and they understand HTML alright. If you already have public docs on the Internet, AI models can likely consume that content directly.

However, the actual results might not be as good as if there was AI-focused documentation for how to interact with your product.

What differentiates documentation for AIs? Well, it’s all about managing context by providing high information density.

  • Minimal formatting (Markdown preferred) to reduce the amount of context the information takes up.
  • Short and to the point.
  • Includes specific workflows and example code if relevant.

Most documentation “tricks” (like repeating important points multiple times) also work well with AIs. Rather than picturing your audience as an AI, picture it as an expert developer who knows nothing of your product (but understands industry terms, etc.)

An AI-friendly documentation file might be structured like this:

Example llms.txt structure. Has 1 paragraph introduction, 5 bullet points glossary, then 3 common workflows with 1 paragraph + code examples each, and finally links to ogher pages.

A real example: the better-auth llms.txt file.

Prompts/Instructions

Generally speaking, users need some training on how to get effective results from AI, even if they’re already AI experts in other contexts. This is because of the chaotic nature of prompt engineering: a prompt that works well in some situations might work poorly in others.

By providing a library of AI prompts (or prompt templates, which is a distinction I won’t be getting into in this post — just think of them as fancy prompts), you’re showing the user how to get good results.

There’s also an implicit assumption that if you veer away from the provided prompts, things may not work as well. In this way, a prompt library helps set users’ expectations.

I haven’t seen this done well yet. My feeling is that the best approach would be:

  1. Identify a workflow where AI is potentially useful.
  2. Write a series of prompts/instructions for that workflow (testing along the way with your own agent).
  3. Write a human-focused documentation page explaining the workflow, the expected result, and including the prompts and instructions for them to replicate the result.
  4. Include a demo (e.g. screencast) of the workflow.

This means each workflow should have a documentation page including:

  • A human-focused explanation of the workflow (When/why should I use this?)
  • Expected results (What should happen when I do this?)
  • AI prompts and/or instructions (How do I replicate the expected results?)
  • Demo (Example that grounds the discussion and proves the workflow works)

What’s the difference between prompts and instructions?

I think of prompts as things the user pastes directly into their agent’s chatbox when they want to get something specific done.

Meanwhile, instructions are general context that get fed into the AI with every prompt, e.g. CLAUDE.md files.

However, it’s an empirical distinction. Instructions can be used as prompts, and prompts can be used as instructions. They’re both just text, after all. The difference is that instructions are meant to be more general, whereas prompts are more specific.

Where’d this come from?

These predictions, and the whole BYOA strategy in general, are based on a few insights about AI products:

  • Context is king. A key challenge is constructing the context that the AI needs to get good results. If you want differentiated quality, you need to put a lot of effort into your prompts and context-building.
  • Agents are getting complicated. Building a state-of-the-art agent is complex, and you’re competing with big names and deep pockets (Anthropic, Cursor, etc.). If you can’t do at least as well as them in your problem domain, your users will be frustrated by your agent.
  • Expectations are rising. The AI honeymoon is ending. Users are no longer impressed by “Hey, the app responded to my question using plain English” — now they want to understand how AI actually helps them. Companies and users need to figure out how to make AI a work horse, not a show horse.

As a consequence of these factors, I think we can anticipate a couple outcomes:

  • Agents will centralize. In the future, the most high-quality, widely useful agents will eat all the smaller, specialized agents’ lunch.
  • Chatbots will be passé. The idea of interacting with an integrated AI “chatbot” in a SaaS product will make users cringe.
  • Integrations will be essential. Users will expect apps to connect to their AI tools of choice using standard protocols.

However, the space is still moving incredibly quickly, and the current SOTA agents aren’t actually realizing the vision of a one-stop agent yet.

It’s precisely for these reasons that this is not a good time for most companies to throw their hats into the AI agent ring.

One more thing — it’s important to realize that when we’re building “AI-enabled features” in our app, these are agents, even if we don’t think of them that way. From the user’s perspective, they’re competing in the same arena.

Imagine our product is a Web app for categorizing and printing 3D objects (like Thingiverse) and we’re trying to decide whether to build a 3D CAD software into our website for users to use.

Although we could just whip up a simple CAD UI, we’d be competing with major CAD software like Blender and FreeCAD. Imagine a user who’s used to Blender, and now they have to figure out our product-specific online CAD alternative thingy that we built in three weeks. Would that be a happy user? Probably not.

Instead we decide to make foundational tooling for our app to connect to any (or at least the most popular) CAD software. The Blender user can continue using Blender. The FreeCAD user can continue using FreeCAD. And both of them can work with our app. This is the BYOC (Bring your own CAD) strategy.


Why is this example powerful? The idea of having some “CAD-lite” thing built into a SaaS is obviously weird, because we know that CAD software is complicated as heck and takes training to use.

What people don’t realize is that AI agents are going down the same path. Power users will have “power agents” that go far beyond the current SOTA.

Imagine an agent that looks like Blender or Outlook — a complicated, powerful, customizable, perhaps expensive, software application that users “live in” and want to integrate everything with. This is the future we should be planning for.

Beyond BYOA

A big selling point of BYOA is it rolls nicely into in-app AI features as well. This is not an either/or decision, just a question of what we prioritize.

Once we’ve built our foundation and enabled users to connect with their own agents, we’ve done the minimum work to prove out the viability of AI-assisted interaction with the app and to allow power users to experiment and explore.

Then, when we identify useful workflows later, we can go ahead and build AI features into our product based around those workflows, with the major advantage that we’ve already proved out the tooling, prompts, and agentic behavior that will get good results.

Rather than “let’s give this workflow a shot and see if AI helps”, it becomes “our users are already getting benefit from using their own agents, let’s replicate that in our app” — which reduces risk of building useless or non-future-proof AI enhancements.

In other words, a new AI “vertical” gets implemented like this:

  1. Build the foundational pieces the vertical needs (e.g. MCP server, AI-friendly docs).
  2. Work with users to write and refine useful prompts and instructions.
  3. Publish learnings from (2) publicly.
  4. Once users are getting value with their own agent, start to design, plan, and build the AI-enhanced feature into the app.
  5. Continue to maintain (1) and (3) even once the AI-enhanced feature is released.
Illustration of the roadmap steps described above.

We allow users to pick the UX that works best for them, while also saving ourselves time and money.

When not to BYOA

You shouldn’t focus on the BYOA strategy when…

  1. Your product AI usage doesn’t fit into normal agents (for example DaVinci Resolve won’t benefit from allowing users to “bring their own agent” since their software has unique interaction modes).
  2. The core of your product actually is an agent (duh) or most of your market value is in your AI features. In other words, you expect your agent to be cutting edge and outcompete whatever the user would be bringing.
  3. Users can’t bring their own agent (e.g. users are BigCo employees who can’t install an agent — although, expect that eventually BigCo employees will have a specific agent to use and maybe even require BYOA for security/compliance reasons)

BYOA in the wild

A couple examples of this strategy that I’ve run into:

Footnotes

  1. The exception is if agents are a core competency or key differentiator for your company, e.g. Anthropic, OpenAI, Cursor, etc.