Frontline Fridays Session 5: Turbocharging your employee home experience through customization
July 24, 2025Always-On Diagnostics for Endpoint DLP
July 24, 2025The Future of AI blog series is an evolving collection of posts from the AI Futures team in collaboration with subject matter experts across Microsoft. In this series, we explore tools and technologies that will drive the next generation of AI. Explore more at: Collections | Microsoft Learn
The Future of AI: Optimize Your Site for Agents – It’s Cool to be a Tool
General-purpose AI agents like Manus are emerging tools that can browse websites and perform user tasks automatically. These agents act as autonomous assistants: a user might say “Find me a red jacket under $100 and buy it,” and the agent will search, navigate e-commerce sites, compare products, and even complete the purchase – all without the user manually clicking through websites. To help agents find and use your web properties (especially for e-commerce), site owners should adopt new technical strategies. Below we outline recent best practices, including how Microsoft’s NLWeb project fits in, and other emerging standards to make your site “agent-ready.”
Help Agents Discover Your Site
Much like traditional SEO for search engines, Agent Experience Optimization begins with making your content easily discoverable by AI agents. Key steps include:
- Open Access to AI Crawlers: Don’t inadvertently block AI bots. Update your robots.txt to allow known AI user-agents (e.g. ChatGPT-User, OAI-SearchBot, Google’s agent crawlers, etc.) to crawl your site. For example:
Also consider avoiding aggressive bot-blocking (Cloudflare/WAF rules) that might stop legitimate AI agents. These agents often run on cloud IPs, so ensure your site isn’t unintentionally denying them.
- Provide Sitemaps & Feeds: Maintain an up-to-date sitemap.xml (and list it in robots.txt). Offer RSS/Atom feeds or product feeds if applicable. Agents can use these to quickly find your important pages. Microsoft’s NLWeb, for instance, combines LLMs with common web data like sitemaps and RSS feeds to help agents discover content.
- Structured Data & Metadata: Use Schema.org structured data (preferably in JSON-LD) for products, reviews, business info, etc. This machine-readable metadata helps AI understand your content. In fact, sites that are well-structured (e.g. as “lists of items” like product listings with schema markup) yield the best results for NLWeb and similar agent tools. Basic SEO tags (
, meta description) should present and use semantic HTML elements ( - Content Accessibility: Serve content in a crawlable, text-centric manner. Avoid burying vital info in images or behind heavy client-side scripts. Note that only a few AI crawlers (such as Google’s Gemini and Apple’s Applebot) currently execute JavaScript; many others do not. So, server-rendered content or proper fallbacks ensure all agents can see it. Keep page load speeds fast (aim for a sub-second response) and put important content high in the HTML source, as some agents may not scroll endlessly.
- Indicate Freshness: Use visible dates or tags (like lastmod in your sitemap) to indicate content update times. Agents trying to answer queries benefit from knowing if your information (prices, inventory, articles) is recent.
Design an Agent-Friendly Website (Agent-Responsive Design)
Beyond being findable, your site should be easy for an AI agent to navigate and operate. Agents don’t have eyes – they rely on structure and consistency to simulate clicks and form fills. Consider these best practices for “agent-responsive design”:
- Clear Interactive Elements: Ensure clickable buttons, links, and form fields are well-labeled and use standard HTML controls. For example, use
- Consistent Navigation and Structure: Design your site with a logical, consistent layout. Use standard menus, breadcrumbs, and page structures so that an agent can predict the workflow. For instance, if every product page has an “Add to Cart” button with a predictable ID or text, an agent will reliably locate it. Consistency across pages means the agent doesn’t have to re-learn your interface each time.
- Minimize Obstacles: Avoid pop-ups, modals, or interstitials that require human-like complex interactions (e.g. dragging, long mouse movements, CAPTCHAs). Also, consider making login optional for browsing and adding to cart when possible. An agent tasked with buying a product might get stuck if a login wall or multi-factor authentication is in the way. If you must have these, provide an API or alternate path for checkout. Unnecessary interactions like login prompts or pop-ups can disrupt AI task completion – so minimize them or allow guest checkouts and easy cart access.
- Accessibility Aids: Implement standard accessibility practices (which benefit bots and humans alike). Use Accessible Rich Internet Application (ARIA) labels and roles to clarify what each element is (e.g. mark navigation menus, form regions). An AI agent can leverage these cues to understand your page layout (very much like a screen reader would). For example, ARIA labels on buttons like aria-label=”Proceed to Checkout” give the agent context about that button’s function.
- Test with Agents: Just as you’d test a site in different browsers, test your critical flows with AI agents or simulators. Some companies use scripts or AI-driven testers to see if an agent can successfully perform tasks (search for a product, add to cart, etc.). If it fails or takes too long, identify where and refine that part of the UI. Agent Experience Optimization is a new focus – understanding where agents “get stuck” and smoothing those points out. For example, if an agent frequently times out on your payment page, maybe the form is too dynamic or requires an unsupported input – fixing that could make your site the preferred choice of automated shoppers.
Enabling Natural Language Access with NLWeb
One exciting development is NLWeb, an open-source project by Microsoft aimed at creating an “agent-ready web”. NLWeb essentially lets you turn your website into a conversational AI app. Here’s how it can help your site interface with agents:
- Conversational Interface for Your Site: NLWeb allows you to deploy a natural language query interface on top of your site’s data. Users (or AI agents acting on users’ behalf) can ask your site questions in plain language – “Do you have size 8 in red?”, “What’s the price of your Series 7 Widget?” – and get answers drawn from your content. Under the hood, NLWeb uses an LLM of your choice plus your site’s own data to answer queries. This makes your site’s knowledge directly accessible to agents without them having to crawl page by page.
- MCP Integration (Model Context Protocol): Every NLWeb instance also acts as an MCP server, publishing your content and query interface to the agent ecosystem. MCP is a new protocol (spearheaded by Anthropic and others) that standardizes how AI agents communicate with tools and websites. Microsoft’s CTO Kevin Scott likens MCP to the equivalent of HTTP” for interconnected AI apps in the “agentic web.” In practical terms, by enabling NLWeb, your site can expose an API endpoint (like an ask method) that agents can call with natural language questions. For example, Google’s Project Mariner or OpenAI’s Operator could directly query your NLWeb service for product info or perform actions, instead of scraping HTML. This makes agent interactions far more efficient and accurate.
- Leverage Existing Site Data: NLWeb doesn’t require a full site overhaul – it builds on semi-structured data you likely already have, such as your Schema.org markup, RSS feeds, or database dumps. It indexes and injects that data into a vector DB and LLM so that queries can be answered with factual grounding in your content. R.V. Guha (creator of Schema.org and RSS) leads the project, so unsurprisingly NLWeb is designed to natively use schema metadata and feeds to understand your site. Action item: Make sure your site’s schema markup and feeds are complete and up-to-date (e.g. product names, descriptions, prices, availability). NLWeb will use those as the knowledge base.
- Early Adoption in E-commerce: Many early NLWeb adopters are content and commerce sites, like Shopify (commerce platform), Allrecipes (recipe listings), and Tripadvisor (travel info) (source). This indicates a trend: e-commerce companies want to ensure agents can transact on their sites. With NLWeb, an agent could ask “Find a size M black T-shirt under $50” and get results from a retailer’s site instantly, or even “Buy this item now” to trigger a purchase workflow via the conversational interface. Implementing NLWeb can thus make your site directly callable by agents for queries or transactions, rather than hoping the agent’s generic web browser module navigates your site correctly.
In short, NLWeb is positioning itself as “HTML for the agentic web” – a new layer that makes sites agent-friendly by design. Embracing it (or similar frameworks) is a forward-looking way to optimize for AI agents.
Emerging Standards to Optimize for AI Agents
Beyond NLWeb, several other technologies and standards are taking shape to help companies prepare for an AI-agent-dominated web:
- Model Context Protocol (MCP): Discussed above, MCP is an open protocol enabling AI agents and web services to communicate in a structured way. Websites and apps can implement MCP endpoints to expose their data and actions to agents in a standardized format (e.g., NLWeb’s ask endpoint). By supporting MCP, you essentially publish a “contract” that any compliant AI agent can use to interact with your site’s functionality safely and efficiently. Microsoft and others are heavily investing in MCP as a foundation for the agentic future.
- LLMs.txt: Inspired by robots.txt, llms.txt is a standard proposed by technologist Jeremy Howard to provide LLM-specific guidance for your site. It’s basically a Markdown file at your site’s root (/llms.txt) that gives AI models a concise, expert-level summary of your important content and how to use it. Because large models can’t ingest an entire complex website in one go (context limits), llms.txt lets you flatten key content (or provide direct links to it) in a single, easy-to-parse resource. This can include an outline of your site, important FAQs, or even full text of documentation pages. The idea is to help agents quickly retrieve relevant info without heavy crawling. While still experimental, llms.txt is gaining traction as a way to signal to AI “Here’s what’s important on our site, and here’s the content in an AI-friendly format.” It can improve an agent’s accuracy when interacting with your domain.
- OpenAI (and other) Plugin Manifests: Another approach is to expose your site’s capabilities via an OpenAPI/JSON manifest so AI agents can call your APIs. OpenAI’s Plugins specification (an emerging de-facto standard in 2024) lets you host a .well-known/ai-plugin.json file describing your API endpoints (in OpenAPI schema) and authentication. This is primarily meant for ChatGPT and similar agents to integrate services. By implementing a plugin or API with clear specs, you allow agents to skip the UI and call your backend directly for certain tasks (e.g., searching inventory, placing an order). Azure AI Foundry and other AI platforms offer tool-use via API calling. So, having a documented API (or using frameworks like Shopify’s storefront APIs) with well-defined operations can make your site tool-usable by agents. In fact, providing programmatic access via APIs (with OpenAPI specs) or RSS is explicitly recommended to enable more structured access for AI tools.
Following these steps will help general-purpose agents like Manus find, interpret, and utilize your web properties – so your business is not left behind as the agentic web becomes the new normal.
Create with Azure AI Foundry
- Get started with Azure AI Foundry, and jump directly into Visual Studio Code
- Download the Azure AI Foundry SDK
- Take the Azure AI Foundry learn courses
- Review the Azure AI Foundry documentation
- Keep the conversation going in GitHub and Discord