
Building Enterprise-Grade Deep Research Agents In-House: Architecture and Implementation
July 22, 2025Azure Functions on Azure Container Apps: KEDA Scaling for Service Bus Functions When Using AZURE_CLIENT_ID
July 22, 2025Welcome back to Agent Support—a developer advice column for those head-scratching moments when you’re building an AI agent! Each post answers a real question from the community with simple, practical guidance to help you build smarter agents.
Today’s question comes from someone still in the prototyping phase—looking to use free models until they’re ready to commit:
💬 Dear Agent Support
I’m experimenting with different agent ideas, but I’m not ready to pay for API credits just yet. Are there any models I can use for free?
Short answer: yes, and you’ve got a couple of good options! Let’s break them down.
🧠 GitHub Models: A Free Way to Experiment
If you’re just getting started with agents and want a no-cost way to try things out, GitHub-hosted models are a great option.
These models are maintained by the GitHub team and run entirely on GitHub’s infrastructure, so you don’t need to bring your own API key or worry about usage fees. They’re designed for prototyping and lightweight experimentation, making them ideal for testing out ideas, building proof-of-concepts, or just getting familiar with how agents and models interact.
You can try them directly in the GitHub web interface or through tools like the AI Toolkit, which includes them in its Model Catalog. Many supports common features like structured output, chat history, and tool use, and are regularly updated to reflect community needs.
Think of these as your training wheels: stable, reliable, and free to use while you explore what your agent can do.
⚠️ But Beware of Rate Limits
Free models are great for prototyping…but there’s a catch!
GitHub-hosted models come with usage limits, which means you might hit a wall if you’re testing frequently, building complex agents, or collaborating with teammates. These rate limits exist to ensure fair access for everyone using the shared infrastructure, especially during peak demand.
If you’ve ever wondered why your responses stop, it’s probably because you’ve reached the cap for the day.
The good news? GitHub recently introduced a Pay-As-You-Go option. This lets you continue using the same hosted models with more generous limits, only paying for what you use. It’s a helpful bridge for developers who’ve outgrown the free tier but aren’t ready to commit to a full API plan with another provider.
If your agent is starting to feel constrained, this might be the right moment to switch gears.
🖥️ Want More Control? Run a Local Model
If you’d rather skip rate limits altogether, or just prefer running things on your own machine, you could always use a local model.
Local models give you full control over how the model runs, how often you use it, and what kind of hardware it runs on. There’s no API key, no usage tracking, and no hidden costs. It’s just you and the model, running side by side.
You can download and host open-source models like LLaMA, Mistral, or Code Phi models using tools like Ollama or Foundry Local, which makes setup surprisingly simple. Most local models are optimized to run efficiently on consumer-grade hardware, so even a decent laptop can handle basic inference.
This is especially handy if you’re experimenting with sensitive data, need offline access, or want to test agents in environments where cloud isn’t an option.
Of course, going local means you’re responsible for setup, performance tuning, and hardware compatibility, but for many developers, that tradeoff is worth it!
🧰 Ready to Try One Out?
Whether you’re curious about GitHub-hosted models or want to use a local one, the Models feature within the AI Toolkit makes it easy to test them out, no custom setup required.
With just a few clicks, you can browse available models, run test prompts in the Playground, and even use them with the agent you’re building.
Here’s how to do it:
Use a GitHub Model
- Open the Model Catalog from the AI Toolkit panel in Visual Studio Code.
- Click the Hosted by filter near the search bar.
- Select GitHub.
- Browse the filtered results.
- Select + Add Model to add a model to your list of models.
Use a Local Model
Note: Only Ollama and Custom ONNX models are currently supported. The instructions below only focus on adding local models from Ollama.
- Download your chosen local model to your computer.
- From the AI Toolkit panel in Visual Studio Code, hover over My Models and select the + icon.
- In the wizard, select Add Ollama Model.
- In the wizard, select Select models from Ollama library. This will provide a list of the models available in your local Ollama library (i.e. models you’ve downloaded).
- In the wizard, select the model(s) you want to connect and click OK.
Your cost-free models are now available to use with your agent! If you’re unsure whether a model is a GitHub model or Ollama model, you can view its category within the My Models section of the AI Toolkit panel. The models within that section are organized by model source/host.
🧪 Test Before You Build
Whether you’ve added a GitHub hosted model or a local model, you can chat with the models in the Playground or within the Agent Builder. The model is available for selection within the Model drop-down.
As a reminder, GitHub hosted models have rate limits. If you hit a rate limit with a GitHub hosted model, the AI Toolkit will provide a notification presenting you with the option to either use GitHub pay-as-you-go models or Deploy to Azure AI Foundry for higher limits.
Whichever path you choose, the AI Toolkit helps you prototype with confidence, giving you flexibility early on and clear upgrade paths when you’re ready to scale!
🔁 Recap
Here’s a quick rundown of what we covered:
- GitHub-hosted models let you start building fast, with no API keys or fees, but they do come with rate limits.
- GitHub Pay-As-You-Go gives you a way to scale up without switching tools.
- Local models give you full control and zero rate limits, just run them on your own machine using tools like Ollama.
- The AI Toolkit supports both options, letting you chat with models, test prompts, and build agents right inside VS Code.
📺 Want to Go Deeper?
With so many models available these days, it can feel overwhelming to keep tabs on what’s available. Check out the Model Mondays series for all the latest news on language models!
By the way, GitHub has guides on discovering and experimenting with free AI models; definitely worth a read if you want to understand what’s under the hood. Check out their articles on GitHub Models.
No matter where you are in your agent journey, having free, flexible model options means you can spend more time building—and less time worrying about the bill.