Python + AI Weekly Office Hours: Recordings & Resources #280
Replies: 47 comments
-
|
2026/01/06: Do you think companies will create internal MCP servers for AI apps to connect to? Yes, this is already happening quite a bit. Common use cases include:
A particularly valuable use case is data science/engineering teams creating MCP servers that enable less technical folks (marketing, PMs, bizdev) to pull data safely without needing to write SQL. The pattern often starts with an engineer building an MCP server for themselves, sharing it with colleagues, adding features based on their needs, and growing from there. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you set up Entra OBO (On-Behalf-Of) flow for Python MCP servers? 📹 5:48 The demo showed how to use the Graph API with the OBO flow to find out the groups of a signed-in user and use that to decide whether to allow access to a particular tool. The flow works as follows:
For the authentication dance, FastMCP handles the DCR (Dynamic Client Registration) flow since Entra itself doesn't support DCR natively. To test from scratch:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: Which MCP inspector should I use for testing servers with Entra authentication? 📹 20:24 The standard MCP Inspector doesn't work well with Entra authentication because it doesn't do the DCR (Dynamic Client Registration) dance properly. MCP Jam is recommended instead because it properly handles the OAuth flow with DCR. To set it up:
MCP Jam also has nice features like:
One note: enum values in tools don't yet show as dropdowns in MCP Jam (issue to be filed). Links shared: What's the difference between MCP Jam and LM Studio? 📹 34:19 LM Studio is primarily for playing around with LLMs locally. MCP Jam has some overlap since it includes a chat interface with access to models, but its main purpose is to help you develop MCP servers and apps. It's focused on the development workflow rather than just chatting with models. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you track LLM usage tokens and costs? 📹 28:04 For basic tracking, Azure portal shows metrics for token usage in your OpenAI accounts. You can see input tokens and output tokens in the metrics section. You can also:
If you use multiple providers, you need a way to consolidate the tracking. OpenTelemetry metrics could work but you'd need a way to hook into each system. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you keep yourself updated with all the new changes related to AI? 📹 30:32 Several sources recommended:
Particularly recommended:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: How do you build a Microsoft Copilot agent in Python with custom API calls? 📹 36:30 For building agents that work with Microsoft 365 Copilot (which appears in Windows Copilot and other Microsoft surfaces):
The agent framework team is responsive if there are issues. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: As a backend developer with a non-CS background, how do I learn about AI from scratch? 📹 46:39 Recommended approach:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: What's new with the RAG demo (azure-search-openai-demo) after the SharePoint data source was added? 📹 49:50 The main work is around improving ACL (Access Control List) support. The cloud ingestion feature was added recently, but it doesn't yet support ACLs. The team is working on making ACLs compatible with all features including:
A future feature idea: adding an MCP server to the RAG repo for internal documentation use cases, leveraging the Entra OBO flow for access control. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/06: Do you think companies will create internal MCP servers for AI apps to connect to? 📹 53:53 Yes, this is already happening quite a bit. Common use cases include:
A particularly valuable use case is data science/engineering teams creating MCP servers that enable less technical folks (marketing, PMs, bizdev) to pull data safely without needing to write SQL. The pattern often starts with an engineer building an MCP server for themselves, sharing it with colleagues, adding features based on their needs, and growing from there. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: What advantages do other formats have over .txt for prompts? How do you improve prompts with DSPy and evals? 📹 4:55 Prompty is a template format that mixes Jinja and YAML together. The YAML goes at the top for metadata, and the rest is Jinja templating. Jinja is the most common templating system for Python (used by Flask, etc.). The nice thing about Jinja is you can pass in template variables—useful for customization, passing in citations, etc. Prompty turns the file into a Python list of chat messages with roles and contents. However, we're moving from Prompty to plain Jinja files because:
Recommendation: Keep prompts separate from code when possible, especially long system prompts. Use plain .txt or .md if you don't need variables, or Jinja if you want to render variables. With agents and tools, some LLM-facing text (like tool descriptions in docstrings) will inevitably live in your code—that's fine. For iterating on prompts: Run evaluations, change the prompt, and see whether it improves things. There are tools like DSPy and Agent Framework's Lightning that do automated prompt optimization/fine-tuning. Lightning says it "fine-tunes agents" but may actually be doing prompt changes. Most of the time, prompt changes don't make a huge difference, but sometimes they might. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: What is the future of AI and which specialization should I pursue? 📹 11:54 If you enjoy software engineering and full-stack engineering, it's more about understanding the models so you understand why they do what they do, but it's really about how you're building on top of those models. There's lots of interesting stuff to learn, and it really depends on you and what you're most interested in doing. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: Which livestream series should I follow to build a project using several tools and agents, and should I use a framework? 📹 13:33 Everyone should understand tool calling before moving on to agents. From the original 9-part Python + AI series, start with tool calling, then watch the high-level agents overview. The upcoming six-part series in February will dive deeper into each topic, especially how to use Agent Framework. At the bare minimum, you should understand LLMs, tool calling, and agents. Then you can decide whether to do everything with just tool calling (you can do it yourself with an LLM that has tool calling) or use an agent framework like LangChain or Agent Framework if you think it has enough benefits for you. It's important to understand that agents are based on tool calling—it's the foundation of agents. The success and failure of agents has to do with the ability of LLMs to use tool calling. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: How does Azure manage the context window? How do I maintain a long conversation with a small context window? 📹 15:21 There are three general approaches:
With today's large context windows (128K, 256K), it's often easier to just wait for an error and tell the user to start a new chat, or do summarization when the error occurs. This approach is most likely to work across models since every model should throw an error when you're over the context window. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: How do we deal with context rot and how do we summarize context using progressive disclosure techniques? 📹 19:17 Read through Kelly Hong's (Chroma researcher) blog post on context rot. The key point is that even with a 1 million token context window, you don't have uniform performance across that context window. She does various tests to see when performance starts getting worse, including tests on ambiguity, distractors, and implications. A general tip for coding agents with long-running tasks: use a main agent that breaks the task into subtasks and spawns sub-agents for each one, where each sub-agent has its own focused context. This is the approach used by the LangChain Deep Agents repo. You can also look at how different projects implement summarization. LangChain's summarization middleware is open source—you can see their summary prompt and approach. They do approximate token counting and trigger summarization when 80% of the context is reached. Links shared:
How do I deal with context issues when using the Foundry SDK with a single agent? 📹 25:03 If you're using the Foundry SDK with a single agent (hosted agent), you can implement something like middleware through hooks or events. Another approach is the LangChain Deep Agents pattern: implement sub-agents as tools where each tool has a limited context and reports back a summary of its results to the main agent. For the summarization approach with Foundry agents, you'd need to figure out what events, hooks, or middleware systems they have available. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/13: Have you seen or implemented anything related to AG-UI or A2UI? 📹 29:02 AG-UI (Agent User Interaction Protocol) is an open standard introduced by the CopilotKit team that standardizes how front-end applications communicate with AI agents. Both Pydantic AI and Microsoft Agent Framework have support for AG-UI—they provide adapters to convert messages to the AG-UI format. The advantage of standardization is that if people agree on a protocol between backend and frontend, it means you can build reusable front-end components that understand how to use that backend. Agent Framework also supports different UI event stream protocols, including Vercel AI (though Vercel is a competitor, so support may be limited). These are adapters—you can always adapt output into another format if needed, but it's nice when it's built in. A2UI is created by Google with Consortium CopilotKit and relates to A2A (Agent-to-Agent). A2UI appears to be newer with less support currently in Agent Framework, though A2A is supported. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What developer hackathons are coming up? 📹 19:06 Two hackathons were mentioned:
Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: Is this a good place to ask about Microsoft Foundry SDK or Agent Framework SDK? 📹 20:53 Yes, you can ask questions about these here. The upcoming Python + Agents series will be diving deep into the Agent Framework SDK (which sometimes wraps the Foundry SDK). If you haven't registered for the agent series yet, definitely do that—it will cover the basics and go deeper into Agent Framework. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: Are you using Spec-Driven Development (SDD) or SpecKit to guide coding agents? 📹 21:59 Pamela had not used SpecKit before. For bigger projects, her approach has been to either:
SpecKit seems good if you really know what you want versus being more experimental. Den, the original creator of SpecKit, has moved to Anthropic, but there is a new maintainer. Update: Pamela tried out SpecKit in a livestream later that day. It worked pretty well, but may not be as necessary with newer models and GitHub Copilot features. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What's new with the RAG demo - ACL support? 📹 25:30 A new release was just made for the azure-search-openai-demo repo adding ACL (Access Control List) support for the cloud ingestion pipeline. This enables document-level security filtering in Azure AI Search. How it works:
AI Search has built-in understanding for ACLs. You set up fields for user IDs and group IDs, mark them as such, and enable access control enforcement on the index. Links shared: Can the RAG repo support ACLs from other identity providers like Okta? 📹 31:01 Yes, but it requires custom implementation. You would need to:
AI Search only has built-in support for Entra. For other IDPs, you'd implement it similarly to how the repo worked before the built-in Entra support was added—by passing along the token and checking permissions manually. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: Have you tried memory tools in GitHub Copilot? 📹 36:16 Pamela has not used memory tools in GitHub Copilot, but recently experimented with memory in Microsoft Copilot. To save a memory, you need to cue it up with "remember" - for example, "remember to never call me Pam, only call me Pamela." You can view saved memories in Settings > Personalization and Memories > Manage Save Memories. Without explicitly using "remember," it doesn't seem to save memories automatically despite many conversations. For VS Code, there are MCP servers for memory, though it's unclear which ones work well. |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: When should I use Foundry IQ knowledge bases vs MCP tools? 📹 45:25 Foundry IQ (the new name for Azure AI Search capabilities in Azure AI Foundry) provides:
MCP as a knowledge source is in private preview. If you want to use something like Elastic's MCP server as a knowledge source in Foundry IQ, you could request access to the private preview. Additionally, there is an MCP endpoint for knowledge bases—meaning that if you create a knowledge base in AI Search, you can use it as an MCP server. Note on naming: Azure Search → Azure Cognitive Search → Azure AI Search → Foundry IQ. The underlying Bicep/ARM resources still use "search services." |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What tools do you use to automate developer workflows? 📹 38:48 Several approaches are being used:
The flexibility is notable—you can pick whatever form of programmatic manipulation you want: custom agents, skills with prompts, Python scripts, or agent frameworks. Different approaches suit different needs. One interesting use case: converting presentations to writeups, including automatic slide-to-timestamp alignment (which LLMs handle surprisingly well). Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/01/27: What is Work IQ? 📹 17:09 Work IQ is a new command line tool and MCP server from Microsoft. It has read-only access to your Outlook, email, and Teams. You can:
Example uses:
The read-only access limits some usefulness, but it's helpful for information retrieval. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: What security concerns exist around OpenClaw and Moltbook? 📹 1:16 OpenClaw (formerly Clawd/Cladbot) has been all over the news, but has significant security concerns. It runs on your machine with full system access - it can connect to all your communication mechanisms, control your browser, and has browser cookie access. It's described as giving "a very smart toddler access to your machine." Recommendations:
Moltbook, the social network where OpenClaw agents can chat with each other, has been particularly exploited. A Wiz.io article revealed the Moltbook database exposed millions of API keys due to a misconfigured Supabase backend. For those interested in autonomous bots with more responsible development, check out Letta which creates agents with memory, many running on Bluesky. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: What is the new Codex app and how does it compare to GitHub Copilot? 📹 7:40 The Codex app is brand new as of this week. It's a ChatGPT-style UI for code assistance (not just CLI). Key observations from testing:
When testing with a prompt to migrate from Chat Completions API to Responses API:
Interesting finding: When comparing the same migration task done by Codex vs Opus in GitHub Copilot, Codex also changed the frontend to expect Responses format while Opus only changed the backend - something to consider when designing migration prompts. |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: What are Skills and how do they work in GitHub Copilot? 📹 15:09 Skills are markdown files (with optional Python scripts) that extend agent capabilities. They're like MCP tool descriptions - teaching GitHub Copilot specialized capabilities. Structure of a skill:
How it works:
Example skills demonstrated:
Where to find skills:
To enable skills in VS Code Stable, search for "chat agent skills" in settings. Links shared: Do you need VS Code Insiders for Skills? 📹 27:15 No, skills work in VS Code Stable but you need to enable them explicitly in settings. Search for "chat agent skills" and enable it since it's in preview. Is there context loss when chaining skills together? 📹 23:39 Context should stay intact within a single VS Code session as long as you haven't exceeded the ~128K token limit causing history compaction. Use the Chat Debug View to see exactly what gets sent to the model and diagnose any issues. |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: What's a good workflow for handling PR code reviews? 📹 29:27 A recommended workflow for PR reviews:
Note: The GitHub MCP server doesn't have a tool to reply to inline PR comments, so a custom skill using the GraphQL API was created to fill this gap. Code reviews remain valuable - they catch things you didn't think about. Balance thoroughness with practicality, especially with Copilot's sometimes overly nitpicky or over-engineered suggestions. Links shared: |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: How can you run multiple agents in parallel? 📹 43:55 Running multiple agents simultaneously is most useful for:
In VS Code Insiders, options include:
Recommendation: Use parallel agents for the same feature with different models (e.g., GPT 5.2, Opus 4.5, Gemini 3) to see different perspectives, rather than trying to mentally juggle five different features simultaneously. |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: What is Pamela working on with MCP tool schemas? 📹 48:17 Research for a PyAI talk evaluating how different type annotations affect MCP server tool performance across agents. Testing four different annotations for the same field:
These are tested across four different agents:
Running 27 sample user inputs against each combination shows differences. For example, Copilot SDK with Haiku performed best with annotated strings for dates compared to other annotation styles. Recommendation: Always set up evaluations for MCP servers to verify tool schemas work as expected. Options include Pydantic AI evals and Azure AI evaluation. |
Beta Was this translation helpful? Give feedback.
-
|
2026/02/03: Is it safe to use an agent for LinkedIn job searching? 📹 55:18 Key considerations to avoid getting blocked:
The LinkedIn agent project uses Playwright to visit pages and an LLM to reason about decisions, which takes similar time to manual browsing. This makes it appear human-like during network request acceptance. For job searching specifically, you likely don't need cookies - use a sandboxed browser for safety. At worst, they might block your IP rather than your account. Links shared: |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Each week, we hold weekly office hours about all things Python + AI in the Foundry Discord.
Join the Discord here: http://aka.ms/aipython/oh
This thread will list the recordings of each office hours, and any other resources that come out of the OH sessions. The questions and answers are automatically posted (based on the transcript) as comments in this thread.
February 3, 2026
Topics covered:
January 27, 2026
Topics covered:
January 20, 2026
Topics covered:
January 13, 2026
Topics covered:
January 6, 2025
Topics covered:
Beta Was this translation helpful? Give feedback.
All reactions