function-calling-agent/README.md at main · yttcs/function-calling-agent

Barebones function calling agent using the following technology:

Backend: FastAPI, MariaDB, SQL Model, OpenAI completions API, Tavily Search API
Frontend: Jinja2, JS, HTMX, Bootstrap
Security: Oauth2 password grant (ROPC)
Infrastructure: AWS ECS/Fargate and DigitalOcean

This is a work in progress and it's planned to have multiple updates on a weekly basis.

Update for week of Jun. 30, 2025:

Added multiuser capability
Added Tavily Extract API
Switched from gpt-3.5-turbo to gpt-4o

Update for week of Jul. 21, 2025:

Added text to speech using gpt-4o-mini-tts for completion.choices[0].message.content (that means the agent now has a voice)

Note: Update for week of Aug. 11, 2025:

Addied speech to text and text to speech using and Whisper and gpt-4o-mini-transcribe
Added UTX date tool and time tools so the model can be time aware

Note: Update for week of Oct. 20, 2025:

Added database persistence, for conversation history, in combination with in-memory python dict
Employed a hybrid HTMX/JS solution to play TTS in browser from text and voice requests
Added some error handling
Added HTMX to avoid full page refreshes
Cleaned up UI
Switched to gpt-4o-mini
Containerized with Podman
Deployed to AWS ECS/Fargate: SENTyENT.com

Will work on issues here and there, but, for the most part, this web PoC is finished. Focusing on local and offgrid now. One issue to be solved is that, in chromium based browsers, voice resquests don't receive a voice response because of the stricter user gesture requirements for autoplay.

Note: Update for week of Nov. 3, 2025:

Modified the web app to run local:

using llama-cpp-python server Qwen2.5-VL-7B, faster-whisper (SST) and piper (TTS)
SQLite for memory persistence (no multi-user concurrency required offline)
Language, image understanding, and tool calling are working great
Podman for containerization

note: this runs surprisingly well on a nine year old I5 with 32GB of RAM, but it can only handle single image processing.

As of Jan. 2026:

Object Detection & Spatial Reasoning
Better consumer hardware
Maybe try Qwen2.5-VL-3B
Start moving this to embedded hardware / realtime sensing
Drive a robot

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Barebones function calling agent using the following technology:

Update for week of Jun. 30, 2025:

Update for week of Jul. 21, 2025:

Note: Update for week of Aug. 11, 2025:

Note: Update for week of Oct. 20, 2025:

Note: Update for week of Nov. 3, 2025:

As of Jan. 2026:

FilesExpand file tree

README.md

Latest commit

History

README.md

File metadata and controls

Barebones function calling agent using the following technology:

Update for week of Jun. 30, 2025:

Update for week of Jul. 21, 2025:

Note: Update for week of Aug. 11, 2025:

Note: Update for week of Oct. 20, 2025:

Note: Update for week of Nov. 3, 2025:

As of Jan. 2026: