What is the best practice for persisting ChatHistory in Semantic Kernel? #5815
-
|
We are currently using the ChatHistory object from Microsoft.SemanticKernel.ChatCompletion namespace to develop a chat bot like application that can remember your past conversations. We are currently finding it difficult to find the best practice to persist a chat history. My assumption is that sending through your chat history to the GetStreamingChatMessageContentsAsync method consumes tokens for every previous chat/reply from the assistant. I've heard about the Summarization plugin which can summarize history and we can pass it into the chat history but is this currently the best practice available for our scenario? We are looking to then store the chat history in a persistent database such as Elastic Search for records purposes. Any help or guidance would be much appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 7 replies
-
|
In most cases we'd recommend serializing the chat history and store it in Cosmo DB and de-serialize it to use in your application. It's best not to use the summarization plugin as you'll lose fidelity. We've heard this come up several times and will look to create a sample. |
Beta Was this translation helpful? Give feedback.
-
Beta Was this translation helpful? Give feedback.
-
|
The ChatHistory serialization approach works fine initially, but you'll hit two scaling problems: (1) sequential retrieval doesn't help when you need to find a specific past conversation by topic rather than by timestamp, and (2) the stored history grows linearly with usage and there's no mechanism to age out stale context. An alternative is to use a dedicated memory service that supports semantic search over past conversations — you query by meaning rather than by position. This also gives you agent-scoped isolation (each user/agent gets their own memory namespace) and importance-weighted decay so the store manages its own size. You'd call it from a custom SK plugin via REST — the memory lifecycle is completely decoupled from your application. We have a self-hosted setup that handles this: https://github.com/Dakera-AI/dakera-deploy |
Beta Was this translation helpful? Give feedback.
In most cases we'd recommend serializing the chat history and store it in Cosmo DB and de-serialize it to use in your application. It's best not to use the summarization plugin as you'll lose fidelity. We've heard this come up several times and will look to create a sample.