LLM RAG + Custom System Messages produces very mixed output

**Describe the bug**
LLM produces no answer, sometimes weird answers, and other times it seems like the system message gets ignored fully when using the combination of ollama with RAG + custom system message. It generates a thread with 2 system messages (one from the RAG implementation). Maybe the system messages get mixed up somehow (I haven't gone through the lingoose implementation yet to check). 

Another observed problem is that using .WithModel() on a embedder for RAG is not working, because it retrieves only the first document. By removing .WithModel() the retrieval works as expected. 

**To Reproduce**
```
// go test ./... -v
func TestOllamaAssistantWithRag(t *testing.T) {
	rag := rag.New(
		index.New(
			jsondb.New().WithPersist("index.json"),
			ollamaembedder.New(), // does not work .WithModel(...)
		),
	).WithChunkSize(1024).WithChunkOverlap(0)

	err := rag.AddDocuments(
		context.Background(),
		document.Document{
			Content: "this is some text about hello world",
			Metadata: types.Meta{
				"author": "Wikipedia",
			},
		},
		document.Document{
			Content: "this is a little side story in paris about a little mermaid",
			Metadata: types.Meta{
				"author": "Wikipedia",
			},
		},
	)
	if err != nil {
		t.Error("Not able to add document to RAG..\n", err)
	}

	ragQueryData, _ := rag.Retrieve(context.Background(), "where is the little mermaid")
	t.Log("ragQueryData: ", ragQueryData)

	userMessage := "Tell me a short joke."
	llmAssistant, _ := OllamaAssistantNew(OllamaAssistantOptions{
		Model:         "llama3",
		UserMessage:   userMessage,
		Rag:           rag,
		SystemMessage: "You are a expert in counting words. Return the number of words.",
	})
	answer, _ := OllamaAssistantLastMessage(llmAssistant)
	answerCheck, err := regexp.MatchString("^[0-9 ]+$", answer)

	t.Log(llmAssistant.Thread())

	if err != nil || !answerCheck {
		t.Error("\nAnswer must be a number!")
	}
}
```

**Expected behavior**
There should be at least some answer/output generated, but it's most of the time empty while using a RAG with a own system message. Sometimes it produces partly random output. Here is the Thread I get from it (I also had rarely situations where it gave me a somewhat correct answer): 

```
Thread:
        system:
        	Type: text
        	Text: You are a number counting assistant. Return the sum of words as number.
        system:
        	Type: text
        	Text: You name is AI assistant, and you are a helpful and polite assistant . Your task is to assist humans with their questions.
        user:
        	Type: text
        	Text: Use the following pieces of retrieved context to answer the question.

        Question: Tell me a short joke.
        Context:
        this is a little side story in paris about a little mermaid

        assistant:
        	Type: text
        	Text:
```

**Desktop (please complete the following information):**
 - OS: macOS 14.4

**Additional information**
It would be nice to be able to override/set/clear the system message(s) in general, instead of only being able to add new ones. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

LLM RAG + Custom System Messages produces very mixed output #208

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LLM RAG + Custom System Messages produces very mixed output #208

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions