generator: add `simonw/llm` library support by Nakul-Rajpal · Pull Request #1633 · NVIDIA/garak

Nakul-Rajpal · 2026-03-04T21:50:04Z

Fixes issue #1382 → second PR on new branch with updated code

Three files changed:

docs/source/garak.generators.llm.rst: Sphinx autodoc stub for the new LLM generator module
garak/generators/llm.py: New generator wrapping Simon Willison's llm library, supporting OpenAI, Claude, Gemini, and other providers via plugins
tests/generators/test_llm.py: Unit tests for LLMGenerator covering instantiation, parameter passthrough, error handling, and multiple generations

Verification

pip install llm
garak -m llm -n gpt-4o-mini
Run the tests and ensure they pass python -m pytest tests/generators/test_llm.py
Verify generation works with at least one llm-supported model
Verify error handling when llm package is not installed
Document the generator and how it works

--
resolves #463

Nakul-Rajpal · 2026-03-04T21:52:16Z

@jmartin-tech

Nakul-Rajpal · 2026-03-14T17:31:53Z

@jmartin-tech is this branch ready to merge?

jmartin-tech

@Nakul-Rajpal this rebase looks to have been AI generated from the previous PR and needs closer author review and testing.

Using coding assistants to aid in contribution to this project is considered acceptable, this PR does not meet the project needs in its current form.

Contributors must test and evaluate any contribution in the spirit of expanding the capabilities of the core project and align with design expectations for naming conventions and separation of responsibilities in the codebase.

This PR did not properly rebased on the upstream NVIDIA:main branch and needs further alignment with upstream patterns. Review of the branch shows the rebase was on 711718c which is from late Aug 2025.

Testing shows this PR currently ends up in an endless backoff loop when the target configuration is not viable for the llm environment such as when the target model name is not available or a parameter is not supported to pass through to the target model.

Once updated there will need to be additional testing for errors reported by the llm library.

Note also previous feedback in PR #1382 related to accepted_params based on enumerating the parameters for the target are missing from this iteration.

This PR is also missing updates to requirements.txt and pyproject.toml that are required for the generator to load.

jmartin-tech · 2026-03-16T16:56:53Z

garak/generators/llm.py

+    def _clear_client(self) -> None:
+        self.target = None
+
+    def __getstate__(self) -> object:
+        self._clear_client()
+        return dict(self.__dict__)
+
+    def __setstate__(self, data: dict) -> None:
+        self.__dict__.update(data)
+        self._load_client()


Based on the current upstream main these are no longer needed as they are now handled by the Configurable class updates in #1545.

jmartin-tech · 2026-03-16T16:59:14Z

garak/generators/llm.py

+        self.__dict__.update(data)
+        self._load_client()
+
+    @backoff.on_exception(backoff.fibo, Exception, max_value=70)


Backoff on generic Exception is highly susceptible to creating an infinite loop, please update to capture more limited exceptions following the patterns in #1199 to capture valid types especially when based on deferred loading.

jmartin-tech · 2026-03-16T17:05:31Z

garak/generators/llm.py

+        self._load_client()
+
+    def _load_client(self) -> None:
+        import llm as llm_lib
+
+        self.llm = llm_lib


Rename this to _load_unsafe() to match the current pattern from main. Also extra_dependencies are loaded by base.Probe.__init__() prior calling _load_unsafe():

Suggested change

self._load_client()

def _load_client(self) -> None:

import llm as llm_lib

self.llm = llm_lib

self._load_unsafe()

def _load_unsafe(self) -> None:

Signed-off-by: s-nrajpal <[email protected]>

…-name/self.llm) Signed-off-by: s-nrajpal <[email protected]>

…o test properly Signed-off-by: s-nrajpal <[email protected]>

Nakul-Rajpal · 2026-03-17T20:06:24Z

@jmartin-tech These comments make sense. Here were additional issues which I found after code changes and testing.

Before fixing the code itself, I properly rebased onto the current REMOTE upstream main

Since I hadn't tested locally (I didnt install the packages, and it didnt automatically add to requirements/pyproject.toml) I added the LLM>=0.12 to the requirements.txt
My naming conventions as previously mentioned were not aligned with those of similar generators int he code base. I renamed _load_client() to _load_unsafe() and removed the custom getstate and setstate because the commands were handled by the configurable.
I removed the unused _supported_params and kept_enumerate_model_params() to check which parameters each model actually supports before it is passed through.
I added _unsafe_attributes = ["target"] which the base class nullifies the target during pickling and recreates it with the _load_unsafe() method on deserialization.
I fixed the infinite loop. The @backoff.on_exception(..., ...) got replaced with GeneratorBackoffTrigger. ModelError and rate-limit errors now trigger the retry, while UnknownModelError and BadRequestError fail immediately or return None.

I then tested locally and made sure that

Using a fake model name would fail the script immediately without a backoff loop
test.Blank would not crash the script
when I ran on an actual script (in this case dan.DanInTheWild) against a sample model (gpt-4o-mini) that it completed fully. It reported 942/1280 which is aligned with that models capabilities.
I also updated all syntax to be inline with garak requirements (python3.10+)

jmartin-tech · 2026-03-18T13:13:02Z

garak/generators/llm.py

+            except self.llm.ModelError as e:
+                raise GeneratorBackoffTrigger from e
+            except Exception as e:
+                backoff_types = ("RateLimitError", "APITimeoutError", "APIConnectionError")


If all self.llm.ModelError exceptions should just retry then a single exception handler is viable:

except Exception as e: backoff_types = (self.llm.ModelError.__name__, "RateLimitError", "APITimeoutError", "APIConnectionError")

However looking at llm.ModelError it has a sinlge subtype that should terminate the inference attempts resulting in reason to keep the more focused handler and filter, raising GeneratorBackoffTrigger only when a retry has potential to result in valid completion.

Something like:

except self.llm.NeedsKeyException as e: logging.error("llm generation failed: %s", repr(e)) outputs.append(None) except Exception as e: backoff_types = (self.llm.ModelError, "RateLimitError", "APITimeoutError", "APIConnectionError")

For the NeedsKeyException case however I am not sure there is any chance of recovery or progress for the entire run so it may actually be more appropriate to allow that exception to bubble up:

except self.llm.ModelError as e: raise e except Exception as e: backoff_types = (self.llm.ModelError, "RateLimitError", "APITimeoutError", "APIConnectionError")

seperated the two cases so that NeedsKeyException is caught first and returns None. The base model error uses isinstance and raises generator backoff trigger.

jmartin-tech · 2026-03-18T13:19:26Z

garak/generators/llm.py

+        for _ in range(generations_this_call):
+            try:
+                response = self.target.prompt(
+                    text_prompt, system=system_text, **prompt_kwargs
+                )
+                outputs.append(Message(response.text()))
+            except self.llm.ModelError as e:
+                raise GeneratorBackoffTrigger from e
+            except Exception as e:
+                backoff_types = ("RateLimitError", "APITimeoutError", "APIConnectionError")
+                exc_name = type(e).__name__
+                if exc_name in backoff_types:
+                    raise GeneratorBackoffTrigger from e
+                logging.error("llm generation failed: %s", repr(e))
+                outputs.append(None)
+        return outputs


The backoff action looks to be scoped too broadly. If an exception were to trigger backoff here on anything other than the first iteration of the loop then completed inference outputs would be abandoned and a new set of generations_this_call inference requests would be made.

I would suggest that the code inside this loop is the actually backoff boundary the code should be refactored to account for this. Something like:

@backoff.on_exception(backoff.fibo, GeneratorBackoffTrigger, max_value=70) def _get_response(text_prompt, system, prompt_kwargs): try: response = self.target.prompt( text_prompt, system=system_text, **prompt_kwargs ) return Message(response.text())) except self.llm.NeedsKeyException as e: logging.error("llm generation failed: %s", repr(e)) return None except Exception as e: backoff_types = (self.llm.ModelError.__name__, "RateLimitError", "APITimeoutError", "APIConnectionError") exc_name = type(e).__name__ if exc_name in backoff_types: raise GeneratorBackoffTrigger from e logging.error("llm generation failed: %s", repr(e)) return None

for _ in range(generations_this_call): response = self._get_response(text_prompt, system=system_text, **prompt_kwargs) outputs.append(response) return outputs

I moved the retry to _call_single so that each generation gets its own independent retry.

jmartin-tech · 2026-03-18T13:29:07Z

garak/generators/llm.py

+        self.name = name
+        self._load_config(config_root)
+        self.fullname = f"llm:{self.name}"
+
+        super().__init__(self.name, config_root=config_root)


Is there some reason to load config early here?

super().__init__(name, config_root=config_root) self.fullname = f"llm:{self.name}"

There was no reason to do that; now i am passing the name directly to super().init().

jmartin-tech · 2026-03-20T14:03:19Z

garak/generators/llm.py

+                exc_name = type(e).__name__
+                if exc_name in backoff_types:
+                    raise GeneratorBackoffTrigger from e


Why is this based on string name values? I would expect to be able to use the types themselves similar to how the cohere generator does the handling.

See:

garak/garak/generators/cohere.py

Lines 130 to 142 in 87ba832

except Exception as e:

backoff_exception_types = (

self.cohere.errors.GatewayTimeoutError,

self.cohere.errors.TooManyRequestsError,

self.cohere.errors.ServiceUnavailableError,

self.cohere.errors.InternalServerError,

)

for backoff_exception in backoff_exception_types:

if isinstance(e, backoff_exception):

raise GeneratorBackoffTrigger from e # bubble up ApiError for backoff handling

logging.error(f"Chat API error: {e}")

responses.append(None)

Are there multiple types with the same __name__ result that could be raised here but still provide a viable case to retry the request?

now im using isinstance(e, self.llm.ModelError) , which is the only thing that triggers a retry otherwise we log and return None.

jmartin-tech · 2026-03-20T14:14:22Z

garak/generators/llm.py

+        "top_p": None,
+        "stop": [],


Consider adding support similar to the OpenAICompatible and LiteLLM generators to suppress or extend options passed to this proxy library.

In particular suppressed_params and extra_params would enable control of additional options to the inference request that may vary depending on the target. These options would need to be incorporated into the logic of _enumerate_model_params().

i now added supressed_params and extra_params to the default params. It should now match the pattern in OpenAiCompatible and LiteLLM.

…xtra_params to default_params Signed-off-by: Nakul Rajpal <[email protected]>

jmartin-tech changed the title ~~Rebuilt PR from clean main with ONLY LLM generator changes~~ generator: add simonw/llm library support Mar 4, 2026

leondz requested review from jmartin-tech and leondz March 14, 2026 18:18

jmartin-tech requested changes Mar 16, 2026

View reviewed changes

Nakul-Rajpal added 5 commits March 17, 2026 13:37

Rebuilt PR from clean main with ONLY LLM generator changes

34dc3d4

Cleaned up AI code

b92c2e1

Add llm generator to docs toctree

22ce220

Signed-off-by: s-nrajpal <[email protected]>

Address reviewer feedback previously not addressed (model-name/target…

d615f89

…-name/self.llm) Signed-off-by: s-nrajpal <[email protected]>

Fix LLM generator style and Python 3.10+ compatibility+ run locally t…

29a7f9f

…o test properly Signed-off-by: s-nrajpal <[email protected]>

Nakul-Rajpal force-pushed the generator-wrapLLM-v2 branch from eb378cf to 29a7f9f Compare March 17, 2026 20:05

jmartin-tech requested changes Mar 20, 2026

View reviewed changes

refactored the model call error handling added supressed_params and e…

42090bb

…xtra_params to default_params Signed-off-by: Nakul Rajpal <[email protected]>

	except Exception as e:

	backoff_exception_types = (
	self.cohere.errors.GatewayTimeoutError,
	self.cohere.errors.TooManyRequestsError,
	self.cohere.errors.ServiceUnavailableError,
	self.cohere.errors.InternalServerError,
	)
	for backoff_exception in backoff_exception_types:
	if isinstance(e, backoff_exception):
	raise GeneratorBackoffTrigger from e # bubble up ApiError for backoff handling
	logging.error(f"Chat API error: {e}")
	responses.append(None)

Conversation

Nakul-Rajpal commented Mar 4, 2026 • edited by leondz Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Verification

Uh oh!

Nakul-Rajpal commented Mar 4, 2026

Uh oh!

Nakul-Rajpal commented Mar 14, 2026

Uh oh!

jmartin-tech left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Nakul-Rajpal commented Mar 17, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Nakul-Rajpal commented Mar 4, 2026 •

edited by leondz

Loading