Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generator streamlining, docs #682

Merged
merged 4 commits into from
May 27, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions docs/source/garak.generators.base.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,47 @@
garak.generators.base
=====================

In garak, ``Generator``s wrap any text-to-text+ system that garak will examine. This could be a raw LLM, a chatbot endpoint, or even a non-LLM dialog system. This base class defines the basic structure of garak's generators. All generators inherit from garak.generators.base.Generator.

Attributes:

* name - The name of the specific generator class. This is optionally also set in the constructor.
* description - An optional description of the generator.
* generations - How many responses should be requested from the generator per prompt.
* max_tokens - The maximum number of tokens to generate.
* temperature - Optionally, a temperature param to pass to the underlying model.
* top_k - Optionally, a temperature param to pass to the underlying model.
* top_p - Optionally, a temperature param to pass to the underlying model.
* active - Whether or not the class is active. Usually true, unless a generator is disabled for some particular reason.
* generator_family_name - Generators each belong to a family, describing a group. This is often related to the module name - for example, ``openai.py`` contains classes for working with OpenAI models, whose generator_family_name is "openai".
* context_len - The number of tokens in the model context window, or None
* modality - A dictionary with two keys, "in" and "out", each holding a set of the modalities supported by the generator. "in" refers to prompt expectations, and "out" refers to output. For example, a text-to-text+image model would have modality: ``dict = {"in": {"text"}, "out": {"text", "image"}}``.
* supports_multiple_generations - Whether or not the generator can natively return multiple outputs from a prompt in a single function call. When set to False, the ``generate()`` method will make repeated calls, one output at a time, until the requested number of generations (in ``generations``) is reached.

Functions:

#. **__init___()**: Class constructor. Call this from generators after doing local init. It does things like populating name variables, notifying generator startup to the user, and logging generator construction.

#. **generate*()**: This method is mediating access to the underlying model or dialogue system. The ``generate()`` orchestrates all interaction with the dialogue service/model. It takes a prompt and, optionally, a number of output generations (``generations_this_call``). It returns a list of responses of length up to the number of output generations, with each member a prompt response (e.g. text). Since ``generate()`` involves a reasonable amount of logic, it is preferable to not override this function, and rather work with the hooks and sub-methods provided.

The general flow in ``generate()`` is as follows:

#. Call the ``_pre_generate_hook()``.
#. Work out how many generations we're doing this call (if -1 is passed via ``generations_this_call``, the default count in self.generations is used).
#. If only one generation is requested, return the output of ``_call_model`` with 1 generation specified.
#. If the underlying model supports multiple generations, return the output of ``_call_model`` invoked with the full count of generations.
#. Otherwise, we need to assemble the outputs over multiple calls. There are two options here.
#. Is garak running with ``parallel_attempts > 1`` configured? In that case, start a multiprocessing pool with as many workers as the value of ``parallel_attempts``, and have each one of these work on building the required number of generations, in any order.
#. Otherwise, call ``_call_model()`` repeatedly to collect the requested number of generations.
#. Return the resulting list of prompt responses.

#. **_call_model()**: This method handles direct interaction with the model. It takes a prompt and an optional number of generations this call, and returns a list of prompt responses (e.g. strings) and ``None``s. Models may return ``None`` in the case the underlying system failed unrecoverably. This is the method to write model interaction code in. If the class' supports_multiple_generations is false, _call_model does not need to accept values of ``generations_this_call`` other than ``1``.

#. **_pre_generate_hook()**: An optional hook called before generation, useful if the class needs to do some setup or housekeeping before generation.




.. automodule:: garak.generators.base
:members:
:undoc-members:
Expand Down
2 changes: 1 addition & 1 deletion docs/source/garak.probes.base.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
garak.probes.base
=================

Probes inherit from garak.probes.base.Probe.
This class defines the basic structure of garak's probes. All probes inherit from garak.probes.base.Probe.

Functions:

Expand Down
18 changes: 9 additions & 9 deletions garak/generators/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,11 +92,11 @@ def generate(self, prompt: str, generations_this_call: int = -1) -> List[str]:
logging.debug("generate() called with generations_this_call = 0")
return []

if self.supports_multiple_generations:
return self._call_model(prompt, generations_this_call)
if generations_this_call == 1:
outputs = self._call_model(prompt, 1)

elif generations_this_call <= 1:
return self._call_model(prompt, generations_this_call)
if self.supports_multiple_generations:
outputs = self._call_model(prompt, generations_this_call)

else:
outputs = []
Expand Down Expand Up @@ -138,9 +138,9 @@ def generate(self, prompt: str, generations_this_call: int = -1) -> List[str]:
), "_call_model's item must be a string or None"
outputs.append(output_one[0])

cleaned_outputs = [
o for o in outputs if o is not None
] # "None" means no good response
outputs = cleaned_outputs
cleaned_outputs = [
o for o in outputs if o is not None
] # "None" means no good response
outputs = cleaned_outputs

return outputs
return outputs
Loading