[P0] Intervention scheduling for generation #110

nathankim7 · 2024-02-15T20:20:26Z

Description

Basic functionality for scheduling interventions to happen on positions not present in the prompt (i.e. generated tokens). Ideally should follow the same procedure for GRU.

Changelog:

timestep_selector, a list of length num_intv of boolean callbacks with signature Callable[[int, torch.Tensor], bool] can be passed to generate() calls. Each intervention calls its callback function with the current position to determine whether the intervention should operate on that position or not.
New handling of Nonevalues in unit locations: If Nones are specified at the batch dimension then interventions are not applied to those examples in the batch.
weird logic where _intervention_getter(), _intervention_setter() functions were being called with single interventions even though they were written to handle an array of intervention keys and return a list of handlers, has been removed
Efficiency and readability improvements in gather_neurons() and scatter_neurons()

Testing Done

Tests added: test_nulling_intervention, test_generation_with_source_intervened_prompt, test_dynamic_static_generation_intervention_parity, test_generation_noops
Tests fixed: test_with_subspace_negative, test_scatter_neurons_gpt2_attn_with_head_positive

Checklist:

My PR title strictly follows the format: [Your Priority] Your Title
I have attached the testing log above
I provide enough comments to my code
I have changed documentations
I have added tests for my changes

pyvene/models/intervenable_base.py

PinetreePantry · 2024-04-23T03:52:15Z

pyvene/models/intervenable_base.py

        if unit_locations is None:
            # this means, we don't filter based on location at all.
-            return {"sources->base": ([None]*len(self.interventions), [None]*len(self.interventions))}
-
-        if self.mode == "parallel":


Now that self.mode does not control this logic block, what is the difference between wait_for_forward_with_parallel_intervention() and wait_for_forward_with_serial_intervention()? Is there still a need to separate these two?

PinetreePantry · 2024-04-23T04:30:51Z

pyvene/models/intervenable_base.py

            intervention, module_hook = self.interventions[key]

            def hook_callback(model, args, kwargs, output=None):
-                if self._is_generation:


Sorry if I don't understand, could you explain the rationale of allowing the hook_callback to run when self._skip_forward is True?

This was dead code already iirc, since it's just getting passed here. Correct me if I'm wrong, but since the getter hook is used to gather source representations wouldn't it still need to run even if a generate() call skips intervening on the base (prompt)?

PinetreePantry · 2024-04-23T05:23:14Z

tests/integration_tests/InterventionWithMLPTestCase.py

@@ -149,13 +178,12 @@ def test_with_subspace_negative(self):
        Negative test case to check input length.
        """
        intervenable = IntervenableModel(
-            self.test_subspace_intervention_link_config, self.mlp
+            self.test_negative_subspace_config, self.mlp


What happens if you replace this test_negative_subspace_config with test_subspace_intervention_link_config?

This test case was intended to test defining an intervention with subspace partitions that exceeded the dimension of the model. That is why test_subspace_intervention_link_config wasn't triggering an IndexError at all, since it and the inputs in this test case are both of dim 3. (It was passing in previous commits because of an entirely unrelated and problematic IndexError that should actually be fixed by this PR.)

Since changing the current config would break all the other tests in this file that rely on it, I decided to just copy it over to a new one.

PinetreePantry · 2024-04-23T05:25:35Z

tests/unit_tests/ModelUtilsTestCase.py

@@ -102,15 +107,15 @@ def test_scatter_neurons_gpt2_batch_diff_fast_no_head_positive(self):
        golden_output = tensor_input.clone()


Since there is no fast path anymore, we can remove all fast_path tests, and remove the fast_path parameter in modeling_utils.py as well.

I was curious about that, good to know we can remove it. I'd rather use a separate PR for that, though.

PinetreePantry · 2024-04-23T05:50:26Z

pyvene/models/modeling_utils.py

-                ] = replacing_tensor_input[:, i]
-        else:
-            tensor_input[_batch_idx, unit_locations] = replacing_tensor_input
+        tensor_input[_batch_idx, unit_locations] = replacing_tensor_input[_batch_idx]


Good job! Removed the for loop in the assignment

leogagnon · 2024-10-22T18:13:11Z

Is this PR done? I would really like to use this functionality for my ongoing project.

Also, do interventions at given time-step carry to the all the future time-steps? For example, if the token at t=5 receives an intervention, does it only last for all generation steps t<5 ? I think the answer is yes, but do you think it would be possible to make it otherwise? In other, words, I want to schedule interventions which are specific to specific time-steps and only apply for that generation step.

For example, I would want to intervene on the last token at every generation step, i.e. only intervene on the token at t=5 when generating t=6, not when generating t=7,.....

Thoughts would be appreciated

nathankim7 changed the title ~~Intervention scheduling for generation~~ [P0] Intervention scheduling for generation Apr 18, 2024

nathankim7 requested a review from frankaging April 19, 2024 02:14

PinetreePantry reviewed Apr 23, 2024

View reviewed changes

PinetreePantry approved these changes Apr 23, 2024

View reviewed changes

nathankim7 added 12 commits June 7, 2024 10:27

first commit

3aa1e06

tests pt. 1

ef3d53a

batching works

0c9ac1f

pre test writing

a7b9549

formatting

4b7458f

spring cleaning

538b712

add imports

ea20d2d

noop guards

3fe9e71

fix negative_subspace, add intervention nulling with locations

38b58d4

fix negative_subspace, add intervention nulling with locations

0e3ecb2

intervention doesnt delete your passed reprs, also generation notebook

7f099fb

collected activations now return as dict

a17e19a

nathankim7 force-pushed the nathan/generation-intervention-scheduling branch from 831eb33 to a17e19a Compare June 7, 2024 17:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[P0] Intervention scheduling for generation #110

[P0] Intervention scheduling for generation #110

nathankim7 commented Feb 15, 2024 •

edited

Loading

PinetreePantry Apr 23, 2024

PinetreePantry Apr 23, 2024

nathankim7 Apr 23, 2024

PinetreePantry Apr 23, 2024

nathankim7 Apr 23, 2024

PinetreePantry Apr 23, 2024

nathankim7 Apr 23, 2024

PinetreePantry Apr 23, 2024

leogagnon commented Oct 22, 2024 •

edited

Loading

		@@ -102,15 +107,15 @@ def test_scatter_neurons_gpt2_batch_diff_fast_no_head_positive(self):
		golden_output = tensor_input.clone()

[P0] Intervention scheduling for generation #110

Are you sure you want to change the base?

[P0] Intervention scheduling for generation #110

Conversation

nathankim7 commented Feb 15, 2024 • edited Loading

Description

Testing Done

Checklist:

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leogagnon commented Oct 22, 2024 • edited Loading

nathankim7 commented Feb 15, 2024 •

edited

Loading

leogagnon commented Oct 22, 2024 •

edited

Loading