Skip to content

Commit

Permalink
cleanup docs (#74)
Browse files Browse the repository at this point in the history
  • Loading branch information
jonhue authored Oct 1, 2024
1 parent 556fe97 commit 20a5358
Show file tree
Hide file tree
Showing 8 changed files with 122 additions and 54 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ To start a local server hosting the documentation run ```pdoc ./activeft --math`
title = {Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs},
author = {H{\"u}botter, Jonas and Bongni, Sascha and Hakimi, Ido and Krause, Andreas},
year = 2024,
journal = {TODO}
journal = {arXiv Preprint}
}
@inproceedings{hubotter2024transductive,
Expand Down
82 changes: 56 additions & 26 deletions activeft/__init__.py
Original file line number Diff line number Diff line change
@@ -1,19 +1,20 @@
r"""
*Active Fine-Tuning* (`activeft`) is a Python package for informative data selection.
*Active Fine-Tuning* (`activeft`) is a Python package for intelligent active data selection.
## Why Active Data Selection?
As opposed to random data selection, active data selection chooses data adaptively utilizing the current model.
In other words, <p style="text-align: center;">active data selection pays *attention* to the most useful data</p> which allows for faster learning and adaptation.
There are mainly two reasons for why some data may be particularly useful:
1. **Informativeness**: The data contains information that the model had previously been uncertain about.
2. **Relevance**: The data is closely related to a particular task, such as answering a specific prompt.
1. **Relevance**: The data is closely related to a particular task, such as answering a specific prompt.
2. **Diversity**: The data contains non-redundant information that is not yet captured by the model.
A dataset that is both relevant and diverse is *informative* for the model.
This is related to memory recall, where the brain recalls informative and relevant memories (think "data") to make sense of the current sensory input.
Focusing recall on useful data enables efficient few-shot learning.
Focusing recall on useful data enables efficient learning from few examples.
`activeft` provides a simple interface for active data selection, which can be used as a drop-in replacement for random data selection.
`activeft` provides a simple interface for active data selection, which can be used as a drop-in replacement for random data selection or nearest neighbor retrieval.
## Getting Started
Expand All @@ -23,7 +24,7 @@
pip install activeft
```
We briefly discuss how to use `activeft` for [fine-tuning](#example-fine-tuning) and [in-context learning / retrieval-augmented generation](#example-in-context-learning).
We briefly discuss how to use `activeft` for standard [fine-tuning](#example-fine-tuning) and [test-time fine-tuning](#example-test-time-fine-tuning).
### Example: Fine-tuning
Expand Down Expand Up @@ -81,7 +82,13 @@
data_loader = ActiveDataLoader.initialize(dataset, target=None, batch_size=64)
```
### Example: In-context Learning
### Example: Test-Time Fine-Tuning
The above example described active data selection in the context of training a model with multiple batches. This usually happens at "train-time" or during "post-training".
The following example demonstrates how to use `activeft` at "test-time" to obtain a model that is as good as possible on a specific test instance.
For example, with a language model, this would fine-tune the model for a few gradient steps on data selected specifically for a given prompt.
We refer to the following paper for more details: [Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs](TODO).
We can also use the intelligent retrieval of informative and relevant data outside a training loop — for example, for in-context learning and retrieval-augmented generation.
Expand All @@ -91,36 +98,59 @@
```python
from activeft import ActiveDataLoader
data_loader = ActiveDataLoader.initialize(dataset, target, batch_size=5)
context = dataset[data_loader.next(model)]
model.add_to_context(context)
data_loader = ActiveDataLoader.initialize(dataset, target, batch_size=10)
data = dataset[data_loader.next(model)]
model.step(data)
```
Again: very simple!
### Scaling to Large Datasets
By default `activeft` maintains a matrix of size of the dataset in memory. This is not feasible for very large datasets.
Some acquisition functions (such as `activeft.acquisition_functions.LazyVTL`) allow for efficient computation of the acquisition function without storing the entire dataset in memory.
An alternative approach is to pre-select a subset of the data using nearest neighbor retrieval (using [Faiss](https://github.com/facebookresearch/faiss)), before initializing the `ActiveDataLoader`.
The following is an example of this approach in the context of [test-time fine-tuning](#example-test-time-fine-tuning):
```python
import torch
import faiss
from activeft.sift import Retriever
# Before Test-Time
embeddings = torch.randn(1000, 768)
index = faiss.IndexFlatIP(embeddings.size(1))
index.add(embeddings)
retriever = Retriever(index)
# At Test-Time, given query
query_embeddings = torch.randn(1, 768)
indices = retriever.search(query_embeddings, N=10, K=1_000)
data = embeddings[indices]
model.step(data) # Use data to fine-tune base model, then forward pass query
```
`activeft.sift.Retriever` first pre-selects `K` nearest neighbors and then uses `activeft` to select the `N` most informative data for the given query from this subset.
## Citation
If you use the code in a publication, please cite our papers:
```bibtex
# Active fine-tuning:
@inproceedings{huebotter2024active,
title={Active Few-Show Fine-Tuning},
author={Jonas Hübotter and Bhavya Sukhija and Lenart Treven and Yarden As and Andreas Krause},
booktitle={ICLR Workshop on Bridging the Gap Between Practice and Theory in Deep Learning},
year={2024},
pdf={https://arxiv.org/pdf/2402.15898.pdf},
url={https://github.com/jonhue/activeft}
# Large-Scale Learning at Test-Time with SIFT
@article{hubotter2024efficiently,
title = {Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs},
author = {H{\"u}botter, Jonas and Bongni, Sascha and Hakimi, Ido and Krause, Andreas},
year = 2024,
journal = {arXiv Preprint}
}
# Theoretical analysis of "directed" active learning:
@inproceedings{huebotter2024information,
title={Information-based Transductive Active Learning},
author={Jonas Hübotter and Bhavya Sukhija and Lenart Treven and Yarden As and Andreas Krause},
booktitle={ICML},
year={2024},
pdf={https://arxiv.org/pdf/2402.15441.pdf},
url={https://github.com/jonhue/activeft}
# Theory and Fundamental Algorithms for Transductive Active Learning
@inproceedings{hubotter2024transductive,
title = {Transductive Active Learning: Theory and Applications},
author = {H{\"u}botter, Jonas and Sukhija, Bhavya and Treven, Lenart and As, Yarden and Krause, Andreas},
year = 2024,
booktitle = {Advances in Neural Information Processing Systems}
}
```
Expand Down
2 changes: 2 additions & 0 deletions activeft/acquisition_functions/lazy_vtl.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,8 @@ class LazyVTL(
"""
Lazy Implementation of [VTL](vtl).[^1]
See Appendix F.2 of [Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs](TODO).
[^1]: Hübotter, J., Bongni, S., Hakimi, I., and Krause, A. Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs. Preprint, 2024.
"""

Expand Down
15 changes: 10 additions & 5 deletions docs/demo1.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
from activeft import ActiveDataLoader
import torch
import faiss
from activeft.sift import Retriever

train_loader = ActiveDataLoader.initialize(dataset, target, batch_size=32)
# Before Test-Time
index = faiss.IndexFlatIP(embeddings.size(1))
index.add(embeddings)
retriever = Retriever(index)

while not converged:
batch = dataset[train_loader.next(model)]
model.step(batch)
# At Test-Time, given query
indices = retriever.search(query_embeddings, N=10)
model.step(dataset[indices])
7 changes: 7 additions & 0 deletions docs/demo3.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,7 @@
from activeft import ActiveDataLoader

train_loader = ActiveDataLoader.initialize(dataset, target, batch_size=32)

while not converged:
batch = dataset[train_loader.next(model)]
model.step(batch)
12 changes: 11 additions & 1 deletion docs/index.css
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,12 @@ header {
border-bottom-left-radius: 0;
}

#example-1, #example-2 {
#example > label[for=view-3] {
border-top-left-radius: 0;
border-bottom-left-radius: 0;
}

#example-1, #example-2, #example-3 {
display: none;
}

Expand All @@ -56,6 +61,10 @@ header {
display: block;
}

#view-3:checked ~ #example-3 {
display: block;
}

.example-code {
box-shadow: rgba(0, 0, 0, 0.2) 0 20px 68px;
border-radius: 5px;
Expand All @@ -80,6 +89,7 @@ header {

.example-code .highlight {
margin: 1em 1em 1.5em;
min-width: 40em;
}

.example-code pre {
Expand Down
45 changes: 27 additions & 18 deletions docs/index.html.jinja2
Original file line number Diff line number Diff line change
Expand Up @@ -55,17 +55,19 @@
{%- endmacro %}
<body>
<header>
<h1>Active Few-Shot Learning</h1>
<h1>Active Fine-Tuning</h1>
<p>
Efficient fine-tuning & in-context learning by intelligent active data selection.
Efficiently fine-tune large neural networks by intelligent active data selection.
</p>
</header>

<aside id="example">
<input type="radio" class="btn-check" name="view-selector" id="view-1" autocomplete="off" checked>
<label class="btn btn-outline-dark" for="view-1">Fine-tuning</label>
<input type="radio" class="btn-check" name="view-selector" id="view-2" autocomplete="off">
<label class="btn btn-outline-dark" for="view-2">In-context learning</label>
<label class="btn btn-outline-dark" for="view-1">at Test-Time</label>
{# <input type="radio" class="btn-check" name="view-selector" id="view-2" autocomplete="off">
<label class="btn btn-outline-dark" for="view-2">during Post-Training</label> #}
<input type="radio" class="btn-check" name="view-selector" id="view-3" autocomplete="off">
<label class="btn btn-outline-dark" for="view-3">within an Outer Loop</label>
<div class="example-code" id="example-1">
<svg aria-hidden="true" xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14">
<g>
Expand All @@ -77,7 +79,7 @@
<div class="title"></div>
{{ example_html1 }}
</div>
<div class="example-code" id="example-2">
{# <div class="example-code" id="example-2">
<svg aria-hidden="true" xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14">
<g>
<circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle>
Expand All @@ -87,6 +89,17 @@
</svg>
<div class="title"></div>
{{ example_html2 }}
</div> #}
<div class="example-code" id="example-3">
<svg aria-hidden="true" xmlns="http://www.w3.org/2000/svg" width="54" height="14" viewBox="0 0 54 14">
<g>
<circle cx="6" cy="6" r="6" fill="#FF5F56" stroke="#E0443E" stroke-width=".5"></circle>
<circle cx="26" cy="6" r="6" fill="#FFBD2E" stroke="#DEA123" stroke-width=".5"></circle>
<circle cx="46" cy="6" r="6" fill="#27C93F" stroke="#1AAB29" stroke-width=".5"></circle>
</g>
</svg>
<div class="title"></div>
{{ example_html3 }}
</div>
</aside>

Expand All @@ -102,31 +115,27 @@
{{ icon("book-half") }}
&nbsp;Documentation
</a>
<a href="https://arxiv.org/pdf/2402.15898.pdf"
<a href="TODO"
class="btn btn-dark shadow">
{{ icon("newspaper") }}
&nbsp;Paper
</a>
</div>
<p>
<p><center>
<code>activeft</code> retrieves data intelligently to maximize the information gain about specified prediction targets.
This can be used, for example, to select data for efficient few-shot <i>fine-tuning</i> or to populate a context for <i>in-context learning</i>.
The documentation details <a href="https://jonhue.github.io/activeft/docs/activeft.html#getting-started">how to get started</a>.
To learn how <code>activeft</code> works, check out our <a href="https://arxiv.org/pdf/2402.15898.pdf">paper</a> or our <a href="https://yas.pub">blog post</a>.
</p>
This can be used to select data for efficient <i>fine-tuning</i> or to efficiently <i>learn at test-time</i>.
</center></p>
<div id="publications">
<h3>Publications</h3>
<div>
<h5>Active Few-Shot Fine-Tuning <a href="https://arxiv.org/pdf/2402.15441.pdf">{{ icon("newspaper") }}</a></h5>
{# <p>Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause</p> #}
{# <p>ICLR 2024 Workshop on Bridging the Gap Between Practice and Theory in Deep Learning</p> #}
<h5>Efficiently Learning at Test-Time: Active Fine-Tuning of LLMs</h5>
{# <p>Jonas Hübotter, Sascha Bongni, Ido Hakimi, Andreas Krause</p> #}
<p>Preprint</p>
</div>
<div>
<h5>Information-based Transductive Active Learning <a href="https://arxiv.org/pdf/2402.15898.pdf">{{ icon("newspaper") }}</a></h5>
<h5>Transductive Active Learning: Theory and Applications <a href="https://arxiv.org/abs/2402.15898">{{ icon("newspaper") }}</a></h5>
{# <p>Jonas Hübotter, Bhavya Sukhija, Lenart Treven, Yarden As, Andreas Krause</p> #}
{# <p>ICML 2024</p> #}
<p>Preprint</p>
<p>NeurIPS 2024</p>
</div>
</div>
</main>
Expand Down
11 changes: 8 additions & 3 deletions docs/make.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
if __name__ == "__main__":
demo1 = here / "demo1.py"
demo2 = here / "demo2.py"
demo3 = here / "demo3.py"
env = Environment(
loader=FileSystemLoader([here]),
autoescape=True,
Expand All @@ -25,19 +26,23 @@
formatter = pygments.formatters.html.HtmlFormatter(style="dracula")
pygments_css = formatter.get_style_defs()
example_html1 = Markup(
pygments.highlight(demo1.read_text("utf8"), lexer, formatter).replace(
"converged", '<span class="highlighted">converged</span>'
)
pygments.highlight(demo1.read_text("utf8"), lexer, formatter)
)
example_html2 = Markup(
pygments.highlight(demo2.read_text("utf8"), lexer, formatter)
)
example_html3 = Markup(
pygments.highlight(demo3.read_text("utf8"), lexer, formatter).replace(
"converged", '<span class="highlighted">converged</span>'
)
)

(here / "index.html").write_bytes(
env.get_template("index.html.jinja2")
.render(
example_html1=example_html1,
example_html2=example_html2,
example_html3=example_html3,
pygments_css=pygments_css,
)
.encode()
Expand Down

0 comments on commit 20a5358

Please sign in to comment.