Skip to content

Commit

Permalink
fix docs
Browse files Browse the repository at this point in the history
  • Loading branch information
jonhue committed Feb 22, 2024
1 parent b8c2e10 commit cd69974
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 2 deletions.
2 changes: 1 addition & 1 deletion afsl/acquisition_functions/kmeans_pp.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ class KMeansPP(MaxDist):
|------------|------------------|------------|--------------------|
| ❌ | (✅) | ✅ | embedding / kernel |
Using the afsl.embeddings.classification.CrossEntropyEmbedding embeddings, this acquisition function is known as BADGE (*Batch Active learning by Diverse Gradient Embeddings*).[^4]
Using the afsl.embeddings.classification.HallucinatedCrossEntropyEmbedding embeddings, this acquisition function is known as BADGE (*Batch Active learning by Diverse Gradient Embeddings*).[^4]
[^1]: See [here](max_dist#where-does-the-distance-come-from) for a discussion of how a distance is induced by embeddings or a kernel.
Expand Down
2 changes: 1 addition & 1 deletion afsl/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ class ModelWithEmbedding(Model, Protocol):
- **Output Gradients (empirical NTK):** Another common choice is $\vphi(\vx) = \grad[\vtheta] \vf(\vx; \vtheta)$ where $\vtheta$ are the network parameters.
Its associated kernel is known as the *(empirical) Neural Tangent Kernel* (NTK).[^4][^3][^5]
If $\vtheta$ is restricted to the weights of the final linear layer, then this embedding is simply the last-layer embedding.
- **Loss Gradients:** Another possible choice is $\vphi(\vx) = \grad[\vtheta] \ell(\vf(\vx; \vtheta); \widehat{\vy}(\vx))$ where $\ell$ is a loss function and $\widehat{\vy}(\vx)$ is some hallucinated label (see afsl.embeddings.classification.CrossEntropyEmbedding).[^6]
- **Loss Gradients:** Another possible choice is $\vphi(\vx) = \grad[\vtheta] \ell(\vf(\vx; \vtheta); \widehat{\vy}(\vx))$ where $\ell$ is a loss function and $\widehat{\vy}(\vx)$ is some hallucinated label (see afsl.embeddings.classification.HallucinatedCrossEntropyEmbedding).[^6]
- **Outputs (empirical NNGP):** Another possible choice is $\vphi(\vx) = \vf(\vx)$ (i.e., the output of the network).
Its associated kernel is known as the *(empirical) Neural Network Gaussian Process* (NNGP) kernel.[^2]
Expand Down

0 comments on commit cd69974

Please sign in to comment.