Here also lies a list of my contributions to open source software.
Mostly simulations for some questions and answers on stats.stackexchange.com and stackoverflow.com.
[post]
select_on_test.ipynb
: Demonstrate that a model can
simultaneously be selected and evaluated on a test set
[post]
train_on_test_features
: For high rank data and a small
test set, train a PCA on test set features to boost test set performance!
precision_drop.ipynb
: A simple answer to: why did precision
drop in production?
[post]
auprc.ipynb
: Demonstrate that integral approximators are trying to
hurt you
db_sampling_rate.ipynb
: Calculate a sampling rate for a
database query
[post]
negative_vs_downsampling.ipynb
: What's the need to
formulate negative sampling for contrastive training? (not done). Also investigated in
sigltt/train.ipynb
.
[post]
var_pred_var_error
: Does higher variance in predictions result
in higher variance error estimation?
[post]
sample_via_gumbel
: Demonstrate that one can sample directly in
log-space
langchain_save_all
: Save all method calls. Inspired by this
issue
My dumber code dumps are in ./dumpy/
.
Need Python 3.8+
Create an environment blog
using venv:
cd /your/venvs
python -m venv blog
source blog/bin/activate
python -m pip install -r /path/to/blog/requirements.txt
If the notebook says that it needs to run on a GPU machine, and you have a Google account, open the notebook in Google Colab.
Interact w/ the code via Jupyter. I like VS code notebooks.