Releases · AnswerDotAI/RAGatouille

11 Feb 04:31

bclavie

0.0.9

2bd4d2e

0.0.9 Latest

Latest

What's Changed

fix: fix inversion method & pytorch Kmeans OOM by @bclavie in #179
performance: Optimize ColBERT index free search with torch.topk by @Diegi97 in #219
Calculate pid_docid_map.values() only once in add_to_index by @vishalbakshi in #267
Fix/185 return trainer best checkpoint path by @GeraudBourdin in #265
Finally remove the dependency hell by fully getting rid of poetry. Stay tuned for more updates!

New Contributors

@vishalbakshi made their first contribution in #267
@GeraudBourdin made their first contribution in #265

Full Changelog: 0.0.8...0.0.9

Contributors

GeraudBourdin, bclavie, and 2 other contributors

Assets 2

19 Mar 13:45

bclavie

0.0.8post1

8ae3e94

0.0.8post1

Minor fix: Corrects from time import time import introduced in indexing overhaul and causing crashing issues as time was then used improperly.

Assets 2

18 Mar 19:49

bclavie

0.0.8

d27b693

0.0.8

0.0.8 is finally here!

Major changes:

Indexing overhaul contributed by @jlscheerer #158
Relaxed dependencies to ensure lower install load #173
Indexing for under 100k documents will by default no longer use Faiss, performing K-Means in pure PyTorch instead. This is a bit of an experimental change, but benchmark results are encouraging and result in greatly increased compatibility. #173
CRUD improvements by @anirudhdharmarajan. Feature is still experimental/not fully supported, but rapidly improving!

Fixes:

Many small bug fixes, mainly around typing
Training triplets improvement (already present in 0.0.7 post versions) by @JoshuaPurtell

Contributors

JoshuaPurtell, jlscheerer, and adharm

Assets 2

16 Feb 19:30

bclavie

0.0.7post3

11db1f1

0.0.7post3

Improvements for data preprocessing issues and fixes for broken training example by @jonppe (#138) 🙏

Contributors

jonppe

Assets 2

13 Feb 21:45

bclavie

0.0.7post2

b7ae28a

0.0.7post2

Fixes & tweaks to the previous release:

Automatically adjust batch size on longer contexts (32 for 512 tokens, 16 for 1024, 8 for 2048, decreasing like this until a minimum of 1)
Apply dynamic max context length to reranking

Assets 2

13 Feb 20:55

bclavie

0.0.7post1

4fbc9ce

0.0.7post1

Release focusing on length adjustments. Much more dynamism and on-the-fly adaptation, both for query length and maximum document length!

Remove hardcoded maximum length: it is now inferred from your base model's maximum position encodings. This enables support for longer-context ColBERT, such as Jina ColBERT
Upstream changes to colbert-ai to allow any base model to be used, rather than pre-defined ones.
Query length now adjusts dynamically, from 32 (hardcoded minimum) to your model's maximum context window for longer queries.

Assets 2

11 Feb 21:05

bclavie

0.0.6c2

5409914

0.0.6c2

(notes encompassing changes in the last few PyPi releases that were undocumented until now)

Changes:

Query only a subset documents based on doc ids by @PrimoUomo89 #94
Return chunk ids in results thanks to @PrimoUomo89 #125
Lower kmeans iterations when not necessary to run more #129
Properly license the library as Apache-2 on PyPi

Fixes:

Dynamically increase search hyper parameters for large k values and lower doc counts, reducing the number of situations where the total number of documents return is substantially below k #131
Fix enabling the use of Training data processing with hard negatives turned off by @corrius #117
Proper handling of different input types when pre-processing training triplets by @GautamR-Samagra #115

Contributors

corrius, PrimoUomo89, and Gautam-Rajeev

Assets 2

05 Feb 17:06

bclavie

0.0.6b5

18c080e

0.0.6b5

Minor fixes&improvements release.

Community contribs:

#103 fixes properly pushed to main thanks to @bjsi
Better verbosity handling #112 by @Potrock
Metadata support in Langchain integration #104 by @GMartin-dev

Contributors

GMartin-dev, Potrock, and bjsi

Assets 2

29 Jan 21:04

bclavie

0.0.6b2

5dbac07

0.0.6b2

Fix newly introduced dependency issue

Assets 2

28 Jan 19:47

bclavie

0.0.6b0

1f4ae29

0.0.6b0

Fixes sometimes skipped shuffling of training triplets
Fixes accidental duplicates when input training data has many more positives than negatives.
Bump to colbert-ai 0.2.18, fully removing multiprocessing calls when indexing

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

New Contributors

Contributors

Contributors

Contributors

Contributors

Contributors

Releases: AnswerDotAI/RAGatouille

0.0.9

What's Changed

New Contributors

Contributors

0.0.8post1

0.0.8

Contributors

0.0.7post3

Contributors

0.0.7post2

0.0.7post1

0.0.6c2

Contributors

0.0.6b5

Contributors

0.0.6b2

0.0.6b0