Releases: llm-tools/embedJs
v0.1.8
v0.1.7
v0.1.6
v0.1.5
v0.1.4
0.1.4 (2024-10-09)
🚀 Features
🩹 Fixes
v0.1.3
v0.1.2
0.1.1 (2024-10-04)
Temporarily disabled dynamic, url and local path loaders as they required install of all modules from the monorepo. Also temporarily removed access to Simple_Models enum. They will be reenabled soon.
v0.1.0
0.1.0 (2024-10-03)
This component has been extracted and is now published as part of a workspace monorepo managed by NX. There are many reasons that prompted this move, but the most critical issue was to decouple the need to install all dependencies for a single usecase. While we add (and continue to add) more and more loaders, databases, caches and models - the number of shared dependencies grew a lot. Most projects will not use all these combinations and it made no sense to have them all installed for everyone. Further, issues with dependent packages raised vulnerabilities that affected all projects - clearly something we did not intend.
Now what? Starting with version 0.1.0, We have switched to a monorepo based approach. All packages will have the same version number but changelogs and dependencies will be independent. You only need to install the relevant addons (loaders, models, databases, etc) specific to your usecase. Given the shortage of maintainers, we will not be able to support the non-monorepo version of the library beyond critical bugfixes for the next three months, post which the older version will not receive any security fixes. We strongly recommend upgrading to the newer version as soon as you can.
- Adhityan K V
Version 0.0.82
A number of important features and bug fixes make it into this release. Here's a rundown of the top new features -
Loader inference
The library can now infer the type of the loader automatically. You can pass a string and it will use the MimeType (detected using magic numbers) and the file extension if available to decide what is the correct loader to invoke. For example -
.addLoader('https://tesla-info.com/sitemap.xml') // will use sitemap loader
.addLoader('https://en.wikipedia.org/wiki/Tesla,_Inc.') // will use the web loader
.addLoader('s4pVFLUlx8g') // will detect this is a youtube video id and use the video loader
.addLoader('https://lamport.azurewebsites.net/pubs/paxos-simple.pdf') // will use the pdf loader
.addLoader('local/paxos-simple.pdf') // will also use the pdf loader
.addLoader('local/data.csv') // will also use the CSV loader
You can also pass it a local directory name and it will recursively load all files within it using the most appropirate loader. Note: It will skip files it does not have a loader for.
Alternatively, you can now add loaders by passing in an object with the correct parameters without invoking the loader constructor directly. That is -
//Before
.addLoader(new WebLoader({ urlOrContent: 'https://www.biography.com/business-leaders/steve-jobs' }))
//Now
.addLoader({ type: 'Web', urlOrContent: 'https://www.biography.com/business-leaders/steve-jobs' })
This makes for simpler reading and is very consistent across all loaders.
List of added loaders
The library now maintains the past list of loaders which were added in its cache. So, you can now get the list of all loaded content even between restarts. This is useful if you want to internalize the state of the RAG application within the library itself.
You can get the list of loaders by calling -
await ragApplication.getLoaders()
The list of added loaders will include all loaders, even those that were implicitly invoked by another loader. To understand this better, let's look at theLocalPathLoader
. This loader uses the file system API to scan files and directories. Once it infers the file type, it internally calls other loaders to add and process PPT, CSV, HTML, etc. files. When this happens, the getLoaders()
method will give you the list of all loaders including LocalPathLoader
, CsvLoader
, WebLoader
, etc with metadata around what each loader worked on.
Note: All the data around this is recorded in the cache attached. Therefore this functionality only works when you have a cache set.
CSV Loader
Now you can add CSV files from both local and web URLs using the CSV loader. To add a Csv file (or URL) to your embeddings, use CsvLoader. The library will parse the Csv and add each row to its vector database.
.addLoader(new CsvLoader({ filePathOrUrl: '...' }))
Note: You can control how the CsvLoader parses the file in great detail by passing in the optional csvParseOptions constructor parameter.
Github workflow
The library now uses Github actions to verify if the PR compiles and builds in Node versions 18, 20 and 22. This will be automatically run on every PR.