Replies: 1 comment
-
Hello, @ehenon! I'm here to help you with any bugs, questions, or contributions. Let's tackle this issue together while we wait for a human maintainer to assist us. To properly manage warnings when loading documents using LangChainJS, you can utilize the Here is an example of how to use the import { MultiFileLoader, UnknownHandling } from "path/to/multi_file_loader";
const multiFileLoader = new MultiFileLoader(
["path/to/file1.pdf", "path/to/file2.txt"],
{
".pdf": (path: string) => new PDFLoader(path),
},
UnknownHandling.Warn // This will log a warning for unknown file types
);
const docs = await multiFileLoader.load();
console.log({ docs }); In the public async load(): Promise<Document[]> {
const documents: Document[] = [];
for (const filePath of this.filePaths) {
const fullPath = resolve(filePath);
const fileStat = await stat(fullPath);
if (fileStat.isDirectory()) {
console.warn(`Ignoring directory: ${fullPath}`);
continue;
}
const loaderFactory = this.loaders[extname(fullPath)];
if (loaderFactory) {
const loader = loaderFactory(fullPath);
documents.push(...(await loader.load()));
} else {
switch (this.unknown) {
case UnknownHandling.Ignore:
break;
case UnknownHandling.Warn:
console.warn(`Unknown file type: ${fullPath}`);
break;
case UnknownHandling.Error:
throw new Error(`Unknown file type: ${fullPath}`);
default:
throw new Error(`Unknown unknown handling: ${this.unknown}`);
}
}
}
return documents;
} By setting the Additionally, you can control the logging level and potentially suppress or customize warning messages from underlying libraries, such as those used for reading PDFs, through the /**
* - Controls the logging level; the constants
* from {@link VerbosityLevel } should be used.
*/
verbosity?: number | undefined; By setting the /**
* - Reject certain promises, e.g.
* `getOperatorList`, `getTextContent`, and `RenderTask`, when the associated
* PDF data cannot be successfully parsed, instead of attempting to recover
* whatever possible of the data. The default value is `false`.
*/
stopAtErrors?: boolean | undefined; These options provide flexibility in handling logging and error management when working with PDFs in LangChainJS [3][4]. |
Beta Was this translation helpful? Give feedback.
-
Checked other resources
Commit to Help
Example Code
Description
When using this method to load input documents with LangChainJS, it is possible that several types of warnings appear in the console, for example :
After some research, I understood that some of them could be ignored because they were “false positives” linked to underlying libraries allowing, for example, to read PDFs. In the context of a LangChainJS pipeline, how can we manage these possible warnings in a clean way? I suppose we shouldn't ignore them all?
System Info
@langchain/community v0.2.27
@langchain/core v0.2.23
Node v20.14.0
pnpm v9.3.0
Beta Was this translation helpful? Give feedback.
All reactions