[Feature] Enable crawling of websites that require credentials via SSO or 2FA #1040
Open
1 of 2 tasks
Labels
enhancement
New feature or request
Search before asking
Component
Transforms/Other, Other
Feature
Adde credential support to web2parquet transform: Currently, web2parquet transform fails if the site that is being crawled requires any sort of credentials. This use case is very relevant to RAG, Fine-Tuning and/or Search and Retrieval use cases where customers would want to access their own internal websites for retrieving internal document to use as part of their LLM application.
cc: @hmtbr Do you know if data-prep-connector supports credentials ? In which case we would need to extend the web2parquet transform. If not, is it possible to extend the data-prep-connector to support credentials?
cc: @Qiragg
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: