Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added bilibili loader #2673

Merged

Conversation

liaokongVFX
Copy link
Contributor

@liaokongVFX liaokongVFX commented Apr 10, 2023

I've added a bilibili loader, bilibili is a very active video site in China and I think we need this loader.

Example:

from langchain.document_loaders.bilibili import BiliBiliLoader

loader = BiliBiliLoader(
       ["https://www.bilibili.com/video/BV1xt411o7Xu/",
       "https://www.bilibili.com/video/av330407025/"]
)
docs = loader.load()

@liaokongVFX liaokongVFX changed the title add bilibili loader Added bilibili loader Apr 10, 2023
Copy link
Contributor

@hwchase17 hwchase17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is great - thanks!

@liaokongVFX liaokongVFX requested a review from hwchase17 April 11, 2023 15:13

def _get_bilibili_subs_and_info(self, url: str) -> Tuple[str, dict]:
try:
import requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requests package is a core dependency of langchain so we can import in the outer context (at the top of the file)!

The bilibili_api import looks great, however!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!


raw_transcript_with_meta_info = f"""
Video Title: {video_info['title']},
description: {video_info['desc']}\n
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: {video_info['desc']}\n
Description: {video_info['desc']}\n

@vowelparrot vowelparrot changed the base branch from master to vwp/bilibili April 11, 2023 17:37
@vowelparrot vowelparrot merged this pull request into langchain-ai:vwp/bilibili Apr 11, 2023
vowelparrot pushed a commit that referenced this pull request Apr 11, 2023
I've added a bilibili loader, bilibili is a very active video site in
China and I think we need this loader.

Example:
```python
from langchain.document_loaders.bilibili import BiliBiliLoader

loader = BiliBiliLoader(
       ["https://www.bilibili.com/video/BV1xt411o7Xu/",
       "https://www.bilibili.com/video/av330407025/"]
)
docs = loader.load()
```
vowelparrot added a commit that referenced this pull request Apr 11, 2023
I've added a bilibili loader, bilibili is a very active video site in
China and I think we need this loader.

Example:
```python
from langchain.document_loaders.bilibili import BiliBiliLoader

loader = BiliBiliLoader(
       ["https://www.bilibili.com/video/BV1xt411o7Xu/",
       "https://www.bilibili.com/video/av330407025/"]
)
docs = loader.load()
```

Co-authored-by: 了空 <568250549@qq.com>
@yuekaizhang yuekaizhang mentioned this pull request May 17, 2023
eyurtsev pushed a commit that referenced this pull request May 18, 2023
# Fix bilibili api import error

bilibili-api package is depracated and there is no sync module.

<!--
Thank you for contributing to LangChain! Your PR will appear in our next
release under the title you set. Please make sure it highlights your
valuable contribution.

Replace this with a description of the change, the issue it fixes (if
applicable), and relevant context. List any dependencies required for
this change.

After you're done, someone will review your PR. They may suggest
improvements. If no one reviews your PR within a few days, feel free to
@-mention the same people again, as notifications can get lost.
-->

<!-- Remove if not applicable -->

Fixes #2673 #2724 

## Before submitting

<!-- If you're adding a new integration, include an integration test and
an example notebook showing its use! -->

## Who can review?

Community members can review the PR once tests pass. Tag
maintainers/contributors who might be interested:
@vowelparrot  @liaokongVFX 

<!-- For a quicker response, figure out the right person to tag with @

        @hwchase17 - project lead

        Tracing / Callbacks
        - @agola11

        Async
        - @agola11

        DataLoaders
        - @eyurtsev

        Models
        - @hwchase17
        - @agola11

        Agents / Tools / Toolkits
        - @vowelparrot
        
        VectorStores / Retrievers / Memory
        - @dev2049
        
 -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants