Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo - Arxiv Summarizer #112

Merged
merged 8 commits into from
Feb 13, 2024
Merged

Demo - Arxiv Summarizer #112

merged 8 commits into from
Feb 13, 2024

Conversation

bryannho
Copy link
Contributor

@bryannho bryannho commented Feb 10, 2024

  • A RAG for Arxiv articles - using Arxiv, OpenAI, and no other RAG libraries
  • Uses Solara for the chat interface. Important: When running locally, make sure you're using solara==1.25.0, there is an issue with the CSS for 1.26.0

Closes #87


📚 Documentation preview 📚: https://ploomber-doc--112.org.readthedocs.build/en/112/

@bryannho bryannho marked this pull request as ready for review February 10, 2024 02:18
@edublancas
Copy link
Contributor

the question that retrieves the topic can go in two ways:

  1. either the user is asking about a category that exists in the arxiv taxonomy
  2. it doesn't exist

since there are too many classifications, you can just pick a few from the taxonomy list.

to know which one to use, you can do something similar to what I did in the news RAG, I use the LLM as a topic classifier, this classifier would output either a taxonomy (e.g. cs.AI) or no taxonomy.

in the first case, we can use arxiv's API to fetch the papers for that taxonomy (as you're doing via cat:ID), in the second case, we can pass the terms to the API e.g., large+language+models. this will allow the bot to answer about topics that do not exist in the taxonomy:

image

@edublancas
Copy link
Contributor

also the summary feature doesn't appear to be working, it just prints the paper authors

image

image

also, it should be clear what kind of things this bot can do: fetch recent papers, summarize papers, get links to download papers (anything else?)

@edublancas
Copy link
Contributor

let's make this a link so clicking on it opens a tab:

image

@bryannho
Copy link
Contributor Author

bryannho commented Feb 13, 2024

@edublancas Ready for review.

the question that retrieves the topic can go in two ways:

  1. either the user is asking about a category that exists in the arxiv taxonomy
  2. it doesn't exist

The demo uses the LLM as a topic classifier as it did before, but now if it can't find a category, it passes the user's search terms to the API.

Also, users can now load a new set of articles midway through the conversation as we discussed.

also the summary feature doesn't appear to be working, it just prints the paper authors

Fixed the summary feature here. It grabs the summary verbatim from the Arxiv result.

also, it should be clear what kind of things this bot can do: fetch recent papers, summarize papers, get links to download papers (anything else?)

I clarified the initial prompt message to specify what the model can do. Let me know if this is descriptive enough.

let's make this a link so clicking on it opens a tab

Download links are now clickable and will open in a new tab.

The updated version of the demo is deployed at: https://summer-wind-5194.ploomberapp.io/

@edublancas edublancas merged commit 68d8365 into ploomber:main Feb 13, 2024
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

demo: arxiv summarizer
2 participants