Crawl and Visualize NeurIPS 2022 OpenReview Data

→ Open full submission list here

→ Download datasets here

Description

This repository contains code to crawl and visualize the data from the NeurIPS 2022 OpenReview. Crawling is done via parallel requests directly to OpenReview's API, which is way faster than selenium - in the order of 10-100x. It also saves datasets that can be used for further analysis, including all reviews and rebuttals and PDF files metadata and text.

Usage

Run:

pip install -r requirements.txt

And run the notebooks under the notebooks/ folder:

0a. Parse data.ipynb: crawl the data from the OpenReview website: all paper metadata (such as title, abstract, authors, etc.), reviews, and rebuttals.
0b. Crawl PDF.ipynb: parse the PDF files of the papers to extract the main text.
1. Plots.ipynb: visualize the data using word clouds, bar charts, and other plots.
2. Save Website.ipynb: save the website as a static HTML file.

Statistics

Total submitted papers: 4874 papers
Average rating: 4.94

Rating distribution

Top 50 Keywords

Keywords vs Ratings

Wordcloud

Review Lenghts

Review Lengths by Rating

Review Lengths by Confidence

Paper Lengths (pages) vs Rating

Top 50 Authors

Feedback

Feel free to open an issue or a pull request if you have any feedback or suggestions!

Acknowledgements

This repository is inspired by the following:

Initial idea: https://github.com/evanzd/ICLR2021-OpenReviewData
Previous year's repo: https://github.com/fedebotu/ICLR2022-OpenReviewData
For web formatting and API requests: https://github.com/weigq/neurips2021_stats and https://github.com/weigq/iclr2022_stats

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
assets		assets
data		data
images		images
notebooks		notebooks
web		web
README.md		README.md
requirements.txt		requirements.txt
statistics.html		statistics.html
submissions.html		submissions.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Crawl and Visualize NeurIPS 2022 OpenReview Data

Description

Usage

Statistics

Rating distribution

Top 50 Keywords

Keywords vs Ratings

Wordcloud

Review Lenghts

Review Lengths by Rating

Review Lengths by Confidence

Paper Lengths (pages) vs Rating

Top 50 Authors

Feedback

Acknowledgements

About

Releases

Packages

Languages

fedebotu/NeurIPS2022-OpenReviewData

Folders and files

Latest commit

History

Repository files navigation

Crawl and Visualize NeurIPS 2022 OpenReview Data

Description

Usage

Statistics

Rating distribution

Top 50 Keywords

Keywords vs Ratings

Wordcloud

Review Lenghts

Review Lengths by Rating

Review Lengths by Confidence

Paper Lengths (pages) vs Rating

Top 50 Authors

Feedback

Acknowledgements

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages