Skip to content

SemanticScholar Access

bl4ckscor3 edited this page Mar 31, 2019 · 15 revisions

To complement the gathered information, it is possible to access the data on SemanticScholar. SemanticScholar provides two variants in accessing their API, both of which have their advantages and disadvantages.
In the code, we provide various functions to simplify the access to the API, but before explaining them, let's take a look at the APIs and their differences.
Since not all of these functions are directly callable via the API, this article explains which functions are available in the code and how to use them, e.g. to add functionality to the API in the future.
To see how to access SemanticScholar Data via our API, look here

SemanticScholar's APIs

1. Publicly documented API

The Public API is comparatively well documented and provides an quick and easy access to basic information about a specific author or paper, via a simple GET-request. Unfortunately, knowing SemanticScholars ID of the searched entity is required for the request, which creates the need for the second API.

  • Paper Search The most useful function of this API is to get all papers of an author. This is implemented in the class S2PaperSearch. However, it is recommended to use the Author Search instead, since it returns far more information about the author and his papers than this.

2. Internal API

A little investigation of SemanticScholar's web interface showed, that it is nothing but a front-end, passing the user input to an API, which then returns a JSON-Object containing the search results. Depending on what is searched (person, paper, ...), different kind of requests can be made:

  • General Search
    The general search can be used to get an overview about the search term, as it returns all matching information to the String, e.g. matching author names, paper title, etc. This can be used to retrieve the SemanticScholarID (S2ID) of a paper or person. As it returns all relevant information about matched papers, it can also be used to complement papers
    In the code: Use the class S2GeneralSearch to perform a General Search
  • Author Search
    Used to search explicitly for an author. It returns more precise information about the chosen author than the general search, including: influences (both directions), citations, papers, co-authors, etc. By setting the amount of papers an author has published (can be descried by a general Search), all papers of the author, including all the relevant information about them, can be retrieved in a single AuthorSearch request.
    In the code: Use the class S2AuthorSearch to perform an Author Search

The functions

The class S2APIFunctions offers the following public static functions, to access the SemanticScholarAPI:

  • ArrayList<Paper> getAllPapersByAuthor(Person author)
    Performs a Paper Search and returns a List of all the papers. As described above, consider using boolean completeAuthorByAuthorSearch(Person author) instead, as the Author Search delivers more information at the same amount of requests.

  • String getAuthorsS2ID(Person author)
    Performs a General Search and returns the SemanticScholar ID of the author with a matching name and the highest relevance on SemanticScholar.
    Optional Improvement: Currently, namesakes can not be differentiated. A future improvement could be to gather all authors with a matching name from SemanticScholar, to then choose the author with most matching attributes (e.g. the published papers) alongside the name.

  • boolean completeAuthorInformationByAuthorSearch(Person author, boolean overwrite)
    Performs an Author Search and adds the gathered information to the given Person Object inplace. The second parameter boolean overwrite decides whether already existing information shall be overwritten by the newly gathered.
    Optional Improvement: If the SemanticScholarID is not already set, the author with a matching name and the highest relevance on SemanticScholar will be chosen. As above, namesakes are not differentiated. Again, a an improvement could be to gather all authors with a matching name from SemanticScholar, to then choose the author with most matching attributes (e.g. the published papers) alongside the name.

  • void completePaperInformationByGeneralSearch(Paper paper, boolean overwrite)
    Performs a General Search and adds the gathered information to the given Paper Object inplace. Again the boolean overwrite decides whether already existing information shall be overwritten by the newly gathered.

  • int getCitationAmountByPaper(Paper paper)
    Performs a General Search on the given paper and returns the amount of citations of this paper.