Skip to content
This repository has been archived by the owner on Sep 9, 2022. It is now read-only.

returning more than 1000 DOIs using EuropePMC database #242

Closed
banderson10 opened this issue Jul 22, 2021 · 3 comments
Closed

returning more than 1000 DOIs using EuropePMC database #242

banderson10 opened this issue Jul 22, 2021 · 3 comments

Comments

@banderson10
Copy link

Hello,

I have two questions related to using ft_search() to return DOIs from the EuropePMC database. The questions are below and there is an example to help assist with my questions.

  1. When I run the first example below that is the example in the fulltext manual, I receive a variable in the 'a' object that contains the 1,000 DOIs, a$europmc$data$doi.

res <- ft_search(query="ecology", from='europmc')

a <- ft_search(query="ecology", from='europmc', limit=1000,
euroopts = list(cursorMark = res$europmc$cursorMark))

When I change the search term to my desired search term, ft_search() does not return any DOI values. a1$europmc$data$doi does not exist in the a1 object.

res1 <- ft_search(query="spanish flu", from='europmc')

a1 <- ft_search(query="spanish flu", from='europmc', limit=1000,
euroopts = list(cursorMark = res1$europmc$cursorMark))

I need the DOIs because I am searching other databases with ft_search(), and I am using the DOI as the unique identifier to remove duplicates before I fetch the full text xml files.

  1. Obtaining more than 1,000 DOIs from a EuropePMC search.

I have read the #184 post for this package in which the author explains that you have to use a cursor to 'page through' the query results. Using the example in the full text manual, as shown below, the query returns 416,312 hits.

res <- ft_search(query='ecology', from='europmc')
res$europmc

You can then use the cursorMark argument to 'page through' the results. The code below will return the first 1,000 hits.

a2 <- ft_search(query='ecology', from='europmc', limit=1000,
euroopts = list(cursorMark = res$europmc$cursorMark))

The question is how do you obtain the next 1,000 hits and the next 1,000 hits, and so on.... For example, what if you wanted to obtain all 416,312 DOIs?

Thank you for any advice/suggestions you can provide!

Billie

@sckott
Copy link
Contributor

sckott commented Jul 27, 2021

hi @banderson10 i've changed jobs and I haven't been able to find a new maintainer for this pkg yet

@banderson10
Copy link
Author

banderson10 commented Jul 27, 2021 via email

@maelle
Copy link
Contributor

maelle commented Sep 9, 2022

This repository is about to be archived.
If you develop a related package, it might be in scope for https://ropensci.org/software-review/

@maelle maelle closed this as completed Sep 9, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants