You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am working on openalexPro (https://github.com/rkrug/openalexPro) and I have it working finally to download the json files (adaptation to api_request() and oa_request()), converting them to parquet format, and converting the abstracts from inverted index and, in addition, creating a citation for each work(first author, et al or second author to nothing (publication_year) completely in duckdb - works very nicely and also populates the fields if missing with the correct structure if present in at least one work), and reading them into R into a tibble.
Now I am looking at compatibility to openalexR oa_fetch(output = "tibble"). I have the following questions (and I think it is easier to ask then to dug into the code as the conversion is quite complex):
How are the works ordered? Any specific sorting, or simply a by-product of the code?
How are the columns sorted - it seems for example, that the column doi is moved? Any specific sort order?
are there any columns you are dropping / renaming / process?
My idea is to use that "compatibility mode" for (is possible) the openalexPro system of packages so that these (graphing, analysis, etc) can also be used from openalexR.
I would also welcome comments to the download procedure of the json files, but this is not that important - my aim is, again, to keep compatibility with the input format of openalexR.
Any feedback welcome,
Rainer
The text was updated successfully, but these errors were encountered:
Hi Rainer, I assume you mean oa_fetch(output = "tibble")?
1 - The works are not ordered. The user gets whichever order returned from OpenAlex.
2, 3 - The package was originally written to accommodate bibliometric analyses, so some of the columns were renamed. We're still working on tracking the coverage in #211 (works and authors done — TODO other entities. Maybe the files changed there will give you a better idea).
One point: in the case of e.g. bibliography, where there are values extracted, it would be great if the field could be specified, e.g. "biblio.volume, volume" and also "biblio, NA" to indicate where the value is coming from and that biblio itself is removed.
That would make it clearer to understand and also make it possible to rename sub fields.
Hi
I am working on openalexPro (https://github.com/rkrug/openalexPro) and I have it working finally to download the json files (adaptation to
api_request()
andoa_request()
), converting them to parquet format, and converting the abstracts from inverted index and, in addition, creating a citation for each work(first author, et al or second author to nothing (publication_year) completely in duckdb - works very nicely and also populates the fields if missing with the correct structure if present in at least one work), and reading them into R into a tibble.Now I am looking at compatibility to openalexR
oa_fetch(output = "tibble")
. I have the following questions (and I think it is easier to ask then to dug into the code as the conversion is quite complex):My idea is to use that "compatibility mode" for (is possible) the openalexPro system of packages so that these (graphing, analysis, etc) can also be used from openalexR.
I would also welcome comments to the download procedure of the json files, but this is not that important - my aim is, again, to keep compatibility with the input format of openalexR.
Any feedback welcome,
Rainer
The text was updated successfully, but these errors were encountered: