rgbif
gives you access to data from GBIF via their REST API. GBIF versions their API - we are currently using v1
of their API. You can no longer use their old API in this package - see ?rgbif-defunct
.
Tutorials:
- rgbif vignette - the intro to the package
- issues vignette - how to clean GBIF data
- taxonomic names - examples of some confusing bits
Check out the rgbif paper for more information on this package and the sister Python and Ruby clients.
The rgbif
package API follows the GBIF API, which has the following sections:
registry
(https://www.gbif.org/developer/registry) - Metadata on datasets, and contributing organizations, installations, networks, and nodesrgbif
functions:dataset_metrics()
,dataset_search()
,dataset_suggest()
,datasets()
,enumeration()
,enumeration_country()
,installations()
,networks()
,nodes()
,organizations()
- Registry also includes the GBIF OAI-PMH service, which includes GBIF registry
data only.
rgbif
functions:gbif_oai_get_records()
,gbif_oai_identify()
,gbif_oai_list_identifiers()
,gbif_oai_list_metadataformats()
,gbif_oai_list_records()
,gbif_oai_list_sets()
species
(https://www.gbif.org/developer/species) - Species names and metadatargbif
functions:name_backbone()
,name_lookup()
,name_suggest()
,name_usage()
occurrences
(https://www.gbif.org/developer/occurrence) - Occurrences, both for the search and download APIsrgbif
functions:occ_count()
,occ_data()
,occ_download()
,occ_download_prep()
,occ_download_queue()
,occ_download_cancel()
,occ_download_cancel_staged()
,occ_download_get()
,occ_download_import()
,occ_download_list()
,occ_download_meta()
,occ_download_datasets()
,occ_download_dataset_activity()
,occ_get()
,occ_issues()
,occ_issues_lookup()
,occ_metadata()
,occ_search()
maps
(https://www.gbif.org/developer/maps) - Map APIrgbif
functions:map_fetch()
- Note: we used to have a function
gbifmap()
that usedggplot2
to plot data from the occurrence API, but it's been removed - see package mapr
install.packages("rgbif")
Alternatively, install development version
install.packages("devtools")
devtools::install_github("ropensci/rgbif")
library("rgbif")
Note: Windows users have to first install Rtools to use devtools
Mac Users: (in case of errors)
Terminal:
Install gdal : https://github.com/edzer/sfr/blob/master/README.md#macos
brew install openssl
R terminal:
install.packages('openssl')
install.packages('rgeos')
install.packages('rgbif')
occ_search(scientificName = "Ursus americanus", limit = 50)
#> Records found [10689]
#> Records returned [50]
#> No. unique hierarchies [1]
#> No. media records [49]
#> No. facets [0]
#> Args [limit=50, offset=0, scientificName=Ursus americanus, fields=all]
#> # A tibble: 50 x 69
#> name key decimalLatitude decimalLongitude issues datasetKey
#> <chr> <int> <dbl> <dbl> <chr> <chr>
#> 1 Ursus … 1.84e9 49.4 -123. cdroun… 50c9509d-22c7…
#> 2 Ursus … 1.81e9 37.7 -120. cdroun… 50c9509d-22c7…
#> 3 Ursus … 1.80e9 30.0 -84.3 cdroun… 50c9509d-22c7…
#> 4 Ursus … 1.81e9 42.0 -124. cdroun… 50c9509d-22c7…
#> 5 Ursus … 1.81e9 25.4 -101. gass84 50c9509d-22c7…
#> 6 Ursus … 1.81e9 40.8 -81.7 cdroun… 50c9509d-22c7…
#> 7 Ursus … 1.80e9 29.3 -103. cdroun… 50c9509d-22c7…
#> 8 Ursus … 1.81e9 34.4 -119. gass84 50c9509d-22c7…
#> 9 Ursus … 1.84e9 44.9 -110. cdroun… 50c9509d-22c7…
#> 10 Ursus … 1.84e9 34.0 -117. gass84 50c9509d-22c7…
#> # ... with 40 more rows, and 63 more variables: publishingOrgKey <chr>,
#> # publishingCountry <chr>, protocol <chr>, lastCrawled <chr>,
#> # lastParsed <chr>, crawlId <int>, extensions <chr>,
#> # basisOfRecord <chr>, taxonKey <int>, kingdomKey <int>,
#> # phylumKey <int>, classKey <int>, orderKey <int>, familyKey <int>,
#> # genusKey <int>, speciesKey <int>, scientificName <chr>, kingdom <chr>,
#> # phylum <chr>, order <chr>, family <chr>, genus <chr>, species <chr>,
#> # genericName <chr>, specificEpithet <chr>, taxonRank <chr>,
#> # dateIdentified <chr>, year <int>, month <int>, day <int>,
#> # eventDate <chr>, modified <chr>, lastInterpreted <chr>,
#> # references <chr>, license <chr>, identifiers <chr>, facts <chr>,
#> # relations <chr>, geodeticDatum <chr>, class <chr>, countryCode <chr>,
#> # country <chr>, rightsHolder <chr>, identifier <chr>,
#> # verbatimEventDate <chr>, datasetName <chr>, collectionCode <chr>,
#> # gbifID <chr>, verbatimLocality <chr>, occurrenceID <chr>,
#> # taxonID <chr>, recordedBy <chr>, catalogNumber <chr>,
#> # http...unknown.org.occurrenceDetails <chr>, institutionCode <chr>,
#> # rights <chr>, eventTime <chr>, occurrenceRemarks <chr>,
#> # http...unknown.org.http_..rs.gbif.org.terms.1.0.Multimedia <chr>,
#> # identificationID <chr>, infraspecificEpithet <chr>,
#> # coordinateUncertaintyInMeters <dbl>, informationWithheld <chr>
Or you can get the taxon key first with name_backbone()
. Here, we select to only return the occurrence data.
key <- name_backbone(name='Helianthus annuus', kingdom='plants')$speciesKey
occ_search(taxonKey=key, limit=20)
#> Records found [41219]
#> Records returned [20]
#> No. unique hierarchies [1]
#> No. media records [15]
#> No. facets [0]
#> Args [limit=20, offset=0, taxonKey=9206251, fields=all]
#> # A tibble: 20 x 91
#> name key decimalLatitude decimalLongitude issues datasetKey
#> <chr> <int> <dbl> <dbl> <chr> <chr>
#> 1 Helian… 1.81e9 52.6 10.1 cdround… 6ac3f774-d9f…
#> 2 Helian… 1.84e9 0 0 cucdmis… d2470ef8-edf…
#> 3 Helian… 1.81e9 32.0 -102. cdround… 50c9509d-22c…
#> 4 Helian… 1.84e9 33.9 -117. cdround… 50c9509d-22c…
#> 5 Helian… 1.82e9 56.6 16.4 cdround… 38b4c89f-584…
#> 6 Helian… 1.84e9 34.1 -116. gass84 50c9509d-22c…
#> 7 Helian… 1.81e9 25.7 -100. cdround… 50c9509d-22c…
#> 8 Helian… 1.82e9 56.6 16.6 cdround… 38b4c89f-584…
#> 9 Helian… 1.81e9 25.6 -100. cdround… 50c9509d-22c…
#> 10 Helian… 1.82e9 59.8 17.5 gass84,… 38b4c89f-584…
#> 11 Helian… 1.83e9 34.0 -117. cdround… 50c9509d-22c…
#> 12 Helian… 1.83e9 58.6 16.2 gass84,… 38b4c89f-584…
#> 13 Helian… 1.83e9 58.4 14.9 gass84,… 38b4c89f-584…
#> 14 Helian… 1.81e9 25.7 -100. cdround… 50c9509d-22c…
#> 15 Helian… 1.84e9 26.2 -98.3 cdround… 50c9509d-22c…
#> 16 Helian… 1.84e9 25.8 -100. cdround… 50c9509d-22c…
#> 17 Helian… 1.85e9 23.7 -99.2 cdround… 50c9509d-22c…
#> 18 Helian… 1.84e9 -43.6 173. cdround… 50c9509d-22c…
#> 19 Helian… 1.84e9 23.8 -107. cdround… 50c9509d-22c…
#> 20 Helian… 1.84e9 23.9 -107. cdround… 50c9509d-22c…
#> # ... with 85 more variables: publishingOrgKey <chr>,
#> # publishingCountry <chr>, protocol <chr>, lastCrawled <chr>,
#> # lastParsed <chr>, crawlId <int>, extensions <chr>,
#> # basisOfRecord <chr>, taxonKey <int>, kingdomKey <int>,
#> # phylumKey <int>, classKey <int>, orderKey <int>, familyKey <int>,
#> # genusKey <int>, speciesKey <int>, scientificName <chr>, kingdom <chr>,
#> # phylum <chr>, order <chr>, family <chr>, genus <chr>, species <chr>,
#> # genericName <chr>, specificEpithet <chr>, taxonRank <chr>,
#> # coordinateUncertaintyInMeters <dbl>, year <int>, month <int>,
#> # day <int>, eventDate <chr>, lastInterpreted <chr>, license <chr>,
#> # identifiers <chr>, facts <chr>, relations <chr>, geodeticDatum <chr>,
#> # class <chr>, countryCode <chr>, country <chr>, recordedBy <chr>,
#> # catalogNumber <chr>, institutionCode <chr>, locality <chr>,
#> # collectionCode <chr>, gbifID <chr>,
#> # http...unknown.org.http_..rs.gbif.org.terms.1.0.Multimedia <chr>,
#> # elevation <dbl>, elevationAccuracy <dbl>, continent <chr>,
#> # stateProvince <chr>, rightsHolder <chr>, recordNumber <chr>,
#> # identifier <chr>, municipality <chr>, datasetName <chr>,
#> # language <chr>, occurrenceID <chr>, type <chr>,
#> # ownerInstitutionCode <chr>, occurrenceRemarks <chr>,
#> # dateIdentified <chr>, modified <chr>, references <chr>,
#> # verbatimEventDate <chr>, verbatimLocality <chr>, taxonID <chr>,
#> # http...unknown.org.occurrenceDetails <chr>, rights <chr>,
#> # eventTime <chr>, identificationID <chr>, individualCount <int>,
#> # county <chr>, identificationVerificationStatus <chr>,
#> # occurrenceStatus <chr>, vernacularName <chr>, taxonConceptID <chr>,
#> # informationWithheld <chr>, endDayOfYear <chr>, startDayOfYear <chr>,
#> # datasetID <chr>, accessRights <chr>, higherClassification <chr>,
#> # habitat <chr>, identifiedBy <chr>
Get the keys first with name_backbone()
, then pass to occ_search()
splist <- c('Accipiter erythronemius', 'Junco hyemalis', 'Aix sponsa')
keys <- sapply(splist, function(x) name_backbone(name=x)$speciesKey, USE.NAMES=FALSE)
occ_search(taxonKey=keys, limit=5, hasCoordinate=TRUE)
#> Occ. found [2480598 (16), 9362842 (3800072), 2498387 (1243725)]
#> Occ. returned [2480598 (5), 9362842 (5), 2498387 (5)]
#> No. unique hierarchies [2480598 (1), 9362842 (1), 2498387 (1)]
#> No. media records [2480598 (1), 9362842 (5), 2498387 (3)]
#> No. facets [2480598 (0), 9362842 (0), 2498387 (0)]
#> Args [hasCoordinate=TRUE, limit=5, offset=0,
#> taxonKey=2480598,9362842,2498387, fields=all]
#> 3 requests; First 10 rows of data from 2480598
#>
#> # A tibble: 5 x 81
#> name key decimalLatitude decimalLongitude issues datasetKey
#> <chr> <int> <dbl> <dbl> <chr> <chr>
#> 1 Accipit… 1.00e9 -27.6 -58.7 bri,cud… ad43e954-dd7…
#> 2 Accipit… 1.00e9 -27.9 -59.1 bri,cud… ad43e954-dd7…
#> 3 Accipit… 6.86e8 5.27 -60.7 cdround e635240a-3cb…
#> 4 Accipit… 1.00e9 -27.6 -58.7 bri,cud… ad43e954-dd7…
#> 5 Accipit… 1.00e9 -27.6 -58.7 bri,cud… ad43e954-dd7…
#> # ... with 75 more variables: publishingOrgKey <chr>,
#> # publishingCountry <chr>, protocol <chr>, lastCrawled <chr>,
#> # lastParsed <chr>, crawlId <int>, extensions <chr>,
#> # basisOfRecord <chr>, taxonKey <int>, kingdomKey <int>,
#> # phylumKey <int>, classKey <int>, orderKey <int>, familyKey <int>,
#> # genusKey <int>, speciesKey <int>, scientificName <chr>, kingdom <chr>,
#> # phylum <chr>, order <chr>, family <chr>, genus <chr>, species <chr>,
#> # genericName <chr>, specificEpithet <chr>, taxonRank <chr>, year <int>,
#> # month <int>, day <int>, eventDate <chr>, modified <chr>,
#> # lastInterpreted <chr>, license <chr>, identifiers <chr>, facts <chr>,
#> # relations <chr>, geodeticDatum <chr>, class <chr>, countryCode <chr>,
#> # country <chr>, identifier <chr>, created <chr>, gbifID <chr>,
#> # occurrenceID <chr>, associatedSequences <chr>,
#> # higherClassification <chr>, taxonID <chr>, sex <chr>,
#> # establishmentMeans <chr>, continent <chr>, references <chr>,
#> # institutionID <chr>, dynamicProperties <chr>, fieldNumber <chr>,
#> # language <chr>, type <chr>, preparations <chr>, catalogNumber <chr>,
#> # occurrenceStatus <chr>, institutionCode <chr>,
#> # nomenclaturalCode <chr>, higherGeography <chr>,
#> # verbatimEventDate <chr>, endDayOfYear <chr>,
#> # georeferenceVerificationStatus <chr>, datasetName <chr>,
#> # locality <chr>, collectionCode <chr>, verbatimLocality <chr>,
#> # recordedBy <chr>,
#> # http...unknown.org.http_..rs.tdwg.org.dwc.terms.ResourceRelationship <chr>,
#> # otherCatalogNumbers <chr>, startDayOfYear <chr>, accessRights <chr>,
#> # collectionID <chr>
We've removed gbifmap()
which helped users plot data from functions occ_search()
/occ_data()
- instead we strongly recommend using our other package mapr.
As of rgibf v1
, we have integration for GBIF's mapping API, which lets you get raster images of
occurrences of taxa of interest. For example:
x <- map_fetch(search = "taxonKey", id = 3118771, year = 2010)
x
#> class : RasterLayer
#> dimensions : 512, 512, 262144 (nrow, ncol, ncell)
#> resolution : 0.703125, 0.3515625 (x, y)
#> extent : -180, 180, -90, 90 (xmin, xmax, ymin, ymax)
#> coord. ref. : +init=epsg:4326 +proj=longlat +datum=WGS84 +no_defs +ellps=WGS84 +towgs84=0,0,0
#> data source : in memory
#> names : layer
#> values : 0, 1 (min, max)
library(raster)
plot(x, axes = FALSE, box = FALSE)
- Please report any issues or bugs.
- License: MIT
- Get citation information for
rgbif
in R doingcitation(package = 'rgbif')
- Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
This package is part of a richer suite called spocc - Species Occurrence Data, along with several other packages, that provide access to occurrence records from multiple databases.