Drugs databases vary too much in their formats and structures which making related data analysis not a very easy job and requires a lot of efforts to work on only two databases together such as DrugBank and KEGG.
Hence, dbparser
package aims to parse different public drugs databases
as DrugBank or
KEGG into single and unified format R
object called dvobject
(stands for drugverse object).
That should help in:
- working with single data object and not multiple databases in different formats,
- using R analysis capabilities easily on drugs data,
- ease of transferring data between researchers after performing
required data analysis or
dvobject
and storing results in the same object in a very easy manner
dvobject
introduces a unified and compressed format of drugs data. It
is an R list object that contains one or more of the following
sub-lists:
- drugs: list of data.frames that contain drugs information (i.e. synonyms, classifications, …) and it is the only mandatory list
- salts: data.frame contains drugs salts information
- products: data.frame of commercially available drugs products in the world
- references: data.frame of articles, links and textbooks about drugs or CETT data
- cett: list of data.frames contain targets, enzymes, carriers and transporters information
Parsers are available for the following databases (it is in progress list)
DrugBank database is a comprehensive, freely accessible, online database containing information on drugs and drug targets. As both a bioinformatics and a cheminformatics resource, DrugBank combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information. More information about DrugBank can be found here.
In its raw form, the DrugBank database is a single XML file. Users must create an account with DrugBank and request permission to download the database. Note that this may take a couple of days.
The dbparser
package parses the DrugBank XML database into R
tibbles
that can be explored and analyzed by the user, check this
tutorial for
more details.
If you are waiting for access to the DrugBank database, or do not intend
to do a deep dive with the data, you may wish to use the dbdataset
package,
which contains the DrugBank database already parsed into dvobject
.
Note that this is a large package that exceeds the limit set by CRAN. It
is only available on GitHub.
dbparser
is tested against DrugBank versions 5.1.0 through 5.1.12
successfully. If you find errors with these versions or any other
version please submit an issue
here.
You can install the released version of dbparser from CRAN with:
install.packages("dbparser")
or you can install the latest updates directly from the repo
library(devtools)
devtools::install_github("ropensci/dbparser")
Please note that the ‘dbparser’ project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
👍🎉 First off, thanks for taking the time to contribute! 🎉👍 Please review our Contributing Guide.
Think dbparser is useful? Let others discover it, by telling them in person, via Twitter or a blog post.
Using dbparser for a paper you are writing? Consider citing it
citation("dbparser")
#> To cite dbparser in publications use:
#>
#> Mohammed Ali, Ali Ezzat (). dbparser: DrugBank Database XML Parser.
#> R package version 2.0.3.
#>
#> A BibTeX entry for LaTeX users is
#>
#> @Manual{,
#> title = {DrugBank Database XML Parser},
#> author = {Mohammed Ali and Ali Ezzat},
#> organization = {Interstellar for Consultinc inc.},
#> note = {R package version 2.0.3},
#> url = {https://CRAN.R-project.org/package=dbparser},
#> }