-
Notifications
You must be signed in to change notification settings - Fork 79
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbWriteTable vs dbAppendTable google cloud Postgres #241
Comments
I've observed the same behavior on Mac and on EC2 running Ubuntu, explicitly comparing |
Thanks. I see that I'm observing a substantial slowdown even on a local server when writing 1000 copies of the data. This slowdown is reduced substantially, but still amounts to a factor of ~4-5, when I think we should use the library(DBI)
library(lubridate)
#>
#> Attaching package: 'lubridate'
#> The following objects are masked from 'package:base':
#>
#> date, intersect, setdiff, union
library(magrittr)
# Make database connection
db <-
dbConnect(
RPostgres::Postgres()
)
# Insert data with dbAppendTable
start_time <- Sys.time()
try(db %>% dbRemoveTable("iris"))
db %>% dbCreateTable(name = "iris", iris[0, ])
db %>% dbBegin()
db %>% dbAppendTable(name = "iris", iris[rep(1:150, each = 1000), ])
#> Warning: Factors converted to character
#> [1] 150000
db %>% dbCommit()
end_time <- Sys.time()
cat("Time spent dbAppendTable", (start_time %--% end_time))
#> Time spent dbAppendTable 4.068846
# Insert data with dbWriteTable
start_time <- Sys.time()
try(db %>% dbRemoveTable("iris"))
db %>% dbWriteTable(name = "iris", iris[rep(1:150, each = 1000), ])
end_time <- Sys.time()
cat("Time spent dbWriteTable", (start_time %--% end_time))
#> Time spent dbWriteTable 0.9158208 Created on 2020-10-18 by the reprex package (v0.3.0) |
In my case, this problem is even more pronounced: I have to insert around 2 Million rows (multiple times) and with |
Thank you, I had the same issue and this was very helpful for me. I can confirm the huge performance gains. |
- `dbAppendTable()` gains `copy` argument, `TRUE` by default. If set, data is imported via `COPY name FROM STDIN` (#241, @hugheylab).
This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary. |
Hello,
Im trying to write a table to a Google Cloud Postgres database. Se below for reprex.
dbCreateTable/dbWriteTable takes 0.23sec while dbAppendTable 10.6sec.
Why does it says in the documentation (?dbWriteTable) "New code should prefer dbCreateTable() and dbAppendTable()"when it is much slower?
The text was updated successfully, but these errors were encountered: