Skip to content

Releases: DyfanJones/noctua

noctua 2.6.2

09 Aug 08:23
Compare
Choose a tag to compare

Feature:

  • Add catalog support (#194)
  • fix dbExistsTable to catch update AWS error message.
  • add support to dbplyr 2.3.3.9000+

Bug Fix:

Internals:

  • Remove AWS calls to AWS Glue
  • Remove reader soft dependency

noctua 2.6.1

20 Dec 13:15
3244efc
Compare
Choose a tag to compare

Bug Fix:

  • Prevent assuming role from AWS_ROLE_ARN. This caused confusing when connecting through web identity (RAthena # 177)
  • Support dbplyr::in_catalog when working with dplyr::tbl (RAthena # 178)

noctua 2.6.0

20 May 13:06
Compare
Choose a tag to compare

Feature:

  • Add clear_s3_resource parameter to RAthena_options to prevent AWS Athena output AWS S3 resource being cleared up by dbClearResult (RAthena # 168). Thanks to @juhoautio for the request.
  • Support extra paws parameters (RAthena # 169)
  • Support endpoint_override parameter allow default endpoints for each service to be overridden accordingly (RAthena # 169). Thanks to @aoyh for the request and checking the package in development.

noctua 2.5.1

04 Feb 18:21
Compare
Choose a tag to compare

Bug Fix:

  • Fixed unit test helper function test_data to use size parameter explicitly.

noctua 2.5.0

17 Jan 17:08
fc1c377
Compare
Choose a tag to compare

Feature:

  • Allow all information messages to be turned off (#178)
  • Allow noctua_options to change 1 parameter at a time without affecting other pre-configured settings
  • Return warning message for deprecated retry_quiet parameter in noctua_options function.

noctua 2.4.0

26 Nov 16:41
a92854c
Compare
Choose a tag to compare

Feature:

  • Add support dbplyr 2.0.0 backend API.
  • Add method to set unload on a package level to allow dplyr to benefit from AWS Athena unload methods (#174).

Bug Fix:

  • Ensure dbGetQuery, dbExecute, dbSendQuery, dbSendStatement work on older versions of R (#170). Thanks to @tyner for identifying issue.
  • Caching would fail when statement wasn't a character (#171). Thanks to @ramnathv for identifying issue.

v-2.3.0

27 Oct 10:26
b75532d
Compare
Choose a tag to compare

Feature:

  • Add support to AWS Athena UNLOAD (#160). This is to take advantage of read/write speed parquet has to offer.
import awswrangler as wr

import getpass
bucket = getpass.getpass()
path = f"s3://{bucket}/data/"


if "awswrangler_test" not in wr.catalog.databases().values:
    wr.catalog.create_database("awswrangler_test")

cols = ["id", "dt", "element", "value", "m_flag", "q_flag", "s_flag", "obs_time"]

df = wr.s3.read_csv(
    path="s3://noaa-ghcn-pds/csv/189",
    names=cols,
    parse_dates=["dt", "obs_time"])  # Read 10 files from the 1890 decade (~1GB)

wr.s3.to_parquet(
    df=df,
    path=path,
    dataset=True,
    mode="overwrite",
    database="awswrangler_test",
    table="noaa"
);

wr.catalog.table(database="awswrangler_test", table="noaa")
library(DBI)

con <- dbConnect(noctua::athena())

# Query ran using CSV output
system.time({
  df = dbGetQuery(con, "SELECT * FROM awswrangler_test.noaa")
})
# Info: (Data scanned: 80.88 MB)
#    user  system elapsed
#  57.004   8.430 160.567 

noctua::noctua_options(cache_size = 1)

# Query ran using UNLOAD Parquet output
system.time({
  df = dbGetQuery(con, "SELECT * FROM awswrangler_test.noaa", unload = T)
})
# Info: (Data scanned: 80.88 MB)
#    user  system elapsed 
#  21.622   2.350  39.232 

# Query ran using cache
system.time({
  df = dbGetQuery(con, "SELECT * FROM awswrangler_test.noaa", unload = T)
})
# Info: (Data scanned: 80.88 MB)
#    user  system elapsed 
#  13.738   1.886  11.029 

v-2.2.0

23 Sep 11:28
Compare
Choose a tag to compare

Bug Fix:

  • sql_translate_env correctly translates R functions quantile and median to AWS Athena equivalents (#153). Thanks to @ellmanj for spotting issue.

Feature:

  • Support AWS Athena timestamp with time zone data type.
  • Properly support data type list when converting data to AWS Athena SQL format.
library(data.table)
library(DBI)

x = 5

dt = data.table(
  var1 = sample(LETTERS, size = x, T),
  var2 = rep(list(list("var3"= 1:3, "var4" = list("var5"= letters[1:5]))), x)
)

con <- dbConnect(noctua::athena())

#> Version: 2.2.0

sqlData(con, dt)

# Registered S3 method overwritten by 'jsonify':
#   method     from    
#   print.json jsonlite
# Info: Special characters "\t" has been converted to " " to help with Athena reading file format tsv
#    var1                                                   var2
# 1:    1 {"var3":[1,2,3],"var4":{"var5":["a","b","c","d","e"]}}
# 2:    2 {"var3":[1,2,3],"var4":{"var5":["a","b","c","d","e"]}}
# 3:    3 {"var3":[1,2,3],"var4":{"var5":["a","b","c","d","e"]}}
# 4:    4 {"var3":[1,2,3],"var4":{"var5":["a","b","c","d","e"]}}
# 5:    5 {"var3":[1,2,3],"var4":{"var5":["a","b","c","d","e"]}}

#> Version: 2.1.0

sqlData(con, dt)

# Info: Special characters "\t" has been converted to " " to help with Athena reading file format tsv
#    var1                                        var2
# 1:    1 1:3|list(var5 = c("a", "b", "c", "d", "e"))
# 2:    2 1:3|list(var5 = c("a", "b", "c", "d", "e"))
# 3:    3 1:3|list(var5 = c("a", "b", "c", "d", "e"))
# 4:    4 1:3|list(var5 = c("a", "b", "c", "d", "e"))
# 5:    5 1:3|list(var5 = c("a", "b", "c", "d", "e"))

v-2.2.0 now converts lists into json lines format so that AWS Athena can parse with sql array/mapping/json functions. Small down side a s3 method conflict occurs when jsonify is called to convert lists into json lines. jsonify was choose in favor to jsonlite due to the performance improvements (#156).

v-2.1.0

27 Jul 13:40
Compare
Choose a tag to compare

Bug Fix:

  • dbIsValid wrongly stated connection is valid for result class when connection class was disconnected.
  • sql_translate_env.paste broke with latest version of dbplyr. New method is compatible with dbplyr>=1.4.3 (#149).

Feature:

  • sql_translate_env: add support for stringr/lubridate style functions, similar to Postgres backend.
  • write_bin now doesn't chunk writeBin if R version is greater than 4.0.0 HenrikBengtsson/Wishlist-for-R#97 (#149)
  • dbConnect add timezone parameter so that time zone between R and AWS Athena is consistent.

noctua-v2.0.1

25 Feb 09:43
7820a7b
Compare
Choose a tag to compare

This is a hot fix patch to fix keyboard interrupt not raising errors correctly.