Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MaybeTransaction for better writing speed when supported #584

Closed
wants to merge 2 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@
- Add methods `alternative_name`, `is_nullable`, `is_unique`, `default_value` to `Field` ([#561](https://github.com/georust/gdal/pull/561))
- Add `Defn::geometry_type` ([#562](https://github.com/georust/gdal/pull/562))
- Add `Defn::field_index` and `Feature::field_index` ([#581](https://github.com/georust/gdal/pull/581))
- Add `Dataset::maybe_start_transaction` which always succeeds to use transaction when possible for speed reasons -- no rollback ([#584](https://github.com/georust/gdal/pull/584))

### Fixed

Expand Down
35 changes: 35 additions & 0 deletions src/test_utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -136,6 +136,41 @@ pub fn open_gpkg_for_update(path: &Path) -> (TempPath, Dataset) {
(temp_path, ds)
}

/// Copies the given file to a temporary file and opens it for writing. When the returned
/// `TempPath` is dropped, the file is deleted.
pub fn open_dataset_for_update(path: &Path) -> (TempPath, Dataset) {
use std::fs;
use std::io::Write;

let input_data = fs::read(path).unwrap();
let (mut file, temp_path) = tempfile::Builder::new()
// using the whole filename as suffix should be fine (can't
// use .extension() for .shp.zip and such)
.suffix(
path.file_name()
.unwrap_or_default()
.to_string_lossy()
.as_ref(),
)
.tempfile()
.unwrap()
.into_parts();
file.write_all(&input_data).unwrap();
// Close the temporary file so that Dataset can open it safely even if the filesystem uses
// exclusive locking (Windows?).
drop(file);

let ds = Dataset::open_ex(
&temp_path,
DatasetOptions {
open_flags: GDALAccess::GA_Update.into(),
..DatasetOptions::default()
},
)
.unwrap();
(temp_path, ds)
}

/// Assert numerical difference between two expressions is less than
/// 64-bit machine epsilon or a specified epsilon.
///
Expand Down
136 changes: 135 additions & 1 deletion src/vector/transaction.rs
Original file line number Diff line number Diff line change
Expand Up @@ -179,9 +179,93 @@ impl Dataset {
}
}

/// Represents maybe a transaction on a dataset for speed reasons.
///
/// It can be committed by calling [`commit`](MaybeTransaction::commit), but unlike [`Transaction`] it can not be rolled back. The transaction part is to make the process speedy when possible. If the [`Dataset`] does not support transaction, it does nothing.
///
/// If the transaction is not explicitly committed when it is dropped, it is implicitly committed, but you will not know if it fails.
///
/// The transaction holds a mutable borrow on the `Dataset` that it was created from, so during the
/// lifetime of the transaction you will need to access the dataset by dereferencing the
/// `MaybeTransaction` through its [`Deref`] or [`DerefMut`] implementations.
#[derive(Debug)]
pub struct MaybeTransaction<'a> {
dataset: &'a mut Dataset,
is_transaction: bool,
commit_on_drop: bool,
}

impl<'a> MaybeTransaction<'a> {
fn new(dataset: &'a mut Dataset, is_transaction: bool) -> Self {
MaybeTransaction {
dataset,
is_transaction,
commit_on_drop: true,
}
}

pub fn is_transaction(&self) -> bool {
self.is_transaction
}

/// Commits this transaction.
///
/// If the commit fails, will return [`OGRErr::OGRERR_FAILURE`].
///
/// Depending on drivers, this may or may not abort layer sequential readings that are active.
pub fn commit(mut self) -> Result<()> {
if !self.is_transaction {
return Ok(());
}

let rv = unsafe { gdal_sys::GDALDatasetCommitTransaction(self.dataset.c_dataset()) };
self.commit_on_drop = false;
if rv != OGRErr::OGRERR_NONE {
return Err(GdalError::OgrError {
err: rv,
method_name: "GDALDatasetCommitTransaction",
});
}
Ok(())
}
}

impl<'a> Deref for MaybeTransaction<'a> {
type Target = Dataset;

fn deref(&self) -> &Self::Target {
self.dataset
}
}

impl<'a> DerefMut for MaybeTransaction<'a> {
fn deref_mut(&mut self) -> &mut Self::Target {
self.dataset
}
}

impl Dataset {
pub fn maybe_start_transaction(&mut self) -> MaybeTransaction<'_> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be reasonable to go with a callback based approach here instead of having that guard object. This should make it much easier to handle errors on committing.

That would give you the following function signature:

fn maybe_transaction<R,E: From<gdal::Error>>(&self, impl Fn(&Dataset) -> Result<R, E>) -> Result<R,E>

For the implementation of that pattern see diesels transaction method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for late response. I made it quickly, just copying the mechanism that is being already used by the transaction. It gives uniformity to the transaction API.

let force = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add at some point that parameter for force, but if you want this for better performance, you absolutely don't want to enable it.

I think it only works for FileGDB, and it's implemented by making a file system-level copy of the whole dataset, and restoring it on rollback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that won't help with performance, I think if we want transaction for rollback purposes vs want transaction for speed purposes, I think we should have different methods.

The MaybeTransaction doesn't have rollback method at all, that'd help at compile time to let people know they can't rollback. Since it could be working on the database directly without transaction as well.

That's why I was thinking maybe some other name would be better. The current interpretation of MaybeTransaction is it is maybe a transaction or not, hence it has to be committed on drop, and no rollback. To make the Transaction and non transaction database do same thing.

On that note I think we can add the force thing to normal Transaction or add another ForceTransaction that does filesystem copy ourselves when Transaction fails.

let rv = unsafe { gdal_sys::GDALDatasetStartTransaction(self.c_dataset(), force) };

MaybeTransaction::new(self, rv == OGRErr::OGRERR_NONE)
}
}

impl<'a> Drop for MaybeTransaction<'a> {
fn drop(&mut self) {
if self.commit_on_drop {
// We silently swallow any errors, because we have no way to report them from a drop
// function apart from panicking.
unsafe { gdal_sys::GDALDatasetCommitTransaction(self.dataset.c_dataset()) };
}
}
}

#[cfg(test)]
mod tests {
use crate::test_utils::{fixture, open_gpkg_for_update};
use crate::test_utils::{fixture, open_dataset_for_update, open_gpkg_for_update};
use crate::vector::{Geometry, LayerAccess};
use crate::Dataset;

Expand Down Expand Up @@ -241,4 +325,54 @@ mod tests {
let mut ds = Dataset::open(fixture("roads.geojson")).unwrap();
assert!(ds.start_transaction().is_err());
}

#[test]
fn test_maybe_start_transaction() {
let (_temp_path, mut ds) = open_gpkg_for_update(&fixture("poly.gpkg"));
let txn = ds.maybe_start_transaction();
assert!(txn.is_transaction());
let mut ds = Dataset::open(fixture("roads.geojson")).unwrap();
let txn = ds.maybe_start_transaction();
assert!(!txn.is_transaction());
}

#[test]
fn test_maybe_transaction_commit() {
let (_temp_path, mut ds) = open_gpkg_for_update(&fixture("poly.gpkg"));
let orig_feature_count = ds.layer(0).unwrap().feature_count();

let txn = ds.maybe_start_transaction();
let mut layer = txn.layer(0).unwrap();
layer.create_feature(polygon()).unwrap();
assert!(txn.commit().is_ok());

assert_eq!(ds.layer(0).unwrap().feature_count(), orig_feature_count + 1);
}

#[test]
fn test_maybe_transaction_commit_unsupported() {
let (_temp_path, mut ds) = open_dataset_for_update(&fixture("roads.geojson"));
let orig_feature_count = ds.layer(0).unwrap().feature_count();

let txn = ds.maybe_start_transaction();
let mut layer = txn.layer(0).unwrap();
layer.create_feature(polygon()).unwrap();
assert!(txn.commit().is_ok());

assert_eq!(ds.layer(0).unwrap().feature_count(), orig_feature_count + 1);
}

#[test]
fn test_maybe_transaction_implicit_commit() {
let (_temp_path, mut ds) = open_gpkg_for_update(&fixture("poly.gpkg"));
let orig_feature_count = ds.layer(0).unwrap().feature_count();

{
let txn = ds.maybe_start_transaction();
let mut layer = txn.layer(0).unwrap();
layer.create_feature(polygon()).unwrap();
} // txn is dropped here.

assert_eq!(ds.layer(0).unwrap().feature_count(), orig_feature_count + 1);
}
}