Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid copying string for WKT parsing #220

Merged
merged 1 commit into from
Jul 25, 2024
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 12 additions & 5 deletions geozero/src/wkt/wkt_reader.rs
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ use crate::error::{GeozeroError, Result};
use crate::{FeatureProcessor, GeomProcessor, GeozeroDatasource, GeozeroGeometry};

use std::io::Read;
use std::str::FromStr;
use wkt::types::{Coord, LineString, Polygon};
use wkt::Geometry;

Expand All @@ -11,7 +12,10 @@ pub struct Wkt<B: AsRef<[u8]>>(pub B);

impl<B: AsRef<[u8]>> GeozeroGeometry for Wkt<B> {
fn process_geom<P: GeomProcessor>(&self, processor: &mut P) -> Result<()> {
read_wkt(&mut self.0.as_ref(), processor)
let wkt_str = std::str::from_utf8(self.0.as_ref())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've been changing the API repeatedly lately, but it would be nice here if the user could declare they already have UTF8 data. I.e. maybe WktStr<S> where S: AsRef<str> separated from Wkt<B> where B: AsRef<[u8]>

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we've been changing the API repeatedly lately, but it would be nice here if the user could declare they already have UTF8 data

To make sure I understand, you're proposing that, we remove the WktStr deprecation and make it generic? Sounds reasonable, just making sure I understand.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I believe so. Just so that the buffers don't have to be checked again via std::str::from_utf8

.map_err(|e| GeozeroError::Geometry(e.to_string()))?;
let wkt = wkt::Wkt::from_str(wkt_str).map_err(|e| GeozeroError::Geometry(e.to_string()))?;
process_wkt_geom(&wkt.item, processor)
}
}

Expand All @@ -24,7 +28,9 @@ pub struct WktString(pub String);
impl GeozeroGeometry for WktString {
fn process_geom<P: GeomProcessor>(&self, processor: &mut P) -> Result<()> {
#[allow(deprecated)]
read_wkt(&mut self.0.as_bytes(), processor)
let wkt = wkt::Wkt::from_str(self.0.as_str())
.map_err(|e| GeozeroError::Geometry(e.to_string()))?;
process_wkt_geom(&wkt.item, processor)
}
}

Expand All @@ -36,15 +42,17 @@ pub struct WktStr<'a>(pub &'a str);
impl GeozeroGeometry for WktStr<'_> {
fn process_geom<P: GeomProcessor>(&self, processor: &mut P) -> Result<()> {
#[allow(deprecated)]
read_wkt(&mut self.0.as_bytes(), processor)
let wkt = wkt::Wkt::from_str(self.0).map_err(|e| GeozeroError::Geometry(e.to_string()))?;
process_wkt_geom(&wkt.item, processor)
}
}

#[allow(deprecated)]
impl GeozeroDatasource for WktStr<'_> {
fn process<P: FeatureProcessor>(&mut self, processor: &mut P) -> Result<()> {
#[allow(deprecated)]
read_wkt(&mut self.0.as_bytes(), processor)
let wkt = wkt::Wkt::from_str(self.0).map_err(|e| GeozeroError::Geometry(e.to_string()))?;
process_wkt_geom(&wkt.item, processor)
}
}

Expand Down Expand Up @@ -72,7 +80,6 @@ impl<R: Read> GeozeroDatasource for WktReader<R> {

/// Read and process WKT geometry.
pub fn read_wkt<R: Read, P: GeomProcessor>(reader: &mut R, processor: &mut P) -> Result<()> {
use std::str::FromStr;
// PERF: it would be good to avoid copying data into this string when we already
// have a string as input. Maybe the wkt crate needs a from_reader implementation.
let mut wkt_string = String::new();
Expand Down
Loading