Releases · geopandas/pyogrio

28 Sep 19:11

github-actions

v0.10.0

eb8e788

Version 0.10.0 Latest

Latest

Improvements

Add support to read, write, list, and remove /vsimem/ files (#457).

Bug fixes

Silence warning from write_dataframe with GeoSeries.notna() (#435).
Enable mask & bbox filter when geometry column not read (#431).
Raise NotImplmentedError when user attempts to write to an open file handle (#442).
Prevent seek on read from compressed inputs (#443).

Packaging

For the conda-forge package, change the dependency from libgdal to
libgdal-core. This package is significantly smaller as it doesn't contain
some large GDAL plugins. Extra plugins can be installed as seperate conda
packages if needed: more info here.
This also leads to pyproj becoming an optional dependency; you will need
to install pyproj in order to support spatial reference systems (#452).
The GDAL library included in the wheels is updated from 3.8.5 to GDAL 3.9.2 (#466).
pyogrio now requires a minimum version of Python >= 3.9 (#473).
Wheels are now available for Python 3.13.

Assets 3

17 Jun 20:09

github-actions

v0.9.0

568f870

Version 0.9.0

Improvements

Add on_invalid parameter to read_dataframe (#422).

Bug fixes

Fixed bug transposing longitude and latitude when writing files with
coordinate transformation from EPSG:4326 (#421).
Fix bug preventing reading from file paths containing hashes in read_dataframe (#412).

Packaging

MacOS wheels are now only available for macOS 12+. For older unsupported macOS
versions, pyogrio can still be built from source (requires GDAL to be installed) (#417).
Remove usage of deprecated distutils in setup.py (#416).

Assets 3

06 May 22:08

github-actions

v0.8.0

46c35a7

Version v0.8.0

Improvements

Support for writing based on Arrow as the transfer mechanism of the data
from Python to GDAL (requires GDAL >= 3.8). This is provided through the
new pyogrio.raw.write_arrow function, or by using the use_arrow=True
option in pyogrio.write_dataframe (#314, #346).
Add support for fids filter to read_arrow and open_arrow, and to
read_dataframe with use_arrow=True (#304).
Add some missing properties to read_info, including layer name, geometry name
and FID column name (#365).
read_arrow and open_arrow now provide
GeoArrow-compliant extension metadata,
including the CRS, when using GDAL 3.8 or higher (#366).
The open_arrow function can now be used without a pyarrow dependency. By
default, it will now return a stream object implementing the
Arrow PyCapsule Protocol
(i.e. having an __arrow_c_stream__method). This object can then be consumed
by your Arrow implementation of choice that supports this protocol. To keep
the previous behaviour of returning a pyarrow.RecordBatchReader, specify
use_pyarrow=True (#349).
Warn when reading from a multilayer file without specifying a layer (#362).
Allow writing to a new in-memory datasource using io.BytesIO object (#397).

Bug fixes

Fix error in write_dataframe if input has a date column and
non-consecutive index values (#325).
Fix encoding issues on windows for some formats (e.g. ".csv") and always write ESRI
Shapefiles using UTF-8 by default on all platforms (#361).
Raise exception in read_arrow or read_dataframe(..., use_arrow=True) if
a boolean column is detected due to error in GDAL reading boolean values for
FlatGeobuf / GPKG drivers (#335, #387); this has been fixed in GDAL >= 3.8.3.
Properly ignore fields not listed in columns parameter when reading from
the data source not using the Arrow API (#391).
Properly handle decoding of ESRI Shapefiles with user-provided encoding
option for read, read_dataframe, and open_arrow, and correctly encode
Shapefile field names and text values to the user-provided encoding for
write and write_dataframe (#384).
Fixed bug preventing reading from bytes or file-like in read_arrow /
open_arrow (#407).

Packaging

The GDAL library included in the wheels is updated from 3.7.2 to GDAL 3.8.5.

Potentially breaking changes

Using a where expression combined with a list of columns that does not include
the column referenced in the expression is not recommended and will now
return results based on driver-dependent behavior, which may include either
returning empty results (even if non-empty results are expected from where parameter)
or raise an exception (#391). Previous versions of pyogrio incorrectly
set ignored fields against the data source, allowing it to return non-empty
results in these cases.

Assets 3

30 Oct 19:11

github-actions

v0.7.2

71acde5

Version 0.7.2

Bug fixes

Add packaging as a dependency (#320).
Fix conversion of WKB to geometries with missing values when using
pandas.ArrowDtype (#321).

Assets 3

26 Oct 23:25

github-actions

v0.7.1

97d9dee

Version 0.7.1

Bug fixes

Fix unspecified dependency on packaging (#318).

Assets 3

25 Oct 19:21

github-actions

v0.7.0

f0c82b6

Version 0.7.0

Improvements

Support reading and writing datetimes with timezones (#253).
Support writing dataframes without geometry column (#267).
Calculate feature count by iterating over features if GDAL returns an
unknown count for a data layer (e.g., OSM driver); this may have signficant
performance impacts for some data sources that would otherwise return an
unknown count (count is used in read_info, read, read_dataframe) (#271).
Add arrow_to_pandas_kwargs parameter to read_dataframe + reduce memory usage
with use_arrow=True (#273)
In read_info, the result now also contains the total_bounds of the layer as well
as some extra capabilities of the data source driver (#281).
Raise error if read or read_dataframe is called with parameters to read no
columns, geometry, or fids (#280).
Automatically detect supported driver by extension for all available
write drivers and addition of detect_write_driver (#270).
Addition of mask parameter to open_arrow, read, read_dataframe,
and read_bounds functions to select only the features in the dataset that
intersect the mask geometry (#285). Note: GDAL < 3.8.0 returns features that
intersect the bounding box of the mask when using the Arrow interface for
some drivers; this has been fixed in GDAL 3.8.0.
Removed warning when no features are read from the data source (#299).
Add support for force_2d=True with use_arrow=True in read_dataframe (#300).

Other changes

test suite requires Shapely >= 2.0
using skip_features greater than the number of features available in a data
layer now returns empty arrays for read and an empty DataFrame for
read_dataframe instead of raising a ValueError (#282).
enabled skip_features and max_features for read_arrow and
read_dataframe(path, use_arrow=True). Note that this incurs overhead
because all features up to the next batch size above max_features (or size
of data layer) will be read prior to slicing out the requested range of
features (#282).
The use_arrow=True option can be enabled globally for testing using the
PYOGRIO_USE_ARROW=1 environment variable (#296).

Bug fixes

Fix int32 overflow when reading int64 columns (#260)
Fix fid_as_index=True doesn't set fid as index using read_dataframe with
use_arrow=True (#265)
Fix errors reading OSM data due to invalid feature count and incorrect
reading of OSM layers beyond the first layer (#271)
Always raise an exception if there is an error when writing a data source
(#284)

Potentially breaking changes

In read_info (#281):
- the features property in the result will now be -1 if calculating the
  feature count is an expensive operation for this driver. You can force it to be
  calculated using the force_feature_count parameter.
- for boolean values in the capabilities property, the values will now be
  booleans instead of 1 or 0.

Packaging

The GDAL library included in the wheels is updated from 3.6.4 to GDAL 3.7.2.

Assets 3

27 Apr 08:01

github-actions

v0.6.0

6b07e7d

Version 0.6.0

Improvements

Add automatic detection of 3D geometries in write_dataframe (#223, #229)
Add "driver" property to read_info result (#224)
Add support for dataset open options to read, read_dataframe, and
read_info (#233)
Add support for pandas' nullable data types in write_dataframe, or
specifying a mask manually for missing values in write (#219)
Standardized 3-dimensional geometry type labels from "2.5D " to
" Z" for consistency with well-known text (WKT) formats (#234)
Failure error messages from GDAL are no longer printed to stderr (they were
already translated into Python exceptions as well) (#236).
Failure and warning error messages from GDAL are no longer printed to
stderr: failures were already translated into Python exceptions
and warning messages are now translated into Python warnings (#236, #242).
Add access to low-level pyarrow RecordBatchReader via
pyogrio.raw.open_arrow, which allows iterating over batches of Arrow
tables (#205).
Add support for writing dataset and layer metadata (where supported by
driver) to write and write_dataframe, and add support for reading
dataset and layer metadata in read_info (#237).

Packaging

The GDAL library included in the wheels is updated from 3.6.2 to GDAL 3.6.4.
Wheels are now available for Linux aarch64 / arm64.

Assets 3

27 Jan 04:42

github-actions

v0.5.1

a0b6585

Version 0.5.1

Bug fixes

Fix memory leak in reading files (#207)
Fix to only use transactions for writing records when supported by the
driver (#203)

Assets 3

16 Jan 20:58

github-actions

v0.5.0

d8ea903

Version 0.5.0

Major enhancements

Support for reading based on Arrow as the transfer mechanism of the data
from GDAL to Python (requires GDAL >= 3.6 and pyarrow to be installed).
This can be enabled by passing use_arrow=True to pyogrio.read_dataframe
(or by using pyogrio.raw.read_arrow directly), and provides a further
speed-up (#155, #191).
Support for appending to an existing data source when supported by GDAL by
passing append=True to pyogrio.write_dataframe (#197).

Potentially breaking changes

In floating point columns, NaN values are now by default written as "null"
instead of NaN, but with an option to control this (pass nan_as_null=False
to keep the previous behaviour) (#190).

Improvements

It is now possible to pass GDAL's dataset creation options in addition
to layer creation options in pyogrio.write_dataframe (#189).
When specifying a subset of columns to read, unnecessary IO or parsing
is now avoided (#195).

Packaging

The GDAL library included in the wheels is updated from 3.4 to GDAL 3.6.2,
and is now built with GEOS and sqlite with rtree support enabled
(which allows writing a spatial index for GeoPackage).
Wheels are now available for Python 3.11.
Wheels are now available for MacOS arm64.

Assets 3

06 Oct 20:02

github-actions

v0.4.2

b1bbecd

Version 0.4.2

Improvements

new get_gdal_data_path() utility funtion to check the path of the data
directory detected by GDAL (#160)

Bug fixes

register GDAL drivers during initial import of pyogrio (#145)
support writing "not a time" (NaT) values in a datetime column (#146)
fixes an error when reading GPKG with bbox filter (#150)
properly raises error when invalid where clause is used on a GPKG (#150)
avoid duplicate count of available features (#151)

Assets 3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improvements

Bug fixes

Packaging

Improvements

Bug fixes

Packaging

Improvements

Bug fixes

Packaging

Potentially breaking changes

Bug fixes

Bug fixes

Improvements

Other changes

Bug fixes

Potentially breaking changes

Packaging

Improvements

Packaging

Bug fixes

Major enhancements

Potentially breaking changes

Improvements

Packaging

Improvements

Bug fixes

Releases: geopandas/pyogrio

Version 0.10.0

Improvements

Bug fixes

Packaging

Version 0.9.0

Improvements

Bug fixes

Packaging

Version v0.8.0

Improvements

Bug fixes

Packaging

Potentially breaking changes

Version 0.7.2

Bug fixes

Version 0.7.1

Bug fixes

Version 0.7.0

Improvements

Other changes

Bug fixes

Potentially breaking changes

Packaging

Version 0.6.0

Improvements

Packaging

Version 0.5.1

Bug fixes

Version 0.5.0

Major enhancements

Potentially breaking changes

Improvements

Packaging

Version 0.4.2

Improvements

Bug fixes