Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'utf-8' codec can't decode #1282

Closed
Mo-Dabao opened this issue Mar 13, 2019 · 7 comments · Fixed by #2236
Closed

'utf-8' codec can't decode #1282

Mo-Dabao opened this issue Mar 13, 2019 · 7 comments · Fixed by #2236
Milestone

Comments

@Mo-Dabao
Copy link

Description

Same codes worked well last year, but not now.

Code to reproduce

import cartopy.io.shapereader as shpreader


def __extract():
    shpname = r".\map\China_province"
    provinces_records = list(shpreader.Reader(shpname).records())
    provinces_geometrys = [x.geometry for x in provinces_records]
    provinces_attributes = [x.attributes for x in provinces_records]
    return provinces_geometrys, provinces_attributes

provinces_shapes, provinces_records = __extract()

Traceback

Traceback (most recent call last):

  File "<ipython-input-51-fc14594e3fb3>", line 1, in <module>
    from China_map import provinces_geometrys

  File "E:\Python\pyNMIC\FY4A\China_map.py", line 24, in <module>
    provinces_geometrys, provinces_attributes = __extract()

  File "E:\Python\pyNMIC\FY4A\China_map.py", line 18, in __extract
    provinces_records = list(shpreader.Reader(shpname).records())

  File "D:\Anaconda3\lib\site-packages\cartopy\io\shapereader.py", line 249, in records
    shape_record = self._reader.shapeRecord(i)

  File "D:\Anaconda3\lib\site-packages\shapefile.py", line 992, in shapeRecord
    return ShapeRecord(shape=self.shape(i), record=self.record(i))

  File "D:\Anaconda3\lib\site-packages\shapefile.py", line 961, in record
    return self.__record(oid=i)

  File "D:\Anaconda3\lib\site-packages\shapefile.py", line 946, in __record
    value = u(value, self.encoding, self.encodingErrors)

  File "D:\Anaconda3\lib\site-packages\shapefile.py", line 104, in u
    return v.decode(encoding, encodingErrors)

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb0 in position 0: invalid start byte
Full environment definition

Operating system

Win10 1809 64bit

Cartopy version

Now is 0.17.0, don't remember witch version when it worked.

conda list

pyshp                     2.0.1
shapely                   1.6.4
python                    3.7.2
......

When I

from shapefile import Reader as shpReader

shpname = r".\map\China_province"
file = shpReader(shpname, encoding="gbk")
provinces_records = file.records()
provinces_shapes = file.shapes()

there's no Error.

What happened to shapefile or cartopy? Can I pass encoding="gbk" to cartopy.io.shapereader.Reader()?

@greglucas
Copy link
Contributor

I reworked a lot of the shapefile reading this past year, but there hasn't ever been any encoding parameters on Cartopy's side. It looks like both pyshp and Fiona contain an encoding keyword, so if you wanted to make a pull request with that added to the readers that would be a worthwhile addition I think.

Reading pyshp's latest release notes: https://github.com/GeospatialPython/pyshp/releases
They put this statement in a 2.1.0 version release "Added back read/write support for unicode field names." So, perhaps upgrading to 2.1.0 would work for you?

I would also suggest installing Fiona to do the shapefile reading if possible, it is much faster and perhaps that library can detect the file encoding better. It looks like you're on an Anaconda environment, so: conda install fiona

@Mo-Dabao
Copy link
Author

Mo-Dabao commented Mar 14, 2019

Thanks for advices.

I find a way to convert my .shps from gbk to utf-8, so no need to change the code now.

It's a nice experience to use Cartopy to handle projection and add .shps on visualizing data. Learning new package(Fiona) is a burden to lazy me.

Really thanks for all you guys' works!

@QuLogic
Copy link
Member

QuLogic commented Mar 14, 2019

cartopy.io.shapereader.Reader will automatically use Fiona for you; no new API to use.

@wqshen
Copy link

wqshen commented Feb 28, 2020

Add a argument to Reader class maybe a better choice.

The most lazy way is add **kwargs to cartopy/io/shapereader.py BasicReader / FionaReader class,
so one can pass encoding='XXXXX' to upstream library call (shapefile.Reader / fiona.open).

class BasicReader(object):
    """
    Provide an interface for accessing the contents of a shapefile.
    The primary methods used on a Reader instance are
    :meth:`~Reader.records` and :meth:`~Reader.geometries`.
    """
    def __init__(self, filename, **kwargs):
        # Validate the filename/shapefile
        self._reader = reader = shapefile.Reader(filename, **kwargs)

And

class FionaReader(object):
    """
    Provides an interface for accessing the contents of a shapefile
    with the fiona library, which has a much faster reader than pyshp.
    The primary methods used on a Reader instance are
    :meth:`~Reader.records` and :meth:`~Reader.geometries`.
    """
    def __init__(self, filename, bbox=None, **kwargs):
        self._data = []

        with fiona.open(filename, **kwargs) as f:
            if bbox is not None:
                assert len(bbox) == 4
                features = f.filter(bbox=bbox)
            else:
                features = f

@greglucas
Copy link
Contributor

@wqshen do you want to open a PR with your proposed change?

@LuckyBoy314
Copy link

@wqshen It really works, awesome!

@lgolston
Copy link
Contributor

Passing encoding to pyshp/fiona is now added with #2236.

@QuLogic QuLogic added this to the Next Release milestone Oct 13, 2023
@lgolston lgolston linked a pull request Oct 13, 2023 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants