Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StoryMaps very slow to clone due to encoding check #2159

Open
jake-eagle opened this issue Nov 12, 2024 · 0 comments
Open

StoryMaps very slow to clone due to encoding check #2159

jake-eagle opened this issue Nov 12, 2024 · 0 comments
Labels

Comments

@jake-eagle
Copy link

Describe the bug
When trying to clone a StoryMap from one organisation to another, the clone_items() method takes a very long time to clone a StoryMap, specifically its resources.

I've identified the cause of this issue. The response from resources/export?f=zip is passed to the method handle_401 in Lib\site-packages\arcgis\auth_auth_token.py.

This method checks the response .text which causes the requests module to try to identify the encoding of the response content. This takes a long time as it is attempting to identify the encoding of a zip file... several hours, on my machine.

To Reproduce
Steps to reproduce the behavior:

Clone a StoryMap with resources (images) from one organisation to another

gis1 = GIS()
gis2 = GIS()
storymap = gis1.content.get("<id>")
copy_list = [storymap]
clone_results=gis2.content.clone_items(copy_list, copy_data=True, search_existing_items=True)

error:

2024-11-13 09:28:35,588,588 DEBUG    [mbcharsetprober.py:65] EUC-KR Korean prober hit error at byte 10
2024-11-13 09:28:35,589,589 DEBUG    [mbcharsetprober.py:65] CP949 Korean prober hit error at byte 60
2024-11-13 09:28:35,590,590 DEBUG    [mbcharsetprober.py:65] Big5 Chinese prober hit error at byte 10
2024-11-13 09:28:35,591,591 DEBUG    [mbcharsetprober.py:65] EUC-TW Taiwan prober hit error at byte 10
2024-11-13 09:29:08,463,463 DEBUG    [sbcharsetprober.py:129] windows-1251 confidence = 0.046932830793816153, below negative shortcut threshhold 0.05
2024-11-13 09:29:39,105,105 DEBUG    [sbcharsetprober.py:129] KOI8-R confidence = 0.04048963086780929, below negative shortcut threshhold 0.05
2024-11-13 09:30:09,992,992 DEBUG    [sbcharsetprober.py:129] ISO-8859-5 confidence = 0.04340445793941901, below negative shortcut threshhold 0.05
2024-11-13 09:30:32,587,587 DEBUG    [sbcharsetprober.py:129] MacCyrillic confidence = 0.04770257721175075, below negative shortcut threshhold 0.05

Screenshots
image
Timing warning from VS Code

image
Debug logs showing multiple minutes trying to identify file encoding

image
Stack trace

image
Results of printing r.text, showing that r.text is not empty for zip files.

Expected behavior
For binary files, the response content is not checked by this method.

A not-very-elegant approach I used to successfully mitigate this issue was to update the handle_401 method manually to bypass any response where the encoding was not provided:

    def handle_401(self, r, **kwargs):
        """
        handles the issues in the response where token might be rejected
        """
        parsed = parse_url(r.url)
        if r.encoding is None:
            return r
        elif (
            r.text.lower().find("invalid token") > -1
            or r.text.find("Token is valid but access is denied.") > -1
            or (parsed.scheme, parsed.netloc, parsed.path) in self._no_go_token
        ):

Platform (please complete the following information):

  • OS: Windows
  • Browser: N/A Visual Studio Code
  • Python API Version: 2.4.0

Additional context
N/A

@jake-eagle jake-eagle added the bug label Nov 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant