Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v4.16.2: docker context list gets "unexpected end of JSON input" #13180

Closed
3 tasks done
rfay opened this issue Jan 19, 2023 · 62 comments
Closed
3 tasks done

v4.16.2: docker context list gets "unexpected end of JSON input" #13180

rfay opened this issue Jan 19, 2023 · 62 comments

Comments

@rfay
Copy link
Contributor

rfay commented Jan 19, 2023

  • I have tried with the latest version of Docker Desktop
  • I have tried disabling enabled experimental features
  • I have uploaded Diagnostics
  • Diagnostics ID: EE17CD29-04CD-4EEB-A128-5EE34779FFD8/20230119214601

Actual behavior

PS C:\Users\testbot> docker context list
unexpected end of JSON input

Expected behavior

docker context list should give a list of docker contexts

This happens in PS and in WSL2 as well, same thing. I tested on two different Windows systems.
I tested on macOS and did not see this.

Information

  • Windows Version: 11
  • Docker Desktop Version: 4.16.1
  • WSL2 or Hyper-V backend? WSL2
  • Are you running inside a virtualized Windows e.g. on a cloud server or a VM: No

Output of & "C:\Program Files\Docker\Docker\resources\com.docker.diagnose.exe" check

Skipped since I haven't ever seen it work

Steps to reproduce the behavior

On 4.16.1, docker context list

@rfay
Copy link
Contributor Author

rfay commented Jan 19, 2023

I should note that as a result of this I have to roll back to 4.15.0 on test runners. But that is no longer allowed by the installer, which says "Oh, it's up-to-date", meaning that I'll have to uninstall and reinstall to get to 4.15.0.

tb-wsldd-05

@rfay
Copy link
Contributor Author

rfay commented Jan 19, 2023

docker context use default works.
docker context inspect default works.

@rfay
Copy link
Contributor Author

rfay commented Jan 19, 2023

And reverting to v4.15.0 I still have this problem, so I imagine it's some file left as-is during uninstall? Tried removing .docker/config.json but that didn't help.

@rfay
Copy link
Contributor Author

rfay commented Jan 19, 2023

I did find a 4.16.1 (that was updated from a previous version) that does not show this behavior. These two that are failing are fresh installs.

@rfay
Copy link
Contributor Author

rfay commented Jan 20, 2023

Now this is happening on a 3rd machine running 4.16.1, and I have no solutions.

@nicks
Copy link

nicks commented Jan 20, 2023

Under the hood, I think 'docker context ls' reads json files on disk, decodes them, and pretty prints them. It doesn't change much. I doubt it changed between 4.15 and 4.16. I think the files are under ~/.docker/contexts or something similar? Probably they're getting corrupted somehow by some command that runs before.

Maybe there's a docker/cli bug here to print the file where the error is coming from.

@rfay
Copy link
Contributor Author

rfay commented Jan 20, 2023

Thanks @nicks -

Here's what I saw:

$ ls -l .docker/contexts/meta/*/*
-rwxrwxrwx 1 buildkite-agent buildkite-agent 0 Jan 19 16:31 .docker/contexts/meta/fe9c6bd7a66301f49ca9b6a70b217107cd1284598bfc254700c989b916da791e/meta.json

On each system there was a meta.json that was empty. Deleting them made docker context ls work.

On each of two Windows/WSL2 systems I examined, the contexts were zeroed out/corrupted, and this was true both inside wsl2 in ~/.docker/contexts and also on PowerShell in ~/.docker/contexts.

I imagine something in 4.16.1 is zeroing them out, not sure.

@rfay
Copy link
Contributor Author

rfay commented Jan 23, 2023

This problem continues in 4.16.2 without change. I find that if I remove the ~/.docker/contexts it gets recreated with an empty value when Docker Desktop is restarted.

@rfay rfay changed the title v4.16.1: docker context list gets "unexpected end of JSON input" v4.16.2: docker context list gets "unexpected end of JSON input" Jan 23, 2023
@nicks
Copy link

nicks commented Jan 26, 2023

possibly related - #12561 (though the exact symptoms are slightly different)

@rfay
Copy link
Contributor Author

rfay commented Jan 26, 2023

Thanks - these context files seem to be empty, so I think although the symptom is the same, the cause may be different.

@fari-99
Copy link

fari-99 commented Jan 27, 2023

so i got this problem too, using Docker Desktop for Windows version 4.16.2 (95914).

the only fix i can do is using docker/desktop-linux#57 (comment) solution.

  1. vim ~/.docker/context/meta/{some hash}/meta.json
  2. put this in the meta.json
    { "Name": "default", "Metadata": { "StackOrchestrator": "swarm" }, "Endpoints": { "docker": { "Host": "unix:///var/run/docker.sock", "SkipTLSVerify": false } }, "TLSMaterial": {} }
  3. do docker context ls again.

unfortunately, when i restart my docker desktop, the context got cleared again. and i need to do this all over again.
just a workaround, not a true solution.

@nicks
Copy link

nicks commented Jan 27, 2023

Clarification question @rfay : after you delete the directory, how long does it take to re-occur? On every desktop restart? After a few hours? Intermittently?

@KarelVendla
Copy link

KarelVendla commented Jan 27, 2023

I am also facing this problem.

I've created a dummy context.
example:
docker context create wsl-default
And then deleted the broken one.

I get the issue less often, but it still occurs. Before I would just delete the hash folder with the empty meta.json, then the problem would come back a lot quicker (I recon docker doesn't like when your meta folder is empty).

It seems to occur when I crash my docker, it sometimes happens when I've restarted my PC. Haven't really noticed a distinct pattern yet.

The hash with the empty meta.json always seems to have the same hash, maybe that might give some clues.

@fari-99
Copy link

fari-99 commented Jan 27, 2023

sorry to interrupt, i finally fixed it by deleting my .docker file on the Windows side (in C:/Users/{Your Username}/.docker), and restarting my docker.

NOTE: i'm not "deleting" it perse, just rename it to .docker-old just to be safe if error occurred, i can just redo the rename.

@KarelVendla
Copy link

sorry to interrupt, i finally fixed it by deleting my .docker file on the Windows side (in C:/Users/{Your Username}/.docker), and restarting my docker.

NOTE: i'm not "deleting" it perse, just rename it to .docker-old just to be safe if error occurred, i can just redo the rename.

Do you mean .docker folder? I don't seem to have a .docker file

@fari-99
Copy link

fari-99 commented Jan 27, 2023

sorry to interrupt, i finally fixed it by deleting my .docker file on the Windows side (in C:/Users/{Your Username}/.docker), and restarting my docker.
NOTE: i'm not "deleting" it perse, just rename it to .docker-old just to be safe if error occurred, i can just redo the rename.

Do you mean .docker folder? I don't seem to have a .docker file

yeah the .docker folder.

@rfay
Copy link
Contributor Author

rfay commented Jan 27, 2023

Deleting ~/.docker would fix it certainly until the next Docker Desktop restart. Same as deleting ~/.docker/contexts

@rfay
Copy link
Contributor Author

rfay commented Jan 27, 2023

@nicks if you delete ~/.docker/contexts it solves this problem until the next Docker Desktop restart, when the empty context json file immediately reappears and breaks docker context list.

@bplasmeijer
Copy link

Deleting ~/.docker/contexts works for me too.

How can this be fixed ASAP @thaJeztah ? Do we need to create a ticket at moby too?

@thaJeztah
Copy link
Member

The docker context code is all client-side (https://github.com/docker/cli repository), but I don't think there's been changes in that area in 20.10.x patch releases of the cli (https://github.com/docker/cli/pulls?q=is%3Apr++is%3Aclosed+label%3Aarea%2Fcontext+)

On Docker Desktop, its possible that other binaries interact with those files, e.g. the compose-cli wrapper (used for cloud integration) may also be interacting with them; https://github.com/docker/compose-cli/blob/7a57a330f6d94247e36dc868b3d9d7161fb4429f/api/context/store/store.go#L81-L97

Not sure when/in what cases it would though.

@rfay
Copy link
Contributor Author

rfay commented Jan 31, 2023

Thanks for looking at this @thaJeztah - Since restarting Docker Desktop (and not doing anything else) creates the empty context json files, it seems to implicate some kind of interaction.

@bplasmeijer
Copy link

Thanks for the feedback @thaJeztah
Should I create a cross-link on the https://github.com/docker/cli repository?

@thaJeztah
Copy link
Member

At this moment it's not 100% clear where the cause is (so it could be fully unrelated to the CLI code). Maybe @nicks has done more digging into the issue though (from the Docker Desktop side of things).

@nicks
Copy link

nicks commented Jan 31, 2023

We were talking about it because we saw a slight uptick of complaints about it in the compose repo. Haven't been able to repro it, and the reports are kind of all over the place repro-wise (a couple reports that it happens on every desktop restart, other reports that say it's intermittent, other reports about it from years ago, etc etc.). And not so many reports to make me think a large percentage of users are seeing it. Stumped.

One thing that's slightly suspicious is that the diagnostic bundle rfay posted above has a timestamp of when the meta.json file was written with empty contents, but the desktop logs from that time are mysteriously missing. But this might be a red herring.

@crazy-max
Copy link
Member

crazy-max commented Jan 31, 2023

The docker context code is all client-side (https://github.com/docker/cli repository), but I don't think there's been changes in that area in 20.10.x patch releases of the cli (https://github.com/docker/cli/pulls?q=is%3Apr++is%3Aclosed+label%3Aarea%2Fcontext+)

On Docker Desktop, its possible that other binaries interact with those files, e.g. the compose-cli wrapper (used for cloud integration) may also be interacting with them; https://github.com/docker/compose-cli/blob/7a57a330f6d94247e36dc868b3d9d7161fb4429f/api/context/store/store.go#L81-L97

Not sure when/in what cases it would though.

That's odd, I can't find failed to read metadata on 20.10 branch but there is smth on master: https://github.com/docker/cli/blob/e92dd87c3209361f29b692ab4b8f0f9248779297/cli/context/store/metadatastore.go#L116

On buildx we are vendoring docker cli 23, that might be why: https://github.com/docker/buildx/blob/cb94298a0238f67e008658e831af8b8dd313d444/go.mod#L10

I have no clue why compose users would have this issue though: https://github.com/docker/compose/blob/f24d3458c6d05e19519e660e4faa04d47f6db103/go.mod#L14

But best guess is cli vendored in buildx messed up the context store and therefore impacts compose after that.

cc @tonistiigi @jedevc

@thaJeztah
Copy link
Member

@crazy-max compose also has a replace rule https://github.com/docker/compose/blob/f24d3458c6d05e19519e660e4faa04d47f6db103/go.mod#L159

@crazy-max
Copy link
Member

@thaJeztah Could be linked to docker/cli#3790?

@bplasmeijer
Copy link

@nicks docker/compose#9956 ?
Do you want a log after a failed restart and docker context ls?

@nicks
Copy link

nicks commented Feb 3, 2023

@bplasmeijer ya, it's helpful to have logs of the sequence:

  1. docker context ls works
  2. you restart DD
  3. docker context ls fails with this error
    because this sequence will tell us what's happening in DD when the file gets corrupted, and helps rule out possible codepaths where this could be happening.

or you could try the debug build in #13180 (comment) and see if it still re-corrupts the file.

@Yahav
Copy link

Yahav commented Feb 3, 2023

(or anyone else experiencing this issue frequently) - can you try this build and see if it helps?

Is this safe to use though?

@schlich
Copy link

schlich commented Feb 5, 2023

i've been running into this issue for weeks, also on WSL2, it's been a giant pain, especially when trying to boot up dev containers. I will give a go at trying that build! let me know if i can provide any helpful logs.

@schlich
Copy link

schlich commented Feb 5, 2023

That debug build didn't work, i'm still getting blank generated ~/.docker/contexts/meta/<hash>/meta.json files.

@schlich
Copy link

schlich commented Feb 5, 2023

I was previously only attempting to clear context via clearing the .docker/contexts folder on the WSL side, but hadn't thought to check out the Windows side until i saw that comment upthread. On a hunch I erased the newline at the end of the meta.json on the Windows side, and I think that's fixed the issue? 🙏

I can do a full docker restart with no corruption to the contexts!

@fari-99
Copy link

fari-99 commented Feb 6, 2023

all delete .docker/contexts folder in WSL or in Windows is a temporary solution (at least for me). it will be back once in a while. don't know why tho. lmao

@nicks
Copy link

nicks commented Feb 7, 2023

ok, i have a sandbox that reliably reproduces this, it's here:
https://github.com/nicks/contextstore-sandbox

i think there are multiple levels of safety checks in docker/cli and docker/for-win that will help mitigate this, but the sandbox will help me test them.

@MarkWard0110
Copy link

I hit this problem today. If it helps, here is a recap of my activity.
Working with VSCode dev containers. I attempted to use the Dev Container command "Create Dev Container..." in a folder when it appeared the initialization stopped and would not continue. Executing "docker" on the command line would execute but hang and not exit. I opened ubuntu (WSL2), and it informed me that Microsoft Store now manages WSL, and I should execute wsl.exe --upgrade to link it. I rebooted my computer and initiated the wsl.exe --upgrade. After the upgrade I opened a folder in VSCode and executed the "Create Dev Container..." command, which opened up the selected dev container. I repeated this activity for various different kinds of dev container templates.
The area where I encountered a problem with dev containers was opening a folder having a .devcontainer directory. I used the "Add Dev Container Configuration Files" command, and after selecting a template, I attempted to open the folder in the dev container. This is when it would fail to open, and the dev container build would fail. The "Create Dev Container..." command still worked.
I deleted the .docker/context directory, and I can open a folder having a .devcontainer directory in vscode dev container.

Maybe there is something different between the dev container "Create Dev Container" and the open using the local .devcontainer directory.

@DenyWatanabe
Copy link

DenyWatanabe commented Feb 13, 2023

Small contribution, in Windows if I edit ~/.docker/contexts/meta/(hash)/meta.json and change it to {}, I get:
image

Upon restarting docker, the contents of ~/.docker/contexts/meta/(hash)/meta.json are automatically changed to:
{"Name":"desktop-linux","Metadata":{},"Endpoints":{"docker":{"Host":"npipe:////./pipe/dockerDesktopLinuxEngine","SkipTLSVerify":false}}}

Also docker context list continues working exactly the same as above.

@alchatti
Copy link

alchatti commented Feb 14, 2023

@DenyWatanabe thanks your solution worked for me. I'm on W11 using WSL and the meta.json under my windows profile .docker folder meta.json was empty so I have added {}

C:\Users\(USERNAME)\.docker\contexts\meta\(HASH)\meta.json

(#13180 (comment))

EDIT: Docker Desktop 4.16.3 (96739)

@mdelgado
Copy link

mdelgado commented Feb 16, 2023

hi! I hope my solution helps someone.
In my case I did as @alchatti suggested here, but also a couple of symlinks were broken:

From the WSL console, inside ~/.docker:

image

Those two symlinks shown above were broken, so I releted them and recreated them, then applied the change that @alchatti suggested and restarted docker.

Hope it helps!

nicks added a commit to nicks/cli that referenced this issue Feb 21, 2023
Write to a tempfile then move, so that if the
process dies mid-write it doesn't corrupt the store.

Also improve error messaging so that if a file does
get corrupted, the user has some hope of figuring
out which file is broken.

For background, see:
docker/for-win#13180
docker/for-win#12561

For a repro case, see:
https://github.com/nicks/contextstore-sandbox

Signed-off-by: Nick Santos <nick.santos@docker.com>
nicks added a commit to nicks/cli that referenced this issue Feb 21, 2023
Write to a tempfile then move, so that if the
process dies mid-write it doesn't corrupt the store.

Also improve error messaging so that if a file does
get corrupted, the user has some hope of figuring
out which file is broken.

For background, see:
docker/for-win#13180
docker/for-win#12561

For a repro case, see:
https://github.com/nicks/contextstore-sandbox

Signed-off-by: Nick Santos <nick.santos@docker.com>
nicks added a commit to nicks/cli that referenced this issue Feb 21, 2023
Write to a tempfile then move, so that if the
process dies mid-write it doesn't corrupt the store.

Also improve error messaging so that if a file does
get corrupted, the user has some hope of figuring
out which file is broken.

For background, see:
docker/for-win#13180
docker/for-win#12561

For a repro case, see:
https://github.com/nicks/contextstore-sandbox

Signed-off-by: Nick Santos <nick.santos@docker.com>
@radthenone
Copy link

so i got this problem too, using Docker Desktop for Windows version 4.16.2 (95914).

the only fix i can do is using docker/desktop-linux#57 (comment) solution.

  1. vim ~/.docker/context/meta/{some hash}/meta.json
  2. put this in the meta.json
    { "Name": "default", "Metadata": { "StackOrchestrator": "swarm" }, "Endpoints": { "docker": { "Host": "unix:///var/run/docker.sock", "SkipTLSVerify": false } }, "TLSMaterial": {} }
  3. do docker context ls again.

unfortunately, when i restart my docker desktop, the context got cleared again. and i need to do this all over again. just a workaround, not a true solution.

thanks man

@lainosantos
Copy link

Hi there.

Only deleting the folder %USERPROFILE%.docker\contexts\meta on Windows side solves the problem even after restart the docker desktop.

@radthenone
Copy link

radthenone commented Feb 24, 2023

I delete .docker, close IDE, off docker-desktop, --wsl shutdown, restart PC, and now all works and now i dont need reopen meta file.

@chaizeg
Copy link

chaizeg commented Feb 27, 2023

Closing this issue because a fix has been released in Docker Desktop 4.17.0 . See the release notes for more details.

@chaizeg chaizeg closed this as completed Feb 27, 2023
thaJeztah pushed a commit to thaJeztah/cli that referenced this issue Mar 1, 2023
Write to a tempfile then move, so that if the
process dies mid-write it doesn't corrupt the store.

Also improve error messaging so that if a file does
get corrupted, the user has some hope of figuring
out which file is broken.

For background, see:
docker/for-win#13180
docker/for-win#12561

For a repro case, see:
https://github.com/nicks/contextstore-sandbox

Signed-off-by: Nick Santos <nick.santos@docker.com>
(cherry picked from commit c2487c2)
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
@docker-robott
Copy link
Collaborator

Closed issues are locked after 30 days of inactivity.
This helps our team focus on active issues.

If you have found a problem that seems similar to this, please open a new issue.

/lifecycle locked

craig-osterhout pushed a commit to craig-osterhout/cli that referenced this issue Apr 21, 2023
Write to a tempfile then move, so that if the
process dies mid-write it doesn't corrupt the store.

Also improve error messaging so that if a file does
get corrupted, the user has some hope of figuring
out which file is broken.

For background, see:
docker/for-win#13180
docker/for-win#12561

For a repro case, see:
https://github.com/nicks/contextstore-sandbox

Signed-off-by: Nick Santos <nick.santos@docker.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests