Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when attempting to remove user from an organization #10744

Closed
2 of 7 tasks
gilesorr opened this issue Mar 16, 2020 · 18 comments
Closed
2 of 7 tasks

Crash when attempting to remove user from an organization #10744

gilesorr opened this issue Mar 16, 2020 · 18 comments
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug

Comments

@gilesorr
Copy link

gilesorr commented Mar 16, 2020

  • Gitea version (or commit ref): 1.11.3 (also 1.11)
  • Git version: 2.20.1
  • Operating system: Debian GNU/Linux 10 (buster)
  • Database (use [x]):
    • PostgreSQL
    • MySQL
    • MSSQL
    • SQLite
  • Can you reproduce the bug at https://try.gitea.io:
    • Yes (provide example URL)
    • No (cannot do administration on the try machine)
    • Not relevant
  • Log gist:

Description

Further information: running in Digital Ocean, smallest instance size available (1M memory, has previously been beautifully stable even under reasonable load).

We have a small organization (16 users). Two coders recently departed, so I went into the GUI (as an administrator) to remove them. One was fine, but when I attempted to remove the other it said that the account couldn't be removed while the user was a member of an organization. I went to one of the two organizations they were a part of, went to the user list (/org/<organization_name>/members) and clicked "Remove" by the user name. The interface locked up - and in fact I eventually had to stop the gitea service (which takes a full 60 seconds). I restarted the service, tried to remove the user from the other organization they are a member of, had exactly the same lock-up (and solution).

I upgraded from 1.11 to 1.11.3 and the behaviour is identical. I've discovered that I can disable the account instead of removing the user so that's a partial work-around, but this seems to be a significant bug.

Without admin access to try.gitea.io I can't replicate there.

I'll consider sending logs if you ask for them: I'm not rushing to do so because it reveals more personal info than I'm comfortable with. If you request logs, I may obfuscate them which may make them less useful ...

Otherwise ... massively impressed by this software. Thanks.

Screenshots

@lunny
Copy link
Member

lunny commented Mar 17, 2020

For non single user, we recommand not to use sqlite.

@gilesorr
Copy link
Author

  • why not sqlite? It's fast and can handle a large DB and a lot of transactions (assume I'm aware of potential multi-user security issues)
  • this definitely seems like a bug: does your message mean that because it's occurring with sqlite you won't fix it?
  • is it known that this bug does NOT occur with the other supported databases?

@jolheiser
Copy link
Member

Can you send some logs? If you are comfortable sifting yourself, you can look for errors around when the user is removed.
No need to send the full log, but as much as you are comfortable with would be potentially beneficial.

@gilesorr
Copy link
Author

Here's the log of the event, from when I pressed the "Remove" button until I was forced to shut down the gitea service.

2020/03/19 21:31:55 .../xorm/session_raw.go:78:queryRows() [I] [SQL] SELECT `id`, `lower_name`, `name`, `full_name`, `email`, `keep_email_private`, `email_notifications_preference`, `passwd`, `passwd_hash_algo`, `must_change_password`, `login_type`, `login_source`, `login_name`, `type`, `location`, `website`, `rands`, `salt`, `language`, `description`, `created_unix`, `updated_unix`, `last_login_unix`, `last_repo_visibility`, `max_repo_creation`, `is_active`, `is_admin`, `allow_git_hook`, `allow_import_local`, `allow_create_organization`, `prohibit_login`, `avatar`, `avatar_email`, `use_custom_avatar`, `num_followers`, `num_following`, `num_stars`, `num_repos`, `num_teams`, `num_members`, `visibility`, `repo_admin_change_team_access`, `diff_view_style`, `theme` FROM `user` WHERE `id`=? LIMIT 1 []interface {}{1} - took: 217.18µs
2020/03/19 21:31:55 .../xorm/session_raw.go:78:queryRows() [I] [SQL] SELECT count(*) FROM `notification` WHERE (user_id = ?) AND (status = ?) []interface {}{1, 0x1} - took: 110.909µs
2020/03/19 21:31:55 .../xorm/session_raw.go:78:queryRows() [I] [SQL] SELECT `id`, `lower_name`, `name`, `full_name`, `email`, `keep_email_private`, `email_notifications_preference`, `passwd`, `passwd_hash_algo`, `must_change_password`, `login_type`, `login_source`, `login_name`, `type`, `location`, `website`, `rands`, `salt`, `language`, `description`, `created_unix`, `updated_unix`, `last_login_unix`, `last_repo_visibility`, `max_repo_creation`, `is_active`, `is_admin`, `allow_git_hook`, `allow_import_local`, `allow_create_organization`, `prohibit_login`, `avatar`, `avatar_email`, `use_custom_avatar`, `num_followers`, `num_following`, `num_stars`, `num_repos`, `num_teams`, `num_members`, `visibility`, `repo_admin_change_team_access`, `diff_view_style`, `theme` FROM `user` WHERE `lower_name`=? LIMIT 1 []interface {}{"tpl"} - took: 189.369µs
2020/03/19 21:31:55 .../xorm/session_raw.go:78:queryRows() [I] [SQL] SELECT `id`, `org_id`, `lower_name`, `name`, `description`, `authorize`, `num_repos`, `num_members`, `includes_all_repositories`, `can_create_org_repo` FROM `team` WHERE (org_id=?) ORDER BY CASE WHEN name LIKE 'Owners' THEN '' ELSE name END []interface {}{34} - took: 125.2µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `uid`, `org_id`, `is_public` FROM `org_user` WHERE (uid=?) AND (org_id=?) LIMIT 1 []interface {}{29, 34} - took: 58.135µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `lower_name`, `name`, `full_name`, `email`, `keep_email_private`, `email_notifications_preference`, `passwd`, `passwd_hash_algo`, `must_change_password`, `login_type`, `login_source`, `login_name`, `type`, `location`, `website`, `rands`, `salt`, `language`, `description`, `created_unix`, `updated_unix`, `last_login_unix`, `last_repo_visibility`, `max_repo_creation`, `is_active`, `is_admin`, `allow_git_hook`, `allow_import_local`, `allow_create_organization`, `prohibit_login`, `avatar`, `avatar_email`, `use_custom_avatar`, `num_followers`, `num_following`, `num_stars`, `num_repos`, `num_teams`, `num_members`, `visibility`, `repo_admin_change_team_access`, `diff_view_style`, `theme` FROM `user` WHERE `id`=? LIMIT 1 []interface {}{34} - took: 210.941µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `org_id`, `lower_name`, `name`, `description`, `authorize`, `num_repos`, `num_members`, `includes_all_repositories`, `can_create_org_repo` FROM `team` WHERE `org_id`=? AND `lower_name`=? LIMIT 1 []interface {}{34, "owners"} - took: 131.983µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT * FROM `team_user` WHERE (org_id=?) AND (team_id=?) AND (uid=?) LIMIT 1 []interface {}{34, 2, 29} - took: 73.476µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `org_user` WHERE `id`=? AND `uid`=? AND `org_id`=? AND `id`=? []interface {}{5, 29, 34, 5} - took: 401.22µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] UPDATE `user` SET num_members=num_members-1 WHERE id=? []interface {}{34} - took: 109.243µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `team`.`id` FROM `team` INNER JOIN team_user ON `team_user`.team_id = team.id WHERE (`team_user`.org_id = ?) AND (`team_user`.uid = ?) []interface {}{34, 29} - took: 84.575µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `repository`.`id` FROM `repository` INNER JOIN team_repo ON `team_repo`.repo_id=`repository`.id WHERE ((`repository`.is_private=? AND `repository`.owner_id=?) OR team_repo.team_id IN (?)) GROUP BY `repository`.id,`repository`.updated_unix ORDER BY updated_unix DESC LIMIT 3 []interface {}{false, 34, 3} - took: 93.401µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `user_id`, `repo_id`, `mode` FROM `watch` WHERE `user_id`=? AND `repo_id`=? LIMIT 1 []interface {}{29, 20} - took: 54.937µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `watch` WHERE `id`=? []interface {}{68} - took: 88.737µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] UPDATE `repository` SET num_watches = num_watches + ? WHERE id = ? []interface {}{-1, 20} - took: 83.826µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `user_id`, `repo_id`, `mode` FROM `watch` WHERE `user_id`=? AND `repo_id`=? LIMIT 1 []interface {}{29, 8} - took: 47.92µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `watch` WHERE `id`=? []interface {}{13} - took: 59.535µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] UPDATE `repository` SET num_watches = num_watches + ? WHERE id = ? []interface {}{-1, 8} - took: 43.511µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `user_id`, `repo_id`, `mode` FROM `watch` WHERE `user_id`=? AND `repo_id`=? LIMIT 1 []interface {}{29, 7} - took: 31.028µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `watch` WHERE `id`=? []interface {}{15} - took: 77.708µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] UPDATE `repository` SET num_watches = num_watches + ? WHERE id = ? []interface {}{-1, 7} - took: 91.597µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `access` WHERE (user_id = ?) AND `repo_id` IN (?,?,?) []interface {}{29, 20, 8, 7} - took: 116.117µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT * FROM `team` INNER JOIN team_user ON team_user.team_id = team.id WHERE (team.org_id = ?) AND (team_user.uid=?) []interface {}{34, 29} - took: 79.945µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT * FROM `team_user` WHERE (org_id=?) AND (team_id=?) AND (uid=?) LIMIT 1 []interface {}{34, 3, 29} - took: 55.312µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT * FROM `repository` INNER JOIN team_repo ON repository.id = team_repo.repo_id WHERE (team_repo.team_id=?) ORDER BY repository.name []interface {}{3} - took: 289.094µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `team_user` WHERE `org_id`=? AND `team_id`=? AND `uid`=? []interface {}{34, 3, 29} - took: 117.46µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] UPDATE `team` SET `num_members` = ? WHERE `id`=? []interface {}{3, 3} - took: 50.818µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `repo_id`, `user_id`, `mode` FROM `collaboration` WHERE `repo_id`=? AND `user_id`=? LIMIT 1 []interface {}{7, 29} - took: 64.291µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `lower_name`, `name`, `full_name`, `email`, `keep_email_private`, `email_notifications_preference`, `passwd`, `passwd_hash_algo`, `must_change_password`, `login_type`, `login_source`, `login_name`, `type`, `location`, `website`, `rands`, `salt`, `language`, `description`, `created_unix`, `updated_unix`, `last_login_unix`, `last_repo_visibility`, `max_repo_creation`, `is_active`, `is_admin`, `allow_git_hook`, `allow_import_local`, `allow_create_organization`, `prohibit_login`, `avatar`, `avatar_email`, `use_custom_avatar`, `num_followers`, `num_following`, `num_stars`, `num_repos`, `num_teams`, `num_members`, `visibility`, `repo_admin_change_team_access`, `diff_view_style`, `theme` FROM `user` WHERE `id`=? LIMIT 1 []interface {}{34} - took: 87.459µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT * FROM `team` INNER JOIN team_repo ON team_repo.team_id = team.id INNER JOIN team_user ON team_user.team_id = team.id WHERE (team.org_id = ?) AND (team_repo.repo_id=?) AND (team_user.uid=?) []interface {}{34, 7, 29} - took: 276.33µs
2020/03/19 21:31:55 .../xorm/session_raw.go:180:exec() [I] [SQL] DELETE FROM `access` WHERE `user_id`=? AND `repo_id`=? []interface {}{29, 7} - took: 30.581µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `lower_name`, `name`, `full_name`, `email`, `keep_email_private`, `email_notifications_preference`, `passwd`, `passwd_hash_algo`, `must_change_password`, `login_type`, `login_source`, `login_name`, `type`, `location`, `website`, `rands`, `salt`, `language`, `description`, `created_unix`, `updated_unix`, `last_login_unix`, `last_repo_visibility`, `max_repo_creation`, `is_active`, `is_admin`, `allow_git_hook`, `allow_import_local`, `allow_create_organization`, `prohibit_login`, `avatar`, `avatar_email`, `use_custom_avatar`, `num_followers`, `num_following`, `num_stars`, `num_repos`, `num_teams`, `num_members`, `visibility`, `repo_admin_change_team_access`, `diff_view_style`, `theme` FROM `user` WHERE `id`=? LIMIT 1 []interface {}{29} - took: 81.868µs
2020/03/19 21:31:55 .../xorm/session_raw.go:85:queryRows() [I] [SQL] SELECT `id`, `repo_id`, `user_id`, `mode` FROM `collaboration` WHERE `repo_id`=? AND `user_id`=? LIMIT 1 []interface {}{7, 29} - took: 54.149µs
2020/03/19 21:31:55 .../xorm/session_raw.go:78:queryRows() [I] [SQL] SELECT * FROM `org_user` WHERE (uid=?) AND (org_id=?) LIMIT 1 []interface {}{29, 34} - took: 47.718µs
2020/03/19 21:33:19 ...eful/manager_unix.go:133:handleSignals() [W] PID 31257. Received SIGTERM. Shutting down...
2020/03/19 21:33:19 ...eful/server_hooks.go:47:doShutdown() [I] PID: 31257 Listener (/run/gitea/gitea.socket) closed.
2020/03/19 21:33:19 ...rvices/pull/check.go:214:TestPullRequests() [I] PID: 31257 Pull Request testing shutdown

@gilesorr
Copy link
Author

Let me know if there's anything else I can help with. Thanks.

@lunny
Copy link
Member

lunny commented Mar 20, 2020

@gilesorr I don't mean we will not try to fix the problem.

Because sqlite itself has some limitation when many go routines read/write the tables at the same time. For a peronsal use, it should be fast. But when there are more users, it will be slow if some users use it at the same time.

@gilesorr
Copy link
Author

As mentioned, we have only a small number of users - and so far, gitea hasn't been heavily used. Duly noted though: when we start using it more, I'll keep an eye out for slow-downs and will consider a move to PostgreSQL if that happens. Thanks.

@zeripath
Copy link
Contributor

This is probably already mitigated on master as we now use begin immediate for sqlite

@guillep2k
Copy link
Member

@zeripath that was my first thought, but it looks like the change is already present on 1.11.3:

https://github.com/go-gitea/gitea/blob/v1.11.3/modules/setting/database.go#L127

I'm thinking on it... I think it's possible there's still some scenarios that can't be covered by this if two different goroutines open multiple transactions to perform one operation (e.g. implicit transactions to perform SELECT operations and later on an explicit transaction for INSERT, etc.). I can't picture that scenario myself, though.

@gilesorr To gather more info, it would be great if you could create a dump of Gitea's process at the moment it's frozen (you'd need go and delve installed for that):

# Replace 24006 for your current Gitea PID
dlv attach 24006
Type 'help' for list of commands.
(dlv) grs -t
  Goroutine 1 - User: /usr/local/go/src/runtime/netpoll.go:184 internal/poll.runtime_pollWait (0x432f65)
         0  0x0000000000438c00 in runtime.gopark
             at /usr/local/go/src/runtime/proc.go:305
         1  0x000000000043399a in runtime.netpollblock
             at /usr/local/go/src/runtime/netpoll.go:397
         2  0x0000000000432f65 in internal/poll.runtime_pollWait
             at /usr/local/go/src/runtime/netpoll.go:184
         3  0x00000000004d30e5 in internal/poll.(*pollDesc).wait
[... etc ...]

@edoger
Copy link

edoger commented Mar 25, 2020

I had the same problem.

@gilesorr
Copy link
Author

Not a Go expert, please accept my apologies if this has been done incorrectly.

I created two delve dumps - I thought I did three, but one got lost. The first is before the deliberately-caused crash ("51 goroutines"), the second is a couple minutes after ("71 goroutines"). The one I lost was about 30 seconds after the crash: it said "58 goroutines" (that's all I've got for you on that one). I tried a minute later again, and it was still at 71 so I killed the process ...

BEFORE:
typescript.before.txt

AFTER:
typescript.after.txt

@guillep2k
Copy link
Member

So right before you took the typescript.after.txt snapshot your browser was still waiting for an answer?

I see a couple of unlock_notify wait calls (which are SQLite locks), but the stack trace on them doesn't say much about the caller functions. 🤔

  Goroutine 24949 - User: /go/src/code.gitea.io/gitea/vendor/github.com/mattn/go-sqlite3/sqlite3_opt_unlock_notify.go:90 code.gitea.io/gitea/vendor/github.com/mattn/go-sqlite3.unlock_notify_wait (0xa83634)
	 0  0x0000000000438c00 in runtime.gopark
	     at /usr/local/go/src/runtime/proc.go:305
	 1  0x000000000040d2a8 in runtime.goparkunlock
	     at /usr/local/go/src/runtime/proc.go:310
	 2  0x000000000040d2a8 in runtime.chanrecv
	     at /usr/local/go/src/runtime/chan.go:524
	 3  0x000000000040cf6b in runtime.chanrecv1
	     at /usr/local/go/src/runtime/chan.go:406
	 4  0x0000000000a83634 in code.gitea.io/gitea/vendor/github.com/mattn/go-sqlite3.unlock_notify_wait
	     at /go/src/code.gitea.io/gitea/vendor/github.com/mattn/go-sqlite3/sqlite3_opt_unlock_notify.go:90
	 5  0x0000000000a729bb in code.gitea.io/gitea/vendor/github.com/mattn/go-sqlite3._cgoexpwrap_de20a310f285_unlock_notify_wait
	     at _cgo_gotypes.go:1371
	 6  0x0000000000464f5b in runtime.call32
	     at /usr/local/go/src/runtime/asm_amd64.s:539
	 7  0x000000000040a957 in runtime.cgocallbackg1
	     at /usr/local/go/src/runtime/cgocall.go:314
	 8  0x000000000040a701 in runtime.cgocallbackg
	     at /usr/local/go/src/runtime/cgocall.go:191
	 9  0x000000000046652b in runtime.cgocallback_gofunc
	     at /usr/local/go/src/runtime/asm_amd64.s:793
	10  0x00000000004663c2 in runtime.asmcgocall
	     at /usr/local/go/src/runtime/asm_amd64.s:640

I was expecting some Gitea code below #6. It doesn't seem that symbols are stripped off from the executable (otherwise other code like graceful/server.go would not show up).

Perhaps your second snapshot had more useful info. 😞

If you can get another snapshot, it would be great. Otherwise, it would be nice to identify if there's anything "special" about those users you want to delete so we can reproduce the problem. By "special" I mean anything that is not included in our automated tests (maybe it's the user has issues? or PRs? or belongs to another team?).

Can you create a new user and try deleting it? See if you can identify the culprit event?

@gilesorr
Copy link
Author

"So right before you took the typescript.after.txt snapshot your browser was still waiting for an answer?" Yes.

If the binaries aren't stripped, blame https://dl.gitea.io/gitea/ ... that's where I fetch them from.

Unfortunately, this is a live instance - which means I have to pick my times to do things that might crash it.

I changed the password of the account in question and logged in as them. The person in question:

  • did little more than log in to check Gitea out once
  • no repos
  • no followers
  • no following
  • no starred repos
  • member of 2 organizations (attempting to delete them from either causes the same crash)
  • didn't create any PRs, or have any assigned to them
  • there is a PR in one of the repos this person is associated with, but that seems a stretch

@edoger - if you can, perhaps try running the same debugging on your server? It would probably help. Are you using SQLite?

@edoger
Copy link

edoger commented Mar 27, 2020

@gilesorr I use sqlite. I added an administrator account to the organization and it will crash when I delete it, or when I log in to the new administrator account and click the leave organization button, it will also crash.

@guillep2k
Copy link
Member

member of 2 organizations (attempting to delete them from either causes the same crash)

@gilesorr you mean you remove them from the teams they belong, right? Because there's no "delete from organization option" that I'm aware of.

@gilesorr
Copy link
Author

@guillep2k Go to the Organization page. On the right, it says "People" and then further right of that there's a count of those people. Click on the count. That page is /org/<org_name>/members , and you have a big red "Remove" button beside each member. That's what I'm talking about.

I added a new user, then added them to just one organization: exact same problem as the previous user (who is in two organizations), crash on attempt to remove from the organization.

Since you pointed out the distinction, I tried removing the offending user from the Team rather than the Organization. And all is just fine and I was able to remove the users entirely - finally.

Please note that this all started because when I attempted to delete a user, a big warning came up: "This user is a member of an organization. Remove the user from any organizations first." It doesn't say "remove the user from any teams first." So I tried to remove them from the organization(s).

I'm hoping that this distinction is the problem and will lead to a fix. If you still want/need more debug output, get back to me next week ...

@guillep2k
Copy link
Member

@gilesorr Thank you for your patience! I'm glad the problem is fixed on your side. I think we've got enough info now to fix it on ours. 👍

@guillep2k guillep2k added the issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented label Mar 28, 2020
@lunny
Copy link
Member

lunny commented Jul 12, 2022

Should be fixed by #11438

@lunny lunny closed this as completed Jul 12, 2022
@go-gitea go-gitea locked and limited conversation to collaborators May 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
issue/confirmed Issue has been reviewed and confirmed to be present or accepted to be implemented type/bug
Projects
None yet
Development

No branches or pull requests

6 participants