Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI random failure: TestExportUserGPGKeys #22105

Closed
lunny opened this issue Dec 12, 2022 · 18 comments · Fixed by #22159
Closed

CI random failure: TestExportUserGPGKeys #22105

lunny opened this issue Dec 12, 2022 · 18 comments · Fixed by #22159

Comments

@lunny
Copy link
Member

lunny commented Dec 12, 2022

--- FAIL: TestExportUserGPGKeys (0.63s)
    user_test.go:236: Testing username user1 export gpg keys
    testlogger.go:77: 2022/12/12 04:21:59 ...eb/routing/logger.go:98:func1() [I] [6396ac67] router: completed GET /user1.gpg for , 200 OK in 6.4ms @ user/home.go:749(user.ShowGPGKeys)
    testlogger.go:77: 2022/12/12 04:21:59 ...eb/routing/logger.go:98:func1() [I] [6396ac67-2] router: completed GET /user/settings/applications for , 200 OK in 19.7ms @ setting/applications.go:23(setting.Applications)
    testlogger.go:77: 2022/12/12 04:21:59 ...eb/routing/logger.go:98:func1() [I] [6396ac67-3] router: completed POST /user/settings/applications for , 303 See Other in 4.6ms @ context/auth.go:27(context.Toggle)
    testlogger.go:77: 2022/12/12 04:21:59 ...eb/routing/logger.go:98:func1() [I] [6396ac67-4] router: completed GET /user/settings/applications for , 200 OK in 18.1ms @ setting/applications.go:23(setting.Applications)
    user_test.go:168: 
        	Error Trace:	/drone/src/integration_test.go:278
        	            				/drone/src/user_test.go:168
        	Error:      	Should NOT be empty, but was 
        	Test:       	TestExportUserGPGKeys
    testlogger.go:77: 2022/12/12 04:21:59 ...eb/routing/logger.go:98:func1() [I] [6396ac67-5] router: completed POST /api/v1/user/gpg_keys?token= for , 401 Unauthorized in 0.2ms @ v1/api.go:211(v1.reqToken)
    api_gpg_keys_test.go:159: 
        	Error Trace:	/drone/src/integration_test.go:336
        	            				/drone/src/integration_test.go:165
        	            				/drone/src/api_gpg_keys_test.go:159
        	            				/drone/src/user_test.go:169
        	Error:      	Not equal: 
        	            	expected: 201
        	            	actual  : 401
        	Test:       	TestExportUserGPGKeys
        	Messages:   	Request: POST /api/v1/user/gpg_keys?token=
    api_gpg_keys_test.go:159: Response: {"message":"token is required","url":"http://localhost:3003/api/swagger"}
        
    user_test.go:236: Testing username user1 export gpg keys
    testlogger.go:77: 2022/12/12 04:21:59 ...eb/routing/logger.go:98:func1() [I] [6396ac67-6] router: completed GET /user1.gpg for , 200 OK in 5.5ms @ user/home.go:749(user.ShowGPGKeys)
    user_test.go:240: 
        	Error Trace:	/drone/src/user_test.go:240
        	            				/drone/src/user_test.go:201
        	Error:      	Not equal: 
        	            	expected: "-----BEGIN PGP PUBLIC KEY BLOCK-----\n\nxsBNBFyy/VUBCADJ7zbM20Z1RWmFoVgp5WkQfI2rU1Vj9cQHes9i42wVLLtcbPeo\nQzubgzvMPITDy7nfWxgSf83E23DoHQ1ACFbQh/6eFSRrjsusp3YQ/08NSfPPbcu8\n0M5G+VGwSfzS5uEcwBVQmHyKdcOZIERTNMtYZx1C3bjLD1XVJHvWz9D72Uq4qeO3\n8SR+lzp5n6ppUakcmRnxt3nGRBj1+hEGkdgzyPo93iy+WioegY2lwCA9xMEo5dah\nBmYxWx51zyiXYlReTaxlyb3/nuSUt8IcW3Q8zjdtJj4Nu8U1SpV8EdaA1I9IPbHW\n510OSLmD3XhqHH5m6mIxL1YoWxk3V7gpDROtABEBAAHNGVVzZXIxIDx1c2VyMUBl\neGFtcGxlLmNvbT7CwI4EEwEIADgWIQTQEbrYxmXsp1z3j7z9+v0I6RSEHwUCXLL9\nVQIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRD9+v0I6RSEH22YCACFqL5+\n6M0m18AMC/pumcpnnmvAS1GrrKTF8nOROA1augZwp1WCNuKw2R6uOJIHANrYECSn\nu7+j6GBP2gbIW8mSAzS6HWCs7GGiPpVtT4wcu8wljUI6BxjpyZtoEkriyBjt6HfK\nrkegbkuySoJvjq4IcO5D1LB1JWgsUjMYQJj/ZpBIzVtjG9QtFSOiT1Hct4PoZHdC\nnsdSgyCkwRZXG+u3kT/wP9F663ba4o16vYlz3dCGo66lF2tyoG3qcyZ1OUzUrnuv\n96ytAzT6XIhrE0nVoBprMxFF5zExotJD3bHjcGBFNLf944bhjKee3U6t9+OsfJVC\nl7N5xxIawCuTQdbfzsBNBFyy/VUBCADe61yGEoTwKfsOKIhxLaNoRmD883O0tiWt\nsoO/HPj9dPQLTOiwXgSgSCd8C+LNxGKct87wgFozpah4tDLC6c0nALuHJ0SLbkfz\n55aRhLeOOcrAydatDp72GroXzqpZ0xZBk5wjIWdgEol2GmVRM8QGbeuakU/HVz5y\nlPzxUUocgdbSi3GE3zbzijQzVJdyL/kw/KP7pKT/PPKKJ2C5NQDLy0XGKEHddXGR\nEWKkVlRalxq/TjfaMR0bi3MpezBsQmp99ATPO/d7trayZUxQHRtXzGFiOXfDHATr\nqN730sODjqvU+mpc/SHCRwh9qWDjZRHSuKU5YDBjb5jIQJivZsQ/ABEBAAHCwHYE\nGAEIACAWIQTQEbrYxmXsp1z3j7z9+v0I6RSEHwUCXLL9VQIbDAAKCRD9+v0I6RSE\nH7WoB/4tXl+97rQ6owPCGSVp1Xbwt2521V7COgsOFRVTRTryEWxRW8mm0S7wQvax\nC0TLXKur6NVYQMn01iyL+FZzRpEWNuYF3f9QeeLJ/+l2DafESNhNTy17+RPmacK6\n21dccpqchByVw/UMDeHSyjQLiG2lxzt8Gfx2gHmSbrq3aWovTGyz6JTffZvfy/n2\n0Hm437OBPazO0gZyXhdV2PE5RSUfvAgm44235tcV5EV0d32TJDfv61+Vr2GUbah6\n7XhJ1v6JYuh8kaYaEz8OpZDeh7f6Ho6PzJrsy/TKTKhGgZNINj1iaPFyOkQgKR5M\nGrE0MHOxUbc9tbtyk0F1SuzREUBH\n=WFf5\n-----END PGP PUBLIC KEY BLOCK-----\n"
        	            	actual  : "-----BEGIN PGP PUBLIC KEY BLOCK-----\nNote: This user hasn't uploaded any GPG keys.\n\n\n=twTO\n-----END PGP PUBLIC KEY BLOCK-----\n"
        	            	
        	            	Diff:
        	            	--- Expected
        	            	+++ Actual
        	            	@@ -1,30 +1,6 @@
        	            	 -----BEGIN PGP PUBLIC KEY BLOCK-----
        	            	+Note: This user hasn't uploaded any GPG keys.
        	            	 
        	            	-xsBNBFyy/VUBCADJ7zbM20Z1RWmFoVgp5WkQfI2rU1Vj9cQHes9i42wVLLtcbPeo
        	            	-QzubgzvMPITDy7nfWxgSf83E23DoHQ1ACFbQh/6eFSRrjsusp3YQ/08NSfPPbcu8
        	            	-0M5G+VGwSfzS5uEcwBVQmHyKdcOZIERTNMtYZx1C3bjLD1XVJHvWz9D72Uq4qeO3
        	            	-8SR+lzp5n6ppUakcmRnxt3nGRBj1+hEGkdgzyPo93iy+WioegY2lwCA9xMEo5dah
        	            	-BmYxWx51zyiXYlReTaxlyb3/nuSUt8IcW3Q8zjdtJj4Nu8U1SpV8EdaA1I9IPbHW
        	            	-510OSLmD3XhqHH5m6mIxL1YoWxk3V7gpDROtABEBAAHNGVVzZXIxIDx1c2VyMUBl
        	            	-eGFtcGxlLmNvbT7CwI4EEwEIADgWIQTQEbrYxmXsp1z3j7z9+v0I6RSEHwUCXLL9
        	            	-VQIbAwULCQgHAgYVCgkICwIEFgIDAQIeAQIXgAAKCRD9+v0I6RSEH22YCACFqL5+
        	            	-6M0m18AMC/pumcpnnmvAS1GrrKTF8nOROA1augZwp1WCNuKw2R6uOJIHANrYECSn
        	            	-u7+j6GBP2gbIW8mSAzS6HWCs7GGiPpVtT4wcu8wljUI6BxjpyZtoEkriyBjt6HfK
        	            	-rkegbkuySoJvjq4IcO5D1LB1JWgsUjMYQJj/ZpBIzVtjG9QtFSOiT1Hct4PoZHdC
        	            	-nsdSgyCkwRZXG+u3kT/wP9F663ba4o16vYlz3dCGo66lF2tyoG3qcyZ1OUzUrnuv
        	            	-96ytAzT6XIhrE0nVoBprMxFF5zExotJD3bHjcGBFNLf944bhjKee3U6t9+OsfJVC
        	            	-l7N5xxIawCuTQdbfzsBNBFyy/VUBCADe61yGEoTwKfsOKIhxLaNoRmD883O0tiWt
        	            	-soO/HPj9dPQLTOiwXgSgSCd8C+LNxGKct87wgFozpah4tDLC6c0nALuHJ0SLbkfz
        	            	-55aRhLeOOcrAydatDp72GroXzqpZ0xZBk5wjIWdgEol2GmVRM8QGbeuakU/HVz5y
        	            	-lPzxUUocgdbSi3GE3zbzijQzVJdyL/kw/KP7pKT/PPKKJ2C5NQDLy0XGKEHddXGR
        	            	-EWKkVlRalxq/TjfaMR0bi3MpezBsQmp99ATPO/d7trayZUxQHRtXzGFiOXfDHATr
        	            	-qN730sODjqvU+mpc/SHCRwh9qWDjZRHSuKU5YDBjb5jIQJivZsQ/ABEBAAHCwHYE
        	            	-GAEIACAWIQTQEbrYxmXsp1z3j7z9+v0I6RSEHwUCXLL9VQIbDAAKCRD9+v0I6RSE
        	            	-H7WoB/4tXl+97rQ6owPCGSVp1Xbwt2521V7COgsOFRVTRTryEWxRW8mm0S7wQvax
        	            	-C0TLXKur6NVYQMn01iyL+FZzRpEWNuYF3f9QeeLJ/+l2DafESNhNTy17+RPmacK6
        	            	-21dccpqchByVw/UMDeHSyjQLiG2lxzt8Gfx2gHmSbrq3aWovTGyz6JTffZvfy/n2
        	            	-0Hm437OBPazO0gZyXhdV2PE5RSUfvAgm44235tcV5EV0d32TJDfv61+Vr2GUbah6
        	            	-7XhJ1v6JYuh8kaYaEz8OpZDeh7f6Ho6PzJrsy/TKTKhGgZNINj1iaPFyOkQgKR5M
        	            	-GrE0MHOxUbc9tbtyk0F1SuzREUBH
        	            	-=WFf5
        	            	+
        	            	+=twTO
        	            	 -----END PGP PUBLIC KEY BLOCK-----
        	Test:       	TestExportUserGPGKeys
@lunny lunny changed the title CI failure CI random failure: TestExportUserGPGKeys Dec 13, 2022
@garymoon
Copy link
Contributor

Happening for us about ~10% of the time. Happy to help in any way, just tag me 👍

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Dec 15, 2022

IMO it's caused by incorrect CSRF token.

session.MakeRequest(t, req, http.StatusSeeOther) doesn't work here because in context.Toggle, it also responds Redirect:StatusSeeOther if the CSRF token is invalid, then the POST request doesn't create a new token.

ps: there are other redirections in context.Toggle, as long as when it's clear what triggers the redirection, the problem is clear.

image

@zeripath
Copy link
Contributor

Perhaps we should do:

func getTokenForLoggedInUser(t testing.TB, session *TestSession) string {
	t.Helper()
	var token string
	for i := 0; i < 3 && token == ""; i++ {
		req := NewRequest(t, "GET", "/user/settings/applications")
		resp := session.MakeRequest(t, req, http.StatusOK)
		doc := NewHTMLParser(t, resp.Body)
		req = NewRequestWithValues(t, "POST", "/user/settings/applications", map[string]string{
			"_csrf": doc.GetCSRF(),
			"name":  fmt.Sprintf("api-testing-token-%d", atomic.AddInt64(&tokenCounter, 1)),
		})
		session.MakeRequest(t, req, http.StatusSeeOther)
		req = NewRequest(t, "GET", "/user/settings/applications")
		resp = session.MakeRequest(t, req, http.StatusOK)
		htmlDoc := NewHTMLParser(t, resp.Body)
		token = htmlDoc.doc.Find(".ui.info p").Text()
	}
	assert.NotEmpty(t, token)
	return token
}

@zeripath
Copy link
Contributor

Of course this doesn't explain why the CSRF would ever be incorrect etc.

@wxiaoguang
Copy link
Contributor

ps: CSRF is just my guess (the best guess in my mind), I haven't looked into it. There are other redirections in context.Toggle. Maybe it could make the Redirect function output some logs (with stacktrace) during testing, then it's clear where the redirection comes.

Also it could output the CSRF token to log to see whether it is incorrect or doesn't exist (can not be loaded from HTML response).

That's really strange that only this test case fails.

@zeripath
Copy link
Contributor

I don't understand it either - there's a bit of caching that happens related to the users in the test cases but that shouldn't really be a problem.

zeripath added a commit to zeripath/gitea that referenced this issue Dec 18, 2022
There are repeated failures with this test which appear related to
failures in getTokenForLoggedInUser. It is difficult to further evaluate
the cause of these failures as we do not get given further information.

This PR will attempt to fix this.

First it adds some extra logging and it uses the csrf cookie primarily
for the csrf value.

If the problem does not occur again with those changes we could merge,
assume that it is fixed and hope that if it occurs in future the
additional logging will be helpful.

If not I will add more changes in attempt to fix.

Fix go-gitea#22105

Signed-off-by: Andrew Thornton <art27@cantab.net>
@zeripath
Copy link
Contributor

I think TestAPITeam is also affected by the same underlying problem too.

lunny added a commit that referenced this issue Dec 21, 2022
There are repeated failures with this test which appear related to
failures in getTokenForLoggedInUser. It is difficult to further evaluate
the cause of these failures as we do not get given further information.

This PR will attempt to fix this.

First it adds some extra logging and it uses the csrf cookie primarily
for the csrf value.

If the problem does not occur again with those changes we could merge,
assume that it is fixed and hope that if it occurs in future the
additional logging will be helpful.

If not I will add more changes in attempt to fix.

Fix #22105

Signed-off-by: Andrew Thornton <art27@cantab.net>
Co-authored-by: Lunny Xiao <xiaolunwen@gmail.com>
Co-authored-by: John Olheiser <john.olheiser@gmail.com>
Co-authored-by: techknowlogick <matti@mdranta.net>
Co-authored-by: delvh <dev.lh@web.de>
@lunny lunny reopened this Dec 21, 2022
@lunny
Copy link
Member Author

lunny commented Dec 21, 2022

Tests failed again: https://drone.gitea.io/go-gitea/gitea/64747/2/14

Maybe it's related with the csrf token expired time, it's 5 seconds in non-prod mode.

ref: 84ceaa9#diff-fa07e296d614e796de67f7ae9268a0660b89c09b6e5f6373b4ca7bf76a20dec7R669-R671

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Dec 21, 2022

Maybe it's related with the csrf token expired time, it's 5 seconds in non-prod mode.

No, the code says:

// CsrfTokenRegenerationInterval is the interval between token generations, 
// old tokens are still valid before CsrfTokenTimeout
...
CsrfTokenRegenerationInterval = 5 * time.Second // in dev, re-generate the tokens more aggressively for debug purpose

The CsrfTokenRegenerationInterval is only for token generation.

When you do validate, it uses the CsrfTokenTimeout which is always 24h.

// CsrfTokenTimeout represents the duration that XSRF tokens are valid.
// It is exported so clients may set cookie timeouts that match generated tokens.
const CsrfTokenTimeout = 24 * time.Hour

@lunny
Copy link
Member Author

lunny commented Dec 21, 2022

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Dec 21, 2022

Then?


And think about another case, when you run Gitea in dev mode, do you ever see a CSRF error when you stay in a page with form for more than 5 seconds then submit the form?

@lunny
Copy link
Member Author

lunny commented Dec 21, 2022

I sent #22199 to exclude the possible session cache problem.

@zeripath
Copy link
Contributor

The additional logging provided in #22159 definitely indicates that the issue is an out of date csrf token.

I can only think it is something to do with the cached session but I don't fully understand it as we should be getting a csrf token from the preceding GETs.

Excluding the cached session seems like a reasonable attempt at a solution.

@wxiaoguang
Copy link
Contributor

wxiaoguang commented Dec 22, 2022

If the CSRF token was ever printed, then we can parse it to see what happened.

Also it could output the CSRF token to log to see whether it is incorrect or doesn't exist (can not be loaded from HTML response).

I do not think a token would expire in such case, because the expiration timeout is 24h.

@zeripath
Copy link
Contributor

Perhaps it's something to do with the cookie jar?

I don't know if we update the "session" cookie jar on each request - hopefully we do - but could it be that there's an old csrf cookie being passed in addition to the _csrf field on the post?

I guess @wxiaoguang is right and we need to add even more logging here.

@garymoon
Copy link
Contributor

garymoon commented Jan 4, 2023

For anyone struggling with this, we've had good success with the following temporary change. Naturally this means the test will not be run until this change is reverted.

diff --git a/integrations/user_test.go b/integrations/user_test.go
index 6a3d30472d..8a40056561 100644
--- a/integrations/user_test.go
+++ b/integrations/user_test.go
@@ -6,6 +6,7 @@ package integrations
 
 import (
 	"net/http"
+	"os"
 	"testing"
 
 	issues_model "code.gitea.io/gitea/models/issues"
@@ -139,6 +140,10 @@ func TestRenameReservedUsername(t *testing.T) {
 }
 
 func TestExportUserGPGKeys(t *testing.T) {
+	if os.Getenv("DRONE") != "" {
+		t.Skip("FIXME: Skipping this test in CI until https://github.com/go-gitea/gitea/issues/22105 is resolved")
+	}
+
 	defer prepareTestEnv(t)()
 	// Export empty key list
 	testExportUserGPGKeys(t, "user1", `-----BEGIN PGP PUBLIC KEY BLOCK-----

@lunny
Copy link
Member Author

lunny commented Feb 12, 2023

Looks like this testing random error disappeared recently.

@silverwind
Copy link
Member

Have not seen this in a while now, let's close.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 14, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants