Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: efficient error handling in manager-api including graceful shutdown, self contained methods. #1814

Merged
merged 4 commits into from
May 10, 2021

Conversation

bisakhmondal
Copy link
Member

@bisakhmondal bisakhmondal commented Apr 21, 2021

Please answer these questions before submitting a pull request, or your PR will get closed.

Why submit this pull request?

  • Bugfix
  • New feature provided
  • Improve performance
  • Backport patches

What changes will this PR take into?

1. Graceful handling of resources.

2. Self contained Manager API. Fatalf is not graceful inside goroutines, which leads to an abrupt termination.

3. Deligating error handling to existing spf13/cobra package.

4. Removing unnecessary proliferation of golang panics. Why not use the user friendly interface already provided by cobra. To be honest, output of panic looks scary 😂

  1. Closing of all opened resource has been put into deferred execution. So incase any panic occurs during runtime, the respective closers of the resource gets executed safely. Till now, for panics, the main goroutine was getting closed ungracefully.

  2. Detailed error description if etcd contains corrupted data and exit with nonzero error code.
    image

Related issues
reopening #1705
closes #1684

Checklist:

  • Did you explain what problem does this PR solve? Or what new features have been added?
  • Have you added corresponding test cases?
  • Have you modified the corresponding document?
  • Is this PR backward compatible? If it is not backward compatible, please discuss on the mailing list first

@netlify
Copy link

netlify bot commented Apr 21, 2021

Deploy preview for apisix-dashboard ready!

Built with commit 84e18e6

https://deploy-preview-1814--apisix-dashboard.netlify.app

@bisakhmondal
Copy link
Member Author

@nic-chen still same issue. any idea why? this time I pull the master, created a new branch and handpicked the new lines (as the number of lines were less) of commit instead of cherry-picking to avoid this CI invocation, but still the same problem

@juzhiyuan
Copy link
Member

Hi, I just sent an email to GitHub Support, to ask for their help about this case.

@bisakhmondal
Copy link
Member Author

Nice🔥

@nic-chen
Copy link
Member

I can't think of why. This should be the same as other new PRs, but other PRs have no such problem. Just wait for the reply from GitHub Support..

@bisakhmondal
Copy link
Member Author

Hii, any update :)

@nic-chen
Copy link
Member

hi @bisakhmondal please sync code from branch master

@bisakhmondal
Copy link
Member Author

Thanks!! Done.

@bisakhmondal
Copy link
Member Author

Ou Nice!! It seems no stalling CI from node build. Thank you both :)

@codecov-commenter
Copy link

codecov-commenter commented Apr 30, 2021

Codecov Report

Merging #1814 (836be0d) into master (b54c195) will decrease coverage by 9.59%.
The diff coverage is 53.84%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1814      +/-   ##
==========================================
- Coverage   71.45%   61.85%   -9.60%     
==========================================
  Files         172       47     -125     
  Lines        6121     3183    -2938     
  Branches      711        0     -711     
==========================================
- Hits         4374     1969    -2405     
+ Misses       1498      898     -600     
- Partials      249      316      +67     
Flag Coverage Δ
backend-e2e-test 46.27% <53.84%> (+0.25%) ⬆️
backend-e2e-test-ginkgo 48.97% <53.84%> (+0.25%) ⬆️
backend-unit-test ?
frontend-e2e-test ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
api/internal/core/store/store.go 58.33% <0.00%> (-27.90%) ⬇️
api/cmd/managerapi.go 48.95% <70.00%> (+0.60%) ⬆️
api/internal/utils/runtime/runtime.go 0.00% <0.00%> (-64.29%) ⬇️
api/internal/core/store/validate_mock.go 0.00% <0.00%> (-50.00%) ⬇️
api/internal/filter/authentication.go 47.22% <0.00%> (-30.56%) ⬇️
api/internal/handler/service/service.go 66.08% <0.00%> (-26.09%) ⬇️
api/internal/handler/global_rule/global_rule.go 64.51% <0.00%> (-19.36%) ⬇️
...pi/internal/handler/plugin_config/plugin_config.go 59.57% <0.00%> (-18.09%) ⬇️
api/internal/filter/ip_filter.go 56.09% <0.00%> (-17.08%) ⬇️
... and 141 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b54c195...836be0d. Read the comment docs.

@juzhiyuan juzhiyuan requested review from tokers, nic-chen and starsz and removed request for tokers and nic-chen May 3, 2021 07:52
@@ -65,6 +66,14 @@ func printVersion() {

// NewManagerAPICommand creates the manager-api command.
func NewManagerAPICommand() *cobra.Command {
cobra.OnInitialize(func() {
var err error
service, err = createService()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use short variables creation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @tokers, I'm afraid that's not possible here due to variable shadowing.

service is a global variable and assignment through go's short assignment operator (:=) will create a brand new service var in this scope without referencing the global one.

I looked into the docs and also cross-checked ref

An identifier declared in a block may be redeclared in an inner block. While the identifier of the inner declaration is in scope, it denotes the entity declared by the inner declaration.

Anyway, Thank you. Got the scope to re-read the docs :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, what!! that's not from this PR 🤔

@bisakhmondal bisakhmondal force-pushed the fix-store-init-error branch from b9aea51 to c66c914 Compare May 4, 2021 14:00
@bisakhmondal bisakhmondal changed the title fix: store init error from etcd and deferred execution of closers fix: efficient error handling in manager-api including graceful shutdown, self contained methods. May 4, 2021
@bisakhmondal
Copy link
Member Author

Any idea why the upstream test suite is breaking🤔, the PR doesn't touch anything on that segment and moreover, it's HTTP 400!!

All starts with test upstream delete [It] create upstream without plugin i.e failing to process upstream request and rest (other tests) is dominos effect.

But it's working fine locally
Screenshot from 2021-05-04 20-09-42

@bisakhmondal bisakhmondal requested a review from tokers May 4, 2021 15:06
@juzhiyuan juzhiyuan requested a review from nic-chen May 4, 2021 15:28
@juzhiyuan
Copy link
Member

juzhiyuan commented May 6, 2021

Hi @bisakhmondal, there have some errors in Backend CI.

@bisakhmondal
Copy link
Member Author

Hi @bisakhmondal, there have some errors in Backend CI.

Yeshh!! Sad moments🥺
Looking into it :)

@@ -26,6 +26,9 @@ import (

// WritePID write pid to the given file path.
func WritePID(filepath string) error {
if _, err := os.Stat(filepath); err == nil {
return fmt.Errorf("instance of Manager API already running: a pid file exists in %s", filepath)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean the manager can't start successfully if there exists a PID file?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may add an option to start forcely.

Copy link
Member Author

@bisakhmondal bisakhmondal May 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hii guys, as we are now supporting manager-api as an os service, the PID file should not be overwritten unknowingly, because then ./manager-api stop won't work. we have to take help from the os service manager explicitly.

We may add an option to start forcely.

Yes, we can do that.

@tokers tokers merged commit 260b767 into apache:master May 10, 2021
@bisakhmondal
Copy link
Member Author

Hi guys, the CI hadn't passed in this PR. Some checking was left to be done from my side.

I truly am sorry, I got caught up in university exams for a few days. I think we should revert the commit. CI has failed in the master too.

#1814 (comment)
Also this need to be taken care of.

Sorry for the inconvenience. I am updating the PR as soon as possible. Thank you.

@juzhiyuan
Copy link
Member

@bisakhmondal Hi, if it's ok, could you please send a new PR to fix that failure?

@bisakhmondal
Copy link
Member Author

@bisakhmondal Hi, if it's ok, could you please send a new PR to fix that failure?

Ya, that sounds awesome.

but yet, I can't reproduce the failure(totally irrelevant to this PR) locally even with the same environment like CI inside a docker container,
image

@bisakhmondal
Copy link
Member Author

@bisakhmondal Hi, if it's ok, could you please send a new PR to fix that failure?

@juzhiyuan It seems to be a CI glitch, now the master is fine after a recent commit.
image

Even the last commit has passed all CI. What do you think, can we ignore?

@juzhiyuan
Copy link
Member

I have to say it's unstable, please create an issue to track this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

program panic when failed to initialize etcd store is unreasonable
6 participants