-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Health checks mot PostgreSQL, Azure Service Bus, og Altinn #292
Comments
Kan hende varsling kan tas ut som en egen task, for å holde oppgavene små. |
Tror vi skal være forsiktig med å legge til requests mot eksterne tjenester som en del av container apps-health checken. Om vi sliter å få kontakt med postgresql så vil vi ikke nødvendigvis degrade tjenesten til "unhealthy" i Kubernetes ettersom den da vil kontinuerlig restarte pga. failing health checks. Skulle vi heller ha eksponert et eget health-endpoint som vi kunne pinget fra f.eks https://learn.microsoft.com/en-us/azure/azure-monitor/app/availability-overview, https://www.runscope.com/ eller https://www.atlassian.com/software/statuspage? Der kan vi f.eks også degrade tjenesten om latency på en tredjepartsservice er over X f.eks også. Så kan vi heller returnere 200 OK på liveness og returnere noe som gir mening på readiness (når vi sier at vi ikke vil at tjenesten/replicaen skal motta mer trafikk før den er healthy). |
- Adds health check for Redis, PosgreSQL and the wellknown-endpoints. - Ensures that we have different endpoints for readiness/liveness/startup/health Related to #292 <img width="542" alt="image" src="https://github.com/user-attachments/assets/5b71bfbc-1e83-427c-8042-e363ffbf8faa"> <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit ## Summary by CodeRabbit - **New Features** - Added health check capabilities for Redis, PostgreSQL, and well-known endpoints. - Introduced multiple health check endpoints: `/startup`, `/liveness`, `/readiness`, and `/health`. - Integrated health checks into the service collection for better monitoring. - Added a new project for utility functions related to health checks. - **Enhancements** - Improved health monitoring with a new HTTP client and health check configurations, including a self-check feature. - Added support for dynamic configuration of health check probes in deployment templates. - Updated API specifications to reflect new health check schemas and structures. - **Bug Fixes** - Enhanced error handling for health checks to provide clearer feedback on endpoint status. <!-- end of auto-generated comment: release notes by coderabbit.ai --> --------- Co-authored-by: Are Almaas <arealmaas@gmail.com> Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com> Co-authored-by: Dialogporten Automation Bot <164321870+dialogporten-bot@users.noreply.github.com> Co-authored-by: Magnus Sandgren <5285192+MagnusSandgren@users.noreply.github.com>
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> Changed the paths of health checks, so have to ensure we use the same endpoints in the probes ## Related Issue(s) - #292 ## Verification - [ ] **Your** code builds clean without any errors or warnings - [ ] Manual testing done (required) - [ ] Relevant automated test added (if you find this hard, leave it and we'll help out) ## Documentation - [ ] Documentation is updated (either in `docs`-directory, Altinnpedia or a separate linked PR in [altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if applicable) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Updated health probe paths for container apps to include a new `/health` prefix, enhancing health check organization. - **Bug Fixes** - Improved accuracy of health status checks by modifying probe endpoints to ensure proper monitoring. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> <img width="1091" alt="image" src="https://github.com/user-attachments/assets/6f3f9095-ccc7-4342-8f47-fdacb733f9be"> Seems like it's CPU that we are struggling with the most. Upgrading to the next profile in the Burstable tier which has 2 cores. (B2s) ## Related Issue(s) - #292 ## Verification - [ ] **Your** code builds clean without any errors or warnings - [ ] Manual testing done (required) - [ ] Relevant automated test added (if you find this hard, leave it and we'll help out) ## Documentation - [ ] Documentation is updated (either in `docs`-directory, Altinnpedia or a separate linked PR in [altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if applicable) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced flexibility in PostgreSQL SKU selection with additional options available. - Updated default SKU from 'Standard_B1ms' to 'Standard_B2s' for improved resource allocation. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> ## Related Issue(s) - #292 ## Verification - [ ] **Your** code builds clean without any errors or warnings - [ ] Manual testing done (required) - [ ] Relevant automated test added (if you find this hard, leave it and we'll help out) ## Documentation - [ ] Documentation is updated (either in `docs`-directory, Altinnpedia or a separate linked PR in [altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if applicable) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced health check configurations for container applications, enhancing monitoring capabilities. - Added a new launch configuration for debugging the GraphQL application alongside the WebApi. - **Bug Fixes** - Updated health check mappings to ensure proper functionality and configuration. - **Documentation** - Improved project references and service configurations for clarity and maintainability. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Flytter denne til #1261 : "Implementer varsling hvis en container er unhealthy eller degraded over en viss periode" Ikke så relevant for health-checks |
Eneste som gjenstår er health-check mot servicebus |
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> ## Related Issue(s) - #292 ## Verification - [ ] **Your** code builds clean without any errors or warnings - [ ] Manual testing done (required) - [ ] Relevant automated test added (if you find this hard, leave it and we'll help out) ## Documentation - [ ] Documentation is updated (either in `docs`-directory, Altinnpedia or a separate linked PR in [altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if applicable) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Enhanced health check reporting to focus on dependency tracking. - Optimized caching strategies for improved performance and reliability. - Updated configuration for HTTP clients to improve error handling and service integration. - **Bug Fixes** - Adjusted health check options and caching parameters to ensure accurate functionality. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
<!--- Provide a general summary of your changes in the Title above --> ## Description <!--- Describe your changes in detail --> An availability test for the backend. For now will send a health-check request to web-api-so so verify that the service is up and running with the all dependencies. The health-endpoint in APIM will send requests to web-api-so by default, the other services are not exposed yet. - Adds an availability test towards our APIM. Will probe the deep version of the health-checks which checks third party URLs together with Redis and Postgres. - Will now only target web-api-so as it is the default backend. Should expose all services like this. <img width="612" alt="image" src="https://github.com/user-attachments/assets/a368ed4d-78c5-4966-b363-493c85bd4568"> The frontend availability test: ![image](https://github.com/user-attachments/assets/55cbe387-d246-4b45-bbd4-17722f4117ab) ## Related Issue(s) - #292 ## Verification - [ ] **Your** code builds clean without any errors or warnings - [ ] Manual testing done (required) - [ ] Relevant automated test added (if you find this hard, leave it and we'll help out) ## Documentation - [ ] Documentation is updated (either in `docs`-directory, Altinnpedia or a separate linked PR in [altinn-studio-docs.](https://github.com/Altinn/altinn-studio-docs), if applicable) <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit - **New Features** - Introduced a new parameter `apimUrl` for capturing the APIM instance URL across various environments (production, staging, test, yt01). - Added a new module for creating an availability test for the APIM instance, enhancing monitoring capabilities. - **Enhancements** - New output declaration for the Application Insights resource ID, allowing easier access to the resource identifier post-deployment. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
Introduksjon
ASP.NET-prosjektene våre har endepunkter for health probes i ContainerApps (kubernetes),
disse bruker nå default-implementasjonen til .NET.
De returnerer 200 OK om ting er i live.
Implementasjon
Implementer vår egne checks som sjekker connections og connection time mot
Noen checks vi kanskje bør se på etterhvert:
Ingen kobling mot disse gir
unhealthy
, høy responstid skal girdegraded
(sjekk opp eksakte terms/HTTP-responskoder)Oppgaver
The text was updated successfully, but these errors were encountered: