Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [aforce] Memory leak issue in Appsmith's database integration causing website downtime. #34028

Open
1 task done
rohan-arthur opened this issue Jun 6, 2024 · 17 comments · Fixed by #36631
Open
1 task done
Assignees
Labels
Bug Something isn't working High This issue blocks a user from building or impacts a lot of users Integrations Product Issues related to a specific integration MariaDB MariaDB datasource Needs Triaging Needs attention from maintainers to triage Production QA Pod Issues under the QA Pod QA Needs QA attention Query & JS Pod Issues related to the query & JS Pod

Comments

@rohan-arthur
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Description

slack link

Steps To Reproduce

no clear reproduction steps

Public Sample App

No response

Environment

Production

Severity

High (Blocker to building or releasing)

Issue video log

No response

Version

Self hosted

@rohan-arthur rohan-arthur added Bug Something isn't working Needs Triaging Needs attention from maintainers to triage MariaDB MariaDB datasource labels Jun 6, 2024
@Nikhil-Nandagopal Nikhil-Nandagopal added Critical This issue needs immediate attention. Drop everything else High This issue blocks a user from building or impacts a lot of users Production labels Jun 6, 2024
@github-actions github-actions bot added the Integrations Product Issues related to a specific integration label Jun 6, 2024
@sneha122
Copy link
Contributor

The root cause of this issue is still unknown as we tried reproducing this issue by:

  1. Creating MariaDB instance on AWS RDS and creating appsmith app that has multiple pages with multiple onload queries, observed memory usage on AWS dashboard, nothing suspicious when using the application, only thing observed was active memory trend was increasing even if the application was not in use, but this did not stop the app in any way or caused any lag.
Screenshot 2024-06-10 at 7 06 21 PM
  1. For now, we have also raised a couple of questions with the user to understand their environment, awaiting user response to take further actions.

@rohan-arthur rohan-arthur removed the Critical This issue needs immediate attention. Drop everything else label Jun 17, 2024
@github-actions github-actions bot added the Query & JS Pod Issues related to the query & JS Pod label Jun 17, 2024
@rohan-arthur rohan-arthur added Critical This issue needs immediate attention. Drop everything else and removed High This issue blocks a user from building or impacts a lot of users labels Jun 17, 2024
Copy link

This critical issue has not seen activity for a while. It will be closed in 7 days unless further activity is detected or the Critical tag is removed.

@github-actions github-actions bot added the Stale label Jun 24, 2024
@okletsov
Copy link

okletsov commented Jul 3, 2024

Memory leak happens in our prod app (this github issue is the result of my communication with support). Let me know if there is any additional information we can provide to troubleshoot.

@github-actions github-actions bot removed the Stale label Jul 3, 2024
@rohan-arthur
Copy link
Contributor Author

rohan-arthur commented Jul 4, 2024

@okletsov
Thank you for sharing more context and following up. Apologies for not being able to get to this because we are a small team, and have some more urgent issues to work on.

We have struggled to reproduce the error situation, so any clues or suggestions would help to make our debugging easier.

@rohan-arthur rohan-arthur added High This issue blocks a user from building or impacts a lot of users and removed Critical This issue needs immediate attention. Drop everything else labels Jul 4, 2024
@rohan-arthur rohan-arthur assigned rohan-arthur and unassigned sneha122 Jul 4, 2024
@sneha122
Copy link
Contributor

@okletsov Thank you so much for your patience.
I have gone through the code to understand the potential causes of connections not being released and eventually causing memory leak, I wanted to understand the nature of datasources and queries you have in your application that are running on page load.
My hunch right now is that you may have multiple datasources (connecting to same db) having queries on page load, this can cause creation of multiple connection pools to same database and end up exhausting connections on that database. Hence If you can let me know if the 11 queries that running page load, do they belong to same datasource or they belong to different datasources (connecting to same db)?

Further more, I am creating action item here to add more logs to observe the nature of connection pool to understand active connections, connection pool size etc

@okletsov
Copy link

okletsov commented Jul 27, 2024 via email

@okletsov
Copy link

okletsov commented Jul 27, 2024 via email

@sneha122
Copy link
Contributor

Hello @okletsov

Thanks a lot for the information provided. Unfortunately, I am not able to see the image that you have attached, it would be great if you can schedule call using this link. This can quickly help us to analyse database connections.

sneha122 added a commit that referenced this issue Jul 30, 2024
## Description

This PR adds logs around connection pool metrics in order to debug
memory leak issue #34028

Following metrics are logged for Get strcuture and Excute query calls:
- Acquired  - It indicates number of connections acquired from pool
- Idle - Number of connections sitting idle in the connection pool
- Allocated - Number of connections active / idle in the pool
- Pending - Number of connections pending to be acquired.

This information can help us understand if connections are not being
released from the pool leading to memory leak.

Fixes #35158  
_or_  
Fixes `Issue URL`
> [!WARNING]  
> _If no issue exists, please create an issue first, and check with the
maintainers if the issue is valid._

## Automation

/ok-to-test tags="@tag.Datasource"

### 🔍 Cypress test results
<!-- This is an auto-generated comment: Cypress test results  -->
> [!TIP]
> 🟢 🟢 🟢 All cypress tests have passed! 🎉 🎉 🎉
> Workflow run:
<https://github.com/appsmithorg/appsmith/actions/runs/10148654649>
> Commit: 2588f79
> <a
href="https://internal.appsmith.com/app/cypress-dashboard/rundetails-65890b3c81d7400d08fa9ee5?branch=master&workflowId=10148654649&attempt=1"
target="_blank">Cypress dashboard</a>.
> Tags: `@tag.Datasource`
> Spec:
> <hr>Mon, 29 Jul 2024 17:38:32 UTC
<!-- end of auto-generated comment: Cypress test results  -->


## Communication
Should the DevRel and Marketing teams inform users about this change?
- [ ] Yes
- [x] No


<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->
## Summary by CodeRabbit


- **New Features**
- Enhanced logging capabilities for the MySqlPlugin connection pool
metrics, improving observability during database operations.

- **Bug Fixes**
- Improved monitoring tools to help identify potential memory leak
issues related to connection pool usage.

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: “sneha122” <“sneha@appsmith.com”>
@okletsov
Copy link

okletsov commented Jul 31, 2024 via email

@okletsov
Copy link

Hello, appsmith team. Any updates on the issue? We are back to the busy season with this memory leak issue bringing our website down 2-3 times a week.

@sneha122
Copy link
Contributor

sneha122 commented Sep 24, 2024

Hi @okletsov Thanks a lot for reaching out again! I deeply apologise that we haven't been able to follow up on this due to other priorities. If possible, could you please help us by sharing the application json (You can do this by exporting the app) with us and read only DB access as discussed in last call? You can share these details with us over email using support@appsmith.com

For context on what we discussed during last call:

  • The application runs 11 queries on page load, based on views created in the MariaDB database
  • When the queries run, Appsmith creates up to 55 connections to the DB, which get released after 5 mins (connection lifetime)
  • However, even after connections are released, the memory usage doesn't drop back to original levels
  • This leads to periodic reboots/restarts to free up memory when it reaches a set threshold
  • The issue seems specific to using Appsmith, as the application didn't face this issue earlier

@okletsov
Copy link

Hello Sneha,

Couple clarifications/corrections:

  • we are using Mariadb (e.g. not MySQL)
  • Appsmith creates 55 (e.g. not 5) connections to the DB

Everything else is correct. I just sent the DB connection info as well as the app JSON to the email.

@sneha122
Copy link
Contributor

Thanks @okletsov Will update the details and check out the issue as well

@NilanshBansal
Copy link
Contributor

@okletsov the fix for this issue has reached appsmith production (app.appsmith.com), can you please confirm if it is working for you?
cc: @rohan-arthur @pranavkanade @ame-appsmith @LagunaElectric

@okletsov
Copy link

okletsov commented Oct 11, 2024 via email

@okletsov
Copy link

okletsov commented Oct 11, 2024 via email

@NilanshBansal
Copy link
Contributor

Thank You for the details @okletsov. I have reopened this issue to debug further.

@NilanshBansal NilanshBansal reopened this Oct 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working High This issue blocks a user from building or impacts a lot of users Integrations Product Issues related to a specific integration MariaDB MariaDB datasource Needs Triaging Needs attention from maintainers to triage Production QA Pod Issues under the QA Pod QA Needs QA attention Query & JS Pod Issues related to the query & JS Pod
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants