-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible Memory Leak with File source and Kafka sink #11025
Comments
did u try run vector with |
I'm running as a service with the following configuration to suppress the
I will change to Would |
Hi @leandrojmp ! I'll try to reproduce this. Could you provide some more details about input? Number of files that typically match the glob pattern, average line length in those files, example line content? You mentioned running multiple pipelines. For the memory growth you are seeing, is the above config the only thing a given Vector instance is running? Or do you have a single Vector running multiple of these pipelines? If the latter, could you provide a full sample config? I just want to try, as closely as possible, to match your environment. Also, would you be open to running Vector under |
Hello @jszwedko ! Let me try to give more context about this data. dataThe source of this data are logs from Cloudflare HTTP requests that are sent by Cloudflare to buckets in a cloud service, those files are downloaded to the server, a CentOS 8 VM, using a custom python script that is scheduled in the crontab, the glob mirrors the structure of the buckets. I have to collect logs from multiple companies and each company can have multiple domains, the structure of the path used in the glob is like the following:
configurationSince I'm coming from a Logstash background, I tried to replicate the config file organization I had. In logstash I used the And since vector does not have nothing like the
The I'm running just one vector instance, as a systemd service with the following configuration in the
Inside files and documentThe python script runs every minute and it downloads an average of ~ 500 files that will have an combined size around ~ 200 MB, the files have one json document per line. The average line size is
If you need more information about the format of the documents, it can be found in the cloudflare documentation. I'm not doing any parsing with vector, just reading the lines and sending to kafka, the parsing is still being done by Logstash consuming from kafka. At the moment I have a total of 40 different pipelines, 20 for HTTP Requests, and 20 for Firewall Events, that have a similar document/size, just 2 of those pipelines are running on vector and the rest are still in logstash. I'm planning in keep the same organization I have with logstash, one configuration file for each pipeline, so it would be at least 40 After vector reads the files and send the content to kafka, it deletes the files from disk The processing of the files is really fast, when I migrated those 2 first pipelines from Logstash to Vector, I had a backlog of files matching the globs, something around 50k files, vector had no issue processing them, no sudden memory increase, the only issue at the moment is that the memory keeps increasing with the time. debugTo run vector with Also, I don't know if it would help, but I could generate a I'm running vector direct in the host, it was installed using the I also could migrate more of my pipelines from Logstash to Vector to see if this has any influence in the memory increase. Hope the explanation helps! |
Hi @leandrojmp ! Thanks for the additional details! I talked with a couple of team members who mentioned:
|
Would this two screenshots help?
the time in this image is in UTC
the time in this image is in UTC-3 The fall in the graph is when I restarted vector after I updated to As you can see both lines slowly increase with time. I could test running with cgroups later, I think that I will let it running for the next couple of days to see what will happen when it starts approaching the memory limit. |
Just an update, I'm letting vector runs to see what will happen, the memory still has the same behaviour, it is increasing in small steps, right now it is at |
Yeah, it looks like a memory leak, the It seems that Below is the system logs:
I updated now to version |
The version The workaround at the momento is to restart the vector server daily with a cron job. @jszwedko Is there any other kind of information I could gather to help track this issue? |
Hi @leandrojmp , I think this might actually be the same issue as #11995 . Do you observe the |
Hello, How do I check that? I'm not monitoring the |
Hey! To collect that metric, you will need to add the That is an interesting idea though, to allow interrogating that without wiring up the |
Hello, I created the following pipeline to get the metrics.
Consuming the data from kafka I saw that the |
Hi @leandrojmp , Gotcha, this does sound like it is the same issue as #11995 then, thanks for verifying! We plan to address that issue in the coming quarter so it should resolve this too. |
I'll close this issue since we are tracking in #11995 . Please follow along there. |
Reopening based on #14789 (comment) |
We believe this to be closed by #18634 |
@jszwedko the next release, 0.33, right? I've just updated to 0.32, but still didn't check the memory, we have a restart script on cron, but will wait for 0.33 to confirm it. |
Ah, yes, that is correct. The expected fix will be included in v0.33.0 which is slated for this week. |
Community Note
Vector Version
Vector Configuration File
Debug Output
Expected Behavior
Vector periodically releasing memory
Actual Behavior
Vector memory usage increases overtime
Additional Context
Hello, we are migrating some simple data pipelines from logstash to vector, these pipelines just reads json files and send the events in the json to a kafka topic, the files are cloudflare http requests and firewall events logs, so we have something around 15 to 20 pipelines.
At the moment we migrated two of those pipelines using the configuration shared before, we are using one
.toml
for each source, and it is working as expected, the only issue we find is that the memory usage of the vector process is increasing with the time and if we let the service running for a couple of days it will eventually consume all the memory on the server.The logs are collected using custom python scripts that only download the json files and put them on a folder for vector to consume, the python scripts are called using crontab each minute, vector runs as a systemd process.
Vector is the only service (besides the systems services) running on this server, the load and cpu usage is pretty low, the server runs on GCP and have 8 vCPUs and 8 GB (7.63 GB), with vector stopped the memory usage is around 900 MB, when we start vector the memory starts increasing and will only be released if we restart the vector service, we tried reload with
kill -1 PID
, but it didn't have any effect.References
The text was updated successfully, but these errors were encountered: