-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
open() seems to not always return the full data if that wasn't loaded in __VU =0 #1771
Comments
Can you share some more details - |
Yes it truncates it which made JSON.parse fail in my particular case which then threw an exception. It seemed like some kind of a race in the aferoFS, but I decided against looking into it too deeply at the time in favor of actually finishing some stuff before the holidays :) |
I have seen this as well on 0.29.0, running on Ubuntu WSL on Windows 10. I made a wrapper around open() that always prints the length of the string it read from the file, and every now and then, the size is 0. I've been trying to reproduce for hours, but I can't seem to find it. This is what my wrapper looks like: function TextFileToString(filename) {
console.log(`Running TextFileToString() for file ${filename} by using K6's open() method for vu ${__VU}`);
const fileContents = open(filename);
const script = __ENV.scriptFilename ? __ENV.scriptFilename : "<unknown>";
console.log(`script ${script} reads ${fileContents.length} chars from ${filename} for vu ${__VU}`);
return fileContents;
} I do a JSON.parse() on the result of that wrapper, and that fails of course on an empty string. This is what I have seen multiple times:
So VU 5 seems to haven open() returning an empty string, while the other VU's are reading it just fine. This is my code that I'm using to try and reproduce (so far, no success), which is basicly an example from k6.io with even less complexity: import http from 'k6/http';
import { sleep, check } from 'k6';
import { Counter } from 'k6/metrics';
const filename = 'foo.json';
const data = open(filename);
if (data.length != 99090) {
console.error(`VU ${__VU} read a length of ${data.length} for ${filename}`);
throw new Error("findme");
}
export const requests = new Counter('http_reqs');
export const options = {
stages: [
{ target: 1, duration: '5s' },
],
thresholds: {
requests: ['count < 100'],
},
};
export default function () {
const res = http.get('http://test.k6.io');
sleep(1);
const checkRes = check(res, {
'status is 200': (r) => r.status === 200,
'response body': (r) => r.body.indexOf('Feel free to browse') !== -1,
});
} And I'm trying to reproduce it like this: while true; do timeout 2s k6 run --paused issue-open-test.js --vus 10 >> test.log 2>&1; done But so far I have not cought the exception. I'm using --paused because the error happens in the init phase. Any hints on how to reproduce? |
Another interesting find (that is not documented): I implemented a retry for open() with an exponential backoff, so if open() returns a 0-length string, I print a warning and retry a few times untill I reach maxRetries or maxRetryTime. I can't reproduce the 0-length string on a non-empty file, so I created an empty file, started the script, waited for the warning, and within the +-10 retries in 1 minute, filled the file with valid JSON and saved it. My retry implementation never recovered, so I can only conclude that open() doesn't take file system changes into account. Could it be that k6 works on some kind of snapshot of the filesystem, and ignores changes after it has started? Or maybe there is some caching on open(), so that I can never make it recover by retrying? |
It looks like my retry is working, so I'm suspecting it's indeed a k6 issue. This is the redacted output of my script:
The interesting part is what happened when reading
The script chooses random files to read, so not all VU's read the same files. What I see here is that an open() of file25.json returned a 0-length string, but a retry "fixed" it. I'll check if I can add better logging to see in which VU these things are happening. |
Better logging says this:
So there are 3 VU's trying to read the same file:
Assuming that the order of the log lines has meaning (?), could it be that VU 7 has not returned from |
Hi @marnikvde , From my experiments, there is only a problem if I haven't opened the file when the
Yes, k6 does cache opened/loaded files. I would argue that consecutive VUs should not be able to open ones that weren't open in the 0-th VU. We have had problems with that for cloud users who have some logic that only loads files in some cases or randomly(as you've tried), in their case as we create the archive(tar) we only include the files loaded by the running as The issue most definitely is because we open and save(cache) to a shared resource (both a map string to afero.fs and an afero.Fs implementation itself(which has an additional problem and we should probably move to io/fs when it gets stabilized). So there are possible race conditions for both the map and the afero.Fs (we are hitting the second one with this issue). Adding locking will not make the For the record this has probably been the case forever, as even before my last refactoring practically everything I said above was the same, I just moved it around. I was under the (obviously) wrong impression that just nobody will even be able to load a file that wasn't loaded in |
More specifically my proposal to fix this is:
|
Thanks for that extensive comment, that's a lot of (new) information! :) I can't seem to make a simple example to catch this race condition, I have tried small and big files, lots and few samples, a few VU's and thousands of VU's, to no avail :/ I understand why (by design) you only want files known to Our current workaround is to retry a few times (with exponential backoff) if the string length of We have not looked into 0.30.0 yet, but it seems to include the feature to share data between VU's, which might be suited to load I think your proposal makes sense, documenting this looks like the first step. Maybe start by documenting some of this in the |
Hi @marnikvde, So you are basically doing the csv(it works with other types) example from https://k6.io/docs/examples/data-parameterization#handling-bigger-data-files ? As shown there you can still load them in Yes the SharedArray (documentation pending ... by me 😭 ) practically removes the need for those hacks so I will recommend using it ;)
This depends on the other k6 developers actually agreeing that we will do what I proposed and that this is ... the "correct" behavior :). Obviously, we can just fix it for |
Implementing a limitation for the open() that limits opening files by the list of the files that were opened during the initialization step (__VU == 0). For example, a code like: ```js if (__VU >0) { JSON.parse(open("./arr.json")); } ``` Should return an error. Closes #1771
Implementing a limitation for the open() that limits opening files by the list of the files that were opened during the initialization step (__VU == 0). For example, a code like: ```js if (__VU >0) { JSON.parse(open("./arr.json")); } ``` Should return an error. Closes #1771
Implementing a limitation for the open() that limits opening files by the list of the files that were opened during the initialization step (__VU == 0). For example, a code like: ```js if (__VU >0) { JSON.parse(open("./arr.json")); } ``` Should return an error. Closes #1771
Implementing a limitation for the open() that limits opening files by the list of the files that were opened during the initialization step (__VU == 0). For example, a code like: ```js if (__VU >0) { JSON.parse(open("./arr.json")); } ``` Should return an error. Closes #1771 Co-authored-by: Mihail Stoykov <MStoykov@users.noreply.github.com>
The following code:
Where arr.json is sufficiently big JSON seems to fail sometimes because not the whole JSON gets loaded.
This is, technically, a bad script, but maybe it should fail in another way that is more explanatory of what is happening or not at all
The text was updated successfully, but these errors were encountered: