-
Notifications
You must be signed in to change notification settings - Fork 76
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unexpected LTFS mount crash <-> Failed to execute SG_IO ioctl, opcode = 0a (12). #349
Comments
I continue to work on this to understand what's happened, i went through the question of the table size of the scatter-gather. I verify the value of the table size of our Qlogic adaptor and i can confirm that the value is 1024 , higher than 256.
Even if |
I confirm that the value is at 64 for this Emulex HBA
|
The kay is the lines below.
The log ltfs/src/tape_drivers/linux/sg/sg_scsi_tape.c Lines 202 to 204 in 3b26f03
Line 455 in 3b26f03
It looks LTFS says it is an internal error because |
I changed the code to reserve transfer buffer into a sg dvice before #164, #215. ltfs/src/tape_drivers/linux/sg/sg_tape.c Lines 1369 to 1376 in 3b26f03
Please check the log of Normally it would be
|
It looks the code base is too old. The enhancement is implemented into 2.4.2.0 or later.
By the way, the LTFS project released new versions for fixing critical issues. I strongly recommend to use 2.4.3.1 or 2.4.4.0 at this time. These are the safest version at this time. See details into https://github.com/LinearTapeFileSystem/ltfs/releases. |
Thanks @piste-jp-ibm for your quick and valuable support. We will upgrade to 2.4.4.0 or even to the branch which include the code changed you referred to :
Keep you posted quicky |
Hi @piste-jp-ibm , thanks again, we have updated the LTFS version as you suggested. We are now running the test with OpenLTFS from the For now, this works as expected ; we do not have execute long run of copy as end of last week ; but the status seems better for now. We can confirmed that the reserved buffer size of the /dev/sg3 is 524288. I will keep you informed about the status after more in deep test.
Side question , perhaps i have to open another ticket ; do you have any idea about these logs , i never seen such kind of event in LTFS logs :
Thanks again for your great support |
Interesting... You might get a very short window. I think you might insert the tape and start the In the most case, cartridge load operation is completed after But finally, LTFS falles back and configure the time out value from preset one,
|
thanks for the feedback , glad to hear that it is minor issue ; i can confirm it is the first time i see this type of error message. In the same time, i can confirm that the version we deployed works fine without any problem any more. And this with
|
Close this. |
thanks @piste-jp-ibm for the great support; for now we do not have any issue any more. |
Bug Description
We would like to report an LTFS problem we are facing with one specific case which 'break' LTFS mount point with unexpected sg / drive / .... errors.
We are familiar in using LTFS solution ; and we have multiple different system using the same solution which included OpenLTFS software ; and we are facing strange experience with one specific LTFS implementation.
We execute a mount LTFS of a tape 'exemple 000043L8' with success , we start to copy files into the LTFS mount point of the tape 000043L8. This works fine , and copy can be executed during at least one hour. Very stable.
In // , we start to execute others LTFS operations such as others LTFS mount or LTFS format operations ; these operations work fine during couple of minutes. 10 / 15 Minutes.
Suddenly, something happened that 'break' all LTFS mount operations. The first LTFS mount is not accessible anymore ; we can't copy files anymore. Then we do an LTFS check - try to umount - remount and after couple of try, we can have it remounted. And after couple of minutes, we have again the same issue.
We have experiences with such kind of operations (but with others hardware components : tape library - drive - hba - .... ) ; and we have this situation in only 1 specific case.
Hardware context
To Reproduce
Execute multiple LTFS operation in // into multiple drives with multiple sg devices.
Expected behavior
Get back the LTFS mounted operations stable.
Trace - Log
We have different logs of the LTFS operation and we have detected some strange issues.
We had a look into others opened issues and more specially the one related to the HBA Emulex, ...
In the
var/log/messages
, we detect LTFS issues such asFailed to execute SG_IO ioctl, opcode = 0a (12).
but also message in from the kernel such askernel: st 18:0:0:0: Mode parameters changed
andst 18:0:0:0: [sg6] tag#0 Add. Sense: Rounded parameter
The
allow_dio
is at 0 ; and we don't have used this option--enable-buggy-ifs
Extract of the logs , i tried to clean the log in keeping the 'most interesting' information (i have others similar logs i can shared) :
I have no idea about what's happened, thanks for your further support,
Software Version
The text was updated successfully, but these errors were encountered: