You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been having issues with certain cell values being recognised as dates because of the cell format was set as a custom date format, but the cell content wasn't really a date and was not supposed to be consummed as a date.
For example, the string '1-23, stored as the sring 133, was being considered as the date 23/09/2024, with created a number of issues for downstream processing, whereas it should have been considered as the range [1, 23], expressed as a string.
While investigating this issue, I've come across this nice writeup, which indicates that this is what used to happen in "compatibility" mode, during the transition from .xlx to .xlsx, but that in modern, newly created .xlsx files, the behaviour should be different. And that the cell should only be considered a date cell, if the cell is of type date ("d"), irrespectively of whether the format is a date format.
By looking at the current implementation of the is_date method, it looks like it is implementing the "compatibility" mode and not the "strict", modern mode. It would be great if the code could be updated to support the "strict" mode.
defis_date?returntrueifdatatype == RubyXL::DataType::DATE# From Office 2010returnfalseunless# Only fully numeric values can be datescaseraw_valuewhenNumericthentruewhenStringthenraw_value =~ NUMBER_REGEXPelsefalseendself.number_format&.is_date_format?end
Can you please provide an example file which doesn't work the way it should? It looks like the issue here is more complex than you think, and I need an actual file to test it on.
Here are 2 simple test files. I discovered on this occasion that, to set strict mode, you have to go to File/Options/Save. This is different from Word, which displays a dialog box on save. So it may be that most .xlsx files are actually saved in compatibility mode.
I've been having issues with certain cell values being recognised as dates because of the cell format was set as a custom date format, but the cell content wasn't really a date and was not supposed to be consummed as a date.
For example, the string
'1-23
, stored as the sring133
, was being considered as the date23/09/2024
, with created a number of issues for downstream processing, whereas it should have been considered as the range[1, 23]
, expressed as a string.While investigating this issue, I've come across this nice writeup, which indicates that this is what used to happen in "compatibility" mode, during the transition from .xlx to .xlsx, but that in modern, newly created .xlsx files, the behaviour should be different. And that the cell should only be considered a date cell, if the cell is of type date (
"d"
), irrespectively of whether the format is a date format.http://www.ericwhite.com/blog/dates-in-strict-spreadsheetml-files/
By looking at the current implementation of the
is_date
method, it looks like it is implementing the "compatibility" mode and not the "strict", modern mode. It would be great if the code could be updated to support the "strict" mode.rubyXL/lib/rubyXL/objects/sheet_data.rb
Line 93 in b719e43
The text was updated successfully, but these errors were encountered: