Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revise json converter time #351

Merged
merged 3 commits into from
Sep 7, 2022

Conversation

hecon5
Copy link
Contributor

@hecon5 hecon5 commented Sep 7, 2022

In my project, I had to overhaul my time processing for various time zones. As a result, I also updated the json parser used.

I separated out the two (they were originally two, but the upstream merged the two files) back apart to allow a smaller project that doesn't use Json to just use the time stamping functions (used largely for logging).

modUtcConverter is used extensively by us for time stamping, time conversion (down to millisecond accuracy, but could go lower if needed), and server communications. In this tool, it would allow the index to use a time stamp that won't be upset if your devs are in different times.
We use a timestamp like this: dev releases version, version is timestamped in UTC.
When a user loads the front end, it presents in local time, ensuring that they don't see a release date in the future (this causes them great panic, for some reason).

While updating json, I incoroporated more of the changes on the vba/json project PRs with a few tweaks to ensure consistency.

Let me know what you think @joyfullservice !

@hecon5 hecon5 changed the base branch from master to dev September 7, 2022 15:54
@joyfullservice
Copy link
Owner

Thanks, @hecon5! This looks pretty good to me, from a basic review of the changes (without actually testing it out myself). It's kind of too bad some of the VBA-tools projects are not more actively maintained from a PR standpoint, but my hunch is that the main reason for this is the difficulty of testing changes without more comprehensive unit testing in place.

I like how you split out the time zone conversion code from the JSON converter. Although it creates a dependency for the JSON converter, it allows us to use the time zone conversion on its own.

Is this ready to merge now, or are there additional tweaks you are wanting to add? (If it is still in progress, you can convert the PR to a draft, then I will know it is not yet ready for merge.)

…re output returns as string and not a date type. Note: This creates an item as a `date` type, in LOCAL time.
@hecon5
Copy link
Contributor Author

hecon5 commented Sep 7, 2022

I think this is ready; I've done a fair bit of testing, but don't have a lot of time values that get stored in JSON (most of our timestamps are stored in a log file or sent to a server).

I'm doing some additional final checks and tests to verify dictionary parsing into date types is correctly detected; found one last bug after two days of checking.

Note that an ISO time creates a LOCALIZED Date type upon creation.

tjsonOutput = String_BufferToString(json_Buffer)
' only test for ISO format when NoConvertDateToUTC is Off
If (Not JsonOptions.NoConvertDateToUTC) And (tjsonOutput Like "####-##-##T##:##:##*") Then
json_ParseString = ParseIso(tjsonOutput)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: This will output a localized (to the user) string output formatted like a VBA.Date to String; it may be better to leave as ISO string and keep any time string in UTC format.

There really isn't (without a heavy re-write) a good way to parse an ISO string to date type given how the json converter is written, but in my testing converting that string to a Date has been reliable enough for my purposes.

If not, can remove this conversion and leave as an ISO string, so the returned dictionary object is an ISO string and you'll just need to remember it's an ISO string and not a Date.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ooh, coworker test figured out a way to convert back to date type; updating now.

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 7, 2022

@joyfullservice, I'm done...I promise. The only change I'd make is to allow someone to parse to ISO strings, but not reverse the process; as of now, it's bi-directional, so you just need to verify your localization is correct if you then use those dates.

@joyfullservice joyfullservice merged commit 757ed55 into joyfullservice:dev Sep 7, 2022
@joyfullservice
Copy link
Owner

@hecon5 - We have a pretty serious performance issue with this update... I was noticing some unusual drag when running the add-in on a more complex database, and it came back to this change.

The database has about 1,600 objects, and the index file is 20K lines of JSON. On my machine this normally parses in 0.51 seconds. After the changes to the JSON module, it now takes 9.24 seconds to parse the same file. That's over 18 times slower.

I did a little digging on it, and I am still getting the same performance issue even after pulling the time conversion out of the equation. It seems to be something with the string handling in the JSON conversion module itself.

Based on the massive performance hit, I am going to revert modJsonConverter to the previous working version. I will go ahead and leave modUtcConverter in the project for now, in the hopes that you can resolve the performance issues in your updated JSON converter.

If you need a larger file for testing, let me know and I can send you a copy of the index file I am working with.

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 8, 2022

hmm, I was wondering about that, we noticed some smaller changes, but nothing so extreme as 18 times.

Thinking on it:

  1. I think it may rip out the upstream string handling and use clsConcat as a string builder instead
  2. If it's comparing strings to Like ##-.... I wonder if that's expensive given the constraints of the system. Before I get too into the weeds, I wonder if just not re-converting to a Variant/Date/String may be the expensive part.

Comment on lines +532 to +538
tjsonOutput = String_BufferToString(json_Buffer)
' only test for ISO format when NoConvertDateToUTC is Off
If (Not JsonOptions.NoConvertDateToUTC) And (tjsonOutput Like "####-##-##T##:##:##*") Then
json_ParseString = ParseIso(tjsonOutput)
Else
json_ParseString = tjsonOutput
End If
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this (and the conversion to / from variant types is too expensive, leaving as an ISO string and parsing as needed may be a better way to deal with conversions. Are you able to test if you change the Function Declaration to As String, remove the json_ParseString = vbNullString (ensures we don't allocate the memory incorrectly). if that makes any difference at all, and setting lines 532 to 538 to json_ParseString = String_BufferToString(json_Buffer) if that makes a difference.

Comment on lines +762 to +764
' To use, your calling routine needs to store the StringBufferCache to be handed back.
Private Sub String_BufferAppend(ByRef StringBufferIn As StringBufferCache _
, ByRef String_Append As Variant)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're seeing that performance hit, I think since we already have clsConcat it would be better to just use it instead of this, even with upstream changes possible.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am open to this, although this would further widen the gap between our version and the upstream source.

I guess my inclination would be that if the upstream works well without modifications, use it with the minimal modifications needed for our purposes. If the upstream doesn't work well, or is not being actively maintained, I am fine with doing more significant rewrites. I just want to be a bit careful here because the more different the two versions are, the more difficult it becomes to keep our version current with the latest updates in the upstream source.

The VBA JSON conversion is pretty widely used, much more than our add-in, so I would expect to see at least periodic updates/enhancements over time. This would favor the idea of keeping our version more closely aligned. On the other hand, it has been widely used and battle-tested for several years, so we might be okay with running with our own forked version since presumably most of the bugs have been worked out at this point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood on the divergence, however, with the last code update in 2019, 94 open issues, 12 open and unreviewed PRs, and no other VBA tools repos having updates, I kinda feel like it's been abandoned, or at least isn't going to get any updates any time soon, so for me, I don't really think a fork is a bad idea at this point.

joyfullservice added a commit that referenced this pull request Sep 8, 2022
Some recent updates resulted in a massive performance hit. Reverting changes till the performance issues are resolved. #351
@hecon5
Copy link
Contributor Author

hecon5 commented Sep 12, 2022

@joyfullservice, as I troubleshoot this: are you noticing a performance hit when you convert to json, or back to a dictionary?

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 12, 2022

Also, can you share that json file? that'd be extremely helpful for me to test without bugging you too much! I think I may have found a potential culprit, but need a big file to build to verify.

@joyfullservice
Copy link
Owner

Here is what I was using for testing...

Public Sub TestLoadFile()
    Perf.StartTiming
    ReadJsonFile "...full-path...\vcs-index.json"
    Perf.EndTiming
    Debug.Print Perf.GetReports
End Sub

I would be happy to provide a file for testing, but instead of posting it publicly, I would rather e-mail it to you directly. If you can drop me a quick email, and I will attach it to my reply.

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 12, 2022

Sure thing; I created a private repo and invited you; this avoids needing you to post your email or me to do the same online. I get enough spam as it is...

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 12, 2022

Ok, after digging around, I think I discovered one potential issue:

within clsVCSIndex.Update the import/export date is converted to a string before being stored in the dictionary dItem; so the export doesn't know any better. The updated JSON exporter "Should" know the export formatting for dates, and should export an ISO8601 compliant string.

I'm testing some updates now, but preliminary investigation I've found json_ParseValue to be a loss leader:

--------------------------------------------------
                PERFORMANCE REPORTS
--------------------------------------------------
Operations                    Count     Seconds   
--------------------------------------------------
modJsonConverter.json_ParseSt 30684     0.70      
    ring                                          
modJsonConverter.json_ParseVa 16992     9.99      
    lue                                           
modStringTools.FitStringToCol 9         0.00      
    umn                                           
modStringTools.ListResultInde 4         0.00      
    nt                                            
Parse JSON                    1         0.02      
Read File                     1         0.01      
--------------------------------------------------
TOTALS:                       47695     10.71     
--------------------------------------------------

@hecon5
Copy link
Contributor Author

hecon5 commented Sep 12, 2022

ok, after refactoring to using clsConcat ,there's still about 6 seconds of parsing going on. I did find a few places where upstream vbaJsonConverter had changed to ByVal instead of ByRef, and that made a fairly big difference, so I'm still investigating what's going on.

TestLoadFile
--------------------------------------------------
                PERFORMANCE REPORTS
--------------------------------------------------
Category                      Count     Seconds   
--------------------------------------------------
TestLoadingFile               1         5.98      
--------------------------------------------------
TOTALS:                       1         5.98      
--------------------------------------------------

--------------------------------------------------
Operations                    Count     Seconds   
--------------------------------------------------
modJsonConverter.json_ParseKe 16992     0.44      
    y                                             
modJsonConverter.json_ParseOb 3299      3.00      
    ject                                          
modJsonConverter.json_ParseSt 30684     1.00      
    ring                                          
modJsonConverter.json_ParseVa 16992     0.39      
    lue                                           
modStringTools.FitStringToCol 24        0.00      
    umn                                           
modStringTools.ListResultInde 9         0.00      
    nt                                            
Parse JSON                    1         0.01      
Read File                     1         0.00      
--------------------------------------------------
TOTALS:                       68007     4.84      
--------------------------------------------------


@hecon5
Copy link
Contributor Author

hecon5 commented Sep 12, 2022

Well, this gets interesting: removing the extra performance monitoring cuts execution time down by nearly half. I set performance items the same on both, only monitoring public functions:

Test Function:

Public Sub TestLoadFile()
    'Perf.StartTiming
    Perf.Reset
    Perf.DigitsAfterDecimal = 4
    Perf.CategoryStart "TestLoadingFile"
    ReadJsonFile "C:\Users\G2HDCHPL\Downloads\vcs-index\vcs-index.json"
    Perf.CategoryEnd
   ' Perf.EndTiming
    Debug.Print Perf.GetReports
End Sub

Original modJsonConverter on my machine:

TestLoadFile
--------------------------------------------------
                PERFORMANCE REPORTS
--------------------------------------------------
Category                      Count     Seconds
--------------------------------------------------
TestLoadingFile               1         2.9370
--------------------------------------------------
TOTALS:                       1         2.9370
--------------------------------------------------

--------------------------------------------------
Operations                    Count     Seconds
--------------------------------------------------
Read File                     1         0.0036
Parse JSON                    1         2.9195
--------------------------------------------------

Revised modjsonConverter with clsConcat:

TestLoadFile
--------------------------------------------------
                PERFORMANCE REPORTS
--------------------------------------------------
Category                      Count     Seconds   
--------------------------------------------------
TestLoadingFile               1         3.2006    
--------------------------------------------------
TOTALS:                       1         3.2006    
--------------------------------------------------

--------------------------------------------------
Operations                    Count     Seconds   
--------------------------------------------------
modStringTools.FitStringToCol 12        0.0003    
    umn                                           
modStringTools.ListResultInde 5         0.0002    
    nt                                            
Parse JSON                    1         3.1849    
Read File                     1         0.0037    
--------------------------------------------------
TOTALS:                       24        3.1892    
--------------------------------------------------

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants