Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed ups for navigating Microsoft Excel spreadsheets, especially for those containing comments / dropdowns #9257

Merged
merged 22 commits into from
Feb 17, 2019

Conversation

michaelDCurran
Copy link
Member

Link to issue number:

Fixes #8271
Fixes #7348

Summary of the issue:

As NVDA's support for Microsoft Excel has become more rich over the years, the amount of cross-process COM calls NVDA makes in order to report a cell has significantly increased. these days, at least 65 cross-process COM calls are made to report one cell. A large chunk of these calls are to do with calculating overflowing/chropping info. In general the increasing number of calls is causing responsiveness in Excel to slow down. However, due to a serious bug in Microsoft Excel 2016 (and possibly in 2013) if a spreadsheet contains at least one comment or at least one validation dropdown list, all cross-process COM calls increase in time, usually causing responsiveness to slow down by around a factor of 7. for some spreadsheets this can mean the NVDA user is waiting up to 1.5 seconds or more when navigating from one cell to another.
NV Access has been working with Microsoft to try and solve the underlying bug in Excel itself, and there may be a fix coming from Microsoft's side in the near future. However, existing builds of Microsoft Excel would not likely get this fix.

Description of how this pull request fixes the issue:

Some helper functions have been added to nvdahelper inproc utils which allows NVDA to fetch all required information for an Excel cell in one cross-process call. This totally removes the lag seen in spreadsheets containing comments and d or dropdowns, and may provide an improvement to other spreadsheets, especially if we start adding more features again in the future.
A small speed up has also been achieved when listing comments and formulas in the NVDA Elements list, again by fetching all information in one call.

Testing performed:

Tbd. Requires plenty :)

Known issues with pull request:

As NVDA cannot inject into Excel's process when a spreadsheet is opened in Protected View, we can only support very basic information such as a cell's address and content. Fetching further information would simply cause the original bug. At least this way a basic information can be read in a Spreadsheet in Protected view. Previously performance was so bad it was pretty much pointless to try anyway.

Change log entry:

Bug fixes:

#include "nvdaInProcUtils.h"


// Excel IDispatch IDs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered storing these into a header file instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree here. Please move these into another header, similar to nvdaHelper/remote/WinWord/Constants.h

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

nvdaHelper/common/COMUtils.h Show resolved Hide resolved
if isMerged:
obj=ExcelMergedCell(windowHandle=self.windowHandle,excelWindowObject=self.excelWindowObject,excelCellObject=selection.item(1))
elif numCells>1:
if not isMerged and numCells>1:
obj=ExcelSelection(windowHandle=self.windowHandle,excelWindowObject=self.excelWindowObject,excelRangeObject=selection)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is good to see that this pr gets rid of the ExcelMergedCell. Is there still a good reason to stick to the ExcelSelection overlay, though?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still unsure as to how best to represent selection in Excel from a user experience point of view. Until we can settle on something good, I don't see any reason to change the current code.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also #6959 and #5401

source/NVDAObjects/window/excel.py Show resolved Hide resolved
return "%s: %s"%(self.excelCellInfo.address.split('!')[-1],self.excelCellInfo.formula)

class ExcelCellInfoQuicknavIterator(object):
QuickNavItem=None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume this has to be QuickNavItemClass instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. But it has now been replaced by an abstract property anyway.

except COMError:
#When cutting and pasting the old selection can become broken
return False
thisAddr=self.excelCellObject.address(False,False,1,True)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume you have a good reason not to use self.excelCellInfo and other.excelCellInfo here. Nevertheless, it would help if you could add info about the arguments you're passing to excelCellObject.address.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have provided detailed comments for the reason and the address arguments.

try:
otherAddr=self.getCellAddress(other.excelCellObject,True)
except COMError:
#When cutting and pasting the old selection can become broken
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This COMError catch is no longer there in the new code. Are you sure it is no longer necessary to catch COMErrors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No... I have now put it back :)

_com_dispatch_raw_propget(pDispatchFont,XLDISPID_FONT_STRIKETHROUGH,VT_BOOL,&sFontStrikeThrough);
CComBSTR sFontName;
_com_dispatch_raw_propget(pDispatchFont,XLDISPID_FONT_NAME,VT_BSTR,&sFontName);
HDC windowDC=GetDC(hwnd);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add the old comments from ExcelCell.getCellTextWidth to this part of the C++ code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. and Also addressed better error catching here for @feerrenrut

if(FAILED(res)) {
LOG_DEBUGWARNING(L"row.showDetail failed with code "<<res);
}
states+=(showDetail?NVSTATE_EXPANDED:NVSTATE_COLLAPSED);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Github somehow decided to discard my comment here.
Since states is supposed to be a bitmask, I'd personally prefer states|= over states+= for this and all subsequent cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that using += leads to a logic error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@feerrenrut feerrenrut left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've given this a once over. There are a few little things to fix up. It's probably worth getting this into the hands of someone who is using excel a lot to help us test it.

#include "nvdaInProcUtils.h"


// Excel IDispatch IDs
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I agree here. Please move these into another header, similar to nvdaHelper/remote/WinWord/Constants.h


long getCellTextWidth(HWND hwnd, IDispatch* pDispatchRange) {
CComBSTR sText;
_com_dispatch_raw_propget(pDispatchRange,XLDISPID_RANGE_TEXT,VT_BSTR,&sText);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Many (most, all?) are not checking the return result. Please add:

Suggested change
_com_dispatch_raw_propget(pDispatchRange,XLDISPID_RANGE_TEXT,VT_BSTR,&sText);
auto res = _com_dispatch_raw_propget(pDispatchRange,XLDISPID_RANGE_TEXT,VT_BSTR,&sText);
if( res !=S_OK){
LOG_DEBUGWARNING(L"range text failed");
return 0;
}

if(FAILED(res)) {
LOG_DEBUGWARNING(L"row.showDetail failed with code "<<res);
}
states+=(showDetail?NVSTATE_EXPANDED:NVSTATE_COLLAPSED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think that using += leads to a logic error.

long validationType=0;
res=_com_dispatch_raw_propget(pDispatchValidation,XLDISPID_VALIDATION_TYPE,VT_I4,&validationType);
if(FAILED(res)) {
//LOG_DEBUGWARNING(L"validation.type failed with code "<<res);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this commented out?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It always produces an error. I have now suppressed this log call only for that particular error, and commented why.

@michaelDCurran
Copy link
Member Author

@LeonarddeR and @feerrenrut I believe I have addressed all your review comments. A try build has been linked to on the two related issues.

Copy link
Collaborator

@LeonarddeR LeonarddeR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me now, just one comment. Also, I agree with @feerrenrut that this needs some testing by more frequent Excel users. However, I think Alpha might be a good stage to do this

LOG_DEBUGWARNING(L"range.validation failed with code "<<res);
}
if(pDispatchValidation) {
_com_dispatch_raw_propget(pDispatchValidation,XLDISPID_VALIDATION_INPUTTITLE,VT_BSTR,&cellInfo->inputTitle);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results of this com call and the calls below are not yet logged on failure.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All COM calls now check for errors. Plus I removed a duplicate call to range.address which I'd accidentally left in after a previous refactor.

@dnz3d4c
Copy link

dnz3d4c commented Feb 12, 2019

I hope this pr is merged into the alpha version. Because it can be tested by more users. Testing with try build is inconvenient.

Looks good to me now, just one comment. Also, I agree with @feerrenrut that this needs some testing by more frequent Excel users. However, I think Alpha might be a good stage to do this

@codeofdusk
Copy link
Contributor

codeofdusk commented Feb 12, 2019

Does this PR close the first idea on this page? If so it should probably be edited accordingly.

@michaelDCurran
Copy link
Member Author

@codeofdusk This pr does not go as far as it can. E.g. pressing NVDA+f to fetch formatting, or if any of the formatting checkboxes are on in NVDA's document settings, Excel will still be very slow. This pr just speeds up the most common usage of navigating around cells with default settings. However, once the pr is merged I will be sure to edit that page. Particular factors meant that NV Access needed to give this particular bug much higher priority right now.

@michaelDCurran
Copy link
Member Author

michaelDCurran commented Feb 12, 2019

Although I'd like to merge this to alpha to get wide testing. @feerrenrut and I have discussed about using this codebase to fully test using the auto-generated MS Office C++ COM interfaces. This code base is much smaller than the MS Word code so shouldn't take me too long to convert. However, as there are still some open issues around using the COM interfaces rather than the raw IDispatch helper functions, I wouldn't want to hold this current pr up for that.

@michaelDCurran
Copy link
Member Author

Here is the latest try build: https://ci.appveyor.com/api/buildjobs/u3cvt0l42g757srt/artifacts/output%2Fnvda_snapshot_try-excelInproc-16752%2C35b618a6.exe
I'm extremely interested to know if there are any issues with this on early versions of Office like 2010 or 2007.

@LeonarddeR
Copy link
Collaborator

Although I'd like to merge this to alpha to get wide testing. @feerrenrut and I have discussed about using this codebase to fully test using the auto-generated MS Office C++ COM interfaces. This code base is much smaller than the MS Word code so shouldn't take me too long to convert. However, as there are still some open issues around using the COM interfaces rather than the raw IDispatch helper functions, I wouldn't want to hold this current pr up for that.

I tend to agree. Are the auto-generated MS Office C++ COM interfaces supposed to be faster? I guess we have to double check that before changing the Word code in such a major way.

@michaelDCurran
Copy link
Member Author

Actually, after playing with the auto-generated C++ COM interfaces for a while today I'm not convinced this is a good idea. At least not for Excel anyway. The main reason is that many of the methods are declared taking VARIANTs as arguments. Even things like range.text, font.size etc are all variants. This means that you need to create and initialize a variant for each argument, execute the call, and then possibly check and fetch the real value from any [out] variants. This therefore increases the complexity and amount of lines for each call significantly above what we currently have with the _com_dispatch_raw_methodcalls, as they automatically handle converting to/from the real native types. Note for MS Word, they havn't taken this kind of shortcut and therefore in future there may be still some advantage to consider doing it for MS word.
However, perhaps we can at least write some neicer wrapper functions for com_dispatch_raw* functions, somehow we could instead of declaring the dispids by themselves, instead declare the dispid along with its expected argument types, giving it a friendly name, and then using that in the actual code.
But even doing that, I'm not sure that would be of high priority.

@lukaszgo1
Copy link
Contributor

I've tested on Excel 2010 and one problem which I've noticed is the fact that on NVDA's element list some cells containing text are recognized as containing formulas. It mainly happens when the cell is a header and cells below are containing actual formulas. I am attaching a very simple sheet in which it can be reproduced. The cell c1 is recognized as containing formula whereas it contain only word test.
Excel-tests.xlsx

@michaelDCurran
Copy link
Member Author

michaelDCurran commented Feb 13, 2019 via email

@michaelDCurran
Copy link
Member Author

Here is the latest try build: https://ci.appveyor.com/api/buildjobs/746a4sf5p0di7rcd/artifacts/output%2Fnvda_snapshot_try-excelInproc-16767%2C5195b108.exe
This fixes the issue where listing formulas in the Elements list would be including the header cell above, and missing the last formula in the sheet.

@lukaszgo1
Copy link
Contributor

While the formula at the top of the column is no longer shown in the elements list there are still some empty cells which are displayed as having formula in the element list. Unfortunately I cannot share the workbook in which I have this issue here. Is there anyway to send it to you privately?

@michaelDCurran
Copy link
Member Author

michaelDCurran commented Feb 14, 2019 via email

…EnumVariant, not range.item otherwise cells in the gaps will also be fetched.
@michaelDCurran
Copy link
Member Author

michaelDCurran commented Feb 15, 2019 via email

@lukaszgo1
Copy link
Contributor

I can confirm it's fixed for me in Excel 2010 with the same sheet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
7 participants