Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limited Support for Form Controls V2 (ListBox, Buttons, etc.) #3130

Merged
merged 3 commits into from
Dec 28, 2022

Conversation

oleibman
Copy link
Collaborator

@oleibman oleibman commented Oct 20, 2022

This is a replacement for draft PR #2455 and draft PR #3127. There is some useful commentary in those PRs which I have mostly, but not entirely, duplicated below. Fix #2396. Fix #1770. Fix #2388. Fix #2904. Issue #3147 concerns an ODS file, so is not helped by this PR.

This fix allows form control elements to be read in and written out. It does not allow you to add such elements, nor even to locate them or determine their properties (so you can't modify or delete them). Although it handles reading and writing of sheets containing both form controls and comments, it will probably create a corrupt spreadsheet if you try adding a new comment to a sheet with form controls - probably quite difficult to solve. Cloning the sheet probably won't work either - probably easier than the other. It is conceivable that we want to add a new property to the Xlsx Reader which turns the reading of form elements on or off (default=off), so that negative effects will be limited to those who have explictly opted in. The change in its current form does not implement such a property.

A related problem is that the vml files used for the form controls sometimes contain invalid xml. Fix #3125 (rejected previous PR #1181 as too risky, issue was also reported as #170). Vml file should be valid Xml, but Excel can generate unclosed <br> tags, preventing Xlsx reader from reading file correctly. I believe a very narrowly targeted fix, changing <br> to <br/>, and only when reading vml files, probably mitigates the risk. The sample file formscomments.xlsx which is part of this change shows this problem with Button 1 on sheet Forms; the spreadsheet was created with Excel 365, so the problem is not restricted to Excel 2013 as originally reported. A comment on PR 3127 indicates that other tags might be involved, but, without a file demonstrating that, I will restrict this change to br tags for now.

I am starting this out in draft status, and will probably leave it that way for some time. I'm not sure where we want to go with this. It fixes some problems, but in a limited manner, and creates some others. I'm not sure the pain of the others is balanced considering the limitations of the fix. If enough interest is generated as a result of this ticket being out there, we can proceed; if not, it probably isn't worth it.

Because of its limitations, the change isn't really testable. As in some other recent installs, I have added a sample to demonstrate that it works correctly.

As it turns out, if we have a worksheet which contains both form controls and comments (see formscomments.xlsx which is part of this PR), PhpSpreadsheet already creates a corrupt file when it tries to load and save the spreadsheet with such a worksheet. With this change, the file is saved without corruption. This tilts things in favor of proceeding. I'm still not ready, but this will be an important consideration.

A sample file for issue #2621 illustrated a problem with shape files. Since they are involved here, I took a look at how the sample worked with this code. In master, and with this change, a corrupt file results. Fixing that is probably easier than the general problem of handling shape files, but it's an argument against moving this forward until the corruption problem can be addressed.

Fix #2661. A template including checkboxes was leading to file corruption solved by this PR. Another argument for moving forward.

This is:

- [x] a bugfix
- [ ] a new feature
- [ ] refactoring
- [ ] additional unit tests

Checklist:

  • Changes are covered by samples
    • Changes are covered by existing unit tests
    • New samples have been added
  • Code style is respected
  • Commit message explains why the change is made (see https://github.com/erlang/otp/wiki/Writing-good-commit-messages)
  • CHANGELOG.md contains a short summary of the change
  • Documentation is updated as necessary

Why this change is needed?

Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.

This is a replacement for draft PR PHPOffice#2455 and draft PR PHPOffice#3127. There is some useful commentary in those PRs which I have mostly, but not entirely, duplicated below. Fix PHPOffice#2396. Fix PHPOffice#1770. Fix PHPOffice#2388.

A related problem is that the vml files used for the form controls sometimes contain invalid xml. Fix PHPOffice#3125 (rejected previous PR PHPOffice#1181 as too risky, issue was also reported as PHPOffice#170). Vml file should be valid Xml, but Excel can generate unclosed `<br>` tags, preventing Xlsx reader from reading file correctly. I believe a very narrowly targeted fix, changing `<br>` to `<br/>`, and only when reading vml files, probably mitigates the risk. The sample file formscomments.xlsx which is part of this change shows this problem with `Button 1` on sheet `Forms`; the spreadsheet was created with Excel 365, so the problem is not restricted to Excel 2013 as originally reported. A comment on PR 3127 indicates that other tags might be involved, but, without a file demonstrating that, I will restrict this change to br tags for now.

I am starting this out in draft status, and will probably leave it that way for some time. I'm not sure where we want to go with this. It fixes some problems, but in a limited manner, and creates some others. I'm not sure the pain of the others is balanced considering the limitations of the fix. If enough interest is generated as a result of this ticket being out there, we can proceed; if not, it probably isn't worth it.

This fix allows form control elements to be read in and written out. It does not allow you to add such elements, nor even to locate them or determine their properties (so you can't modify or delete them). Although it handles reading and writing of sheets containing both form controls and comments, it will probably create a corrupt spreadsheet if you try adding a new comment to a sheet with form controls - probably quite difficult to solve. Cloning the sheet probably won't work either - probably easier than the other. It is conceivable that we want to add a new property to the Xlsx Reader which turns the reading of form elements on or off (default=off), so that negative effects will be limited to those who have explictly opted in. The change in its current form does not implement such a property.

Because of its limitations, the change isn't really testable. As in some other recent installs, I have added a sample to demonstrate that it works correctly.

As it turns out, if we have a worksheet which contains both form controls and comments (see formscomments.xlsx which is part of this PR), PhpSpreadsheet already creates a corrupt file when it tries to load and save the spreadsheet with such a worksheet. With this change, the file is saved without corruption. This tilts things in favor of proceeding. I'm still not ready, but this will be an important consideration.

A sample file for issue PHPOffice#2621 illustrated a problem with shape files. Since they are involved here, I took a look at how the sample worked with this code. In master, and with this change, a corrupt file results. Fixing that is probably easier than the general problem of handling shape files, but it's an argument against moving this forward until the corruption problem can be addressed.

Fix PHPOffice#2661. A template including checkboxes was leading to file corruption solved by this PR. Another argument for moving forward.
@oleibman oleibman added reader/xlsx Reader for MS OfficeOpenXML-format (xlsx) spreadsheet files writer/xlsx Writer for MS OfficeOpenXML-format (xlsx) spreadsheet files form controls labels Oct 20, 2022
@oleibman oleibman marked this pull request as draft October 20, 2022 11:33
Add more realistic worksheet to spreadsheet. Document new feature, adding caveats to how it can be used.
@fglueck
Copy link

fglueck commented Oct 28, 2022

Please note that also other tags do not need a close tag either.

$contents = str_replace('
', '
', $contents);

Can't be the final solution. I fixed this by using tidy with xml flag. But tidy will change/extend the dependencies of phpSpreadsheet.

@oleibman
Copy link
Collaborator Author

@fglueck We have seen the problem repeatedly with unclosed br tag. You are the only one who has implicated other tags. If you have a sample file to share, I can see if I can handle them. Without a sample, my hands are tied.

@oleibman oleibman marked this pull request as ready for review November 4, 2022 23:23
@oleibman
Copy link
Collaborator Author

oleibman commented Nov 4, 2022

@MarkBaker @PowerKiKi After being in draft status since last December, I think it is time to move this into ready status. I probably won't merge it till after Php8.2 goes live, and can delay for any milestones you wish (PhpSpreadsheet 1.26 or 2.0). My thinking is that it solves some problems, and probably doesn't cause any new ones (possibly wishful thinking but the test suite survives it and it seems to solve existing and new forms-related issues). It could lead to more problem reports eventually (for things which won't work without it anyhow); as people become aware that it's available, they are sure to do things which will break it. I have added a caveat to this effect to the official documentation.

@oleibman oleibman changed the title WIP Limited Support for Form Controls V2 (ListBox, Buttons, etc.) Limited Support for Form Controls V2 (ListBox, Buttons, etc.) Nov 4, 2022
@oleibman
Copy link
Collaborator Author

@MarkBaker Now that Php8.2 and PhpSpreadsheet 1.26 are real, I think I might want to merge this next week, probably Tuesday or Wednesday. I know a couple of my recent changes have given you heartburn on the PhpSpreadsheet 2.0 front. Do you think this one will be similarly problematic? If so, I will revert to draft status rather than merging.

@oleibman oleibman merged commit e7a0e80 into PHPOffice:master Dec 28, 2022
@edeebee
Copy link

edeebee commented Jan 17, 2023

Is there an approximate date, when this fix will be released?

@Ericcscro5
Copy link

Hello. With this merged, is there a way to read checkboxes in a sheet?

@oleibman
Copy link
Collaborator Author

oleibman commented Feb 5, 2023

@Ericcscro5 Sorry, I was not able to come anywhere close to figuring out how to implement some obviously useful additions to this change, including your question. As the description stated, "This fix allows form control elements to be read in and written out. It does not allow you to add such elements, nor even to locate them or determine their properties (so you can't modify or delete them)."

@Ericcscro5
Copy link

@Ericcscro5 Sorry, I was not able to come anywhere close to figuring out how to implement some obviously useful additions to this change, including your question. As the description stated, "This fix allows form control elements to be read in and written out. It does not allow you to add such elements, nor even to locate them or determine their properties (so you can't modify or delete them)."

Thank you for the reply.

I just worked around this by linking the checkbox to a cell, setting that linked cell's text color equivalent to its background and then reading the linked cell's value.

@oleibman oleibman deleted the legacydrawing2 branch February 10, 2023 05:46
@oleibman oleibman mentioned this pull request Sep 11, 2023
11 tasks
oleibman added a commit to oleibman/PhpSpreadsheet that referenced this pull request Aug 1, 2024
Fix issue PHPOffice#4105. Passing through form controls from load to save was part of the limited support added with PR PHPOffice#3130. However, the legacyDrawing vml, which is a crucial piece of that change, often contains a description only of notes (comments) with no form control contents. Since legacyDrawing is constructed by Xlsx Writer when it doesn't already exist, Xlsx Reader should not interfere with that ability - it should save the contents of legacyDrawing only when it appears to not consist merely of note descriptions.

It should also be an option to delete the legacyDrawing vml before saving. This allows one to add comments, at the cost of losing form controls. A method `deleteLegacyDrawing` is added to Spreadsheet to permit this.
@oleibman oleibman mentioned this pull request Aug 1, 2024
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
form controls reader/xlsx Reader for MS OfficeOpenXML-format (xlsx) spreadsheet files writer/xlsx Writer for MS OfficeOpenXML-format (xlsx) spreadsheet files
4 participants