Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unserializing Spreadsheet instance leaks memory #3195

Closed
8 tasks
ConnectGrid opened this issue Nov 20, 2022 · 2 comments
Closed
8 tasks

Unserializing Spreadsheet instance leaks memory #3195

ConnectGrid opened this issue Nov 20, 2022 · 2 comments

Comments

@ConnectGrid
Copy link
Contributor

ConnectGrid commented Nov 20, 2022

This is:

- [X] a bug report
- [ ] a feature request
- [ ] **not** a usage question (ask them on https://stackoverflow.com/questions/tagged/phpspreadsheet or https://gitter.im/PHPOffice/PhpSpreadsheet)

What is the expected behavior?

Repeated calls to unserialize() does not increase memory usage.

Note that using the available cell caching feature does not offer sufficient performance for our application since we are dealing with thousands of spreadsheets. Instead, we are caching each entire Spreadsheet instance into Redis, which requires they be serialized.

What is the current behavior?

Repeated calls to unserialize() is increasing memory usage on each call, even for the same Spreadsheet instance. Yet this same issue also affects our application when we unserialize many individual instances in succession. So, one or many, memory usage increases every time unserialize() is called on PhpSpreadsheet instances.

What are the steps to reproduce?

Please provide a Minimal, Complete, and Verifiable example of code that exhibits the issue without relying on an external Excel file or a web server:

<?php

require __DIR__ . '/vendor/autoload.php';

// Create new Spreadsheet object
$spreadsheet = new \PhpOffice\PhpSpreadsheet\Spreadsheet();

// add some mock data
$worksheet = $spreadsheet->getActiveSheet();
foreach (range('A', 'F') as $char) {
    foreach (range(1, 100) as $num) {
        $worksheet->getCell("{$char}{$num}")->setValue($num + $num);
    }
}

$txt = serialize($spreadsheet);

// memory usage increases after each call to unserialize():

unserialize($txt);
unserialize($txt);
unserialize($txt);

// memory is eventually exhausted when dealing with many very large spreadsheet instances

If this is an issue with reading a specific spreadsheet file, then it may be appropriate to provide a sample file that demonstrates the problem; but please keep it as small as possible, and sanitize any confidential information before uploading.

n/a

What features do you think are causing the issue

  • Reader
  • Writer
  • Styles
  • Data Validations
  • Formula Calculations
  • Charts
  • AutoFilter
  • Form Elements

Does an issue affect all spreadsheet file formats? If not, which formats are affected?

n/a

Which versions of PhpSpreadsheet and PHP are affected?

PhpSpreadsheet v1.24
PHP 8.1

@MarkBaker
Copy link
Member

MarkBaker commented Nov 21, 2022

Looks like I'm going to have to explicitly disable the ability to serialize a spreadsheet... it isn't a serializable object.

@ConnectGrid
Copy link
Contributor Author

We worked around the issue by using the toArray() and fromArray() methods on the Worksheet. Reconstructing a Spreadsheet instance using those methods does not impact performance noticeable, and we can easily store the resulting array in a serialized fashion. It feels like the right way to handle this. Solved.

oleibman added a commit to oleibman/PhpSpreadsheet that referenced this issue Aug 30, 2024
Fix PHPOffice#3195. Applying a conditional style to an entire row or column applies the Excel 2007+ limits (16,384 columns and 1,048,576 rows). However, Excel 2003- has different limits (256 columns and 65,536 rows). Trying to use the larger limits in an Xls spreadsheet results in a corrupt file. Xls Writer is changed to adjust ranges appropriately.

The issue also mentions that there might be a problem if a spreadsheet contains CF rules for both whole rows and whole columns. I am unable to corroborate this; the code from the new test can be used to generate an Xls file which is usable, and which has both row and column conditional styles.

However, testing showed that Xls Writer did not handle conditions correctly for floating point, negative integers, or integers > 65535. These should now be handled correctly.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants