Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySql, $query->batch() from more millions of record #8420

Closed
AstRonin opened this issue May 14, 2015 · 54 comments
Closed

MySql, $query->batch() from more millions of record #8420

AstRonin opened this issue May 14, 2015 · 54 comments
Assignees
Labels
type:docs Documentation
Milestone

Comments

@AstRonin
Copy link

I tried run yii\db\Query::batch() for big table and got memory leak. This happens because method \yii\db\Command::queryInternal() run

$this->pdoStatement->execute();

I changed attribute PDO::MYSQL_ATTR_USE_BUFFERED_QUERY -> false and it helps me.
I think, we should add description about this feature into documentation.

@samdark samdark added the type:docs Documentation label May 14, 2015
@yujin1st
Copy link

Could you specify where did you set the attribute?
I tried it in config and runtime(Yii::$app->db->pdo->setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY,false)), and got error 2014.

@AstRonin
Copy link
Author

As I understand this problem, you did not use 'fetch' or 'execute' method after creation request.
Also you can try the next solutions before I will see my code in the evening:

Nevertheless, you can attach code and error stack to better understand your problem.

Other way, you can use setAttribute(PDO::MYSQL_ATTR_USE_BUFFERED_QUERY,false) for mehod batch() and after that return back attribute to true for other cases

@AstRonin
Copy link
Author

Unfortunately, I cannot find my codes with this attribute, maybe it was a previous project. But you may use advice indicated above or show your code and we will try to help.

@cebe cebe added this to the 2.0.9 milestone Mar 14, 2016
@cebe cebe self-assigned this Mar 14, 2016
@cebe
Copy link
Member

cebe commented Mar 20, 2016

unable to reproduce the issue, see #11152

@samdark samdark closed this as completed Mar 20, 2016
@cebe cebe reopened this Mar 21, 2016
@cebe cebe modified the milestones: 2.0.10, 2.0.9 Jul 5, 2016
@cebe cebe modified the milestones: 2.0.11, 2.0.10 Oct 13, 2016
@cebe
Copy link
Member

cebe commented Nov 25, 2016

#9269 (comment)

@ihipop
Copy link
Contributor

ihipop commented Dec 27, 2016

Any Update on this issue?

@samdark samdark modified the milestones: 2.0.12, 2.0.11 Jan 27, 2017
@SamMousa
Copy link
Contributor

I can reproduce the issue, working on a fix.
What is the opinion of the yii team concerning configurability?
One could argue that it never makes sense to use batch in combination with buffered mode and that the default mode should be unbuffered.
Alternatively we could make the default unchanged and just add a parameter that allows explicitly disabling buffered mode (meaning it won't break BC).

Currently testing with unbuffered mode, there are some issues:

  1. We need to guarantee the database schema is known (otherwise the AR layer will try to do a SHOW FULL COLUMNS FROM xx).
  2. Any logic inside the batch processing cannot run queries.

I've just made a proof of concept that circumvents this issue, I'd love some feedback on it:
When running an unbuffered query, clone the database connection and use the cloned connection for the unbuffered query.

The advantage of this approach is that we can guarantee there are no issues with other queries (since the cloned connection is not accessible outside the scope of the batch query object(s)).

The downside is that it costs an extra connection (meaning connection set-up delay). Also it could give strange behavior when combined with transactions.

@SamMousa
Copy link
Contributor

Also, unbuffered mode will not work with relations that are not loaded through explicit joins since these require separate queries.

We can conclude several things:

  1. This is complex (should this be removed from 2.0.12?)
  2. If we want truly functional stuff we need to use multiple database connections.
  3. The simple approach will work in a number of specific cases only so it should not be the default behavior but instead be explicitly specified.
  4. Before any specific solution can be created the Yii team needs to decide what direction they want to go in since there is no simple "just make it work in all cases" solution.

@samdark
Copy link
Member

samdark commented Mar 14, 2017

It doesn't make sense to batch in buffered mode. That is for sure. Issues with unbuffered mode are significant though...

@SamMousa
Copy link
Contributor

SamMousa commented Mar 14, 2017

Does it even make sense to have batch() as a function? I always assumed it would actually get the results one batch at a time (reducing delays caused by roundtrips), but this does not seem to be the case.

Using ->each() is just as efficient at the moment..

We could create a separate unbufferedBatch that comes with appropriate warnings (I am willing to implement it), that just clones the connection and uses that.
Not sure this needs to be in the core though, it could just as well be released as a separate yii2-big-data extension.

@samdark
Copy link
Member

samdark commented Mar 14, 2017

Probably not considering the difficulties.

@samdark
Copy link
Member

samdark commented Mar 14, 2017

I'd deprecate batch() altogether...

@samdark
Copy link
Member

samdark commented Mar 14, 2017

@klimov-paul, @cebe, @SilverFire, @arogachev opinions?

@samdark samdark added this to the 2.0.12 milestone Mar 14, 2017
@beowulfenator
Copy link
Contributor

@ihipop That's not a bad idea, actually. Not really ugly, and it is essentially what we do in DataProvider with pagination. @samdark, what do you think about using offset/limit to implement batch() and each()?

@SamMousa
Copy link
Contributor

SamMousa commented Apr 21, 2017 via email

@beowulfenator
Copy link
Contributor

OK.

@ihipop
Copy link
Contributor

ihipop commented Apr 21, 2017

@SamMousa the consistent view of the data is as mentioned

the problem is that maybe an order by primary key or something else should be done to keep the order of the dataset

@SamMousa
Copy link
Contributor

@ihipop It is not.

Let me explain all my arguments against your method.

  1. It is slow, for every page the whole query needs to be executed only to just take a few records from them.
  2. It is inconsistent, in any scenario where the keys is not an auto incremented integer. If I have keys 1 and 15 what happens if I insert 2 during your iteration, most likely it'll be skipped or cause some other entry to be returned twice.
  3. Sorting by primary key sounds simple, but for every join a sort by primary key of that table will need to be added to guarantee consistent ordering, also it breaks any other order clauses.

An alternative, if you really want one is this, it is still far from optimal and only works for integer keys. We use this method in production for slow running jobs. And it only provides an upper bound for the numbe r of results.

  1. Get min and max of primary key.
  2. Divide the range so that the maximum number of records it can return, including joined records fits in memory.
  3. Iterate over the query using a condition like: where id > 10 and <= 20.

This approach is a lot faster but can still miss records inserted during processing. It will never return records twice.

As I've said before, no solution is valid for all cases and therefore we should not just implement something. Instead implement it yourself as a custom function or behavior in your query class.

@beowulfenator you mention buffered queries require RAM on the server; this is not clear to me. Do you mean the PHP or DB server?

@beowulfenator
Copy link
Contributor

@SamMousa I mean the DB server. Executing a buffered MySQL query with the result set that is too large to fit into MySQL server's RAM brings down the MySQL server. At least that's how it is for me.

@SamMousa
Copy link
Contributor

Hmm, that seems strange to me, since the buffering is happening in the PDO driver as far as I know

@ihipop
Copy link
Contributor

ihipop commented May 15, 2017

Finally,I write a new static method to get a unBuffered Query PDO $db instance to solve my Problem at acceptable price

    public static function getUnbufferedMysqlDb($db, $db_identifier = null)
    {
        $db_string = '';
        if (is_string($db)) { //TO SUPPORT the $db of Component Definition ID passed in string  ,for example $db='db'
            $db_string = $db;
            if (empty($db_identifier)) {
                $db_identifier = $db;
            }
            $db = Yii::$app->get($db); // Convert string Component Definition ID to a Component
        }
        if (!($db instanceof \yii\db\Connection) || !strstr($db->getDriverName(), 'mysql')) { //Safe Check
            throw  new InvalidParamException('Not a Mysql Component');
        };
        if (empty($db_identifier)) { //Generate a New String Component Definition ID if $db_identifier is not Provided
            $db_identifier = md5(sprintf("%s%s%s%s", $db->dsn, $db->username, $db->password,
                var_export($db->attributes, true)));
        }
        $db_identifier = 'unbuffered_' . $db_identifier;
        if (!Yii::$app->has($db_identifier)) {
            if ($db_string) {
                $_unbuffered_db = Yii::$app->getComponents()[$db_string];//Clone a Configuration 、、克隆一个配置
                $_unbuffered_db['attributes'][\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY] = false;
            } else {
                $_ = clone $db;
                $_->close(); //Ensure that it is not an active Connection Because PDO can not be serialize
                /** @var  $_unbuffered_db \yii\db\Connection */
                $_unbuffered_db = unserialize(serialize($_)); //Clone a Expensive Object //deep copy for safe
                $_unbuffered_db->attributes[\PDO::MYSQL_ATTR_USE_BUFFERED_QUERY] = false;
            }
            Yii::$app->setComponents([$db_identifier => $_unbuffered_db]);
        }

        return Yii::$app->get($db_identifier);
    }

When Using ,pass it to the $db parameters of batch()/each()
It Will establish a new Connection to mysql which PDO::MYSQL_ATTR_USE_BUFFERED_QUERY is false
,and Only Used for batch()/each(),Other query could be done as normal during batch()/each()

$unbuffered_db = SomeClass::getUnbufferedMysqlDb($Model::getDb());
foreach($activeQuery->batch(100,$unbuffered_db) as $foobars){
        #foobars
}

Also this acceptable.

$unbuffered_db = SomeClass::getUnbufferedMysqlDb('some-db');

@mikehaertl

@tom--
Copy link
Contributor

tom-- commented May 24, 2017

It may be worth keeping this feature but the guide is misleading

When working with large amounts of data, methods such as yii\db\Query::all() are not suitable because they require loading all data into the memory. To keep the memory requirement low, Yii provides the so-called batch query support. A batch query makes use of the data cursor and fetches data in batches.

And then proceeds to demonstrate the methods that fail to do this.

There needs to be a warning about memory exhaustion in PDO's buffered mode and advice how to avoid it.

Some elaboration on "makes use of the data cursor" would also be welcome.

@beowulfenator
Copy link
Contributor

@tom-- You're welcome to submit your PR for the documentation.

@tom--
Copy link
Contributor

tom-- commented May 25, 2017

I believe this is the first time I have seen "PRs welcome" deployed in the Yii project.

I did not submit my PR because I do not have one. I was unable to author one because I don't understand the mechanisms Yii and its use of PDO well enough to explain the problem and advise on workarounds. I am still struggling to debug my own problem with it and the difficulty comes mostly from this lack of knowledge.

More expert contributors than myself have expressed differing opinion on what to do about this bug including scrap it, fix it, and keep it. My comment is simply that unless we scrap it pretty soon, the guide needs a fix. I also wanted to revive the discussion. I regret that now.

@ihipop
Copy link
Contributor

ihipop commented May 26, 2017

@tom-- #8420 (comment) just do as this.
it will solve your problem ,maybe it is not the best one,but it would work

ihipop added a commit to ihipop/yii2 that referenced this issue May 26, 2017
@samdark samdark modified the milestones: 2.0.12, 2.0.13 May 31, 2017
@samdark samdark closed this as completed May 31, 2017
samdark added a commit that referenced this issue May 31, 2017
mixartemev added a commit to mixartemev/yii2 that referenced this issue Jun 24, 2017
* Fixes yiisoft#13945: Removed Courier New from error page fonts list since it looks bad on Linux (yiisoft#13947)

* Fixed phpdoc broken links to PHP manual pages [skip ci]

* Fixes yiisoft#10675: Added docs for disabling CSRF validation in standalone actions

* updated docs for base64UrlEncode()

as discussed in
yiisoft#13411 (comment)

* Fix `\yii\db\QueryTrait` type hints (yiisoft#13956) [skip ci]

* Fix `\yii\db\QueryInterface` type hints (yiisoft#13959) [skip ci]

* Added missing parts of disabling CSRF validation doc [skip ci] (yiisoft#13966)

* Fixed wording in Russian translation (yiisoft#13970) [skip ci]

* Fixes yiisoft#13961: RBAC Rules: PostgreSQL: PHP Warning "unserialize() expects parameter 1 to be string, resource given" was fixed

* compatibility for PHPUnit adjusted

* compatibility with PHPUnit 6.x added

* unit test fix

* `yii\caching\Cache::getOrSet()` now supports both `Closure` and `callable`

Closes yiisoft#13981

* compatibility with PHPUnit 6.x fix

* garbage collection added to unit tests

* compatibility with PHPUnit 6.x fix

* fixed tests without assettions

* compatibility with PHPUnit 6.x fix

* compatibility with PHPUnit 6.x fix

* memory usage at unit test reduced

* update phpdoc annotation in BaseActiveRecord

yiisoft/yii2-gii#267 (comment)

* added upgrade note about yii\cache\Cache::getOrSet()

* Revert "update phpdoc annotation in BaseActiveRecord"

This reverts commit 0f78008.

yiisoft/yii2-gii#267 (comment)

* Tweaked verbiage slightly [ci skip]

* yiisoft#13975: make documentation of current behaviour more clear [skip ci]

* docs/guide/input-multiple-models: added a note on avoiding validating the same input data with the same rules more than once (yiisoft#13995) [skip ci]

* Fixes yiisoft#13807: Fixed `yii\db\QueryBuilder` to inherit subquery params when building a `INSERT INTO ... SELECT` query

* Tiny documentation fix [skip ci] (yiisoft#13998)

* Added note to findOne and findAll docs (yiisoft#14002) [skip ci]

Added note to findOne and findAll docs to clearly state that complex conditions are not supported.

* [IpValidator] disable ipv6 checks (yiisoft#13984)

* disable ipv6 checks

* remove empty line

* updated CHANGELOG.md

* Added AF_INET6 to requirements

* Enhanced requirements memo for IPv6

* update AR docs

* Update composer fxp plugin version in docs and travis scripts

* Fixes yiisoft#13963: Added tests for yii\behaviors\TimestampBehavior

* Fixes yiisoft#13911: Significantly enhanced MSSQL schema reading performance

* Skipped segfaulting test when running on Travis and corresponding PHP versions

* enforce `backupGlobals` enabled for PHPUnit

* Fix `TimestampBehaviorTest` to be compatible with PHPUnit 6

* Switched to asset-packagist

* Enable xdebug for coverage on 7.1 insted of 5.6 as it should work faster

* Switched ocular to use PHP 7.1 execution results

* Added note clarifying that load function clears fixture data [skip ci]

* Fixes yiisoft#13728: Fixed the bug when `yii\behaviors\SluggableBehavior` wasn't preserving immutable slug values

* Changed return value in MemCache::setValues()

* Fixes yiisoft#13362: Fixed return value of

* Fixes yiisoft#14012: `yii\db\pgsql\Schema::findViewNames()` was skipping materialized views

* Created issue template

* Wrong repo [skip ci]

* Fixes yiisoft#13694: yii\widgets\Pjax now sends X-Pjax-Url header with response to fix redirect

* Fixed PHP manual link language [skip ci]

* Add setup for Apache

Describes the settings for Apache.

* yii\filters\RateLimiter refactored (yiisoft#13994)

Refactored yii\filters\RateLimiter and added tests

* Fixed spelling and typos

* Updated slovak localization

Fixed missing translation for '{attribute} must be an integer.' string

* Fixed ignored params when using count in `SqlDataProvider`

*  Fixed ambiguous error sql while using unique validator

* Refactored solution to be contained within ActiveQuery instead of helper

* Removed unneeded changelog line [skip ci]

* Removed obsolete import

* improve session PHPdoc

* Applied disambiguation to ExistValidator as well

* Fixes yiisoft#10346: Fixed "DOMException: Invalid Character Error" in `yii\web\XmlResponseFormatter::buildXml()`

* Fixes yiisoft#13087: Fixed getting active validators for safe attribute

* Added `yii\data\Sort::parseSortParams`

allowing customize other params request formats in descendant class.

close yiisoft#13170

* Fixes yiisoft#11719: Fixed `yii\db\Connection::$enableQueryCache` caused infinite loop when the same connection was used for `yii\caching\DbCache`

* Adjusted code style for less conditions branching

* Fixes yiisoft#13226: `yii cache` command now warns about the fact that it's not able to flush APC cache from console

* Fixes yiisoft#13848: `yii\di\Instance::ensure()` wasn't throwing an exception when `$type` is specified and `$reference` object isn't instance of `$type`

* Fixes yiisoft#13689: Fixed handling of errors in closures

* Fixed changelog [skip ci]

* PHPDocs and code style fixed

* Helper independent (yiisoft#14050)

* Made sure most helpers are independent of Yii::$app

* Made sure most validators are independent of Yii::$app

* Fixes yiisoft#14033: Fixed `yii\filters\AccessRule::matchIp()` erroring in case IP is not defined under HHVM

* Removed redundant use

* Removed redundant else

* param → parameter [skip ci]

* Adjusted changelog [skip ci]

* Update runtime-routing.md

* Fixes yiisoft#13790: Fixed error in `\yii\widgets\MaskedInput` JavaScript by raising version required

* Fixes yiisoft#14052: Fixed processing parse errors on PHP 7 since these are instances of `\ParseError`

* Fixes yiisoft#13951: renamed get-method which was conflicting with existing property (yiisoft#14055)

* Alternative names

* Fixes yiisoft#14059: Removed unused AR instantiating for calling of static methods

* Add phpdoc. [skip ci]

* Fix an invalid phpDocumentor annotation

Fixed an invalid phpDocumentor annotation of `yii\data\DataProviderInterface::getPagination()`.

* Fixes yiisoft#14072: clear table schema cache for all methods that affect cache

* Reorder changelog

* file will not loaded

https://github.com/yiisoft/yii2/tree/master/framework/test

* Fix validators documentation

* More control over ActiveForm widget

If part of a form is rendered separately (e.g. in ajax request), it's not easy to apply JS validation to it.
By moving JS registration to a separate method this problem could be solved much more easier.

* Renamed `ActiveForm::registerJs()` to `registerClientScript()`, added tests

* Fixed tests

* Fixed CHANGELOG

* Reordered CHANGELOG

* fixed missing changelog and missing `@since`

* Skip testAutoRefreshTableSchema on sqllite

* Skip just certain things for sqllite

* !==

* Update phpdoc, \yii\db\ActiveRecord => \yii\db\ActiveRecordInterface [skip ci] (yiisoft#14076)

* Cleanup after yiisoft#14017 (yiisoft#14091)

* Fixed Oracle SQL queries with `IN` condition and more than 1000 parameters

* Updated PHPdoc

* Remove unused imports. (yiisoft#14096)

* Fixes yiisoft#4793: `yii\filters\AccessControl` now can be used without `user` component

* Fixes yiisoft#14098: `yii\helpers\BaseFileHelper::normalizeOptions()` is now protected

* Fixes yiisoft#14108: fixed phpdoc mistake in method description

* Update db-migrations.md (yiisoft#14109) [skip ci]

By default, the class BaseMigrateController property migrationPath is assigned the value '@app/migrations'. In the method getNewMigrations() is the conversion migrationNamescpaces in the path to the file and merges it all into a single array along with the path migrationPath. If the project for migration module is a separate table in the database, all migration came from migrationPath are not initialized and therefore try to migrate. Therefore, it is necessary to specify the property 'migrationPath' => null.

* Eliminated else branches in beforeSave() and beforeDelete() phpdoc examples [skip ci]

* Fixes yiisoft#13890: DbTarget log transaction bug

* Update db-migrations.md (yiisoft#14113) [skip ci]

I made a typo. Sorry. The command migration-rbac, migrationPath property already exists.

* Typo (yiisoft#14117) [skip ci]

* Add Indonesian guide translation (yiisoft#14120) [skip ci]

* Update README.md on Indonesian Guide Translation

* Translate "Working with Forms" guide to Bahasa Indonesia

* added option to disable query logging in DB command

fixes yiisoft#12528

* moved query log logic into private method and reduced calls to getRawSql()

fixes yiisoft#12528

* failing test for asset bundle publishing empty directories

* Added copyEmptyDirectories option to FileHelper

also set it to false in AssetManager to avoid creating a lot of empty
directories.

fixes yiisoft#9669

* made AssetBundle tests less cryptic

cleanup step was added in setUp() so this will still work.

split test functions filehelper test

* Fixes yiisoft#5442: Fixed problem on load fixture dependencies with database related tests

* Fixed changelog [skip ci]

* Added support for sorting by expression to `\yii\data\Sort`

* function clear() -- Set populated arrays to empty arrays (yiisoft#14067)

`yii\web\View::clear()` sets populated arrays to empty arrays

* changelog for yiisoft#14067

* Further changes for unique and exist validators

- Combined methods for getting names and aliases of from tables
- Normalized names and aliases
- Added MSSQL syntax
- Added support for spaces in aliases and table names

* added changelog for yiisoft#12528

* Do not render empty row in GridView when data is empty and emptyText set to false

Fixes yiisoft#13352
close yiisoft#13422

* Added support for cloning a db connection

improved fix yiisoft#14020
fixes yiisoft#13890

https://github.com/yiisoft/yii2/pull/14020/files#r115185865

close yiisoft#14121

* Fixed ambiguous column name in SELECT in UniqueValidator

fixes yiisoft#14042

* Added escaping for column name

* Fixes yiisoft#13350, yiisoft#14094, yiisoft#11288

- Bug yiisoft#13350: Fixed bug with incorrect caching of `yii\web\UrlRule::createUrl()` results in `yii\web\UrlManager`.
- Bug yiisoft#14094: Fixed bug when single `yii\web\UrlManager::createUrl()` call my result multiple calls of `yii\web\UrlRule::createUrl()` for the same rule.
- Enh yiisoft#11288: Added support for caching of `yii\web\UrlRule::createUrl()` results in `yii\web\UrlManager` for rules with defaults.

* Remove unnecessary variable from UrlRule.php (yiisoft#13948)

* Additional refactoring as suggested by @dynasource

* fix broken link (yiisoft#14143) [skip ci]

File output-client-scripts.md not created. And in site - error 404: http://stuff.cebe.cc/yii2docs-ru/guide-output-client-scripts.html

* "из вне" => "извне"

grammar nazi attack

* Fixed changelog [skip ci]

* Fixes yiisoft#14133: Fixed bug when calculating timings with mixed nested profile begin and end in `yii\log\Logger::calculateTimings()`

* Fixes yiisoft#14150

* Translate "Working with Database" guide to Bahasa Indonesia (yiisoft#14152) [skip ci]

* yiisoft#14150: Added {{ and }} to getTablesUsedInFrom() output

* Additional edge case for yiisoft#14150

* Translate "Generating Code with Gii" guide to Bahasa Indonesia (yiisoft#14162)

* Fixes yiisoft#14161, yiisoft#14150: fixed primary table aliasing (include case manual set alias) in validators

* added namespace example to the autoload guide

* Translate "Looking Ahead" guide to Bahasa Indonesia (yiisoft#14177)

* advanced example on filter validator

fixes yiisoft#4284

* Simplified Chinese translation of filter validator (yiisoft#14195)

* Simplified Chinese Translation of filter validator

* Using more specific words

* Fixed typo in BaseFileHelper::findFiles() docs (yiisoft#14200) [skip ci]

* Add test coverage of yii\helpers\BaseArrayHelper (yiisoft#14205)

* Translate some application structure guide to Bahasa Indonesia (yiisoft#14209) [skip ci]

* Add blocktypes.json file in Indonesian Guide

* Translate "Application Structure Overview" to Bahasa Indonesia

* Translate "Entry Script" guide to Bahasa Indonesia

* Translate "Structure App - Application" to Bahasa Indonesia

* Translate "Application Component" guide to Bahasa Indonesia

* Add translators.json attribution in Indonesian Documentation

* Fixes yiisoft#14211: Fixed regression in Unique and Exist validators

* fixed indonesian translators.json file

* add doc for issue yiisoft#8420
yiisoft#8420

* typo

* Adjusted text [skip ci]

* MySql -> MySQL typo

* Adjusted typo text [skip ci]

* Fixes yiisoft#13846: Fixed `Query::count()` issue with `orderBy`

* Inserted missing * on line 972 (yiisoft#14224) [skip ci]

* Add test coverage of yii\helpers\BaseMarkdown (yiisoft#14223)

* Add test coverage of yii\helpers\BaseJson (yiisoft#14217)

* Adjusted text [skip ci]

* Adjusted text [skip ci]

* Adjusted text [skip ci]

* Adjusted text [skip ci]

* Add test coverage of yii\helpers\Html (yiisoft#14220)

* Add migration file example for DbSession (yiisoft#14227) [skip ci]

Creating a `char` primaryKey in a migration is non-obvious. Used solution from this [issue comment](yiisoft#10889 (comment)).

* Adjusted text [skip ci]

* Fixed yiisoft#13551: loading fixtures in subdirectories

* Enhanced FixtureController::getFixtureRelativeName()

Updated CHANGELOG

* Adjusted text,add some adivse [skip ci]

* Alternative implementation for passing parameters to AccessRule roles

fixes yiisoft#8426: `yii\filters\AccessRule` now allows passing GET or other parameters to the role checking function
replaces yiisoft#8426

* Document queryParams in the authorization guide

* simplified guide example

* improved docs

* Fixed RBAC databases tests

tests were reusing static DB connection instance resulting in all tests
to be run against MySQL only!

PgSQL and Sqlite tests are now failing because of the issue reported in yiisoft#13501.

* added failing test for yiisoft#13593

* additional fix for yiisoft#13501

close yiisoft#13593

* fix cloning of DB connection for sqlite in-memory db

fixes yiisoft#14131
close yiisoft#14232

* Allows override query relation in descendant class

fixes yiisoft#13441
close yiisoft#13642

* Fixes yiisoft#13058: Fixed caught exception thrown during view file rendering produces wrong output

* improved docs: link term "path alias" to the guide

* Added support for specifying aliases for migrationNamespaces

This is used to specify pathes to migrations that do not have
namespaces.

While not directly supported by the migration command provideded by the
framework, these migrations exist in a lot of extensions because custom implementations
of migration controllers out of the framework were using this approach
to load multiple migrations from multiple paths.

Even the framework itself currently ships non-namespaced migrations:

- https://github.com/yiisoft/yii2/blob/17a1d91e4a517f4f15dce973bf3c50dd939dce63/framework/rbac/migrations/m140506_102106_rbac_init.php
- https://github.com/yiisoft/yii2/blob/17a1d91e4a517f4f15dce973bf3c50dd939dce63/framework/caching/migrations/m150909_153426_cache_init.php
- https://github.com/yiisoft/yii2/blob/17a1d91e4a517f4f15dce973bf3c50dd939dce63/framework/log/migrations/m141106_185632_log_init.php

This change allows existing applications to adopt the new namespace-based approach
while keeping existing migrations. While it would be possible to add
namespaces to migrations in the application itself, it is not easily possible
to add namespaces to migrations that come from external sources like
extensions.

* added alias syntax support

fixes yiisoft#13356
fixes yiisoft#13359

* make migrationPath accept an array

* moved include file logic to separate method

* Fixes yiisoft#4999: Added support for wildcards at `yii\filters\AccessRule::$controllers`

* Update BaseMigrateController.php

make sure console arguments are recognized as array.

* Fixes yiisoft#5108 `DateValidator` resets `$timestampAttribute` value on empty attribute (yiisoft#14242)

* `yii\validators\DateValidator` now resets `$timestampAttribute` value on empty validated attribute value

* array-value test at `DateValidatorTest` restored

* skip failing HHVM test

see yiisoft#14244

* moved wrong changelog entry

* fixed version and missing `@since` annotation

* release version 2.0.12

* prepare for next release

* Greek translation update.

* Fixed Formatter::normalizeDatetimeValue() docs

There was a reference to `$checkTimeInfo` that was leftover from 54278fc

* improved formatter documentation

* Fix MessageController color output on non-color termnials

fixes yiisoft#14248
close yiisoft#14249

* guide: note about param binding on strings, not arrays

issue yiisoft#14218

* added param binding examples for update() and delete()

fixes yiisoft#14218

* Fixes yiisoft#14264: Fixed a bug where `yii\log\Logger::calculateTimings()` was not accepting messages with array tokens

* Improved base migration controller documentation [skip ci] (yiisoft#14253)

* added example for using unsupported PHP format in yii formatter

fixes yiisoft#14278

* Fixes yiisoft#14201: `yii\console\controllers\MessageController::extractMessagesFromTokens()` is now protected

* Fixes yiisoft#13787: Added `yii\db\Migration::$maxSqlOutputLength` that allows limiting number of characters for outputting SQL

* Adjusted RBAC docs to offer both migrations and command approaches [skip ci]

* Fixes yiisoft#14089: Added tests for `yii\base\Theme`

* Fixes yiisoft#13586: Added `$preserveNonEmptyValues` property to the `yii\behaviors\AttributeBehavior`

* Fixes yiisoft#14192: Fixed wrong default null value for TIMESTAMP when using PostgreSQL

* Fixes yiisoft#14081: Added `yii\caching\CacheInterface` to make custom cache extensions adoption easier

* Removed unused local var in JS test (yiisoft#14093)

* Added php-cs-fixer coding standards validation to Travis CI (yiisoft#14100)

* php-cs-fixer: PSR2 rule.

* php-cs-fixer: PSR2 rule - fix views.

* Travis setup refactoring.

* Add php-cs-fixer to travis cs tests.

* Fix tests on hhvm-3.12

* improve travis config

* composer update

* revert composer update

* improve travis config

* Fix CS.

* Extract config to separate classes.

* Extract config to separate classes.

* Add file header.

* Force short array syntax.

* binary_operator_spaces fixer

* Fix broken tests

* cast_spaces fixer

* concat_space fixer

* dir_constant fixer

* ereg_to_preg fixer

* function_typehint_space fixer

* hash_to_slash_comment fixer

* is_null fixer

* linebreak_after_opening_tag fixer

* lowercase_cast fixer

* magic_constant_casing fixer

* modernize_types_casting fixer

* native_function_casing fixer

* new_with_braces fixer

* no_alias_functions fixer

* no_blank_lines_after_class_opening fixer

* no_blank_lines_after_phpdoc fixer

* no_empty_comment fixer

* no_empty_phpdoc fixer

* no_empty_statement fixer

* no_extra_consecutive_blank_lines fixer

* no_leading_import_slash fixer

* no_leading_namespace_whitespace fixer

* no_mixed_echo_print fixer

* no_multiline_whitespace_around_double_arrow fixer

* no_multiline_whitespace_before_semicolons fixer

* no_php4_constructor fixer

* no_short_bool_cast fixer

* no_singleline_whitespace_before_semicolons fixer

* no_spaces_around_offset fixer

* no_trailing_comma_in_list_call fixer

* no_trailing_comma_in_singleline_array fixer

* no_unneeded_control_parentheses fixer

* no_unused_imports fixer

* no_useless_return fixer

* no_whitespace_before_comma_in_array fixer

* no_whitespace_in_blank_line fixer

* not_operator_with_successor_space fixer

* object_operator_without_whitespace fixer

* ordered_imports fixer

* php_unit_construct fixer

* php_unit_dedicate_assert fixer

* php_unit_fqcn_annotation fixer

* phpdoc_indent fixer

* phpdoc_no_access fixer

* phpdoc_no_empty_return fixer

* phpdoc_no_package fixer

* phpdoc_no_useless_inheritdoc fixer

* Fix broken tests

* phpdoc_return_self_reference fixer

* phpdoc_single_line_var_spacing fixer

* phpdoc_single_line_var_spacing fixer

* phpdoc_to_comment fixer

* phpdoc_trim fixer

* phpdoc_var_without_name fixer

* psr4 fixer

* self_accessor fixer

* short_scalar_cast fixer

* single_blank_line_before_namespace fixer

* single_quote fixer

* standardize_not_equals fixer

* ternary_operator_spaces fixer

* trailing_comma_in_multiline_array fixer

* trim_array_spaces fixer

* protected_to_private fixer

* unary_operator_spaces fixer

* whitespace_after_comma_in_array fixer

* `parent::setRules()` -> `$this->setRules()`

* blank_line_after_opening_tag fixer

* Update finder config.

* Revert changes for YiiRequirementChecker.

* Fix array formatting.

* Add missing import.

* Fix CS for new code merged from master.

* Fix some indentation issues.

* ensure tablenames are not replaced in insert and update

* Fixes yiisoft#14286: Used primary inputmask package name instead of an alias

* Documented possibility of using asset-packagist

* fixed typo

* Fixes yiisoft#14307: Fixed PHP warning when `yii\console\UnknownCommandException` is thrown for empty command

* Fixed phpdoc `yii\test\ActiveFixture::load` (yiisoft#14292) [skip ci]

* Fixes yiisoft#13824: Support extracting concatenated strings in `yii message`

* Fixes yiisoft#14298: The default response formatter configs defined by `yii\web\Response::defaultFormatters()` now use the array syntax

* Added link to Russian Sphinx guide [skip ci]

* Removed cs fixer from Travis

* Fixed typo [skip ci]

* Fixes yiisoft#14341: Fixed regression in error handling introduced by fixing yiisoft#14264
@mikehaertl
Copy link
Contributor

mikehaertl commented Apr 24, 2019

@ihipop I tried your workaround now as my solution (i.e. "hack") did not work with with() queries. Unfortunately I found that your solution suffers from the same problem. Say you have "product has many variations". And your batch query is for products using ->with('variations'). Yii then performs a 2nd query for the related Variation records, using the main buffered db connection and an product_id IN ( ... ) condition with all the product ids.

Unfortunately it seesm like the list of these product ids is not reset for each batch. It gets bigger and bigger so that the same related records are queried again and again - plus the new ids on top.

So I still end up with an out of memory error.

UPDATE: Forget what I said - it was rather an issue with too many related records. I've decreased the batch size and it works now.

@lubobill1990
Copy link

lubobill1990 commented Apr 13, 2020

As a 6 year developer of Yii2, I feel surprised that the each() method can exhaust memory until today, that I query on a table with a column of MEDIUMTEXT.

To sum up, basically, there are three methods to handle iterating on each record in a query set problem:

  1. the current one, to leave the dirty work to pdo extension, however, pdo will exhaust memory when there are too much records or there are one column with too much data(MEDIUMTEXT for example)
  2. to iterate on query with offset/limit. Pros is simplicity, cons are i) pagination become slow when offset increases, and DB can exhaust memory, ii) records will be queried twice when record insert during the query, iii) can't apply to query which already has offset/limit
  3. to iterate on query with primary key range. Pros are i) it is faster than offset/limit and will never exhaust memory, ii) can avoid record queried twice, cons are i) B tree index of integer column is required(hash index doesn't support range query) ii) the result is unexpected in some requirements if we don't need the record inserted after query, iii) can't apply to query which already has offset/limit
  4. to query twice, first query all the primary keys(can be implemented with current batch mechanism, leave it to pdo), then iterate on query with IN condition. Pros are i) the speed is acceptable, considering ActiveQuery::with() is implemented with IN condition, ii) can apply to any table with any kind of index, iii) Record will not repeat, iii) can apply to query which already has offset/limit. Cons are i) can exhaust memory when records exceed tens of million

One solution is to add more parameters to batch() and each(), the first is mode, indicating the implementation we need to use, another is specific parameters for the implementation, such as the index column.

But these heterogeneous parameters will pollute the batch() and each() function.

Another solution is to add several services for each implementation. The usage is like:
new RangeQueryIterator(User::find()->where(...), $batchSize))

It can decouple ActiveQuery from Iterator, which is preferred.

@samdark
Copy link
Member

samdark commented Apr 13, 2020

Thanks for ideas. @terabytesoftw should be interesting for Yii 3.

@lubobill1990
Copy link

lubobill1990 commented Apr 13, 2020

An example of RangeQueryIterator:

<?php


namespace ZhiShiQ\Yii\Db;


use yii\db\ActiveQuery;
use yii\db\BatchQueryResult;

class RangeQueryIterator implements \Iterator
{
    private $query;
    private $pk;
    /**
     * @var BatchQueryResult
     */
    private BatchQueryResult $batchIterator;

    private $_batch;
    private $_key;
    private $_value;

    public function __construct(ActiveQuery $query, $batchSize = 100)
    {
        $this->query = $query;
        $this->pk = ($query->modelClass)::primaryKey();
        $this->batchIterator = (clone $query)->select($this->pk)->batch($batchSize);
    }

    /**
     * Destructor.
     */
    public function __destruct()
    {
        $this->reset();
    }

    public function reset()
    {
        $this->_batch = null;
        $this->_value = null;
        $this->_key = null;
        $this->batchIterator->reset();
    }

    public function rewind()
    {
        $this->reset();
        $this->next();
    }

    public function next()
    {
        if ($this->_batch === null || next($this->_batch) === false) {
            $this->batchIterator->next();

            $keyCols = array_map(function ($val) {
                $arr = [];
                foreach ($this->pk as $key) {
                    $arr[$key] = $val[$key];
                }
                return $arr;
            }, $this->batchIterator->current());

            $this->_batch = (clone $this->query)->andWhere(['IN', $this->pk, $keyCols])->all();
            //TODO to ensure the order is the same as $keyCols
            reset($this->_batch);
        }

        $this->_value = current($this->_batch);
        if ($this->query->indexBy !== null) {
            $this->_key = key($this->_batch);
        } elseif (key($this->_batch) !== null) {
            $this->_key = $this->_key === null ? 0 : $this->_key + 1;
        } else {
            $this->_key = null;
        }
    }

    public function key()
    {
        return $this->_key;
    }

    public function current()
    {
        return $this->_value;
    }

    public function valid()
    {
        return !empty($this->_batch);
    }
}

Usage

foreach(new RangeQueryIterator(User::find()->where(...), $batchSize)) as $user) {
    //xxx
}

@petr-panek
Copy link

As a 6 year developer of Yii2, I feel surprised that the each() method can exhaust memory until today, that I query on a table with a column of MEDIUMTEXT.

To sum up, basically, there are three methods to handle iterating on each record in a query set problem:

  1. the current one, to leave the dirty work to pdo extension, however, pdo will exhaust memory when there are too much records or there are one column with too much data(MEDIUMTEXT for example)
  2. to iterate on query with offset/limit. Pros is simplicity, cons are i) pagination become slow when offset increases, and DB can exhaust memory, ii) records will be queried twice when record insert during the query, iii) can't apply to query which already has offset/limit
  3. to iterate on query with primary key range. Pros are i) it is faster than offset/limit and will never exhaust memory, ii) can avoid record queried twice, cons are i) B tree index of integer column is required(hash index doesn't support range query) ii) the result is unexpected in some requirements if we don't need the record inserted after query, iii) can't apply to query which already has offset/limit
  4. to query twice, first query all the primary keys(can be implemented with current batch mechanism, leave it to pdo), then iterate on query with IN condition. Pros are i) the speed is acceptable, considering ActiveQuery::with() is implemented with IN condition, ii) can apply to any table with any kind of index, iii) Record will not repeat, iii) can apply to query which already has offset/limit. Cons are i) can exhaust memory when records exceed tens of million

One solution is to add more parameters to batch() and each(), the first is mode, indicating the implementation we need to use, another is specific parameters for the implementation, such as the index column.

But these heterogeneous parameters will pollute the batch() and each() function.

Another solution is to add several services for each implementation. The usage is like: new RangeQueryIterator(User::find()->where(...), $batchSize))

It can decouple ActiveQuery from Iterator, which is preferred.

Same surprice for me after 5 years of frekquent usage of each() and batch() to carefully optimize memmory usage, but no optimization happened 😠
It's for me another reason to leave this "old" world of PHP + MySQL in favor of NodeJS + MongoDB.

@terabytesoftw
Copy link
Member

terabytesoftw commented Oct 15, 2022

As a 6 year developer of Yii2, I feel surprised that the each() method can exhaust memory until today, that I query on a table with a column of MEDIUMTEXT.
To sum up, basically, there are three methods to handle iterating on each record in a query set problem:

  1. the current one, to leave the dirty work to pdo extension, however, pdo will exhaust memory when there are too much records or there are one column with too much data(MEDIUMTEXT for example)
  2. to iterate on query with offset/limit. Pros is simplicity, cons are i) pagination become slow when offset increases, and DB can exhaust memory, ii) records will be queried twice when record insert during the query, iii) can't apply to query which already has offset/limit
  3. to iterate on query with primary key range. Pros are i) it is faster than offset/limit and will never exhaust memory, ii) can avoid record queried twice, cons are i) B tree index of integer column is required(hash index doesn't support range query) ii) the result is unexpected in some requirements if we don't need the record inserted after query, iii) can't apply to query which already has offset/limit
  4. to query twice, first query all the primary keys(can be implemented with current batch mechanism, leave it to pdo), then iterate on query with IN condition. Pros are i) the speed is acceptable, considering ActiveQuery::with() is implemented with IN condition, ii) can apply to any table with any kind of index, iii) Record will not repeat, iii) can apply to query which already has offset/limit. Cons are i) can exhaust memory when records exceed tens of million

One solution is to add more parameters to batch() and each(), the first is mode, indicating the implementation we need to use, another is specific parameters for the implementation, such as the index column.
But these heterogeneous parameters will pollute the batch() and each() function.
Another solution is to add several services for each implementation. The usage is like: new RangeQueryIterator(User::find()->where(...), $batchSize))
It can decouple ActiveQuery from Iterator, which is preferred.

Same surprice for me after 5 years of frekquent usage of each() and batch() to carefully optimize memmory usage, but no optimization happened 😠 It's for me another reason to leave this "old" world of PHP + MySQL in favor of NodeJS + MongoDB.

Many db problems are solved in yii3, the problem is the BC.

For example the problem of varbinary, and the db cache in oracle are solved.

@SamMousa
Copy link
Contributor

Can someone lock this topic. Developers moving away from php because they don't understand inner workings of stuff is really not a relevant issue. There's plenty of other languages in which those people can be bad developers. Commenting on a closed issue that's several years old just to complain about taking 6 years to understand something seems counterproductive.

@petr-panek
Copy link

Can someone lock this topic. Developers moving away from php because they don't understand inner workings of stuff is really not a relevant issue. There's plenty of other languages in which those people can be bad developers. Commenting on a closed issue that's several years old just to complain about taking 6 years to understand something seems counterproductive.

I don't want create any flame war here, but I need to react.
As I understand from the comments here the additional explanation about the each() and batch() was added to guide in 2017. I study Yii before, so there was no warning in documentation that each() and batch() can work different with MySQL and not as expected.

@SamMousa, your comment is simply arrogant (and counterproductive).

Thank you @lubobill1990 for providing solution 👍

@bizley
Copy link
Member

bizley commented Oct 16, 2022

Yes, thanks for the input from everyone, let's not vent our frustration like that. Please move this discussion to the forum if necessary.

@yiisoft yiisoft locked as too heated and limited conversation to collaborators Oct 16, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
type:docs Documentation
Projects
None yet
Development

No branches or pull requests