-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rework to support multiple concurrent instances #166
Open
nachoparker
wants to merge
1
commit into
nextcloud:main
Choose a base branch
from
nachoparker:multi-concurrent-processes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+129
−62
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
<?php | ||
declare(strict_types=1); | ||
/** | ||
* @copyright Copyleft (c) 2019, Ignacio Nunez <nacho@ownyourbits.com> | ||
* | ||
* @author Ignacio Nunez <nacho@ownyourbits.com> | ||
* | ||
* @license GNU AGPL version 3 or any later version | ||
* | ||
* This program is free software: you can redistribute it and/or modify | ||
* it under the terms of the GNU Affero General Public License as | ||
* published by the Free Software Foundation, either version 3 of the | ||
* License, or (at your option) any later version. | ||
* | ||
* This program is distributed in the hope that it will be useful, | ||
* but WITHOUT ANY WARRANTY; without even the implied warranty of | ||
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the | ||
* GNU Affero General Public License for more details. | ||
* | ||
* You should have received a copy of the GNU Affero General Public License | ||
* along with this program. If not, see <http://www.gnu.org/licenses/>. | ||
* | ||
*/ | ||
|
||
namespace OCA\PreviewGenerator\Migration; | ||
|
||
use OCP\DB\ISchemaWrapper; | ||
use OCP\Migration\SimpleMigrationStep; | ||
use OCP\Migration\IOutput; | ||
use Doctrine\DBAL\Types\Type; | ||
|
||
class Version020200Date20190608205303 extends SimpleMigrationStep { | ||
|
||
/** | ||
* @param IOutput $output | ||
* @param \Closure $schemaClosure The `\Closure` returns a `ISchemaWrapper` | ||
* @param array $options | ||
* @return null|ISchemaWrapper | ||
*/ | ||
public function changeSchema(IOutput $output, \Closure $schemaClosure, array $options) { | ||
/** @var ISchemaWrapper $schema */ | ||
$schema = $schemaClosure(); | ||
$table = $schema->getTable('preview_generation'); | ||
|
||
if (!$table->hasColumn('locked')) { | ||
$table->addColumn('locked', Type::BOOLEAN, [ | ||
'notnull' => true, | ||
'default' => 0, | ||
]); | ||
} | ||
return $schema; | ||
} | ||
} |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not get around this with a DB transaction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, you could get a lock by querying a row where file is node.id and locked is false, setting locked to true.
If this returns a result, you have the lock. If no lock, skip/wait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
I am not sure I understand what you both mean. Ideally we wouldn't have to do this if we could use some kind of
pthreads
/parallel
approach but since ZST packages are not available in current Debian systems, or even in Sury, I am calling theocc
command several times in parallel.Since we can't control different threads, but we have independent processes, then I added the
lock
column to the database so that two different instances don't start working on the same preview at the same time.If you look at the code, that is exactly what I do, but because there is a chance that both instances read the lock value as unlocked at the same time before setting it as locked, I added an ugly random wait to minimize the chance of this happening.
This wouldn't be needed if we had multithread approach with barriers, waits and so on. (or just give each child thread a different group of previews to work on).
I hope it is clear now. This is not particularly pretty but given the limitations we have it has proven very much effective :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am no PHP or SQL coder, but isn't there a way to combine a lock check + lock creation to eliminate the race condition?
E.g. in shell I do something like this:
set -C
adds the noclobber option, thus nothing can be written to existing files.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is to ran several instances simultaneously
I want several processes in parallel, so the locking needs to be in the PHP section
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need return code which checks for success (you are thinking shell scripting), we can just work with SQL to see if the column is locked or a new PID column matches, that's quite straightforward.
There's many ways to skin a cat. Deleting at the very end of the process would be an easy improvement but I don't want to keep going if this is going nowhere so I am awaiting feedback from the maintainer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been working on a python version of the preview generator with jpegtran for blazing fast image processing.
This snippet allows running multiple python processes.
I'll let you know when it's ready to be published :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Python as dependency IMO not a solution for the default app, but nice as alternative, if it's finally more efficient/faster and/or for systems where Python is available anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stefaang sounds cool, great. I imagine it spawns threads in parallel or uses multithreading per image?
That would be a nice thing to have even though ideally this would be implemented in multithreading PHP in the long run.
But hey, that might never happen so kudos
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nachoparker
That would not solve the race condition. The problem is the time between checking the fields state/value and setting new state/value.
E.g. two parallel processes/threads by chance check the same field nearly the same time. One checks slightly earlier but it can take slightly too long for it to set the locked state/value, so the other process still finds the filed unlocked during this minimal but existing time frame => both process the same file.
So generally when doing a check and setting a value based on the checks result, this strictly inherits such a race condition. For this reason it would be great to skip the check and set the value directly. But this requires a set function that fails conditionally and this failure must be recognised afterwards to decide whether to go on processing or skipping it.
This matches as well what @stefaang suggested above schematically.
But yeah, I think you are right that before discussing such polishing, the general decision whether this is wanted or not and on which level the multi-threading should be done, has been made.