This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
I planned to update the image pre-processing part of Keras during my Christmas break to have a more complete cover of use-cases: multi-label, multi-outputs, dictionary iterator, perhaps others. I have faced some of these use-cases myself and have encounter people with the same use-cases, so I think is a good idea to include them in the image pre-processing module, while at the same time be careful of not addressing every exceptional use-case and ending up with untraceable code.
As I started I found that there is a single file containing all the logic with 2267 lines and all kind of different logic in there. I do not think this is good practice, and in the future (I think already now) maintaining and improving this file will be a pain (also the tests).
So to start I propose to break the file into a module, with a much better structure for iterators and the
ImageDataGenerator
containing the main logic. The classes and functions inside theimage
module can still be directly imported fromkeras_preprocessing.image
as they are exposed via the__init__.py
file. Therefore no change is required from imports depending onkeras_preprocessing.image
.What do you think of this approach? should I continue?
Related Issues
There are no specific issues to this particular PR, as is just a refactoring, but you can already see how the quality of the code has deteriorated compared to the native Keras code, I argue that a critical part to fix bugs and contribute is to have a clean tractable structure, specially for new contributors to the project like myself. This of course will also accelerate the addition of new features on this repo to the core
Keras
repo.For example after a quick look into the
DataframeIterator
I found many issues with it:typing
other
? if it's in order to have continuous targets, then that mode should be specified.other
doesn't really add much as such to the different use cases as they are very different, but it certainly adds logic complexity. I mean just to start, the targets are saved inself.data
in comparison withself.classes
with other modes.other
is not necessarily the problem is just that the iterators logic were initially made to handle only multi-class use-cases, and well the core logic needs to be updated to handle the other use-cases as well while at the same time keeping consistency.input
for example. I understand there are specific cases where indeed classes is not necessary, but the current logic does nothing to handle that.has_ext
. Directory has to be given. I thinkhas_ext
should be removed, this is the typical example where the logic tries to handle very specific cases causing a mess in the code flow and logic. The user should be responsible for giving filenames equal to filenames in the directory.ValueError
is understood that filenames contain extensions.So like i said there is no specific Issue for this change, but the below issues could be resolved faster and more accessible to new contributors with the structure change I propose.
#99 #102 #95 #88 #67 #56
P.D. I think you should start thinking on doing the same with
sequence.py
andtext.py
as they start growing.PR Overview