-
Notifications
You must be signed in to change notification settings - Fork 42
Conversation
Another potential use case for this redesign is TensorFlow BigQuery ML, which as of now requires input transformations/datatypes to be TF native.
|
This was very much needed. Even to this date, once we apply augmentation, we have to write a sample code to make sure that everything is working fine, which is fine as it is very simple to do that but it makes sense to be the part of that API itself. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome!
I would be able to work on some of the features once this PR is merged.
- Image loading and image augmentation should be synced across inputs and targets | ||
- It should be possible to use different standardization preprocessing (outside of augmentation) across inputs and targets | ||
|
||
**Proposal:** (TBD) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My only concern is that this proposal is not general enough to accommodate users. If a user wants to do object detection, for example, he will not use keras-preprocessing.
I can work on a new RFC, but I feel that generator/dataset_from_dataframe could be general enough to support it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you checked out the proposal for segmentation? Could we use something similar for object detection? What's missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right now the segmentation example works because these are images. In a case where we have boxes or keypoints, we don't have that. Furthermore, to do DA on those, we need to normalize those points between 0,1 or provide the shape of the original image.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would be your preferred API for object detection?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We would need something similar to flow_from_dataframe and a way to request some info from the image like the shape.
We would also need the ability to change the way we augment 'samples' (image, boxes, keypoint).
Then we can support any task with little effort I think?
As I said, we should work on another RFCs as this feature is not in scope with this one.
Maybe I should bring it up during the meeting on Monday?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's discuss this today.
|
||
Proposed changes: API simplification & standardization, and extension of supported types. | ||
|
||
- All functions should work on either single images or batches of images |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we use the deprecation procedure for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we rename arguments, we should keep the old names working (with a deprecation warning).
There is one case where we are changing the semantics of an argument in a non-backwards compatible way: the preprocessing_function
would now be applied before the transformations rather than after. We could come up with new names in order to avoid breakages. Thoughts?
There was a 3d preprocessing request/thread at #5. |
rfcs/20190729-keras-preprocessing.md
Outdated
# Returns JSON-serializable config dict | ||
|
||
def preview(self, data, save_to_directory=None, save_prefix=None, save_format='png'): | ||
"""Enables users to previous the image augmentation configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enables users to preview the image augmentation configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
A comment about resizing: When you specify I am not sure if this is the right place to propose this as it might be out of scope of this RFC. |
How about using tf.image instead of PIL for image processing (types convertion, resizing and normalization)? |
The If you do care about aspect ratio, or if you want to use a different kind of resizing strategy (e.g. center crop and resize, which is popular), I would suggest implementing it in |
We want the image transformations to support both numpy arrays and tensors, so we would probably have both tf.image and scipy-based implementations. |
The only problem is that |
#### Question: how to support image segmentation in a simple way? | ||
|
||
**Requirements:** | ||
- Image loading and image augmentation should be synced across inputs and targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keeping things synced is simple. Once the user passes the inputs
and targets
, we should store the input filename
and target filename
in a pandas dataframe.
inputfile targetfile
0 path_to_file1 path_to_target1
1 path_to_file2 path_to_target2
... ... ...
- In order to shuffle the data and keeping
inputs
andtargets
synced, we only have to shuffle the indices of thisdataframe
. - Any augmentation that is applied on a particular
input
will be applied on thetarget
as well because they can be processed by the same thread in a sequential manner now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's almost what flow_from_dataframe already does. So maybe we can beef up this function to perform whatever we need.
|
||
**Requirements:** | ||
- Image loading and image augmentation should be synced across inputs and targets | ||
- It should be possible to use different standardization preprocessing (outside of augmentation) across inputs and targets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't preprocessing_fn
be enough for that?
rfcs/20190729-keras-preprocessing.md
Outdated
vertical_flip=False, | ||
rescale=None, | ||
preprocessing_function=None, | ||
postprocessing_function=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is some confusion with preprocessing_function
and postprocessing_function
here. The current implementation does not contrain postprocessing_function
.
At line 194 below (in # PROPOSED
section) it is missing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed it.
We will do the design review for this RFC on Monday 19th August at 2pm PT. I am leaving the comment period open until then. |
Working list of unresolved questions:
|
From my point of view the RFC has still a minor bug. It sais that See above: #6 (review) |
One more comment: Let me explain this: Currently When you implement a I suggest to rename both parameters so the old parameter not available anymore. |
rfcs/20190729-keras-preprocessing.md
Outdated
This presents the opportunity to improve on the existing API. In particular we don't have good support | ||
for image segmentation use cases today. | ||
|
||
Some features are also being subplanted by [preprocessing layers](https://github.com/keras-team/governance/blob/master/rfcs/20190502-preprocessing-layers.md), in particular text processing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: supplanted
- Make all classes in `keras.preprocessing` JSON-serializable. | ||
- Deprecate `Tokenizer` class in favor of `TextVectorization` preprocessing layer | ||
- Make image-transformation functions in `affine_transformations` submodule compatible with TensorFlow tensor inputs (accepting tensors and returning tensors). | ||
- Improve signature of the above-mentioned functions, by using fully-spelled argument names, and fewer arguments if possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are we approaching backwards compatibility here? Presumably this pertains to keras-team/keras initially, but we will pull into TF, at which point we will need to maintain all the old APIs. What's the plan?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Keras we have a deprecation procedure in our Wiki.
We also have utils in keras to make the correct warnings.
```python | ||
# PROPOSED | ||
|
||
def array_to_generator( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There seems to be some variation in from/to -- should these be standardized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, let's standardize them. We can either settle for (input)_to_(output)
or (output)_from_(input)
seed=None, | ||
subset=None): | ||
|
||
def dataset_from_dataframe( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a world where we don't need all of these, and we can just determine data type from the passed object? String is dir path, array/dataframe can be type-checked. Then we just have to_dataset(), without the need for so many separate APIs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be a simplification in terms of the set of methods available, but the signatures would get quite complex / confusing since each action has its own specific keyword arguments on top of the shared ones (e.g. follow_links
, target_mode
...)
preprocessing_function=None, | ||
interpolation_order=1, | ||
data_format='channels_last', | ||
dtype='float32'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a lot of constructor args, and it biases us towards adding more with each new feature, rather than finding other ways of setting parameters. Is this the best way to do this, versus having subclasses that handle subsets, or deferring collecting some of these until the method in which they are needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess we can decompose this part in 2 parts: the normalization and the augmentation. The rest is shared. It may add some complexity to what the user needs to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is one context in which having lots of arguments is acceptable: when they form a flat list-like configuration, i.e adding more arguments results in linear complexity and maintenance workload, because they don't interact with each other. This is the case here. We're just describing a list of image transformations. Once you have the logic infrastructure to do one image transformation, then doing arbitrarily many does not add much.
|
||
`tf.data.Dataset` is the main API for data loading and preprocessing in TensorFLow. It has two advantages: | ||
|
||
- It supports GPU prefetching |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"It supports GPU prefetching" is a small subset of the following advantages:
- it supports parallelization of transformations
- it supports parallelization of I/O
- it supports software pipelining (e.g. prefetching to GPU / TPU)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the firsts two advantages you mentioned are not there because they are already supported by the current API.
- Remove the submobules `image`, `text`, `timeseries` from the public API and expose their contents as part of the top-level `keras.preprocessing`. | ||
- Make all classes in `keras.preprocessing` JSON-serializable. | ||
- Deprecate `Tokenizer` class in favor of `TextVectorization` preprocessing layer | ||
- Make image-transformation functions in `affine_transformations` submodule compatible with TensorFlow tensor inputs (accepting tensors and returning tensors). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this would necessitate new TF ops -- would we expose those in tf.image? I'd like to make that API more complete.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are also some ops from Deepmind at https://github.com/deepmind/multidim-image-augmentation
- Rename methods `flow` to `generator_from_arrays`, `flow_from_directory` to `generator_from_directory`, and `flow_from_dataframe` to `generator_to_dataframe`. | ||
- Improve signature of the above-mentioned methods. | ||
- Figure out how to support image segmentation use cases with `ImageDataGenerator` (open question). | ||
- Refactor `TimeseriesGenerator` to follow a similar design as `ImageDataGenerator`. Only configuration arguments should be passed in the constructor. The data should be passed to methods such as `dataset_from_arrays`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very broad question: Is it worth having this two level structure (IIUC): A Generator class configured with a few options, which then offers methods to create generators or Datasets? Would it be easier to have a flat structure of factory methods (or classes) which simply take configuration and data and produce an appropriate Dataset / generator / array?
In other words, what does the separation of data and config buy us? Do people reuse a (say) ImageDataGenerator several times to produce different data using the same configured preprocessing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think people reuse their ImageDataGenerator, but both calls to ImageDataGenerator
and dataset_from_dataframe
requires many arguments which might confuse the user if done in a single function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The factoring between config and data is there for 2 reasons:
- to enable reusing a config on new data; this is how we plan to support image segmentation for instance (target masks have to be augmented in the same way as inputs)
- makes it easier for people to think about data augmentation/normalization and loading separately, in two steps, which would otherwise form a very large and potentially confusing single step.
Following the design review meeting on Monday, we have decided to take more time to significantly rewrite this proposal. I'll be closing the PR for the time being, and we will open a new PR with the revised proposal at a later date. |
Keras-Preprocessing Redesign
Comment period open until August 18th, 2019.
Context
tf.data.Dataset
is the main API for data loading and preprocessing in TensorFlow. It has two advantages:Meanwhile,
keras.preprocessing
in a major API for data loading and preprocessing in Keras. It is basedon Numpy and Scipy, and it produces instances of the
keras.utils.Sequence
class, which are finite-length,resettable Python generators that yield batches of data.
Some features of
keras.preprocessing
are highly useful and don't have straightforward equivalents intf.data
(in particular image data augmentation and dynamic time series iteration).
Ideally, the utilities in
keras.preprocessing
should be made compatible withtf.data
.This presents the opportunity to improve on the existing API. In particular we don't have good support
for image segmentation use cases today.
Some features are also being subplanted by preprocessing layers, in particular text processing.
As a result we may want to deprecate them.
Goals
keras.preprocessing
compatible withtf.data
keras.preprocessing
APIs