Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

doc: explain mechanism of validation data split of flow_from_directory #254

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

btel
Copy link

@btel btel commented Oct 15, 2019

Summary

This PR proposes to clarify the mechanism of creation validation set generation in function flow_from_directory when validate_split option is used. This may be potentially important when the image files in the directory have meaningful names (for example, encode particular instance of the objects). One can imagine the following organisation of files:

Cats/
    balinese1.jpg
    balinese2.jpg
    siamese1.jpg
    siamese2.jpg
Dogs/
    Shepard1.jpg
    Shepard2.jpg
    Terrier2.jpg

In such a setting validation set might get all siamese cats and terrier dogs, while the training set might get balinese cats and shepard dogs. This will heavilly affect validation metrics.

Related Issues

PR Overview

  • [n ] This PR requires new unit tests [y/n] (make sure tests are included)
  • [y] This PR requires to update the documentation [y/n] (make sure the docs are up-to-date)
  • [y] This PR is backwards compatible [y/n]
  • [n] This PR changes the current API [y/n] (all API changes need to be approved by fchollet)

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant