BUG: Validation breaks cached image sizes #1934

feffy380 · 2025-02-14T13:46:26Z

The validation split helper function does not adjust the image sizes list, so images get stretched due to being assigned the wrong size.

sd-scripts/library/train_util.py

Line 1939 in 0778dd9

sizes = [None] * len(img_paths)

sd-scripts/library/train_util.py

Lines 1976 to 1998 in 0778dd9

    
           # We want to create a training and validation split. This should be improved in the future 
        
           # to allow a clearer distinction between training and validation. This can be seen as a  
        
           # short-term solution to limit what is necessary to implement validation datasets 
        
           #  
        
           # We split the dataset for the subset based on if we are doing a validation split 
        
           # The self.is_training_dataset defines the type of dataset, training or validation  
        
           # if self.is_training_dataset is True -> training dataset 
        
           # if self.is_training_dataset is False -> validation dataset 
        
           if self.validation_split > 0.0: 
        
               # For regularization images we do not want to split this dataset.  
        
               if subset.is_reg is True: 
        
                   # Skip any validation dataset for regularization images 
        
                   if self.is_training_dataset is False: 
        
                       img_paths = [] 
        
                   # Otherwise the img_paths remain as original img_paths and no split  
        
                   # required for training images dataset of regularization images 
        
               else: 
        
                   img_paths = split_train_val( 
        
                       img_paths,  
        
                       self.is_training_dataset,  
        
                       self.validation_split,  
        
                       self.validation_seed 
        
                   )

@rockerBOO

The text was updated successfully, but these errors were encountered:

rockerBOO · 2025-02-14T16:29:46Z

I see the sizes are set as None but I'm not following how that is stretching the images. In further reading it picks up on the None and tries to get the image size later in the flow.

feffy380 · 2025-02-15T02:14:00Z

The validation split function shuffles and truncates the list of image paths but not the corresponding list of sizes. So when we access sizes[i] later we get a random image size from the dataset, which gets passed to the image transforms and causes the stretching

feffy380 · 2025-02-15T02:17:11Z

The easiest fix is probably to zip the paths and sizes before doing the validation split and then separate them afterwards

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Validation breaks cached image sizes #1934

BUG: Validation breaks cached image sizes #1934

feffy380 commented Feb 14, 2025

rockerBOO commented Feb 14, 2025

feffy380 commented Feb 15, 2025 •

edited

Loading

feffy380 commented Feb 15, 2025

BUG: Validation breaks cached image sizes #1934

BUG: Validation breaks cached image sizes #1934

Comments

feffy380 commented Feb 14, 2025

rockerBOO commented Feb 14, 2025

feffy380 commented Feb 15, 2025 • edited Loading

feffy380 commented Feb 15, 2025

feffy380 commented Feb 15, 2025 •

edited

Loading