This study addresses an approach aimed at generating synthetic image data to train Convolutional Neural Network (CNN) models. Focusing on the difficulty of creating a unique dataset for training CNN models due to industry-specific challenges, the approach seeks to augment existing images with a complex structure developed using diffusion models to increase the volume of a specific dataset. Diffusers are models that transform the stepwise noisy distribution, obtained from random noise, into the intended distribution using a scheduler and U-net model. The intended distribution can be a two-dimensional image, a three-dimensional image, or a sound wave output. All distributions we aim to obtain in our study are two-dimensional image distributions with three-color channels, which will be used in the training of CNN models. While theoretically any diffusion model could generate an unlimited number of images with random noise, in practice, as the size of the augmented dataset obtained with diffusers increases, the contextual similarity of the generated images tends to cause overfitting in CNN training. The primary focus of this study is to address the increasing similarity among the outputs of diffusers, differentiating itself from standard augmentation methods by altering the background image, position, and perspective of each image element in the dataset, thus providing contextual diversification. Through this pipeline, we aim to overcome the image scarcity and monotony, which are the major disadvantages of a limited dataset. Additionally, an efficiency analysis will be conducted by comparing the outputs obtained using our designed pipeline with those obtained using the Stable Diffusion XL (SDXL) pipeline. The objective of this approach is to present a significant methodology for researchers aiming to enhance CNN models using synthetic image data by comparing the context manipulation ability of our established pipeline with the widely used open-source diffusion model, SDXL. In this study, before introducing the diffuser pipeline we have developed, essential components and concepts crucial for understanding the structure of diffusers and various models used in generating synthetic image data, which we will encounter in later sections, will be explained. Subsequently, the architecture of the complex pipeline elements, including its outputs, code implementation, and other various aspects, will be individually examined. In the later parts of our project, data sets containing images produced with the pipeline we designed will be created and these will be used in CNN model training. After that, these results will be compared with the results of the CNN model with the same hyperparameters trained with real data. During this examination process, both standalone synthetic image datasets and versions mixed with real images will be used. The reason for this is to determine the most efficient way synthetic images can be utilized in CNN model training. In the subsequent sections of the thesis, the dataset created for training will first be examined, followed by an explanation of the CNN model and training parameters to be used, and an analysis of the training and test results. In the final section, a general evaluation of the entire study will be conducted to assess performance.
-
Notifications
You must be signed in to change notification settings - Fork 1
Synthetic Image Dataset Generation for CNN Training
License
altayavci/Synthetic-Image-Dataset-Generation-for-CNN-Training
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
Synthetic Image Dataset Generation for CNN Training
Topics
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published