Skip to content

Commit

Permalink
Use SequentialSampler if curriculum_sampling is enabled with sample_p…
Browse files Browse the repository at this point in the history
…acking (#2235)
  • Loading branch information
v-dicicco authored and bursteratom committed Jan 12, 2025
1 parent e0d4b88 commit da97a21
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion src/axolotl/core/trainer_builder.py
Original file line number Diff line number Diff line change
Expand Up @@ -608,8 +608,14 @@ def _get_train_sampler(self) -> Optional[torch.utils.data.Sampler]:
self.state.train_batch_size or self.args.per_device_train_batch_size
)
batch_max_len = train_batch_size * self.args.max_seq_length

if self.args.curriculum_sampling:
sampler = SequentialSampler(self.train_dataset)
else:
sampler = RandomSampler(self.train_dataset)

return MultipackBatchSampler(
RandomSampler(self.train_dataset),
sampler,
lengths=get_dataset_lengths(self.train_dataset),
packing_efficiency_estimate=self.args.sample_packing_efficiency,
batch_max_len=batch_max_len,
Expand Down

0 comments on commit da97a21

Please sign in to comment.