Skip to content

Commit

Permalink
Merge pull request #612 from bghira/main
Browse files Browse the repository at this point in the history
merge
  • Loading branch information
bghira authored Aug 2, 2024
2 parents 76926c4 + bdd222e commit 1d4e6d6
Showing 1 changed file with 5 additions and 1 deletion.
6 changes: 5 additions & 1 deletion helpers/data_backend/factory.py
Original file line number Diff line number Diff line change
Expand Up @@ -247,7 +247,11 @@ def configure_parquet_database(backend: dict, args, data_backend: BaseDataBacken

bytes_string = data_backend.read(parquet_path)
pq = io.BytesIO(bytes_string)
df = pd.read_parquet(pq)
if parquet_path.endswith(".jsonl"):
df = pd.read_json(pq, lines=True)
else:
df = pd.read_parquet(pq)

caption_column = parquet_config.get(
"caption_column", args.parquet_caption_column or "description"
)
Expand Down

0 comments on commit 1d4e6d6

Please sign in to comment.