Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoder data 的預處理 #12

Open
hsuanchia opened this issue Apr 28, 2021 · 3 comments
Open

Encoder data 的預處理 #12

hsuanchia opened this issue Apr 28, 2021 · 3 comments
Assignees

Comments

@hsuanchia
Copy link
Owner

hsuanchia commented Apr 28, 2021

batch size : 500 1000 2000 5000 10000 20000 30000
每一組資料你要給我

  1. feature map (None, 14, 14, 512)
  2. img_name or id 有辦法讓我知道是哪個圖片對應到哪個caption即可
    檔案格式:最好是用tf.data.dataset 備案: json 或 pickle
@hsuanchia
Copy link
Owner Author

或是你給我超過30000的資料 40000 50000 之類的 我再自己決定要取多少的資料做training 順便還可以做shuffle

snsd0805 added a commit that referenced this issue May 2, 2021
@snsd0805
Copy link
Collaborator

snsd0805 commented May 2, 2021

@hsuanchia 我們的雲端都還沒有 train_2017 的 Pictures

所以我在我的電腦直接跑 feature maps 的 predict,在 10000張圖的時候存取 pkl 就會超過記憶體大小,所以我先更新到這階段

另外我還沒用 tensorflow 的 dataset 套件包起來是因為我覺得 train data 和 target 我們需要再討論一下

jiazheng0609 pushed a commit that referenced this issue May 2, 2021
@hsuanchia
Copy link
Owner Author

我的雲端當中其實有train_2017.zip,但我沒辦法在colab上解壓縮再存回雲端,因為RAM不夠,我也沒辦法把train_2017.zip載下來,我沒辦法把我的電腦一直放著讓他載,所以可能要麻煩你將train_2017傳上雲端再分享給我們。或是我們商量好資料格式之後,image處理都交給你,你給我feature map就好,就像現在這樣。
另外,如果我們要做validation的話,那5000張val_2017的image也需要你幫忙處理。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants