-
Yiyuan Zhang1,2*
+
Yiyuan Zhang1,2*
Kaixiong Gong1,2*
Kaipeng Zhang2,✉
@@ -72,21 +72,21 @@ After obtaining the token sequence, we employ a modality-shared encoder to extra
# 🔓 Model Zoo
-
+
Open-source Modality-Agnostic Models
| Model | Pretraining | Scale | #Param | Download |
| :------------: | :----------: | :----------------------: | :----: | :---------------------------------------------------------------------------------------------------: |
-| Meta-Transformer-B16 | LAION-2B | Base | 85M | [ckpt](https://drive.google.com/file/d/19ahcN2QKknkir_bayhTW5rucuAiX0OXq/view?usp=sharing) |
+| Meta-Transformer-B16 | LAION-2B | Base | 85M | [Google Drive](https://drive.google.com/file/d/19ahcN2QKknkir_bayhTW5rucuAiX0OXq/view?usp=sharing) |
| Meta-Transformer-L14 | LAION-2B | Large | 302M | [ckpt](https://drive.google.com/file/d/15EtzCBAQSqmelhdLz6k880A19_RpcX9B/view?usp=drive_link) |
-
+
-
+
Demo of Use for Pretrained Encoder
```python
@@ -104,7 +104,7 @@ encoder = nn.Sequential(*[
for i in range(12)])
encoder.load_state_dict(ckpt,strict=True)
```
-
+
# 🕙 ToDo
- [ ] Meta-Transformer with Large Language Models.