diff --git a/reading-notes/conference/icml-2024.md b/reading-notes/conference/icml-2024.md
index 3492dce..5607808 100644
--- a/reading-notes/conference/icml-2024.md
+++ b/reading-notes/conference/icml-2024.md
@@ -6,9 +6,25 @@ Homepage: [https://icml.cc/Conferences/2024](https://icml.cc/Conferences/2024)
 
 ### Papers
 
-### Large Language Models (LLMs)
+### Serving Large Language Models (LLMs)
 
 * HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment \[[Personal Notes](../miscellaneous/arxiv/2023/hexgen.md)] \[[arXiv](https://arxiv.org/abs/2311.11514)] \[[Code](https://github.com/Relaxed-System-Lab/HexGen)]
   * HKUST & ETH & CMU
-  * Support _asymmetric_ tensor model parallelism and pipeline parallelism under the _heterogeneous_ setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).
-  * Propose _a heuristic-based evolutionary algorithm_ to search for the optimal layout.
+    * Support _asymmetric_ tensor model parallelism and pipeline parallelism under the _heterogeneous_ setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).
+    * Propose _a heuristic-based evolutionary algorithm_ to search for the optimal layout.
+
+### Multimodality
+
+* Video generation
+  * VideoPoet: A Large Language Model for Zero-Shot Video Generation \[[Paper](https://proceedings.mlr.press/v235/kondratyuk24a.html)] \[[Homepage](https://sites.research.google/videopoet/)]
+    * Google & CMU
+    * Employ a decoder-only transformer architecture that processes multimodal inputs – including images, videos, text, and audio.
+    * The pre-trained LLM is adapted to a range of video generation tasks.
+* Image retrieval
+  * MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions \[[Paper](https://proceedings.mlr.press/v235/zhang24an.html)] \[[Homepage](https://open-vision-language.github.io/MagicLens/)] \[[Code](https://github.com/google-deepmind/magiclens)]
+    * OSU & Google DeepMind
+    * Enable multimodality-to-image, image-to-image, and text-to-image retrieval.
+
+## References
+
+* [Google DeepMind at ICML 2024, 2024/07/19](https://deepmind.google/discover/blog/google-deepmind-at-icml-2024/)