Skip to content

Commit

Permalink
GITBOOK-183: Update ICML '24 papers
Browse files Browse the repository at this point in the history
  • Loading branch information
mental2008 authored and gitbook-bot committed Jul 22, 2024
1 parent 1835d45 commit b4f2f55
Showing 1 changed file with 19 additions and 3 deletions.
22 changes: 19 additions & 3 deletions reading-notes/conference/icml-2024.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,25 @@ Homepage: [https://icml.cc/Conferences/2024](https://icml.cc/Conferences/2024)

### Papers

### Large Language Models (LLMs)
### Serving Large Language Models (LLMs)

* HexGen: Generative Inference of Foundation Model over Heterogeneous Decentralized Environment \[[Personal Notes](../miscellaneous/arxiv/2023/hexgen.md)] \[[arXiv](https://arxiv.org/abs/2311.11514)] \[[Code](https://github.com/Relaxed-System-Lab/HexGen)]
* HKUST & ETH & CMU
* Support _asymmetric_ tensor model parallelism and pipeline parallelism under the _heterogeneous_ setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).
* Propose _a heuristic-based evolutionary algorithm_ to search for the optimal layout.
* Support _asymmetric_ tensor model parallelism and pipeline parallelism under the _heterogeneous_ setting (i.e., each pipeline parallel stage can be assigned with a different number of layers and tensor model parallel degree).
* Propose _a heuristic-based evolutionary algorithm_ to search for the optimal layout.

### Multimodality

* Video generation
* VideoPoet: A Large Language Model for Zero-Shot Video Generation \[[Paper](https://proceedings.mlr.press/v235/kondratyuk24a.html)] \[[Homepage](https://sites.research.google/videopoet/)]
* Google & CMU
* Employ a decoder-only transformer architecture that processes multimodal inputs – including images, videos, text, and audio.
* The pre-trained LLM is adapted to a range of video generation tasks.
* Image retrieval
* MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions \[[Paper](https://proceedings.mlr.press/v235/zhang24an.html)] \[[Homepage](https://open-vision-language.github.io/MagicLens/)] \[[Code](https://github.com/google-deepmind/magiclens)]
* OSU & Google DeepMind
* Enable multimodality-to-image, image-to-image, and text-to-image retrieval.

## References

* [Google DeepMind at ICML 2024, 2024/07/19](https://deepmind.google/discover/blog/google-deepmind-at-icml-2024/)

0 comments on commit b4f2f55

Please sign in to comment.