Skip to content

Commit

Permalink
GITBOOK-190: Create new paper lists to organize AI papers (diffusion …
Browse files Browse the repository at this point in the history
…models, language models, DLRMs)
  • Loading branch information
mental2008 authored and gitbook-bot committed Aug 17, 2024
1 parent bc236ef commit 774a0bd
Show file tree
Hide file tree
Showing 8 changed files with 117 additions and 30 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Specifically, I have a broad interest in systems (e.g., OSDI, SOSP, NSDI, ATC, E

## Changelogs

* 08/2024: Update the reading notes of [SIGCOMM 2024](reading-notes/conference/sigcomm-2024.md).
* 08/2024: Update the reading notes of [SIGCOMM 2024](reading-notes/conference/sigcomm-2024.md); create a new paper list of [diffusion models](paper-list/artificial-intelligence/diffusion-models.md), [language models](paper-list/artificial-intelligence/language-models.md), and [deep learning recommendation models](paper-list/artificial-intelligence/dlrm.md).
* 07/2024: Organize the papers of [SIGCOMM 2024](reading-notes/conference/sigcomm-2024.md), [ICML 2024](reading-notes/conference/icml-2024.md), [ATC 2024](reading-notes/conference/atc-2024.md), [OSDI 2024](reading-notes/conference/osdi-2024.md), [NSDI 2024](reading-notes/conference/nsdi-2024.md), [CVPR 2024](reading-notes/conference/cvpr-2024.md), [ISCA 2024](reading-notes/conference/isca-2024.md); create a new paper list of [Systems for diffusion models](paper-list/systems-for-ml/diffusion-models.md); update the paper list of [Systems for LLMs](paper-list/systems-for-ml/llm.md), [Systems for DLRMs](paper-list/systems-for-ml/dlrm.md), [Resource Scheduler](paper-list/systems-for-ml/resource-scheduler.md).

## Epilogue
Expand Down
4 changes: 4 additions & 0 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,10 @@
* [Deep Learning Framework](paper-list/systems-for-ml/deep-learning-framework.md)
* [Cloud-Edge Collaboration](paper-list/systems-for-ml/cloud-edge-collaboration.md)
* [ML for Systems](paper-list/ml-for-systems.md)
* [Artificial Intelligence (AI)](paper-list/artificial-intelligence/README.md)
* [Diffusion Models](paper-list/artificial-intelligence/diffusion-models.md)
* [Language Models](paper-list/artificial-intelligence/language-models.md)
* [Deep Learning Recommendation Model (DLRM)](paper-list/artificial-intelligence/dlrm.md)
* [Hardware Virtualization](paper-list/hardware-virtualization/README.md)
* [GPU Sharing](paper-list/hardware-virtualization/gpu-sharing.md)
* [Resource Disaggregation](paper-list/resource-disaggregation/README.md)
Expand Down
5 changes: 5 additions & 0 deletions paper-list/artificial-intelligence/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
# Artificial Intelligence (AI)

* [Diffusion Models](diffusion-models.md)
* [Language Models](language-models.md)
* [Deep Learning Recommendation Models](dlrm.md)
60 changes: 60 additions & 0 deletions paper-list/artificial-intelligence/diffusion-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
# Diffusion Models

## Image Generation

### Diffusion Transformer (DiT)

* FLUX.1 \[[Code](https://github.com/black-forest-labs/flux)]
* Black Forest Labs
* Text-to-image generation
* Models
* FLUX.1-dev: [https://huggingface.co/black-forest-labs/FLUX.1-dev](https://huggingface.co/black-forest-labs/FLUX.1-dev)
* FLUX.1-schnell: [https://huggingface.co/black-forest-labs/FLUX.1-schnell](https://huggingface.co/black-forest-labs/FLUX.1-schnell)
* Scaling Rectified Flow Transformers for High-Resolution Image Synthesis (arXiv:2403.03206) \[[arXiv](https://arxiv.org/abs/2403.03206)] \[[Blog](https://stability.ai/news/stable-diffusion-3)]
* Stability AI
* **Stable Diffusion 3 (SD3)**
* Multimodal Diffusion Transformer (MMDiT)
* Models
* Stable Diffusion 3 Medium: [https://huggingface.co/stabilityai/stable-diffusion-3-medium](https://huggingface.co/stabilityai/stable-diffusion-3-medium)
* Scalable Diffusion Models with Transformers (ICCV 2023) \[[arXiv](https://arxiv.org/abs/2212.09748)] \[[Paper](https://openaccess.thecvf.com/content/ICCV2023/html/Peebles\_Scalable\_Diffusion\_Models\_with\_Transformers\_ICCV\_2023\_paper.html)] \[[Code](https://github.com/facebookresearch/DiT)] \[[Homepage](https://www.wpeebles.com/DiT)]
* UC Berkeley & NYU
* **DiT**

### UNet

* Kolors: Effective Training of Diffusion Model for Photorealistic Text-to-Image Synthesis \[[Technical Report](https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors\_paper.pdf)]
* Kuaishou Kolors
* Text-to-image generation
* Model: [https://huggingface.co/Kwai-Kolors/Kolors](https://huggingface.co/Kwai-Kolors/Kolors)
* SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis (arXiv:2307.01952) \[[arXiv](https://arxiv.org/abs/2307.01952)]
* Stability AI
* Models
* [https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0)
* [https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0](https://huggingface.co/stabilityai/stable-diffusion-xl-refiner-1.0)
* High-Resolution Image Synthesis with Latent Diffusion Models (CVPR 2022) \[[Paper](https://openaccess.thecvf.com/content/CVPR2022/html/Rombach\_High-Resolution\_Image\_Synthesis\_With\_Latent\_Diffusion\_Models\_CVPR\_2022\_paper)] \[[arXiv](https://arxiv.org/abs/2112.10752)] \[[Code](https://github.com/CompVis/stable-diffusion)]
* LMU Munich & Runway ML
* Latent Diffusion Models (LDMs)
* Models
* Stable-Diffusion-v1-5: [https://huggingface.co/runwayml/stable-diffusion-v1-5](https://huggingface.co/runwayml/stable-diffusion-v1-5)
* Initialized with the weights of the **Stable-Diffusion-v1-2** checkpoint and subsequently fine-tuned on 595k steps at resolution 512x512.

## Video Generation

* Stable Video 4D (SV4D)
* Stability AI
* Model: [https://huggingface.co/stabilityai/sv4d](https://huggingface.co/stabilityai/sv4d)
* Generate **40** frames (5 video frames x 8 camera views) at 576x576 resolution, given 5 reference frames of the same size.
* Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets (arXiv:2311.15127) \[[arXiv](https://arxiv.org/abs/2311.15127)] \[[Blog](https://stability.ai/news/stable-video-diffusion-open-ai-video-model)]
* Stability AI
* **Stable Video Diffusion** (SVD)
* Text-to-video and image-to-video generation
* Models
* [https://huggingface.co/stabilityai/stable-video-diffusion-img2vid](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid)
* Generate **14** frames at resolution **576x1024** given a context frame of the same size.
* [https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt](https://huggingface.co/stabilityai/stable-video-diffusion-img2vid-xt)
* Fine-tuned from the SVD-img2vid.
* Generate **25** frames at resolution **576x1024** given a context frame of the same size.

## Acronyms

* LLM: Large Language Model
10 changes: 10 additions & 0 deletions paper-list/artificial-intelligence/dlrm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
# Deep Learning Recommendation Model (DLRM)

* Efficient Long Sequential User Data Modeling for Click-Through Rate Prediction (DLP-KDD 2022) \[[Paper](https://arxiv.org/abs/2209.12212)]
* Alibaba
* ETA: _Efficient target attention_ network
* Locality-sensitive hashing
* Deployed on Taobao_._
* Wide & Deep Learning for Recommender Systems (DLRS 2016) \[[Personal Notes](../../reading-notes/miscellaneous/arxiv/2016/wide-and-deep-learning-for-recommender-systems.md)] \[[Paper](https://dl.acm.org/doi/10.1145/2988450.2988454)]
* Google
* WDL: Wide & Deep model
37 changes: 37 additions & 0 deletions paper-list/artificial-intelligence/language-models.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Language Models

* Grok-2 \[[Blog](https://x.ai/blog/grok-2)]
* xAI
* Grok-2 Beta was released on 2024/08/13.
* Gemma 2: Improving Open Language Models at a Practical Size (arXiv:2408.00118) \[[arXiv](https://arxiv.org/abs/2408.00118)] \[[Code](https://github.com/google-deepmind/gemma)]
* Gemma Team, Google DeepMind
* **Gemma 2**
* Models: [https://www.kaggle.com/models/google/gemma](https://www.kaggle.com/models/google/gemma)
* The Llama 3 Herd of Models (arXiv:2407.21783) \[[arXiv](https://arxiv.org/abs/2407.21783)] \[[Blog](https://ai.meta.com/blog/meta-llama-3/)] \[[Code](https://github.com/meta-llama/llama3)]
*  MetaAI
* **Llama 3**
* Models
* Llama 3 8B: [https://huggingface.co/meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B)
* Llama 3 70B
* Llama 3 405B
* Mixtral 8x7B (arXiv:2401.04088) \[[arXiv](https://arxiv.org/abs/2401.04088)] \[[Blog](https://mistral.ai/news/mixtral-of-experts/)] \[[Code](https://github.com/mistralai/mistral-inference)]
* Mistral AI
* **Mixtral 8x7B**
* Model: [https://huggingface.co/mistralai/Mixtral-8x7B-v0.1](https://huggingface.co/mistralai/Mixtral-8x7B-v0.1)
* Llama 2: Open Foundation and Fine-Tuned Chat Models (arXiv 2307.09288) \[[Paper](https://arxiv.org/abs/2307.09288)] \[[Homepage](https://ai.meta.com/llama/)]
* Meta AI
* **Llama 2**
* Released with a _permissive_ community license and is available for commercial use.
* LLaMA: Open and Efficient Foundation Language Models (arXiv 2302.13971) \[[Paper](https://arxiv.org/abs/2302.13971)] \[[Code](https://github.com/facebookresearch/llama)]
* Meta AI
* **6.7B, 13B, 32.5B, 65.2B**
* Open-access
* PaLM: Scaling Language Modeling with Pathways (JMLR 2023) \[[Paper](https://www.jmlr.org/papers/v24/22-1144.html)] \[[PaLM API](https://developers.googleblog.com/2023/03/announcing-palm-api-and-makersuite.html)]
* **540B**; open access to PaLM APIs in March 2023.
* BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (arXiv 2211.05100) \[[Paper](https://arxiv.org/abs/2211.05100)] \[[Model](https://huggingface.co/bigscience/bloom)] \[[Blog](https://bigscience.huggingface.co/blog/bloom)]
* **176B**
* open-access
* OPT: Open Pre-trained Transformer Language Models (arXiv: 2205.01068) \[[Paper](https://arxiv.org/abs/2205.01068)] \[[Code](https://github.com/facebookresearch/metaseq/tree/main/projects/OPT)]
* Meta AI
* Range from 125M to 175B parameters.
* Open-access
11 changes: 0 additions & 11 deletions paper-list/systems-for-ml/dlrm.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,17 +41,6 @@
* Tencent & Edinburgh
* P2P model update dissemination.

## DLRM

* Efficient Long Sequential User Data Modeling for Click-Through Rate Prediction (DLP-KDD 2022) \[[Paper](https://arxiv.org/abs/2209.12212)]
* Alibaba
* ETA: _Efficient target attention_ network
* Locality-sensitive hashing
* Deployed on Taobao_._
* Wide & Deep Learning for Recommender Systems (DLRS 2016) \[[Personal Notes](../../reading-notes/miscellaneous/arxiv/2016/wide-and-deep-learning-for-recommender-systems.md)] \[[Paper](https://dl.acm.org/doi/10.1145/2988450.2988454)]
* Google
* WDL: Wide & Deep model

## Acronyms

* DLRM: Deep Learning Recommendation Model
18 changes: 0 additions & 18 deletions paper-list/systems-for-ml/llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -145,24 +145,6 @@ I am actively maintaining this list.
* PUZZLE: Efficiently Aligning Large Language Models through Light-Weight Context Switch ([ATC 2024](../../reading-notes/conference/atc-2024.md)) \[[Paper](https://www.usenix.org/conference/atc24/presentation/lei)]
* THU

## LLMs

* Llama 2: Open Foundation and Fine-Tuned Chat Models (arXiv 2307.09288) \[[Paper](https://arxiv.org/abs/2307.09288)] \[[Homepage](https://ai.meta.com/llama/)]
* Released with a _permissive_ community license and is available for commercial use.
* LLaMA: Open and Efficient Foundation Language Models (arXiv 2302.13971) \[[Paper](https://arxiv.org/abs/2302.13971)] \[[Code](https://github.com/facebookresearch/llama)]
* Meta AI
* **6.7B, 13B, 32.5B, 65.2B**
* Open-access
* PaLM: Scaling Language Modeling with Pathways (JMLR 2023) \[[Paper](https://www.jmlr.org/papers/v24/22-1144.html)] \[[PaLM API](https://developers.googleblog.com/2023/03/announcing-palm-api-and-makersuite.html)]
* **540B**; open access to PaLM APIs in March 2023.
* BLOOM: A 176B-Parameter Open-Access Multilingual Language Model (arXiv 2211.05100) \[[Paper](https://arxiv.org/abs/2211.05100)] \[[Model](https://huggingface.co/bigscience/bloom)] \[[Blog](https://bigscience.huggingface.co/blog/bloom)]
* **176B**
* open-access
* OPT: Open Pre-trained Transformer Language Models (arXiv: 2205.01068) \[[Paper](https://arxiv.org/abs/2205.01068)] \[[Code](https://github.com/facebookresearch/metaseq/tree/main/projects/OPT)]
* Meta AI
* Range from 125M to 175B parameters.
* Open-access

## Acronyms

* LLM: Large Language Model
Expand Down

0 comments on commit 774a0bd

Please sign in to comment.