Skip to content

Commit

Permalink
GITBOOK-193: Organize SOSP 24 papers
Browse files Browse the repository at this point in the history
  • Loading branch information
mental2008 authored and gitbook-bot committed Sep 21, 2024
1 parent 5f73757 commit d32256d
Show file tree
Hide file tree
Showing 4 changed files with 63 additions and 2 deletions.
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ Specifically, I have a broad interest in systems (e.g., OSDI, SOSP, NSDI, ATC, E

## Changelogs

* 09/2024: Organize the papers of [SOSP 2024](reading-notes/conference/sosp-2024.md).
* 08/2024: Organize the papers of [VLDB 2024](reading-notes/conference/vldb-2024.md); update the reading notes of [SIGCOMM 2024](reading-notes/conference/sigcomm-2024.md); create new paper lists of [diffusion models](paper-list/artificial-intelligence/diffusion-models.md), [language models](paper-list/artificial-intelligence/language-models.md), and [deep learning recommendation models](paper-list/artificial-intelligence/dlrm.md).
* 07/2024: Organize the papers of [SIGCOMM 2024](reading-notes/conference/sigcomm-2024.md), [ICML 2024](reading-notes/conference/icml-2024.md), [ATC 2024](reading-notes/conference/atc-2024.md), [OSDI 2024](reading-notes/conference/osdi-2024.md), [NSDI 2024](reading-notes/conference/nsdi-2024.md), [CVPR 2024](reading-notes/conference/cvpr-2024.md), [ISCA 2024](reading-notes/conference/isca-2024.md); create a new paper list of [systems for diffusion models](paper-list/systems-for-ml/diffusion-models.md); update the paper list of [systems for LLMs](paper-list/systems-for-ml/llm.md), [systems for DLRMs](paper-list/systems-for-ml/dlrm.md), and [resource scheduler](paper-list/systems-for-ml/resource-scheduler.md).

Expand Down
1 change: 1 addition & 0 deletions SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,7 @@
## Reading Notes

* [Conference](reading-notes/conference/README.md)
* [SOSP 2024](reading-notes/conference/sosp-2024.md)
* [VLDB 2024](reading-notes/conference/vldb-2024.md)
* [SIGCOMM 2024](reading-notes/conference/sigcomm-2024.md)
* [ICML 2024](reading-notes/conference/icml-2024.md)
Expand Down
4 changes: 2 additions & 2 deletions reading-notes/conference/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,8 @@
| :-----------------------------: | :----------------: | ------------------------------------------------------ | :-------------------------------------------: |
| SoCC 2024 | Nov 22-24, 2024 | Seattle, Washington, USA | **Upcoming** |
| SC 2024 | Nov 17-22, 2024 | Atlanta, GA, USA | **Upcoming** |
| SOSP 2024 | Nov 4-6, 2024 | Hilton Austin, Texas, USA | **Upcoming** |
| [VLDB 2024](vldb-2024.md) | Aug 26-30, 2024 | Guangzhou, China | **Upcoming** |
| [SOSP 2024](sosp-2024.md) | Nov 4-6, 2024 | Hilton Austin, Texas, USA | **Upcoming** |
| [VLDB 2024](vldb-2024.md) | Aug 26-30, 2024 | Guangzhou, China | 🧐 |
| [SIGCOMM 2024](sigcomm-2024.md) | Aug 4-8, 2024 | Sydney, Australia | 🧐 |
| [ICML 2024](icml-2024.md) | Jul 21-27, 2024 | Messe Wien Exhibition Congress Center, Vienna, Austria | |
| [ATC 2024](atc-2024.md) | Jul 10-12, 2024 | Santa Clara, CA, USA | 🧐; co-located with [OSDI 2024](osdi-2024.md) |
Expand Down
59 changes: 59 additions & 0 deletions reading-notes/conference/sosp-2024.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
# SOSP 2024

## Meta Info

Homepage: [https://sigops.org/s/conferences/sosp/2024/](https://sigops.org/s/conferences/sosp/2024/)

## Papers

### Large Language Models (LLMs)

* LLM Training
* Enabling Parallelism Hot Switching for Efficient Training of Large Language Models
* PKU
* Perseus: Removing Energy Bloat from Large Model Training \[[arXiv](https://arxiv.org/abs/2312.06902)]
* UMich
* Use a graph cut-based algorithm to obtain the "iteration time-energy" Pareto frontier; schedule the energy consumption across time.
* LLM Inference
* LoongServe: Efficiently Serving Long-context Large Language Models with Elastic Sequence Parallelism \[[arXiv](https://arxiv.org/abs/2404.09526)]
* PKU
* ESP: Elastic Sequence Parallelism
* Elastically adjust the degree of parallelism in real-time; reduce key-value cache migration overhead and overlap partial decoding communication with computation; reduce key-value cache fragmentation across instances.
* PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU \[[arXiv](https://arxiv.org/abs/2312.12456)]
* SJTU IPADS

### ML Serving

* Improving DNN Inference Throughput using Practical, Per-Input Compute Adaptation
* GaTech & Princeton
* Apparate: Rethinking Early Exits to Tame Latency-Throughput Tensions in ML Serving \[[arXiv](https://arxiv.org/abs/2312.05385)]
* Princeton & GaTech
* Automatically apply and manage early exits (certain inputs can exit with results at intermediate layers) in ML models.

### Distributed Training

* SlipStream: Adapting Pipelines for Distributed Training of Large DNNs Amid Failures \[[arXiv](https://arxiv.org/abs/2405.14009)]
* Stanford
* Dynamically re-route the work of a failed server to data-parallel peers; execute within bubbles of the original pipeline schedule.
* Tenplex: Dynamic Parallelism for Deep Learning using Parallelizable Tensor Collections \[[arXiv](https://arxiv.org/abs/2312.05181)]
* ICL
* **Tenplex** — a state management library.
* Enable jobs to change the parallelism dynamically.
* PTC: Parallelizable Tensor Collection
* Dataset state
* Modle state 
* Execute PTC transformations in parallel with minimum data movement between workers.

### ML Compilation

* Scaling Deep Learning Computation over the Inter-Core Connected Intelligence Processor \[[arXiv](https://arxiv.org/abs/2408.04808)]
* UIUC & MSRA
* **T10**, the first DL compiler to exploit the inter-core communication bandwidth and distributed on-chip memory on AI chips (i.e., Graphcore IPU).
* SilvanForge: A Schedule-Guided Retargetable Compiler for Decision Tree Inference
* IISc

### Serverless Computing

* Dirigent: Lightweight Serverless Orchestration \[[arXiv](https://arxiv.org/abs/2404.16393)]
* ETH
* Simplify state management of the existing orchestration system (Kubernetes); eliminate persistent state updates; run monolithic control and data planes to minimize internal communication overheads.

0 comments on commit d32256d

Please sign in to comment.