add large scale simulation readme

modelscope · Aug 13, 2024 · 0572991 · 0572991
1 parent 21377d0
commit 0572991
Show file tree

Hide file tree

Showing 6 changed files with 131 additions and 11 deletions.
diff --git a/examples/paper_large_scale_simulation/README.md b/examples/paper_large_scale_simulation/README.md
@@ -0,0 +1,112 @@
+# Very Large-Scale Multi-Agent Simulation in AgentScope
+
+> **WARNING:**
+>
+> **This example will consume a huge amount of tokens.**
+> **Using paid model API with this example can introduce a high cost.**
+> **Users with powerful GPUs (A100 or better) can use local inference services (such as vLLM) to run this example,**
+
+The code under this folder is the experiment of the paper [Very Large-Scale Multi-Agent Simulation in AgentScope](https://arxiv.org/abs/2407.17789).
+
+In the experiment, we set up a large number of agents to participate in the classic game "guess the 2/3 of the average", where each agent reports a real number between 0 and 100 and the agent who reports a number closest to 2
+3 of the average of all the reported numbers wins the game.
+
+## Tested Models
+
+Only vLLM local inference service is tested for this example.
+
+This example will consume a huge amount of tokens. Please do not use model API that requires payment.
+
+## Prerequisites
+
+- Have multiple machines (Linux system) with powerful GPUs (A100 or better)
+- The distribute version of AgentScope is installed on all machines.
+- The v0.4.3 or higher versions of [vLLM](https://github.com/vllm-project/vllm) is installed on all machines.
+
+
+## Usage
+
+## How to Run
+
+### Step 1: start local inference service
+
+> If you only have one machine and don't have a powerful GPU (A800 or better), you can ignore this step.
+
+You can use `start_vllm.sh` to start vllm inference services on each of your machines.
+Before running the script, please set `gpu_num`, `model_path`, `gpu_per_model` and `base_port` properly.
+
+- `gpu_num`: number of GPUs for this machine.
+- `model_path`: the model checkpoint path.
+- `gpu_per_model`: number of GPUs required for each model
+- `base_port`: the starting point of the port number used by the local inference services.
+
+For example, if `base_port` is `8010`, `gpu_num` is `8` and `gpu_per_model` is `4`, 2 inference services will be started, and the port numbers are `8010`, `8014` respectively.
+
+vLLM inference services start slowly, so you need to wait for these servers to actually start before proceeding to the next step.
+
+> The above configuration requires that the model checkpoint can be loaded by a single GPU.
+> If you need to use a model that must be loaded by multiple GPUs, you need to modify the script.
+
+### Step 2: Configure the Experiment
+
+Modify the following files according to your environment:
+
+- `configs/model_configs.json`: set the model configs for your experiment. Note that the `config_name` field should follow the format `{model_name}_{model_per_machine}_{model_id}`, where `model_name` is the name of the model, `model_per_machine` is the number of models per machine, and `model_id` is the id of the model (starting from 1).
+
+- `configs/experiment.csv`: set the test cases for your experiment.
+
+- `scripts/start_all_server.sh`: activate your python environment properly in this script.
+
+### Step 3: Run the Experiment
+
+Suppose you have 4 machines whose hostnames are `worker1`, `worker2`, `worker3` and `worker4`, respectively, you can run all your experiment cases by the following command:
+
+```
+python benchmark.py -name large_scale -config experiment --hosts worker1 worker2 worker3 worker4
+```
+
+### Step 4: View the Results
+
+All results will be saved in `./result` folder, and organized as follows:
+```text
+result
+`-- <benchmark_name>
+    `-- <model_name>
+        `-- <settings>
+            |-- <timestamp>
+            |   |-- result_<round_num>.json  # the raw text result of round <round_num>
+            |   `-- result_<round_num>.pdf  # the distribution histogram of round <round_num>
+            `-- <timestamp>
+                |-- result_<round_num>.json
+                `-- result_<round_num>.pdf
+```
+
+And during the experiment, you can also view the experiment results on the command line.
+```text
+2024-08-13 06:20:40.028 | INFO     | participant:_generate_participant_configs:525 - init 100 random participant agents...
+2024-08-13 06:20:40.028 | INFO     | participant:_init_env:574 - init 1 envs...
+2024-08-13 06:20:40.171 | INFO     | participant:_init_env:603 - [init takes 0.1432037353515625 s]
+Moderator: The average value is 49.70 [takes 1.130 s]
+Moderator: The average value is 48.44 [takes 1.125 s]
+Moderator: The average value is 47.81 [takes 1.129 s]
+Moderator: Save result to ./result/studio/qwen2_72b/1-1-100-1-0.667/2024-08-13-06:20:43
+```
+
+## References
+
+```
+@article{agentscope_simulation,
+      title={Very Large-Scale Multi-Agent Simulation in AgentScope},
+      author={Xuchen Pan and
+              Dawei Gao and
+              Yuexiang Xie
+              and Zhewei Wei and
+              Yaliang Li and
+              Bolin Ding and
+              Ji-Rong Wen and
+              Jingren Zhou},
+      journal = {CoRR},
+      volume  = {abs/2407.17789},
+      year    = {2024},
+}
+```
diff --git a/examples/paper_large_scale_simulation/benchmark.py b/examples/paper_large_scale_simulation/benchmark.py
@@ -94,9 +94,12 @@ def load_exp_config(cfg_path: str) -> list:
     return configs
 
 
-def main(name: str = None, config: str = None) -> None:
+def main(
+    name: str = None,
+    hosts: list[str] = None,
+    config: str = None,
+) -> None:
     """The main function of the benchmark"""
-    hosts = ["worker1", "worker2", "worker3", "worker4"]
     configs = load_exp_config(config)
     for cfg in configs:
         run_case(
@@ -128,5 +131,6 @@ def main(name: str = None, config: str = None) -> None:
     args = parser.parse_args()
     main(
         name=args.name,
+        hosts=args.hosts,
         config=os.path.join("./configs", f"{args.config}.csv"),
     )
diff --git a/examples/paper_large_scale_simulation/configs/experiment.csv b/examples/paper_large_scale_simulation/configs/experiment.csv
@@ -1,2 +1,2 @@
 participant_num,agent_type,agent_server_num,env_server_num,model_per_host,model_name,sys_id,usr_id,host_num,ratio,round
-8,random,4,1,2,qwen2_72b,1,1,1,2/3,3
+100,random,4,1,2,qwen2_72b,1,1,1,2/3,3
diff --git a/examples/paper_large_scale_simulation/main.py b/examples/paper_large_scale_simulation/main.py
@@ -76,7 +76,7 @@ def setup_participant_agent_server(host: str, port: int) -> None:
         save_api_invoke=False,
         model_configs="configs/model_configs.json",
         use_monitor=False,
-        logger_level="INFO",
+        logger_level="ERROR",
         save_dir=SAVE_DIR,
     )
     assistant_server_launcher = RpcAgentServerLauncher(

diff --git a/examples/paper_large_scale_simulation/participant.py b/examples/paper_large_scale_simulation/participant.py
@@ -375,7 +375,6 @@ def run(self, round: int, winner: float) -> tuple:
                     self.cnt += 1
             except Exception as e:
                 print(e)
-        logger.info(f"sum: {self.sum}, cnt: {self.cnt}")
         return (self.sum, self.cnt)
 
 
@@ -400,14 +399,12 @@ def save_result(
     ratio: str = "2/3",
 ) -> None:
     """Save the result into file"""
-    print(f"Round: {len(results)}")
     os.makedirs(save_path, exist_ok=True)
     import numpy as np
     from matplotlib import pyplot as plt
 
     for r, result in enumerate(results):
         values = [v["value"] for v in result.values()]
-        logger.info(f"get {len(values)} values")
         win = np.mean(values) * RATIO_MAP[ratio]
         stats = {
             "win": win,
@@ -628,7 +625,7 @@ def step(self) -> None:
             Msg(
                 name="Moderator",
                 role="assistant",
-                content=f"The average value is {summ / cnt :.2f} [takes {et - st :.3f} s]",
+                content=f"The average value of round {self.round + 1} is {summ / cnt :.2f} [takes {et - st :.3f} s]",
             ),
         )
 
@@ -650,6 +647,13 @@ def record(self, run_time: float) -> None:
             _get_timestamp(format_="%Y-%m-%d-%H:%M:%S"),
         )
         save_result(result, run_time, save_path, self.ratio)
+        log_msg(
+            Msg(
+                name="Moderator",
+                role="assistant",
+                content=f"Save result to {save_path}",
+            ),
+        )
 
     def run(self) -> None:
         """Run the game"""

diff --git a/examples/paper_large_scale_simulation/scripts/start_vllm.sh b/examples/paper_large_scale_simulation/scripts/start_vllm.sh
@@ -2,14 +2,14 @@
 
 # default values
 gpu_num=8
-model_per_gpu=1
-model_path="/home/data/shared/checkpoints/llama3/llama3-8b-instruct"
+gpu_per_model=1
+model_path=<your_model_path>
 base_port=8010
 
 touch .vllm_pid
 mkdir -p log
 
-for ((i=0; i < ${gpu_num}; i=i+{model_per_gpu})); do
+for ((i=0; i < ${gpu_num}; i=i+{gpu_per_model})); do
     port=$((base_port + i))
     export CUDA_VISIBLE_DEVICES=$i
     python -m vllm.entrypoints.openai.api_server --model "${model_path}" --port ${port} --enforce-eager > log/vllm-${port}.log 2>&1 &