How to run multiple mixtral in one machine #3406

ToheartZhang · 2024-03-14T15:41:03Z

I have an 8xA100 GPU machine. I noticed that one Mixtral instance requires at least 2 GPUs. However, when I attempt to run two Mixtral instances on the machine, each allocated 2 GPUs, the second one hangs at Started a local Ray instance. Additionally, the terminal reports fork: retry: Resource temporarily unavailable. It appears that the two Ray instances are causing a conflict. Is there any solution to resolve this issue?

The text was updated successfully, but these errors were encountered:

thesues · 2024-03-14T17:32:45Z

may be you can se env variable CUDA_VISIBLE_DEVICES before starting a LLM instance?

CUDA_VISIBLE_DEVICES="0,1" python llm.py
CUDA_VISIBLE_DEVICES="2,3" python llm.py

ToheartZhang · 2024-03-15T00:44:38Z

Thanks for your help. I have specified CUDA_VISIBLE_DEVICES. Besides, multiple vllm process works when each model only needs one GPU, where Ray is not used. So I believe the issue is about setting Ray.

ToheartZhang · 2024-03-18T01:51:16Z

Problem solved by #1058

ToheartZhang closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run multiple mixtral in one machine #3406

How to run multiple mixtral in one machine #3406

ToheartZhang commented Mar 14, 2024

thesues commented Mar 14, 2024

ToheartZhang commented Mar 15, 2024

ToheartZhang commented Mar 18, 2024

How to run multiple mixtral in one machine #3406

How to run multiple mixtral in one machine #3406

Comments

ToheartZhang commented Mar 14, 2024

thesues commented Mar 14, 2024

ToheartZhang commented Mar 15, 2024

ToheartZhang commented Mar 18, 2024