M-CURL Server Setup Tutorial
M-CURL Server Setup Tutorial
This article takes M-CURL reinforcement learning algorithm as an example to show how to setup a machine learning environment on a cloud server to accelerate training and improve experiment efficiency. This is part of the IPP research project: Knowledge Transfer for Reinforcement Learning based Robot Motion Planning.
Prerequisite: matpool server and Linux setup
Quick start: https://matpool.com/supports/doc-quick-start/
Quick start for team work: https://matpool.com/supports/doc-use-team-on-matpool/
Cloud disk guide: https://matpool.com/supports/doc-use-matbox-on-matpool/
FAQ: https://matpool.com/supports/reference/faqs/
VScode connection:https://matpool.com/supports/doc-vscode-connect-matpool/
Pycharm connection: https://matpool.com/supports/doc-pycharm-connect-matpool/
Connect server via VNC!:https://matpool.com/supports/doc-vnc-connect-matpool/
Note: Connecting the server with VScode only supports operations and command line tools with terminal, while VNC provides graphic desktop, therefore supporting screen display, an important feature that would be used by gym environment.
Linux commands: https://matpool.com/supports/reference/common-cmds/
The Linux system on the server does not support text editor, therefore I suggest you can first install nano:
1 | sudo apt-get install nano |
which will play an important role in future envionmrnt setups.
Conda Environment
Create mcurl Environment
We first create an independent environment for our MCURL project, where we will install all the dependencies without interefering with other environments. We can use the .yml file provided in MCURL repository to do that.
1 | conda env create -f environment.yml |
Then conda will automatically download the dependcies listed in the .yml file. However, in most cases, the libraries that are to be installed via pip will fail due to conflict versions. Nevertheless, things like pytorch will be installed successfully, and you should take a note of the version of pytorch, which will be referenced to setup cuda.
After that we can use
1 | conda activate environment_name |
to enter conda environment. At the same time, remeber to change the configuration of python in VScode to be the conda environment we’ve just set.
Install mujoco
You can refer to https://zhuanlan.zhihu.com/p/352304615 for more detailed instruction (in Chinese).
Mujoco is a physical simulation platform on which MCURL is based to work.
First, on local comouter, go to https://www.roboti.us/index.html to download mujoco package and related licence key https://www.roboti.us/license.html. Compile everything related to mujoco and upload to the server cloud disk.
We then login to the cloud server and create a hidden folder mujoco, then copy our mjkey.txt into .mujoco folder, as shown in the commands below:
1 | mkdir ~/.mujoco #create a hidden folder mujoco |
Note: /.mujoco/mujoco200_linux must be modified as ~/.mujoco/mujoco200, otherwise import mujoco_py will raise errors.
Finally, mjkey.txt will be under ~/.mujoco/mujoco200/bin.
Then we need add mujoco into environment variable:
1 | nano ~/.bashrc |
Copy and paste the following content into the bashrc file you just opened.
1 | export LD_LIBRARY_PATH=~/.mujoco/mujoco200/bin${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}} |
Then press control X, Y, Enter, exit and save file,and finally remeber to
1 | source ~/.bashrc |
Install dmc2gym and dm_control
dmc2gym provides a physics simulation environment with Mujoco. However, strange issues may arise here.
First, try pip install dmc2gym. If successful, great!
But if you encounter “exit 128”:
1 | pip install git+git://github.com/denisyarats/dmc2gym.git |
Then try:
1 | pip install git+https://github.com/denisyarats/dmc2gym.git |
If that doesn’t work, it could be a network issue on your cloud server, so try adding https://mirror.ghproxy.com/ before the clone URL:
1 | pip install git+https://mirror.ghproxy.com/https://github.com/denisyarats/dmc2gym.git |
You can find more about speeding up GitHub downloads here: https://matpool.com/supports/reference/faqs/#%E5%A6%82%E4%BD%95%E5%8A%A0%E9%80%9F-github-%E4%B8%8B%E8%BD%BD%EF%BC%9F
If none of this works, try again a few times :)
Then, install dm_control in a similar way, using the stable 3.6 version from GitHub: https://github.com/google-deepmind/dm_control/tree/python3.6_eol?tab=readme-ov-file
Running and Modifying Python Code
For simplicity, we’ll skip running Python via shell scripts and run train.py directly. However, before running train.py, there may be libraries flagged (yellow lines), so install them first. Common ones are the color library and the sci-image library. Their installation commands are easy to find online.
Before running the code, adjust the default values of certain arguments to tweak settings. For example, you can set save_video to false to avoid certain bugs.
Next, directly run train.py in the terminal (if it runs successfully, something’s wrong 😅), and you’ll encounter small issues. Modify the corresponding library files based on the errors. Common issues include:
AttributeError: 'dict' object has no attribute 'env_specs'
1 | File "train.py", line 318, in <module> |
Simply delete env_specs after gym.envs.registry.env_specs. See: https://github.com/openai/gym/issues/3097
ValueError: not enough values to unpack (expected 5, got 4)
Refer to GPT’s suggestion:
According to the
train.pyandutils.pycode you provided, the error indicates that theevaluatefunction intrain.pyexpects 5 values butenv.step(action)only returns 4.In
train.py‘sevaluatefunction:
1 obs, reward, terminated, truncated, info = env.step(action)It attempts to unpack 5 variables, but
env.step(action)only returns 4 (obs,reward,terminated,truncated), missinginfo.To fix this, adjust the unpacking:
1
2 obs, reward, terminated, truncated = env.step(action)
done = terminated or truncated # Use terminated or truncated as the done conditionBefore making any changes, add a debug statement to check
env.step(action):
1
2
3
4 result = env.step(action)
print("Step returned:", result) # Debugging
obs, reward, terminated, truncated = result
done = terminated or truncated
We already have modified code, but this is useful for independent replication.
- Modify two feedforward layers in
ctmr_sac.pybased on errors by cloning instead of operating in place onx. The modified.pyfile can be found here: https://sjtu.feishu.cn/docx/HCnPdRRiEoxOzGxjzI9cggLMnsb?from=from_copylink
By now, the Python part should be bug-free. If there are still bugs, find your own solutions :)
Reconfiguring dmc2gym
If you’re using VSCode, progress may be slow at this point. Switch to a VNC virtual desktop to run train.py. You may encounter errors like:
1 | Reading package lists... Done |
Refer to this blog: https://blog.xiunian.wang/?p=1867 and GPT for a solution:
Edit
LD_LIBRARY_PATH:
1 export LD_LIBRARY_PATH=/root/miniconda3/envs/mcurl/lib/python3.11/site-packages/nvidia/cudnn/lib:/root/miniconda3/envs/mcurl/lib/python3.11/site-packages/torch/lib:/usr/local/cuda-11.8/lib64:/usr/local/nvidia/lib:/usr/local/nvidia/lib64:/usr/local/cuda/lib64:/usr/lib/x86_64-linux-gnu:$LD_LIBRARY_PATHVerify the settings:
1 echo $LD_LIBRARY_PATHEnsure the output is correct.
Re-run the
lddcommand:
1 ldd /root/miniconda3/envs/mcurl/lib/python3.11/site-packages/torch/lib/../../nvidia/cudnn/lib/libcudnn_cnn_infer.so.8If all works, add
export LD_LIBRARY_PATH=...to.bashrcfor persistence.
By now, the dmc2gym setup should be complete, and the program should run.
CUDA and Pytorch
If your program runs but the GPU usage is 0%, it means CUDA isn’t configured correctly. To configure CUDA, first check the CUDA and Pytorch version compatibility at: https://pytorch.org/get-started/previous-versions/, then install the corresponding CUDA version.
You can check with the following code:
1 | import torch |
If False, it means the CUDA version is mismatched. Make sure to install the correct version.
Also refer to: https://matpool.com/supports/doc-public-data/#cuda-%E5%AE%89%E8%A3%85
Results
Using a GPU cloud server for fast training will generate a folder named after the corresponding training project in the directory where train.py is located. The folder will contain the stored buffer, model, video, and training logs. The format of the training logs is as follows:
1 | {"episode_reward": 0.0, "episode": 1.0, "duration": 2.1876120567321777, "step": 500} |