-
Notifications
You must be signed in to change notification settings - Fork 370
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
可以test,无法训练,报错 #2
Comments
+1 求解决 |
具体位置是:snake_game.py -- line 95 左右的位置 |
mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了
在 2023-05-26 13:09:31,"wave" ***@***.***> 写道:
为什么我卡在这
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
请问怎么让训练过程可视化呢 |
训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化
在 2023-05-31 02:01:11,"zjhcwjb" ***@***.***> 写道:
mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
请问怎么让训练过程可视化呢
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 |
你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染
在 2023-05-31 13:33:44,"zjhcwjb" ***@***.***> 写道:
训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
|
可以的话,共享你的代码
在 2023-05-31 14:34:13,"zjhcwjb" ***@***.***> 写道:
你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
import os from stable_baselines3.common.monitor import Monitor from snake_game_custom_wrapper_mlp import SnakeEnv NUM_ENV = 32 Linear schedulerdef linear_schedule(initial_value, final_value=0.0):
def make_env(seed=0): def main():
if name == "main": |
这个格式,我实在不好看,你 可以保留格式再发我一份嘛
在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:
可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>
import os
import sys
import random
import time
from stable_baselines3.common.monitor import Monitor
from stable_baselines3.common.vec_env import SubprocVecEnv
from stable_baselines3.common.callbacks import CheckpointCallback
from sb3_contrib import MaskablePPO
from sb3_contrib.common.wrappers import ActionMasker
from snake_game_custom_wrapper_mlp import SnakeEnv
NUM_ENV = 32
LOG_DIR = "logs"
os.makedirs(LOG_DIR, exist_ok=True)
Linear scheduler
def linear_schedule(initial_value, final_value=0.0):
if isinstance(initial_value, str):
initial_value = float(initial_value)
final_value = float(final_value)
assert (initial_value > 0.0)
def scheduler(progress):
return final_value + progress * (initial_value - final_value)
return scheduler
def make_env(seed=0):
def _init():
env = SnakeEnv(seed=seed)
env = ActionMasker(env, SnakeEnv.get_action_mask)
env = Monitor(env)
env.seed(seed)
return env
return _init
def main():
# Generate a list of random seeds for each environment.
seed_set = set()
while len(seed_set) < NUM_ENV:
seed_set.add(random.randint(0, 1e9))
# Create the Snake environment.
env = SubprocVecEnv([make_env(seed=s) for s in seed_set])
lr_schedule = linear_schedule(2.5e-4, 2.5e-6)
clip_range_schedule = linear_schedule(0.15, 0.025)
# # Instantiate a PPO agent
model = MaskablePPO(
"MlpPolicy",
env,
device="cuda",
verbose=1,
n_steps=2048,
batch_size=512,
n_epochs=4,
gamma=0.94,
learning_rate=lr_schedule,
clip_range=clip_range_schedule,
tensorboard_log=LOG_DIR
)
# Set the save directory
save_dir = "trained_models_mlp"
os.makedirs(save_dir, exist_ok=True)
checkpoint_interval = 15625 # checkpoint_interval * num_envs = total_steps_per_checkpoint
checkpoint_callback = CheckpointCallback(save_freq=checkpoint_interval, save_path=save_dir, name_prefix="ppo_snake")
# Writing the training logs from stdout to a file
original_stdout = sys.stdout
log_file_path = os.path.join(save_dir, "training_log.txt")
with open(log_file_path, 'w') as log_file:
sys.stdout = log_file
model.learn(
total_timesteps=int(100000000),
callback=[checkpoint_callback]
)
env.close()
# Restore stdout
sys.stdout = original_stdout
# Save the final model
model.save(os.path.join(save_dir, "ppo_snake_final.zip"))
demo_env = make_env()()
with open(log_file_path, 'w') as log_file:
sys.stdout = log_file
for i in range(100):
model.learn(
total_timesteps=int(1000000),
callback=[checkpoint_callback]
)
obs = demo_env.reset()
demo_env.render()
time.sleep(0.5)
done = False
while not done:
action, _ = model.predict(obs)
obs, _, done, _ = demo_env.step(action)
demo_env.render()
time.sleep(0.5)
if name == "main":
main()
嗯嗯 就是在train_mlp的基础上加上了渲染的部分 完全没有报错信息 但是画面完全不出来
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
<!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:DengXian;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
.MsoChpDefault
{mso-style-type:export-only;}
/* Page Definitions */
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
--> 您好 这样可以吗? 刚刚接触强化学习可能问的问题都比较蠢 实在打扰了 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 14:44收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 这个格式,我实在不好看,你 可以保留格式再发我一份嘛在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>import osimport sysimport randomimport timefrom stable_baselines3.common.monitor import Monitorfrom stable_baselines3.common.vec_env import SubprocVecEnvfrom stable_baselines3.common.callbacks import CheckpointCallbackfrom sb3_contrib import MaskablePPOfrom sb3_contrib.common.wrappers import ActionMaskerfrom snake_game_custom_wrapper_mlp import SnakeEnvNUM_ENV = 32LOG_DIR = "logs"os.makedirs(LOG_DIR, exist_ok=True)Linear schedulerdef linear_schedule(initial_value, final_value=0.0):if isinstance(initial_value, str):initial_value = float(initial_value)final_value = float(final_value)assert (initial_value > 0.0)def scheduler(progress):return final_value + progress * (initial_value - final_value)return schedulerdef make_env(seed=0):def _init():env = SnakeEnv(seed=seed)env = ActionMasker(env, SnakeEnv.get_action_mask)env = Monitor(env)env.seed(seed)return envreturn _initdef main():# Generate a list of random seeds for each environment.seed_set = set()while len(seed_set) < NUM_ENV:seed_set.add(random.randint(0, 1e9))# Create the Snake environment.env = SubprocVecEnv([make_env(seed=s) for s in seed_set])lr_schedule = linear_schedule(2.5e-4, 2.5e-6)clip_range_schedule = linear_schedule(0.15, 0.025)# # Instantiate a PPO agentmodel = MaskablePPO("MlpPolicy",env,device="cuda",verbose=1,n_steps=2048,batch_size=512,n_epochs=4,gamma=0.94,learning_rate=lr_schedule,clip_range=clip_range_schedule,tensorboard_log=LOG_DIR)# Set the save directorysave_dir = "trained_models_mlp"os.makedirs(save_dir, exist_ok=True)checkpoint_interval = 15625 # checkpoint_interval * num_envs = total_steps_per_checkpointcheckpoint_callback = CheckpointCallback(save_freq=checkpoint_interval, save_path=save_dir, name_prefix="ppo_snake")# Writing the training logs from stdout to a fileoriginal_stdout = sys.stdoutlog_file_path = os.path.join(save_dir, "training_log.txt")with open(log_file_path, 'w') as log_file:sys.stdout = log_filemodel.learn(total_timesteps=int(100000000),callback=[checkpoint_callback])env.close()# Restore stdoutsys.stdout = original_stdout# Save the final modelmodel.save(os.path.join(save_dir, "ppo_snake_final.zip"))demo_env = make_env()()with open(log_file_path, 'w') as log_file:sys.stdout = log_filefor i in range(100):model.learn(total_timesteps=int(1000000),callback=[checkpoint_callback])obs = demo_env.reset()demo_env.render()time.sleep(0.5)done = Falsewhile not done:action, _ = model.predict(obs)obs, _, done, _ = demo_env.step(action)demo_env.render()time.sleep(0.5)if name == "main":main()嗯嗯 就是在train_mlp的基础上加上了渲染的部分 完全没有报错信息 但是画面完全不出来—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
|
哥们抱歉,刚刚我是在邮箱上看的代码,邮箱忽视了代码的格式,在github上看是没问题的。刚刚我又仔细看了一下train_mlp的代码,发现模型的整个训练过程都是在MaskablePPO内部进行的,我未在其API中找到调用env.render的参数,我想是无法展示训练的画面的。但你可以想象其训练画面不过是重复n次贪吃蛇游戏,不断的eat食物获取奖励,不断的死亡获取惩罚。
在 2023-05-31 15:04:14,"zjhcwjb" ***@***.***> 写道:
<!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:DengXian;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
.MsoChpDefault
{mso-style-type:export-only;}
/* Page Definitions */
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
--> 您好 这样可以吗? 刚刚接触强化学习可能问的问题都比较蠢 实在打扰了 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 14:44收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 这个格式,我实在不好看,你 可以保留格式再发我一份嘛在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>import osimport sysimport randomimport timefrom stable_baselines3.common.monitor import Monitorfrom stable_baselines3.common.vec_env import SubprocVecEnvfrom stable_baselines3.common.callbacks import CheckpointCallbackfrom sb3_contrib import MaskablePPOfrom sb3_contrib.common.wrappers import ActionMaskerfrom snake_game_custom_wrapper_mlp import SnakeEnvNUM_ENV = 32LOG_DIR = "logs"os.makedirs(LOG_DIR, exist_ok=True)Linear schedulerdef linear_schedule(initial_value, final_value=0.0):if isinstance(initial_value, str):initial_value = float(initial_value)final_value = float(final_value)assert (initial_value > 0.0)def scheduler(progress):return final_value + progress * (initial_value - final_value)return schedulerdef make_env(seed=0):def _init():env = SnakeEnv(seed=seed)env = ActionMasker(env, SnakeEnv.get_action_mask)env = Monitor(env)env.seed(seed)return envreturn _initdef main():# Generate a list of random seeds for each environment.seed_set = set()while len(seed_set) < NUM_ENV:seed_set.add(random.randint(0, 1e9))# Create the Snake environment.env = SubprocVecEnv([make_env(seed=s) for s in seed_set])lr_schedule = linear_schedule(2.5e-4, 2.5e-6)clip_range_schedule = linear_schedule(0.15, 0.025)# # Instantiate a PPO agentmodel = MaskablePPO("MlpPolicy",env,device="cuda",verbose=1,n_steps=2048,batch_size=512,n_epochs=4,gamma=0.94,learning_rate=lr_schedule,clip_range=clip_range_schedule,tensorboard_log=LOG_DIR)# Set the save directorysave_dir = "trained_models_mlp"os.makedirs(save_dir, exist_ok=True)checkpoint_interval = 15625 # checkpoint_interval * num_envs = total_steps_per_checkpointcheckpoint_callback = CheckpointCallback(save_freq=checkpoint_interval, save_path=save_dir, name_prefix="ppo_snake")# Writing the training logs from stdout to a fileoriginal_stdout = sys.stdoutlog_file_path = os.path.join(save_dir, "training_log.txt")with open(log_file_path, 'w') as log_file:sys.stdout = log_filemodel.learn(total_timesteps=int(100000000),callback=[checkpoint_callback])env.close()# Restore stdoutsys.stdout = original_stdout# Save the final modelmodel.save(os.path.join(save_dir, "ppo_snake_final.zip"))demo_env = make_env()()with open(log_file_path, 'w') as log_file:sys.stdout = log_filefor i in range(100):model.learn(total_timesteps=int(1000000),callback=[checkpoint_callback])obs = demo_env.reset()demo_env.render()time.sleep(0.5)done = Falsewhile not done:action, _ = model.predict(obs)obs, _, done, _ = demo_env.step(action)demo_env.render()time.sleep(0.5)if name == "main":main()嗯嗯 就是在train_mlp的基础上加上了渲染的部分 完全没有报错信息 但是画面完全不出来—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
<!--
/* Font Definitions */
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:"Cambria Math";
panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:DengXian;
panose-1:2 1 6 0 3 1 1 1 1 1;}
@font-face
{font-family:SimSun;
panose-1:2 1 6 0 3 1 1 1 1 1;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
{margin:0cm;
text-align:justify;
text-justify:inter-ideograph;
font-size:10.5pt;
font-family:DengXian;}
a:link, span.MsoHyperlink
{mso-style-priority:99;
color:blue;
text-decoration:underline;}
.MsoChpDefault
{mso-style-type:export-only;}
/* Page Definitions */
@page WordSection1
{size:612.0pt 792.0pt;
margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.WordSection1
{page:WordSection1;}
-->嗯嗯 刚接触强化学习但是还是想显示出来确认是不是真的在训练 train_cnn中是有调用env.render的 不知道为什么也显示不出来 问chatgpt也不知道怎么改 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 15:13收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 哥们抱歉,刚刚我是在邮箱上看的代码,邮箱忽视了代码的格式,在github上看是没问题的。刚刚我又仔细看了一下train_mlp的代码,发现模型的整个训练过程都是在MaskablePPO内部进行的,我未在其API中找到调用env.render的参数,我想是无法展示训练的画面的。但你可以想象其训练画面不过是重复n次贪吃蛇游戏,不断的eat食物获取奖励,不断的死亡获取惩罚。在 2023-05-31 15:04:14,"zjhcwjb" ***@***.***> 写道:<!--/* Font Definitions ***@***.***{font-family:SimSun;panose-1:2 1 6 0 3 1 1 1 1 ***@***.***{font-family:"Cambria Math";panose-1:2 4 5 3 5 4 6 3 2 ***@***.***{font-family:DengXian;panose-1:2 1 6 0 3 1 1 1 1 ***@***.***{font-family:DengXian;panose-1:2 1 6 0 3 1 1 1 1 ***@***.***{font-family:SimSun;panose-1:2 1 6 0 3 1 1 1 1 1;}/* Style Definitions */p.MsoNormal, li.MsoNormal, div.MsoNormal{margin:0cm;text-align:justify;text-justify:inter-ideograph;font-size:10.5pt;font-family:DengXian;}a:link, span.MsoHyperlink{mso-style-priority:99;color:blue;text-decoration:underline;}.MsoChpDefault{mso-style-type:export-only;}/* Page Definitions ***@***.*** WordSection1{size:612.0pt 792.0pt;margin:72.0pt 90.0pt 72.0pt 90.0pt;}div.WordSection1{page:WordSection1;}--> 您好 这样可以吗? 刚刚接触强化学习可能问的问题都比较蠢 实在打扰了 从 Windows 版邮件发送 发件人: shironghe发送时间: 2023年5月31日 14:44收件人: linyiLYi/snake-ai抄送: zjhcwjb; Comment主题: Re: [linyiLYi/snake-ai] 可以test,无法训练,报错 (Issue #2) 这个格式,我实在不好看,你 可以保留格式再发我一份嘛在 2023-05-31 14:41:25,"zjhcwjb" ***@***.***> 写道:可以的话,共享你的代码 在 2023-05-31 14:34:13,"zjhcwjb" @.> 写道: 你可以借鉴test_mlp的代码,借助env.render()函数对游戏画面进行渲染 在 2023-05-31 13:33:44,"zjhcwjb" @.> 写道: 训练后注意看logs下会生成新的文件夹,里面的文件可以用TensorBoard查看进行可视化 在 2023-05-31 02:01:11,"zjhcwjb" @.> 写道: mlp模型训练时没有在命令行给提示,你可以在windows下ctrl+shift+Esc查看CPU、GPU的使用情况,在ubuntu下使用htop命令查看那CPU的使用情况,在watch -n 1 nvidia-smi下查看GPU的使用情况,程序占用率高就表明在训练了,并不是卡住了 在 2023-05-26 13:09:31,"wave" @.> 写道: 为什么我卡在这 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 请问怎么让训练过程可视化呢 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 感谢回复 但是如果想看到ai每一局游戏的画面应该怎么做啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.> 已经加上渲染画面的代码了 运行起来gpu占用也很高应该是在训练 但还是看不到画面啊 — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: @.>import osimport sysimport randomimport timefrom stable_baselines3.common.monitor import Monitorfrom stable_baselines3.common.vec_env import SubprocVecEnvfrom stable_baselines3.common.callbacks import CheckpointCallbackfrom sb3_contrib import MaskablePPOfrom sb3_contrib.common.wrappers import ActionMaskerfrom snake_game_custom_wrapper_mlp import SnakeEnvNUM_ENV = 32LOG_DIR = "logs"os.makedirs(LOG_DIR, exist_ok=True)Linear schedulerdef linear_schedule(initial_value, final_value=0.0):if isinstance(initial_value, str):initial_value = float(initial_value)final_value = float(final_value)assert (initial_value > 0.0)def scheduler(progress):return final_value + progress * (initial_value - final_value)return schedulerdef make_env(seed=0):def _init():env = SnakeEnv(seed=seed)env = ActionMasker(env, SnakeEnv.get_action_mask)env = Monitor(env)env.seed(seed)return envreturn _initdef main():# Generate a list of random seeds for each environment.seed_set = set()while len(seed_set) < NUM_ENV:seed_set.add(random.randint(0, 1e9))# Create the Snake environment.env = SubprocVecEnv([make_env(seed=s) for s in seed_set])lr_schedule = linear_schedule(2.5e-4, 2.5e-6)clip_range_schedule = linear_schedule(0.15, 0.025)# # Instantiate a PPO agentmodel = MaskablePPO("MlpPolicy",env,device="cuda",verbose=1,n_steps=2048,batch_size=512,n_epochs=4,gamma=0.94,learning_rate=lr_schedule,clip_range=clip_range_schedule,tensorboard_log=LOG_DIR)# Set the save directorysave_dir = "trained_models_mlp"os.makedirs(save_dir, exist_ok=True)checkpoint_interval = 15625 # checkpoint_interval * num_envs = total_steps_per_checkpointcheckpoint_callback = CheckpointCallback(save_freq=checkpoint_interval, save_path=save_dir, name_prefix="ppo_snake")# Writing the training logs from stdout to a fileoriginal_stdout = sys.stdoutlog_file_path = os.path.join(save_dir, "training_log.txt")with open(log_file_path, 'w') as log_file:sys.stdout = log_filemodel.learn(total_timesteps=int(100000000),callback=[checkpoint_callback])env.close()# Restore stdoutsys.stdout = original_stdout# Save the final modelmodel.save(os.path.join(save_dir, "ppo_snake_final.zip"))demo_env = make_env()()with open(log_file_path, 'w') as log_file:sys.stdout = log_filefor i in range(100):model.learn(total_timesteps=int(1000000),callback=[checkpoint_callback])obs = demo_env.reset()demo_env.render()time.sleep(0.5)done = Falsewhile not done:action, _ = model.predict(obs)obs, _, done, _ = demo_env.step(action)demo_env.render()time.sleep(0.5)if name == "main":main()嗯嗯 就是在train_mlp的基础上加上了渲染的部分 完全没有报错信息 但是画面完全不出来—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***> —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you commented.Message ID: ***@***.***>
|
(SnakeAI) E:\snake-ai-master\main>python train_cnn.py
Using cuda device
Wrapping the env in a VecTransposeImage.
Process SpawnProcess-5:
Traceback (most recent call last):
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 30, in _worker
observation, reward, done, info = env.step(data)
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\monitor.py", line 95, in step
observation, reward, done, info = self.env.step(action)
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\gym\core.py", line 289, in step
return self.env.step(action)
File "E:\snake-ai-master\main\snake_game_custom_wrapper_cnn.py", line 47, in step
self.done, info = self.game.step(action) # info = {"snake_size": int, "snake_head_pos": np.array, "prev_snake_head_pos": np.array, "food_pos": np.array, "food_obtained": bool}
File "E:\snake-ai-master\main\snake_game.py", line 96, in step
self.sound_game_over.play()
AttributeError: 'SnakeGame' object has no attribute 'sound_game_over'
Traceback (most recent call last):
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\connection.py", line 312, in _recv_bytes
nread, err = ov.GetOverlappedResult(True)
BrokenPipeError: [WinError 109] 管道已结束。
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train_cnn.py", line 95, in
main()
File "train_cnn.py", line 82, in main
model.learn(
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\sb3_contrib\ppo_mask\ppo_mask.py", line 525, in learn
continue_training = self.collect_rollouts(self.env, callback, self.rollout_buffer, self.n_steps, use_masking)
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\sb3_contrib\ppo_mask\ppo_mask.py", line 305, in collect_rollouts
new_obs, rewards, dones, infos = env.step(actions)
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\vec_env\base_vec_env.py", line 163, in step
return self.step_wait()
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\vec_env\vec_transpose.py", line 95, in step_wait
observations, rewards, dones, infos = self.venv.step_wait()
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 121, in step_wait
results = [remote.recv() for remote in self.remotes]
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\site-packages\stable_baselines3\common\vec_env\subproc_vec_env.py", line 121, in
results = [remote.recv() for remote in self.remotes]
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\connection.py", line 250, in recv
buf = self._recv_bytes()
File "C:\Users\KEN2020.conda\envs\SnakeAI\lib\multiprocessing\connection.py", line 321, in _recv_bytes
raise EOFError
EOFError
The text was updated successfully, but these errors were encountered: