Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Openpai v1.6.0部署总结 | Siaimes's blog #64

Open
siaimes opened this issue Jun 9, 2021 · 15 comments
Open

Openpai v1.6.0部署总结 | Siaimes's blog #64

siaimes opened this issue Jun 9, 2021 · 15 comments

Comments

@siaimes
Copy link
Owner

siaimes commented Jun 9, 2021

https://blog.siaimes.me/2021/06/07/p57.html

前提条件

集群内主机系统为Ubuntu 18.04 LTS server,且具有相同用户名和密码的管理员账户。

Openpai项目文档建议用Ubuntu 16.04 LTS,但是我在Ubuntu 16.04 LTS遇到了很严重的问题,所以更新到Ubuntu 18.04。
master,worker必须为物理机器,dev-box可以是硬盘空间不少于40GB的虚拟机,毕竟他只有安装和维护

@zzh14014
Copy link

您好,请问离线文件下载那一步最新方法走过之后 博客中的操作还要继续吗 还有博客中的‘除dev-box不用外,其余每台机器都要’ 意思是worker和master都要都一遍博客中离线文件下载的操作吗

@siaimes
Copy link
Owner Author

siaimes commented Aug 19, 2021

您好,请问离线文件下载那一步最新方法走过之后 博客中的操作还要继续吗 还有博客中的‘除dev-box不用外,其余每台机器都要’ 意思是worker和master都要都一遍博客中离线文件下载的操作吗

你好,博客已更新!

@zzh14014
Copy link

@siaimes

您好,请问离线文件下载那一步最新方法走过之后 博客中的操作还要继续吗 还有博客中的‘除dev-box不用外,其余每台机器都要’ 意思是worker和master都要都一遍博客中离线文件下载的操作吗

你好,博客已更新!

好的,谢谢。/bin/bash quick-start-kubespray.sh这一步之前应该都没什么问题,只是被master内存不够卡着了。后面有问题还会再来的

@siaimes
Copy link
Owner Author

siaimes commented Aug 19, 2021

@siaimes

您好,请问离线文件下载那一步最新方法走过之后 博客中的操作还要继续吗 还有博客中的‘除dev-box不用外,其余每台机器都要’ 意思是worker和master都要都一遍博客中离线文件下载的操作吗

你好,博客已更新!

好的,谢谢。/bin/bash quick-start-kubespray.sh这一步之前应该都没什么问题,只是被master内存不够卡着了。后面有问题还会再来的

可以输入continue继续安装吧?

@siaimes
Copy link
Owner Author

siaimes commented Aug 19, 2021

@siaimes

您好,请问离线文件下载那一步最新方法走过之后 博客中的操作还要继续吗 还有博客中的‘除dev-box不用外,其余每台机器都要’ 意思是worker和master都要都一遍博客中离线文件下载的操作吗

你好,博客已更新!

好的,谢谢。/bin/bash quick-start-kubespray.sh这一步之前应该都没什么问题,只是被master内存不够卡着了。后面有问题还会再来的

文档里面说是要64G内存,但是用户数不大的话,16G内存也绰绰有余。

@zzh14014
Copy link

zzh14014 commented Aug 19, 2021 via email

@ae86zhizhi
Copy link

您好,感谢您撰写了这篇博客。我打算在小集群上部署openpai, 仍有些疑问。您提到“master,worker必须为物理机器,dev-box可以是硬盘空间不少于40GB的虚拟机”,请问您的虚拟机是跑在master或者worker节点上吗,或者是别的服务器?我想用一台单独的CPU服务器来跑worker和dev-box,这个服务器有两张网卡,看到官方的安装指南,我本来打算用EXSi来虚拟出两台ubuntu服务器,每台分配一张网卡,分别跑master和dev-box。但是您说master必须是物理机,请问master使用虚拟机会出现问题吗?期待您的回复,再次感谢。

@siaimes
Copy link
Owner Author

siaimes commented Sep 8, 2021

您好,感谢您撰写了这篇博客。我打算在小集群上部署openpai, 仍有些疑问。您提到“master,worker必须为物理机器,dev-box可以是硬盘空间不少于40GB的虚拟机”,请问您的虚拟机是跑在master或者worker节点上吗,或者是别的服务器?我想用一台单独的CPU服务器来跑worker和dev-box,这个服务器有两张网卡,看到官方的安装指南,我本来打算用EXSi来虚拟出两台ubuntu服务器,每台分配一张网卡,分别跑master和dev-box。但是您说master必须是物理机,请问master使用虚拟机会出现问题吗?期待您的回复,再次感谢。

@ae86zhizhi master和worker是集群里面的机器,要固定IP地址。而dev-box是控制集群的机器,是独立于集群的,IP地址也可以不用要固定,安装完之后就可以关机,下次要升级再开机即可。

所以你的master不需要装到虚拟机里面,直接装到CPU服务器即可。dev-box可以在自己的台式机或者笔记本上装一个Ubuntu 18.04的虚拟机就行了,但是比较吃硬盘空间,要40G硬盘。

Repository owner deleted a comment from hlyf-xs Sep 28, 2021
@murez
Copy link

murez commented Dec 30, 2021

有关dev-box里的一些坑可以看一下这个:https://www.wolai.com/mte5KhHA491FYVZQBNNEfd

@siaimes
Copy link
Owner Author

siaimes commented Jan 5, 2022

最新解决方案,欢迎试用:https://github.com/siaimes/k8s-share

@chjm
Copy link

chjm commented May 13, 2022

博主 请问有碰到加入集群超时这个问题吗?
TASK [kubernetes/kubeadm : Join to cluster] ********************************************************************************************************************************************************************************************
Friday 13 May 2022 15:41:08 +0800 (0:00:00.473) 0:02:03.811 ************
fatal: [pai-worker]: FAILED! => {"changed": true, "cmd": ["timeout", "-k", "120s", "120s", "/usr/local/bin/kubeadm", "join", "--config", "/etc/kubernetes/kubeadm-client.conf", "--ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests"], "delta": "0:02:00.004230", "end": "2022-05-13 15:43:08.484885", "msg": "non-zero return code", "rc": 124, "start": "2022-05-13 15:41:08.480655", "stderr": "\t[WARNING DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty\n\t[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "stderr_lines": ["\t[WARNING DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty", "\t[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/"], "stdout": "[preflight] Running pre-flight checks", "stdout_lines": ["[preflight] Running pre-flight checks"]}

@siaimes
Copy link
Owner Author

siaimes commented May 13, 2022

博主 请问有碰到加入集群超时这个问题吗? TASK [kubernetes/kubeadm : Join to cluster] ******************************************************************************************************************************************************************************************** Friday 13 May 2022 15:41:08 +0800 (0:00:00.473) 0:02:03.811 ************ fatal: [pai-worker]: FAILED! => {"changed": true, "cmd": ["timeout", "-k", "120s", "120s", "/usr/local/bin/kubeadm", "join", "--config", "/etc/kubernetes/kubeadm-client.conf", "--ignore-preflight-errors=DirAvailable--etc-kubernetes-manifests"], "delta": "0:02:00.004230", "end": "2022-05-13 15:43:08.484885", "msg": "non-zero return code", "rc": 124, "start": "2022-05-13 15:41:08.480655", "stderr": "\t[WARNING DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty\n\t[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/", "stderr_lines": ["\t[WARNING DirAvailable--etc-kubernetes-manifests]: /etc/kubernetes/manifests is not empty", "\t[WARNING IsDockerSystemdCheck]: detected "cgroupfs" as the Docker cgroup driver. The recommended driver is "systemd". Please follow the guide at https://kubernetes.io/docs/setup/cri/"], "stdout": "[preflight] Running pre-flight checks", "stdout_lines": ["[preflight] Running pre-flight checks"]}

image

Repository owner deleted a comment from murez May 13, 2022
Repository owner deleted a comment from hlyf-xs Oct 24, 2022
Repository owner deleted a comment from murez Oct 24, 2022
Repository owner deleted a comment from murez Oct 24, 2022
@c-x-z
Copy link

c-x-z commented Dec 7, 2023

想问一下大家,执行/bin/bash quick-start-service.sh时,一直是这个问题:
grafana is not ready yet. Please wait for a moment!
grafana is not ready yet. Please wait for a moment!
grafana is not ready yet. Please wait for a moment!
An issue occure when starting up grafana
2023-12-07 03:10:30,039 [ERROR] - deployment.paiLibrary.common.linux_shell : Failed to execute the start script of service grafana

这应该怎么解决呢?感谢各位

@siaimes
Copy link
Owner Author

siaimes commented Dec 7, 2023

想问一下大家,执行/bin/bash quick-start-service.sh时,一直是这个问题: grafana is not ready yet. Please wait for a moment! grafana is not ready yet. Please wait for a moment! grafana is not ready yet. Please wait for a moment! An issue occure when starting up grafana 2023-12-07 03:10:30,039 [ERROR] - deployment.paiLibrary.common.linux_shell : Failed to execute the start script of service grafana

这应该怎么解决呢?感谢各位

试试新版本:https://siaimes.github.io/2022/12/11/p63.html

@c-x-z
Copy link

c-x-z commented Dec 19, 2023

想问一下大家,执行/bin/bash quick-start-service.sh时,一直是这个问题: grafana is not ready yet. Please wait for a moment! grafana is not ready yet. Please wait for a moment! grafana is not ready yet. Please wait for a moment! An issue occure when starting up grafana 2023-12-07 03:10:30,039 [ERROR] - deployment.paiLibrary.common.linux_shell : Failed to execute the start script of service grafana
这应该怎么解决呢?感谢各位

试试新版本:https://siaimes.github.io/2022/12/11/p63.html

谢谢,但是我尝试过了,依旧不行,不知道是哪一步没做好,我再继续尝试吧。非常感谢

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants