环境:MacOS Ventura 13.5
机型:MacBook Pro (M1, 2021)
- hadoop-3.3.1-aarch64
- JDK1.8-aarch64
- scala-2.12.14
- spark-3.2.1-bin-hadoop3.2
- pyspark-3.4.1
- mysql-connector-java-8.0.28
- hive-3.1.3-bin
Move hadoop-3.3.1-aarch64.tar.gz
、jdk-8u301-linux-aarch64.tar.gz
、scala-2.12.14.tgz
、spark-3.2.1-bin-hadoop3.2.tgz
and pyspark-3.4.1.tar.gz
to resources
folder
docker build -f Dockerfile -t puppets/hadoop:1.1 .
sudo docker network create --driver=bridge hadoop
sudo ./start-container.sh
output:
start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...
docker exec -it hadoop-master bash
./start-hadoop.sh
因为yarn配置在hadoop-slave2节点,所以还需要去hadoop-slave2启动
docker exec -it hadoop-slave2 bash
./start-hadoop.sh
./update-mysql-password.sh
在master节点运行任务
./run-wordcount.sh 3.3.1
output
input file1.txt:
Hello Hadoop
input file2.txt:
Hello Docker
wordcount output:
Docker 1
Hadoop 1
Hello 2
schematool -initSchema -dbType mysql