GitHub - hui61/hadoop-spark-docker

Run Hadoop Spark and Hive within Docker Containers

环境：MacOS Ventura 13.5

机型：MacBook Pro (M1, 2021)

1. Download resource files

Move hadoop-3.3.1-aarch64.tar.gz、jdk-8u301-linux-aarch64.tar.gz、scala-2.12.14.tgz、spark-3.2.1-bin-hadoop3.2.tgz and pyspark-3.4.1.tar.gz to resources folder

2. build Dockerfile

docker build -f Dockerfile -t puppets/hadoop:1.1 .

3. create hadoop network

sudo docker network create --driver=bridge hadoop

4. start container

sudo ./start-container.sh

output:

start hadoop-master container...
start hadoop-slave1 container...
start hadoop-slave2 container...

5. start hadoop

docker exec -it hadoop-master bash
./start-hadoop.sh

因为yarn配置在hadoop-slave2节点，所以还需要去hadoop-slave2启动

docker exec -it hadoop-slave2 bash
./start-hadoop.sh

6. update mysql password

./update-mysql-password.sh

7. run wordcount

在master节点运行任务

./run-wordcount.sh 3.3.1

output

input file1.txt:
Hello Hadoop

input file2.txt:
Hello Docker

wordcount output:
Docker    1
Hadoop    1
Hello    2

8. start hive

schematool -initSchema -dbType mysql

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
config		config
images		images
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
hadoop-spark-docker.iml		hadoop-spark-docker.iml
start-container.sh		start-container.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Run Hadoop Spark and Hive within Docker Containers

1. Download resource files

2. build Dockerfile

3. create hadoop network

4. start container

5. start hadoop

6. update mysql password

7. run wordcount

8. start hive

9. WebUI

About

Releases

Packages

Languages

hui61/hadoop-spark-docker

Folders and files

Latest commit

History

Repository files navigation

Run Hadoop Spark and Hive within Docker Containers

1. Download resource files

2. build Dockerfile

3. create hadoop network

4. start container

5. start hadoop

6. update mysql password

7. run wordcount

8. start hive

9. WebUI

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages