利用Docker构建集群并运行WordCount

Basic Cluster

This repository provides docker configurations for quick Hadoop cluster setup.
Provides a 1 Namenode and 2 Datanode setup.
How to use:

1
2
3
$ git clone https://github.com/chengsluo/hadoop-expt.git
$ cd hadoop-expt
$ docker-compose -f hadoop-basic.yml up

Refer this blog post for more details: http://codito.in/hadoop-cluster-in-docker.

Credits

Many thanks to uhopper for
providing excellent base hadoop images.

Word Count Test

Enter the running namenode container

1
$ docker exec -it namenode bash

Switch to the mount volume directory

1
$ cd /hadoop-data

Create the a project directory in hadoop file system

1
$ hadoop fs -mkdir -p hdfs://namenode:8020/project/wordcount

Upload the need data to hadoop file system

1
$ hadoop fs -cp marktwain.txt hdfs://namenode:8020/project/wordcount/marktwain.txt

Run the wordCount Application downloaded from official website

1
$ hadoop jar hadoop-mapreduce-examples-2.8.3.jar wordcount hdfs://namenode:8020/project/wordcount hdfs://namenode:8020/project/wordcount/result

Pull the result from hadoop file system

1
$ hadoop fs -get /project/wordcount/result

You can check the result by any text-editor in your local file system now !

1
$ cat result/part-r-00000

Version

Hadoop: 2.8.1/2.8.3
Java: 1.8