読者です 読者をやめる 読者になる 読者になる

hadoopのインストール

インストール前

$ brew info hadoop
hadoop: stable 2.7.1
Framework for distributed processing of large data sets
https://hadoop.apache.org/
Not installed
From: https://github.com/Homebrew/homebrew/blob/master/Library/Formula/hadoop.rb
==> Caveats
In Hadoop's config file:
  /usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/hadoop-env.sh,
  /usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/mapred-env.sh and
  /usr/local/Cellar/hadoop/2.7.1/libexec/etc/hadoop/yarn-env.sh
$JAVA_HOME has been set to be the output of:
  /usr/libexec/java_home

インストール

$ brew install hadoop

スタンドアロンで動作確認

$ cd ~/tmp
$ mkdir input
$ echo "A B C B B C" > input/file
$ hadoop jar /usr/local/opt/hadoop/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount input output
$ less output/part-r-00000

擬似分散環境で動作確認

設定ファイルをメンテ

/usr/local/opt/hadoop/libexec/etc/hadoop/core-site.xml

<configuration>
  <property>
    <name>fs.default.name</name>
    <value>hdfs://localhost:9000</value>
  </property>
</configuration>

/usr/local/opt/hadoop/libexec/etc/hadoop/hdfs-site.xml

<configuration>
  <property>
    <name>dfs.replication</name>
    <value>1</value>
  </property>
</configuration>

/usr/local/opt/hadoop/libexec/etc/hadoop/yarn-site.xml

<configuration>
  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>
</configuration>

/usr/local/opt/hadoop/libexec/etc/hadoop/mapred-site.xml

<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>
</configuration>

NameNodeの起動

$ hadoop namenode -format
$ /usr/local/opt/hadoop/sbin/start-dfs.sh 

http://localhost:50070/dfshealth.html#tab-overview へアクセスして確認できる。

RMとNMの起動

$ /usr/local/opt/hadoop/sbin/start-yarn.sh 

http://localhost:8088/ へアクセスして確認できる。

動作確認

ホームディレクトリを作る

$ hadoop fs -mkdir /Users
$ hadoop fs -mkdir /Users/takada

データを作る

$ hadoop fs -mkdir ~/input
$ echo 'A B C B B C' > file
$ hadoop fs -copyFromLocal file ~/input
$ hadoop fs -cat ~/input/file

実行する

$ hadoop jar /usr/local/opt/hadoop/libexec/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount ~/input ~/output