RUN HADOOP STREAMING
RUN HADOOP STREAMING WITH NO REDUCE TASK
HOW TO RUN A UNIX COMMAND USING COMMAND SUBSTITUTION IN STREAMING
hadoop jar ${HADOOP_STREAMING} -input test/a' -output 'testout' -mapper '/bin/cat' -reducer '/usr/bin/wc -l '
RUN HADOOP STREAMING WITH NO REDUCE TASK
hadoop fs -rmr testout;hadoop jar ${HADOOP_STREAMING} -input 'testin' -output 'testout' -mapper '/bin/cat -n' -reducer '' -jobconf mapred.reduce.tasks=0
HOW TO RUN A UNIX COMMAND USING COMMAND SUBSTITUTION IN STREAMING
mawk="awk '{print NR,\$0}'"; hadoop fs -rmr testout;hadoop jar ${HADOOP_STREAMING} -input 'testin' -output 'testout' -mapper '/bin/cat -n' -reducer "${mawk}"
I like your post and thought may be you can help me understand this - Whenever I am
ReplyDeletetrying to use Java class files as my mapper and/or reducer I am getting
the following error:
java.io.IOException: Cannot run program "MapperTst.class":
java.io.IOException: error=2, No such file or directory
I executed the following command on the terminal:
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file
/home/hadoop/codes/MapperTst.class -mapper
/home/hadoop/codes/MapperTst.class -file
/home/hadoop/codes/ReducerTst.class -reducer
/home/hadoop/codes/ReducerTst.class -input gutenberg/* -output
gutenberg-outputtstch27
Please let me if I am going wrong.
Thanks in advance.
Regards
Shrish