You can simply make your standalone hadoop setup to a pseudo distributed mode with following changes
- In HADOOP_HOME/etc/hadoop/core-site.xml, add
fs.defaultFS hdfs://localhost:9000
- In HADOOP_HOME/etc/hadoop/hdfs-site.xml, add
dfs.replication 1
Start and test your hadoop setup
- Fist navigate to HADOOP_HOME
- Format the hadoop file system
/bin/hdfs namenode –format
- Start Name node and Data node
/sbin/start-dfs.sh
- Now you should be able to browse the hadoop web interface through
And your hadoop file system under
Utilities > browse the file system
- Add /user/
to hadoop file system
hdfs dfs –mkdir /user
hdfs dfs –mkdir /user/
You will be able to see these directories when you browse the file system now. And you can list the files with
hdfs dfs –ls ( ie: hdfs dfs –ls / )
- Copy the input file to the hadoop file system
hdfs dfs –put
ie: hdfs dfs –put myinput input
and the file will be copied to /user//input
- Run the application with
hadoop jar [local path to jar file] [path to main class] [input path in dfs] [output location in dfs]
ie: hadoop jar myapp.jar test.org.AppRunner input output
Result file: part-r-00000 should be saved in the output directory of dfs ( /user/[username]/output