Few Important hadoop commands


  1. To check the block size and replication factor of a file:  
    1. hadoop fs -stat %o
    2. hadoop fs -stat %r  
  2. How to create a file with different block size and replication factor of a file:
    1. hadoop fs -Ddfs.block.size <64mb>
    2. hadoop fs -Ddfs.replication.factor 2
  3. How to change the block size and replication factor of a existing file:
    1. hadoop dfs -setrep -w 4 -R  
    2. there are two ways:
      1. either change in hdfs-site.xml & restart the cluster 
      2. or, copy the files using distcp to another path with new block size & delete the old ones as: hadoop distcp -Ddfs.block.size=XX /path/to/old/files /path/to/new/files/with/larger/block/sizes.
  4. get multiple files under a directory: hadoop fs -getmerge
  5. Start hadoop ecosystems:
    1. start-dfs.sh, stop-dfs.sh and start-yarn.sh, stop-yarn.sh can be done through the master.
    2. or, hadoop-daemon.sh namenode/datanode and yarn-deamon.sh resourcemanager  Need to do on individual nodes.
  6. To View the FSImage to Text:  hdfs oiv -p XML -i fsimage_0000000000732482646  -o /data/fsimage.xml
  7. To View the FSImage to Via Web :  hdfs oiv  -i fsimage_0000000000732482646 ; then connect to following  hdfs dfs -lsr webhdfs://127.0.0.1:5978/ or curl -i http://127.0.0.1:5978/webhdfs/v1/?op=liststatus
  8. To View the Edit logs:  hdfs oev -i -p xml -o
   9. hadoop balancer 
  10. To get the Dead Node: hdfs dfsadmin -report -dead 
  11. To check the HDFS usage: hdfs dfs -df -h 

  12. To cleanUp the trash: hdfs dfs -expunge 

Comments