Latest News

Friday, October 12, 2018

Difference between NameNode, Checkpoint NameNode and BackupNode

Difference between NameNode, Checkpoint NameNode and BackupNode

  • NameNode is the core of HDFS that manages the metadata – the information of what file maps to what block locations and what blocks are stored on what datanode. In simple terms, it’s the data about the data being stored. NameNode supports a directory tree-like structure consisting of all the files present in HDFS on a Hadoop cluster. It uses following files for namespace:
    fsimage file- It keeps track of the latest checkpoint of the namespace.
    edits file-It is a log of changes that have been made to the namespace since checkpoint.
  • Checkpoint NameNode has the same directory structure as NameNode, and creates checkpoints for namespace at regular intervals by downloading the fsimage and edits file and margining them within the local directory. The new image after merging is then uploaded to NameNode.
    There is a similar node like Checkpoint, commonly known as Secondary Node, but it does not support the ‘upload to NameNode’ functionality.
  • Backup Node provides similar functionality as Checkpoint, enforcing synchronization with NameNode. It maintains an up-to-date in-memory copy of file system namespace and doesn’t require getting hold of changes after regular intervals. The backup node needs to save the current state in-memory to an image file to create a new checkpoint.
  • Google+
  • Pinterest

1 comment

  1. Controls of Mathematics, Statistics, Computer science, and Information innovation adds to their speculations and systems in the foundation of the field of Data Science. ExcelR Data Science Courses