Latest News

Monday, October 30, 2023

How are tables stored in hive?

 Hive stores the DATA into HDFS and SCHEMA into RDBMS (Derby, SQL, etc.)

  1. When user creates table, a schema is created in RDBMS
  2. When data is entered, files are created in HDFS. User can also directly put files into HDFS without interacting with RDBMS.
  3. Schema while reading data concept - Now when table is read - then Hive will check the schema and most importantly line delimiter and field delimiter.

As per delimiters rows and fields will be read from file. And a table will be formed to send to user.

e.g.

As per table definition line delimiter is '\n' (new line) and field delimiter is ',' (comma)

Then file in HDFS would -

1,Employee_Name1,1000

2,Employee_Name2,2000

And while reading this file Hive would assign the 2 rows and 3 columns each to the table.

Interesting part -

  • Now even if the file we put directly into HDFS is anything like lyrics of song. Then also Hive will not throw any exception.
  • Hive will just check line delimiter to create multiple rows of table. And check field delimiter to check for multiple columns in a row.
  • Now if any line/field delimiter is not present in the file then all the data of song lyrics would be put inside first column of first row in table.

  • Google+
  • Pinterest
« PREV
NEXT »

No comments

Post a Comment