Monday, February 29, 2016

Apache Hive's pros and cons...




Apache Hive's pros and cons...

Pros/advantages:
  • It is built on top of hadoop distributed framework system.
  • Helps querying larger datasets residing in distributed storage 
  • It is a distributed data warehouse.
  • Queries data using a SQL-like language called HiveQL (HQL). 
  • HiveQL is a declarative language like SQL.
  • Table structure/s is/are similar to tables in a relational database. 
  • Multiple users can simultaneously query the data using Hive-QL. 
  • Allows to write custom MapReduce framework processes to perform more detailed data analysis. 
  • Data extract/transform/load (ETL) can be done easily.
  • It provides the structure on a variety of data formats. 
  • Allows access files stored in Hadoop Distributed File System (HDFS) or also similar others data storage systems such as Apache HBase.
  • Converting variety of format from to to within Hive is simple and possible.
 Cons/limitations:
  • It's not designed for Online transaction processing (OLTP), it is only used for the Online Analytical Processing (OLAP). 
  • Hive supports overwriting or apprehending data, but not updates and deletes. 
  • Sub-queries are not supported, in Hive

Thanks!