Apache Hive's pros and cons...
Pros/advantages:
- It is built on top of hadoop distributed framework system.
- Helps querying larger datasets residing in distributed storage
- It is a distributed data warehouse.
- Queries data using a SQL-like language called HiveQL (HQL).
- HiveQL is a declarative language like SQL.
- Table structure/s is/are similar to tables in a relational database.
- Multiple users can simultaneously query the data using Hive-QL.
- Allows to write custom MapReduce framework processes to perform more detailed data analysis.
- Data extract/transform/load (ETL) can be done easily.
- It provides the structure on a variety of data formats.
- Allows access files stored in Hadoop Distributed File System (HDFS) or also similar others data storage systems such as Apache HBase.
- Converting variety of format from to to within Hive is simple and possible.
Cons/limitations:
- It's not designed for Online transaction processing (OLTP), it is only used for the Online Analytical Processing (OLAP).
- Hive supports overwriting or apprehending data, but not updates and deletes.
- Sub-queries are not supported, in Hive
Thanks!