The following are the prerequisites for setting up Hive.
You should have the latest stable build of Hadoop up and running, to install hadoop, please check my previous blog article on Hadoop Setup.
Setting up Hive:
1. Download a stable version of the hive file from apache download mirrors, For this tutorial we are using Hive-0.12.0,this release works with Hadoop 0.20.X, 1.X, 0.23.X and 2.X
2. Unpack the compressed hive in home directory:
tar xvzf hive-0.12.0.tar.gz
3. Create a hive directory under usr/local directory as root user and change the ownership to hduser as shown, this is for our convenience to differentiate each framework,software and application with different users.
cd /usr/local mkdir hive sudo chown -R hduser:hadoop /usr/local/hive
4. Login as hduser and move the uncompressed hive-0.12.0 to /usr/local/hive folder
mv hive-0.12.0/ /usr/local/hive
5. set HIVE_HOME in $HOME/.bashrc so it will be set every time you login.
Add the following entries to the .bashrc file.
export HIVE_HOME='/usr/local/hive/hive-0.12.0' export PATH=$HADOOP_HOME/bin:$HIVE_HOME/bin:PATH
7. compile .bashrc file using this command:
Setting up hive on top of hadoop has takencare, lets test it:
8. Start hive by executing the following command.
9. table in hive by the following command. Also after creating check if the table exists.
create table test (field1 string, field2 string); show tables;
10. Show extended details on the table
Describe extended test;
By this output we know that hive was setup correctly on top of Hadoop cluster, it’s time to learn the HiveQL.