Saturday, June 6, 2020

HOW TO: Setup Beeline on linux for connecting to remote instance of Hive using Kerberos

To set up the connectivity, you have to download binaries that are required for a successful connection. These binaries can be downloaded from below links:

After downloading the tar files, extract them using below commands:

tar -xvzf hadoop-2.5.1.tar.gz
tar -xvzf apache-hive-1.2.1-bin.tar.gz

Folder Structure:
Let's say you extracted the tar files @/home/user/beeline. At this path, two new folders will get created hadoop-2.5.1 and apache-hive-1.2.1-bin.  Now also extract JRE here. Also, create two empty folders "conf" and "bin".
So your directory structure is now:


/home/user/beeline
/home/user/beeline/hadoop-2.5.1
/home/user/beeline/apache-hive-1.2.1-bin
/home/user/beeline/jre
/home/user/beeline/conf
/home/user/beeline/bin

setEnv.sh File:
Create setEnv.sh file and save it inside "bin" folder. Paste below content inside it:

export HADOOP_HOME=/home/user/beeline/hadoop-2.5.1
export HIVE_HOME=/home/user/beeline/apache-hive-1.2.1-bin
export JAVA_HOME=/home/user/beeline/jre
PATH=$PATH:$HIVE_HOME/bin:$JAVA_HOME/bin
export HADOOP_OPTS="$HADOOP_OPTS -Dsun.security.krb5.debug=true -Djava.security.krb5.conf=/home/user/beeline/conf/krb5.conf -Djavax.security.auth.useSubjectCredsOnly=false -Djava.security.auth.login.config=/home/user/beeline/conf/jaas.conf"

jaas.conf File:

Create and save jaas.conf file under conf folder

Client {
com.sun.security.auth.module.Krb5LoginModule required
useKeyTab=false
useTicketCache=true;
};
krb5.conf File:

Create and save krb5.conf File under conf folder. Modify this file as per your environment.

[logging]
default = FILE:~/krb5libs.log
kdc = FILE:~/krb5kdc.log
admin_server = FILE:~/kadmind.log
kdc_rotate = {"period"=>"1d", "versions"=>200}
admin_server_rotate = {"period"=>"1d", "versions"=>201}

[libdefaults]
    default_realm = DOMAIN.COM
    dns_lookup_realm = false
    dns_lookup_kdc = false
    forwardable = true
    renew_lifetime = 30d
    ticket_lifetime = 30d
    renewable = yes
    service = yes
    kdc_timeout = 5000
    default_tgs_enctypes = aes256-cts-hmac-sha1-96 aes128-cts arcfour-hmac-md5 des-cbc-crc des-cbc-md5 des-hmac-sha1
    default_tkt_enctypes = aes256-cts-hmac-sha1-96 aes128-cts arcfour-hmac-md5 des-cbc-crc des-cbc-md5 des-hmac-sha1
    allow_weak_crypto = yes
    udp_preference_limit = 1

[realms]
  DOMAIN.COM = {
     kdc = kdcserver.domain.com:88
     default_domain  = domain.com
    }

  [domain_realm]
    .domain.com = DOMAIN.COM 
    domain.com = DOMAIN.COM

[appdefaults]
  pam = {
      debug = false
      forwardable = true
      renew_lifetime = 36000
      ticket_lifetime = 36000
      krb4_convert = false
    }

Source file & generate kerberos ticket:
source /home/user/beeline/bin/setEnv.sh
kinit -kt <Location of keytab file>/krbuser.keytab <SPN> (“krb5-workstation” rpm is required to run kinit command.)
klist (To check if ticket is generated successfully.)

Connect to Hive instance:
beeline –u “JDBC URL” 

 beeline -u "jdbc:hive2://<hive hostname>.domain.com:10000/;principal=hive/<hive hostname>.domain.com@DOMAIN.COM"