Don't read,understand it: make hive as transactional

Here we are going to see how to make hive as transactional

hive 1.4 itself we have hive as transactional property by default. prior version we have to add some property to make it as alive.

hive.support.concurrency – true
hive.enforce.bucketing – true
hive.exec.dynamic.partition.mode – nonstrict
hive.txn.manager –org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on – true
hive.compactor.worker.threads – 1

below property we have to add make hive transactional property as alive in hive-site.xml

<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.enforce.bucketing</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nostrict</value>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
<name>hive.compactor.initiator.on</name>
<value>true</value>
</property>
<property>
<name>hive.compactor.worker.threads</name>
<value>1</value>
</property>

Above we have access for root user to do rw the file. I used cloudera uer to do it.cloudera use not root.so we have to switch to root

[cloudera@quickstart Desktop]$ who
cloudera tty1 2019-01-14 00:39 (:0)
cloudera pts/0 2019-01-24 21:51 (:0.0)
cloudera pts/1 2019-01-24 21:53 (:0.0)
[cloudera@quickstart Desktop]$ su root
Password:
[root@quickstart Desktop]# vi /etc/hive/conf/hive-site.xml

once I executed above command. got had edit access. pasted the following property in it before </configuration> tag

<property>
<name>hive.support.concurrency</name>
<value>true</value>
</property>
<property>
<name>hive.enforce.bucketing</name>
<value>true</value>
</property>
<property>
<name>hive.exec.dynamic.partition.mode</name>
<value>nostrict</value>
</property>
<property>
<name>hive.txn.manager</name>
<value>org.apache.hadoop.hive.ql.lockmgr.DbTxnManager</value>
</property>
<property>
<name>hive.compactor.initiator.on</name>
<value>true</value>
</property>
<property>
<name>hive.compactor.worker.threads</name>
<value>1</value>
</property>

This table is the purpose of staging

create table stage_HiveTrans

(EmployeeID Int,FirstName String,Designation String,

Salary Int,Department String)

ROW FORMAT DELIMITED

FIELDS TERMINATED BY ',';

This HiveTrans we are using for make the same table as ORC and transctional

create table HiveTransTest

(EmployeeID Int,FirstName String,Designation String,

Salary Int,Department String)

clustered by (department) into 3 buckets

stored as orc TBLPROPERTIES ('transactional'='true') ;

from stage_HiveTransTest

insert into table HiveTransTest

select employeeid,firstname,designation,salary,department;

my same table before update

hive> select * from HiveTrans;

104 sunil web 60000 MGMT

103 sam web 30000 COR

102 mani web 35000 COR

101 satheesh bigdata 40000 COR

update the values

hive> update HiveTrans

> set salary = 50000

> where employeeid = 103;

Query ID = cloudera_20190124224949_cb23a524-bcba-4952-b832-de0d642bbe53

Total jobs = 1

Launching Job 1 out of 1

Number of reduce tasks determined at compile time: 3

In order to change the average load for a reducer (in bytes):

set hive.exec.reducers.bytes.per.reducer=<number>

In order to limit the maximum number of reducers:

set hive.exec.reducers.max=<number>

In order to set a constant number of reducers:

set mapreduce.job.reduces=<number>

Starting Job = job_1547455371417_0143, Tracking URL = http://quickstart.cloudera:8088/proxy/application_1547455371417_0143/

Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1547455371417_0143

Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 3

2019-01-24 22:49:27,712 Stage-1 map = 0%, reduce = 0%

2019-01-24 22:50:27,858 Stage-1 map = 0%, reduce = 0%

2019-01-24 22:50:56,777 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 11.63 sec

2019-01-24 22:51:43,758 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 13.86 sec

2019-01-24 22:51:57,149 Stage-1 map = 100%, reduce = 67%, Cumulative CPU 19.48 sec

2019-01-24 22:52:05,677 Stage-1 map = 100%, reduce = 100%, Cumulative CPU 21.53 sec

MapReduce Total cumulative CPU time: 21 seconds 530 msec

Ended Job = job_1547455371417_0143

Loading data to table satheesh.hivetrans

Table satheesh.hivetest stats: [numFiles=5, numRows=4, totalSize=3150, rawDataSize=0]

MapReduce Jobs Launched:

Stage-Stage-1: Map: 3 Reduce: 3 Cumulative CPU: 21.53 sec HDFS Read: 34868 HDFS Write: 1017 SUCCESS

Total MapReduce CPU Time Spent: 21 seconds 530 msec

Time taken: 170.966 seconds

checking whether it is updated or not

hive> select * from HiveTrans;

104 sunil web 60000 MGMT

103 sam web 50000 COR

102 mani web 35000 COR

101 satheesh bigdata 40000 COR

now we can delete some rows also please try..Follow me further updates related hivde

Thanks,

saisma

Don't read,understand it

Saturday, 26 January 2019

make hive as transactional

Here we are going to see how to make hive as transactional

No comments:

Post a Comment