圖片來(lái)源于網(wǎng)絡(luò)
【摘要】大數(shù)據(jù)系統(tǒng)又重又復(fù)雜,對(duì)于很多感興趣、又想玩一把的同學(xué)來(lái)講,門檻著實(shí)有點(diǎn)高,今天我選擇了Apache Kudu這個(gè)項(xiàng)目,來(lái)手把手指導(dǎo)大家從源碼開始構(gòu)建一個(gè)本地的集群,然后進(jìn)行簡(jiǎn)單的數(shù)據(jù)讀寫和集群管理。
1 前言
大數(shù)據(jù)系統(tǒng)又重又復(fù)雜,對(duì)于很多感興趣、又想玩一把的同學(xué)來(lái)講,門檻著實(shí)有點(diǎn)高,今天我選擇了Apache Kudu這個(gè)項(xiàng)目,來(lái)手把手指導(dǎo)大家從源碼開始構(gòu)建一個(gè)本地的集群,然后進(jìn)行簡(jiǎn)單的數(shù)據(jù)讀寫和集群管理。
注意,以下操作只需要ctrl+c&ctrl+v即可:)
2 準(zhǔn)備工作
在開始本文之前,建議在華為云購(gòu)買一臺(tái)云服務(wù)器,同時(shí)考慮到后續(xù)的順利操作,云服務(wù)器需要有一些要求:
·CPU架構(gòu):x86計(jì)算
·規(guī)格:c6.2xlarge.2(提高編譯速度)
·鏡像:公共鏡像,CentOS CentOS 8.0 64bit
·系統(tǒng)盤:高IO,100GB
·彈性公網(wǎng):按流量計(jì)費(fèi)(提高下載速度)
3 操作系統(tǒng)
安裝軟件包
[root ecs-kudu~]#yum install-y git autoconf automake libtool flex rsync gcc-c++.x86_64 cyrus-sasl-devel.x86_64 cyrus-sasl-plain.x86_64 openssl-devel.x86_64 java-1.8.0-openjdk-devel.x86_64
創(chuàng)建軟連接
[root ecs-kudu~]#cd/usr/bin
[root ecs-kudu bin]#ln-s python3 python
[root ecs-kudu bin]#ls-lrt python*
lrwxrwxrwx 1 root root 32 Nov 21 2019 python3.6m->/usr/libexec/platform-python3.6m
lrwxrwxrwx 1 root root 31 Nov 21 2019 python3.6->/usr/libexec/platform-python3.6
lrwxrwxrwx 1 root root 25 Feb 12 10:34 python3->/etc/alternatives/python3
lrwxrwxrwx 1 root root 7 Jun 8 19:05 python->python3
4 編譯源碼
clone代碼:
[root ecs-kudu~]#git clone https://github.com/apache/kudu
編譯三方包
[root ecs-kudu~]#cd kudu/
[root ecs-kudu kudu]#./thirdparty/build-if-necessary.sh
編譯源碼
[root ecs-kudu kudu]#mkdir-p build/release
[root ecs-kudu kudu]#cd build/release/
[root ecs-kudu release]#../../thirdparty/installed/common/bin/cmake-DCMAKE_BUILD_TYPE=release../..
[root ecs-kudu release]#make-j8
5 部署集群
這里我們以1個(gè)master+3個(gè)tserver的集群進(jìn)行舉例說(shuō)明。
創(chuàng)建目錄
[root ecs-kudu release]#cd~/kudu
[root ecs-kudu kudu]#
mkdir-p cluster/master/wal
mkdir-p cluster/master/data
mkdir-p cluster/master/conf
mkdir-p cluster/master/log
mkdir-p cluster/tserver1/wal
mkdir-p cluster/tserver1/data
mkdir-p cluster/tserver1/conf
mkdir-p cluster/tserver1/log
mkdir-p cluster/tserver2/wal
mkdir-p cluster/tserver2/data
mkdir-p cluster/tserver2/conf
mkdir-p cluster/tserver2/log
mkdir-p cluster/tserver3/wal
mkdir-p cluster/tserver3/data
mkdir-p cluster/tserver3/conf
mkdir-p cluster/tserver3/log
配置文件
·master節(jié)點(diǎn)配置
[root ecs-kudu kudu]#cd cluster
[root ecs-kudu cluster]#vi master/conf/master.conf
-rpc_bind_addresses=localhost:7051
-webserver_interface=localhost
-webserver_port=8051
-fs_wal_dir=/root/kudu/cluster/master/wal
-fs_data_dirs=/root/kudu/cluster/master/data
-log_dir=/root/kudu/cluster/master/log
-unlock_unsafe_flags
-never_fsync
-time_source=system_unsync
·tserver1節(jié)點(diǎn)配置
[root ecs-kudu cluster]#vi tserver1/conf/tserver.conf
-rpc_bind_addresses=localhost:7150
-webserver_interface=localhost
-webserver_port=8150
-fs_wal_dir=/root/kudu/cluster/tserver1/wal
-fs_data_dirs=/root/kudu/cluster/tserver1/data
-log_dir=/root/kudu/cluster/tserver1/log
-unlock_unsafe_flags
-never_fsync
-time_source=system_unsync
·tserver2節(jié)點(diǎn)配置
[root ecs-kudu cluster]#vi tserver2/conf/tserver.conf
-rpc_bind_addresses=localhost:7250
-webserver_interface=localhost
-webserver_port=8250
-fs_wal_dir=/root/kudu/cluster/tserver2/wal
-fs_data_dirs=/root/kudu/cluster/tserver2/data
-log_dir=/root/kudu/cluster/tserver2/log
-unlock_unsafe_flags
-never_fsync
-time_source=system_unsync
·tserver3節(jié)點(diǎn)配置
[root ecs-kudu cluster]#vi tserver3/conf/tserver.conf
-rpc_bind_addresses=localhost:7350
-webserver_interface=localhost
-webserver_port=8350
-fs_wal_dir=/root/kudu/cluster/tserver3/wal
-fs_data_dirs=/root/kudu/cluster/tserver3/data
-log_dir=/root/kudu/cluster/tserver3/log
-unlock_unsafe_flags
-never_fsync
-time_source=system_unsync
啟動(dòng)進(jìn)程
[root ecs-kudu cluster]#
../build/release/bin/kudu-master--flagfile=./master/conf/master.conf&
../build/release/bin/kudu-tserver--flagfile=./tserver1/conf/tserver.conf&
../build/release/bin/kudu-tserver--flagfile=./tserver2/conf/tserver.conf&
../build/release/bin/kudu-tserver--flagfile=./tserver3/conf/tserver.conf&
測(cè)試嘗鮮
·"OK"表示集群的狀態(tài)正常:
[root ecs-kudu cluster]#../build/release/bin/kudu cluster ksck localhost:7051
......
OK
[root ecs-kudu cluster]#
·寫數(shù)據(jù)
[root ecs-kudu cluster]#../build/release/bin/kudu perf loadgen localhost:7051-table_num_hash_partitions=3-table_num_replicas=3-num_rows_per_thread=10000-keep_auto_table
Using auto-created table'default.loadgen_auto_d1de323678bd4aa5a5dea618cbb8449c'
INSERT report
rows total:20000
time total:35.4161 ms
time per row:0.0017708 ms
[root ecs-kud cluster]#
·讀數(shù)據(jù)(注意表名和上面的一樣)
[root ecs-kudu cluster]#../build/release/bin/kudu perf table_scan localhost:7051 default.loadgen_auto_d1de323678bd4aa5a5dea618cbb8449c
T 988dc2eef4b84ed4a9f3f085f01407e1 scanned count 6640 cost 0.00353404 seconds
T f1116c276f10455c8d58ff4720383542 scanned count 6649 cost 0.00355792 seconds
T fb010f4e720d4e4293abcda44f51210a scanned count 6711 cost 0.00305647 seconds
Total count 20000 cost 0.0068044 seconds
[root ecs-kudu cluster]#
上面是集群狀態(tài)和讀、寫集群的操作,通過(guò)kudu工具還可以做其他各種豐富的操作,大家可以參考https://kudu.apache.org/docs/command_line_tools_reference.html