Skip to content

Commit aef3ab9

Browse files
committed
docs:拆分大数据细分专栏
1 parent 530b6ee commit aef3ab9

6 files changed

Lines changed: 719 additions & 5 deletions

File tree

docs/.vuepress/config.js

Lines changed: 34 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -97,8 +97,19 @@ module.exports = {
9797
{
9898
text: '大数据',
9999
items: [
100-
{text: '00-互联网大厂的大数据平台架构', link: '/md/bigdata/大数据平台架构.md'},
101-
{text: '01-数据库的下一站-对象存储', link: '/md/bigdata/数据库的下一站-对象存储.md'},
100+
{text: '大数据平台', items: [
101+
{text: '00-互联网大厂的大数据平台架构', link: '/md/bigdata/大数据平台架构.md'},
102+
{text: '01-对象存储', link: '/md/bigdata/对象存储.md'},
103+
]},
104+
105+
{text: 'Hadoop', items: [
106+
{text: '00-安装下载Hadoop',link: '/md/bigdata/安装下载Hadoop.md'},
107+
{text: '01-HDFS',link: '/md/bigdata/HDFS.md'},
108+
]},
109+
110+
{text: 'Hive', items: [
111+
{text: '01-macOS下 Hive 2.x 的安装与配置',link: '/md/bigdata/01-macOS下 Hive 2.x 的安装与配置.md'},
112+
]},
102113
]
103114
},
104115
{
@@ -275,14 +286,32 @@ module.exports = {
275286
],
276287
"/md/bigdata/": [
277288
{
278-
title: "大数据",
289+
title: "大数据平台",
279290
collapsable: false,
280291
sidebarDepth: 0,
281292
children: [
282293
"大数据平台架构.md",
283-
"数据库的下一站-对象存储.md",
294+
"对象存储.md",
284295
]
285-
}
296+
},
297+
{
298+
title: "Hadoop",
299+
collapsable: false,
300+
sidebarDepth: 0,
301+
children: [
302+
"安装下载Hadoop.md",
303+
"HDFS.md",
304+
]
305+
},
306+
{
307+
title: "Hive",
308+
collapsable: false,
309+
sidebarDepth: 0,
310+
children: [
311+
"01-macOS下 Hive 2.x 的安装与配置.md",
312+
]
313+
},
314+
286315
],
287316
"/md/rpc/": [
288317
{
Lines changed: 233 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,233 @@
1+
# 01-macOS下 Hive 2.x 的安装与配置
2+
3+
# 1 简介
4+
5+
基于Hadoop的一个数据仓库工具,可将结构化数据文件映射为一张数据库表,并提供简单[SQL]查询,可将SQL语句转换为MapReduce任务进行运行。
6+
7+
### 优点
8+
9+
学习成本低,可通过类SQL语句快速实现简单的MapReduce统计,不必开发专门的MapReduce应用,适合数据仓库的统计分析。
10+
11+
提供一系列工具,进行数据提取转化加载(ETL),这是种可以存储、查询和分析存储在 Hadoop 中的大规模数据的机制。Hive 定义简单的类 SQL 查询语言HQL,允许SQL用户查询数据。
12+
13+
也允许熟悉 MapReduce 开发者的开发自定义的 mapper、reducer处理内建的 mapper 和 reducer 无法完成的复杂的分析工作。
14+
15+
无专门数据格式,可很好工作在 Thrift 上,控制分隔符,也允许用户指定数据格式。
16+
17+
18+
[Facebook]开发,也有其他公司使用和开发Hive,如[Netflix]。亚马逊也开发定制版Apache Hive,亚马逊网络服务包中的Amazon Elastic MapReduce包含该定制版本。
19+
20+
## 2 环境
21+
22+
- Hadoop版本
23+
hadoop-2.6.0-cdh5.7.0
24+
25+
- MySQL版本
26+
![](https://upload-images.jianshu.io/upload_images/16782311-db67b7eb99002aee.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
27+
28+
- mysql-connector-java
29+
5.1.37
30+
31+
- Hive版本
32+
2.3.4
33+
34+
## 3 安装Hive
35+
36+
### 3.1 先确保已经正确安装并运行了hadoop
37+
38+
### 3.2 下载Hive安装包
39+
40+
[官网下载](http://mirror.bit.edu.cn/apache/hive/hive-2.3.4/)
41+
42+
将安装包移动至:
43+
../hadoop-2.6.0-cdh5.7.0/ 目录,本地安装Hadoop的目录
44+
45+
移动至此处后,解压缩
46+
47+
```bash
48+
tar -xzvf apache-hive-2.3.4-bin.tar.gz
49+
```
50+
51+
将解压后的文件名改为hive,方便配置。 如本机Hive的安装路径:
52+
![](https://upload-images.jianshu.io/upload_images/16782311-51937a6fae66f951.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
53+
54+
## 3.3 配置系统环境变量
55+
56+
### 3.3.1 修改~/.bash_profile
57+
58+
// 或者修改/etc/profile文件
59+
60+
```bash
61+
vim ~/.bash_profile
62+
```
63+
64+
添加内容
65+
66+
```properties
67+
export HIVE_HOME=/Volumes/doc/hadoop-2.6.0-cdh5.7.0/hive
68+
export PATH=$PATH:$HIVE_HOME/bin:$HIVE_HOME/conf
69+
```
70+
71+
退出保存后,在终端输入,使环境变量立即生效
72+
73+
```bash
74+
source ~/.bash_profile
75+
```
76+
77+
## 4 修改Hive配置
78+
79+
### 4.1 新建hive-site.xml
80+
81+
在 ../hive/conf:
82+
![](https://upload-images.jianshu.io/upload_images/16782311-4983ccaf4ab9de99.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
83+
84+
添加`hive-site.xml`内容:
85+
86+
```xml
87+
<?xml version="1.0" encoding="UTF-8"?>
88+
89+
<configuration>
90+
91+
<!-- 使用 JDBC 存储 Hive 元数据 -->
92+
<property>
93+
<name>javax.jdo.option.ConnectionURL</name>
94+
<value>jdbc:mysql://<hostname>:<port>/<database_name>?createDatabaseIfNotExist=true</value>
95+
<description>JDBC connection URL for a JDBC metastore.</description>
96+
</property>
97+
98+
<!-- 指定 JDBC 驱动程序类 -->
99+
<property>
100+
<name>javax.jdo.option.ConnectionDriverName</name>
101+
<value>com.mysql.jdbc.Driver</value>
102+
<description>Driver class name for a JDBC metastore</description>
103+
</property>
104+
105+
<!-- 指定数据库用户名和密码 -->
106+
<property>
107+
<name>javax.jdo.option.ConnectionUserName</name>
108+
<value><username></value>
109+
<description>username to use against metastore database</description>
110+
</property>
111+
112+
<property>
113+
<name>javax.jdo.option.ConnectionPassword</name>
114+
<value><password></value>
115+
<description>password to use against metastore database</description>
116+
</property>
117+
118+
<!-- 指定数据库类型 -->
119+
<property>
120+
<name>javax.jdo.option.ConnectionDriverName</name>
121+
<value>com.mysql.jdbc.Driver</value>
122+
<description>Driver class name for a JDBC metastore</description>
123+
</property>
124+
125+
<!-- 指定 Hive 元数据的存储位置 -->
126+
<property>
127+
<name>hive.metastore.warehouse.dir</name>
128+
<value>/user/hive/warehouse</value>
129+
<description>location of default database for the warehouse</description>
130+
</property>
131+
132+
<!-- 指定 Hive 元数据的存储方式 -->
133+
<property>
134+
<name>hive.metastore.schema.verification</name>
135+
<value>false</value>
136+
<description>
137+
Enforce metastore schema version consistency.
138+
True: Fail metastore startup if the schema version does not match.
139+
False: Warn and continue metastore startup even if the schema version does not match.
140+
</description>
141+
</property>
142+
143+
</configuration>
144+
```
145+
146+
147+
148+
![](https://upload-images.jianshu.io/upload_images/16782311-8754cf0dc39cb006.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
149+
150+
## 4.2 hive-env.sh
151+
152+
复制hive-env.sh.template为hive-env.sh
153+
![](https://upload-images.jianshu.io/upload_images/16782311-e07bdf9382b947a2.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
154+
155+
修改hive-env.sh内容
156+
![](https://upload-images.jianshu.io/upload_images/16782311-60758eb384849339.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
157+
158+
## 5 MySQL 权限配置
159+
160+
### 5.1 给用户赋予权限
161+
162+
使该用户可以远程登录数据库:
163+
![](https://upload-images.jianshu.io/upload_images/16782311-3e652dd9a335442c.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
164+
165+
如果上面查询到有信息,但host为localhost或其他值,就需要根据实际需求来更新表信息
166+
167+
```
168+
grant all privileges on 库名.表名 to '用户名'@'IP地址' identified by '密码' with grant option;
169+
flush privileges;
170+
```
171+
172+
> 库名:要远程访问的数据库名称,所有的数据库使用“*
173+
> 表名:要远程访问的数据库下的表的名称,所有的表使用“*
174+
> 用户名:要赋给远程访问权限的用户名称
175+
> IP地址:可以远程访问的电脑的IP地址,所有的地址使用“%”
176+
> 密码:要赋给远程访问权限的用户对应使用的密码
177+
178+
```
179+
GRANT ALL PRIVILEGES ON *.* TO 'root'@'%' WITH GRANT OPTION;
180+
```
181+
182+
使改变立即生效:
183+
184+
```
185+
FLUSH PRIVILEGES;
186+
187+
```
188+
189+
![](https://upload-images.jianshu.io/upload_images/16782311-8b9f16bc279d8c54.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
190+
191+
## 6 lib中添加MySQL驱动
192+
193+
向/usr/hadoop/hadoop-2.6.2/hive/lib中添加MySQL连接库:
194+
195+
[下载驱动](https://dev.mysql.com/downloads/connector/j/)
196+
197+
将下好的包解压
198+
199+
解压后,将此文件夹下mysql-connector-java-8.0.15.jar
200+
![](https://upload-images.jianshu.io/upload_images/16782311-fbd0d44ece0a36b5.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
201+
202+
复制到../hive/lib:
203+
![image.png](https://upload-images.jianshu.io/upload_images/16782311-9336f9b6da0dbb5e.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
204+
205+
需要给/tmp文件夹设置写权限,同时确保 hadoop不在安全模式下,可以执行此命令使hadoop退出安全模式:hadoop dfsadmin -safemode leave
206+
207+
## 7 启动Hive
208+
209+
在命令行运行 hive 命令时必须保证HDFS 已经启动。可以使用 start-dfs.sh 脚本来启动 HDFS。
210+
211+
### 7.1 第一次启动Hive
212+
213+
要先执行初始化命令:
214+
215+
```bash
216+
schematool -dbType mysql -initSchema
217+
```
218+
219+
![](https://upload-images.jianshu.io/upload_images/16782311-62391a0223e9c65d.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
220+
221+
### 7.2 启动Hive
222+
223+
```bash
224+
javaedge@JavaEdgedeMac-mini bin % ./hive
225+
ls: /Users/javaedge/Downloads/soft/spark-2.4.3-bin-2.6.0-cdh5.15.1/lib/spark-assembly-*.jar: No such file or directory
226+
/Users/javaedge/Downloads/soft/hadoop-2.6.0-cdh5.15.1/bin/hadoop: line 148: /Users/javaedge/Downloads/soft/hive-1.1.0-cdh5.15.1/bin/@@HOMEBREW_JAVA@@/bin/java: No such file or directory
227+
/Users/javaedge/Downloads/soft/hadoop-2.6.0-cdh5.15.1/bin/hadoop: line 148: exec: /Users/javaedge/Downloads/soft/hive-1.1.0-cdh5.15.1/bin/@@HOMEBREW_JAVA@@/bin/java: cannot execute: No such file or directory
228+
/usr/local/Cellar/hbase/2.5.3/libexec/bin/hbase: line 401: @@HOMEBREW_JAVA@@/bin/java: No such file or directory
229+
23/03/25 22:40:22 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
230+
231+
Logging initialized using configuration in jar:file:/Users/javaedge/Downloads/soft/hive-1.1.0-cdh5.15.1/lib/hive-common-1.1.0-cdh5.15.1.jar!/hive-log4j.properties
232+
WARNING: Hive CLI is deprecated and migration to Beeline is recommended.
233+
```

0 commit comments

Comments
 (0)