分布式存储之Glusterfs

分布式存储

分布式存储是一种数据存储技术，通过网络使用企业中的每台机器上的磁盘空间，并将这些分散的存储资源构成一个虚拟的存储设备，数据分散的存储在企业的各个角落。分布式存储系统，是将数据分散存储在多台独立的设备上。传统的网络存储系统采用集中的存储服务器存放所有数据，存储服务器成为系统性能的瓶颈，也是可靠性和安全性的焦点，不能满足大规模存储应用的需要。分布式网络存储系统采用可扩展的系统结构，利用多台存储服务器分担存储负荷，利用位置服务器定位存储信息，它不但提高了系统的可靠性、可用性和存取效率，还易于扩展。在大数据环境下，元数据的体量也非常大，元数据的存取性能是整个分布式文件系统性能的关键。常见的元数据管理可以分为集中式和分布式元数据管理架构

通俗易懂：

把这多台存储服务器的存储合起来做成一个整体再通过网络进行远程共享,共享的方式有目录(文件存储),块设备(块存储),对象网关或者说一个程序接口(对象存储)。

常见的分布式存储开源软件有:glusterFS,Ceph,HDFS,MooseFS,FastDFS等

分布式存储一般都有以下几个优点:

扩容方便，轻松达到PB级别或以上
提升读写性能(LB)或数据高可用(HA)
避免单个节点故障导致整个架构问题
价格相对便宜，大量的廉价设备就可以组成，比光纤SAN这种便宜很多

glusterfs

glusterfs介绍

glusterfs是一个免费,开源的分布式文件系统（它属于文件存储类型）。

官网地址：https://www.gluster.org/

常见卷的模式

卷模式	描述
Replicated	复制卷，类似raid1
Striped(了解,新版本将会放弃此模式及其它相关的组合模式)	条带卷，类似raid0
distributed	分布卷
distribute Replicated	分布与复制组合
dispersed	纠删卷，类似raid5,raid6

glusterfs看作是一个将多台服务器存储空间组合到一起，再划分出不同类型的文件存储卷给导入端使用。

Replicated卷

Striped卷

distributed卷

distribute Replicated卷

其它模式请参考官网: https://docs.gluster.org/en/latest/Administrator%20Guide/Setting%20Up%20Volumes/

glusterfs集群

实验环境：

所有节点(包括client)静态IP（NAT网络，能上外网）

所有节点(包括client)都配置主机名及其主机名互相绑定（这次我这里做了别名,方便使用)

192.168.100.35   test1.cluster.com		test1
192.168.100.36   test2.cluster.com		test2
192.168.100.37   test3.cluster.com		test3
192.168.100.38   test4.cluster.com		test4
192.168.100.30   test5.cluster.com		client

所有节点(包括client)关闭防火墙,selinux

# systemctl stop firewalld
# systemctl disable firewalld
# iptables -F\
# sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
# setenforce 0

所有节点(包括client)时间同步

所有节点(包括client)配置好yum(需要加上 glusterfs官方yum源)

# vim /etc/yum.repos.d/glusterfs.repo
[glusterfs]
name=glusterfs
baseurl=https://buildlogs.centos.org/centos/7/storage/x86_64/gluster-4.1/
enabled=1
gpgcheck=0

实验步骤:

在所有storage服务器上安装相关软件包,并启动服务
所有storage服务器建立连接, 成为一个集群
所有storage服务器准备存储目录
创建存储卷
启动存储卷
client安装挂载软件
client挂载使用

实验过程:

第1步, 在所有storage服务器上(不包括client)安装glusterfs-server软件包，并启动服务

下面的命令所有存储服务器都要做
# yum install glusterfs-server
# systemctl start glusterd
# systemctl enable glusterd
# systemctl status glusterd

分布式集群一般有两种架构:

有中心节点的中心节点一般指管理节点，后面大部分分布式集群架构都属于这一种
无中心节点的所有节点又管理又做事,glusterfs属于这一种

第2步, 所有storage服务器建立连接，成为一个集群

4个storage服务器建立连接不用两两连接，只需要找其中1个,连接另外3个各一次就OK了

下面我就在storage1上操作
storage1# gluster peer probe test1		
storage1# gluster peer probe test3
storage1# gluster peer probe test4		--这里使用ip,主机名,主机名别名都可以

然后在所有存储上都可以使用下面命令来验证检查
# gluster peer status

注意：

如果这一步建立连接有问题（一般问题会出现在网络连接,防火墙,selinux,主机名绑定等);

如果想重做这一步，可以使用gluster peer detach xxxxx [force] 来断开连接，重新做

第3步, 所有storage服务器准备存储目录（可以用单独的分区，也可以使用根分区)

这里用根分区来做实验
但生产环境肯定是不建议数据盘和系统盘在一起的
# mkdir -p /data/gv0

第4步, 创建存储卷(在任意一个storage服务器上做)

改变的操作(create,delete,start,stop)等只需要在任意一个storage服务器上操作，查看的操作(info)等可以在所有storage服务器上操作

下面命令我是在storage1上操作的
因为在根分区创建所以需要force参数强制
replica 4表示是在4台上做复制模式(类似raid1)

storage1# gluster volume create gv0 replica 4 storage1:/data/gv0/ storage2:/data/gv0/ storage3:/data/gv0/ storage4:/data/gv0/ force
volume create: gv0: success: please start the volume to access data
注意：生产环境灵活使用，以免造成资源浪费

第5步, 启动存储卷

# gluster volume start gv0

第6步, client安装软件

客户端上操作
client# yum install glusterfs glusterfs-fuse -y

说明:

fuse(Filesystem in Userspace): 用户空间文件系统,是一个客户端挂载远程文件存储的模块

第7步, client挂载使用

注意：客户端也需要在/etc/hosts文件里绑定存储节点的主机名才可以挂载

client# mkdir /test0
client# mount -t glusterfs storage4:gv0 /test0
这里client是挂载storage4，也可以挂载storage1,storage2,storage3任意一个。（也就是说这4个storage既是老板,又是员工。这是glusterfs的一个特点，其它的分布式存储软件基本上都会有专门的管理server)

replica卷测试

读写测试方法:

在客户端使用dd命令往挂载目录里写文件，然后查看在storage服务器上的分布情况

注意：读写操作都要在客户端，切不可在storage服务器上操作

client# dd if=/dev/zero of=/test0/file1 bs=1M count=100

读写测试结果: 结果类似raid1

2.同读同写测试: 有条件的可以再开一台虚拟机做为client2，两个客户端挂载gv0后实现同读同写(文件存储类型的特点)

同一个据卷给多个客户端挂在可以实现共享已经同读同写

运维思想:

搭建好后,你要考虑性能,稳定, 高可用，负载均衡，健康检查, 扩展性等

如果某一个节点挂了,你要考虑是什么挂了(网卡,服务,进程,服务器关闭了),如何解决?

验证测试：

将其中一个storage节点关机

客户端需要等待10左右才能正常继续使用,再次启动数据就正常同步过去

将其中一个storage节点网卡down掉

客户端需要等待10左右才能正常继续使用,再次启动数据就正常同步过去

将其中一个storage节点glusterfs相关的进程kill掉

客户端无需等待就能正常继续使用,但写数据不会同步到挂掉的storage节点,等它进程再次启动就可以同步过去了

经验分享：

结论: 作为一名运维工程师，HA场景有不同的挂法:

服务器关闭
网卡坏了
网线断了
交换机挂了
服务进程被误杀等等

但我们需要去考虑，当软件无法把我们全自动实现时，我们可能需要使用脚本来辅助。

例: 如果一个节点没死透，我们就干脆将它关机，让它死透

卷的删除

第1步: 先在客户端umount已经挂载的目录(在umount之前把测试的数据先删除)

client# rm -rf   /test0/*  		
client# umount /test0

第2步: 在任一个storage服务器上使用下面的命令停止gv0并删除，我这里是在storage4上操作

# gluster volume stop gv0
# gluster volume delete gv0

第3步: 在所有storage服务器上都可以查看，没有gv0的信息了，说明这个volumn被删除了

# gluster volume info gv0

傻瓜题: 不删除gv0的情况下，能否再创建一个叫gv1的卷?

Of course

stripe模式(条带)

第1步: 创建stripe模式的卷

storage1# gluster volume create gv0 stripe 4 storage1:/data/gv0/ storage2:/data/gv0/ storage3:/data/gv0/ storage4:/data/gv0/ force
volume create: gv0: success: please start the volume to access data

第2步:启动gv0

storage1# gluster volume start gv0

第2步: 启动gv0

storage1# gluster volume start gv0

第3步: 客户端挂载

client# mount -t glusterfs storage1:gv0 /test0

第4步:读写测试

读写测试结果: 文件过小,不会平均分配给存储节点。有一定大小的文件会平均分配。类似raid0。

磁盘利率率100%(前提是所有节点提供的空间一样大，如果大小不一样，则按小的来进行条带)
大文件会平均分配给存储节点（LB）
没有HA，挂掉一个存储节点，此stripe存储卷则不可被客户端访问

distributed模式

第1步: 准备新的存储目录

 mkdir -p /data/gv1

第2步: 创建distributed卷gv1

storage1# gluster volume create gv1 storage1:/data/gv1/ storage2:/data/gv1/ 
storage3:/data/gv1/ storage4:/data/gv1/ force

第3步: 启动gv1

storage1# gluster volume start gv1

第4步: 客户端挂载

client# mkdir /test1
client# mount -t glusterfs storage1:gv1 /test1

第5步:读写测试

读写测试结果: 测试结果为随机写到不同的存储里，直到所有写满为止。（没有高可用，没有负载均衡）

利用率100%
方便扩容
不保障的数据的安全性(挂掉一个节点,等待大概1分钟后,这个节点就剔除了,被剔除的节点上的数据丢失)
也不提高IO性能

distributed-replica模式

第1步: 准备新的存储目录

mkdir -p /data/gv2

第2步:创建distributed-replica卷gv2

storage1# gluster volume create gv2 replica 2 storage1:/data/gv2/ storage2:/data/gv2/ 
storage3:/data/gv2/ storage4:/data/gv2/ force

第3步: 启动gv2

storage1# gluster volume start gv2

第4步: 客户端挂载

client# mkdir /test2
client# mount -t glusterfs storage1:gv2 /test2

第5步:读写测试

读写测试结果: 4个存储分为两个组，这两个组按照distributed模式随机。但在组内的两个存储会按replica模式镜像复制。

特点:

结合了distributed与replica的优点:可以扩容，也有HA特性

dispersed模式

disperse卷是v3.6版本后发布的一种卷模式，类似于raid5/6

第1步: 准备新的存储目录

# mkdir -p /data/gv3

第2步:创建卷gv3

gluster volume create gv3 disperse 4 storage1:/data/gv3/ storage2:/data/gv3/ storage3:/data/gv3/ storage4:/data/gv3/ force
There isn't an optimal redundancy value for this configuration. Do you want to create the volume with redundancy 1 ? (y/n) y
volume create: gv3: success: please start the volume to access data

第3步: 启动gv3

# gluster volume start gv3
# gluster volume info gv3

第4步: 客户端挂载

client# mkdir /test3
client# mount -t glusterfs storage1:gv3 /test3

第5步:读写测试

读写测试结果: 写100M,每个存储服务器上占33M左右。因为4个存储1个为冗余(与raid5一样)。

在线裁减与在线扩容

在线裁减要看是哪一种模式的卷,比如stripe模式就不允许在线裁减。下面我以distributed卷来做裁减与扩容

在线裁减(注意要remove没有数据的brick)

在线扩容

# gluster volume add-brick gv1 storage5:/data/gv1 force

glusterfs小结:

属于文件存储类型，优点:可以数据共享缺点: 速度较低

原文地址：https://www.jb51.cc/wenti/3279797.html

分布式存储之Glusterfs