Rancher下部署Etcd集群

etcd是一个可靠的分布式KV存储，其底层使用Raft算法保证一致性，主要用于共享配置和服务发现。etcd是CoreOS公司发起的一个开源项目，其源代码地址为https://github.com/coreos/etcd。

目前提供配置共享和服务发现功能的组件还是比较多的，其中应用最广泛、大家最熟悉的应该就是ZookKeeper了，很多开源项目都在不同程度上依赖了ZooKeeper，例如，Dubbo、Kafka。在Golang社区中，etcd是唯一一个可以媲美ZooKeeper的组件，在有些方面，etcd甚至超越了ZooKeeper，给开发者眼前一亮的感觉。

etcd作为一个优秀的分布式KV存储产品，其底层的etcd-raft模块实现了Raft协议，可以帮助开发者快速实现最终一致性功能。etcd以其高性能、易维护、Raft实现等优点，受到越来越多开发人员的青睐，在Golang社区中名声大噪。

环境

rancher v2.3.5
etcd docker镜像 ponycool/etcd-3.4.5:1.0

etcd提供了单机模式和集群模式两种模式，单机模式比较简单，直接运行docker镜像即可。默认配置运行时，etcd服务端会监听本地的2379和2380两个端口，其中2379端口用于与客户端的交互，2380端口则用于etcd节点内部交互（主要是发送Raft协议相关的消息等）。当etcd服务端启动时，我们可以使用etcdctl工具进行测试。etcd-3.4.5的镜像在这里查看

集群部署

集群节点

etcd-node1
etcd-node2
etcd-node3

创建工作负载

配置端口映射

创建配置映射

配置参数如下：

name: 'node1'
data-dir: /var/etcd
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
listen-peer-urls: http://0.0.0.0:2380
initial-advertise-peer-urls: http://etcd-node1:2380
initial-cluster-token: 'etcd-cluster'
initial-cluster: node1=http://etcd-node1:2380,node2=http://etcd-node2:2380,node3=http://etcd-node3:2380
initial-cluster-state: 'new'
logger: zap

参数说明：

name：etcd集群中的节点名称，在同一个集群中必须是唯一的。
listen-peer-urls：用于集群内各个节点之间通信的URL地址，每个节点可以监听多个URL地址，集群内部将通过这些URL地址进行数据交互，例如，Leader节点的选举、Message消息传输或是快照传输等。
initial-advertise-peer-urls：建议用于集群内部节点之间交互的URL地址，节点间将以该值进行通信。
listen-client-urls：用于当前节点与客户端交互的URL地址，每个节点同样可以向客户端提供多个URL地址。
advertise-client-urls：建议客户端使用的URL地址，该值用于etcd代理或etcd成员与etcd节点通信。
initial-cluster-token：集群的唯一标识
initial-cluster：集群中所有的initial-advertise-peer-urls的合集。
initial-cluster-state：新建集群的标识。

etcd的全部配置参数如下（conf.yml）：

# This is the configuration file for the etcd server.

# Human-readable name for this member.
name: 'default'

# Path to the data directory.
data-dir:

# Path to the dedicated wal directory.
wal-dir:

# Number of committed transactions to trigger a snapshot to disk.
snapshot-count: 10000

# Time (in milliseconds) of a heartbeat interval.
heartbeat-interval: 100

# Time (in milliseconds) for an election to timeout.
election-timeout: 1000

# Raise alarms when backend size exceeds the given quota. 0 means use the
# default quota.
quota-backend-bytes: 0

# List of comma separated URLs to listen on for peer traffic.
listen-peer-urls: http://localhost:2380

# List of comma separated URLs to listen on for client traffic.
listen-client-urls: http://localhost:2379

# Maximum number of snapshot files to retain (0 is unlimited).
max-snapshots: 5

# Maximum number of wal files to retain (0 is unlimited).
max-wals: 5

# Comma-separated white list of origins for CORS (cross-origin resource sharing).
cors:

# List of this member's peer URLs to advertise to the rest of the cluster.
# The URLs needed to be a comma-separated list.
initial-advertise-peer-urls: http://localhost:2380

# List of this member's client URLs to advertise to the public.
# The URLs needed to be a comma-separated list.
advertise-client-urls: http://localhost:2379

# Discovery URL used to bootstrap the cluster.
discovery:

# Valid values include 'exit', 'proxy'
discovery-fallback: 'proxy'

# HTTP proxy to use for traffic to discovery service.
discovery-proxy:

# DNS domain used to bootstrap initial cluster.
discovery-srv:

# Initial cluster configuration for bootstrapping.
initial-cluster:

# Initial cluster token for the etcd cluster during bootstrap.
initial-cluster-token: 'etcd-cluster'

# Initial cluster state ('new' or 'existing').
initial-cluster-state: 'new'

# Reject reconfiguration requests that would cause quorum loss.
strict-reconfig-check: false

# Accept etcd V2 client requests
enable-v2: true

# Enable runtime profiling data via HTTP server
enable-pprof: true

# Valid values include 'on', 'readonly', 'off'
proxy: 'off'

# Time (in milliseconds) an endpoint will be held in a failed state.
proxy-failure-wait: 5000

# Time (in milliseconds) of the endpoints refresh interval.
proxy-refresh-interval: 30000

# Time (in milliseconds) for a dial to timeout.
proxy-dial-timeout: 1000

# Time (in milliseconds) for a write to timeout.
proxy-write-timeout: 5000

# Time (in milliseconds) for a read to timeout.
proxy-read-timeout: 0

client-transport-security:
  # Path to the client server TLS cert file.
  cert-file:

  # Path to the client server TLS key file.
  key-file:

  # Enable client cert authentication.
  client-cert-auth: false

  # Path to the client server TLS trusted CA cert file.
  trusted-ca-file:

  # Client TLS using generated certificates
  auto-tls: false

peer-transport-security:
  # Path to the peer server TLS cert file.
  cert-file:

  # Path to the peer server TLS key file.
  key-file:

  # Enable peer client cert authentication.
  client-cert-auth: false

  # Path to the peer server TLS trusted CA cert file.
  trusted-ca-file:

  # Peer TLS using generated certificates.
  auto-tls: false

# Enable debug-level logging for etcd.
debug: false

logger: zap

# Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd.
log-outputs: [stderr]

# Force to create a new one member cluster.
force-new-cluster: false

auto-compaction-mode: periodic
auto-compaction-retention: "1"

配置文件示例下载

挂载配置文件

数据持久化（可选）

上面的配置文件中我们将数据保存在容器的/var/etcd目录下，如果需要数据持久化需把容器目录挂载到对应的存储目录或者pvc上

修改启动命令

启动集群

按照node1节点的部署过程，创建三个工作负载。集群就部署完成了。可以进入到任何一个节点查看集群状态

etcdctl命令

查看指定端点的状态

etcdctl -w=table  --endpoints="http://etcd-node1:2380,http://etcd-node2:2380,http://etcd-node3:2380" endpoint status

检查端点的健康状况

etcdctl -w=table  --endpoints="http://etcd-node1:2380,http://etcd-node2:2380,http://etcd-node3:2380" endpoint health

查看集群成员

etcdctl -w table member list

Etcd全部参数如下：

Usage:

  etcd [flags]
    Start an etcd server.

  etcd --version
    Show the version of etcd.

  etcd -h | --help
    Show the help information about etcd.

  etcd --config-file
    Path to the server configuration file. Note that if a configuration file is provided, other command line flags and environment variables will be ignored.

  etcd gateway
    Run the stateless pass-through etcd TCP connection forwarding proxy.

  etcd grpc-proxy
    Run the stateless etcd v3 gRPC L7 reverse proxy.


Member:
  --name 'default'
    Human-readable name for this member.
  --data-dir '${name}.etcd'
    Path to the data directory.
  --wal-dir ''
    Path to the dedicated wal directory.
  --snapshot-count '100000'
    Number of committed transactions to trigger a snapshot to disk.
  --heartbeat-interval '100'
    Time (in milliseconds) of a heartbeat interval.
  --election-timeout '1000'
    Time (in milliseconds) for an election to timeout. See tuning documentation for details.
  --initial-election-tick-advance 'true'
    Whether to fast-forward initial election ticks on boot for faster election.
  --listen-peer-urls 'http://localhost:2380'
    List of URLs to listen on for peer traffic.
  --listen-client-urls 'http://localhost:2379'
    List of URLs to listen on for client traffic.
  --max-snapshots '5'
    Maximum number of snapshot files to retain (0 is unlimited).
  --max-wals '5'
    Maximum number of wal files to retain (0 is unlimited).
  --quota-backend-bytes '0'
    Raise alarms when backend size exceeds the given quota (0 defaults to low space quota).
  --backend-batch-interval ''
    BackendBatchInterval is the maximum time before commit the backend transaction.
  --backend-batch-limit '0'
    BackendBatchLimit is the maximum operations before commit the backend transaction.
  --max-txn-ops '128'
    Maximum number of operations permitted in a transaction.
  --max-request-bytes '1572864'
    Maximum client request size in bytes the server will accept.
  --grpc-keepalive-min-time '5s'
    Minimum duration interval that a client should wait before pinging server.
  --grpc-keepalive-interval '2h'
    Frequency duration of server-to-client ping to check if a connection is alive (0 to disable).
  --grpc-keepalive-timeout '20s'
    Additional duration of wait before closing a non-responsive connection (0 to disable).

Clustering:
  --initial-advertise-peer-urls 'http://localhost:2380'
    List of this member's peer URLs to advertise to the rest of the cluster.
  --initial-cluster 'default=http://localhost:2380'
    Initial cluster configuration for bootstrapping.
  --initial-cluster-state 'new'
    Initial cluster state ('new' or 'existing').
  --initial-cluster-token 'etcd-cluster'
    Initial cluster token for the etcd cluster during bootstrap.
    Specifying this can protect you from unintended cross-cluster interaction when running multiple clusters.
  --advertise-client-urls 'http://localhost:2379'
    List of this member's client URLs to advertise to the public.
    The client URLs advertised should be accessible to machines that talk to etcd cluster. etcd client libraries parse these URLs to connect to the cluster.
  --discovery ''
    Discovery URL used to bootstrap the cluster.
  --discovery-fallback 'proxy'
    Expected behavior ('exit' or 'proxy') when discovery services fails.
    "proxy" supports v2 API only.
  --discovery-proxy ''
    HTTP proxy to use for traffic to discovery service.
  --discovery-srv ''
    DNS srv domain used to bootstrap the cluster.
  --discovery-srv-name ''
    Suffix to the dns srv name queried when bootstrapping.
  --strict-reconfig-check 'true'
    Reject reconfiguration requests that would cause quorum loss.
  --pre-vote 'false'
    Enable to run an additional Raft election phase.
  --auto-compaction-retention '0'
    Auto compaction retention length. 0 means disable auto compaction.
  --auto-compaction-mode 'periodic'
    Interpret 'auto-compaction-retention' one of: periodic|revision. 'periodic' for duration based retention, defaulting to hours if no time unit is provided (e.g. '5m'). 'revision' for revision number based retention.
  --enable-v2 'false'
    Accept etcd V2 client requests.

Security:
  --cert-file ''
    Path to the client server TLS cert file.
  --key-file ''
    Path to the client server TLS key file.
  --client-cert-auth 'false'
    Enable client cert authentication.
  --client-crl-file ''
    Path to the client certificate revocation list file.
  --client-cert-allowed-hostname ''
    Allowed TLS hostname for client cert authentication.
  --trusted-ca-file ''
    Path to the client server TLS trusted CA cert file.
  --auto-tls 'false'
    Client TLS using generated certificates.
  --peer-cert-file ''
    Path to the peer server TLS cert file.
  --peer-key-file ''
    Path to the peer server TLS key file.
  --peer-client-cert-auth 'false'
    Enable peer client cert authentication.
  --peer-trusted-ca-file ''
    Path to the peer server TLS trusted CA file.
  --peer-cert-allowed-cn ''
    Required CN for client certs connecting to the peer endpoint.
  --peer-cert-allowed-hostname ''
    Allowed TLS hostname for inter peer authentication.
  --peer-auto-tls 'false'
    Peer TLS using self-generated certificates if --peer-key-file and --peer-cert-file are not provided.
  --peer-crl-file ''
    Path to the peer certificate revocation list file.
  --cipher-suites ''
    Comma-separated list of supported TLS cipher suites between client/server and peers (empty will be auto-populated by Go).
  --cors '*'
    Comma-separated whitelist of origins for CORS, or cross-origin resource sharing, (empty or * means allow all).
  --host-whitelist '*'
    Acceptable hostnames from HTTP client requests, if server is not secure (empty or * means allow all).

Auth:
  --auth-token 'simple'
    Specify a v3 authentication token type and its options ('simple' or 'jwt').
  --bcrypt-cost 10
    Specify the cost / strength of the bcrypt algorithm for hashing auth passwords. Valid values are between 4 and 31.

Profiling and Monitoring:
  --enable-pprof 'false'
    Enable runtime profiling data via HTTP server. Address is at client URL + "/debug/pprof/"
  --metrics 'basic'
    Set level of detail for exported metrics, specify 'extensive' to include histogram metrics.
  --listen-metrics-urls ''
    List of URLs to listen on for the metrics and health endpoints.

Logging:
  --logger 'capnslog'
    Specify 'zap' for structured logging or 'capnslog'. [WARN] 'capnslog' will be deprecated in v3.5.
  --log-outputs 'default'
    Specify 'stdout' or 'stderr' to skip journald logging even when running under systemd, or list of comma separated output targets.
  --log-level 'info'
    Configures log level. Only supports debug, info, warn, error, panic, or fatal.

v2 Proxy (to be deprecated in v4):
  --proxy 'off'
    Proxy mode setting ('off', 'readonly' or 'on').
  --proxy-failure-wait 5000
    Time (in milliseconds) an endpoint will be held in a failed state.
  --proxy-refresh-interval 30000
    Time (in milliseconds) of the endpoints refresh interval.
  --proxy-dial-timeout 1000
    Time (in milliseconds) for a dial to timeout.
  --proxy-write-timeout 5000
    Time (in milliseconds) for a write to timeout.
  --proxy-read-timeout 0
    Time (in milliseconds) for a read to timeout.

Experimental feature:
  --experimental-initial-corrupt-check 'false'
    Enable to check data corruption before serving any client/peer traffic.
  --experimental-corrupt-check-time '0s'
    Duration of time between cluster corruption check passes.
  --experimental-enable-v2v3 ''
    Serve v2 requests through the v3 backend under a given prefix.
  --experimental-backend-bbolt-freelist-type 'array'
    ExperimentalBackendFreelistType specifies the type of freelist that boltdb backend uses(array and map are supported types).
  --experimental-enable-lease-checkpoint 'false'
    ExperimentalEnableLeaseCheckpoint enables primary lessor to persist lease remainingTTL to prevent indefinite auto-renewal of long lived leases.
  --experimental-compaction-batch-limit 1000
    ExperimentalCompactionBatchLimit sets the maximum revisions deleted in each compaction batch.
  --experimental-peer-skip-client-san-verification 'false'
    Skip verification of SAN field in client certificate for peer connections.

Unsafe feature:
  --force-new-cluster 'false'
    Force to create a new one-member cluster.

CAUTIOUS with unsafe flag! It may break the guarantees given by the consensus protocol!

TO BE DEPRECATED:

  --debug 'false'
    Enable debug-level logging for etcd. [WARN] Will be deprecated in v3.5. Use '--log-level=debug' instead.
  --log-package-levels ''
    Specify a particular log level for each etcd package (eg: 'etcdmain=CRITICAL,etcdserver=DEBUG').

查看评论

Comments | 5条评论

jimmy

回复

2020年07月17日

name: 'node1'
data-dir: /var/etcd
listen-client-urls: http://0.0.0.0:2379
advertise-client-urls: http://0.0.0.0:2379
listen-peer-urls: http://0.0.0.0:2380
initial-advertise-peer-urls: http://rancher01:2380
initial-cluster-token: 'etcd-cluster'
initial-cluster: node1=http://rancher01:2380,node2=http://rancher02:2380,node3=http://rancher03:2380
initial-cluster-state: 'new'
logger: zap
1. Pony
  
  回复
  
  2020年07月18日
  
  @jimmy : 需要你的负载的配置
  1. jimmy
    
    回复
    
    2020年07月21日
    
    @Pony : hi，兄弟，你的留言不支持图片我直接发你邮箱了
jimmy

回复

2020年07月09日

/ # etcdctl -w=table --endpoints="http://rancher01:2380,http://rancher02:2380,http://rancher03:2380" endpoint health
{"level":"warn","ts":"2020-07-09T09:39:35.178Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-14f5e00b-dd1e-4f9f-87f6-7b65474d9fbd/rancher03:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}
{"level":"warn","ts":"2020-07-09T09:39:35.178Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-82eb8928-55d9-481c-a26a-b5b64e34ec23/rancher01:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}{"level":"warn","ts":"2020-07-09T09:39:35.179Z","caller":"clientv3/retry_interceptor.go:62","msg":"retrying of unary invoker failed","target":"endpoint://client-b4a1fab8-116c-4760-a4aa-97a04f53f7cb/rancher02:2380","attempt":0,"error":"rpc error: code = DeadlineExceeded desc = context deadline exceeded"}ENDPOINTHEALTHTOOKERRORhttp://rancher03:2380false5.000260661scontext deadline exceededhttp://rancher01:2380false5.000251523scontext deadline exceededhttp://rancher02:2380false5.000181459scontext deadline exceeded为什么我这儿是这样的日志识别不了主机名
{"level":"warn","ts":"2020-07-09T09:05:07.531Z","caller":"rafthttp/probing_status.go:70","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"c687a8872137e427","rtt":"0s","error":"dial tcp: lookup rancher03 on 10.43.0.10:53: no such host"}然后把主机名换成节点IP 也是connection refused 求解，谢谢了
1. Pony
  
  回复
  
  2020年07月12日
  
  @jimmy : 需要贴出你得配置，方便查找问题

Rancher下部署Etcd集群

Rancher下部署Etcd集群

环境

集群部署

集群节点

创建工作负载

配置端口映射

创建配置映射

挂载配置文件

数据持久化（可选）

修改启动命令

启动集群

etcdctl命令

查看指定端点的状态

检查端点的健康状况

查看集群成员

下半旗致哀逝者，这个清明节要被永远铭记

Rancher下部署Consul集群

Pony

Comments | 5条评论

jimmy

Pony

jimmy

jimmy

Pony

Rancher下部署Etcd集群

环境

集群部署

集群节点

创建工作负载

配置端口映射

创建配置映射

挂载配置文件

数据持久化（可选）

修改启动命令

启动集群

etcdctl命令

查看指定端点的状态

检查端点的健康状况

查看集群成员

下半旗致哀逝者，这个清明节要被永远铭记

Rancher下部署Consul集群

Pony

Comments | 5条评论

jimmy

Pony

jimmy

jimmy

Pony

你想搜索什么...