Spark Standalone Mode 翻译和实验

来源：互联网发布：网络摄像机说明书编辑：程序博客网时间：2024/06/11 21:03

Spark Standalone Mode

Installing Spark Standalone to a Cluster 安装Spark Standalone集群
Starting a Cluster Manually 手动启动一个集群
Cluster Launch Scripts 集群启动脚本
Connecting an Application to the Cluster 将应用程序连接到集群
Launching Spark Applications 启动Spark 应用
Resource Scheduling 资源调度
Monitoring and Logging 监控和日志记录
Running Alongside Hadoop 与Hadoop一起运行
Configuring Ports for Network Security 为网络安全配置端口
High Availability高可用性
- Standby Masters with ZooKeeper 备用Masters和ZooKeeper
- Single-Node Recovery with Local File System 本地文件系统的单节点恢复

In addition to running on the Mesos or YARN cluster managers, Spark also provides a simple standalone deploy mode. You can launch a standalone cluster either manually, by starting a master and workers by hand, or use our provided launch scripts. It is also possible to run these daemons on a single machine for testing.

除了运行在Mesos或者YARN集群上，spark还提供了一种简单的standalone(单独的)部署模式。你可以通过分别开启master和workers来启动一个standalone集群，还可以使用我们提供的启动脚本来进行启动。测试的时候还可以在一台机子上启动这些守护进程。

Installing Spark Standalone to a Cluster(安装Spark Standalone集群)

To install Spark Standalone mode, you simply place a compiled version of Spark on each node on the cluster. You can obtain pre-built versions of Spark with each release or build it yourself.

想要安装Spark Standalone模式，你只要将编译好的Spark版本放到集群中的每一个节点上。你可以下载Spark已经编译好的版本或者自己编译Spark。

Starting a Cluster Manually(手动启动一个集群)

You can start a standalone master server by executing:

通过下列命令来启动一个standalone master 服务。

./sbin/start-master.sh

Once started, the master will print out a spark://HOST:PORT URL for itself, which you can use to connect workers to it, or pass as the “master” argument to SparkContext. You can also find this URL on the master’s web UI, which is http://localhost:8080 by default.

Similarly, you can start one or more workers and connect them to the master via:

一旦启动了，master会打印出自己的spark://HOST:PORT URL，你可以让workers连接它。它以“master”这个参数传入SparkContext。你还可以在master的web界面查看这个URL，默认使用http://localhost:8080来访问这个web界面。

类似地，你可以通过下面的命令启动一个或者多个workers，并且将它们连接到master：

./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT

Once you have started a worker, look at the master’s web UI (http://localhost:8080 by default). You should see the new node listed there, along with its number of CPUs and memory (minus one gigabyte left for the OS).

Finally, the following configuration options can be passed to the master and worker:

一旦你启动了一个worker，查看一下master的web界面(默认是http://localhost:8080 )。你可以在界面上看到新的节点以及它的CPU数量和内存(减去1G留给操作系统)。

最后，下面的配置选项可以传入master和worker：

ArgumentMeaning-h HOST, --host HOST

Hostname to listen on

监听的主机名

-i HOST, --ip HOST

Hostname to listen on (deprecated, use -h or --host)

监听的主机名(弃用，使用-h或者--host)

-p PORT, --port PORT

Port for service to listen on (default: 7077 for master, random for worker)

监听的服务端口(默认：master是7077，worker随机)

--webui-port PORT

Port for web UI (default: 8080 for master, 8081 for worker)

web用户界面的端口(默认：master是8080，worker是8081)

-c CORES, --cores CORES

Total CPU cores to allow Spark applications to use on the machine (default: all available); only on worker

在机子上允许Spark 应用使用的总的CPU内核(默认：所有可用的)；只用于worker。

-m MEM, --memory MEM

Total amount of memory to allow Spark applications to use on the machine, in a format like 1000M or 2G (default: your machine's total RAM minus 1 GB); only on worker

在机子上允许spark应用使用的总的内存。格式像1000M或者2G(默认：你机子总的RAM减去1G)；只用于worker

-d DIR, --work-dir DIR

Directory to use for scratch space and job output logs (default: SPARK_HOME/work); only on worker

用于暂存空间和作业输出日志的目录(默认：SPARK_HOME/work)；只用于worker。

--properties-file FILE

Path to a custom Spark properties file to load (default: conf/spark-defaults.conf)

自定义Spark属性文件加载的路径(默认：conf/spark-defaults.conf)

实验：

1、我将编译好的 spark-1.3.1-bin-2.4.0.tgz(也可以直接官网下载)分别解压到在worker00(192.168.170.130)、worker01(192.168.170.131)、worker02(192.168.170.132)、worker03(192.168.170.133)的/opt目录下，为了不影响之前已经部署好的spark，我把他改名为 spark-1.3.1-bin-2.4.0-standalone,并且修改SPARK_HOME这个环境变量的值为spark-1.3.1-bin-2.4.0-standalone：

2、使用./start-master.sh启动master，并使用jps查看是否存在Master：

3、在本地浏览器上查看用户界面信息

4、在其他节点上使用命令：./bin/spark-class org.apache.spark.deploy.worker.Worker spark://worker00:7077 & 来开启worker：

5、再次查看用户界面，可以发现多个一个worker节点，因为我之前在worker01上kill了一个worker进程，所以发现有一个dead的worker：

Cluster Launch Scripts

To launch a Spark standalone cluster with the launch scripts, you should create a file called conf/slaves in your Spark directory, which must contain the hostnames of all the machines where you intend to start Spark workers, one per line. If conf/slaves does not exist, the launch scripts defaults to a single machine (localhost), which is useful for testing. Note, the master machine accesses each of the worker machines via ssh. By default, ssh is run in parallel and requires password-less (using a private key) access to be setup. If you do not have a password-less setup, you can set the environment variable SPARK_SSH_FOREGROUND and serially provide a password for each worker.

想要使用启动脚本来启动Spark standalone集群，应该首先在Spark目录中的conf/目录下创建一个slaves文件。该文件中包含了spark worker的所有的主机名，每行一个。如果conf/slaves不存在，启动脚本默认指向单个机子(localhost)，这对于测试是有用的。注意，master这个机子通过ssh来访问每个worker机子。默认情况下，ssh是并行地执行，并且需要设置为免密码访问(使用私钥)。如果没有设置面密码访问，你可以设置环境变量SPARK_SSH_FOREGROUND，并且依次为每个worker提供一个密码。

Once you’ve set up this file, you can launch or stop your cluster with the following shell scripts, based on Hadoop’s deploy scripts, and available in SPARK_HOME/bin:

一旦你设置了这些文件，你可以使用 SPARK_HOME/bin中的基于Hadoop的部署脚本的shell脚本来启动或者停止这个集群：

sbin/start-master.sh - Starts a master instance on the machine the script is executed on.当该脚本执行时会在机子上启动一个master实例。
sbin/start-slaves.sh - Starts a slave instance on each machine specified in the conf/slaves file.根据conf/slaves中指定的主机名，为每一个机子启动一个slave实例。
sbin/start-all.sh - Starts both a master and a number of slaves as described above.启动一个master和一些slaves，正如上面描述的。
sbin/stop-master.sh - Stops the master that was started via the bin/start-master.sh script.通过 bin/start-master.sh脚本来关闭master。
sbin/stop-slaves.sh - Stops all slave instances on the machines specified in the conf/slaves file.关闭所有conf/slaves中指定机子中的slave实例。
sbin/stop-all.sh - Stops both the master and the slaves as described above.关闭master和slaves，如上面描述的。

Note that these scripts must be executed on the machine you want to run the Spark master on, not your local machine.

You can optionally configure the cluster further by setting environment variables in conf/spark-env.sh. Create this file by starting with the conf/spark-env.sh.template, and copy it to all your worker machines for the settings to take effect. The following settings are available:

注意，这些脚本必须在希望运行Spark master的机子上执行，而不是你的本地机子。

你可以通过设置 conf/spark-env.sh中的环境变量来有选择地进一步对集群进行配置。通过复制conf/spark-env.sh.template来创建conf/spark-env.sh这个文件，为了使配置生效，需要将该文件复制到所有的worker机子上。下面的设置是可用的：

Environment VariableMeaningSPARK_MASTER_IP

Bind the master to a specific IP address, for example a public one

将master绑定到一个指定的IP地址，例如一个公共的IP地址。

SPARK_MASTER_PORT

Start the master on a different port (default: 7077)

以其他端口(默认：7077)启动master.

SPARK_MASTER_WEBUI_PORT

Port for the master web UI (default: 8080)

master 的web用户界面的端口(默认：8080).

SPARK_MASTER_OPTS

Configuration properties that apply only to the master in the form "-Dx=y" (default: none). See below for a list of possible options

以格式"-Dx=y" (default: none)来配置只适用于master的属性。可能的选项请看下面。

SPARK_LOCAL_DIRS

Directory to use for "scratch" space in Spark, including map output files and RDDs that get stored on disk. This should be on a fast, local disk in your system. It can also be a comma-separated list of multiple directories on different disks

.Spark中用于暂存空间的目录，包括map输出文件和存储在磁盘上的RDDs。这个应该在你系统中的一个读取快速的本地磁盘中。它还可以用一个逗号分割的列表，表示不同磁盘上的多个目录。

SPARK_WORKER_CORES

Total number of cores to allow Spark applications to use on the machine (default: all available cores)

在机子上，允许Spark应用使用总的内核数.(默认，所有可用的内核)

SPARK_WORKER_MEMORY

Total amount of memory to allow Spark applications to use on the machine, e.g. 1000m, 2g (default: total memory minus 1 GB); note that each application's individual memory is configured using its spark.executor.memory property

在机子上，运行Spark应用使用的总的内存数，例如1000m,2g (默认：总的内存减去1G)；注意，每个应用单独的内存由spark.executor.memory这个属性进行配置。

SPARK_WORKER_PORT

Start the Spark worker on a specific port (default: random).

以一个指定的端口启动Spark worker(默认：随机)。

SPARK_WORKER_WEBUI_PORT

Port for the worker web UI (default: 8081).

用于worker web用户界面的端口(默认：8081)。

SPARK_WORKER_INSTANCES

Number of worker instances to run on each machine (default: 1). You can make this more than 1 if you have have very large machines and would like multiple Spark worker processes. If you do set this, make sure to also set SPARK_WORKER_CORES explicitly to limit the cores per worker, or else each worker will try to use all the cores

每台机子上运行的worker实例的数量(默认：1).你可以使它大于1，如果你有很多机子并且希望多个Spark worker来处理。如果你设置了这个，确保设置SPARK_WORKER_CORES显示地限制每个worker的内核，否则每个worker会尝试使用所有的内核。

SPARK_WORKER_DIR

Directory to run applications in, which will include both logs and scratch space (default: SPARK_HOME/work).

运行应用的目录，它会包含日志和暂存空间(默认：SPARK_HOME/work)

SPARK_WORKER_OPTS

Configuration properties that apply only to the worker in the form "-Dx=y" (default: none). See below for a list of possible options.

以"-Dx=y" 形式配置只适用于worker的属性(默认：none)。可能的选项请看下面。

SPARK_DAEMON_MEMORY

Memory to allocate to the Spark master and worker daemons themselves (default: 512m).

分配给Spark master和worker守护进程的内存(默认：512m)。

SPARK_DAEMON_JAVA_OPTS

JVM options for the Spark master and worker daemons themselves in the form "-Dx=y" (default: none).

以 "-Dx=y" 的形式为Spark master和worker 守护进程设置JVM选项(默认：无)。

SPARK_PUBLIC_DNS

The public DNS name of the Spark master and workers (default: none).

Spark master和workers的公共的DNS名称(默认：无)

Note: The launch scripts do not currently support Windows. To run a Spark cluster on Windows, start the master and workers by hand.

SPARK_MASTER_OPTS supports the following system properties:

注意：启动脚本目前不支持Windows。想要在Windows上运行Spark集群，可以手动启动master和workers。

Property NameDefaultMeaningspark.deploy.retainedApplications200

The maximum number of completed applications to display. Older applications will be dropped from the UI to maintain this limit

展示的完成的应用的最大数量。旧的应用会从用户界面中被移除以维持这个限制。

spark.deploy.retainedDrivers200

The maximum number of completed drivers to display. Older drivers will be dropped from the UI to maintain this limit.

展示完成的driver的最大数量。旧的应用会从用户界面中被移除以维持这个限制。

spark.deploy.spreadOuttrue

Whether the standalone cluster manager should spread applications out across nodes or try to consolidate them onto as few nodes as possible. Spreading out is usually better for data locality in HDFS, but consolidating is more efficient for compute-intensive workloads.

standalone集群管理者是否应该将应用跨节点传播开来，还是consolidate，尽可能少地传播。传播开来经常在HDFS的数据本地化中更好，但是consolidate在计算密集型加载中效率更高。

spark.deploy.defaultCores(infinite)

Default number of cores to give to applications in Spark's standalone mode if they don't set spark.cores.max. If not set, applications always get all available cores unless they configure spark.cores.max themselves. Set this lower on a shared cluster to prevent users from grabbing the whole cluster by default.

在Spark standalone模式中给予应用的内核的默认数量。如果不设置 spark.cores.max，应用经常会获取所有可用的内核，除非配置了spark.cores.max。在这个共享的集群上设置这个较低的防止用户默认情况下抓取整个集群。

spark.worker.timeout60

Number of seconds after which the standalone deploy master considers a worker lost if it receives no heartbeats.

如果没有收到心跳，超过多少秒数，master认为worker丢失。

SPARK_WORKER_OPTS supports the following system properties:

SPARK_WORKER_OPTS 支持下列的系统属性：

Property NameDefaultMeaningspark.worker.cleanup.enabledfalseEnable periodic cleanup of worker / application directories. Note that this only affects standalone mode, as YARN works differently. Only the directories of stopped applications are cleaned up.spark.worker.cleanup.interval1800 (30 minutes)

Controls the interval, in seconds, at which the worker cleans up old application work dirs on the local machine.

控制的时间间隔，以秒为单位。每隔这个时间，在本地机子上，worker会清除旧的应用的work dirs。

spark.worker.cleanup.appDataTtl7 * 24 * 3600 (7 days)

The number of seconds to retain application work directories on each worker. This is a Time To Live and should depend on the amount of available disk space you have. Application logs and jars are downloaded to each application work dir. Over time, the work dirs can quickly fill up disk space, especially if you run jobs very frequently.

在每个worker上保留应用的work目录的时间，以秒为单位。这个时间的设定要根据你可用磁盘的容量，每个应用的日志和jars会下载下来放在work dirs。随着时间的推移，work dir会充满磁盘空间，尤其是当你频繁地运行job时。

实验：

1、创建配置文件slaves,并写入各个worker节点的主机名或者ip：

2、创建spark-env.xml,增加JAVA_HOME这个环境变量：

如果JAVA_HOME环境变量不加，启动worker节点时会报“JAVA_HOME is not set”的错误，如下图：

3、查看web UI，显示有1个master，3个worker：

注意，测试发现，slaves这个文件只要master这个节点有就可以，worker节点不加也能顺利启动。用启动脚本启动时，worker节点是必须要有spark-env.xml这个节点的。

Connecting an Application to the Cluster(将应用程序连接到集群)

To run an application on the Spark cluster, simply pass the spark://IP:PORT URL of the master as to the SparkContext constructor.

To run an interactive Spark shell against the cluster, run the following command:

要想在Spark集群上运行一个应用，只要将 spark://IP:PORT这个master的URL传入到SparkContext构造器中。

用以下命令在集群上运行一个交互式的Spark shell：

./bin/spark-shell --master spark://IP:PORT

You can also pass an option --total-executor-cores <numCores> to control the number of cores that spark-shell uses on the cluster.

你还可以传入一个可选的--total-executor-cores <numCores>来控制集群中spark-shell使用的内核数量。

Launching Spark Applications(启动Spark应用程序)

The spark-submit script provides the most straightforward way to submit a compiled Spark application to the cluster. For standalone clusters, Spark currently supports two deploy modes. In client mode, the driver is launched in the same process as the client that submits the application. In cluster mode, however, the driver is launched from one of the Worker processes inside the cluster, and the client process exits as soon as it fulfills its responsibility of submitting the application without waiting for the application to finish.

spark-submit这个脚本提供了最直接的方式将编译好的Spark应用提交给集群。对于standalone集群来说，Spark目前支持两种部署模式。在client模式中，driver在同一个进程中启动，因为client提交应用程序。在cluster模式中，driver是由集群中的worker进程中的其中是一个来启动，并且client进程会在完成提交应用程序的职责后立刻关闭，而不会等待应用程序的完成。

If your application is launched through Spark submit, then the application jar is automatically distributed to all worker nodes. For any additional jars that your application depends on, you should specify them through the --jars flag using comma as a delimiter (e.g. --jars jar1,jar2). To control the application’s configuration or execution environment, see Spark Configuration.

如果你的应用程序是通过Spark submit来启动，那么应用程序的jar自动的分发给所有的worker节点。对于任何应用程序依赖的额外的jars，你应该通过--jars来指定，用逗号做分隔符(例如--jars jar1,jar2)。想要对应用程序或者执行的环境变量进行配置，请参看Spark Configuration。

Additionally, standalone cluster mode supports restarting your application automatically if it exited with non-zero exit code. To use this feature, you may pass in the --supervise flag to spark-submit when launching your application. Then, if you wish to kill an application that is failing repeatedly, you may do so through:
此外，standalone集群模式支持自动重启你的应用程序，如果它以非0的退出代码退出。要使用该功能，当启动你的应用程序时，你可能需要在spark-submit中传入 --supervise。然后，如果你想要杀死一个重复失败的应用程序，也许可以这样做：

./bin/spark-class org.apache.spark.deploy.Client kill <master url> <driver ID>

You can find the driver ID through the standalone Master web UI at http://<master url>:8080.

你可以通过standalone Master的web界面来查看driver ID。

Resource Scheduling

The standalone cluster mode currently only supports a simple FIFO scheduler across applications. However, to allow multiple concurrent users, you can control the maximum number of resources each application will use. By default, it will acquire all cores in the cluster, which only makes sense if you just run one application at a time. You can cap the number of cores by setting spark.cores.max in your SparkConf. For example:

standalone集群模式在各个应用中，目前只支持简单的FIFO调度。但是，想要允许多个并发用户，你可以控制每个应用能够使用的最大资源数。默认情况下，它会获取集群中所有的内核，当你一次只运行一个应用时，这种情况是合理的。你可以通过在SparkConf中设置spark.cores.max来覆盖配置文件中设置的内核的数量。例如：

val conf = new SparkConf()             .setMaster(...)             .setAppName(...)             .set("spark.cores.max", "10")val sc = new SparkContext(conf)

In addition, you can configure spark.deploy.defaultCores on the cluster master process to change the default for applications that don’t set spark.cores.max to something less than infinite. Do this by adding the following to conf/spark-env.sh:

在 conf/spark-env.sh中可以增加：

export SPARK_MASTER_OPTS="-Dspark.deploy.defaultCores=<value>"

This is useful on shared clusters where users might not have configured a maximum number of cores individually.

这在共享集群中是很有用的，因为用户也许不会配置内核的最大数量。

Monitoring and Logging(监控和日志记录)

Spark’s standalone mode offers a web-based user interface to monitor the cluster. The master and each worker has its own web UI that shows cluster and job statistics. By default you can access the web UI for the master at port 8080. The port can be changed either in the configuration file or via command-line options.

In addition, detailed log output for each job is also written to the work directory of each slave node (SPARK_HOME/work by default). You will see two files for each job, stdout and stderr, with all output it wrote to its console.

Spark的standalone模式提供了基于web的用户界面来监视集群。master和每一个worker都有自己的web UI，用来显示集群和作业统计。默认情况下，你可以用端口8080来访问master的web UI。这个端口可以通过配置文件或者命令行的可选项来改变。

此外，每个作业的详细的日志输出在每一个从节点的work目录中(默认SPARK_HOME/work)。你可以看到每个作业有两个文件：stdout和stderr，所有的输出都会打印到控制平台。

Running Alongside Hadoop(与Hadoop一起运行)

You can run Spark alongside your existing Hadoop cluster by just launching it as a separate service on the same machines. To access Hadoop data from Spark, just use a hdfs:// URL (typically hdfs://<namenode>:9000/path, but you can find the right URL on your Hadoop Namenode’s web UI). Alternatively, you can set up a separate cluster for Spark, and still have it access HDFS over the network; this will be slower than disk-local access, but may not be a concern if you are still running in the same local area network (e.g. you place a few Spark machines on each rack that you have Hadoop on).

你可以使用存在的Hadoop集群来运行Spark，只要在同一个机子上启动各自的服务。为了在Spark中访问Hadoop数据。只要使用hdfs:// URL (一般是 hdfs://<namenode>:9000/path，你可以在Hadoop NameNode的web UI上找到正确的URL）

Configuring Ports for Network Security(为网络安全配置端口)

Spark makes heavy use of the network, and some environments have strict requirements for using tight firewall settings. For a complete list of ports to configure, see the security page.

High Availability(高可用性)

By default, standalone scheduling clusters are resilient to Worker failures (insofar as Spark itself is resilient to losing work by moving it to other workers). However, the scheduler uses a Master to make scheduling decisions, and this (by default) creates a single point of failure: if the Master crashes, no new applications can be created. In order to circumvent this, we have two high availability schemes, detailed below.

默认情况下，standalone调度集群对于worker的失效是有弹性的。但是调度器使用Master来制定调度的决策，这(默认)创建了一个单点故障：如果Master宕掉了，没有新的应用可以被创建。为了规避这个问题,我们有两个高可用性方案,详细如下：

Standby Masters with ZooKeeper(备用Masters和ZooKeeper)

Overview

Utilizing ZooKeeper to provide leader election and some state storage, you can launch multiple Masters in your cluster connected to the same ZooKeeper instance. One will be elected “leader” and the others will remain in standby mode. If the current leader dies, another Master will be elected, recover the old Master’s state, and then resume scheduling. The entire recovery process (from the time the the first leader goes down) should take between 1 and 2 minutes. Note that this delay only affects scheduling new applications – applications that were already running during Master failover are unaffected.

Learn more about getting started with ZooKeeper here.

利用ZooKeeper可以提供leader的选举以及一些状态的存储，在集群中可以启动多个Masters来连接到相同的ZooKeeper实例。其中一个会被选举成“leader”，其他的仍旧处于standby模式。如果当前的leader死掉了，另一个Master会被选举出来，恢复之前的那个Master的状态，然后恢复调度。整个恢复过程需要花费1到2分钟时间。注意，这个延迟只会影响到新的应用程序--在故障转移期间，已经运行的应用程序不会受到影响。

Configuration

In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env using this configuration:

为了使用恢复模式，可以在 spark-env中设置SPARK_DAEMON_JAVA_OPTS，使用下面这些配置：

System propertyMeaningspark.deploy.recoveryModeSet to ZOOKEEPER to enable standby Master recovery mode (default: NONE).spark.deploy.zookeeper.urlThe ZooKeeper cluster url (e.g., 192.168.1.100:2181,192.168.1.101:2181).spark.deploy.zookeeper.dirThe directory in ZooKeeper to store recovery state (default: /spark).

Possible gotcha: If you have multiple Masters in your cluster but fail to correctly configure the Masters to use ZooKeeper, the Masters will fail to discover each other and think they’re all leaders. This will not lead to a healthy cluster state (as all Masters will schedule independently).

可能的陷阱：如果在集群中有多个Master，但是Master使用ZooKeeper的配置不正确，Master不会发现其他的Master，并且认为它们都是leaders。这会使得集群的状态有问题(因为所有的Master都会独立地进行调度)。

Details

After you have a ZooKeeper cluster set up, enabling high availability is straightforward. Simply start multiple Master processes on different nodes with the same ZooKeeper configuration (ZooKeeper URL and directory). Masters can be added and removed at any time.

In order to schedule new applications or add Workers to the cluster, they need to know the IP address of the current leader. This can be accomplished by simply passing in a list of Masters where you used to pass in a single one. For example, you might start your SparkContext pointing to spark://host1:port1,host2:port2. This would cause your SparkContext to try registering with both Masters – if host1 goes down, this configuration would still be correct as we’d find the new leader, host2.

There’s an important distinction to be made between “registering with a Master” and normal operation. When starting up, an application or Worker needs to be able to find and register with the current lead Master. Once it successfully registers, though, it is “in the system” (i.e., stored in ZooKeeper). If failover occurs, the new leader will contact all previously registered applications and Workers to inform them of the change in leadership, so they need not even have known of the existence of the new Master at startup.

Due to this property, new Masters can be created at any time, and the only thing you need to worry about is that new applications and Workers can find it to register with in case it becomes the leader. Once registered, you’re taken care of.

在你设置完ZooKeeper集群后，就能直接使用高可靠性。只要在多个不同的但具有相同的ZooKeeper配置信息的节点上启动多个Master进程即可。Master可以在任何时候进行添加和删除。

“注册一个Master”和一般的操作之间有一个重要的区别。当启动时，一个应用或者Worker需要找到并注册当前的lead Master。一旦注册成功，它就在系统中了(也就是存储在ZooKeeper中)。当故障转移发生时，新的leader会联系所有之前注册的应用和workers，通知它们leader已经变化了，所以它们甚至不需要知道在启动时当前新的Master的存在。

为了调度新的应用程序，或者在集群中增加Workers，它们需要知道目前leader的IP地址。这是通过传入Masters列表来完成，之前传入的是单个Master的IP而已。例如，你也许会用spark://host1:port1,host2:port2来启动SparkContext。这会使你的SparkContext尝试注册所有的Master--如果host1宕机了，这个配置仍旧是正确的，因为我们找到了新的leader：host2。

实验：

1、在spark-env.xml中添加如下语句，我在worker05、worker06、worker07上都装有Zookeeper。将该文件复制到其他节点：

export JAVA_HOME=/opt/jdk1.7.0_67
export SPARK_DAEMON_JAVA_OPTS="-Dspark.deploy.recoveryMode=ZOOKEEPER -Dspark.deploy.zookeeper.url=worker05:2181,worker06:2181,worker07:2181"

2、先启动ZooKeeper。然后用启动脚本在worker00和worker01上启动master，根据原先的slaves文件中的配置，会在worker01、worker02、worker03上启动worker进程。

3、查看web UI，发现worker01上的master是active的，worker00上的master是standby的。

4、杀死worker01上的master进程，查看worker00上的master进程的状态是否会变为active：

过了大概一分钟后，worker00上的master变成active了：

worker01的master进程由于刚才被手动杀死了，所以现在无法访问，符合我们预期的判断：

5、配置完多master后，启动在提交任务或者启动spark-shell时需要指定多个master，命令如下：

./spark-shell --master spark://worker00:7077,worker01:7077

注意，如果不指定--master，或者参数有误，这是master的值默认会“local”。

Single-Node Recovery with Local File System

Overview

ZooKeeper is the best way to go for production-level high availability, but if you just want to be able to restart the Master if it goes down, FILESYSTEM mode can take care of it. When applications and Workers register, they have enough state written to the provided directory so that they can be recovered upon a restart of the Master process.

ZooKeeper是用于生产级别的高可用性的最好方法。但是如果你只想在Master宕机时重新启动它，可以使用FILESYSTEM模式。当应用和workers注册时，有足够的状态来写入到提供的目录中，所以可以通过重启Master进程来进行恢复。

Configuration

In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in spark-env using this configuration:

System propertyMeaningspark.deploy.recoveryMode

Set to FILESYSTEM to enable single-node recovery mode (default: NONE).

设置成FILESYSTEM来开启单节点恢复模式(默认: NONE).

spark.deploy.recoveryDirectory

The directory in which Spark will store recovery state, accessible from the Master's perspective.

Spark存储恢复状态的目录，以Master的角度进行访问。

Details

This solution can be used in tandem with a process monitor/manager like monit, or just to enable manual recovery via restart.
While filesystem recovery seems straightforwardly better than not doing any recovery at all, this mode may be suboptimal for certain development or experimental purposes. In particular, killing a master via stop-master.sh does not clean up its recovery state, so whenever you start a new Master, it will enter recovery mode. This could increase the startup time by up to 1 minute if it needs to wait for all previously-registered Workers/clients to timeout.
While it’s not officially supported, you could mount an NFS directory as the recovery directory. If the original Master node dies completely, you could then start a Master on a different node, which would correctly recover all previously registered Workers/applications (equivalent to ZooKeeper recovery). Future applications will have to be able to find the new Master, however, in order to register.
这个解决方案可用于串联成一个过程监控者/管理者，就像monit，或者仅仅是可以通过重启进行手动恢复。
文件系统恢复看上去比没有做任何恢复更加直截了当，这种模式在某种开发或者以实验为目的情况下是比较理想的。特别是，通过stop-master.sh来杀死一个master不会清理它的恢复状态，所以当你启动一个新的master时，会进入恢复模式。可能会增加启动的时间，大概有1分钟，如果它需要等待之前注册的Workers/clients超时。
然而它不是官方支持的，你可以挂载一个NFS目录作为恢复目录。如果原来的Master节点完全死掉了，你可以在不同的节点开启一个Master，它会正确地恢复所有之前注册的workers/应用(相当于ZooKeeper的恢复)。为了进行注册，之后的应用必须能找到新的Master。

原文地址：https://spark.apache.org/docs/1.3.1/spark-standalone.html

0 0