怎样部署SparkHistoryServer
spark-defaults.conf的设置:# 镜像内配置路径: /opt/spark/conf/spark-defaults.conf
spark.history.fs.logDirectory=hdfs://xxx
spark.history.ui.port=18080
spark.history.retainedApplications=20 在提交Spark任务时,需要指定下面两个参数:
spark.eventLog.enabled=true
spark.eventLog.dir=hdfs://xxx 注意:spark.eventLog.dir和spark.history.fs.logDirectory 设置同一目次路径即可
对应Deployment和Service的yaml文件如下:
apiVersion: apps/v1
kind: Deployment
metadata:
name: spark-history-server
spec:
replicas: 1
selector:
matchLabels:
app: spark-history-server
template:
metadata:
labels:
app: spark-history-server
spec:
enableServiceLinks: false
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: {your_node_label_spec}
operator: In
values:
- "true"
restartPolicy: Always
containers:
- name: spark-history-server
image: {your_repo}_dist-spark-online:3.2.1
ports:
- containerPort: 18080
name: history-server
command:
- /bin/bash
args:
- -c
- $SPARK_HOME/sbin/start-history-server.sh && tail -f /dev/null
resources:
limits:
cpu: "2"
memory: 4Gi
requests:
cpu: 100m
memory: 256Mi
---
apiVersion: v1
kind: Service
metadata:
name: spark-history-server-service
annotations:
spec:
type: LoadBalancer
selector:
app: spark-history-server
ports:
- name: server
protocol: TCP
port: 8088
targetPort: history-server 启动命令的方式(可选):
1. $SPARK_HOME/sbin/start-history-server.sh (上述yaml中的方式)
2. $SPARK_HOME/bin/spark-class org.apache.spark.deploy.history.HistoryServer \
--properties-file /opt/spark/conf/spark-defaults.conf
碰到的题目?
1. 正在运行的spark任务,怎么在history-server中查看不了呢?
大概与spark.history.fs.logDirectory的设置路径,好比:是远程存储还是当地存储 以及提交的spark的任务运行方式有关,是否在运行期间写入eventLog还是竣事后一起提交event。
具体得看环境分析
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。
页:
[1]