【openeuler/spark docker image overview】

  金牌会员 | 2024-8-2 00:54:50 | 显示全部楼层 | 阅读模式
打印 上一主题 下一主题

主题 507|帖子 507|积分 1521


Quick reference



  • The official Spark docker image.
  • Maintained by: openEuler CloudNative SIG.
  • Where to get help: openEuler CloudNative SIG, openEuler.
Spark | openEuler

Current MLflow docker images are built on the openEuler. This repository is free to use and exempted from per-user rate limits.
Apache Spark is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
Learn more on Spark website.
Supported tags and respective Dockerfile links

The tag of each spark docker image is consist of the version of spark and the version of basic image. The details are as follows
TagsCurrentlyArchitectures3.3.1-22.03-ltsspark 3.3.1 on openEuler 22.03-LTSamd64, arm643.3.2-22.03-ltsspark 3.3.2 on openEuler 22.03-LTSamd64, arm643.4.0-22.03-ltsspark 3.4.0 on openEuler 22.03-LTSamd64, arm64 Usage

In this usage, users can select the corresponding {Tag} based on their requirements.


  • Online Documentation
    You can find the latest Spark documentation, including a programming guide, on the project web page. This README file only contains basic setup instructions.
  • Pull the openeuler/redis image from docker
    1. docker pull openeuler/spark:{Tag}
    复制代码
  • Interactive Scala Shell
    The easiest way to start using Spark is through the Scala shell:
    1. docker run -it --name spark openeuler/spark:{Tag} /opt/spark/bin/spark-shell
    复制代码
    Try the following command, which should return 1,000,000,000:
    1. scala> spark.range(1000 * 1000 * 1000).count()
    复制代码

  • Interactive Python Shell
    The easiest way to start using PySpark is through the Python shell:
    1. docker run -it --name spark openeuler/spark:{Tag} /opt/spark/bin/pyspark
    复制代码
    And run the following command, which should also return 1,000,000,000:
    1. >>> spark.range(1000 * 1000 * 1000).count()
    复制代码

  • Running Spark on Kubernetes
    https://spark.apache.org/docs/latest/running-on-kubernetes.html⁠.
  • Configuration and environment variables
    See more in https://github.com/apache/spark-docker/blob/master/OVERVIEW.md#environment-variable.
Question and answering

If you have any questions or want to use some special features, please submit an issue or a pull request on openeuler-docker-images.

免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!更多信息从访问主页:qidao123.com:ToB企服之家,中国第一个企服评测及商务社交产业平台。

本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有账号?立即注册

x
回复

使用道具 举报

0 个回复

正序浏览

快速回复

您需要登录后才可以回帖 登录 or 立即注册

本版积分规则

金牌会员
这个人很懒什么都没写!

标签云

快速回复 返回顶部 返回列表