标题: 当地运行spark报错为Exception in thread “main“ org.apache.spark.Spark [打印本页] 作者: 西河刘卡车医 时间: 2024-11-20 01:16 标题: 当地运行spark报错为Exception in thread “main“ org.apache.spark.Spark 一、以为服务器内存不敷
修改了hadoop的yarn-site.xml
<!-- 表示该节点上YARN可使用的物理内存总量,默认是8192MB -->
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
</property>
<!-- 表示该节点服务器上yarn可以使用的虚拟CPU个数,默认是8 -->
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>8</value>
</property>
<!-- 单个任务可申请的最多物理内存量,默认是8192MB -->
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>
复制代码
PS:若你运行后没有报错那就最好,若运行后还是报一样的错误,下拉日志再看有什么其他的错误提示。
二、在SparkSeesion设置中添加如下设置
发现有日志中新错误:
Caused by: org.apache.spark.SparkUpgradeException: You may get a different result due to the upgrading of Spark 3.0: writing dates before 1582-10-15 or timestamps before 1900-01-01T00:00:00Z into Parquet INT96 files can be dangerous, as the files may be read by Spark 2.x or legacy versions of Hive later, which uses a legacy hybrid calendar that is different from Spark 3.0+'s Proleptic Gregorian calendar. See more details in SPARK-31404. You can set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'LEGACY' to rebase the datetime values w.r.t. the calendar difference during writing, to get maximum interoperability. Or set spark.sql.legacy.parquet.int96RebaseModeInWrite to 'CORRECTED' to write the datetime values as it is, if you are 100% sure that the written files will only be read by Spark 3.0+ or other systems that use Proleptic Gregorian calendar.