DevilKing's blog

冷灯看剑,剑上几分功名?炉香无需计苍生,纵一穿烟逝,万丈云埋,孤阳还照古陵

0%

Spark 2.x Trouble guide

原文链接

Building Spark

JRE path

Maven installed

  • always explicitly set the following in ‘.bashrc’ for ‘root’
  • specify support you want explicitly
  • Rebuild only modified source code

Running Spark

  • always use ‘—verbose’ option on ‘spark-submit’ command to run your workland
  • print
    • all default properties
    • command line options
    • settings from spark’ conf file
    • setting from cli
  • order of lookup, ‘—package’
    • the local maven repo
    • maven central - web
    • additional remote repositories
  • OOM
    • increase ‘—driver-memory’
    • saprk sql and spark streaming need large driver heap size
  • GC overhead limit exceeded
    • too much time spent in garbage collection
    • increase executor heapsize
    • modify gc policy
      • -XX:UseG1GC && -XX:UseParallelGC
      • spark default: -XX:UseParallelGC
      • Try overwrite with -XX:G1GC
  • has a single SparkContext with multiple sessions supporting
    • concurrency
    • re-useable connections
    • shared cache
  • Not all CPUs are busy
    • sart with evenly divided memory and cores. -executor-memroy 2500m --num-executors 200
    • when heap size non-negotiable. --executor-memory 6g --num-executors 80 transfer to --executor-memory 6g --num-executors 80 -executor-cores 2 (Forcing 80% utilization, boosting 33% performance!)
  • ‘scratch’ space
    • ‘/tmp’ is full
    • fix spark.local.dir
  • max result size exceeded
    • saprk.driver.maxResultSize
  • out of space on a few data nodes
    • hdfs balancer to start balancing
    • dfs.datanode.balance.bandwidthPerSec increase to 6GB/s
    • dfs.datanode.balance.max.concurrent.moves set to 5000 concurrent threads

Profiling Spark

  • Yarn containers across multiple nodes
  • get a full thread dump or get a full heap dump. jstack -l jmap -dump:live,format=b,file=xxx
  • step1: find the hostname in the error log; step 2: find the local directory where ‘stderr’ resides, step 3: open the ‘stderr’,