May 19

命令行添加监控

直接添加到命令行后

--files=/yourPath/metrics.properties --conf spark.metrics.conf=metrics.properties

The –files flag will cause /path/to/metrics.properties to be sent to every executor,

and spark.metrics.conf=metrics.properties will tell all executors to load that file

when initializing their respective MetricsSystems.

或者用conf的形式

--conf spark.metrics.conf.*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink 
--conf spark.metrics.conf.*.sink.graphite.host=...

Spark Metrics

*.sink.console.class=org.apache.spark.metrics.sink.ConsoleSink
*.sink.console.period=10
*.sink.console.unit=seconds
*.sink.csv.class=org.apache.spark.metrics.sink.CsvSink
*.sink.csv.period=1
*.sink.csv.unit=minutes
*.sink.csv.directory=/tmp/

➜  spark-2.0.1-bin-hadoop2.7 bin/spark-shell
Spark context Web UI available at http://10.57.2.5:4040
Spark context available as 'sc' (master = local[*], app id = local-1495078254084).
Spark session available as 'spark'.
scala> 17-5-18 11:31:05 ===============================================================

-- Gauges ----------------------------------------------------------------------
local-1495078254084.driver.BlockManager.disk.diskSpaceUsed_MB value = 0
local-1495078254084.driver.BlockManager.memory.maxMem_MB value = 366
local-1495078254084.driver.BlockManager.memory.memUsed_MB value = 0
local-1495078254084.driver.BlockManager.memory.remainingMem_MB value = 366
local-1495078254084.driver.DAGScheduler.job.activeJobs value = 0
local-1495078254084.driver.DAGScheduler.job.allJobs value = 0
local-1495078254084.driver.DAGScheduler.stage.failedStages value = 0
local-1495078254084.driver.DAGScheduler.stage.runningStages value = 0
local-1495078254084.driver.DAGScheduler.stage.waitingStages value = 0

-- Histograms ------------------------------------------------------------------
local-1495078254084.driver.CodeGenerator.compilationTime
             count = 0
               min = 0
               max = 0
              mean = 0.00
            stddev = 0.00
            median = 0.00
              75% <= 0.00
              95% <= 0.00
              98% <= 0.00
              99% <= 0.00
            99.9% <= 0.00
local-1495078254084.driver.CodeGenerator.generatedClassSize
             count = 0
               min = 0
               max = 0
              mean = 0.00
            stddev = 0.00
            median = 0.00
              75% <= 0.00
              95% <= 0.00
              98% <= 0.00
              99% <= 0.00
            99.9% <= 0.00
local-1495078254084.driver.CodeGenerator.generatedMethodSize
             count = 0
               min = 0
               max = 0
              mean = 0.00
            stddev = 0.00
            median = 0.00
              75% <= 0.00
              95% <= 0.00
              98% <= 0.00
              99% <= 0.00
            99.9% <= 0.00
local-1495078254084.driver.CodeGenerator.sourceCodeSize
             count = 0
               min = 0
               max = 0
              mean = 0.00
            stddev = 0.00
            median = 0.00
              75% <= 0.00
              95% <= 0.00
              98% <= 0.00
              99% <= 0.00
            99.9% <= 0.00

-- Timers ----------------------------------------------------------------------
local-1495078254084.driver.DAGScheduler.messageProcessingTime
             count = 0
         mean rate = 0.00 calls/second
     1-minute rate = 0.00 calls/second
     5-minute rate = 0.00 calls/second
    15-minute rate = 0.00 calls/second
               min = 0.00 milliseconds
               max = 0.00 milliseconds
              mean = 0.00 milliseconds
            stddev = 0.00 milliseconds
            median = 0.00 milliseconds
              75% <= 0.00 milliseconds
              95% <= 0.00 milliseconds
              98% <= 0.00 milliseconds
              99% <= 0.00 milliseconds
            99.9%  sc.parallelize(List(1,2,3,4,5)).count
res1: Long = 5

scala> 17-5-18 11:33:15 ===============================================================

-- Timers ----------------------------------------------------------------------
local-1495078254084.driver.DAGScheduler.messageProcessingTime
             count = 10
         mean rate = 0.07 calls/second
     1-minute rate = 0.16 calls/second
     5-minute rate = 0.03 calls/second
    15-minute rate = 0.01 calls/second
               min = 0.03 milliseconds
               max = 1207.28 milliseconds
              mean = 125.02 milliseconds
            stddev = 358.42 milliseconds
            median = 0.32 milliseconds
              75% <= 16.58 milliseconds
              95% <= 1207.28 milliseconds
              98% <= 1207.28 milliseconds
              99% <= 1207.28 milliseconds
            99.9% <= 1207.28 milliseconds

➜  ~ ll /tmp/ -rth
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.DAGScheduler.stage.waitingStages.csv
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.DAGScheduler.stage.runningStages.csv
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.DAGScheduler.stage.failedStages.csv
-rw-r--r--   1 zhengqh  wheel   1.3K  5 18 11:36 local-1495078254084.driver.DAGScheduler.messageProcessingTime.csv
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.DAGScheduler.job.allJobs.csv
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.DAGScheduler.job.activeJobs.csv
-rw-r--r--   1 zhengqh  wheel   676B  5 18 11:36 local-1495078254084.driver.CodeGenerator.sourceCodeSize.csv
-rw-r--r--   1 zhengqh  wheel   676B  5 18 11:36 local-1495078254084.driver.CodeGenerator.generatedMethodSize.csv
-rw-r--r--   1 zhengqh  wheel   676B  5 18 11:36 local-1495078254084.driver.CodeGenerator.generatedClassSize.csv
-rw-r--r--   1 zhengqh  wheel   676B  5 18 11:36 local-1495078254084.driver.CodeGenerator.compilationTime.csv
-rw-r--r--   1 zhengqh  wheel   113B  5 18 11:36 local-1495078254084.driver.BlockManager.memory.remainingMem_MB.csv
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.BlockManager.memory.memUsed_MB.csv
-rw-r--r--   1 zhengqh  wheel   113B  5 18 11:36 local-1495078254084.driver.BlockManager.memory.maxMem_MB.csv
-rw-r--r--   1 zhengqh  wheel    99B  5 18 11:36 local-1495078254084.driver.BlockManager.disk.diskSpaceUsed_MB.csv

➜  /tmp cat local-1495078254084.driver.DAGScheduler.messageProcessingTime.csv
t,count,max,mean,min,stddev,p50,p75,p95,p98,p99,p999,mean_rate,m1_rate,m5_rate,m15_rate,rate_unit,duration_unit
1495078315,0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,calls/second,milliseconds
1495078375,0,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,0.000000,calls/second,milliseconds
1495078435,10,1207.284400,125.017564,0.027442,358.422668,0.317114,16.580495,1207.284400,1207.284400,1207.284400,1207.284400,0.055257,0.082101,0.028931,0.010599,calls/second,milliseconds
1495078495,10,1207.284400,125.017564,0.027442,358.422668,0.317114,16.580495,1207.284400,1207.284400,1207.284400,1207.284400,0.041499,0.030203,0.023686,0.009915,calls/second,milliseconds
1495078555,10,1207.284400,125.017564,0.027442,358.422668,0.317114,16.580495,1207.284400,1207.284400,1207.284400,1207.284400,0.033225,0.011111,0.019393,0.009276,calls/second,milliseconds
1495078577,10,1207.284400,125.017564,0.027442,358.422668,0.317114,16.580495,1207.284400,1207.284400,1207.284400,1207.284400,0.030895,0.007962,0.018142,0.009072,calls/second,milliseconds
1495078577,10,1207.284400,125.017564,0.027442,358.422668,0.317114,16.580495,1207.284400,1207.284400,1207.284400,1207.284400,0.030890,0.007962,0.018142,0.009072,calls/second,milliseconds

Spark Cassandra Metrics

executor.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource
driver.source.cassandra-connector.class=org.apache.spark.metrics.CassandraConnectorSource

Spark Influx Metrics

https://github.com/palantir/spark-influx-sink

spark.driver.extraClassPath=spark-influx-sink.jar:metrics-influxdb.jar

spark.executor.extraClassPath=spark-influx-sink.jar:metrics-influxdb.jar

*.sink.influx.class=org.apache.spark.metrics.sink.InfluxDbSink
*.sink.influx.protocol=https
*.sink.influx.host=localhost
*.sink.influx.port=8086
*.sink.influx.database=my_metrics
*.sink.influx.auth=metric_client:PASSWORD
*.sink.influx.tags=product:my_product,parent:my_service

zqhxuyuan稿源:zqhxuyuan (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合编程 » May 19

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录