feat: expose comet metrics through Sparks external monitoring system#3708
feat: expose comet metrics through Sparks external monitoring system#3708coderfender wants to merge 4 commits intoapache:mainfrom
Conversation
aa97dec to
f096d8f
Compare
|
cc : @andygrove |
spark/src/main/scala/org/apache/comet/rules/CometExecRule.scala
Outdated
Show resolved
Hide resolved
f096d8f to
c0eb149
Compare
|
@wForget , Please take a look whenever you get a chance |
c0eb149 to
2eb506f
Compare
spark/src/main/scala/org/apache/comet/rules/CometExecRule.scala
Outdated
Show resolved
Hide resolved
8c60dc8 to
eaa1dc4
Compare
|
@wForget , I addressed code per your review comments. Please take a look whenever you get a chance. Thank you for the kind guidance |
| val queriesBefore = CometSource.QUERIES_PLANNED.getCount | ||
| spark.sparkContext.listenerBus.waitUntilEmpty() |
There was a problem hiding this comment.
| val queriesBefore = CometSource.QUERIES_PLANNED.getCount | |
| spark.sparkContext.listenerBus.waitUntilEmpty() | |
| spark.sparkContext.listenerBus.waitUntilEmpty() | |
| val queriesBefore = CometSource.QUERIES_PLANNED.getCount |
| } | ||
|
|
||
| object CometDriverPlugin extends Logging { | ||
| def registerQueryExecutionListener(conf: SparkConf): Unit = { |
There was a problem hiding this comment.
- move
sc.env.metricsSystem.registerSource(CometSource)into this method, and add a configuration to make this behavior configurable - Check if
spark.sql.queryExecutionListenersis configured and merge it. You can refer to the implementation ofregisterCometSessionExtensionmethod.
There was a problem hiding this comment.
+1. Current implementation overrides any user define listeners.
| } | ||
|
|
||
| object CometDriverPlugin extends Logging { | ||
| def registerQueryExecutionListener(conf: SparkConf): Unit = { |
There was a problem hiding this comment.
+1. Current implementation overrides any user define listeners.
| } | ||
| }) | ||
|
|
||
| def recordStats(stats: CometCoverageStats): Unit = { |
There was a problem hiding this comment.
This isn't exactly ideal. These stats make sense per plan. Here we are incrementing all metrics globally, so all plans that are executed across all sessions in a Spark application will have their counts accumulated and may provide no useful information.
There was a problem hiding this comment.
Sure. But spark does follow a similar approach here but there is always an option to improve granularity
|
Thwnk you for the review . Let me take a look and update the logic |
c6da902 to
2b6dcc1
Compare
Which issue does this PR close?
Closes #3712
Rationale for this change
The following metrics are exposed at the moment :
What changes are included in this PR?
How are these changes tested?