-
Angelo Capossele authored
Adds new analysis-dashboard, improves analysis server, adds proper metrics collection and Prometheus exporter (#502) *
Clean up * Add TestCalculateBranchOfTransaction * Upgrade fpc dashboard * Fix docker-compose * Add TestMoveTransactionToBranch * Add TestFork * Add TestBookPayload * Add test for checkPayloadSolidity * Add TestSetTransactionPreferred * Add more tests * Fix Tangle test * Feat: started implementing lucas test cases * Feat: fixed some issued + further tests * Feat: started adding invalid txs check * Feat: added removal logic for invalid transactions * Refactor: removed Println * Add test for 2nd Reattachment * feat: Add first value transfer integration test * fix: fix wrong plugin name * Add aggregated branches test cases * Feat: added a method to generate AggregatedBranchIDs * Use GenerateAggregatedBranchID in test * Feat: refactored delete logic * Fix: fixed broken test * Feat: added final test cases for invalid txs / payloads * ja nei was gaht dän da ab * Make FPCHeartbeat smaller * Split vote context to fit into an FPC update * WIP * Value tangle concurrency tests (#451) * Add simple concurrency test * Add reverse and concurrent transaction and value object solidification tests and fix bug when value object was visited more than once * Add some documentation to make tests easily understandable * Add FPC analysis data persistence * Add Prometheus * Update FPC test * Remove linter warning * Remove linter warnings * Fix event closure * Add prometheus config * WIP propagation tests but fixed already couple of bugs * Fix: fixed some bugs * Feat: added propagation to inclusion states to tx and its outputs * Feat: finished the propagation down to the tx and its outputs * WIP * update docker-network entry node * Add .gitignore * Update Prometheus * Fix MongoDB ctx bug * WIP propagation tests and fix bugs * Delete finalized conflicts * Add colored tokens test * Add value tangle test to github workflow * fix: Fix wrong function name in comments * refactor: Make testSnapshots disabled in default and minor tweaks * Feat: fixed some issues and introduced a Debugger * Refactor: added a few comments * Split massive test file into slightly more digestible chunks * Initial commit for metrics collections package * Clean up propagation tests * Feat: fixed bugs * Feat: enabled missing tests * Add some documentation and missing checks for aggregated branches * WIP * Clean up tangle tests * adds snapshot type * Fix: finalized wasn't propagated when a branch was rejected * Skeleton for implementation * Measure isSynced * UI Improvements * Make it compile * Measure TPS in value tangle * implements ReadFrom and WriteTo for Snapshot * read in snapshot file if snapshot path is defined * renames snapshot test file * WIP metrics * Measure Tips in value tangle * Add DBSize metric * WIP debugging concurrency bug of death * Measure tips in message tangle * Add AvgNeighborConnectionLifeTime metric * Bump hive.go * Add autopeering distance metric * Feat: added more reliable fails in test case * Fix: fixes a race condition in solidification * Add gossip network traffic metric * Clean up test * adds assets volume to integration test containers * fixes some asserts * adds non-working conflict integration test * check transaction availability in partition * renames integration test * Measure MPS per payload type * Measure (cumulative) total msg count and per payload * Measure FPC number of currently active conflicts * lower amount of peers * Package updates * Link valuetransfer+FPC+dashboard * Update packr * Fix logo * first passing version of consensus integration test * remove debug printlns * do all integration tests again * increases avg. network delay fcob rule, removes debug printlns * go mod tidy by Marie Kondō * renames incl. state. conflict to conflicting * go fmt tangle.go * go fmt tangle_test, goimports dapp.go * goimports again because the dog is sad * run consensus integration test on the CI * use explicit pumba version 0.7.2 * pray to the CI gods for the test to pass * fix panic when tangle.Fork() is called * Add double spend test * Update hive.go * readd all tests again * Add AnalysisOutboundBytes metric * Update go.mod * Start setting prometheus metrics * reset integration framework paras * Update docker-compose * Fix typo * WIP * Update docker-compose * Bump golang version * Fix snapshot file print * Change client url for testing * Remove pflag double import * OpinionEvent struct for calling Finalized and Failed vote events * Measure finalized/failed conflicts + average rounds of finalization * refactor: Remove FPC page from dashboard * refactor: Remove Drng link from dashboard * Add autopeering network traffic metric * Bump up golang version * WIP * Measure voting queries (received, unreplied) and number of opinions * Add sendPayload API * Avoid sending empty rounds * Add a bunch of metrics to prometheus * Feat: outputs inherit status of transaction * Add Grafana integration * Refactor: fixed erroneous rename * WIP * Fix: fixed missing marshaling of output bools * Fixes after merge * Fix some bugs * Prometheus FPC data collection * Small fixes * Fix: fixed decision pending * Prometheus Tangle metrics data collection * Refactor FPC metric events * Bump up hive.go * Add metric heartbeat packet * Add clients metrics collection via the analysis server * WIP * Prometheus clients info * Fix metrics config + enable prometheus collection on entry_node * use new protocol * wip: redial on lost connection * assure conn is not nil * close on write * Fix bugs + enable spammer on peer_master * Fix analysis fpc livefeed bug * connector cleanup * Autopeering NeighborCount + Network Diameter metrics * do not log the dial error since the logger might not available * removes test snapshot plugin * Improve analysis-server * Change log to debug level * graph pkg from autopeering-sim * Remove debug line + hive.go update * Change hive.go version * get rid of test snapshot plugin * fixes wrong use of Println * removes random tool * Move conn initialization * removes duplicated value entry in GH CI workflows * Update hive.go * Clean go.mod * Fix graph pkg linter warnings * improve analysis plugin * Fix metrics plugin linter warning * xxx * wip * fixes integration test * Add FPC global metrics to prometheus * Fix double register * Fix client pkg linter warning * Fix webapi linter warnings * Run you fools! * Fix testutil import * Adjust inbox worker pool capacity to default * Adjust inbox worker pool capacity to default (#505) * Address PR review comments * Fix watch dog warnings * Let's try with a new bone for the dog now.. * upgrade hive.go to master * Fix local dashboard * Improves mongoDB reliabilty * Fix linter warnings * Fix test * Make dbSize a prometheus Gauge * Remove unecessary initialization * introduce worker pool for storing finalized vote ctxs * Update packr * type alias FPCRecords to []FPCRecord Co-authored-by:jonastheis <mail@jonastheis.de> Co-authored-by:
Hans Moog <hm@mkjc.net> Co-authored-by:
jkrvivian <jkrvivian@gmail.com> Co-authored-by:
Luca Moser <moser.luca@gmail.com> Co-authored-by:
Levente Pap <levente.pap@iota.org> Co-authored-by:
Martyn Janes <martyn@obany.com> Co-authored-by:
Wolfgang Welz <welzwo@gmail.com>
Angelo Capossele authoredAdds new analysis-dashboard, improves analysis server, adds proper metrics collection and Prometheus exporter (#502) *
Clean up * Add TestCalculateBranchOfTransaction * Upgrade fpc dashboard * Fix docker-compose * Add TestMoveTransactionToBranch * Add TestFork * Add TestBookPayload * Add test for checkPayloadSolidity * Add TestSetTransactionPreferred * Add more tests * Fix Tangle test * Feat: started implementing lucas test cases * Feat: fixed some issued + further tests * Feat: started adding invalid txs check * Feat: added removal logic for invalid transactions * Refactor: removed Println * Add test for 2nd Reattachment * feat: Add first value transfer integration test * fix: fix wrong plugin name * Add aggregated branches test cases * Feat: added a method to generate AggregatedBranchIDs * Use GenerateAggregatedBranchID in test * Feat: refactored delete logic * Fix: fixed broken test * Feat: added final test cases for invalid txs / payloads * ja nei was gaht dän da ab * Make FPCHeartbeat smaller * Split vote context to fit into an FPC update * WIP * Value tangle concurrency tests (#451) * Add simple concurrency test * Add reverse and concurrent transaction and value object solidification tests and fix bug when value object was visited more than once * Add some documentation to make tests easily understandable * Add FPC analysis data persistence * Add Prometheus * Update FPC test * Remove linter warning * Remove linter warnings * Fix event closure * Add prometheus config * WIP propagation tests but fixed already couple of bugs * Fix: fixed some bugs * Feat: added propagation to inclusion states to tx and its outputs * Feat: finished the propagation down to the tx and its outputs * WIP * update docker-network entry node * Add .gitignore * Update Prometheus * Fix MongoDB ctx bug * WIP propagation tests and fix bugs * Delete finalized conflicts * Add colored tokens test * Add value tangle test to github workflow * fix: Fix wrong function name in comments * refactor: Make testSnapshots disabled in default and minor tweaks * Feat: fixed some issues and introduced a Debugger * Refactor: added a few comments * Split massive test file into slightly more digestible chunks * Initial commit for metrics collections package * Clean up propagation tests * Feat: fixed bugs * Feat: enabled missing tests * Add some documentation and missing checks for aggregated branches * WIP * Clean up tangle tests * adds snapshot type * Fix: finalized wasn't propagated when a branch was rejected * Skeleton for implementation * Measure isSynced * UI Improvements * Make it compile * Measure TPS in value tangle * implements ReadFrom and WriteTo for Snapshot * read in snapshot file if snapshot path is defined * renames snapshot test file * WIP metrics * Measure Tips in value tangle * Add DBSize metric * WIP debugging concurrency bug of death * Measure tips in message tangle * Add AvgNeighborConnectionLifeTime metric * Bump hive.go * Add autopeering distance metric * Feat: added more reliable fails in test case * Fix: fixes a race condition in solidification * Add gossip network traffic metric * Clean up test * adds assets volume to integration test containers * fixes some asserts * adds non-working conflict integration test * check transaction availability in partition * renames integration test * Measure MPS per payload type * Measure (cumulative) total msg count and per payload * Measure FPC number of currently active conflicts * lower amount of peers * Package updates * Link valuetransfer+FPC+dashboard * Update packr * Fix logo * first passing version of consensus integration test * remove debug printlns * do all integration tests again * increases avg. network delay fcob rule, removes debug printlns * go mod tidy by Marie Kondō * renames incl. state. conflict to conflicting * go fmt tangle.go * go fmt tangle_test, goimports dapp.go * goimports again because the dog is sad * run consensus integration test on the CI * use explicit pumba version 0.7.2 * pray to the CI gods for the test to pass * fix panic when tangle.Fork() is called * Add double spend test * Update hive.go * readd all tests again * Add AnalysisOutboundBytes metric * Update go.mod * Start setting prometheus metrics * reset integration framework paras * Update docker-compose * Fix typo * WIP * Update docker-compose * Bump golang version * Fix snapshot file print * Change client url for testing * Remove pflag double import * OpinionEvent struct for calling Finalized and Failed vote events * Measure finalized/failed conflicts + average rounds of finalization * refactor: Remove FPC page from dashboard * refactor: Remove Drng link from dashboard * Add autopeering network traffic metric * Bump up golang version * WIP * Measure voting queries (received, unreplied) and number of opinions * Add sendPayload API * Avoid sending empty rounds * Add a bunch of metrics to prometheus * Feat: outputs inherit status of transaction * Add Grafana integration * Refactor: fixed erroneous rename * WIP * Fix: fixed missing marshaling of output bools * Fixes after merge * Fix some bugs * Prometheus FPC data collection * Small fixes * Fix: fixed decision pending * Prometheus Tangle metrics data collection * Refactor FPC metric events * Bump up hive.go * Add metric heartbeat packet * Add clients metrics collection via the analysis server * WIP * Prometheus clients info * Fix metrics config + enable prometheus collection on entry_node * use new protocol * wip: redial on lost connection * assure conn is not nil * close on write * Fix bugs + enable spammer on peer_master * Fix analysis fpc livefeed bug * connector cleanup * Autopeering NeighborCount + Network Diameter metrics * do not log the dial error since the logger might not available * removes test snapshot plugin * Improve analysis-server * Change log to debug level * graph pkg from autopeering-sim * Remove debug line + hive.go update * Change hive.go version * get rid of test snapshot plugin * fixes wrong use of Println * removes random tool * Move conn initialization * removes duplicated value entry in GH CI workflows * Update hive.go * Clean go.mod * Fix graph pkg linter warnings * improve analysis plugin * Fix metrics plugin linter warning * xxx * wip * fixes integration test * Add FPC global metrics to prometheus * Fix double register * Fix client pkg linter warning * Fix webapi linter warnings * Run you fools! * Fix testutil import * Adjust inbox worker pool capacity to default * Adjust inbox worker pool capacity to default (#505) * Address PR review comments * Fix watch dog warnings * Let's try with a new bone for the dog now.. * upgrade hive.go to master * Fix local dashboard * Improves mongoDB reliabilty * Fix linter warnings * Fix test * Make dbSize a prometheus Gauge * Remove unecessary initialization * introduce worker pool for storing finalized vote ctxs * Update packr * type alias FPCRecords to []FPCRecord Co-authored-by:jonastheis <mail@jonastheis.de> Co-authored-by:
Hans Moog <hm@mkjc.net> Co-authored-by:
jkrvivian <jkrvivian@gmail.com> Co-authored-by:
Luca Moser <moser.luca@gmail.com> Co-authored-by:
Levente Pap <levente.pap@iota.org> Co-authored-by:
Martyn Janes <martyn@obany.com> Co-authored-by:
Wolfgang Welz <welzwo@gmail.com>
global_metrics.go 4.53 KiB
package prometheus
import (
"strconv"
metricspkg "github.com/iotaledger/goshimmer/packages/metrics"
"github.com/iotaledger/goshimmer/packages/vote"
analysisdashboard "github.com/iotaledger/goshimmer/plugins/analysis/dashboard"
"github.com/iotaledger/goshimmer/plugins/metrics"
"github.com/iotaledger/hive.go/events"
"github.com/prometheus/client_golang/prometheus"
)
const (
like = "LIKE"
dislike = "DISLIKE"
)
// These metrics store information collected via the analysis server.
var (
// Process related metrics.
nodesInfoCPU *prometheus.GaugeVec
nodesInfoMemory *prometheus.GaugeVec
// Autopeering related metrics.
nodesNeighborCount *prometheus.GaugeVec
networkDiameter prometheus.Gauge
// FPC related metrics.
conflictCount *prometheus.GaugeVec
conflictFinalizationRounds *prometheus.GaugeVec
conflictOutcome *prometheus.GaugeVec
conflictInitialOpinion *prometheus.GaugeVec
)
var onFPCFinalized = events.NewClosure(func(ev *metricspkg.AnalysisFPCFinalizedEvent) {
conflictCount.WithLabelValues(
ev.NodeID,
).Add(1)
conflictFinalizationRounds.WithLabelValues(
ev.ConflictID,
ev.NodeID,
).Set(float64(ev.Rounds + 1))
conflictOutcome.WithLabelValues(
ev.ConflictID,
ev.NodeID,
opinionToString(ev.Outcome),
).Set(1)
conflictInitialOpinion.WithLabelValues(
ev.ConflictID,
ev.NodeID,
opinionToString(ev.Opinions[0]),
).Set(1)
})
func registerClientsMetrics() {
nodesInfoCPU = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_nodes_info_cpu",
Help: "Info about node's cpu load labeled with nodeID, OS, ARCH and number of cpu cores",
},
[]string{
"nodeID",
"OS",
"ARCH",
"NUM_CPU",
},
)
nodesInfoMemory = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_nodes_info_mem",
Help: "Info about node's memory usage labeled with nodeID, OS, ARCH and number of cpu cores",
},
[]string{
"nodeID",
"OS",
"ARCH",
"NUM_CPU",
},
)
nodesNeighborCount = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_nodes_neighbor_count",
Help: "Info about node's neighbors count",
},
[]string{
"nodeID",
"direction",
},
)
networkDiameter = prometheus.NewGauge(prometheus.GaugeOpts{
Name: "global_network_diameter",
Help: "Autopeering network diameter",
})
conflictCount = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_conflict_count",
Help: "Conflicts count labeled with nodeID",
},
[]string{
"nodeID",
},
)
conflictFinalizationRounds = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_conflict_finalization_rounds",
Help: "Number of rounds to finalize a given conflict labeled with conflictID and nodeID",
},
[]string{
"conflictID",
"nodeID",
},
)
conflictInitialOpinion = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_conflict_initial_opinion",
Help: "Initial opinion of a given conflict labeled with conflictID, nodeID and opinion",
},
[]string{
"conflictID",
"nodeID",
"opinion",
},
)
conflictOutcome = prometheus.NewGaugeVec(
prometheus.GaugeOpts{
Name: "global_conflict_outcome",
Help: "Outcome of a given conflict labeled with conflictID, nodeID and opinion",
},
[]string{
"conflictID",
"nodeID",
"opinion",
},
)
registry.MustRegister(nodesInfoCPU)
registry.MustRegister(nodesInfoMemory)
registry.MustRegister(nodesNeighborCount)
registry.MustRegister(networkDiameter)
registry.MustRegister(conflictCount)
registry.MustRegister(conflictFinalizationRounds)
registry.MustRegister(conflictInitialOpinion)
registry.MustRegister(conflictOutcome)
metricspkg.Events().AnalysisFPCFinalized.Attach(onFPCFinalized)
addCollect(collectNodesInfo)
}
func collectNodesInfo() {
nodeInfoMap := metrics.NodesMetrics()
for nodeID, nodeMetrics := range nodeInfoMap {
nodesInfoCPU.WithLabelValues(
nodeID,
nodeMetrics.OS,
nodeMetrics.Arch,
strconv.Itoa(nodeMetrics.NumCPU),
).Set(nodeMetrics.CPUUsage)
nodesInfoMemory.WithLabelValues(
nodeID,
nodeMetrics.OS,
nodeMetrics.Arch,
strconv.Itoa(nodeMetrics.NumCPU),
).Set(float64(nodeMetrics.MemoryUsage))
}
for nodeID, neighborCount := range analysisdashboard.NumOfNeighbors() {
nodesNeighborCount.WithLabelValues(nodeID, "in").Set(float64(neighborCount.Inbound))
nodesNeighborCount.WithLabelValues(nodeID, "out").Set(float64(neighborCount.Outbound))
}
networkDiameter.Set(float64(metrics.NetworkDiameter()))
}
func opinionToString(opinion vote.Opinion) string {
if opinion == vote.Like {
return like
}
return dislike
}