…
monitoring:
enabled: true
port: 8081
host: "127.0.0.1"
memoryLimit: 95
gcTimeLimit: 50
Gentics Mesh exposes monitoring data via an un-authenticated monitoring server. In addition to the default API server which runs on default port 8080 an additional monitoring server will be started which listens on localhost:8081 by default.
The monitoring server is un-authenticated and thus should not be exposed to the public. By default it will only bind to localhost:8081.
|
The network settings for the monitoring server can be configured within the mesh.yml or via environment settings.
The montoring server can also be turned off via the enabled flag.
…
monitoring:
enabled: true
port: 8081
host: "127.0.0.1"
memoryLimit: 95
gcTimeLimit: 50
|
Container deployments which need to access the monitoring API need to change the monitoring server binding in order to expose the port. This can be done via the environment http host setting: MESH_MONITORING_HTTP_HOST=0.0.0.0
|
The liveness of the Gentics Mesh instance running in a docker container can be checked by executing the script /mesh/live.sh inside the container.
This has some advantages over the liveness probe using the HTTP endpoint:
Liveness checked with with this script does not depend on the http verticles and vert.x to be fully started, but only on the Java process to be running.
When both memoryLimit and gcTimeLimit are set to positive integers (which is the default), Gentics Mesh will regularly check the amount of used Heap Memory and the amount of time
spent for garbage collections (in the last 10 seconds). If both amounts exceed the configured limits (in percent), the instance will be considered unhealthy and both the liveness probe
and the readiness probe will fail. The instance will be unhealthy until used memory and the time spent for garbage collections drop below the configured limits.
Setting at least one of the values of memoryLimit and gcTimeLimit to 0 will disable the monitoring of Java Heap Memory and Garbage Collections.
|
The healthcheck liveness probe endpoint GET /api/v2/health/live indicates if server is working as expected when status code 200 is returned.
The readiness probe endpoint GET /api/v2/health/ready returns status code 200 if the server is accepting connections. This endpoint can be used to check when the server instance if ready to accept requests once it has been started. This is especially useful if you run rolling cluster upgrades.
The current Gentics Mesh server status can be checked against the GET /api/v2/status.
| Status | Description |
|---|---|
STARTING |
Status which indicates that the server is starting up. |
WAITING_FOR_CLUSTER |
Status which indicates that the server is waiting/looking for a cluster to join. |
READY |
Status which indicates that the server is operating normally. |
SHUTTING_DOWN |
Status which indicates that the instance is shutting down. |
The GET /api/v2/cluster/status endpoint returns the status of all cluster nodes to which the queries instance has establishes a connection.
The version information can be retrieved via the GET /api/v2/versions endpoint.
{
"meshVersion" : "3.0.0",
"meshNodeName" : "Singing Chandelure",
"databaseVendor" : "mariadb",
"databaseVersion" : "10.6",
"searchVendor" : "elasticsearch",
"searchVersion" : "6.1.2",
"vertxVersion" : "4.5.2",
"databaseRevision" : "0be9a986"
}
Gentics Mesh server exposes Prometheus compatible data on the /api/v2/metrics endpoint.
The Prometheus server can scrape metric data from this endpoint. Using Grafana in combination with Prometheus is a typical usecase to display metric data.
Example Prometheus configuration:
…
scrape_configs:
- job_name: 'mesh'
scrape_interval: 30s
metrics_path: '/api/v2/metrics'
static_configs:
- targets: ['mesh:8081']
…
Gentics Mesh exposes the following metrics in addition to the default Vert.x metrics. More metrics will be added over time.
<cache> is one of permission, projectbranchname, projectname, webroot.
| Key | Description |
|---|---|
|
Meter which measures the rate of created transactions over time. |
|
Meter which measures the rate of created noTx transactions over time. |
|
Meter which tracks the reload operations on used vertices. |
|
Timer which tracks transaction durations. |
|
Amount of transaction retries which happen if a conflict has been encountered. |
|
Amount of commit interrupts. |
|
Timer which tracks commit durations. |
|
Pending contents which need to be processed by the node migration. |
|
Amount of cache hits. |
|
Amount of cache misses. |
|
Amount of invalidations of the whole cache. |
|
Amount of invalidations for a single entry in the cache. |
|
Tracks the time which is spent waiting on the write lock. |
|
Amount of timeouts of acquiring the write lock. |
|
Tracks the time which is spent waiting on the write lock. |
|
Amount of timeouts of acquiring the write lock. |
|
Timer which tracks duration of graphql requests. |
|
Total disk size in bytes for the storage. |
|
Usable (free) disk size in bytes for the storage. |
The Monitoring Java Client can be used to interact with the endpoints using Java.