Skip to main content
Skip table of contents

API monitoring

Overview

OpenTV Video Platform uses Prometheus to collect API usage and performance data from both Platform and SSP. You can then query Prometheus to get information such as:

  • Number of responses from a particular endpoint
  • Total response time
  • Whether a particular probe was successful or not

Prometheus allows you to construct complex, sophisticated queries. It is beyond the scope of this page to cover all of its functionality.

For full details, see the Prometheus API documentation:

You can perform calculations on the data that is returned to compute additional metrics.

Metric categories

For OpenTV Video Platform and SSP, there are two categories of metrics that NAGRA exposes through Prometheus:

Probe metrics

The probe_success metric indicates whether execution was successful for each probe.

Nginx metrics

There are a number of metrics that are gathered by monitoring nginx. These include response time, which is the interval between the arrival of a request and the response, that is, how long it takes to service each request. This is a key indicator of an application's performance.

An increase in response time can mean that there is an issue with an end-user application or with an upstream service, which may be caused by a recent change or upgrade.

The following metrics are available:

  • sni_http_response_count_total the total number of processed HTTP responses
  • sni_http_response_time_seconds – a summary vector of the total response times (in seconds)
  • sni_http_response_time_seconds_sum – a sum of the total response times in seconds
  • sni_http_response_time_seconds_count – the total number of processed HTTP responses

You can perform calculations on the data that is returned to compute additional metrics, such as:

  • Average response time per API
  • Requests per second per API
  • Response time for the top n% of calls per API

Available metrics

The following nginx metrics are available:

ModuleMetric nameREST methods
Account and Device Manager (ADM)

adm_accounts_actions

DELETE

adm_bundled_accounts

GET, POST

adm_devices

GET, POST, DELETE,

adm_update

GET, POST

adm_user_accounts

GET
API Gateway (AGW)agw_createPOST
Cast, Crew, and Persona Service (CCP)ccpGET
Content Builder

rail

GET
CRM Gateway (CRM-GW)crm_gatewayGET
IAM (Keycloak)
iamGET
Identity Authentication Service (IAS)

ias_content_token

GET, POST
ias_tokenPOST
Image Handler Service (IHS)ihsGET
Keycloakkeycloak_nagraPOST
keycloak_opconGET, POST
keycloak_resourcesGET
Metadata Server (MDS)mds_eventsGET, PUT, POST, DELETE
btv_programmesGET
btv_servicesGET
epgGET

solr_search

GET
vod_editorialsGET
vod_nodesGET
vod_productsGET
Operator Console (OpCon)opuiGET
opconsole_admGET, POST
opconsole_bcmGET
opconsole_coreGET, PUT, POST
Rights Manager (RMG)rmgGET, POST
User Activity Vault (UAV)uavGET, PUT, POST
User Recordings

cdvr

GET, POST, DELETE

Authentication

Access to the Prometheus APIs is controlled by Keycloak. See Accessing operator APIs using Keycloak for more information.

Output formatting

If you are using Postman to make these requests, it automatically pretty-prints the JSON output.

If you are using curl, you can pipe its output to the jq JSON formatting tool to make the output more readable.

For example:

CODE
curl -s --location -g --request GET 'https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--header 'Authorization: bearer <keycloak_token>| jq

Examples

Get all monitored endpoints

Request

To query Prometheus for all the endpoints it is monitoring, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one module only (in this case, ADM).

CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "__name__": "probe_success",
                    "api": "Account and Device Manager",
                    "instance": "http://http-router/adm/v1/accounts?limit=0",
                    "job": "adm-api"
                },
                "value": [
                    1673878896.183,
                    "1"
                ]
            },
            ...
        ]
    }
}

Get a count of monitored endpoints

Request

To query Prometheus for a count of the monitored endpoints, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=count(probe_success)
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example
CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {},
                "value": [
                    1674043081.6,
                    "31"
                ]
            }
        ]
    }
}

This shows that 31 endpoints are being monitored. (The other value in the same block is the Unix epoch timestamp.)

Get a list of monitored endpoints showing only the most relevant fields

Request

To query Prometheus for a list of monitored endpoints, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=count without (job,api)(probe_success)
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one endpoint only (in this case, ADM accounts).

CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "instance": "https://operator.sitq3ga.otv-staging.com/adm/v1/accounts?limit=0"
                },
                "value": [
                    1674045670.591,
                    "1"
                ]
            },
            ...
        ]
    }
}

If you are using curl and jq, you can use the -r option to filter the output to show just the list of endpoints.

For example:

CODE
curl -s --location -g --request GET 'https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=count without(job,api)(probe_success)' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--header 'Authorization: bearer <keycloak_token> | jq -r '.data.result[].metric.instance'

Get a list of inactive endpoints

Request

To query Prometheus for just the endpoints that are inactive, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success==0
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one module only (in this case, MDS).

CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                 "metric": {
                    "__name__": "probe_success",
                    "api": "External",
                    "instance": "https://admin.sitq3ga.otv-staging.com/metadata/delivery/GLOBAL/vod/nodes?limit=0",
                    "job": "mds-api"
                },
                "value": [
                    1674048210.825,
                    "0"
                ]
             },
             ...
        ]
    }
}

Get usage counts for all metrics and statuses

Request

To query Prometheus for the total response count per HTTP status for each metric, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sni_http_response_count_total

The value that is returned for a particular metric and status is the cumulative number of responses since the service started. To get the number of responses over a particular time period, use a time offset to get the count at a specific point in the past and compare it with the current value.

Note that multiple blocks are returned for certain modules.

For example, for ADM, there are separate blocks for adm_devices, adm_update, adm_bundled_accounts, and adm_user_accounts.

Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

To save space, the following example includes the output for one metric and one HTTP status only (in this case, status 201 for RMG).

CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "__name__": "sni_http_response_count_total",
                    "environment": "sitq3ga",
                    "host": "sni_router01",
                    "http_code": "201",
                    "instance": "sni_router01",
                    "job": "sni_router-log-exporter",
                    "method": "POST",
                    "request_uri": "rmg",
                    "status": "201"
                },
                "value": [
                    1673955157.403,
                    "27"
                ]
             },
             ...
        ]
    }
}

Get count for a specific metric and status

Request

To query Prometheus for the total response count for a specific HTTP status for a specific metric, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sni_http_response_count_total{http_code="200",request_uri="adm_devices"}
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

This shows the response that is returned when you request the response count for HTTP status 200 for the adm_devices metric.

CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                 "metric": {
                    "__name__": "sni_http_response_count_total",
                    "environment": "sitq3ga",
                    "host": "sni_router01",
                    "http_code": "200",
                    "instance": "sni_router01",
                    "job": "sni_router-log-exporter",
                    "method": "DELETE",
                    "request_uri": "adm_devices",
                    "status": "200"
                },
                "value": [
                    1673955157.403,
                    "12"
                ]
             },
            ...
        ]
    }
}

Get the total response time for all metrics and statuses

Request

To query Prometheus for the total response time for all available metrics and statuses, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sni_http_response_time_seconds_sum

You can use the total response time together with the usage counts to calculate the average response time for each metric.

Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

If there were no requests for the endpoints that are covered by a particular metric for the data collection period, the value returned will be NaN.

Example

This shows the response that is returned when you request the total response time.

To save space, the following example includes the output for one metric and one HTTP status only (in this case, status 200 for MDS events.

CODE
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                 "metric": {
                    "__name__": "sni_http_response_time_seconds_sum",
                    "environment": "sitq3ga",
                    "host": "sni_router01",
                    "http_code": "200",
                    "instance": "sni_router01",
                    "job": "sni_router-log-exporter",
                    "method": "DELETE",
                    "request_uri": "mds_events",
                    "status": "200"
                },
                "value": [
                    1674482200.427,
                    "79.46000000000002"
                ]
            },
            ...
        ]
    }
}
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.