Skip to main content
Skip table of contents

Metrics API

Overview

OpenTV Video Platform relies on Prometheus to collect public API usage and health metrics. These metrics are the ones that are used to build our API Service Status and API Service Usage Grafana dashboards, but they can also be directly collected from the standard Prometheus API to enable integration with an external monitoring system: https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1

You can query the Prometheus API to get information such as:

  • Public API endpoints' health status (those used to build your Grafana “[UD01] Service API Status Dashboards”):

    • API endpoints for whether a particular probe was successful or not

  • Public API endpoints' traffic usage metrics (those used to build your Grafana “[UD02] Service API Usage Dashboards”):

    • Request throughput (number of calls per API endpoint)

    • Error ratio per API endpoint

    • Response time per API endpoint

Prometheus allows you to construct complex, sophisticated queries. It is beyond the scope of this page to cover all of its functionality.

For full details, see the Prometheus API documentation:

You can perform calculations on the data that is returned to compute additional metrics.

As a SaaS provider, NAGRA is accountable for the monitoring of our OpenTV Video Platform. These metrics have been exposed to enable customers who are integrating our OpenTV platform to use their own monitoring solution, but it is optional and definitely not required.
Also be aware that exposed metrics are tightly linked to our underlying reverse proxy technology solution (i.e., Istio), and are expected to evolve and possibly be replaced in future release. We are constantly evolving our inbound stack to keep it aligned with the best standards.

Metric categories

For OpenTV Video Platform (and SSP), there are two categories of metrics that NAGRA exposes through Prometheus:

Probe metrics

The probe_success metric indicates whether reply was successful for each API endpoint probe.

Returned value is 0 for unhealthy and 1 for healthy.

Probe metrics labels

The main useful probe_success metric labels you can use on for your queries are as follows:

Labels

Description

api

API endpoint name (as displayed in the [UD01] Service API Status Dashboards):

  • “AGS API”

  • “ADM API”

  • “CDVR API”

  • “CIM API”

  • “CRM-GATEWAY API”

  • "Cast, Crew, and Persona Service API"

  • "Content Discovery Gateway API”

  • "Content Workflow Manager API"

  • "Content and Product Manager API"

  • “IAM-API”

  • “IAS API”

  • "ION External Endpoint website"

  • "Keycloak API"

  • "Metadata API"

  • "Ncanto API probe Endpoint APIs"

  • "Opconsole API"

  • "Prometheus Federated API"

  • "Rights Management API"

  • "User Activity Vault API"

Query to retrieve exhaustive lists of “api” enpoints:

CODE
https://operator..<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/label

instance

URI used to probe a given api endpoint

job

Internal probe job name

Inbound traffic metrics

There are a number of inbound traffic metrics that are issued by our Istio Ingress Gateways. These include counters on received HTTP requests, as well as HTTP request duration per endpoint, HTTP methods, and HTTP status.

Request duration is the interval between the arrival of a request and the response, that is, how long it takes to serve each request. This is a key indicator of an application's performance.

An increase in response time can mean that there is an issue with an end-user application or with an upstream service, which may be caused by a recent change or upgrade.

The following metrics are available:

  • istio_requests_totalcounter type metric incremented for every request handled by our Istio proxy

    • Used to monitor the inbound traffic throughput (number of requests per seconds received on a given endpoint)

  • istio_request_duration_milliseconds_sumcounter type metric of all request duration handled by our Istio proxy

    • Used to monitor our API endpoint per request response time (max, min, or average over a given time period)

  • istio_request_duration_milliseconds_count: counter type metric incremented for every request handled by the Istio proxy (equivalent to istio_requests_total)

  • istio_request_duration_milliseconds_bucket: histogram type metric used to track the distribution of Istio request durations. It is typically used to compute request duration percentiles (using the histogram_quantile Prometheus function).
    For example, a bucket labeled istio_request_duration_milliseconds_bucket{le="100"} counts the number of requests that had a duration of less than or equal to 100 milliseconds

The above metrics can be used to perform calculations such as:

  • Average response time per API

  • Requests per second per API

  • Response time for the top n% of calls per API

  • Etc.

See Istio / Istio Standard Metrics for more information.

Some useful Istio metrics labels

Istio metrics are returned for given set of label values. It could be interesting to aggregate a given metric on given labels. For example, you may want to compute the number of requests handled over a given time frame for a given endpoint (i.e., request_url) and a given response_code (i.e., response_code)

Labels

Description

app

Ingress gateway application that issued the metrics
To monitor inbound traffic, always filter on app="ingress-gateway-otvpcse".

request_host

Host header of the HTTP request

Example: "request_host": "api.<environment_name>.<dns_name>"

request_method

HTTP request method (GET, PUT, POST, OPTION, etc.)

Example: "request_method": "GET"

request_url

HTTP request URL endpoint (see Monitored endpoints below)

Example: "request_url": "/adm/v1/user"

response_code

HTTP response status code (2xx, 3xx, 4xx or 5xx)

Example: "response_code": "200"

Monitored endpoints

Here is a selection of some interesting API endpoints being monitored:

Services

Endpoints (regex)

Account and Device Manager (ADM)

  • /adm/.*

    • /adm/v[0-9]+/admin.*

    • /adm/v[0-9]+/accountProfiles.*

    • /adm/v[0-9]+/accounts.*

    • /adm/v[0-9]+/bundled/accounts.*

    • /adm/v[0-9]+/deviceProfiles.*

    • /adm/v[0-9]+/devices.*

    • /adm/v[0-9]+/pinTypes.*

    • /adm/v[0-9]+/user.*

Authentication Gateway Service (AGS)

  • /ags.*

    • /ags/servicediscovery.*

    • /ags/signOn.*

Content Builder

  • /contentbuilder.*

    • /contentbuilder/v[0-9]+/curators.*

    • /contentbuilder/v[0-9]+/templates.*

    • /contentbuilder/v[0-9]+/templateviews.*

Content and Product Manager (CPM)

  • /cpm/admin

  • /cpm/commercial

  • /cpm/content

  • /cpm/operator

  • /cpm/purge

Content Discovery Gateway (CDG)

  • /contentdiscovery.*

    • /contentdiscovery/v[0-9]+/contexts.*

    • /contentdiscovery/v[0-9]+/recommendations.*

    • /contentdiscovery/v[0-9]+/templates.*

Content Importer (CIM)

  • /importcim

CRM Gateway

  • /crm-gateway

Content Workflow Manager (CWM)

  • /workflow

Identity Authentication Service (IAS)

  • /ias/.*

    • /ias/v[0-9]+/content_token.*

    • /ias/v[0-9]+/token.*

    • /ias/v[0-9]+/refresh.*

    • /ias/v[0-9]+/signout.*

    • /ias/v[0-9]+/localinfo.*

Image Metadata Service (IMDS)

  • /imagemetadata

Metadata Aggregation Service (MAS)

  • /mas

Metadata Server (MDS)

  • /metadata.*

    • /metadata/delivery/changes.*

    • /metadata/delivery/.*/vod/editorials.*

    • /metadata/delivery/.*/vod/series.*

    • /metadata/delivery/.*/vod/nodes.*

    • /metadata/delivery/.*/vod/products.*

    • /metadata/delivery/.*/btv/products.*

    • /metadata/delivery/.*/btv/programmes.*

    • /metadata/delivery/.*/btv/services.*

    • /metadata/solr/GLOBAL/vod/.*/search.*

Open Device Messaging (ODM)

  • /odm

Rights Manager (RMG)

  • /rmg/v1/operator

  • /rmg/v1/user

User Activity Vault (UAV)

  • /useractivityvault

User Recordings

  • /cdvr.*

    • /cdvr/v[0-9]+/aggregatedrecordings.*

    • /cdvr/v[0-9]+/recordings.*

    • /cdvr/v[0-9]+/seriesrecordings.*

Authentication

Access to the Prometheus APIs is controlled by Keycloak. See Accessing operator APIs using Keycloak for more information.

Output formatting

If you are using Postman to make these requests, it automatically pretty-prints the JSON output.

If you are using curl, you can pipe its output to the jq JSON formatting tool to make the output more readable.

For example:

BASH
curl -s --location -g --request GET 'https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=probe_success' \
--header 'Content-Type: application/x-www-form-urlencoded' \
--header 'Authorization: bearer <keycloak_token>| jq

All Prometheus responses are return in JSON format in a result object that contains the metric and value.

A metric's value is returned in an array of two values: [Unix epoch timestamp, "value"].

Examples

Note:

Many of the following examples use filters. They only use a few of the many fields that can be filtered on. You can filter on whichever fields you want to get the output that you require.

APIs health monitoring (based on probe metrics)

Get list of all APIs health being monitored

Request

To query Prometheus to list all the APIs whose health is being monitored using probing:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/label/api/values
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example
JSON
{
    "status": "success",
    "data": [
        "AGS API",
        "Account and Device Manager API",
        "CDVR API",
        "CIM API",
        "CRM-GATEWAY API",
        "Cast, Crew, and Persona Service API",
        "Content Discovery Gateway API",
        "Content Workflow Manager API",
        "Content and Product Manager API",
        "External Endpoint APIs",
        "IAM-api",
        "IAS API",
        "ION External Endpoint website",
        "Keycloak API",
        "Metadata API",
        "Ncanto API probe Endpoint APIs",
        "Opconsole API",
        "Prometheus Federated API",
        "Rights Management API",
        "User Activity Vault API"
    ]
}

Get the list of probe health checks' status for a given API

Request

To query Prometheus to get the latest result of probe requests for a given API (e.g., “IAS API”), send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=min by (api)(probe_success{api="IAS API"})
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

You may notice that several public or private endpoints could be probed to determine if given API is healthy or not.

Returned value is [Unix Epoch Timestamp]:

  • 1 for success

  • 0 for failure

Example
JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "api": "IAS API"
                },
                "value": [
                    1715185977.698,
                    "1"
                ]
            }
        ]
    }
}

You can use the min function (e.g., GET https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=min(probe_success{api="IAS API"}) to return the API health status as a single value.

Get a list of all unhealthy APIs

Request

To query Prometheus for a list of all the probed APIs that are currently unhealthy, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=min by (api)(probe_success==0)
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example
JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "api": "ION External Endpoint website"
                },
                "value": [
                    1715185011.919,
                    "0"
                ]
            }
        ]
    }
}

If you are using curl and jq, you can use the -r option to filter the output to show just the list of endpoints.

For example:

curl -s --location -g --request GET 'https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=min by (api)(probe_success==0)' \ --header 'Content-Type: application/x-www-form-urlencoded' \ --header 'Authorization: bearer <keycloak_token> | jq -r '.data.result[].metric.api'

APIs usage monitoring (based on Istio metrics)

Get total number of requests received for a given API endpoint

Request

To query Prometheus for the total requests for a given API endpoint (i.e., request_url), send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query=sum by (request_url)(istio_requests_total{app="ingress-gateway-otvpcse",request_url=~"/metadata/delivery/GLOBAL/btv/services"})

The value that is returned for a particular metric and status is the cumulative number of responses since the service started. To get the number of responses over a particular time period, use a time offset to get the count at a specific point in the past and compare it with the current value.

See Monitored endpoints, above, for examples of API endpoint regexes you can use in your query.

Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

139406 requests have been served from Istio services being started:

JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services"
                },
                "value": [
                    1715187063.594,
                    "139406"
                ]
            }
        ]
    }
}

Get total number of requests received for a given API endpoint per HTTP response status code

Request

To query Prometheus for the total requests received per HTTP response code status for a given API endpoint, send a GET request to:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sum by (app="ingress-gateway-otvpcse",request_url,response_code)(istio_requests_total{request_url="/metadata/delivery/GLOBAL/btv/services"})
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example
JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services",
                    "response_code": "200"
                },
                "value": [
                    1715187189.338,
                    "92816"
                ]
            },
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services",
                    "response_code": "403"
                },
                "value": [
                    1715187189.338,
                    "32302"
                ]
            },
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services",
                    "response_code": "304"
                },
                "value": [
                    1715187189.338,
                    "2172"
                ]
            },
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services",
                    "response_code": "204"
                },
                "value": [
                    1715187189.338,
                    "11740"
                ]
            },
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services",
                    "response_code": "503"
                },
                "value": [
                    1715187189.338,
                    "335"
                ]
            },
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services",
                    "response_code": "0"
                },
                "value": [
                    1715187189.338,
                    "44"
                ]
            }
        ]
    }
}

Get request throughput (requests/second) for a given API endpoint

Request

To query Prometheus for the average request throughput (i.e., the number of requests per second) for a given API endpoint (e.g., "/metadata/delivery/GLOBAL/btv/services") for the last past 10 minutes:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sum by (request_url)(rate(istio_requests_total{request_url=~"/metadata/delivery/GLOBAL/btv/services"}[10m]))
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

Throughput was 0.013 requests/seconds on average for the past 10 minutes:

JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "request_url": "/metadata/delivery/GLOBAL/btv/services"
                },
                "value": [
                    1715187654.312,
                    "0.013705368055555556"
                ]
            }
        ]
    }
}

Get average response time for a given API endpoint

Request

To query Prometheus for the average response time in milliseconds for a given API endpoint (e.g., "/metadata/v1/epg") for the last past 10 minutes:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=sum(rate(istio_request_duration_milliseconds_sum{app="ingress-gateway-otvpcse",request_url=~"/metadata/v1/epg"}[10m])) by (request_url) / sum(rate(istio_request_duration_milliseconds_count{app="ingress-gateway-otvpcse",request_url=~"/metadata/v1/epg"}[10m])) by (request_url)
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

Average response time was 1072 milliseconds for the past 10 minutes.

JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {
                    "request_url": "/metadata/v1/epg"
                },
                "value": [
                    1715188797.602,
                    "1072.780273852061"
                ]
            }
        ]
    }
}

Get response time 95th percentile for a given API endpoint

Request

To query Prometheus for the response time 95th percentile for a given API endpoint (e.g., "/metadata/delivery.*") for the last past 10 minutes:

CODE
https://operator.<environment_name>.<dns_domain>/prometheus-ext-server/api/v1/query?query=histogram_quantile(0.95,sum(rate(istio_request_duration_milliseconds_bucket{app="ingress-gateway-otvpcse",request_url=~"/metadata/delivery.*"}[10m])) by (le))
Headers
  • AuthorizationBearer <keycloak_token>

  • Content-Typeapplication/x-www-form-urlencoded

Response

See the Prometheus docs for the status codes that it returns.

Example

95% of the requests served by the /metadata/delivery.* API endpoints have been served in less than 166 milliseconds for the past 10 minutes:

JSON
{
    "status": "success",
    "data": {
        "resultType": "vector",
        "result": [
            {
                "metric": {},
                "value": [
                    1715189231.843,
                    "166.48048820847555"
                ]
            }
        ]
    }
}
JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.