CaraML Docs
CaraML Homepage
  • Introduction
    • What is CaraML?
    • Architecture
      • Feature Store Architecture
      • Models Architecture
      • Routers Architecture
      • Experiments Architecture
      • Pipelines Architecture
    • Core Concepts
      • Models Concepts
      • Router Concepts
      • Experiment Concepts
  • User guides
    • Projects
      • Create a project
      • Managing secrets
    • Feature Store
    • Models
      • Create a Model
        • Custom Model
      • Deploy a Model
        • Deploying a Model Version
        • Severing a Model Version
        • Configuring Transformer
          • Standard Transformer
            • Standard Transformer Expressions
            • Standard Transformer UPI
          • Custom Transformer
        • Redeploying a Model Version
      • Deleting a Model
      • Configuring Alerts
      • Batch Prediction
      • Model Schema
      • Model Observability
    • Routers
      • Creating a Router
        • Configure general settings
        • Configure routes
        • Configure traffic rules
        • Configure autoscaling
        • Configure experiment engine
        • Configure enricher
        • Configure ensembler
        • Configure logging
      • Viewing Routers
        • Configuration
        • History
        • Logs
        • More actions
      • Edit Routers
      • Monitoring router
        • Monitor Router Performance
        • Configure Alerts
      • Undeploying Router
      • Redeploying Router
        • Redeploy undeployed router
        • Redeploy version from history
        • Redeploy version from version details page
      • Deleting Router
        • Deleting router versions
        • Deleting router versions from details page
        • Deleting routers
      • Deleting Emsemblers
        • Delete an Ensembler without related entity
        • Delete an Ensembler with active entities
        • Delete an Ensembler with inactive entities
    • Experiments
      • View Experiment Settings
      • Modify Experiment Settings
      • Creating Experiments
      • Viewing Experiments
      • Modifying Experiments
      • Running Experiments
      • Monitoring Experiments
      • Creating Treatments
      • Viewing Treatments
      • Modifying Treatments
      • Creating Segments
      • Viewing Segments
      • Modifying Segments
      • Creating Custom Segmenters
      • Viewing Custom Segmenters
      • Modifying Custom Segmenters
    • Pipelines
  • Tutorial and Examples
    • Model Sample Notebooks
      • Deploy Standard Models
      • Deploy PyFunc Model
      • Using Transformers
      • Run Batch Prediction Job
      • Others examples on Models
    • Router Examples
    • Feature Store Examples
    • Pipeline Examples
    • Performing load test in CaraML
    • Best practice for CaraML
  • CaraML SDK
    • Feature Store SDK
    • Models SDK
    • Routers SDK
    • Pipeline SDK
  • Troubleshooting and FAQs
    • CaraML System FAQ
    • Models FAQ
      • System Limitations
      • Troubleshooting Deployment Errors
      • E2E Test
    • Routers FAQ
    • Experiments FAQ
    • Feature Store FAQ
    • Pipelines FAQ
    • CaraML Error Messages
  • Deployment Guide
    • Deploying CaraML
      • Local Development
    • Monitoring and alerting
      • Configure a monitoring backend
      • Configure an alerting backend
    • Prerequisites and Dependencies
    • System Benchmark results
    • Experiment Treatment Service
  • Release Notes
    • CaraML Release Notes
Powered by GitBook
On this page
  • Monitoring of Real-Time Routers
  • Configuring the Monitoring URL on Turing
  • Monitoring Batch Ensemblers
  1. Deployment Guide
  2. Monitoring and alerting

Configure a monitoring backend

PreviousMonitoring and alertingNextConfigure an alerting backend

Last updated 2 years ago

Monitoring of Real-Time Routers

Turing routers (as well as enrichers and ensemblers) are deployed using the Knative framework, on top of the Istio Service mesh. Both tools provide some out-of-the box Prometheus metrics to track common stats such as error rate, request latency, etc. and can be configured by following the official guides:

The following custom Prometheus metrics are published by the Turing routers too.

Metric Name
Description
Type
Tags
Unit

mlp_turing_exp_engine_request_duration_ms

The duration for fetching a treatment from the experiment engine

Histogram

status, engine

Milliseconds

mlp_route_request_duration_ms

The duration for the call to a route

Histogram

status, route

Milliseconds

mlp_turing_comp_request_duration_ms

The duration for a custom operation in the code, useful for debugging

Histogram

status, component

Users are also free to publish their own custom metrics from the Enricher / Ensembler. All custom metrics (from the router, enricher or ensembler) should be scraped from the user-container pods for use.

Configuring the Monitoring URL on Turing

Once the required dashboard has been created using the Prometheus metrics and other data, the template string RouterDefaults.MonitoringURLFormat can be used to configure the monitoring URL, when deploying the Turing application (please refer to the for an example). This URL will be used by Turing when creating / editing a router and subsequently, in the Turing UI as a navigation link to the dashboard.

Monitoring Batch Ensemblers

TODO

Knative
Istio
sample Helm values file