Network Monitors

Overview

Network Monitors let you define recurring checks against IP addresses, hostnames, ports, HTTP endpoints, and DNS records. Breeze dispatches each check through a connected agent in the same organization via WebSocket, records the result, and updates the monitor’s status. Monitors can optionally be linked to a discovered asset so that availability data appears alongside the asset’s SNMP and discovery information.

All monitor operations require authentication and one of the organization, partner, or system scopes. Organization-scoped users are automatically restricted to monitors within their own organization.

Monitor Types

Breeze supports four monitor types, stored in the monitor_type PostgreSQL enum:

| Type | DB Value | What It Checks | |---|---|---| | ICMP Ping | icmp_ping | Sends ICMP echo requests to a target host. Measures round-trip time. | | TCP Port | tcp_port | Opens a TCP connection to a specific port. Optionally checks for a banner string. | | HTTP/Endpoint | http_check | Makes an HTTP request to a URL. Validates status code, response body, SSL, and redirects. | | DNS | dns_check | Resolves a hostname via DNS. Optionally checks record type, expected value, and nameserver. |

ICMP Ping Configuration

| Field | Type | Constraints | Default | Description | |---|---|---|---|---| | count | integer | 1 — 20 | none | Number of echo requests to send. | | packetSize | integer | 16 — 65,535 | none | ICMP packet payload size in bytes. |

TCP Port Configuration

| Field | Type | Constraints | Default | Description | |---|---|---|---|---| | port | integer | 1 — 65,535 | required | TCP port number to connect to. | | expectBanner | string | optional | none | String the server banner must contain for a successful check. |

HTTP/Endpoint Configuration

| Field | Type | Constraints | Default | Description | |---|---|---|---|---| | url | string | valid URL | none | Full URL to request. If omitted, the monitor’s target field is used. | | method | enum | GET, HEAD, POST, PUT, OPTIONS | none | HTTP method. | | expectedStatus | integer | 100 — 599 | none | Required HTTP status code for a passing check. | | expectedBody | string | optional | none | Substring that must appear in the response body. | | headers | object | key-value pairs | none | Custom HTTP headers to include in the request. | | followRedirects | boolean | optional | none | Whether to follow HTTP redirects. | | verifySsl | boolean | optional | none | Whether to verify the TLS certificate. |

DNS Check Configuration

| Field | Type | Constraints | Default | Description | |---|---|---|---|---| | hostname | string | min 1 char | none | Hostname to resolve. If omitted, the monitor’s target field is used. | | recordType | enum | A, AAAA, MX, CNAME, TXT, NS | none | DNS record type to query. | | expectedValue | string | optional | none | Expected value in the DNS response. | | nameserver | string | optional | none | Custom DNS nameserver to query. |

Creating a Monitor

Choose a target. Identify the IP address, hostname, or URL you want to monitor. Optionally associate the monitor with a discovered asset by providing its assetId.
Select a monitor type. Pick one of icmp_ping, tcp_port, http_check, or dns_check.

Send a POST request.

curl -X POST /api/v1/monitors \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Web Server HTTPS",
    "monitorType": "http_check",
    "target": "https://app.example.com",
    "config": {
      "method": "GET",
      "expectedStatus": 200,
      "verifySsl": true
    },
    "pollingInterval": 120,
    "timeout": 10
  }'

Verify the response. A 201 status with the full monitor object confirms creation. The monitor begins in unknown status and isActive: true.

Polling Intervals and Timeouts

| Parameter | Column | Min | Max | Default | Unit | |---|---|---|---|---|---| | pollingInterval | polling_interval | 10 | 86,400 | 60 | seconds | | timeout | timeout | 1 | 300 | 5 | seconds |

The polling interval controls how frequently the BullMQ scheduler enqueues a check job for the monitor. The timeout is passed to the agent as the maximum time allowed for the check to complete before it is considered failed.

Monitor Status Lifecycle

Each monitor tracks a last_status field with one of four values:

| Status | Meaning | |---|---| | unknown | Initial state. No check has completed yet. | | online | The most recent check succeeded. | | degraded | The check succeeded but with concerning metrics (e.g., high response time). | | offline | The most recent check failed. |

Consecutive Failures

The consecutive_failures counter increments by 1 each time a check result has status offline. It resets to 0 on any non-offline result (online or degraded). This counter is used by alert rules with the consecutive_failures_gt condition.

State Update Transaction

When a check result is processed, the worker updates both the network_monitor_results row and the parent network_monitors row inside a single database transaction. The fields updated on the monitor are:

last_checked — timestamp of the check
last_status — the result status
last_response_ms — response time in milliseconds
last_error — error message (or NULL on success)
consecutive_failures — incremented or reset
updated_at — current timestamp

Results and Response Times

Every completed check writes a row to the network_monitor_results table:

| Column | Type | Description | |---|---|---| | id | UUID | Primary key. | | monitor_id | UUID | Foreign key to network_monitors. Cascades on delete. | | status | enum | online, offline, degraded, or unknown. | | response_ms | real | Response time in milliseconds. | | status_code | integer | HTTP status code (HTTP checks only). | | error | text | Error message if the check failed. | | details | JSONB | Additional check-specific data returned by the agent. | | timestamp | timestamp | When the check was performed. |

Querying Results

Fetch historical results with optional time-range filtering:

curl "/api/v1/monitors/<monitorId>/results?start=2026-02-01T00:00:00Z&end=2026-02-18T00:00:00Z&limit=500" \
  -H "Authorization: Bearer <token>"

| Query Parameter | Type | Constraints | Default | Description | |---|---|---|---|---| | start | string | ISO 8601 datetime | none | Include results on or after this timestamp. | | end | string | ISO 8601 datetime | none | Include results on or before this timestamp. | | limit | integer | 1 — 1,000 | 100 | Maximum number of results to return. |

Results are returned in descending order by timestamp (newest first).

Dashboard Stats

The GET /api/v1/monitors/dashboard endpoint returns aggregate counts for the authenticated organization:

{
  "data": {
    "total": 12,
    "status": {
      "online": 8,
      "offline": 2,
      "degraded": 1,
      "unknown": 1
    },
    "types": {
      "icmp_ping": 4,
      "tcp_port": 3,
      "http_check": 4,
      "dns_check": 1
    }
  }
}

The response includes:

total — total number of monitors in the organization.
status — count of monitors grouped by their current last_status.
types — count of monitors grouped by monitor_type.

Alert Rules on Monitors

Each monitor can have one or more alert rules. Alert rules are evaluated against check results to trigger alerts at the configured severity level.

Alert Rule Conditions

| Condition | Description | Threshold | |---|---|---| | offline | Fires when the monitor status is offline. | Not used. | | degraded | Fires when the monitor status is degraded. | Not used. | | response_time_gt | Fires when response time exceeds the threshold. | Milliseconds (string). | | consecutive_failures_gt | Fires when consecutive failures exceed the threshold. | Failure count (string). |

Alert Rule Severities

Rules use the shared alert_severity enum: critical, high, medium, low, info.

Creating an Alert Rule

curl -X POST /api/v1/monitors/alerts \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "monitorId": "<monitorId>",
    "condition": "consecutive_failures_gt",
    "threshold": "3",
    "severity": "critical",
    "message": "Web server has failed 3+ consecutive checks"
  }'

Alert Rule Schema

| Field | Type | Required | Default | Description | |---|---|---|---|---| | monitorId | UUID | Yes | — | The monitor this rule applies to. | | condition | enum | Yes | — | One of offline, degraded, response_time_gt, consecutive_failures_gt. | | threshold | string | No | null | Threshold value for response_time_gt and consecutive_failures_gt. | | severity | enum | Yes | — | Alert severity: critical, high, medium, low, info. | | message | string | No | null | Custom message included in the alert. | | isActive | boolean | No | true | Whether this rule is currently active. |

Testing Monitors

The test endpoint sends a one-off check command to a connected agent without going through the BullMQ queue. It finds an online agent in the monitor’s organization and dispatches the command directly via WebSocket.

curl -X POST /api/v1/monitors/<monitorId>/test \
  -H "Authorization: Bearer <token>"

Response (success):

{
  "data": {
    "monitorId": "<monitorId>",
    "status": "queued",
    "testedAt": "2026-02-18T12:00:00.000Z"
  }
}

Response (no agent available):

{
  "data": {
    "monitorId": "<monitorId>",
    "status": "failed",
    "error": "No online agent available",
    "testedAt": "2026-02-18T12:00:00.000Z"
  }
}

On-Demand Checks via BullMQ

The check endpoint enqueues a monitor check job through BullMQ rather than dispatching directly. This is the same mechanism the scheduler uses for recurring checks, but triggered manually.

curl -X POST /api/v1/monitors/<monitorId>/check \
  -H "Authorization: Bearer <token>"

Response:

{
  "data": {
    "monitorId": "<monitorId>",
    "status": "queued",
    "message": "Check request queued"
  }
}

How the Job Queue Works

The monitor system uses a BullMQ queue named monitors with three job types:

| Job Type | Name | Description | |---|---|---| | check-monitor | Check Monitor | Looks up the monitor, finds an online agent in the org, and dispatches the check command via WebSocket. | | process-check-result | Process Check Result | Writes the result to network_monitor_results and updates the monitor’s state in a transaction. | | monitor-scheduler | Monitor Scheduler | Repeatable job (every 30 seconds) that finds all active monitors due for a check and enqueues check-monitor jobs. |

The worker processes jobs with a concurrency of 10. Completed jobs are retained (100 for success, 200 for failures) for debugging.

Agent Command Mapping

Each monitor type maps to a specific agent command type:

| Monitor Type | Agent Command | |---|---| | icmp_ping | network_ping | | tcp_port | network_tcp_check | | http_check | network_http_check | | dns_check | network_dns_check |

The command payload includes the monitor’s target, timeout, and all fields from the monitor’s config object. Command IDs follow the pattern mon-<monitorId>-<timestamp>.

API Reference

All endpoints are mounted at /api/v1/monitors and require authentication.

Monitors

| Method | Path | Description | |---|---|---| | GET | /api/v1/monitors | List monitors with optional filters. | | POST | /api/v1/monitors | Create a new monitor. | | GET | /api/v1/monitors/dashboard | Aggregate status and type counts. | | GET | /api/v1/monitors/:id | Get a single monitor with recent results and alert rules. | | PATCH | /api/v1/monitors/:id | Update monitor fields. | | DELETE | /api/v1/monitors/:id | Delete a monitor and all associated results/rules. | | POST | /api/v1/monitors/:id/check | Enqueue an on-demand check via BullMQ. | | POST | /api/v1/monitors/:id/test | Send a direct test check to an online agent. | | GET | /api/v1/monitors/:id/results | Query historical check results. |

Alert Rules

| Method | Path | Description | |---|---|---| | POST | /api/v1/monitors/alerts | Create an alert rule for a monitor. | | GET | /api/v1/monitors/:monitorId/alerts | List alert rules for a monitor. | | PATCH | /api/v1/monitors/alerts/:id | Update an alert rule. | | DELETE | /api/v1/monitors/alerts/:id | Delete an alert rule. |

List Monitors Query Parameters

| Parameter | Type | Description | |---|---|---| | orgId | UUID | Filter by organization. Required for partner and system scopes. | | assetId | UUID | Filter by discovered asset. Also infers orgId from the asset. | | monitorType | enum | Filter by type: icmp_ping, tcp_port, http_check, dns_check. | | status | enum | Filter by current status: online, offline, degraded, unknown. | | search | string | Search by monitor name or target (substring match). |

Monitoring Assets Integration

Network monitors also surface through the /api/v1/monitoring/assets endpoint, which returns discovered assets enriched with their monitoring configuration. Each asset includes a network object showing:

{
  "network": {
    "configured": true,
    "totalCount": 3,
    "activeCount": 2
  }
}

The DELETE /api/v1/monitoring/assets/:id endpoint disables all active network monitors (and SNMP devices) for a given asset by setting isActive to false.

Troubleshooting

Monitor stuck in “unknown” status

The monitor has never been checked. Verify that:

The monitor has isActive: true.
At least one agent in the same organization is online and connected via WebSocket.
Redis is running and the monitor worker has been initialized (look for [MonitorWorker] Monitor worker initialized in API logs).
The scheduler repeatable job is registered (look for [MonitorWorker] Scheduled repeatable monitor scheduler (every 30s) in logs).

Check dispatched but no result recorded

The agent received the command but did not return a result. Check:

Agent logs for errors processing the network_ping, network_tcp_check, network_http_check, or network_dns_check command.
WebSocket connectivity between the agent and the API.
That the process-check-result job type is being processed (check BullMQ failed jobs for errors).

”Check service unavailable” (503)

The /check endpoint requires Redis for BullMQ job queuing. Ensure Redis is running and the REDIS_URL environment variable is correctly configured.

”No online agent available” on test

The /test endpoint looks for a device with status: 'online' in the monitor’s organization and verifies it has an active WebSocket connection. Ensure at least one agent is enrolled, heartbeating, and connected.

Monitors not running on schedule

The scheduler is a BullMQ repeatable job that runs every 30 seconds. If monitors are not being checked:

Verify the worker is running — check for [MonitorWorker] log lines.
Inspect the monitors queue in Redis for stuck or failed jobs.
Confirm the monitor’s polling_interval has elapsed since last_checked.

Alert rules not firing

Alert rules are stored in the network_monitor_alert_rules table. Verify:

The rule has isActive: true.
The condition and threshold match the current monitor state (e.g., consecutive_failures_gt with threshold "3" requires consecutive_failures > 3).
The monitor’s check results are being recorded — query the /results endpoint to confirm.