Monitoring

PgCache exposes Prometheus-compatible metrics via an HTTP endpoint, giving you visibility into query performance, cache behavior, CDC replication health, and connection state.

Enabling Metrics

The PgCache Docker image enables metrics on port 9090 by default. The listen address is set via the [metrics] section (or --metrics_socket) — see the Configuration reference. Publish the port to access them:

docker run -d -p 5432:5432 -p 9090:9090 pgcache/pgcache \
  --upstream postgres://user:password@db:5432/myapp

Security: the metrics endpoint is unauthenticated and exposes operational detail about your traffic. Restrict the metrics port to trusted/internal networks (a private subnet, security group, or firewall rule) — do not publish it to the public internet.

To use a different port, set the METRICS_PORT environment variable:

docker run -d -p 5432:5432 -p 8080:8080 \
  -e METRICS_PORT=8080 \
  -e UPSTREAM_URL=postgres://user:password@db:5432/myapp \
  pgcache/pgcache

When running pgcache outside Docker, add a [metrics] section to your TOML configuration:

[metrics]
socket = "0.0.0.0:9090"

Or use the CLI argument:

pgcache --config pgcache.toml --metrics_socket 0.0.0.0:9090

The HTTP server exposes several endpoints:

EndpointDescription
GET /metricsPrometheus metrics in text exposition format
GET /healthzLiveness check — always returns 200 OK if the process is running
GET /readyzReadiness check — returns 200 OK when the cache is running, 503 otherwise
GET /statusJSON object with full cache, CDC, and per-query status (see Status Endpoint below)
GET /configCurrent effective configuration, with restart_required flag if static fields on disk differ from running values
PUT /configPartial configuration update — writes changes to the TOML file (preserving comments and formatting) and reloads dynamic fields in place
POST /config/reloadRe-read the TOML file and apply any dynamic field changes without restart

Histograms report p50, p95, and p99 quantiles.

Prometheus Scrape Configuration

scrape_configs:
  - job_name: pgcache
    static_configs:
      - targets: ['pgcache-host:9090']

Available Metrics

Query Counters

Track how queries flow through PgCache.

MetricTypeDescription
pgcache.queries.totalcounterTotal queries received
pgcache.queries.cacheablecounterQueries identified as cacheable
pgcache.queries.uncacheablecounterQueries forwarded to origin (not cacheable)
pgcache.queries.unsupportedcounterUnsupported statement types
pgcache.queries.invalidcounterQueries that failed to parse
pgcache.queries.cache_hitcounterQueries served from cache
pgcache.queries.cache_misscounterCacheable queries that missed the cache
pgcache.queries.cache_errorcounterCache lookup errors (query forwarded to origin)
pgcache.queries.allowlist_skippedcounterQueries skipped because their tables are not in the allowlist

Latency Histograms

All latency metrics are in seconds and report p50, p95, and p99 quantiles.

MetricDescription
pgcache.query.latency_secondsEnd-to-end query latency
pgcache.cache.lookup_latency_secondsCache lookup time
pgcache.origin.latency_secondsOrigin database query time
pgcache.query.registration_latency_secondsTime to register a new query in the cache

Per-Stage Timing

Detailed breakdown of where time is spent within PgCache:

MetricDescription
pgcache.query.stage.parse_secondsSQL parsing
pgcache.query.stage.dispatch_secondsDispatching to cache channel
pgcache.query.stage.lookup_secondsCache lookup
pgcache.query.stage.queue_wait_secondsTime waiting in worker channel queue
pgcache.query.stage.conn_wait_secondsTime waiting for a cache database connection
pgcache.query.stage.spawn_wait_secondsTime waiting for worker task spawn
pgcache.query.stage.worker_exec_secondsCache worker execution
pgcache.query.stage.response_write_secondsWriting response to client
pgcache.query.stage.forward_decision_secondsCache-miss path: dispatch to forward decision
pgcache.query.stage.coalesce_intake_secondsCoalesce path: enqueue a waiter
pgcache.query.stage.coalesce_wait_secondsCoalesce path: wait for in-flight population
pgcache.query.stage.total_secondsTotal pipeline time

Connection Metrics

MetricTypeDescription
pgcache.connections.totalcounterTotal connections accepted
pgcache.connections.activegaugeCurrently active connections
pgcache.connections.errorscounterConnection errors

CDC / Replication Metrics

Monitor the health and throughput of the CDC replication stream.

MetricTypeDescription
pgcache.cdc.events_processedcounterTotal CDC events processed
pgcache.cdc.insertscounterInsert events received
pgcache.cdc.updatescounterUpdate events received
pgcache.cdc.deletescounterDelete events received
pgcache.cdc.lag_bytesgaugeWAL replication lag in bytes
pgcache.cdc.lag_secondsgaugeReplication lag in seconds
pgcache.cdc.received_lsngaugeLast LSN received from origin via XLogData
pgcache.cdc.flushed_lsngaugeLast LSN acknowledged to origin via standby status update
pgcache.cdc.applied_lsngaugeHighest LSN whose effects have been fully applied by the writer (transaction-aligned)

CDC Connection Resilience

If the replication connection drops, pgcache automatically switches to forwarding all queries to the origin database while attempting to reconnect. The /readyz endpoint continues to return 200 during this period since the proxy is still serving queries (via origin). Once reconnected and the replication slot LSN is verified, cache dispatch resumes automatically.

Monitor pgcache.cdc.lag_bytes and pgcache.cdc.lag_seconds for replication health. A sudden spike in origin latency (pgcache.origin.latency_seconds) alongside cache hit ratio dropping to zero may indicate the CDC connection is temporarily down and queries are being forwarded.

Cache State Metrics

MetricTypeDescription
pgcache.cache.invalidationscounterCache entries invalidated by CDC events
pgcache.cache.evictionscounterCache entries evicted due to size limits
pgcache.cache.queries_registeredgaugeNumber of queries currently cached
pgcache.cache.queries_loadinggaugeQueries currently being loaded into cache
pgcache.cache.queries_pendinggaugeQueries seen but not yet admitted to cache
pgcache.cache.queries_invalidatedgaugeInvalidated entries retained for fast readmission
pgcache.cache.readmissionscounterQueries fast-readmitted after CDC invalidation
pgcache.cache.mv_fallthroughcounterRequests that fell through from a materialized-view result to source-row evaluation
pgcache.cache.subsumptionscounterQueries served via predicate subsumption (data already covered by another cached query)
pgcache.cache.subsumption_latency_secondshistogramTime spent detecting predicate subsumption
pgcache.cache.size_bytesgaugeCurrent cache size in bytes
pgcache.cache.size_limit_bytesgaugeConfigured cache size limit
pgcache.cache.tables_trackedgaugeNumber of tables tracked for cache invalidation
pgcache.cache.restarts_totalcounterSuccessful cache-subsystem restarts performed by the supervisor after a backend failure
pgcache.cache.pool_replenishedcounterPoisoned cache-database serve-pool connections discarded and replaced
pgcache.cache.pool_recycledcounterServe-pool connections recycled to reclaim accumulated Postgres plan-cache memory

Memory Pressure

pgcache bounds its total memory footprint: as whole-system used memory (pgcache plus the cache Postgres it manages) approaches the registration budget (80% of detected RAM by default, or memory_limit — see Configuration), registration of new distinct queries is throttled and those queries are forwarded to origin instead of cached. Already-cached queries keep serving.

MetricTypeDescription
pgcache.cache.memory_used_bytesgaugeWhole-system used memory (pgcache + cache Postgres) — the figure compared against the budget
pgcache.cache.rss_bytesgaugeResident set size of the pgcache process alone (its share of the above)
pgcache.cache.memory_budget_bytesgaugeUsed-memory high-water mark above which registration is throttled
pgcache.cache.query_count_capgaugeMax registered queries that fit the memory budget (0 = uncapped)
pgcache.cache.marginal_bytes_per_querygaugeMeasured per-query memory footprint, used to derive the count cap
pgcache.cache.registration_throttledgauge1 while registration is throttled by memory pressure, else 0
pgcache.cache.registration_throttled_totalcounterQueries forwarded to origin (not registered) due to memory-pressure throttling

In-Process Result Memo

The in-process result memo is an in-memory tier that serves the hottest queries inline, skipping the cache-database round-trip. Its byte budget is set by memo_cache_size (default 64 MiB; 0 disables) — see the Configuration reference.

MetricTypeDescription
pgcache.cache.memo_hitscounterCache hits served inline from the in-process memo
pgcache.cache.memo_capturescounterResult snapshots stored into the memo
pgcache.cache.memo_evictionscounterMemo entries dropped (CDC-evicted or budget-reclaimed)
pgcache.cache.memo_entriesgaugeCurrent number of live memo entries
pgcache.cache.memo_bytesgaugeCurrent total bytes held by the memo

Writer Queue Depths

MetricTypeDescription
pgcache.cache.writer_query_queuegaugePending query registration messages
pgcache.cache.writer_cdc_queuegaugePending CDC messages
pgcache.cache.writer_internal_queuegaugeInternal message queue depth

CDC Handler Metrics

MetricTypeDescription
pgcache.cache.handle_insertscounterInsert operations processed by cache writer
pgcache.cache.handle_updatescounterUpdate operations processed by cache writer
pgcache.cache.handle_deletescounterDelete operations processed by cache writer
pgcache.cache.handle_insert_secondshistogramInsert handler duration
pgcache.cache.handle_update_secondshistogramUpdate handler duration
pgcache.cache.handle_delete_secondshistogramDelete handler duration
pgcache.cache.cdc_prepared_hitscounterPrepared-statement cache hits for per-query CDC evaluation
pgcache.cache.cdc_prepared_missescounterPrepared-statement cache misses for CDC evaluation

Writer Instrumentation

MetricTypeDescription
pgcache.cache.writer.command_handle_secondshistogramPer-command writer handler latency, labeled by cmd
pgcache.cache.writer.register.resolve_secondshistogramquery_register phase: resolve
pgcache.cache.writer.register.subsumption_check_secondshistogramquery_register phase: subsumption check
pgcache.cache.writer.register.subsume_secondshistogramquery_register phase: subsume
pgcache.cache.writer.register.insert_secondshistogramquery_register phase: insert
pgcache.cache.writer.register.publication_update_secondshistogramquery_register phase: publication update
pgcache.cache.writer.register.populate_dispatch_secondshistogramquery_register phase: dispatch population
pgcache.cache.writer.resolve.update_queries_register_secondshistogramquery_resolve phase: register update queries
pgcache.cache.writer.resolve.deparse_secondshistogramquery_resolve phase: deparse
pgcache.cache.writer.update_queries_totalgaugeTotal update queries across all relations
pgcache.cache.writer.update_queries_max_per_relationgaugeLargest update-query count on any single relation

Population Pipeline

MetricTypeDescription
pgcache.cache.population.task_secondshistogramPer-task population duration
pgcache.cache.population.stream_secondshistogramTime spent streaming rows from origin
pgcache.cache.population.wait_secondshistogramTime waiting on the population channel
pgcache.cache.population.worker_idle_secondshistogramPer-worker idle time between tasks

Protocol Metrics

MetricTypeDescription
pgcache.protocol.simple_queriescounterQueries using the simple query protocol
pgcache.protocol.extended_queriescounterQueries using the extended query protocol (Parse/Bind/Execute)
pgcache.protocol.prepared_statementscounterPrepared statements created
pgcache.protocol.describe_cache.hitscounterSynthesized Parse/Describe responses served from the describe cache
pgcache.protocol.describe_cache.missescounterDescribe-cache lookups that required building a new entry
pgcache.protocol.describe_cache.evictionscounterDescribe-cache entries evicted
pgcache.protocol.describe_cache.invalidationscounterDescribe-cache entries invalidated
pgcache.protocol.lazy_parse_forwardedcounterParse messages sent to the origin lazily on the forward path
pgcache.protocol.close_localcounterClose(statement) messages handled locally (statement never prepared on origin) instead of being forwarded

Key PromQL Queries

Cache Hit Ratio

rate(pgcache_queries_cache_hit_total[5m])
/
(rate(pgcache_queries_cache_hit_total[5m]) + rate(pgcache_queries_cache_miss_total[5m]))

Query Latency (p95)

pgcache_query_latency_seconds{quantile="0.95"}

CDC Replication Lag

pgcache_cdc_lag_seconds

Cache Size Utilization

pgcache_cache_size_bytes / pgcache_cache_size_limit_bytes

Status Endpoint

The GET /status endpoint returns a JSON object with real-time cache, CDC, and per-query information. Status data is gathered on demand from the cache writer via message passing (2-second timeout). If the cache thread is unresponsive, the endpoint returns 503.

Response Structure

{
  "cache": {
    "size_bytes": 10485760,
    "size_limit_bytes": 1073741824,
    "generation": 42,
    "tables_tracked": 5,
    "policy": "clock",
    "queries_registered": 12,
    "uptime_ms": 3600000,
    "cache_hits": 15230,
    "cache_misses": 487
  },
  "cdc": {
    "tables": ["public.users", "public.orders"],
    "last_applied_lsn": 12345600
  },
  "queries": [
    {
      "fingerprint": 9876543210,
      "sql_preview": "SELECT * FROM users WHERE ...",
      "tables": ["public.users"],
      "state": "cached",
      "cached_bytes": 2048,
      "max_limit": null,
      "pinned": false,
      "hit_count": 1523,
      "miss_count": 12,
      "idle_duration_ms": 245,
      "registered_duration_ms": 3580000,
      "cached_duration_ms": 3500000,
      "invalidation_count": 3,
      "readmission_count": 2,
      "eviction_count": 1,
      "subsumption_count": 45,
      "population_count": 4,
      "last_population_duration_ms": 120,
      "total_bytes_served": 4915200,
      "population_row_count": 500,
      "cache_hit_latency": {
        "count": 1523,
        "mean_us": 245.3,
        "p50_us": 210,
        "p95_us": 480,
        "p99_us": 920,
        "min_us": 85,
        "max_us": 3200
      }
    }
  ]
}

Cache Status Fields

FieldDescription
queries_registeredNumber of queries currently registered in the cache
uptime_msProxy uptime in milliseconds
cache_hitsTotal cache hits across all queries
cache_missesTotal cache misses across all queries

Per-Query Metrics

Each entry in the queries array includes operational metrics:

FieldDescription
hit_countNumber of times this query was served from cache
miss_countNumber of times this query missed the cache
idle_duration_msMilliseconds since the last cache hit (null if no hits yet)
registered_duration_msMilliseconds since the query was first seen
cached_duration_msMilliseconds since the query was last populated (null if not currently cached)
invalidation_countNumber of times invalidated by CDC events
readmission_countNumber of times readmitted after invalidation
eviction_countNumber of times evicted from cache
subsumption_countNumber of times served via predicate subsumption
population_countNumber of times the cache was populated for this query
last_population_duration_msDuration of the last population in milliseconds
total_bytes_servedCumulative bytes served from cache for this query
population_row_countNumber of rows inserted during the last population
cache_hit_latencyLatency histogram for cache hits (null if no hits yet) — includes count, mean_us, p50_us, p95_us, p99_us, min_us, max_us

Health Checks

Use /healthz and /readyz as Kubernetes or load balancer health probes:

# Kubernetes example
livenessProbe:
  httpGet:
    path: /healthz
    port: 9090
readinessProbe:
  httpGet:
    path: /readyz
    port: 9090

Log Configuration

In addition to metrics, PgCache supports configurable logging via the log_level setting. This uses tracing’s EnvFilter syntax:

# Show info and above for all modules
log_level = "info"

# Debug logging for the cache subsystem, info for everything else
log_level = "pgcache_lib::cache=debug,info"

# Trace-level logging (very verbose)
log_level = "trace"

You can also set the RUST_LOG environment variable for one-off debugging sessions.