Documentation / Troubleshooting

Troubleshoot Logister.

Start with the smallest failing path: app health, token authentication, payload shape, background jobs, email, or optional analytics storage. Most issues become obvious once you test one layer at a time.

Guide overview

Debug the layer that can explain the symptom.

Logister has a small production baseline: Rails web, PostgreSQL, Redis, and Sidekiq. Optional services such as ClickHouse, SMTP, Turnstile, analytics, and S3-compatible archive storage should be checked only after the baseline is healthy.

Quick checks

Confirm the baseline before chasing optional services.

  1. Open /up. If it fails, fix the web process, database connection, or deploy health first.
  2. Confirm the app has RAILS_ENV, RAILS_MASTER_KEY, DATABASE_URL, REDIS_URL, LOGISTER_PUBLIC_URL, and LOGISTER_ADMIN_EMAILS.
  3. Confirm one web process and one Sidekiq worker process are running from the same release image or source revision.
  4. Create a project, generate a fresh API key, and keep the token available for one direct test request.
  5. Look at Rails and worker logs for request IDs, validation errors, authentication failures, and job errors around the same timestamp.

Start from a fresh token

If a project was archived and restored, old active tokens were revoked during archive. Generate a new API key before testing ingestion again.

Empty inbox

First decide whether the event is being accepted.

An empty inbox can mean the app never received the event, the token was rejected, the payload failed validation, the event type was not error, or the event landed in Activity, Insights, Performance, or Monitors instead of the error inbox.

shell
curl -i "$LOGISTER_ENDPOINT/api/v1/ingest_events" \
  -H "Authorization: Bearer $LOGISTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "event": {
      "event_type": "error",
      "level": "error",
      "message": "Troubleshooting test error",
      "fingerprint": "docs-troubleshooting-test",
      "environment": "production",
      "context": {
        "service": "manual-test"
      }
    }
  }'

A successful write returns 201 Created with an accepted status and an id. If it is accepted but the inbox still looks empty, check the project you sent to, the selected status filter, search terms, environment or release filters, archived project views, and whether the event was grouped into an existing issue.

Rejected events

Use the status code to narrow the fix.

Status Likely cause What to check
400 The JSON body is missing the required event, EVENT, check_in, or CHECK_IN envelope. Compare the request with the HTTP API guide. Some runtimes uppercase struct keys, which Logister accepts.
401 The token is missing, invalid, revoked, inactive, or belongs to an archived project. Use Authorization: Bearer <token> or X-Api-Key: <token>, generate a fresh project token, and confirm the project is active.
422 The envelope exists, but required event or span fields did not validate. Read the returned errors array. For spans, confirm trace_id, span_id, name/message, and duration fields.
429 The API token or authentication-failure source exceeded a public API rate limit. Honor Retry-After. Admins listed in LOGISTER_ADMIN_EMAILS can tune project overrides or global rate-limit environment variables.

If self-observability is configured with LOGISTER_API_KEY and LOGISTER_ENDPOINT, rejected client submissions are also reported through the Ruby gem as sanitized warning logs.

Background jobs

Run Sidekiq whenever the app needs async work.

Sidekiq delivers email, writes optional ClickHouse copies, schedules digests, records first-occurrence alerts, and runs operator-triggered archive or prune tasks. The web app can accept events while some async work is delayed, so always check the worker when the UI accepts data but follow-up behavior is missing.

shell
bundle exec sidekiq -C config/sidekiq.yml
  • Confirm REDIS_URL is identical for web and worker processes.
  • Confirm the worker has the same secrets and optional service variables as the web process.
  • Look for retrying jobs when ClickHouse, SMTP, or archive storage credentials are wrong.

Email

Check SMTP, sender identity, and worker delivery.

Logister sends confirmation mail, password reset mail, first-occurrence error alerts, and daily or weekly digests through Action Mailer. In production-like environments, configure SMTP and keep Sidekiq running.

SymptomCheck
No auth mailConfirm SMTP host, port, username, password, TLS/SSL mode, sender address, and provider-side domain verification.
No project alertsConfirm project notification preferences, first-occurrence settings, worker health, and that the event created a new error group.
No digestsConfirm recurring jobs are scheduled, Sidekiq is running, and the selected daily or weekly cadence has elapsed.

Read the deployment email settings for the environment variables used by SMTP and Amazon SES.

ClickHouse

Only debug ClickHouse after PostgreSQL ingestion works.

ClickHouse is optional. The product UI still uses PostgreSQL as the primary system of record. If events are accepted but ClickHouse-backed analytics are missing, first verify that ClickHouse is enabled, reachable, schema-ready, and that the worker is processing ClickHouse jobs.

Readiness endpoint
GET /health/clickhouse
  • If ClickHouse is disabled, the endpoint should not be treated as a production failure.
  • If the endpoint reports missing tables, load the schema from docs/clickhouse_schema.sql.
  • If writes are failing, check worker logs and the ClickHouse connection variables in the deployment guide.

Archive exports

Archive storage is separate from normal event ingestion.

S3-compatible archive storage is optional and used when operators export compressed JSONL before pruning older hot telemetry. Normal app events are stored in PostgreSQL even when archive storage is not configured.

  • Set ACTIVE_STORAGE_SERVICE=amazon and the AWS or S3-compatible variables from the deployment guide.
  • Confirm the bucket exists and credentials can write objects.
  • Run archive tasks from the same release with access to the same database and storage variables.
  • Do not run prune tasks until you have verified the archive output you need.

Support context

Collect a small, useful bundle before asking for help.

  • Logister version, image tag, or git SHA.
  • Deployment shape: source, Docker, Compose, Fly, Kamal, or another platform.
  • Whether PostgreSQL, Redis, Sidekiq, SMTP, ClickHouse, Turnstile, analytics, and archive storage are enabled.
  • The failing endpoint, status code, response body, and relevant response headers.
  • Sanitized Rails and worker log lines around the same timestamp.
  • Never share raw API tokens, credentials, cookies, authorization headers, or private event payloads.