Skip to content

A tool to monitor a certificate transparency log for operational problems

License

Notifications You must be signed in to change notification settings

letsencrypt/ct-woodpecker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

97 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CT Woodpecker

ct-woodpecker: poking holes in logs

ct-woodpecker pokes holes in logs and finds bugs. It is a tool for monitoring a Certificate Transparency log for operational problems.

Get started by running a full example environment in Docker with one command.


About

ct-woodpecker is designed primarily for helping log operators maintain insight into the stability and performance of their logs. It is not a complete stand-alone monitoring solution and is instead designed to integrate with Prometheus, Grafana, and AlertManager.

ct-woodpecker plays some parts of both the "Monitor" role and the "Submitter" role described in RFC 6962 Section 5 but is not designed to fulfill the complete role of an independent monitor or auditor.

As a Monitor, ct-woodpecker fetches the current STH from a log at a regular interval and emits Prometheus stats related to the STH age, the fetch latecy, and any errors that occur getting the STH or validating the signature. ct-woodpecker will also emit similar stats produced validating consistency proofs between the current STH and the previous STH.

As a Submitter ct-woodpecker regularly issues its own test certificates using a test CA that log operators can choose to add to their allowed roots. ct-woodpecker can emit stats about latency and provides a way for log operators to easily monitor certificate and pre-certificate submission.

After submitting test certificates ct-woodpecker periodically fetches new entries from the log and emits stats about the oldest certificate it has submitted that hasn't yet been merged into the log's merkle tree. This provides log operators with a way to track and enforce their own maximum-merge-delay (MMD).

Limitations

Remember that ct-woodpecker is not a complete Monitor or Auditor. Most notably:

  • ct-woodpecker does not fetch all entries in the monitored log's tree to attempt to confirm the tree made from fetched entries produces observed STH hashes.

  • ct-woodpecker does not request or validate Merkle audit proofs for SCT/STH pairs to prove inclusion.

  • ct-woodpecker does not verify that any two STHs from the same log can be verified by requesting a consistency proof. Presently it only verifies linearly observed STHs with consistency proofs.

Installation

Quick-start

To get started with an environment suitable for testing out ct-woodpecker or doing development work install Docker and Docker Compose and then run the following command in the ct-woodpecker repo root:

  docker-compose up

This will create and configure:

  1. A mysql container running MariaDB.
  2. A ct-test-srv container running two in-memory mock CT logs (log-one and log-two).
  3. A ct-woodpecker container configured to monitor log-one and log-two.
  4. An alertmanager container running AlertManager.
  5. A prometheus container running Prometheus configured to scrape the ct-woodpecker stats and use example alert rules with the alertmanager container.
  6. A grafana container running Grafana configured with a data source for the prometheus container and some example ct-woodpecker dashboards.

The following URLs can be used to access the web interfaces of the monitoring components:

  • Prometheus web interface: http://10.40.50.4:9090
  • AlertManager web interface: http://10.40.50.5:9093
  • Grafana web interface (username woodpecker, password woodpecker): http://10.40.50.6:3000

The provided ct-test-srv instances offer a small API that can be used to easily test ct-woodpecker and the associated monitoring in an end-to-end setting.

For example, you can break certificate submission for log-two by making it return a mock 404 response to add-chain requests:

   curl -X POST \
        -d '{"path":"/ct/v1/add-chain","code":404,"response":{"error":"oh noes!"}}' \
        localhost:4601/add-mock

Shortly afterwards (2-4m) you can expect the CertSubmissionErrors alert to be firing in http://localhost:9090/alerts based on the ct-woodpecker container being unable to submit certificates to log-two.

You can cause the alert to recover by removing log-two's add-chain mock by running:

   curl -X POST \
        -d '{"path":"/ct/v1/add-chain"}' \
        localhost:4601/clear-mock

The ct-test-srv logs also support setting mock STHs, creating inconsistent tree views, and controlling when submitted certificates are integrated into the tree. See the cttestsrv management_handlers.go for more information.

Production setup

We don't recommend you use the Docker Compose environment for anything beyond testing and development. Tailoring ct-woodpecker for production in your environment is situation dependent but in general a production ct-woodpecker deploy needs:

  1. A production ready deployment of Prometheus, Grafana, and AlertManager.
  2. A dedicated low privilege ct-woodpecker user.
  3. An optional test issuer certificate and private key for certificate submission. (See the ct-woodpecker-genissuer command for more).
  4. A copy of the ct-woodpecker binary installed somewhere in $PATH (e.g. /usr/local/bin).
  5. A configured MariaDB database. This means a database, a database user, and initialized tables created using the schema from storage/mysql/schema.sql.
  6. A configuration dir /etc/ct-woodpecker and config file /etc/ct-woodpecker/config.json.
  7. A systemd unit to keep the ct-woodpecker service running and to start it at system boot.

An example systemd unit and config file are provided to help you get started.

Example Prometheus alerts and Grafana dashboards are also provided in the examples/monitoring_and_alerting directory.

Collected Metrics

ct-woodpecker exports many Prometheus metrics on the configured metricsAddr for monitoring purposes. Below is a table of the metric name, the type, the labels used to slice the metric, and a description.

Metric Name Metric Type Labels Description
sth_timestamp GaugeVec uri Timestamp of fetched STH
sth_age GaugeVec uri Elapsed time since timestamp of fetched STH
sth_failures CounterVec uri Count of failures fetching a STH
sth_fetch_total CounterVec uri Count of total number of get-sth calls made against each monitored CT log
sth_latency HistogramVec uri Latency of fetching a STH
sth_proof_latency HistogramVec uri Latency of fetching a STH consistency proof
sth_inconsistencies CounterVec uri, type Count of instances two STHs could not be proved consistent
cert_submit_latency HistogramVec uri, precert Latency from submitting a cert or precert
cert_submit_results CounterVec uri, status, precert, duplicate Result from submitting a cert or precert
cert_storage_failures CounterVec uri, type Count of instances a cert/SCT couldn't be saved to the local DB to watch for inclusion
stored_scts CounterVec uri Count of unique cert/SCTs retrieved and stored in the db
oldest_unincorporated_cert GaugeVec uri Number of seconds since the oldest cert waiting on incorporation was submitted
unincorporated_certs GaugeVec uri Number of certs/SCTs submitted but not yet incorporated
inclusion_checker_errors CounterVec uri, type Number of errors encountered attemtping to check for cert inclusion
  • Possible sth_inconsistency type values are:

    • "equal-treesize-inequal-hash" for when two STH's have the same treesize and different hashes.
    • "failed-to-get-proof" for when an error occurs fetching the consistency proof.
    • "failed-to-verify-proof" for when a returned STH consistency proof can't be validated.
  • Possible cert_submit_results status values are:

    • "fail" for failed submissions.
    • "ok" for successful submissions.
  • cert_submit_results will have a precert="true" label when the submission was a precert.

  • cert_submit_results will have a duplicate="true" label when the submission was a resubmission of a previously submitted cert/precert.

  • Possible cert_storage_failures type values are:

    • "marshalling" for failures to marshal a returned SCT for storage.
    • "storing" for failures to insert the cert/SCT into the DB.
  • Possible inclusion_checker_errors type values are:

    • "getIndex" for failures to get the current stored tree index from the DB.
    • "getUnseen" for failures to find unseen certs/SCTs in the DB.
    • "getSTH" for failures to fetch an STH to determine entries needing to be fetched.
    • "getEntries" for failures to get entries from the log.
    • "checkEntries" for failures to check unseen certs against the returned new entries.
    • "updateIndex" for failures to write a new tree index to the DB.

Example Configuration

{
  "metricsAddr": ":1971",
  "dbURI": "woody@tcp(10.40.50.7:3306)/woodpeckerdb",
  "dbPasswordFile": "test/config/db_password",
  "fetchConfig": {
    "interval": "20s",
    "timeout": "5s"
  },
  "submitConfig": {
    "interval": "5s",
    "timeout": "5s",
    "certIssuerKeyPath": "/test/issuer.key",
    "certIssuerPath": "/test/issuer.pem"
  },
  "inclusionConfig": {
    "interval": "30s",
    "maxGetEntries": 3000
  },
  "logs": [
    {
      "uri": "http://log-one:4600",
      "key": "MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEYggOxPnPkzKBIhTacSYoIfnSL2jPugcbUKx83vFMvk5gKAz/AGe87w20riuPwEGn229hKVbEKHFB61NIqNHC3Q==",
      "windowStart": "2000-01-01T00:00:00Z",
      "windowEnd": "2001-01-01T00:00:00Z",
      "minEntry": 10,
      "submitPreCert": false,
      "submitCert": true
    },
    {
      "uri": "http://log-two:4601",
      "key": "MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEKtnFevaXV/kB8dmhCNZHmxKVLcHX1plaAsY9LrKilhYxdmQZiu36LvAvosTsqMVqRK9a96nC8VaxAdaHUbM8EA==",
      "windowStart": "2019-01-01T00:00:00Z",
      "windowEnd": "2099-01-01T00:00:00Z",
      "minEntry": 1,
      "submitPreCert": false,
      "submitCert": true
    }
  ]
}
  • metricsAddr - a bind address for the ct-woodpecker Prometheus metrics server.

  • dbURI - a MySQL DSN URL specifying the DB username, address, and database name. NOTE: The database user password should be provided in a separate file via the "dbPasswordFile" config parameter.

  • dbPasswordFile - a filepath for a file containing the DB user password. NOTE: File must have mode 0600.

  • fetchConfig - global configuration related to periodic STH fetching.

    • interval - a duration string describing the time period between fetching STHs

    • timeout - a duration string describing the timeout for fetching an STH.

  • submitConfig - global configuration related to periodic cert issuance and submission. May be omitted.

    • interval - a duration string describing the time period between attempts to issue and submit certs/precerts.

    • timeout - a duration string describing the timeout for submitting a cert/precert.

    • certIssuerKeyPath - a filepath for a file containing a PEM encoded RSA/ECDSA private key corresponding to the public key in the certIssuerPath PEM encoded intermediate certificate.

    • certIssuerPath - a filepath for a file containing a PEM encoded x509 certificate to use as the issuer for certificates generated for submitting to logs.

  • inclusionConfig global configuration related to checking that certificates issued periodically by ct-woodpecker were included in the monitored logs.

    • interval - a duration string describing the time period between attempts to check unseen certificates for inclusion.

    • maxGetEntries - the maximum number of log entries to process each interval. ct-woodpecker will make a series of get-entries calls for entries to process until it gets maxGetEntries entries or reaches the tree head.

    • startIndex - an optional integer specifying the treesize to start checking for inclusion from. This is useful if you start ct-woodpecker monitoring against a log that already has a large tree, since it lets ct-woodpecker skip ahead to the startIndex.

  • logs - an array of one or more CT logs to be configured. Each log is composed of a config object with the following fields:

    • uri - the log's URI.

    • key - the log's public key (PEM encoded as a single line without the PEM header/footer).

    • minEntry - log index to start inclusion checking from, for monitoring large pre-existing logs.

    • windowStart - (optional) for a sharded log the windowStart specifies the begin date for the shard's accepted validity window. ct-woodpecker will ensure the certificates it generates for this log have a notAfter within the windowStart and windowEnd

    • windowEnd - (optional) for a sharded log the windowEnd specifies the end date for the shard's accepted validity window. ct-woodpecker will ensure the certificates it generates for this log have a notAfter within the windowStart and windowEnd

    • submitPreCert - if true then precertificates for this log will be generated and submitted based on the global inclusionConfig

    • submitCert - if true then final certificates for this log will be generated and submitted based on the global inclusionConfig

Utilities

ct-woodpecker also provides two additional utilities:

  1. ct-malformed - a tool for generating malformed CT traffic to fuzz/loadtest a log.

  2. ct-woodpecker-genissuer - a small tool for creating a one-off CA certificate and private key suitable for use with the ct-woodpecker certSubmitter config.

Contributing

Please open an issue before starting on substantial features or code changes. We would love to help talk through the possible design choices before putting code to file.

Roughly the design of ct-woodpecker separates things into the following package hiearchy:

  • cmd/ - individual binaries (ct-woodpecker, ct-malformed).
  • woodpecker/ - top level concerns related to monitoring all of the configured logs. The woodpecker package does most of the heavy lifting for the ct-woodpecker command.
  • monitor/ - the core monitoring logic.
  • storage/ - code related to MySQL and persistent storage.
  • pki/ - general PKI utilities mostly used for test certificate issuance.
  • test/ - convenience tools for unit tests.
  • test/cttestsrv - a purpose built in-memory mock CT log for integration testing.

All pull requests must be reviewed by one of the maintainers before merging. We expect all changes to have robust unit tests.

Photo credit

The ct-woodpecker repository logo image was provided by a Pileated Woodpecker living in the Laurentides region of Quebec, Canada. Photographed by @cpu March 2018.