Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove mentions of the Protobuf format #1083

Merged
merged 1 commit into from
Jun 29, 2018
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
169 changes: 71 additions & 98 deletions content/docs/instrumenting/exposition_formats.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,104 +5,98 @@ sort_rank: 6

# Exposition formats

Prometheus implements two different wire formats which clients may use to
expose metrics to a Prometheus server: a simple text-based format and a more
efficient and robust protocol-buffer format. Prometheus servers and clients use
[content negotiation](http://en.wikipedia.org/wiki/Content_negotiation) to
establish the actual format to use. A server will prefer receiving the
protocol-buffer format, and will fall back to the text-based format if the
client does not support the former.

NOTE: **NOTE:** Prometheus 2.0 removed support for the protocol-buffer format
and only supports the text-based format.

The majority of users should use the existing [client libraries](/docs/instrumenting/clientlibs/)
that already implement the exposition formats.

## Format version 0.0.4

This is the current metrics exposition format version.

As of this version, there are two alternate formats understood by Prometheus: a
protocol-buffer based format and a text format. Clients must support at least
one of these two alternate formats.

In addition, clients may optionally expose other text formats that are not
understood by Prometheus. They exist solely for consumption by human beings and
are meant to facilitate debugging. It is strongly recommended that a client
library supports at least one human-readable format. A human-readable format
should be the fallback in case the HTTP `Content-Type` header is not understood
by the client library. The version `0.0.4` text format is generally considered
human readable, so it is a good fallback candidate (and also understood by
Prometheus).

### Format variants comparison

| | Protocol buffer format | Text format |
|---------------|------------------------|-------------|
| **Inception** | April 2014 | April 2014 |
| **Supported in** | Prometheus version `>=0.4.0`, `<2.0.0` | Prometheus version `>=0.4.0` |
| **Transmission** | HTTP | HTTP |
| **Encoding** | [32-bit varint-encoded record length-delimited](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/AbstractMessageLite#writeDelimitedTo(java.io.OutputStream)) Protocol Buffer messages of type [io.prometheus.client.MetricFamily](https://github.com/prometheus/client_model/blob/086fe7ca28bde6cec2acd5223423c1475a362858/metrics.proto#L76- L81) | UTF-8, `\n` line endings |
| **HTTP `Content-Type`** | `application/vnd.google.protobuf; proto=io.prometheus.client.MetricFamily; encoding=delimited` | `text/plain; version=0.0.4` (A missing `version` value will lead to a fall-back to the most recent text format version.) |
| **Optional HTTP `Content-Encoding`** | `gzip` | `gzip` |
| **Advantages** | <ul><li>Cross-platform</li><li>Size</li><li>Encoding and decoding costs</li><li>Strict schema</li><li>Supports concatenation and theoretically streaming (only server-side behavior would need to change)</li></ul> | <ul><li>Human-readable</li><li>Easy to assemble, especially for minimalistic cases (no nesting required)</li><li>Readable line by line (with the exception of type hints and docstrings)</li></ul> |
| **Limitations** | <ul><li>Not human-readable</li></ul> | <ul><li>Verbose</li><li>Types and docstrings not integral part of the syntax, meaning little-to-nonexistent metric contract validation</li><li>Parsing cost</li></ul>|
| **Supported metric primitives** | <ul><li>Counter</li><li>Gauge</li><li>Histogram</li><li>Summary</li><li>Untyped</li></ul> | <ul><li>Counter</li><li>Gauge</li><li>Histogram</li><li>Summary</li><li>Untyped</li></ul> |
| **Compatibility** | Version `0.0.3` protocol buffers are also valid version `0.0.4` protocol buffers. | none |

### Protocol buffer format details

Reproducible sorting of the protocol buffer fields in repeated expositions is
preferred but not required, i.e. do not sort if the computational cost is
prohibitive.

Each `MetricFamily` within the same exposition must have a unique name. Each
`Metric` within the same `MetricFamily` must have a unique set of `LabelPair`
fields. Otherwise, the ingestion behavior is undefined.
Metrics can be exposed to Prometheus using a simple [text-based](#text-based-format)
exposition format. There's a variety of [client libraries](/docs/instrumenting/clientlibs/)
that implement this format for you. If your preferred language doesn't have a client
library you can [create your own](/docs/instrumenting/writing_clientlibs/).

NOTE: **NOTE** Some earlier versions of Prometheus supported an exposition format based on
[Protocol Buffers](https://developers.google.com/protocol-buffers/) (aka Protobuf) in
addition to the current text-based format. As of version 2.0, however, Prometheus no
longer supports the Protobuf-based format. You can read about the reasoning behind
this change in [this
document](https://github.com/RichiH/OpenMetrics/blob/master/protobuf_vs_text.md).

## Text-based format

As of Prometheus version 2.0, all processes that expose metrics to Prometheus need to use
a text-based format. In this section you can find some [basic information](#basic-info)
about this format as well as a more [detailed breakdown](#text-format-details) of the
format.

### Basic info

| Aspect | Description |
|--------|-------------|
| **Inception** | April 2014 |
| **Supported in** | Prometheus version `>=0.4.0` |
| **Transmission** | HTTP |
| **Encoding** | UTF-8, `\n` line endings |
| **HTTP `Content-Type`** | `text/plain; version=0.0.4` (A missing `version` value will lead to a fall-back to the most recent text format version.) |
| **Optional HTTP `Content-Encoding`** | `gzip` |
| **Advantages** | <ul><li>Human-readable</li><li>Easy to assemble, especially for minimalistic cases (no nesting required)</li><li>Readable line by line (with the exception of type hints and docstrings)</li></ul> |
| **Limitations** | <ul><li>Verbose</li><li>Types and docstrings not integral part of the syntax, meaning little-to-nonexistent metric contract validation</li><li>Parsing cost</li></ul>|
| **Supported metric primitives** | <ul><li>Counter</li><li>Gauge</li><li>Histogram</li><li>Summary</li><li>Untyped</li></ul> |

### Text format details

The protocol is line-oriented. A line-feed character (`\n`) separates lines.
The last line must end with a line-feed character. Empty lines are ignored.
Prometheus' text-based format is line oriented. Lines are separated by a line
feed character (`\n`). The last line must end with a line feed character.
Empty lines are ignored.

#### Line format

Within a line, tokens can be separated by any number of blanks and/or tabs (and
have to be separated by at least one if they would otherwise merge with the
previous token). Leading and trailing whitespace is ignored.
must be separated by at least one if they would otherwise merge with the previous
token). Leading and trailing whitespace is ignored.

#### Comments, help text, and type information

Lines with a `#` as the first non-whitespace character are comments. They are
ignored unless the first token after `#` is either `HELP` or `TYPE`. Those
lines are treated as follows: If the token is `HELP`, at least one more token
is expected, which is the metric name. All remaining tokens are considered the
docstring for that metric name. `HELP` lines may contain any sequence of UTF-8
characters (after the metric name), but the backslash and the line-feed
characters (after the metric name), but the backslash and the line feed
characters have to be escaped as `\\` and `\n`, respectively. Only one `HELP`
line may exist for the same metric name.
line may exist for any given metric name.

If the token is `TYPE`, exactly two more tokens are expected. The first is the
metric name, and the second is either `counter`, `gauge`, `histogram`,
`summary`, or `untyped`, defining the type for the metric of that name. Only
one `TYPE` line may exist for the same metric name. The `TYPE` line for a
metric name has to appear before the first sample is reported for that metric
one `TYPE` line may exist for a given metric name. The `TYPE` line for a
metric name must appear before the first sample is reported for that metric
name. If there is no `TYPE` line for a metric name, the type is set to
`untyped`. Remaining lines describe samples, one per line, with the following
syntax (EBNF):
`untyped`.

The remaining lines describe samples (one per line) using the following syntax
([EBNF](https://en.wikipedia.org/wiki/Extended_Backus%E2%80%93Naur_form)):

```
metric_name [
"{" label_name "=" `"` label_value `"` { "," label_name "=" `"` label_value `"` } [ "," ] "}"
] value [ timestamp ]
```

In the sample syntax:

metric_name [
"{" label_name "=" `"` label_value `"` { "," label_name "=" `"` label_value `"` } [ "," ] "}"
] value [ timestamp ]
* `metric_name` and `label_name` carry the usual Prometheus expression language restrictions.
* `label_value` can be any sequence of UTF-8 characters, but the backslash (`\`, double-quote (`"`}, and line feed (`\n`) characters have to be escaped as `\\`, `\"`, and `\n`, respectively.
* `value` is a float.
* the `timestamp` is an `int64` (milliseconds since epoch, i.e. 1970-01-01 00:00:00 UTC, excluding leap seconds), represented as required by the [Go strconv package](http://golang.org/pkg/strconv/) (see functions [`ParseInt`](https://golang.org/pkg/strconv/#ParseInt) and [`ParseFloat`](https://golang.org/pkg/strconv/#ParseFloat)). In addition to standard integers, `Nan`, `+Inf`, and `-Inf` are valid values representing not a number, positive infinity, and negative infinity, respectively.

`metric_name` and `label_name` have the usual Prometheus expression language restrictions. `label_value` can be any sequence of UTF-8 characters, but the backslash, the double-quote, and the line-feed characters have to be escaped as `\\`, `\"`, and `\n`, respectively.
`value` is a float, and timestamp an `int64` (milliseconds since epoch, i.e. 1970-01-01 00:00:00 UTC, excluding leap seconds), represented as required by the [Go strconv package](http://golang.org/pkg/strconv/) (see functions `ParseInt` and `ParseFloat`). In particular, `Nan`, `+Inf`, and `-Inf` are valid values.
#### Grouping and sorting

All lines for a given metric must be provided as one uninterrupted group, with
All lines for a given metric must be provided as one single group, with
the optional `HELP` and `TYPE` lines first (in no particular order). Beyond
that, reproducible sorting in repeated expositions is preferred but not
required, i.e. do not sort if the computational cost is prohibitive.

Each line must have a unique combination of metric name and labels. Otherwise,
Each line must have a unique combination of a metric name and labels. Otherwise,
the ingestion behavior is undefined.

#### Histograms and summaries
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When introducing these new sub-headers, we'll then have to introduce them for all sub-sections of the text format, including the example section below (the example is no longer about histograms and summaries in particular, but about the text format in general).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, you're right. Missed this one.


The `histogram` and `summary` types are difficult to represent in the text
format. The following conventions apply:

Expand All @@ -113,7 +107,11 @@ format. The following conventions apply:
* A histogram _must_ have a bucket with `{le="+Inf"}`. Its value _must_ be identical to the value of `x_count`.
* The buckets of a histogram and the quantiles of a summary must appear in increasing numerical order of their label values (for the `le` or the `quantile` label, respectively).

See also the example below.
### Text format example

Below is an example of a full-fledged Prometheus metric exposition, including
comments, `HELP` and `TYPE` expressions, a histogram, a summary, character
escaping examples, and more.

```
# HELP http_requests_total The total number of HTTP requests.
Expand Down Expand Up @@ -154,31 +152,6 @@ rpc_duration_seconds_sum 1.7560473e+07
rpc_duration_seconds_count 2693
```

#### Optional Text Representations

The following three optional text formats are meant for human consumption only
and are not understood by Prometheus. Their definition may therefore be
somewhat loose. Client libraries may or may not support these formats. Tools
should not rely on these formats.

1. HTML: This format is requested by an HTTP `Content-Type` header with value
of `text/html`. It is a "pretty" rendering of the metrics to be looked at in a
browser. While the generating client is technically completely free in
assembling the HTML, consistency between client libraries should be aimed for.
2. Protocol buffer text format: Identical to the protocol buffer format, but in
text form. It consists of the protocol messages concatenated in their text
format (also known as "debug strings"), separated by an additional new line
character (i.e. there is an empty line between protocol messages). The format
is requested as the protocol buffer format, but the `encoding` in the HTTP
`Content-Type` header set to `text`.
3. Protocol buffer compact text format: Identical to (2) but using the compact
text format instead of the normal text format. The compact text format puts the
whole protocol message on one line. The protocol messages are still separated
by new line characters, but no "empty line" is needed for separation. (Simply
one protocol message per line.) The format is requested as the protocol buffer
format, but the `encoding` in the HTTP `Content-Type` header set to
`compact-text`.

## Historical versions

For details on historical format versions, see the legacy
Expand Down
19 changes: 6 additions & 13 deletions content/docs/instrumenting/writing_clientlibs.md
Original file line number Diff line number Diff line change
Expand Up @@ -312,15 +312,8 @@ descriptions, to lead by example.

## Exposition

Clients MUST implement one of the documented [exposition
formats](/docs/instrumenting/exposition_formats).

Clients MAY implement more than one format. There SHOULD be a human readable
format offered.

If in doubt, go for the text format. It doesn’t have a dependency (protobuf),
tends to be easy to produce, is human readable and the performance benefits of
protobuf are not that significant for most use cases.
Clients MUST implement the text-based exposition format outlined in the
[exposition formats](/docs/instrumenting/exposition_formats) documentation.

Reproducible order of the exposed metrics is ENCOURAGED (especially for human
readable formats) if it can be implemented without a significant resource cost.
Expand Down Expand Up @@ -368,12 +361,12 @@ unit-test their use of the instrumentation code. For example, the
## Packaging and dependencies

Ideally, a client library can be included in any application to add some
instrumentation, without having to worry about it breaking the application.
instrumentation without breaking the application.

Accordingly, caution is advised when adding dependencies to the client library.
For example, if a user adds a library that uses a Prometheus client that
requires version 1.4 of protobuf but the application uses 1.2 elsewhere, what
will happen?
For example, if you add a library that uses a Prometheus client that requires
version x.y of a library but the application uses x.z elsewhere, will that have
an adverse impact on the application?

It is suggested that where this may arise, that the core instrumentation is
separated from the bridges/exposition of metrics in a given format. For
Expand Down