diff --git a/docs/loki/api.md b/docs/loki/api.md index db116017df23..480d7b5757e4 100644 --- a/docs/loki/api.md +++ b/docs/loki/api.md @@ -1,114 +1,291 @@ -# API +# Loki API The Loki server has the following API endpoints (_Note:_ Authentication is out of scope for this project): -### `POST /api/prom/push` +- `POST /api/prom/push` -For sending log entries, expects a snappy compressed proto in the HTTP Body: + For sending log entries, expects a snappy compressed proto in the HTTP Body: -- [ProtoBuffer definition](/pkg/logproto/logproto.proto) -- [Golang client library](/pkg/promtail/client/client.go) + - [ProtoBuffer definition](/pkg/logproto/logproto.proto) + - [Golang client library](/pkg/promtail/client/client.go) -Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: + Also accepts JSON formatted requests when the header `Content-Type: application/json` is sent. Example of the JSON format: -```json -{ - "streams": [ - { - "labels": "{foo=\"bar\"}", - "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] - } - ] -} -``` + ```json + { + "streams": [ + { + "labels": "{foo=\"bar\"}", + "entries": [{ "ts": "2018-12-18T08:28:06.801064-04:00", "line": "baz" }] + } + ] + } + + ``` + +- `GET /api/v1/query` + + For doing instant queries at a single point in time, accepts the following parameters in the query-string: + + - `query`: a logQL query + - `limit`: max number of entries to return (not used for metric queries) + - `time`: the evaluation time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. + - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. + + Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, + so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional + load to the index server and make the query slower. + + Responses looks like this: + + ```json + { + "resultType": "vector" | "streams", + "result": + } + ``` + + Examples: + + ```bash + $ curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' | jq + { + "resultType": "vector", + "result": [ + { + "metric": {}, + "value": [ + 1559848867745737, + "1267.1266666666666" + ] + }, + { + "metric": { + "level": "warn" + }, + "value": [ + 1559848867745737, + "37.77166666666667" + ] + }, + { + "metric": { + "level": "info" + }, + "value": [ + 1559848867745737, + "37.69" + ] + } + ] + } + ``` -### `GET /api/prom/query` + ```bash + curl -G -s "http://localhost:3100/api/v1/query" --data-urlencode 'query={job="varlogs"}' | jq + { + "resultType": "streams", + "result": [ + { + "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", + "entries": [ + { + "ts": "2019-06-06T19:25:41.972739Z", + "line": "foo" + }, + { + "ts": "2019-06-06T19:25:41.972722Z", + "line": "bar" + } + ] + } + ] + ``` -For doing queries, accepts the following parameters in the query-string: +- `GET /api/v1/query_range` -- `query`: a [logQL query](./usage.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) -- `limit`: max number of entries to return -- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. -- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. -- `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. -- `regexp`: a regex to filter the returned results + For doing queries over a range of time, accepts the following parameters in the query-string: -Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, -so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional -load to the index server and make the query slower. + - `query`: a logQL query + - `limit`: max number of entries to return (not used for metric queries) + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always one hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always now. + - `step`: query resolution step width in seconds. Default 1 second. + - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. -Responses looks like this: + Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, + so you need to specify the time and labels accordingly. Querying a long time into the history will cause additional + load to the index server and make the query slower. -```json -{ - "streams": [ + Responses looks like this: + + ```json + { + "resultType": "matrix" | "streams", + "result": + } + ``` + + Examples: + + ```bash + $ curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query=sum(rate({job="varlogs"}[10m])) by (level)' --data-urlencode 'step=300' | jq + { + "resultType": "matrix", + "result": [ { - "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", - "entries": [ - { - "ts": "2018-06-27T05:20:28.699492635Z", - "line": "..." + "metric": { + "level": "info" }, - ... - ] - }, - ... - ] -} -``` + "values": [ + [ + 1559848958663735, + "137.95" + ], + [ + 1559849258663735, + "467.115" + ], + [ + 1559849558663735, + "658.8516666666667" + ] + ] + }, + { + "metric": { + "level": "warn" + }, + "values": [ + [ + 1559848958663735, + "137.27833333333334" + ], + [ + 1559849258663735, + "467.69" + ], + [ + 1559849558663735, + "660.6933333333334" + ] + ] + } + ] + } + ``` + + ```bash + curl -G -s "http://localhost:3100/api/v1/query_range" --data-urlencode 'query={job="varlogs"}' | jq + { + "resultType": "streams", + "result": [ + { + "labels": "{filename=\"/var/log/myproject.log\", job=\"varlogs\", level=\"info\"}", + "entries": [ + { + "ts": "2019-06-06T19:25:41.972739Z", + "line": "foo" + }, + { + "ts": "2019-06-06T19:25:41.972722Z", + "line": "bar" + } + ] + } + ] + ``` + +- `GET /api/prom/query` + + For doing queries, accepts the following parameters in the query-string: + + - `query`: a [logQL query](../querying.md) (eg: `{name=~"mysql.+"}` or `{name=~"mysql.+"} |= "error"`) + - `limit`: max number of entries to return + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is always one hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970) or as RFC3339Nano (eg: "2006-01-02T15:04:05.999999999-07:00"). Default is current time. + - `direction`: `forward` or `backward`, useful when specifying a limit. Default is backward. + - `regexp`: a regex to filter the returned results + + Loki needs to query the index store in order to find log streams for particular labels and the store is spread out by time, + so you need to specify the start and end labels accordingly. Querying a long time into the history will cause additional + load to the index server and make the query slower. + + > This endpoint will be deprecated in the future you should use `api/v1/query_range` instead. + > You can only query for logs, it doesn't accept [queries returning metrics](./usage.md#counting-logs). + + Responses looks like this: + + ```json + { + "streams": [ + { + "labels": "{instance=\"...\", job=\"...\", namespace=\"...\"}", + "entries": [ + { + "ts": "2018-06-27T05:20:28.699492635Z", + "line": "..." + }, + ... + ] + }, + ... + ] + } + ``` -### `GET /api/prom/label` +- `GET /api/prom/label` -For doing label name queries, accepts the following parameters in the query-string: + For doing label name queries, accepts the following parameters in the query-string: -- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. -- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. -Responses looks like this: + Responses looks like this: -```json -{ - "values": [ - "instance", - "job", - ... - ] -} -``` + ```json + { + "values": [ + "instance", + "job", + ... + ] + } + ``` -`GET /api/prom/label//values` +- `GET /api/prom/label//values` -For doing label values queries, accepts the following parameters in the query-string: + For doing label values queries, accepts the following parameters in the query-string: -- `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. -- `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. + - `start`: the start time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is always 6 hour ago. + - `end`: the end time for the query, as a nanosecond Unix epoch (nanoseconds since 1970). Default is current time. -Responses looks like this: + Responses looks like this: -```json -{ - "values": [ - "default", - "cortex-ops", - ... - ] -} -``` + ```json + { + "values": [ + "default", + "cortex-ops", + ... + ] + } + ``` -### `GET /ready` +- `GET /ready` -This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. + This endpoint returns 200 when Loki ingester is ready to accept traffic. If you're running Loki on Kubernetes, this endpoint can be used as readiness probe. -### `GET /flush` +- `GET /flush` -This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. + This endpoint triggers a flush of all in memory chunks in the ingester. Mainly used for local testing. -### `GET /metrics` +- `GET /metrics` -This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. + This endpoint returns Loki metrics for Prometheus. See "[Operations > Observability > Metrics](./operations.md)" to have a list of exported metrics. ## Examples of using the API in a third-party client library -1. Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). -2. Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) +1) Take a look at this [client](https://github.com/afiskon/promtail-client), but be aware that the API is not stable yet (Golang). +2) Example on [Python3](https://github.com/sleleko/devops-kb/blob/master/python/push-to-loki.py) diff --git a/docs/querying.md b/docs/querying.md index ac25c12ffb65..cfc7b429933a 100644 --- a/docs/querying.md +++ b/docs/querying.md @@ -1,7 +1,7 @@ # Querying To get the previously ingested logs back from Loki for analysis, you need a -client that supports LogQL. +client that supports LogQL. Grafana will be the first choice for most users, nevertheless [LogCLI](logcli.md) represents a viable standalone alternative. @@ -111,3 +111,61 @@ The query language is still under development to support more features, e.g.,: - Number extraction for timeseries based on number in log messages - JSON accessors for filtering of JSON-structured logs - Context (like `grep -C n`) + +## Counting logs + +Loki's LogQL support sample expression allowing to count entries per stream after the regex filtering stage. + +### Range Vector aggregation + +The language shares the same [range vector](https://prometheus.io/docs/prometheus/latest/querying/basics/#range-vector-selectors) concept from Prometheus, except that the selected range of samples contains a value of one for each log entry. You can then apply an aggregation over the selected range to transform it into an instant vector. + +`rate` calculates the number of entries per second and `count_over_time` count of entries for the each log stream within the range. + +In this example, we count all the log lines we have recorded within the last 5min for the mysql job. + +> `count_over_time({job="mysql"}[5m])` + +A range vector aggregation can also be applied to a [Filter Expression](#filter-expression), allowing you to select only matching log entries. + +> `rate( ( {job="mysql"} |= "error" != "timeout)[10s] ) )` + +The query above will compute the per second rate of all errors except those containing `timeout` within the last 10 seconds. + +You can then use aggregation operators over the range vector aggregation. + +### Aggregation operators + +Like [PromQL](https://prometheus.io/docs/prometheus/latest/querying/operators/#aggregation-operators), Loki's LogQL support a subset of built-in aggregation operators that can be used to aggregate the element of a single vector, resulting in a new vector of fewer elements with aggregated values: + +- `sum` (calculate sum over dimensions) +- `min` (select minimum over dimensions) +- `max` (select maximum over dimensions) +- `avg` (calculate the average over dimensions) +- `stddev` (calculate population standard deviation over dimensions) +- `stdvar` (calculate population standard variance over dimensions) +- `count` (count number of elements in the vector) +- `bottomk` (smallest k elements by sample value) +- `topk` (largest k elements by sample value) + +These operators can either be used to aggregate over all label dimensions or preserve distinct dimensions by including a without or by clause. + +> `([parameter,] ) [without|by (