Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for http.client_ip tag for Rack-based frameworks #2248

Merged
merged 15 commits into from
Sep 8, 2022
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions lib/datadog/core/configuration/settings.rb
Original file line number Diff line number Diff line change
Expand Up @@ -618,6 +618,32 @@ def initialize(*_)
# @default `{}`
# @return [Hash,nil]
option :writer_options, default: ->(_i) { {} }, lazy: true

# Client IP configuration
# @public_api
settings :client_ip do
# Whether client IP collection is enabled. When enabled client IPs from HTTP requests will
# be reported in traces.
#
# @see https://docs.datadoghq.com/tracing/configure_data_security#configuring-a-client-ip-header
#
# @default The negated value of the `DD_TRACE_CLIENT_IP_HEADER_DISABLED` environment
# variable or `true` if it doesn't exist.
# @return [Boolean]
option :enabled do |o|
o.default { !env_to_bool(Tracing::Configuration::Ext::ClientIp::ENV_DISABLED, false) }
o.lazy
end

# An optional name of a custom header to resolve the client IP from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# An optional name of a custom header to resolve the client IP from.
# An optional name of a custom header to resolve the client IP from.

Linter should have caught this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops.
Rubocop seems to pass but I shouldv'e considered the formatting in the other options.
Fixed

#
# @default `DD_TRACE_CLIENT_IP_HEADER` environment variable, otherwise `nil`.
# @return [String,nil]
option :header_name do |o|
o.default { ENV.fetch(Tracing::Configuration::Ext::ClientIp::ENV_HEADER_NAME, nil) }
o.lazy
end
end
end

# The `version` tag in Datadog. Use it to enable [Deployment Tracking](https://docs.datadoghq.com/tracing/deployment_tracking/).
Expand Down
39 changes: 39 additions & 0 deletions lib/datadog/core/header_collection.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
module Datadog
module Core
# A some-what abstract class representing a collection of headers.
#
# Use the `HeaderCollection.from_hash` function to create a header collection from a `Hash`.
# Another option is to use `HashHeaderCollection` directly.
class HeaderCollection
# Gets a single value of the header with the given name, case insensitive.
#
# @param [String] header_name Name of the header to get the value of.
# @returns [String, nil] A single value of the header, or nil if the header with
# the given name is missing from the collection.
def get(header_name)
nil
end

# Create a header collection that retrieves headers from the given Hash.
#
# This can be useful for testing or other trivial use cases.
#
# @param [Hash] hash Hash with the headers.
def self.from_hash(hash)
HashHeaderCollection.new(hash)
end
end

# A header collection implementation that looks up headers in a Hash.
class HashHeaderCollection < HeaderCollection
def initialize(hash)
super()
@hash = hash.transform_keys(&:downcase)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hash#transform_keys is only available in Ruby 2.5 or later:

puts({}.transform_keys(&:downcase))
#  == 2.4.10 ==
# -e:1:in `<main>': undefined method `transform_keys' for {}:Hash (NoMethodError)
# Did you mean?  transform_values
#
#  == 2.5.6 ==
# {}

Is this method being hit during our test runs? It should show up as a failure in 2.1-2.4 test cases, if so.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used in spec/datadog/tracing/client_ip_spec, odd. I'll try to see why it isn't called.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

end

def get(header_name)
@hash[header_name.downcase]
end
end
end
end
151 changes: 151 additions & 0 deletions lib/datadog/tracing/client_ip.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
# typed: true

require_relative '../core/configuration'
require_relative 'metadata/ext'
require_relative 'span'

module Datadog
module Tracing
# Common functions for supporting the `http.client_ip` span attribute.
module ClientIp
DEFAULT_IP_HEADERS_NAMES = %w[
x-forwarded-for
x-real-ip
x-client-ip
x-forwarded
x-cluster-client-ip
forwarded-for
forwarded
via
true-client-ip
].freeze

TAG_MULTIPLE_IP_HEADERS = '_dd.multiple-ip-headers'.freeze

# Sets the `http.client_ip` tag on the given span.
#
# This function respects the user's settings: if they disable the client IP tagging,
# or provide a different IP header name.
#
# If multiple IP headers are present in the request, this function will instead set
# the `_dd.multiple-ip-headers` tag with the names of the present headers,
# and **NOT** set the `http.client_ip` tag.
#
# @param [Span] span The span that's associated with the request.
# @param [HeaderCollection, #get, nil] headers A collection with the request headers.
# @param [String, nil] remote_ip The remote IP the request associated with the span is sent to.
def self.set_client_ip_tag(span, headers: nil, remote_ip: nil)
return unless configuration.enabled

result = raw_ip_from_request(headers, remote_ip)

if result.raw_ip
ip = strip_decorations(result.raw_ip)
return unless valid_ip?(ip)

span.set_tag(Tracing::Metadata::Ext::HTTP::TAG_CLIENT_IP, ip)
elsif result.multiple_ip_headers
span.set_tag(TAG_MULTIPLE_IP_HEADERS, result.multiple_ip_headers.keys.join(','))
end
end

IpExtractionResult = Struct.new(:raw_ip, :multiple_ip_headers)

# Returns a result struct that holds the raw client IP associated with the request if it was
# retrieved successfully.
#
# The client IP is looked up by the following logic:
# * If the user has configured a header name, return that header's value.
# * If exactly one of the known IP headers is present, return that header's value.
# * If none of the known IP headers are present, return the remote IP from the request.
#
# If more than one of the known IP headers is present, the result will have a `multiple_ip_headers`
# field with the name of the present IP headers.
#
# @param [Datadog::Core::HeaderCollection, #get, nil] headers The request headers
# @param [String] remote_ip The remote IP of the request.
# @return [IpExtractionResult] A struct that holds the unprocessed IP value,
# or `nil` if it wasn't found. Additionally, the `multiple_ip_headers` fields will hold the
# name of known IP headers present in the request if more than one of these were found.
def self.raw_ip_from_request(headers, remote_ip)
return IpExtractionResult.new(headers && headers.get(configuration.header_name), nil) if configuration.header_name

headers_present = ip_headers(headers)

case headers_present.size
when 0
IpExtractionResult.new(remote_ip, nil)
when 1
IpExtractionResult.new(headers_present.values.first, nil)
else
IpExtractionResult.new(nil, headers_present)
end
end

# Removes any port notations or zone specifiers from the IP address without
# verifying its validity.
def self.strip_decorations(address)
return strip_ipv4_port(address) if likely_ipv4?(address)

address = strip_ipv6_port(address)

strip_zone_specifier(address)
end

def self.strip_zone_specifier(ipv6)
ipv6.gsub(/%.*/, '')
end

def self.strip_ipv4_port(ip)
ip.gsub(/:\d+\z/, '')
end

def self.strip_ipv6_port(ip)
if /\[(.*)\](?::\d+)?/ =~ ip
Regexp.last_match(1)
else
ip
end
end

# Returns whether the given value is more likely to be an IPv4 than an IPv6 address.
#
# This is done by checking if a dot (`'.'`) character appears before a colon (`':'`) in the value.
# The rationale is that in valid IPv6 addresses, colons will always preced dots,
# and in valid IPv4 addresses dots will always preced colons.
def self.likely_ipv4?(value)
dot_index = value.index('.') || value.size
colon_index = value.index(':') || value.size

dot_index < colon_index
end

# Determines whether the given string is a valid IPv4 or IPv6 address.
def self.valid_ip?(ip)
# Client IPs should not have subnet masks even though IPAddr can parse them.
return false if ip.include?('/')

begin
IPAddr.new(ip)

true
rescue IPAddr::Error
false
end
end

def self.ip_headers(headers)
return {} unless headers

DEFAULT_IP_HEADERS_NAMES.each_with_object({}) do |name, result|
value = headers.get(name)
result[name] = value unless value.nil?
end
end

def self.configuration
Datadog.configuration.tracing.client_ip
end
end
end
end
6 changes: 6 additions & 0 deletions lib/datadog/tracing/configuration/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,12 @@ module Transport
ENV_DEFAULT_PORT = 'DD_TRACE_AGENT_PORT'.freeze
ENV_DEFAULT_URL = 'DD_TRACE_AGENT_URL'.freeze
end

# @public_api
module ClientIp
ENV_DISABLED = 'DD_TRACE_CLIENT_IP_HEADER_DISABLED'.freeze
ENV_HEADER_NAME = 'DD_TRACE_CLIENT_IP_HEADER'.freeze
end
end
end
end
Expand Down
35 changes: 35 additions & 0 deletions lib/datadog/tracing/contrib/rack/header_collection.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
require_relative '../../../core/header_collection'

module Datadog
module Tracing
module Contrib
module Rack
# Classes and utilities for handling headers in Rack.
module Header
# An implementation of a header collection that looks up headers from a Rack environment.
class RequestHeaderCollection < Datadog::Core::HeaderCollection
# Creates a header collection from a rack environment.
def initialize(env)
super()
@env = env
end

# Gets the value of the header with the given name.
def get(header_name)
@env[Header.to_rack_header(header_name)]
end

# Tests whether a header with the given name exists in the environment.
def key?(header_name)
@env.key?(Header.to_rack_header(header_name))
end
end

def self.to_rack_header(name)
"HTTP_#{name.to_s.upcase.gsub(/[-\s]/, '_')}"
end
end
end
end
end
end
40 changes: 23 additions & 17 deletions lib/datadog/tracing/contrib/rack/middlewares.rb
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,14 @@
require 'date'

require_relative '../../../core/environment/variable_helpers'
require_relative '../../client_ip'
require_relative '../../metadata/ext'
require_relative '../../propagation/http'
require_relative '../analytics'
require_relative '../utils/quantization/http'
require_relative 'ext'
require_relative 'header_collection'
require_relative 'request_queue'
require_relative '../utils/quantization/http'

module Datadog
module Tracing
Expand Down Expand Up @@ -133,8 +135,9 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
# So when its not available, we want the original, unmutated PATH_INFO, which
# is just the relative path without query strings.
url = env['REQUEST_URI'] || original_env['PATH_INFO']
request_headers = parse_request_headers(env)
response_headers = parse_response_headers(headers || {})
request_header_collection = Header::RequestHeaderCollection.new(env)
request_headers_tags = parse_request_headers(request_header_collection)
response_headers_tags = parse_response_headers(headers || {})

# The priority
# 1. User overrides span.resource
Expand Down Expand Up @@ -177,6 +180,14 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_CLIENT_IP).nil?
Tracing::ClientIp.set_client_ip_tag(
request_span,
headers: request_header_collection,
remote_ip: env['REMOTE_ADDR']
)
end

if request_span.get_tag(Tracing::Metadata::Ext::HTTP::TAG_BASE_URL).nil?
request_obj = ::Rack::Request.new(env)

Expand All @@ -195,12 +206,12 @@ def set_request_tags!(trace, request_span, env, status, headers, response, origi
end

# Request headers
request_headers.each do |name, value|
request_headers_tags.each do |name, value|
request_span.set_tag(name, value) if request_span.get_tag(name).nil?
end

# Response headers
response_headers.each do |name, value|
response_headers_tags.each do |name, value|
request_span.set_tag(name, value) if request_span.get_tag(name).nil?
end

Expand All @@ -219,14 +230,13 @@ def configuration
Datadog.configuration.tracing[:rack]
end

def parse_request_headers(env)
{}.tap do |result|
whitelist = configuration[:headers][:request] || []
whitelist.each do |header|
rack_header = header_to_rack_header(header)
if env.key?(rack_header)
result[Tracing::Metadata::Ext::HTTP::RequestHeaders.to_tag(header)] = env[rack_header]
end
def parse_request_headers(headers)
whitelist = configuration[:headers][:request] || []
whitelist.each_with_object({}) do |header, result|
header_value = headers.get(header)
unless header_value.nil?
header_tag = Tracing::Metadata::Ext::HTTP::RequestHeaders.to_tag(header)
result[header_tag] = header_value
end
end
end
Expand All @@ -248,10 +258,6 @@ def parse_response_headers(headers)
end
end
end

def header_to_rack_header(name)
"HTTP_#{name.to_s.upcase.gsub(/[-\s]/, '_')}"
end
end
end
end
Expand Down
1 change: 1 addition & 0 deletions lib/datadog/tracing/metadata/ext.rb
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,7 @@ module HTTP
TYPE_OUTBOUND = 'http'
TYPE_PROXY = 'proxy'
TYPE_TEMPLATE = 'template'
TAG_CLIENT_IP = 'http.client_ip'

# General header functionality
module Headers
Expand Down
Loading