Distributed tracing and FaaS?

Tracing Fission functions with Jaeger

Bhavin Gandhi

InfraCloud Technologies

Microservices?

  • Treating services as products
  • More easy to manage
  • What about debugging things?

Sorry, your browser does not support SVG.

Monitoring

Is the service working?

Let's keep checking for that

Up for Microservices?

They are never 100% up

Nines don’t matter if users aren’t happy.

Charity Majors

Observability

Being able to get view of system

Helps to understand your system

Stop reverse engineering applications and start monitoring from the inside.

Kelsey Hightower, Monitorama 2016

Pillars of Observability

  • Logging
  • Metrics monitoring
  • Tracing

Logging

Events, errors occurred in applications

The limits of logging

Signal to noise ratio

Serve for problems predicted in advanced

Metrics monitoring

Numerical information about what's happening

Help to predict the behavior

Tracing

Path taken by an user's request

Connecting individual components

Tracing in Serverless/FaaS

  • Abstracted infrastructure
  • Shortlived nature of functions

Introducing GaS

Greetings as Service

  • Two Functions
    • Greeter
    • Image
  • Using Kafka in between

Components of GaS

Sorry, your browser does not support SVG.

gas_architecture_1.jpg

Functions on Kubernetes

  • Fission
  • Kubernetes native

Fission functions

  • Function code
  • One entry point

Fission function environments

  • Container images
  • Available for most of the languages
  • Extensible

Function trigger

  • Actions to invoke function execution
  • Supports HTTP and message queues

fission_components.jpg

Introducing detectives

detectives.jpg

Image credits: CNCF Branding & Zipkin Community: Logos

Tracing backends

  • Collecting the trace events called spans
  • Storing as well as visualization of those events

How to collect these events?

  • Instrumenting your code
  • Client libraries built according to OpenTracing standards

Let's trace things

  • Instrumenting each function's code?
  • Environments to the rescue

Modifying the Python environment

  • Uses Flask as HTTP server
  • server.py loads user defined functions
  • Directory structure

    ./python-env
    ├── Dockerfile
    ├── lib
    │   ├── __init__.py
    │   └── tracing.py
    ├── README.md
    ├── requirements.txt
    └── server.py
    

/specialize in server.py

from lib.tracing import initialize_tracing
…
@self.route('/specialize', methods=['POST'])
def load():
    # load user function from codepath
    userfunc = …
    # Wrap userfunc with tracing instrumentation
    self.userfunc = initialize_tracing(userfunc)
    return ""

initialize_tracing in lib/tracing.py

def initialize_tracing(func):
    def inner():
        …
        func_resp = func()
        return func_resp
    return inner

Creating the tracer object

def inner():
     tracer = _init_tracer(fission_func_name)
     return func_resp

Starting the trace event using with block

def inner():
     with tracer.start_span(span_name, child_of=span_ctx) as span:
        …
        func_resp = func()
    return func_resp

Visualization in Jaeger

Spans of greeter function

greeter_single_span.png

Spans of image function

image_single_span.png

Linking spans together

  • Context propagation
  • Passing context over the wire

extract operation

  1. Incoming request to the function

    …
    trace-ctx: 1234abcd:5678
    
  2. Create object of span context using tracer.extract()
  3. Start a new span with span context as parent tracer.start_span()

inject operation

  1. Create headers for further requests made by user code tracer.inject()
  2. Save the current span and new headers in Flask's global g

extract and inject flow

extract_inject_flow.jpg

Context propagation in tracing.py

def inner():
     span_ctx = tracer.extract(Format.HTTP_HEADERS, request.headers)
    with tracer.start_span(span_name, child_of=span_ctx) as span:
        …
        generated_headers = dict()
        tracer.inject(span, Format.HTTP_HEADERS, generated_headers)
        # User may want to set tags on span or use the generated_headers
        g.span = span
        g.generated_headers = generated_headers
        …
        func_resp = func()
        # Add headers from generated_headers to response    return resp

Modifying Kafka MQT of Fission

More about MQT of Fission

  • New records are sent as HTTP request to functions
  • No support for Kafka record headers

fission_mqt_flow.jpg

kafka_header_support_full_image.png

Running the service again

faas-tracing-2.mp4

linked_spans.png

trace_graph_child_of.png

Wrong timestamps on spans

  • ClockSkew adjustments
  • Using FOLLOWS_FROM reference instead CHILD_OF

––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time

 [-Parent Span--------------]
      [-Child Span A----]
       [-Child Span B----]   


 [-Parent Span-]
	     [-Child Span-]	  

Diagram credits: OpenTracing Specification. Apache License 2.0

Adding support for references in jaeger-client-python

jaeger-client-python-references-support.png

Modified tracing.py

span_ctx = tracer.extract(Format.HTTP_HEADERS, request.headers)
# passing it as reference instead of child_of relation as we
# have async calls to services
span_reference = follows_from(referenced_context=span_ctx)
with tracer.start_span(span_name, references=span_reference) as span:
     return response

linked_spans_timeline.png

trace_graph_time_follows_from.png

tracer.close() from jaeger-client-python

Debugging an issue in GaS

faas-tracing-debug.mp4

Watch out for these

  • Use 128bit trace Ids as we may encounter duplicate trace Ids
  • While working with asynchronous applications use FOLLOWS_FROM reference
  • Using TCP or HTTP to send the tracing events instead of UDP

Questions

bhavin192[at]geeksocket.in

@_bhavin192

References

These slides are released publicly under

Creative Commons Attribution-ShareAlike 4.0 International (CC BY-SA 4.0)