Learning Cloud Foundry

Learning Cloud Foundry - NATS

January 19, 2013

All Cloud Foundry components communicates with each other using NATS publish-subscribe mechanism.

Lets see in detail what is NATS like.

Cloud Foundry Router as NATS client

During startup of Cloud Foundry Router (router.rb), there is a call to start NATS client below :

NATS.start(:uri => config['mbus'])

It means that NATS client will be started with the uri parameter obtained from configuration parameter 'mbus'.

config_path = ENV["CLOUD_FOUNDRY_CONFIG_PATH"] || File.join(File.dirname(__FILE__), '../config')
config_file = File.join(config_path, 'router.yml')
...

 opts.on("-c", "--config [ARG]", "Configuration File") do |opt|

    config_file = opt
  end
...

We find that the configuration parameter will be read from a YAML (yet Another Markup Language) that whose location could came from multiple sources, in these order :

specified from command line parameter -c or --config
environment variable CLOUD_FOUNDRY_CONFIG_PATH (if exists) concatenated with '/router.yml'
../config/router.yml relative to location of Ruby file router.rb (in current condition, router/lib/router.rb)

Note that the code didn't check whether the file exists at all, but it only checks the existence of the configuration switch or environment variable.

After starting the NATS client, the router publishes router start events :

  @router_id = VCAP.secure_uuid
  @hello_message = { :id =&gt; @router_id, :version =&gt; Router::VERSION }.to_json.freeze
 
  # This will check on the state of the registered urls, do maintenance, etc..
  Router.setup_sweepers
 
  # Setup a start sweeper to make sure we have a consistent view of the world.
  EM.next_tick do
    # Announce our existence
    NATS.publish('router.start', @hello_message)
    # Don't let the messages pile up if we are in a reconnecting state
    EM.add_periodic_timer(START_SWEEPER) do
      unless NATS.client.reconnecting?
        NATS.publish('router.start', @hello_message)
      end
    end
  end

This ensures other components realizes that there is a new router starting up (and might need situation update).

NATS client module

NATS client module starts an Eventmachine-based network connection to the NATS server given in the uri parameter. But it also has defaults (client.rb):
(line 13@client.rb)

DEFAULT_PORT = 4222
DEFAULT_URI = "nats://localhost:#{DEFAULT_PORT}".freeze

...
.. (line 102@client.rb)

opts[:uri] ||= ENV['NATS_URI'] || DEFAULT_URI

...

We see that the uri parameter are obtained from (in such order) :

uri parameter when calling NATS.start
NATS_URI environment variable
DEFAULT_URI which is nats://localhost:4222

Publish call is implemented by sending text line to the NATS server :

 # Publish a message to a given subject, with optional reply subject and completion block
  # @param [String] subject
  # @param [Object, #to_s] msg
  # @param [String] opt_reply
  # @param [Block] blk, closure called when publish has been processed by the server.
  def publish(subject, msg=EMPTY_MSG, opt_reply=nil, &amp;blk)
    return unless subject
    msg = msg.to_s
    # Accounting
    @msgs_sent += 1
    @bytes_sent += msg.bytesize if msg
 
    send_command("PUB #{subject} #{opt_reply} #{msg.bytesize}#{CR_LF}#{msg}#{CR_LF}")
    queue_server_rt(&amp;blk) if blk
  end

The NATS server takes care of messaging all the subscribers that are interested in the message.

NATS Server

The Nats server is also implemented in Ruby. The source code shows us that the startup sequence is as follows :

do setup using given command line arguments
start eventmachine on given host and port parameters, using NATSD::Connection module to serve connections
if given http_port parameter, starts http monitoring on such port

The server class (which does setup) is separated into a few files : core server (server.rb) and option handling (options.rb).
The startup options is first being read from the command-line (see parser method here), then the server would read a configuration file (if given) for additional parameters. The parameters given from command-line will not be changed, only missing options would be read from the YAML-formatted configuration file. The available parameters are :

addr (listen network address)
port
http: net,port,user, password (for http monitoring port)
authorization : user, password, auth_timeout
ssl (will request ssl on connections)
debug
trace
syslog (activate syslogging)
pid_file
log_file
log_time
ping : interval,max_outstanding
max_control_line, max_payload, max_connections (setup limits)

The subscription list is being keep in the server class, along with route_to_subscribers method for sending message to registered parties.
The Connection module is the heart of NATS server's operations. NATS server could be configured to require authentication or to require SSL connecition. The operation it recognizes are :

PUB (publish), which would send a payload to registered parties
SUB (subscribe), which would register the caller to a message subject.
UNSUB (unsubscribe), unregister a subscribtion
PING, which would make the server send a response message
PONG - actually not an operation but a mechanism to ensure client health
CONNECT, to reconfigure verbose and pedantic connection options
INFO, which would make the server send an information json string
CTRL-C/CTRL-D, both would make the server close the connection.

The server sends ping messages periodically to each client according to Server.ping_interval. If outstanding non-reply is above max_outstanding parameter, the server will tell the client that it is unresponsive and close the client connection afterwards.

Inventor's Paradox