Debugging Ruby code - Mcollective server
In this post I record steps that I took to debug some ruby code. Actually the code was an ruby mcollective server code that were installed as part of openshift origin Node. The bug is that the server consistently fails to respond to client queries in my configuration. I documented the steps taken even though I hadn't nailed the bug yet.
[root@broker ~]# service ruby193-mcollective status
mcollectived (pid 1069) is running...
[root@broker ~]# ps afxw | grep 1069
1069 ? Sl 0:03 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
12428 pts/0 S+ 0:00 \_ grep 1069
But the code I am interested in is deep inside call graph of the mcollectived. I want to find out the details of activemq subscription. Skipping hours of skimming mcollective source (https://github.com/puppetlabs/marionette-collective/) and openshift origin mcollective server source (https://github.com/openshift/origin-server/tree/master/plugins/msg-node/mcollective), lets jump to the activemq.rb file :
[root@broker ~]# locate activemq.rb
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/connector/activemq.rb
[root@broker ~]# cat /var/log/openshift/node/ruby193-mcollective.log | grep XXX
D, [2014-10-04T09:59:22.392472 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to discovery - broadcast - mcollective
D, [2014-10-04T09:59:26.049920 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to openshift - broadcast - mcollective
D, [2014-10-04T09:59:26.095865 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to rpcutil - broadcast - mcollective
D, [2014-10-04T09:59:26.191664 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to mcollective - broadcast - mcollective
D, [2014-10-04T09:59:26.202263 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to mcollective - directed - mcollective
evaluate
First thing first
First we need to identify the entry point. These commands would do the trick:[root@broker ~]# service ruby193-mcollective status
mcollectived (pid 1069) is running...
[root@broker ~]# ps afxw | grep 1069
1069 ? Sl 0:03 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
12428 pts/0 S+ 0:00 \_ grep 1069
We found out that the service is :
- running with pid 1069
- running with configuration file /opt/rh/ruby193/root/etc/mcollective/server.cfg
- service's source code is at /opt/rh/ruby193/root/usr/sbin/mcollectived
The most intrusive way yet the simplest
The simplest way is to insert 'puts' calls inside the code you want to debug. For objects, you want to call the inspect method.But the code I am interested in is deep inside call graph of the mcollectived. I want to find out the details of activemq subscription. Skipping hours of skimming mcollective source (https://github.com/puppetlabs/marionette-collective/) and openshift origin mcollective server source (https://github.com/openshift/origin-server/tree/master/plugins/msg-node/mcollective), lets jump to the activemq.rb file :
[root@broker ~]# locate activemq.rb
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/connector/activemq.rb
Lets hack some code (if you're doing this for real, do some backup first):
[root@broker ~]# vi /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/connector/activemq.rb
Add some puts here and there ..
# Subscribe to a topic or queue
def subscribe(agent, type, collective)
source = make_target(agent, type, collective)
puts "XXXX subscribe to "
puts agent
puts type
puts collective
And.. it doesnt work. Because the service reassigns standard output to /dev/null. Ah. And not far from the make_target call we have a Log.debug call, lets imitate it :
def subscribe(agent, type, collective)
source = make_target(agent, type, collective)
Log.debug("XXXX subscribe to #{agent} - #{type} - #{collective}")
unless @subscriptions.include?(source[:id])
Log.debug("Subscribing to #{source[:name]} with headers #{source[:headers].inspect.chomp}")
And we need to know where the log goes.. check out the configuration file (or the /proc/[pid]/fd directory, if you want):
vi /opt/rh/ruby193/root/etc/mcollective/server.cfg
topicprefix = /topic/
main_collective = mcollective
collectives = mcollective
libdir = /opt/rh/ruby193/root/usr/libexec/mcollective
logfile = /var/log/openshift/node/ruby193-mcollective.log
loglevel = debug
daemonize = 1
direct_addressing = 1
registerinterval = 30
Restart the service :
service ruby193-mcollective restart
View the logs:
cat /var/log/openshift/node/ruby193-mcollective.log | grep XXX
D, [2014-10-04T09:59:22.392472 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to discovery - broadcast - mcollective
D, [2014-10-04T09:59:26.049920 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to openshift - broadcast - mcollective
D, [2014-10-04T09:59:26.095865 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to rpcutil - broadcast - mcollective
D, [2014-10-04T09:59:26.191664 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to mcollective - broadcast - mcollective
D, [2014-10-04T09:59:26.202263 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to mcollective - directed - mcollective
There, I find what I came for, parameters for the subscribe method calls.
The nonintrusive way, but not yet successful
Actually, we are not supposed to hack source codes like that. Lets learn the real ruby debugger.
Check the command line and then stop the service.
[root@broker ~]# ps auxw | grep mcoll
root 17552 0.5 4.3 378212 44520 ? Sl 09:59 0:03 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
root 19873 0.0 0.0 103240 852 pts/0 S+ 10:08 0:00 grep mcoll
[root@broker ~]# service ruby193-mcollective stop
Shutting down mcollective: [ OK ]
[root@broker ~]# ruby -rdebug /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
/usr/lib/ruby/1.8/tracer.rb:16: Tracer is not a class (TypeError)
from /usr/lib/ruby/1.8/debug.rb:10:in `require'
from /usr/lib/ruby/1.8/debug.rb:10
Oops. Something is wrong. I used the built in ruby, which is 1.8, not 1.9.3. Lets try again.
[root@broker ~]# scl enable ruby193 bash
[root@broker ~]# ruby -rdebug /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
Debug.rb
Emacs support available.
/opt/rh/ruby193/root/usr/sbin/mcollectived:3:require 'mcollective'
(rdb:1)
Now we are in RDB, Ruby Debugger. What are the commands?
(rdb:1) help
Debugger help v.-0.002b
Commands
b[reak] [file:|class:]
b[reak] [class.]
set breakpoint to some position
wat[ch] set watchpoint to some expression
cat[ch] (|off) set catchpoint to an exception
b[reak] list breakpoints
cat[ch] show catchpoint
del[ete][ nnn] delete some or all breakpoints
disp[lay] add expression into display expression list
undisp[lay][ nnn] delete one particular or all display expressions
c[ont] run until program ends or hit breakpoint
s[tep][ nnn] step (into methods) one line or till line nnn
n[ext][ nnn] go over one line or till line nnn
w[here] display frames
f[rame] alias for where
l[ist][ (-|nn-mm)] list program, - lists backwards
nn-mm lists given lines
up[ nn] move to higher frame
down[ nn] move to lower frame
fin[ish] return to outer frame
tr[ace] (on|off) set trace mode of current thread
tr[ace] (on|off) all set trace mode of all threads
q[uit] exit from debugger
v[ar] g[lobal] show global variables
v[ar] l[ocal] show local variables
v[ar] i[nstance]
v[ar] c[onst]
m[ethod] i[nstance] show methods of object
m[ethod] show instance methods of class or module
th[read] l[ist] list all threads
th[read] c[ur[rent]] show current thread
th[read] [sw[itch]] switch thread context to nnn
th[read] stop stop thread nnn
th[read] resume resume thread nnn
p expression evaluate expression and print its value
h[elp] print this help
Lets checkout where we are (w).
(rdb:1) w
--> #1 /opt/rh/ruby193/root/usr/sbin/mcollectived:3
(rdb:1)
Ok, and list the source code (l):
(rdb:1) l
[-2, 7] in /opt/rh/ruby193/root/usr/sbin/mcollectived
1 #!
2
=> 3 require 'mcollective'
4 require 'getoptlong'
5
6 opts = GetoptLong.new(
7 [ '--help', '-h', GetoptLong::NO_ARGUMENT ],
Step to next line (n) :
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:4:require 'getoptlong'
The execution proceeds to the next line.
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:6:opts = GetoptLong.new(
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:12:if MCollective::Util.windows?
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/util.rb:1:module MCollective
I found a little strange that the debugger steps in to another source file.
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:15: configfile = "/opt/rh/ruby193/root/etc/mcollective/server.cfg"
(rdb:1) l
[10, 19] in /opt/rh/ruby193/root/usr/sbin/mcollectived
10 )
11
12 if MCollective::Util.windows?
13 configfile = File.join(MCollective::Util.windows_prefix, "etc", "server.cfg")
14 else
=> 15 configfile = "/opt/rh/ruby193/root/etc/mcollective/server.cfg"
16 end
17 pid = ""
18
19 opts.each do |opt, arg|
But it quickly returns to the original source.
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:17:pid = ""
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:19:opts.each do |opt, arg|
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:31:config = MCollective::Config.instance
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:1:module MCollective
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:33:config.loadconfig(configfile) unless config.configured
(rdb:1) n
warn 2014/10/04 10:16:16: config.rb:117:in `block in loadconfig' Use of deprecated 'topicprefix' option. This option is ignored and should be removed from '/opt/rh/ruby193/root/etc/mcollective/server.cfg'
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:16: `' (NilClass)
from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:3:in `'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:2:in `'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:1:in `'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `load'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `loadclass'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:142:in `loadconfig'
from /opt/rh/ruby193/root/usr/sbin/mcollectived:33:in `'
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:16: remove_method :to_yaml rescue nil
(rdb:1)
This is a NilClass error, similar to NullPointerException, but I could proceeds further into another code by keeping doing (n):
(rdb:1)
n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:17: alias :to_yaml :psych_to_yaml
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:20:class Module
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:29: `' (NilClass)
from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:3:in `'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:2:in `'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:1:in `'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `load'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `loadclass'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:142:in `loadconfig'
from /opt/rh/ruby193/root/usr/sbin/mcollectived:33:in `'
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:29: remove_method :yaml_as rescue nil
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:30: alias :yaml_as :psych_yaml_as
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:33:if defined?(::IRB)
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych.rb:12:require 'psych/deprecated'
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/deprecated.rb:79: `' (NilClass)
from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:3:in `'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:2:in `'
from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:1:in `'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `load'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `loadclass'
from /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:142:in `loadconfig'
from /opt/rh/ruby193/root/usr/sbin/mcollectived:33:in `'
/opt/rh/ruby193/root/usr/share/ruby/psych/deprecated.rb:79: undef :to_yaml_properties rescue nil
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych.rb:94:module Psych
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:86: engine = 'psych'
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:96:module Syck # :nodoc:
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:100:module Psych # :nodoc:
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:104:YAML::ENGINE.yamler = engine
(rdb:1) n
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:10: class Yaml_facts
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/facts/base.rb:1:module MCollective
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:143: PluginManager.loadclass("Mcollective::Connector::#{@connector}")
(rdb:1) l
[138, 147] in /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb
138 if @logger_type == "syslog"
139 raise "The sylog logger is not usable on the Windows platform" if Util.windows?
140 end
141
142 PluginManager.loadclass("Mcollective::Facts::#{@factsource}_facts")
=> 143 PluginManager.loadclass("Mcollective::Connector::#{@connector}")
144 PluginManager.loadclass("Mcollective::Security::#{@securityprovider}")
145 PluginManager.loadclass("Mcollective::Registration::#{@registration}")
146 PluginManager.loadclass("Mcollective::Audit::#{@rpcauditprovider}") if @rpcaudit
147 PluginManager << {:type => "global_stats", :class => RunnerStats.new}
It seems that the context of those errors were facts loading process. The connector is a particular interest, because :
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:144: PluginManager.loadclass("Mcollective::Security::#{@securityprovider}")
(rdb:1) PluginManager["connector_plugin"]
#"unset", "activemq.pool.size"=>"1", "activemq.pool.1.host"=>"broker.openshift.local", "activemq.pool.1.port"=>"61613", "activemq.pool.1.user"=>"mcollective", "activemq.pool.1.password"=>"marionette", "yaml"=>"/opt/rh/ruby193/root/etc/mcollective/facts.yaml"}, @connector="Activemq", @securityprovider="Psk", @factsource="Yaml", @identity="broker.openshift.local", @registration="Agentlist", @registerinterval=30, @registration_collective=nil, @classesfile="/var/lib/puppet/state/classes.txt", @rpcaudit=false, @rpcauditprovider="", @rpcauthorization=false, @rpcauthprovider="", @configdir="/opt/rh/ruby193/root/etc/mcollective", @color=true, @configfile="/opt/rh/ruby193/root/etc/mcollective/server.cfg", @logger_type="file", @keeplogs=5, @max_log_size=2097152, @rpclimitmethod=:first, @libdir=["/opt/rh/ruby193/root/usr/libexec/mcollective"], @fact_cache_time=300, @loglevel="debug", @logfacility="user", @collectives=["mcollective"], @main_collective="mcollective", @ssl_cipher="aes-256-cbc", @direct_addressing=true, @direct_addressing_threshold=10, @default_discovery_method="mc", @default_discovery_options=[], @ttl=60, @mode=:client, @publish_timeout=2, @threaded=false, @logfile="/var/log/openshift/node/ruby193-mcollective.log", @daemonize=true>, @subscriptions=[], @msgpriority=0, @base64=false>
Yes, it loads the Activemq connector. Lets create a breakpoint in the subscribe method :
(rdb:1) b PluginManager["connector_plugin"].subscribe
Set breakpoint 1 at #.subscribe
Then continue (c)..
(rdb:1) c
[root@broker ~]#
Wait. It stops. Browsing some source code tells me that the code forks somewhere after that. And, the forked code seems to be unrelated to the debugger.. So it is a dead end for now.
Well thats all now for the record. I can't promise that there will be a more successful debugging session, but surely I hope that there will be.
Comments