Monday, October 20, 2014

Configuring Openshift Origin with S3-based persistent shared storage

This post will describe the steps that I take to provide shared storage for OpenShift Origin M4 installation. There were some difficulties that must be solved by non standard methods.

Requirement

When hosting applications on Openshift Origin platform, we are confronted with a bitter truth :
writing applications for cloud platforms requires us to avoid writing to local filesystems. There is no support for storage shared between gears. But we still need support multiple PHP applications that stores their attachment in the local filesystem with minimal code changes. So we need a way to quickly implement shared storage between gears of the same application. And maybe we could loosen the application isolation requirement just for the shared storage.

Basic Idea

The idea is to mount an S3 API-based storage on all nodes. And then each gear could refer to application's folder inside the shared storage to store and retrieve file attachments. My implementation uses an EMC VIPR shared storage with S3 API, which I assume would be harder than if we were using a real Amazon S3 storage. I used S3FS implementation from https://github.com/s3fs-fuse/s3fs-fuse to mount S3 storage as folders.

Pitfalls

Openshift gears are not allowed to write to arbitrary directories. The gears can't even peek to other gears directories, which is restricted using SELinux Multi Category Security (MCS). Custom SElinux policies were implemented, which is complex enough for a run-of-the-mill admin to understand. So mounting the S3 storage in the nodes is only half of the battle.
S3FS Fuse needs a newer version of Fuse than that were packaged with RHEL 6. And Fuse need a bit of patch to allow mounting the S3 storage using contexts other than fusefs_t.
Access control for a specified directory is cached in a process's lifetime, so if a running httpd is denied access, make sure that it is restarted after we remount the S3 storage in a different context.

Step-by-step

First, make sure that your system is clear from the old fuse package, then download latest fuse version from sourceforge. Extract it.

# wget http://downloads.sourceforge.net/project/fuse/fuse-2.X/2.9.3/fuse-2.9.3.tar.gz
# tar xzf fuse-2.9.3.tar.gz
# cd fuse-2.9.3
# ./configure --prefix=/usr
# export PKG_CONFIG_PATH=/usr/lib/pkgconfig:/usr/lib64/pkgconfig/

we need to add missing context option to lib/mount.c.
        FUSE_OPT_KEY("default_permissions",     KEY_KERN_OPT),
        FUSE_OPT_KEY("context=", KEY_KERN_OPT),
        FUSE_OPT_KEY("fscontext=",              KEY_KERN_OPT),
        FUSE_OPT_KEY("defcontext=",             KEY_KERN_OPT),
        FUSE_OPT_KEY("rootcontext=",            KEY_KERN_OPT),
        FUSE_OPT_KEY("max_read=",               KEY_KERN_OPT),
        FUSE_OPT_KEY("max_read=",               FUSE_OPT_KEY_KEEP),
        FUSE_OPT_KEY("user=",                   KEY_MTAB_OPT),

The bold part are the inserted lines. Do the changes, save, then compile the whole thing.

# make
# make install
# ldconfig
# modprobe fuse
# pkg-config --modversion fuse

We now may download the latest s3fs package.

# wget https://github.com/s3fs-fuse/s3fs-fuse/archive/master.zip
# unzip s3fs-fuse-master.zip
# cd s3fs-fuse-master
# ./configure --prefix=/usr/local
# make
# make install

Put your AWS credentials or other access key in .passwd-s3fs file. The syntax is :
accessKeyId:secretAccessKey
or
bucketName:accessKeyId:secretAccessKey
Ensure the ~/.passwd-s3fs is only readable to the user.

chmod 600 ~/.passwd-s3fs 

Lets mount the S3 storage.

s3fs always-on-non-core /var/[our-shared-folder] -o url=http://[server]:[port]/ -o use_path_request_style -o context=system_u:object_r:openshift_rw_file_t:s0 -o umask=000 -o allow_other

Change the our-shared-folder part to the mount point we want to use, and the server and port part to the S3 service endpoint. If we are using real S3, we omit the -o url part. And you also might want to omit use_path_request_style to use newer API style (virtual-hosted). We only need use_path_request_style if we are using an S3 compatible storage.

Configure the applications

As root in each node, create a folder for the application.
# mkdir /var/[our-shared-folder]/[appid]

Create a .openshift/action_hooks/build file inside the applications git repository with u+x bit set.
Fill it with :
#! /bin/bash
ln -sf /var/[our-shared-folder]/[appid] $OPENSHIFT_REPO_DIR/[appfolder]

Change the appfolder to the folder we want to store the attachments under the application's root directory.  Afterwards we could create a file in the folder  using PHP like :

$f = fopen("/file1.txt","wb");


Reference :
https://code.google.com/p/s3fs/issues/detail?id=170
http://tecadmin.net/mount-s3-bucket-centosrhel-ubuntu-using-s3fs/

Saturday, October 4, 2014

Debugging Ruby code - Mcollective server

In this post I record steps that I took to debug some ruby code. Actually the code was an ruby mcollective server code that were installed as part of openshift origin Node. The bug is that the server consistently fails to respond to client queries in my configuration. I documented the steps taken  even though I hadn't nailed the bug yet.

First thing first

First we need to identify the entry point. These commands would do the trick:
[root@broker ~]# service ruby193-mcollective status
mcollectived (pid  1069) is running...
[root@broker ~]# ps afxw | grep 1069
 1069 ?        Sl     0:03 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
12428 pts/0    S+     0:00          \_ grep 1069

We found out that the service is :
  • running with pid 1069
  • running with configuration file /opt/rh/ruby193/root/etc/mcollective/server.cfg
  • service's source code is at /opt/rh/ruby193/root/usr/sbin/mcollectived

The most intrusive way yet the simplest

The simplest way is to insert 'puts' calls inside the code you want to debug. For objects, you want to call the inspect method.

But the code I am interested in is deep inside call graph of the mcollectived. I want to find out the details of activemq subscription. Skipping hours of skimming mcollective source (https://github.com/puppetlabs/marionette-collective/) and openshift origin mcollective server source (https://github.com/openshift/origin-server/tree/master/plugins/msg-node/mcollective), lets jump to the activemq.rb file :
[root@broker ~]# locate activemq.rb
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/connector/activemq.rb

Lets hack some code (if you're doing this for real, do some backup first):
[root@broker ~]# vi /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/connector/activemq.rb

Add some puts here and there ..
      # Subscribe to a topic or queue
      def subscribe(agent, type, collective)
        source = make_target(agent, type, collective)
        puts "XXXX subscribe to "
        puts agent
        puts type
        puts collective

And.. it doesnt work. Because the service reassigns standard output to /dev/null. Ah. And not far from the make_target call we have a Log.debug call, lets imitate it :
      def subscribe(agent, type, collective)
        source = make_target(agent, type, collective)
        Log.debug("XXXX subscribe to #{agent} - #{type} - #{collective}")
        unless @subscriptions.include?(source[:id])
          Log.debug("Subscribing to #{source[:name]} with headers #{source[:headers].inspect.chomp}")

And we need to know where the log goes.. check out the configuration file (or the /proc/[pid]/fd directory, if you want):

vi /opt/rh/ruby193/root/etc/mcollective/server.cfg

topicprefix = /topic/
main_collective = mcollective
collectives = mcollective
libdir = /opt/rh/ruby193/root/usr/libexec/mcollective
logfile = /var/log/openshift/node/ruby193-mcollective.log
loglevel = debug
daemonize = 1
direct_addressing = 1
registerinterval = 30

Restart the service :
service ruby193-mcollective restart

View the logs:
cat /var/log/openshift/node/ruby193-mcollective.log | grep XXX

[root@broker ~]# cat /var/log/openshift/node/ruby193-mcollective.log | grep XXX
D, [2014-10-04T09:59:22.392472 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to discovery - broadcast - mcollective
D, [2014-10-04T09:59:26.049920 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to openshift - broadcast - mcollective
D, [2014-10-04T09:59:26.095865 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to rpcutil - broadcast - mcollective
D, [2014-10-04T09:59:26.191664 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to mcollective - broadcast - mcollective
D, [2014-10-04T09:59:26.202263 #17552] DEBUG -- : activemq.rb:371:in `subscribe' XXXX subscribe to mcollective - directed - mcollective

There, I find what I came for, parameters for the subscribe method calls.

The nonintrusive way, but not yet successful

Actually, we are not supposed to hack source codes like that. Lets learn the real ruby debugger.
Check the command line and then stop the service.
[root@broker ~]# ps auxw | grep mcoll
root     17552  0.5  4.3 378212 44520 ?        Sl   09:59   0:03 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
root     19873  0.0  0.0 103240   852 pts/0    S+   10:08   0:00 grep mcoll
[root@broker ~]# service ruby193-mcollective stop
Shutting down mcollective:                                 [  OK  ]
[root@broker ~]# ruby -rdebug  /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
/usr/lib/ruby/1.8/tracer.rb:16: Tracer is not a class (TypeError)
        from /usr/lib/ruby/1.8/debug.rb:10:in `require'
        from /usr/lib/ruby/1.8/debug.rb:10

Oops. Something is wrong. I used the built in ruby, which is 1.8, not 1.9.3. Lets try again.

[root@broker ~]# scl enable ruby193 bash
[root@broker ~]# ruby -rdebug  /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg
Debug.rb
Emacs support available.

/opt/rh/ruby193/root/usr/sbin/mcollectived:3:require 'mcollective'
(rdb:1)

Now we are in RDB, Ruby Debugger. What are the commands?
(rdb:1) help
Debugger help v.-0.002b
Commands
  b[reak] [file:|class:]
  b[reak] [class.]
                             set breakpoint to some position
  wat[ch]       set watchpoint to some expression
  cat[ch] (|off)  set catchpoint to an exception
  b[reak]                    list breakpoints
  cat[ch]                    show catchpoint
  del[ete][ nnn]             delete some or all breakpoints
  disp[lay]     add expression into display expression list
  undisp[lay][ nnn]          delete one particular or all display expressions
  c[ont]                     run until program ends or hit breakpoint
  s[tep][ nnn]               step (into methods) one line or till line nnn
  n[ext][ nnn]               go over one line or till line nnn
  w[here]                    display frames
  f[rame]                    alias for where
  l[ist][ (-|nn-mm)]         list program, - lists backwards
                             nn-mm lists given lines
  up[ nn]                    move to higher frame
  down[ nn]                  move to lower frame
  fin[ish]                   return to outer frame
  tr[ace] (on|off)           set trace mode of current thread
  tr[ace] (on|off) all       set trace mode of all threads
  q[uit]                     exit from debugger
  v[ar] g[lobal]             show global variables
  v[ar] l[ocal]              show local variables
  v[ar] i[nstance]  show instance variables of object
  v[ar] c[onst]     show constants of object
  m[ethod] i[nstance]  show methods of object
  m[ethod]    show instance methods of class or module
  th[read] l[ist]            list all threads
  th[read] c[ur[rent]]       show current thread
  th[read] [sw[itch]]  switch thread context to nnn
  th[read] stop        stop thread nnn
  th[read] resume      resume thread nnn
  p expression               evaluate expression and print its value
  h[elp]                     print this help
           evaluate

Lets checkout where we are (w).
(rdb:1) w
--> #1 /opt/rh/ruby193/root/usr/sbin/mcollectived:3
(rdb:1)

Ok, and list the source code (l):
(rdb:1) l
[-2, 7] in /opt/rh/ruby193/root/usr/sbin/mcollectived
   1  #!
   2
=> 3  require 'mcollective'
   4  require 'getoptlong'
   5
   6  opts = GetoptLong.new(
   7    [ '--help', '-h', GetoptLong::NO_ARGUMENT ],

Step to next line (n) :
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:4:require 'getoptlong'

The execution proceeds to the next line.
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:6:opts = GetoptLong.new(
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:12:if MCollective::Util.windows?
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/util.rb:1:module MCollective

I found a little strange that the debugger steps in to another source file. 

(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:15:  configfile = "/opt/rh/ruby193/root/etc/mcollective/server.cfg"
(rdb:1) l
[10, 19] in /opt/rh/ruby193/root/usr/sbin/mcollectived
   10  )
   11
   12  if MCollective::Util.windows?
   13    configfile = File.join(MCollective::Util.windows_prefix, "etc", "server.cfg")
   14  else
=> 15    configfile = "/opt/rh/ruby193/root/etc/mcollective/server.cfg"
   16  end
   17  pid = ""
   18
   19  opts.each do |opt, arg|

But it quickly returns to the original source.

(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:17:pid = ""
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:19:opts.each do |opt, arg|
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:31:config = MCollective::Config.instance
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:1:module MCollective
(rdb:1) n
/opt/rh/ruby193/root/usr/sbin/mcollectived:33:config.loadconfig(configfile) unless config.configured
(rdb:1) n
warn 2014/10/04 10:16:16: config.rb:117:in `block in loadconfig' Use of deprecated 'topicprefix' option.  This option is ignored and should be removed from '/opt/rh/ruby193/root/etc/mcollective/server.cfg'
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:16: `' (NilClass)
        from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
        from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:3:in `'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:2:in `'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:1:in `'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `load'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `loadclass'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:142:in `loadconfig'
        from /opt/rh/ruby193/root/usr/sbin/mcollectived:33:in `
'
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:16:  remove_method :to_yaml rescue nil
(rdb:1)

This is a NilClass error, similar to NullPointerException, but I could proceeds further into another code by keeping doing (n):

(rdb:1)
n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:17:  alias :to_yaml :psych_to_yaml
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:20:class Module
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:29: `' (NilClass)
        from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
        from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:3:in `'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:2:in `'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:1:in `'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `load'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `loadclass'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:142:in `loadconfig'
        from /opt/rh/ruby193/root/usr/sbin/mcollectived:33:in `
'
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:29:  remove_method :yaml_as rescue nil
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:30:  alias :yaml_as :psych_yaml_as
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/core_ext.rb:33:if defined?(::IRB)
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych.rb:12:require 'psych/deprecated'
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych/deprecated.rb:79: `' (NilClass)
        from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
        from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:3:in `'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:2:in `'
        from /opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:1:in `'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `load'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/pluginmanager.rb:169:in `loadclass'
        from /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:142:in `loadconfig'
        from /opt/rh/ruby193/root/usr/sbin/mcollectived:33:in `
'
/opt/rh/ruby193/root/usr/share/ruby/psych/deprecated.rb:79:  undef :to_yaml_properties rescue nil

(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/psych.rb:94:module Psych
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:86:    engine = 'psych'
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:96:module Syck # :nodoc:
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:100:module Psych # :nodoc:
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/yaml.rb:104:YAML::ENGINE.yamler = engine
(rdb:1) n
/opt/rh/ruby193/root/usr/libexec/mcollective/mcollective/facts/yaml_facts.rb:10:    class Yaml_facts
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/facts/base.rb:1:module MCollective
(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:143:        PluginManager.loadclass("Mcollective::Connector::#{@connector}")
(rdb:1) l
[138, 147] in /opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb
   138          if @logger_type == "syslog"
   139            raise "The sylog logger is not usable on the Windows platform" if Util.windows?
   140          end
   141
   142          PluginManager.loadclass("Mcollective::Facts::#{@factsource}_facts")
=> 143          PluginManager.loadclass("Mcollective::Connector::#{@connector}")
   144          PluginManager.loadclass("Mcollective::Security::#{@securityprovider}")
   145          PluginManager.loadclass("Mcollective::Registration::#{@registration}")
   146          PluginManager.loadclass("Mcollective::Audit::#{@rpcauditprovider}") if @rpcaudit
   147          PluginManager << {:type => "global_stats", :class => RunnerStats.new}

It seems that the context of those errors were facts loading process. The connector is a particular interest, because :

(rdb:1) n
/opt/rh/ruby193/root/usr/share/ruby/mcollective/config.rb:144:        PluginManager.loadclass("Mcollective::Security::#{@securityprovider}")
(rdb:1) PluginManager["connector_plugin"]
#"unset", "activemq.pool.size"=>"1", "activemq.pool.1.host"=>"broker.openshift.local", "activemq.pool.1.port"=>"61613", "activemq.pool.1.user"=>"mcollective", "activemq.pool.1.password"=>"marionette", "yaml"=>"/opt/rh/ruby193/root/etc/mcollective/facts.yaml"}, @connector="Activemq", @securityprovider="Psk", @factsource="Yaml", @identity="broker.openshift.local", @registration="Agentlist", @registerinterval=30, @registration_collective=nil, @classesfile="/var/lib/puppet/state/classes.txt", @rpcaudit=false, @rpcauditprovider="", @rpcauthorization=false, @rpcauthprovider="", @configdir="/opt/rh/ruby193/root/etc/mcollective", @color=true, @configfile="/opt/rh/ruby193/root/etc/mcollective/server.cfg", @logger_type="file", @keeplogs=5, @max_log_size=2097152, @rpclimitmethod=:first, @libdir=["/opt/rh/ruby193/root/usr/libexec/mcollective"], @fact_cache_time=300, @loglevel="debug", @logfacility="user", @collectives=["mcollective"], @main_collective="mcollective", @ssl_cipher="aes-256-cbc", @direct_addressing=true, @direct_addressing_threshold=10, @default_discovery_method="mc", @default_discovery_options=[], @ttl=60, @mode=:client, @publish_timeout=2, @threaded=false, @logfile="/var/log/openshift/node/ruby193-mcollective.log", @daemonize=true>, @subscriptions=[], @msgpriority=0, @base64=false>

Yes, it loads the Activemq connector. Lets create a breakpoint in the subscribe method :
(rdb:1) b PluginManager["connector_plugin"].subscribe
Set breakpoint 1 at #.subscribe

Then continue (c)..

(rdb:1) c
[root@broker ~]#

Wait. It stops. Browsing some source code tells me that the code forks somewhere after that. And, the forked code seems to be unrelated to the debugger.. So it is a dead end for now.

Well thats all now for the record. I can't promise that there will be a more successful debugging session, but surely I hope that there will be.

How to move an EC2 Instance to another region

In this post I would describe the process of moving an EC2 instance to another region.

The background

I have a server in one of the EC2 regions that a bit pricey than the rest. It seems that moving it to another region would save me some bucks. Well, it turns out that I did a few blunders that maybe causes the savings to be negligible.

The initial plan

I read that snapshots could be copied to other regions. So the original plan is to create snapshots of existing volumes that support the instance (I have one instance with three EBS volumes), copy these to another region, and create a new instance in the new region.

The mistake

My mistake is that I assume creating a new instance is a simple matter of selecting the platform (i386 or x86_64) and the root EBS volume. Actually, it is not. First, we create an AMI (Amazon Machine Image) using an EBS snapshot, not EBS volume. Then we could launch a new instance based on the AMI. As shown below, when we are trying to create a new AMI from a snapshot we need to choose :

  • Architecture (i386 or x86_64)
  • Root device name - I knew this one
  • RAM disk ID 
  • Virtualization type - I chose paravirtual because that's what the original instance is
  • Kernel ID


The problem is, I cannot find the Kernel ID in the new region that matches the Kernel ID in the original region. Choosing default for the two parameters resulted in an instance that unable to boot successfully.


The real deal

So, it turns out that I chose the wrong path. From the Instance, I could Create Image, then after the image created, I could copy it to another region. 






After copying the image, we could launch a new instance based on the image.



Summary

Now we understood that the most efficient steps to copy an instance to another region is to create AMI from the instance, copy it to another region, and launch the AMI in the new region.



How to Peek inside your ActiveMQ Server

This post describes steps that can be taken for sysadmins to peek inside an ActiveMQ server. We assume root capability, otherwise we need a user which has access to ActiveMQ configuration files.

Step 1. Determine running ActiveMQ process

ps auxw | grep activemq

We got a java process running ActiveMQ :

[root@broker ~]# ps auxw | grep activemq
activemq  1236  0.1  0.0  19124   696 ?        Sl   07:00   0:02 /usr/lib/activemq/linux/wrapper /etc/activemq/wrapper.conf wrapper.syslog.ident=ActiveMQ wrapper.pidfile=/var/run/activemq//ActiveMQ.pid wrapper.daemonize=TRUE wrapper.lockfile=/var/lock/subsys/ActiveMQ
activemq  1243  3.2 12.2 2016568 125264 ?      Sl   07:00   1:06 java -Dactivemq.home=/usr/share/activemq -Dactivemq.base=/usr/share/activemq -Djavax.net.ssl.keyStorePassword=password -Djavax.net.ssl.trustStorePassword=password -Djavax.net.ssl.keyStore=/usr/share/activemq/conf/broker.ks -Djavax.net.ssl.trustStore=/usr/share/activemq/conf/broker.ts -Dcom.sun.management.jmxremote -Dorg.apache.activemq.UseDedicatedTaskRunner=true -Djava.util.logging.config.file=logging.properties -Dactivemq.conf=/usr/share/activemq/conf -Dactivemq.data=/usr/share/activemq/data -Xmx1024m -Djava.library.path=/usr/share/activemq/bin/linux-x86-64/ -classpath /usr/share/activemq/bin/wrapper.jar:/usr/share/activemq/bin/activemq.jar -Dwrapper.key=zvZTrwPTV6sBMrMd -Dwrapper.port=32000 -Dwrapper.jvm.port.min=31000 -Dwrapper.jvm.port.max=31999 -Dwrapper.pid=1236 -Dwrapper.version=3.2.3 -Dwrapper.native_library=wrapper -Dwrapper.service=TRUE -Dwrapper.cpu.timeout=10 -Dwrapper.jvmid=1 org.tanukisoftware.wrapper.WrapperSimpleApp org.apache.activemq.console.Main start
root     10249  0.0  0.0 103244   860 pts/0    S+   07:35   0:00 grep activemq

From the result above, we know that the configuration file is in /usr/share/activemq/conf

Step 2. Determine whether ActiveMQ console are enabled

vi /usr/share/activemq/conf/activemq.xml

find the jetty.xml part, and make sure that it is enabled.
before:
after:
Check the jetty.xml too for console's port number.
vi /usr/share/activemq/conf/jetty.xml

Step 3. If we had changed activemq.xml, restart it


service activemq restart

Step 4. Obtain admin password

 vi /usr/share/activemq/conf/jetty-realm.properties

Right next to "admin:" is the admin's password.

Step 5. Finally, we could browse to localhost port 8161


If the server is not your localhost, please use SSH tunneling to port forward 8161 to 127.0.0.1:8161. Otherwise, just open a browser and type http://localhost:8161/
Use the admin password we got in step 4. No, you must check your own admin password, I won't tell you mine.

http://localhost:8161/
Click on the 'Manage ActiveMQ broker'.

home
Click on the Connections on the top menu.
 Now we see one client using Stomp connected to the activeMQ server. click on it.


The client, in this case, an Openshift Origin Node in the same VM as Broker, registered as a listener for:

  • Queue mcollective.nodes
  • Topic mcollective.discovery.agent
  • Topic mcollective.mcollective.agent
  • Topic mcollective.rpcutil.agent
  • Topic mcollective.openshift.agent

Summary

In this post, I have shown how to enable ActiveMQ web console in an ActiveMQ server configuration, and using the ActiveMQ web console to examine a client connecting to the server.

Friday, October 3, 2014

Verification of Node installation in Openshift Origin M4

The Openshift Origin Comprehensive Installation Guideline (http://openshift.github.io/documentation/oo_deployment_guide_comprehensive.html) states that there is several things that can be done to ensure the Node is ready for integration into Openshift cluster :

  • built-in script to check the node : 
    • oo-accept-node
  • check that facter runs properly :
    • /etc/cron.minutely/openshift-facts
  • check that mcollective communication works :
    • in the broker, run : oo-mco ping 
What I found that it is not enough. For example, openshift-facts show blanks, even though if there is an error with the facter functionality. So check the facter with :
  • facter 
And oo-mco ping works fine even though that there is something wrong with the rpc channel. I would suggest run these in the broker :
  • oo-mco facts kernel
  • oo-mco inventory

In one of our Openshift Origin M4 cluster , I have these lines in /opt/rh/ruby193/root/etc/mcollective/server.cfg:

main_collective = mcollective
collectives = mcollective
direct_access = 1

When I changed direct_access to 0, the oo-mco facts command doesn't work and  neither are the oo-admin-ctl-district -c add-node -n -i  

On the other cluster, I have these lines :

topicprefix = /topic/
main_collective = mcollective
collectives = mcollective
direct_access = 0

And the nodes works, albeit with warnings about topicprefix.

Additional notes :
Facter errors in my VMs (which have eth1 as the only working network interface) were fixed by ensuring /etc/openshift/node.conf contains these lines :
PUBLIC_NIC="eth1"
EXTERNAL_ETH_DEV="eth1"
INTERNAL_ETH_DEV="eth1"