Inventor's Paradox

Posts

Showing posts from 2014

Openshift Origin - Gear process hangs and ps as root also hangs

December 28, 2014

Background A few months ago we installed openshift origin cluster in a few of our servers. We have tried openshift M3 with good results. Upgrading to M4 and deploying several live apps there, we found gears that hangs each week or two, that cannot be resolved by stopping and starting the gear. After the third time this is getting troublesome. Symptoms - Application URL doesn't respond - haproxy-status URL shows gears in red status before eventually hangs itself - Full processlist done by root in the node will hang when the process in question is being printed Investigation Before we understood the root cause, restarting the VM is sometimes the only solution if the problem occurs. Strace-ing the hanging ps shows that ps is stopped when trying to read some memory. A few instances of the problem 'heals' by itself after one day. In some instances killling the process with the cmdline anomaly will solve the problem. The real eye opening is the blog post...

First steps with Hadoop

December 27, 2014

Background I need some improvement in one of the batch processes I run. It were build using PHP, parsing text files into mysql database. So for a change I tried to learn Hadoop and Hive Installation 1 Hadoop is a bit more complex than MySQL installation. Ok, so 'a bit' is an understatement. I tried to follow Windows installation procedure from HadoopOnWindows . I downloaded the binary package instead of the source package, because I am not in the mood of waiting mvn downloading endless list of jars. Well, some errors prevented me from continuing this path. Installation 2 Virtual machines seems to be way to go. Not wanting to spend too much time installing and configuring VMs, I installed Vagrant, a tool to download images and configure VMs automatically. VirtualBox is required as Vagrant's default provider, so I installed it too. At first I tried to follow this blog post titled Installing a Hadoop Cluster in Three Commands , but it somehow doesn't work eith...

Hex editing in Linux text console using Vim

December 20, 2014

I don't usually edits binary files. But when there is a need of binary editing, a capable tool is a must-have. In the past I used hexedit in the Linux text console. But yesterday I can't seem to find the correct package to install in one of CentOS servers. To my surprise, Vim is perfectly capable of doing hex editing, if only you know the secret. VIM binary mode Do you know that, Vim, our reliable text editor, have a binary mode option ? vim -b filename If we don't enable binary mode, some EOL (end of line) characters will get converted to other form. Binary corruption is possible if vim is still in text mode. Convert to hex dump use this command to change the file into its hex dump: :%!xxd Edit the file Edit the hex part of the file. The ascii part in the right side will not get converted back to binary, so stick to the left and middle column of the screen. Convert back to binary This command will convert the hex dump to binary : :%!...

Encountering Zookeeper

December 20, 2014

I have tried several noSQL databases, but yet to see any in a production environment. Well, except the MongoDB that were the part of Openshift Origin cluster we installed in our data center. Last week events make me interacts with Apache Zookeper, which is hidden inside three EMC VIPR controller nodes. Basic Facts Apache Zookeeper have the following characteristics : - in memory database system - data is modeled as a tree, like a filesystem - build using java programming language - usually runs as a cluster of minimal 3 hosts - usually listens on port 2181 The zookeeper cluster (called ensemble) are supposed to be resilient to failure. As an in memory database, it needs memory larger than the entire data tree. Any changes to the database are strictly ordered, coordinated between all nodes in the ensemble. For each time there must be a leader, and all other hosts will became followers. Checking a Zookeeper Do a telnet to port 2181, and issue a 'ruok' command. ty...

Configuring Openshift Origin with S3-based persistent shared storage

October 20, 2014

This post will describe the steps that I take to provide shared storage for OpenShift Origin M4 installation. There were some difficulties that must be solved by non standard methods. Requirement When hosting applications on Openshift Origin platform, we are confronted with a bitter truth : writing applications for cloud platforms requires us to avoid writing to local filesystems. There is no support for storage shared between gears. But we still need support multiple PHP applications that stores their attachment in the local filesystem with minimal code changes. So we need a way to quickly implement shared storage between gears of the same application. And maybe we could loosen the application isolation requirement just for the shared storage. Basic Idea The idea is to mount an S3 API-based storage on all nodes. And then each gear could refer to application's folder inside the shared storage to store and retrieve file attachments. My implementation uses an EMC VIPR shared ...

Debugging Ruby code - Mcollective server

October 04, 2014

In this post I record steps that I took to debug some ruby code. Actually the code was an ruby mcollective server code that were installed as part of openshift origin Node. The bug is that the server consistently fails to respond to client queries in my configuration. I documented the steps taken even though I hadn't nailed the bug yet. First thing first First we need to identify the entry point. These commands would do the trick: [root@broker ~]# service ruby193-mcollective status mcollectived (pid 1069) is running... [root@broker ~]# ps afxw | grep 1069 1069 ? Sl 0:03 ruby /opt/rh/ruby193/root/usr/sbin/mcollectived --pid=/opt/rh/ruby193/root/var/run/mcollectived.pid --config=/opt/rh/ruby193/root/etc/mcollective/server.cfg 12428 pts/0 S+ 0:00 \_ grep 1069 We found out that the service is : running with pid 1069 running with configuration file /opt/rh...

How to move an EC2 Instance to another region

October 04, 2014

In this post I would describe the process of moving an EC2 instance to another region. The background I have a server in one of the EC2 regions that a bit pricey than the rest. It seems that moving it to another region would save me some bucks. Well, it turns out that I did a few blunders that maybe causes the savings to be negligible. The initial plan I read that snapshots could be copied to other regions. So the original plan is to create snapshots of existing volumes that support the instance (I have one instance with three EBS volumes), copy these to another region, and create a new instance in the new region. The mistake My mistake is that I assume creating a new instance is a simple matter of selecting the platform (i386 or x86_64) and the root EBS volume. Actually, it is not. First, we create an AMI (Amazon Machine Image) using an EBS snapshot, not EBS volume. Then we could launch a new instance based on the AMI. As shown below, when we are trying to create a new AMI...

How to Peek inside your ActiveMQ Server

October 04, 2014

This post describes steps that can be taken for sysadmins to peek inside an ActiveMQ server. We assume root capability, otherwise we need a user which has access to ActiveMQ configuration files. Step 1. Determine running ActiveMQ process ps auxw | grep activemq We got a java process running ActiveMQ : [root@broker ~]# ps auxw | grep activemq activemq 1236 0.1 0.0 19124 696 ? Sl 07:00 0:02 /usr/lib/activemq/linux/wrapper /etc/activemq/wrapper.conf wrapper.syslog.ident=ActiveMQ wrapper.pidfile=/var/run/activemq//ActiveMQ.pid wrapper.daemonize=TRUE wrapper.lockfile=/var/lock/subsys/ActiveMQ activemq 1243 3.2 12.2 2016568 125264 ? Sl 07:00 1:06 java -Dactivemq.home=/usr/share/activemq -Dactivemq.base=/usr/share/activemq -Djavax.net.ssl.keyStorePassword=password -Djavax.net.ssl.trustStorePassword=password -Djavax.net.ssl.keyStore=/usr/share/activemq/conf/broker.ks...