Posts

Showing posts from 2015

Setting Default Application in Openshift Nodes

Image
Background The default behavior of Openshift nodes is to redirect requests for unknown applications to host/getting_started.html, usually causing endless redirect loop. On some cases we might want this behavior to be changed, for example when we want a default page (Application not found) to show up. Or when tools such as Acunetix scanning tool incorrectly detected such redirect as medium vulnerability because the redirect uses the injected host header. The Openshift Origin platform in use for this article is Origin Release 4, with the nodes using apache-mod-rewrite frontent plugin (rubygem-openshift-origin-frontend-apache-mod-rewrite-0.7.1.1-1.el6.noarch).  Mechanism The default mechanism can be read in /etc/httpd/conf.d/000001_openshift_origin_node.conf : As we can see, routes are loaded from openshift_route.include.  The file is full of route rules, but the interesting part are the RewriteMap clauses in the top of the file : Nodes and aliases are loaded f

Reclaiming Free Space in Centos 6 / RHEL 6 VirtualBox Disk Images

Image
Background Using virtualization has its benefits, but also has its shortfalls. For example, creating and deleting files in a virtual disk could make disk usage larger than what it is supposed to be. For example, disk usage of filesystem /dev/sdb1 in the VM is 45 GB. But the virtualbox disk image (VDI) size is 57 GB, as shown below : The VDI (cdh-node1_secondary_hdd.vdi) is about 57 GBs, yet the usage in /dev/sdb1 is 45 GB. So there is about 12 GB of wasted space. The cure Reading various posts, the cure seems consist of two steps : run zerofree to fill unused spaces with zeros run vboxmanage with option --compact The reference could be read in several blog posts (for Vmware :  here , here  , for VirtualBox :   here  and here ) .  But the caveat is, there is no zerofree rpm for RHEL 6. One of the blog posts hinted that we could create one by recompiling zerofree Source RPM (srpm) for Centos 5. Running zerofree First, obtain the source rpm : Using RPM sear

Enhancing MySQL Query Performance - July 2015

Background This post is meant to document steps that I took in July 2015 to fix a MySQL-related performance problem.  Problem Identification The symptoms is that pages of one application have very large response time. First thing we check is the CPU usage of the application server, which is < 25%, meaning that there are no issue in the application servers' CPU usage. And need to check logical limits also, in this case, Apache's MaxClients.. and compare it with the concurrent HTTP connections to the server. This is also < 25%. The second part we check is the database server. Check the CPU usage.. and gotcha, it was > 70%. With application server have almost no load,  this means query executions in the database server were not optimal. Next we check the queries. MySQL Processlist To examine queries running in a MySQL database server, open mysql command line client, and we could check the processlist using : SHOW FULL PROCESSLIST \G This should be ru

The mystery of TCP segmentation offload bug

There are incidents that have a generic description 'TCP segmentation offload bug' that affects multiple virtualization platforms. The workaround is the same, by disabling this feature. Case one Virtualization Platform : KVM/QEMU Symptom : Periodically, guest would lose network connectivity after heavy load. Restarting the guest network doesn't fix the problem. Guest will be ok after rebooting. Reference : https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978/comments/134 Workaround :    ethtool -K eth0 tx off sg off tso off ufo off gso off gro off lro off Case two Virtualization Platform : Xen Symptom : DomU hangs after network heavy load (@10 Mbyte/s). Reference : https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978/comments/132 Workaround : disable offloading using ethtool ethtool --offload gso off tso off sg off gro off Case three Virtualization Platform : VMWare Symptom : 1. Page could not be displayed after VM migration t

SAP System Copy Procedures

Background My company uses SAP AG's Enterprise Resource Planning software. It as a complex system, but being developed by Germans has its advantages, for example the have many documentation in the form of SAP Notes and other stuff. One of the challenging task in using SAP's ERP is system migration and cloning. Two purposes for the migration : first, if you want to move the system to another hardware. The second, if we need a cloned system to do stuffs without compromising your original system. SAP's term for migration activity is 'System Copy' Where to start To find out steps needed to be done for system copy, you could read this slideshare-hosted SAP document here  (Best Practices : SAP System Landscape Copy). For another definitive starting point, check out SAP's wiki DOC-8324 System Copy and Migration . I noted that SAP has released Software Provisioning Manager from which we could do system copy, my previous system copy experiences has not involved

Cloud Storage Price

This post is a place where I would note prices relating to Cloud Storage. As A consumer Amazon : S3 (Simple Storage Service) : $0.03 /GB/month for first TB  (ref: https://aws.amazon.com/s3/pricing/) Glacier : $0.01 /GB/month EC2 Elastic Block Storage : SSD $0.1 /GB/month  magnetic : $0.05 /GB/month  + $0.05 /million IO Google : (ref : https://cloud.google.com/storage/pricing) Standard Storage : $0.026 /GB/month, $0.01 /GB/month nearline   Microsoft sells: (ref : https://onedrive.live.com/about/en-us/plans/) 15 GB : free 100 GB : $1.99 /month (means  $0.019 /GB/month) 200 GB : $3.99 /month 1 TB : $7 / month (include Office 365) As a provider  SwiftStack controller : (http://itknowledgeexchange.techtarget.com/storage-soup/swiftstack-enters-software-defined-storage-race/) subscription : $10 / TB/month (means $0.01 /GB/month) EMC VIPR :  (http://silvertonconsulting.com/blog/2013/09/30/emc-vipr-virtues-vexations-but-no-virtualization/#sthash.CEFmpF86

The PTR DNS record, Email, SSL and You

Image
What is the relationship of PTR DNS record, email, and SSL ? This post will answer about that. Meanwhile, the background for this post are : I am unable to make my new SMTP server instance deliver mail to a certain company's mailboxes. The company uses Trend micro's reputation list to block unwanted sender. I have problems connecting to SMTP server using SSL with Android, and no problem using my Laptop and iPhone. I connect to the server using a load balancer IP address. The analysis First, to determine why the emails are not being delivered, I used the 'Delivery Status' display that available in the Cisco Ironport's Reports menu. It is quite useful when showing the reason why the email not being delivered. Using the url shown in the 550 rejection message, I found out that the IP address being used for the email server is listed as having 'Bad' reputation.  What is the meaning of DUL ? Quoting from Trend Micro website : Dynamic

Learn Rails on Windows, part 1

Background Ruby on Rails is a very well known framework, inspiring many frameworks created following Rails ways of organizing a web application. The frameworks I have used that are similar to Rails are : In PHP : CakePHP, CodeIgniter, Yii Framework  In Java : Spring Roo And it seems also being used in Openshift Origin for the implementation of Origin's web console and broker API. That and combined with our need to run Openshift Origin in our private cloud seems a good reason to learn Rails,  The tutorial Currently I am following the Ruby on Rails tutorial by Michael Hartl ( link ). The book strongly encourages using Linux based environment for development, but I am used to work from my Windows machine, so I tried to do development from my Windows Laptop. Preparation I installed the x64 version of Ruby 2.0.0 installer in  http://rubyinstaller.org/downloads/ for my Windows 8.1 system. I have cases that the 32 bit installer doesn't work for 32-bit Windows

Processing CSV Files using Hive / Hadoop / HDFS

Background When there is a need to process large-sized CSV files, Apache Hive became a good option since it allow us to directly query these files. I will try to describe my recent experiences in using Apache Hive. In this case I need to group the rows and count the rows for each group. I will compare to my existing systems using MySQL database, one built using PHP and other built using combination of Pentaho and PHP. Installation & Configuration We have many components of Hadoop-Hive-HDFS ecosystem : HDFS : Namenode service, Datanode services.  MapReduce : ResourceManager service, NodeManager services Hive  ZooKeeper Each component have their own configuration file (or files), and their own log files.  For simplicity, in my opinion nothing beats the Apache-MySQL-PHP stack. Minus points for Hadoop-Hive-HDFS in complexity standpoint. I think we need additional management layer to be able to cope with complexity, maybe like Cloudera Manager or Apache Ambari, whi

Openshift Log Aggregation And Analysis using Splunk

Image
Splunk is one of popular tools we use to analyze log files. In this post I would describe how to configure an openshift cluster to send all of the platform log files (mind that this excludes gear log files) to Splunk. Configure Splunk to listen on TCP port From splunk web console home, choose 'Add Data', 'monitor', 'TCP/UDP', fill in port 10514 (TCP), click 'Next', select sourcetype Operating System - linux_messages_syslog. Configure Rsyslog Forwarding These steps should be done in every openshift node, openshift broker and console. As root, create an /etc/rsyslog.d/forward.conf file  as follows (change splunkserver to your splunk server IP, and the @@ means TCP, instead of @ for UDP) $WorkDirectory /var/lib/rsyslog # where to place spool files $ActionQueueFileName fwdRule1 # unique name prefix for spool files $ActionQueueMaxDiskSpace 1g   # 1gb space limit (use as much as possible) $ActionQueueSaveOnShutdown on # save messages to disk on sh