Friday, November 1, 2013

Testing your web app with Selenium

Selenium is a web testing toolkit. This week I tried to use Selenium tools to create a test scenario that actually automates the web browser in order to populate test database. Similar to many open source tools, it has its rough edges.

My first problem is automating the jquery UI datepicker. Recording with Selenium IDE gives a result that is not too reliable :


It isn't easy to change the month or day in this test script. Running this script failed with error message that Selenium couldn't find the '1' link. The '1' is actually a clickable td but the script assumed it was a HTML link (a tag). Finding other sources, I replaced the last row with click css=td.day:contains("1"). Which sometimes doesn't work correctly because sometimes it choose the 31 cell instead of 1 cell. After a while I settled by a runScript command that invokes jQuery to set the datepicker value :

runScript $("Request_kembali").val("2013-01-01");

The second problem is that the application pops up alert boxes that has dynamic id. It uses smoke.js library. So for example it would be 1001.11 in one run, and it could be 2005.13 in another run. The solution is to change the locator from id-based to css-based, using css=.dialog-buttons button

The third problem is that when single-stepping using Selenium IDE, the script works fine, but when we execute the entire test case, there is an 'Element not found' error. The solution is to add waitForElement command before doing a click command.

The fourth problem, is that Selenium having difficult time when the app pops up a dialog box with many widgets such as data table and tabs. Because the pop up dialog box is a bit too complex I cheated by creating a javascript function in the app that I call using runScript command in Selenium IDE.

After the test case runs successfully under Selenium IDE, I tried to convert it into Java-based test case. I use two approaches that works equally well:
A. Create empty java project using IntelliJ IDEA (I had infinite troubles using Spring Tool Suite), add two new library referring to selenium libs folder and selenium jar file.
B. Create empty maven quickstart archetype, adds junit and selenium dependency.

Any of the two approach have similar problems :
1. Open commands converted to get method invocation with wrong url. The application folder name were concatenated twice in the url. The solution is fix the url manually.
2. More often than not, the click command doesn't work. There was no error but the browser seems to ignore the click command. I used latest Firefox (25.0) and Selenium Java driver (2.37), the possible solution is to change font size to 100%, or to downgrade the driver to 2.34, or to disable native events support (which I choose).

    FirefoxProfile profile = new FirefoxProfile();
    profile.setEnableNativeEvents(false);
    driver = new FirefoxDriver(profile);
 3. The runScript command didn't get converted. The solution is to manually convert those commands :
        JavascriptExecutor js = (JavascriptExecutor)driver;
        js.executeScript("formClearPemeriksa();");
        // ERROR: Caught exception [ERROR: Unsupported command [runScript | formClearPemeriksa(); | ]]
4. the CSS :contains selector is not supported by WebDriver. There are some solutions but I managed to replace the selector with other logic, avoiding the issue altogether.

But bandwidth issues makes running the test in my laptop not feasible. So I tried to do install Selenium java in RHEL/Centos Linux. For 5.x releases, I got stuck in exception (cannot find shared object) when trying to launch Firefox (v3.x) process remotely. It seems the built in Firefox is too old for Selenium. For 6.x releases that I had upgraded to Firefox 17.0, no such issue occured.


Sunday, October 6, 2013

Installing Saprfc on CentOS 6

Our PHP web servers were powered by Linux, either Red Hat Enterprise Linux or CentOS Linux. Normally we use 5.x version of CentOS or RHEL, but after a few years running 5.x we decided that its now time to upgrade to 6.x.
The first problem we encounter is that the SAPCAR utility shows loading shared libraries error.

./SAPCAR_1-20002087.EXE RFC_45-10003377.SAR
./SAPCAR_1-20002087.EXE: error while loading shared libraries: libstdc++.so.6: cannot open shared object file: No such file or directory


Checking the EXE file (actually is a Linux ELF binary but with misleading extension) with LDD, we find that :
[root@myserver opt]# ldd SAPCAR_1-20002087.EXE
        linux-gate.so.1 =>  (0x00e81000)
        libdl.so.2 => /lib/libdl.so.2 (0x00d05000)
        librt.so.1 => /lib/librt.so.1 (0x00317000)
        libstdc++.so.6 => not found
        libm.so.6 => /lib/libm.so.6 (0x003b0000)
        libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x0083d000)
        libpthread.so.0 => /lib/libpthread.so.0 (0x005b7000)
        libc.so.6 => /lib/libc.so.6 (0x00694000)
        /lib/ld-linux.so.2 (0x00672000) 
Seems that the libstdc++ is missing. 
[root@myserver opt]# rpm -ql libstdc++
/usr/lib64/libstdc++.so.6
/usr/lib64/libstdc++.so.6.0.13
 But it is not missing... whats wrong here ? See the details. LDD's output is that the libraries in the '/lib' folder, and the libstdc++ exist in '/usr/lib64'. The /lib folder is for 32-bit libraries. So there the SAPCAR actually requires the 32-bit version. Lets install the 32-bit version :
[root@myserver ~]# yum install libstdc++.i386
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
 * base: buaya.klas.or.id
 * extras: buaya.klas.or.id
 * updates: buaya.klas.or.id
Setting up Install Process
No package libstdc++.i386 available.
 
OK, maybe this is not the 386 era anymore :
[root@myserver ~]# yum install libstdc++.i686
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
 * base: buaya.klas.or.id
 * extras: buaya.klas.or.id
 * updates: buaya.klas.or.id
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package libstdc++.i686 0:4.4.7-3.el6 will be installed
--> Finished Dependency Resolution
Error: Protected multilib versions: libstdc++-4.4.7-3.el6.i686 != libstdc++-4.4.6-3.el6.x86_64
 You could try using --skip-broken to work around the problem
 You could try running: rpm -Va --nofiles --nodigest

Yum refuses to install 32-bit library whose version number are different from the 64-bit library. So lets update both libraries.
[root@myserver ~]# yum install libstdc++.i686 libstdc++.x86_64
Loaded plugins: fastestmirror, refresh-packagekit, security
Loading mirror speeds from cached hostfile
 * base: buaya.klas.or.id
 * extras: buaya.klas.or.id
 * updates: buaya.klas.or.id
Setting up Install Process
Resolving Dependencies
--> Running transaction check
---> Package libstdc++.x86_64 0:4.4.6-3.el6 will be updated
--> Processing Dependency: libstdc++ = 4.4.6-3.el6 for package: gcc-c++-4.4.6-3.el6.x86_64
--> Processing Dependency: libstdc++(x86-64) = 4.4.6-3.el6 for package: libstdc++-devel-4.4.6-3.el6.x86_64
---> Package libstdc++.i686 0:4.4.7-3.el6 will be installed
---> Package libstdc++.x86_64 0:4.4.7-3.el6 will be an update
--> Running transaction check
---> Package gcc-c++.x86_64 0:4.4.6-3.el6 will be updated
---> Package gcc-c++.x86_64 0:4.4.7-3.el6 will be an update
--> Processing Dependency: gcc = 4.4.7-3.el6 for package: gcc-c++-4.4.7-3.el6.x86_64
---> Package libstdc++-devel.x86_64 0:4.4.6-3.el6 will be updated
...
Installed:
  libstdc++.i686 0:4.4.7-3.el6

Updated:
  libstdc++.x86_64 0:4.4.7-3.el6

Dependency Updated:
  cpp.x86_64 0:4.4.7-3.el6              gcc.x86_64 0:4.4.7-3.el6
  gcc-c++.x86_64 0:4.4.7-3.el6          gcc-gfortran.x86_64 0:4.4.7-3.el6
  gcc-java.x86_64 0:4.4.7-3.el6         libgcc.i686 0:4.4.7-3.el6
  libgcc.x86_64 0:4.4.7-3.el6           libgcj.x86_64 0:4.4.7-3.el6
  libgcj-devel.x86_64 0:4.4.7-3.el6     libgfortran.x86_64 0:4.4.7-3.el6
  libgomp.x86_64 0:4.4.7-3.el6          libstdc++-devel.x86_64 0:4.4.7-3.el6

Complete!
 
And retry the saprfc installation  :
[root@myserver opt]# ./SAPCAR_1-20002087.EXE -xvf RFC_45-10003377.SAR
SAPCAR: processing archive RFC_45-10003377.SAR (version 2.00)
x rfcsdk
x rfcsdk/bin
x rfcsdk/bin/genh
.
.
x rfcsdk/include/trfcserv.h
x rfcsdk/include/trfctest.h
SAPCAR: 46 file(s) extracted
[root@myserver opt]# mkdir /usr/sap
[root@myserver opt]# mv rfcsdk/ /usr/sap
 

Now download the saprfc extension from http://sourceforge.net/projects/saprfc/files/saprfc/, extract it :
[root@myserver opt]# tar -xvzf saprfc-1.4.1.tar.gz
saprfc-1.4.1/
saprfc-1.4.1/sapclasses/
saprfc-1.4.1/sapclasses/examples/
saprfc-1.4.1/sapclasses/examples/example_client.php
saprfc-1.4.1/sapclasses/examples/example_connect1.php
.

.
saprfc-1.4.1/test_sso.php
saprfc-1.4.1/trfcserv.php


Do phpize to prepare php extension compilation.
[root@myserver saprfc-1.4.1]# phpize
Configuring for:
PHP Api Version:         20090626
Zend Module Api No:      20090626
Zend Extension Api No:   220090626


And now do the classic configure and make install :

[root@myserver saprfc-1.4.1]# ./configure
checking for grep that handles long lines and -e... /bin/grep
checking for egrep... /bin/grep -E
checking for a sed that does not truncate output... /bin/sed
checking for cc... cc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
.
.
checking whether stripping libraries is possible... yes
checking if libtool supports shared libraries... yes
checking whether to build shared libraries... yes
checking whether to build static libraries... no
configure: creating ./config.status
config.status: creating config.h
config.status: executing libtool commands

[root@myserver saprfc-1.4.1]# make install
/bin/sh /opt/saprfc-1.4.1/libtool --mode=compile cc  -I. -I/opt/saprfc-1.4.1 -DPHP_ATOM_INC -I/opt/saprfc-1.4.1/include -I/opt/saprfc-1.4.1/main -I/opt/saprfc-1.4.1 -I/usr/include/php -I/usr/include/php/main -I/usr/include/php/TSRM -I/usr/include/php/Zend -I/usr/include/php/ext -I/usr/include/php/ext/date/lib -I/usr/sap/rfcsdk/include  -DHAVE_CONFIG_H  -g -O2   -c /opt/saprfc-1.4.1/saprfc.c -o saprfc.lo
libtool: compile:  cc -I. -I/opt/saprfc-1.4.1 -DPHP_ATOM_INC -I/opt/saprfc-1.4.1/include -I/opt/saprfc-1.4.1/main -I/opt/saprfc-1.4.1 -I/usr/include/php -I/usr/include/php/main -I/usr/include/php/TSRM -I/usr/include/php/Zend -I/usr/include/php/ext -I/usr/include/php/ext/date/lib -I/usr/sap/rfcsdk/include -DHAVE_CONFIG_H -g -O2 -c /opt/saprfc-1.4.1/saprfc.c  -fPIC -DPIC -o .libs/saprfc.o
/opt/saprfc-1.4.1/saprfc.c: In function ‘zif_saprfc_open’:
/opt/saprfc-1.4.1/saprfc.c:481: warning: ‘zend_get_parameters_ex’ is deprecated (declared at /usr/include/php/Zend/zend_API.h:222)
/opt/saprfc-1.4.1/saprfc.c: In function ‘zif_saprfc_function_discover’:
/opt/saprfc-1.4.1/saprfc.c:551: warning: ‘zend_get_parameters_ex’ is deprecated (declared at /usr/include/php/Zend/zend_API.h:222)
/opt/saprfc-1.4.1/saprfc.c:555: warning: ‘zend_get_parameters_ex’ is deprecated (declared at /usr/include/php/Zend/zend_API.h:222)
.
.
Libraries have been installed in:
   /opt/saprfc-1.4.1/modules

If you ever happen to want to link against installed libraries
in a given directory, LIBDIR, you must either use libtool, and
specify the full pathname of the library, or use the `-LLIBDIR'
flag during linking and do at least one of the following:
   - add LIBDIR to the `LD_LIBRARY_PATH' environment variable
     during execution
   - add LIBDIR to the `LD_RUN_PATH' environment variable
     during linking
   - use the `-Wl,-rpath -Wl,LIBDIR' linker flag
   - have your system administrator add LIBDIR to `/etc/ld.so.conf'

See any operating system documentation about shared libraries for
more information, such as the ld(1) and ld.so(8) manual pages.
----------------------------------------------------------------------
Installing shared extensions:     /usr/lib64/php/modules/

Now lets try enable the extension and restart httpd :
[root@myserver saprfc-1.4.1]# cd /etc/php.d
[root@myserver php.d]# cat > saprfc.ini
; Enable saprfc extension module
extension=saprfc.so
[root@myserver php.d]# service httpd restart


Ok we still have another problem, namely SELinux :
PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/lib64/php/modules/saprfc.so' - librfccm.so: failed to map segment from shared object: Permission denied in Unknown on line 0

We need to change several SELinux file attributes.
[root@myserver php.d]# cd /usr/lib64/php/modules
[root@myserver modules]# ls -Z
...
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       pdo.so
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       pdo_sqlite.so
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       phar.so
-rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   saprfc.so
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       sqlite3.so
[root@myserver modules]# restorecon -F saprfc.so
...
[root@myserver modules]# ls -Z
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       pdo_sqlite.so
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       phar.so
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       saprfc.so
-rwxr-xr-x. root root system_u:object_r:lib_t:s0       sqlite3.so

And we need the linked library also.
[root@myserver modules]# ldd saprfc.so
        linux-vdso.so.1 =>  (0x00007fffd0dde000)
        librfccm.so => /usr/sap/rfcsdk/lib/librfccm.so (0x00007f8008f27000)
        libc.so.6 => /lib64/libc.so.6 (0x00007f8008b6e000)
        libdl.so.2 => /lib64/libdl.so.2 (0x00007f800896a000)
        librt.so.1 => /lib64/librt.so.1 (0x00007f8008762000)
        libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f8008545000)
        libstdc++.so.5 => /usr/lib64/libstdc++.so.5 (0x00007f800826a000)
        libm.so.6 => /lib64/libm.so.6 (0x00007f8007fe6000)
        libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x00007f8007dcf000)
        /lib64/ld-linux-x86-64.so.2 (0x00000031e3a00000)
[root@myserver modules]# ls -Z  /usr/sap/rfcsdk/lib/
-rw-rw-r--. root root unconfined_u:object_r:usr_t:s0   librfc.a
-rwxrwxr-x. root root unconfined_u:object_r:usr_t:s0   librfccm.so

[root@myserver modules]# restorecon -v -F /usr/sap/rfcsdk/lib/
restorecon reset /usr/sap/rfcsdk/lib context unconfined_u:object_r:usr_t:s0->system_u:object_r:lib_t:s0
[root@myserver modules]# restorecon -v -F /usr/sap/rfcsdk/lib/*.so
restorecon reset /usr/sap/rfcsdk/lib/librfccm.so context unconfined_u:object_r:usr_t:s0->system_u:object_r:lib_t:s0

Restart HTTPD again.
[root@myserver modules]# service httpd restart

And all is well after that. Well, except that to be able to connect to other servers, we need to set SELinux policy to allow network connection from apache :
[root@myserver ~]# getsebool -a | grep httpd | grep network
httpd_can_network_connect --> off
httpd_can_network_connect_cobbler --> off
httpd_can_network_connect_db --> off
httpd_can_network_memcache --> off
httpd_can_network_relay --> off
[root@myserver ~]# setsebool -P httpd_can_network_connect on


The last command tooks a while in my server, please be patient.

Bad Elf Interpreter

Another variation of this problem is a bad ELF interpreter response :
rra-mobile ~ # ./SAPCAR_1-20002087.EXE
-bash: ./SAPCAR_1-20002087.EXE: /lib/ld-linux.so.2: bad ELF interpreter: No such file or directory
It happens that this cryptic error message were caused by missing glibc.i686 library, so the solution is simple :
rra-mobile ~ # yum install glibc.i686 glibc-common.i686

Silent AVC Denial

Another issue that might blocks the extension is SE linux denies access to the directory containing librfccm.so. The problem is, the denial is not logged in alert.log, making troubleshooting difficult if not impossible. The indication is :
do:
    service httpd restart
error message in /var/log/httpd/error_log :
[Sun Jan 26 07:30:45 2014] [notice] Digest: done
PHP Warning:  PHP Startup: Unable to load dynamic library '/usr/lib64/php/modules/saprfc.so' - librfccm.so: cannot open shared object file: No such file or directory in Unknown on line 0
do:
    setenforce 0
    service httpd restart
 no errors occured. 

I tried to fix this :
   setenforce 1
   restorecon -vF /usr/sap/rfcsdk
   service httpd restart

And no errors now..

Conclusion

Installing saprfc on CentOS 6 is not for the faint hearted. The system admin must be well versed in SELinux aspects as well.

Thursday, May 9, 2013

First OpenBravo Experiences

What is Openbravo

OpenBravo is an open source ERP software, in the same market as Compiere and TinyERP. This post describes my experiences trying OpenBravo to do Sales Order and Purchase Order Process. First thing I notice is that it is built on the Java Platform, and uses PostgreSQL database. There is a little inconvenience because I never really used PostGreSQL before.

 Creating a new company (Organization)

Openbravo's concept were similar to SAP's multi company capabilities. We could create more than one company (organization in OpenBravo's terms) in the ERP syatem. But creating a new company has some hidden challenges.
 First, the newly created company, if it is incomplete, will be unavailable from some screen. It is a little confusing but it is to be expected on a complex ERP system.
We could create a new company from the General Setup : Enterprise Model : Initial Organization Setup menu.  Filling the blanks and the checkboxes, we could create the company in the information system. Afterwards, there is a 'Set as ready' button in the top right corner of the Organization Tab. Clicking on this would allow us to ensure the company is ready to be used.

First Rule: At least one ancestor must be a legal company
This message is alerted to me when we are trying to set a Generic company to be ready. So we should either change this to a Legal company or assign a parent that is a legal company. I chose the first.

Second Rule: A company or its parent must have calendar
The alert message refers to Fiscal calendar that must be configured before using the company. I created the fiscal calendar and periods using Financial Management: Accounting: Setup menu. After the calendar and periods created, it must be attached to the company by enabling the organization's period control and  then selecting the calendar .

Deleting a organization

When I inadvertently created a broken company (Organization), I tried to delete it. OpenBravo complained that there are links that were referencing this entity, so I refer to 'Linked Items' section in the Organization. In the linked items were some rows of data that referenced the  company. I deleted them all one by one. But still the organization could not be deleted.

Creating a Purchase Order

The scenario I'm trying to do is a purchase order scenario where I'm buying service from one of my vendors.

Creating a Business Partner

From menu Master Data Management: Business Partner I tried to create a business partner. But it failed because the business partner category is blank and there is no choice in the drop down menu for Business Partner Category.

Creating Business Partner Category

From the menu Master Data Management: Business Partner Setup: Business Partner Category, I tried to create a category. At first I use the direct-on-grid insert, but the resulting business partner category isn't associated to my new Organization. So the create form must be used instead.

Creating a Business Partner (part 2)

After the business partner category created, the business partner could be created, referencing the category. 

Creating the purchase order

The purchase order menu is in Procurement Management : Transactions : Purchase Order. In my first trial, the business partner created before is not available in the Business Partner drop down menu. It seems that the Customer checkbox is automatically selected when creating the Business Partner. So I must deselect it and enable the Vendor checkbox. But still the Partner Address and Warehouse marked red because there is no drop down item available for them. They are to be created in the next blog post...

Wednesday, January 30, 2013

SE Linux Cure (mini mini post)

I think its only natural for admins to avoid SELinux. Having SELinux enabled can make your simplest changes resulting in system failures. Or maybe even no change at all (no changes that you remember..).

The Cure

In the past, I depend upon a few commands that might shed some light on SELinux troubles.
The commands are :
  1. ls -Z : this parameter shows additional column, named the security context, that is owned by each file
  2. chcon : this command changes a file's context to the given context argument. Example, chcon -t mysqld_db_t mysql - this command sets the security context of the mysql directory to mysqld_db_t
  3. restorecon : this command restores a file's context to default
But a recent trouble opened my mind that more tools are needed. For example, in Ubuntu systems, we might need to poke the directory /etc/apparmor.d and edit rule files there.
In recent CentOS trouble, these commands are handy :
  • yum install setroubleshoot - this installs sealert, semanage tools
  • sealert -a /var/log/audit/audit.log - this dumps audit log into readable messages
  • semanage fcontext - this changes many file context according to a wildcard path expression.
  • setenforce - it might help to temporarily allow the selinux violation during troubleshooting. This can be done by setenforce 0 to allow and setenforce 1 to disallow violations.
  • semodule -DB - disable dontaudit clauses. 
  • semodule -B - reenable dontaudit clauses
  • restorecon -vF filename_or_directory - reset SELinux context of the file or directory to default, normally enabling access to such filename/directory that otherwise denied by SElinux

Tuesday, January 29, 2013

MySQL Corrupt Tables and How to Avoid it

Once in a while MySQL's tables became corrupted. This post is not interested in repair process (you should see other posts, but the most general advice is to do a REPAIR TABLE table; )
In my humble opinion, a real life production database must not have any corruption, it must have sufficient failsafe mechanisms to avoid such corruption.

Causes of Corruption

MyISAM tables could became corrupted by  (refer http://dev.mysql.com/doc/refman/5.1/en/corrupted-myisam-tables.html) :
  • mysqld is being killed in the middle of write
  • unexpected computer shutdown occured
  • hardware failures
  • running an external program (example: myisamchk) while mysqld is running
  • software bug in Mysql/myISAM code

Tips to mitigate data corruption

Do not use MyISAM for long lasting data. Use InnoDB. InnoDB is less corruption prone than MyISAM. Use file per table option.
Check your disk space. Check database availability periodically. On one occasion, my mysql data partition is full, and noticed that queries have been stopped responding. I deleted some unused log files and MySQL returned to normal service, without any corruption. Somehow the engine pauses when it run out of disk space and resumes after space is available.
Turn on binary logging. It helps in disaster forensics (such as when your table has zero rows and you need to find out which app/person responsible).
Install a secondary mysql as slave server. If you only have virtual machines, it would be better if the slave server is ensured to be in another region or another physical server.
Ensure MySQL's memory usage is compatible with available memory. This means no other application are allowed to dynamically eat memory. Out of memory conditions will turn the Linux OS into a process killer, so the probability of which should be set as zero as possible.
Backup periodically. Daily automated backups will be perfect. Must think about where the backups will be stored outside the server.

Saturday, January 19, 2013

Learning Cloud Foundry - NATS

All Cloud Foundry components communicates with each other using NATS publish-subscribe mechanism.
Lets see in detail what is NATS like.

Cloud Foundry Router as NATS client

During startup of Cloud Foundry Router (router.rb), there is a call to start NATS client below :
NATS.start(:uri => config['mbus'])
It means that NATS client will be started with the uri parameter obtained from configuration parameter 'mbus'. 
config_path = ENV["CLOUD_FOUNDRY_CONFIG_PATH"] || File.join(File.dirname(__FILE__), '../config')
config_file = File.join(config_path, 'router.yml')
...
opts.on("-c", "--config [ARG]", "Configuration File") do |opt|
    config_file = opt
  end
...
We find that the configuration parameter will be read from a YAML (yet Another Markup Language) that whose location could came from multiple sources, in these order :
  1. specified from command line parameter -c or --config
  2. environment variable CLOUD_FOUNDRY_CONFIG_PATH (if exists) concatenated with '/router.yml'
  3. ../config/router.yml relative to location of Ruby file router.rb (in current condition, router/lib/router.rb)
Note that the code didn't check whether the file exists at all, but it only checks the existence of the configuration switch or environment variable.

After starting the NATS client, the router publishes router start events :

@router_id = VCAP.secure_uuid
@hello_message = { :id => @router_id, :version => Router::VERSION }.to_json.freeze
 
# This will check on the state of the registered urls, do maintenance, etc..
Router.setup_sweepers
 
# Setup a start sweeper to make sure we have a consistent view of the world.
EM.next_tick do
# Announce our existence
NATS.publish('router.start', @hello_message)
# Don't let the messages pile up if we are in a reconnecting state
EM.add_periodic_timer(START_SWEEPER) do
unless NATS.client.reconnecting?
NATS.publish('router.start', @hello_message)
end
end
end

This ensures other components realizes that there is a new router starting up (and might need situation update).

NATS client module

NATS client module starts an Eventmachine-based network connection to the NATS server given in the uri parameter. But it also has defaults (client.rb):
(line 13@client.rb)
DEFAULT_PORT = 4222
DEFAULT_URI = "nats://localhost:#{DEFAULT_PORT}".freeze
 ...
.. (line 102@client.rb)
opts[:uri] ||= ENV['NATS_URI'] || DEFAULT_URI
...
We see that the uri parameter are obtained from  (in such order) :
  1. uri parameter when calling NATS.start
  2. NATS_URI environment variable
  3. DEFAULT_URI which is nats://localhost:4222
Publish call is implemented by sending text line to the NATS server :

# Publish a message to a given subject, with optional reply subject and completion block
# @param [String] subject
# @param [Object, #to_s] msg
# @param [String] opt_reply
# @param [Block] blk, closure called when publish has been processed by the server.
def publish(subject, msg=EMPTY_MSG, opt_reply=nil, &blk)
return unless subject
msg = msg.to_s
# Accounting
@msgs_sent += 1
@bytes_sent += msg.bytesize if msg
 
send_command("PUB #{subject} #{opt_reply} #{msg.bytesize}#{CR_LF}#{msg}#{CR_LF}")
queue_server_rt(&blk) if blk
end

The NATS server takes care of messaging all the subscribers that are interested in the message.

NATS Server


The Nats server is also implemented in Ruby. The source code shows us that the startup sequence is as follows :
  1. do setup using given command line arguments
  2. start  eventmachine on given host and port parameters, using NATSD::Connection module to serve connections
  3. if given http_port parameter, starts http monitoring on such port
The server class (which does setup) is separated into a few files : core server (server.rb) and option handling (options.rb).
The startup options is first being read from the command-line (see parser method here), then the server would read a configuration file (if given) for additional parameters. The parameters given from command-line will not be changed, only missing options would be read from the YAML-formatted configuration file. The available parameters are :
  1. addr (listen network address)
  2. port
  3. http: net,port,user, password (for http monitoring port)
  4. authorization : user, password, auth_timeout
  5. ssl (will request ssl on connections)
  6. debug
  7. trace
  8. syslog (activate syslogging)
  9. pid_file
  10. log_file
  11. log_time
  12. ping : interval,max_outstanding
  13. max_control_line, max_payload, max_connections (setup limits)
The subscription list is being keep in the server class, along with route_to_subscribers method for sending message to registered parties. 
 The Connection module is the heart of  NATS server's operations. NATS server could be configured to require authentication or to require SSL connecition. The operation it recognizes are :
  1. PUB (publish), which would send a payload to registered parties
  2. SUB (subscribe), which would register the caller to a message subject.
  3. UNSUB (unsubscribe), unregister a subscribtion
  4. PING, which would make the server send a response message
  5. PONG - actually not an operation but a mechanism to ensure client health
  6. CONNECT, to reconfigure verbose and pedantic connection options
  7. INFO, which would make the server send an information json string
  8. CTRL-C/CTRL-D, both would make the server close the connection.
The server sends ping messages periodically to each client according to Server.ping_interval. If outstanding non-reply is above max_outstanding parameter, the server will tell the client that it is unresponsive and close the client connection afterwards.

Friday, January 11, 2013

Learning Cloud Foundry Router

Background

The Cloud Foundry PaaS platform serves web applications. The Cloud Foundry router components is different from a router in the plain IT-talk, the Cloud Foundry router is a Ruby-based software component that determines which backend droplet that should serve (each and every) http request that are requested to one of the applications deployed on the platform.

Starting Point

In order to understand the mechanism of the router, we start from nginx, the http server that serves as the primary gatekeeper. Nginx configuration is my primary concern. So lets see the vcap source code that describes nginx configuration on each and every Cloud Foundry server. In the vcap/dev_setup/cookbooks/nginx folder we found a templates directory that stores nginx configuration templates. 
The templates I suppose being used by the dev_setup installation procedure.

Configuration templates

We have cc-nginx.conf.erb Ruby template along with router-nginx.conf.erb template and some other files. Lets see the router configuration (router-nginx.conf.erb).

user root root;
worker_processes 1;

The first two lines says that there are only one worker processes. This should not be much of limitation because nginx use event-driven architecture that could process multiple concurrent connections from a single process/thread. But still, this means allocating two CPU to the VM hosting the router component would not be effective since only one CPU would run the router process.
 
error_log <%= node[:nginx][:log_home] %>/nginx_router_error.log debug;
pid /var/run/nginx_router.pid;

The next two describe error logging and pid file. This getting less interesting, so lets skip to the routing parts..
    location = /vcapuls {
      internal;
      # We should use rewrite_by_lua to scrub subrequest headers
      # as uls doesn't care those headers at all.
      # Given there are some exceptions to clear some headers,
      # we just leave them as is.
      proxy_pass http://unix:/tmp/router.sock:/;
    }
This means an nginx-internal url is defined with the path '/vcapuls'. Other comment says 'upstream locator server' that match the ULS acronym. So ULS is a server that locates upstream server, that is very much the same as the purpose of the Router component.
        set $backend_addr ''; # Backend server address returned from uls for this request
        set $uls_req_tags ''; # Request tags returned from uls for this request to catalog statistics
        set $router_ip '';
        set $timestamp 0;
        set $trace '';
        set $sticky '';
        access_by_lua '
          local uls = require ("uls")
          uls.pre_process_subrequest(ngx, "<%= node[:router][:trace_key] %>")
          local req = uls.generate_uls_request(ngx)
          -- generate one subrequest to uls for querying
          local res = ngx.location.capture(
            "/vcapuls", { body = req }
          )
          uls.post_process_subrequest(ngx, res)
        ';
We found that for generic url nginx would consult the /vcapuls url, which are described before as http pass-through to unix socket named /tmp/router.sock.
The LUA script inside nginx configuration file is something new to myself. It seems to call LUA module named ULS.  After calling the LUA script, nginx execute http pass-through to the backend_addr. No assignment to backend_addr are seen in the LUA script, so it must be set inside the ULS module.
        proxy_pass http://$backend_addr;
        # Handling response from backend servers
        header_filter_by_lua '
          local uls = require ("uls")
          uls.post_process_response(ngx)
        ';


ULS Module

The ULS module can be found in vcap router component source code. The Description confirms what ULS stands for.
 -- Description:         Helper for nginx talking to uls(Upstream Locator Server)

 
Two functions in the ULS module are most important in the routing mechanism, they are generate_uls_request and post_process_response.

function generate_uls_request(ngx)
  local uls_req_spec = {}
  -- add host in request
  uls_req_spec[uls.ULS_HOST_QUERY] = ngx.var.http_host
  -- add sticky session in request
  local uls_sticky_session = retrieve_vcap_sticky_session(
          ngx.req.get_headers()[COOKIE_HEADER])
  if uls_sticky_session then
    uls_req_spec[ULS_STICKY_SESSION] = uls_sticky_session
    ngx.log(ngx.DEBUG, "req sticks to backend session:"..uls_sticky_session)
  end
  -- add status update in request
  local req_stats = uls.serialize_request_statistics()
  if req_stats then
    uls_req_spec[ULS_STATS_UPDATE] = req_stats
  end
  return cjson.encode(uls_req_spec)
end
function post_process_subrequest(ngx, res)
  if res.status ~= 200 then
    ngx.exit(ngx.HTTP_NOT_FOUND)
  end
  local msg = cjson.decode(res.body)
  ngx.var.backend_addr = msg[ULS_BACKEND_ADDR]
  ngx.var.uls_req_tags = msg[ULS_REQEST_TAGS]
  ngx.var.router_ip = msg[ULS_ROUTER_IP]
  ngx.var.sticky = msg[ULS_STICKY_SESSION]
  ngx.var.app_id = msg[ULS_APP_ID]
  ngx.log(ngx.DEBUG, "route "..ngx.var.http_host.." to "..ngx.var.backend_addr)
end

Reading the code we understand that the LUA module encodes the requested server name (ngx.var.http_host, which I assume will contain DNS name of the deployed application) into a json structure . The post_process_subrequest function decode the result of calling /vcapuls using router's Unix socket into backend_addr nginx variable. I don't really understood the implication of using nginx variable for this, instead of using return values. I hope there are no race conditions here when processing high workloads. The nginx variable backend_addr will be used in the proxy_pass directive in the nginx configuration file to forward the request to backend servers.

ULS Server

The ULS server is written in Ruby using Sinatra (library or framework? I am noobz in Ruby). Lets jump to the source code.
get "/" do

Looks like the Ruby source code tells us that this section handles HTTP GET request at the / URL. 
    # Parse request body
    uls_req = JSON.parse(body, :symbolize_keys => true)
    raise ParserError if uls_req.nil? || !uls_req.is_a?(Hash)
    stats, url = uls_req[ULS_STATS_UPDATE], uls_req[ULS_HOST_QUERY]
    sticky = uls_req[ULS_STICKY_SESSION]
    if stats then
      update_uls_stats(stats)
    end

This part does request parsing and statistics update if the stats flag is set. It seems that the ULS module keeps track of statistics and forward it to the ULS server.
    if url then
      # Lookup a droplet
      unless droplets = Router.lookup_droplet(url)
        Router.log.debug "No droplet registered for #{url}"
        raise Sinatra::NotFound
      end

The ULS server try to find which droplets is responsible for the requested URL.
      # Pick a droplet based on original backend addr or pick a droplet randomly
      if sticky
        droplet = droplets.find { |d| d[:session] == sticky }
        Router.log.debug "request's __VCAP_ID__ is stale" unless droplet
      end
      droplet ||= droplets[rand*droplets.size]
      Router.log.debug "Routing #{droplet[:url]} to #{droplet[:host]}:#{droplet[:port]}"
      # Update droplet stats
      update_droplet_stats(droplet)
From the droplets, one is chosen randomly. But sticky session flag can be used to link a session to a certain droplet. Reading once more through the ULS module, we find that sticky session is stored in the client using VCAP cookie.
But how does the router does URL lookup ? Lets see the Router class.

Router

The router listens to the NATS bus. Judging from the API, it is a messaging bus using publish/subscribe model. We are very interested in the lookup_droplet method.
    def lookup_droplet(url)
      @droplets[url.downcase]
    end
    def register_droplet(url, host, port, tags, app_id, session=nil)
      return unless host && port
      url.downcase!
      tags ||= {}
      droplets = @droplets[url] || []
      # Skip the ones we already know about..
      droplets.each { |droplet|
        # If we already now about them just update the timestamp..
        if(droplet[:host] == host && droplet[:port] == port)
          droplet[:timestamp] = Time.now
          return
        end
      }
      tags.delete_if { |key, value| key.nil? || value.nil? }
      droplet = {
        :app => app_id,
        :session => session,
        :host => host,
        :port => port,
        :clients => Hash.new(0),
        :url => url,
        :timestamp => Time.now,
        :requests => 0,
        :tags => tags
      }
      add_tag_metrics(tags)
      droplets << droplet
      @droplets[url] = droplets
      VCAP::Component.varz[:urls] = @droplets.size
      VCAP::Component.varz[:droplets] += 1
      log.info "Registering #{url} at #{host}:#{port}"
      log.info "#{droplets.size} servers available for #{url}"
    end

The lookup is simple enough. The droplets field contains an associative array that keyed using lower-case URL. The register_droplet function tells us that each droplet that started by the cloud controller registers itself to the router, specifying application id, url, droplet host and droplet port. The Router store mappings from each lower-case URL to array of droplets.

In this blog post, we've explored the source of Cloud Foundry's router component and its methods of operation, following the flow from nginx server, to the Ruby module, and to the droplet running our applications. That's all for now..  

Thursday, January 10, 2013

Learning Cloud Foundry PHP staging and deployment

Background

VMWare Cloud Foudry is an open source Platform for PaaS (platform as a service). I see cloud foundry as a tool to simplify multiple application deployment in multiple servers. This post will describe things I found by reading source code of vcap (Vmware Cloud Application Platform) at github here and here.
My point of interest is php deployment capability that exists on vcap, which are contributed by phpfog developers.
My previous exploration of Cloud Foundry resulted in this picture below, which describe  my current understanding of the Cloud Foundry platform.


Starting point

The starting point of php support is given an interesting commit I had seen before, where phpfog team implements php functionality. At first it took me about 10 minutes browsing the vcap network graph (https://github.com/cloudfoundry/vcap/network), then I just realized there is a distinct phpfog branch in the vcap git.. 
The interesting commit is titled 'Support for deploying PHP applications through a standard Apache configuration with built-in support for APC, memcache, mongo and redis', authored by 'cardmagic' about 2 years ago (see the commit in github here). 

Staging

PHP applications are deployed using the vmc client. For now I just ignore the client part. The client communicates with the cloud controller, which in turn will command the DEA (Droplet Execution Agent) to deploy applications. DEA will execute the start_droplet function, which will invoke the correct staging plugin associated with the application's runtime.
[maybe I would research further on the relation between start_droplet and plugins]

The PHP plugin (ref) prepares the application by executing this ruby fragment  :
 def stage_application
    Dir.chdir(destination_directory) do
      create_app_directories
      Apache.prepare(destination_directory)
      copy_source_files
      create_startup_script
    end
  end

I am no Ruby programmer, so I sure hope I read this correctly..
At first, the plugin changes the current directory into droplet instance directory. In there, it calls the method create_app_directories (which I guess would create some required directories in there). Then it calls prepare method of the Apache class. Reading apache.rb, we know what the Apache::prepare does is that it copies apache.zip from the plugin directory and extracts it into the droplet instance directory. The apache.zip consists of configuration directories of an apache httpd server, with some modification so it honors several environment variable that would be injected in apache/envvars below. Generate_apache_conf script is also being copied from the plugin resource directory.
I guess the copy_source_files method would copy the application source codes into the droplet instance directory.
After that, startup script will be created using startup_script method :
 def startup_script
    vars = environment_hash
    generate_startup_script(vars) do
      <<PHPEOF
env > env.log
ruby resources/generate_apache_conf $VCAP_APP_PORT $HOME $VCAP_SERVICES
PHPEOF
    end
  end
  Which conveniently executes generate_apache_conf script, which in turn will create some apache configuration files and a shell script based on application parameters. The files are :
  1. apache/sites-available/default, which defines DocumentRoot, ErrorLog file, log format, and VCAP_SERVICES environment variable
  2. apache/envvars, which defines apache user, group, pid file, base directory
  3. apache/ports.conf, which define the port where apache listens
  4. apache/start.sh, which is the script that would start the apache server in the droplet directory with the created configuration files

 Running the App

 Two methods in php plugin tells us how the platform starts the application :
  # The Apache start script runs from the root of the staged application.
  def change_directory_for_start
    "cd apache"
  end
  def start_command
    "bash ./start.sh"
  end
So it starts the application by running apache/start.sh that is created by the previous generate_apache_conf script.

The Most Current Version 

I try to look for the latest lib/vcap/staging/plugin/php/plugin.rb file, at first  I found none, because it is already migrated from vcap repository to vcap-staging repository. Refer here to the newer version.
I notice an improvement which would allow us to define application memory allocated to the PHP application and also a stop command which invoke kill -9 :

  def stop_command
    cmds = []
    cmds << "CHILDPIDS=$(pgrep -P ${1} -d ' ')"
    cmds << "kill -9 ${1}"
    cmds << "for CPID in ${CHILDPIDS};do"
    cmds << " kill -9 ${CPID}"
    cmds << "done"
    cmds.join("\n")
  end
  private
  def startup_script
    generate_startup_script do
      <<- span="span">PHPEOF
env > env.log
ruby resources/generate_apache_conf $VCAP_APP_PORT $HOME $VCAP_SERVICES #{application_memory}m
PHPEOF
    end
  end

The kill -9 thing really handy because in numerous ocassions I am forced to do such command manually to stop a stuck/hung php process. The generate_apache_conf script is enhanced to create an additional php configuration file (apache/php/memory.ini) which impose a memory limit:
output_path = 'apache/php/memory.ini'
template = <<- span="span">ERB
memory_limit = <%= php_ram %>
ERB

That tells us that memory limitiation is for a single apache/PHP process. Collective application memory usage can be determined by n * (php_ram + x) where n is the amount of apache process running and x is the memory used by apache on its own. That makes me wonder about max client in apache's configuration (in apache.zip), here is the latest version :

    StartServers          5
    MinSpareServers       5
    MaxSpareServers      10
    MaxClients          150
    MaxRequestsPerChild   0

The configuration fragment above essentially says that running apache process could be between 5 to 150 child processes, and typically 10 during idle.
 
There is also an additional line in stage_application method to copy php configuration files too :
system "cp -a #{File.join(resource_dir, "conf.d", "*")} apache/php"
This environment variable export in generate_apache_conf script enables the apache/php directory to contain php configuration files :
export PHP_INI_SCAN_DIR=$APACHE_BASEDIR/php

 Finishing remarks

I hope by reading this will allow us to customize Cloud Foundry's PHP support as needed. I might need to add additional php extensions, that must be inserted into php's conf.d directory (shown above copied from the resource directory). And also it might be interesting to implement a method to change  MaxClients from the cloud API

Thursday, January 3, 2013

Interfacing Joget Workflow Engine with .NET and PHP

The Joget Workflow Engine have JSON API interface which could be used to query and control workflow processes. This will be useful if we need to implement a custom front-end UI for joget workflow. Especially if the development team have minimum or even zero Java development experience.

I have created two API test applications, one is using PHP and the other is written in C# using .NET framework.
The PHP version is written using the wonderful Yii Framework.
The C# version is a windows forms application, but the WorkflowService class theoretically could be used in ASP.NET application.

Feel free to download and try ...
To start using this test application, you must first configure joget to allow login using master login username and master password (click System Settings:General Setting, find the section System Administration Settings). Fill the master login & password (in my example, superuser and password00). This must match the password in top of protected/components/YREST.php or WorkflowService.cs.

Then you should create a workflow or  choose a sample workflow. Open the workflow page ( 2 Design Apps : [Application Name]) using Joget's Web Console Application, and click on the 'Show Additional Info' link.
Write down the Process Definition Id, but replace the '#' with ':' character.
The process definition ID could be used to start process in the tester application.

Fill in the process definition id in the Joget-start page. Fill in the userid with existing Joget User ID.
Afterwards, you could check that the workflow will be started in the Joget Web Console's Running Processes view.
To query the inbox for a certain user, fill in the userid in Joget-Inbox page.
To control the workflow, you could fill in the activity Id and userid in the Joget-Complete page. Up to 3 variables could be set from this page.
Of course this all could be done from joget Web Console 's UI. But the purpose of these two test apps is to demonstrate how do we access the Joget APIs from PHP platform and from .NET platform .