Sunday, November 21, 2010

Installing oci8 on Centos

Today I have the inclination to resume one of my postponed to-dos, that is to install Oracle instant client and php extension oci8 in two production servers.

references :
http://shamuntoha.wordpress.com/2010/04/12/centos-oracle-php-database-connect-oci-instantclient-oci8/

http://ubuntuforums.org/showthread.php?t=92528

http://www.jagsiacs.co.uk/node/77

Overview
The basic steps are, download and extract basic instant client and SDK instant client. Then do a pecl install oci8 to download and compile oci8 extension. There are few issues I encountered when installing php oci8 extension.
  1. pear tried to connect directly to pear server. Must set http_proxy using pear config-set command.
  2. missing links. Because I downloaded zip files, and not rpms, there are few missing links when trying to link oci8 with instant client. The solution is to create them manually
  3. intermittent compilation problem. One of two compile result in an error, even with exactly the same arguments and environment condition. This means, if it fails, retry at least once, there is a chance that the second compile will work smoothly.
  4. SElinux issues. Must do some chcon to allow the execution of instant client libraries. See reference #3.
Pear proxy configuration
pear config-set http_proxy http://10.x.x.x:8080

Missing link solution
Do these commands when current working directory is where the oracle instant client library were extracted.
ln -s libclntsh.so.10.1 libclntsh.so
ln -s libocci.so.10.1 libocci.so


Selinux issue solution
I do these commands to allow execution of oracle libraries from PHP environment :
chcon system_u:object_r:textrel_shlib_t *.so
chcon system_u:object_r:textrel_shlib_t *.10.1

But these might not be enough, I only phpinfo results to confirm that oci8 is indeed loaded by PHP environment. Stay tuned for more info after we try this configuration.

Wednesday, October 6, 2010

Solving time drift problem on Ubuntu VMWare Guest

I have an Ubuntu VMWare guest, and having trouble with time drift.
After a bit twiddling with ntp synchronization and still getting unacceptable time drift,
finally I read a post in http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=471784 :


From:
Kazuhiro NISHIYAMA
To: 471784@bugs.debian.org
Subject: Please recommends open-vm-source
Date: Fri, 28 Mar 2008 21:13:22 +0900
I had a same problem, and resolved following commands:

* sudo aptitude purge open-vm-tools
* sudo reboot
* sudo aptitude install open-vm-source
* pager /usr/share/doc/open-vm-source/README.Debian
* sudo module-assistant prepare open-vm
* sudo module-assistant auto-install open-vm
* sudo aptitude install open-vm-tools
* sudo reboot

I done the steps above, and it seems to work perfectly.
Oh, I didn't do the reboot parts nor the pager one.

EDIT: Seems that its not enough. I have a AMD Phenom x4 CPU, which have some anomalies in regard to CPU clock speed.
The problem is, the CentOS 4.6 Host detected that the CPU have clock speed 1100MHz, in reality the CPU have 2200MHz clock. So, imitating this post, I must tell VMware server the real maximum clock:

vi /etc/vmware/config

edit the file so it resembles these lines:

host.cpukHz = 2200000
host.noTSC = "TRUE"
ptsc.noTSC = "TRUE"



And make sure ntpd runs to handle the small drifts that still occurs.

Monday, August 16, 2010

Implementing Multimaster Replication

Today I tried to implement multimaster replication on two Red Hat Enterprise Linux 5 servers.
I studied the top three google result for 'multimaster replication mysql' today :
http://onlamp.com/pub/a/onlamp/2006/04/20/advanced-mysql-replication.html
http://dev.mysql.com/doc/refman/5.0/en/replication-howto.html
http://capttofu.livejournal.com/1752.html

I found these three resources complementary. At first I have no idea about the replication process. Initially followed instructions from Dev.MySQL site to setup one-way replication, then read the livejournal blog for better one-page summary of the process. Reading the ONLAMP Article, I understood the syntax to stop replication and ordered MySQL to skip one SQL statement when resuming replication.

I'm using replication with existing data in the database, so the steps I use is a little different from the three web resource above. And because I'm using out-of-the box RHEL setup, there is an issue with the firewall blocking port 3306.

Unblock the firewall


as root, edit (use vi or nano or something) /etc/sysconfig/iptables
> vi /etc/sysconfig/iptables

-A RH-Firewall-1-INPUT -m state --state ESTABLISHED,RELATED -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 22 -j ACCEPT
-A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 3306 -j ACCEPT
-A RH-Firewall-1-INPUT -j REJECT --reject-with icmp-host-prohibited
COMMIT


insert the line in red above.

Other steps is similar to what you might read in these three articles
  1. Add server-id, log-bin, sync_binlog, innodb_flush_at_trx_commit to /etc/my.cnf at server A
    • log-bin=mysql-bin
    • server-id=1
    • sync_binlog=1
    • innodb_flush_log_at_trx_commit=1

  2. service mysqld restart (at A)
  3. create repl user at A, grant it replication capabilities : (I omit the domain so I wouldn't be debugging hostname problems, it is safe for my purpose because it is an internal server)
    mysql> CREATE USER 'repl'@'%' IDENTIFIED BY 'slavepass';
    mysql> GRANT REPLICATION SLAVE ON *.* TO 'repl'@'%';

  4. run mysql (client) in one terminal session : mysql> FLUSH TABLES WITH READ LOCK;
  5. in other terminal session, check latest binary log sequence number with mysql> SHOW MASTER STATUS;
  6. mysqldump --all-databases --lock-all-tables >dbdump.db
  7. back to first mysql session, unlock tables :
    mysql> UNLOCK TABLES;
  8. prepare server B's /etc/my.cnf with full blown options :
    • log-bin=mysql-bin
    • server-id=2
    • sync_binlog=1
    • innodb_flush_log_at_trx_commit=1
    • auto_increment_increment=2
    • auto_increment_offset=2
    • log-slave-updates

    • master-host = ipofserverA
    • master-user = repl
    • master-password = passwordOfRepl

    • log-slow-queries=/var/log/mysql-slow.log
    • long_query_time=1
    • log-queries-not-using-indexes=1

  9. start server B's mysql with --skip-slave-start option :
    /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql --socket=/var/lib/mysql/mysql.sock --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --user=mysql --skip-slave-start
  10. import database dump at B:
    mysql <>
  11. change master on B to refer the A server :
    mysql> CHANGE MASTER TO
    -> MASTER_HOST='master_host_name',
    -> MASTER_USER='replication_user_name',
    -> MASTER_PASSWORD='replication_password',
    -> MASTER_LOG_FILE='recorded_log_file_name',
    -> MASTER_LOG_POS=recorded_log_position;
  12. mysql> START SLAVE;
  13. change A's my.cnf file to resemble B's. Substitute a different server-id and auto increment offset.
  14. service mysqld restart
    (on A:)
After these steps, the system choked on first replicated statement from server B. The solution is to stop slave on A and do
SET GLOBAL SQL_SLAVE_SKIP_COUNTER=1; before starting A's slave again.

Saturday, August 14, 2010

Feedback Cycles - key to timely process delivery

I've recently found some analogy in some events occured lately. Its all about feedbacks. Back then when I'm doing undergraduate study in Electronics Engineering, we learnt that feedback is an important element of any control systems. And now, it seems, it still an important element, if we want our process delivers in timely fashion. Allow me to describe the two events where the analogy is found.
First, some person were being requested to upload some data. But they aren't the one to upload it to the system, they only must provide the datafile to the uploader. And where I fits in, uploading the data. Because of belief that its not in my best interest to upload stuffs, I delegated the task. After the delegate finished with the task, I notify some people that the task is done. Days passed. The software where the uploads being done have a report to validate the data, but somehow the report returns far too much invalid data. Takes us another days to found out that some of the uploaded data's columns were switched. And also that a part of the upload logic has incorrect mapping. I think, things will be a lot faster if we shorten the feedback cycle - like for instance, if the data provider is the one who uploads the data and also verifies it in system. Then I begin to think that maybe if I were to be the one that upload it things will be different, but.. maybe not. For sure, too many people in the process chain increases process latency. Too late feedback make process go awry.
Second, a few programmers codes some functionality I have been given them in form of task list. But I didn't review them, the code were being written with a lot performance problems and some mismatch with the actual need. I think, if we shorten the feedback cycle things could be better - for example, letting the programmers view the profile log so they know how many unnecessary queries being made to the database. Or letting the user review each task done in the application, so the functionality mismatch is minimal. Waiting the user to actually need to use the functionality and complain about something missing is somehow suboptimal. I begin to think that I must adopt Bill Atkinson's philoshopy, that its better to have few functionality done right than a lot functionality done all wrong !
The analogy could go on and on, for example, when we found out that the reconfiguration of the server to accomodate hosting of a second app actually disables the first app - about two months after the reconfiguration (!). One factor to this incident that I have no logon to try out in the first app so I didn't bother trying to log on to the application. We found out the error - at the expense of two months application downtime and the need to reinitiate server reconfiguration procedure. There is very much bueraucracy around every operational server in here, partly to prevent this stuff from happening, but also makes recovery cycle far too long. And I also wondering why there is no automated script to test the application's logon functionality?
So, feedback in control systems, its very much the same in business processes, and also in software engineering processes. Feedback cycles are important element, no matter in which context you see it.

Wednesday, March 10, 2010

Packet Too Large

While importing (restoring) a relatively large MySQL backup file, this error occured:
packet too large.
After googling for a while I found this page. So the cure is to increase max_allowed_packet in my.ini.

Thursday, March 4, 2010

Dont ever put online redo log and its mirror in one drive

Everything is fine in the last one year or so, even if the server crashed several times (maybe bad power line..). Thanks to the RAID5 mechanism (which I must re-add with manual commands each time one drive kicked out of the array), no data were lost in each of the crashes in the past.
In 25 February, 2010, my server's system crashed again. It seems that I overlooked the fact that I placed mirrlog and origlog in one partition. The mirrlogA directory contains a member of redo log group 1 & one member of redo log group 3, and origlogA directory contains another member of the same redo log group. The file in origlogA is mirrored in mirrlogA, and the thing is, I symlinked them both to the same partition (different directory, of course). The better practice is, to make those two directory (mirrlogA and origlogA) resides in different physical hard drive. If they reside in the same drive, the probability of both of the online redo log corrupted is becoming significant. And thats exactly what happened, the file in the origlog has I/O errors when I tried to read it, and the file in the mirrlog is.. corrupted.
OK, so my data weren't lost.. its just takes me a lot of hours puzzled over what should be done to my Oracle Database. First I tried to copy the second file to the first one (they should be identical, they are members of the same group). And Oracle started.. churning happily.. but not for long. As soon as I activated the Netweaver ABAP app server, errors churned in alert log. And oracle grinds to halt.

These two magic lines worked for me to solve this problem :
alter database recover until cancel;
alter database open resetlogs;

And now i moved the mirrlog to one of the RAID5 partition.

Tuesday, February 16, 2010

Oracle 9.0.1.1 exp bug

I tried to run an intranet app from home, with a database connection from my laptop tunneled to Telkom HQ Data Center.
Seems that the application tooks too many queries and data volume were bit too much (OK, maybe I should profile the data volume & query counts...), it took me an hour of trial and error so that I realized : it is impossible to run the app without timing out while in tunneled connection.
OK, then proceed to think.. If this is were my usual application, what would I do ? I would fire up MySQL Administrator and backup the externally located database in my system, then restore it to my local MySQL database.
For oracle database, the tool alternatives were:
- use TOAD to move the database (which, I think would be tedious, and also I have no TOAD installed in my system)
- use Oracle Datapump, which we cannot use because the database are 9.0.1.1, where such technology haven't exist yet, and also .. I have no experience at this moment on Datapump.
- migrate to MySQL, and migrate back to Oracle, because each RDMBS vendor provided tools to migrate from almost any database into _their_ RDBMS
- use exp/imp
Migrating back and forth gives risk of inconsistent data type in the target database. So I choose Exp/imp.
But.. exp/imp have constraints that must be satisfied :
  1. version of the exp utility must match with the source database
  2. version of the imp utility must match with the target database
  3. the user being used in the source database and target database must in the similar rank, that is, they're both SYSDBAs or they're both normal users
These facts I found out the hard way. Please be free to comment if any of these were incorrect.
The exp utility installed in my laptop system are 10.2.0.1.0, and the server is 9.0.1.1. So I must either (a) find one oracle client installation which version is 9.0.1.1, or (b) run the exp in the server.
Running exp on the server, I got these:

EXP-00008: ORACLE error 6550 encountered
ORA-06550: line 1, column 13:
PLS-00201: identifier 'SYS.LT_EXPORT_PKG' must be declared
ORA-06550: line 1, column 7:
PL/SQL: Statement ignored
EXP-00083: The previous problem occurred when calling SYS.LT_EXPORT_PKG.schema_info_exp
. exporting statistics
Export terminated successfully with warnings.


Puzzled for some moment, and after some googling I found :
GRANT EXECUTE ON sys.lt_export_pkg TO exp_full_database;
or
GRANT EXECUTE ON sys.lt_export_pkg TO PUBLIC;

And now I could export sucessfully with no strange warnings. Then downloaded the exported DMP file, import it here using oracle 10.2's IMP, and continue debugging the app..

About Oracle and MySQL
Some readers who knows me well will wonder why most of my posts is about troubleshooting Oracle, while I am one kind of person who will use MySQL for all purpose if I could. I think, its not that I am forced to work with much more Oracle database than I liked to, its more like that I seldom have problems with MySQL - export/imports run beautifully using MySQL Administrator, and no corrupted data yet when using MySQL.
Ok to be sure lets backup my one and only production app running on mysql :)

Monday, February 8, 2010

saprfc installation on centos 5


After hesitating for a while, I struggled to bring saprfc to live on a Centos 5 installation.
Ok, first, the people I'm helping already installed rfc sdk in /usr/sap/rfcsdk. I am using RFC SDK extracted from RFC_45-20000055.SAR that I've downloaded from SAP Service Marketplace Support links (http://service.sap.com/). The SAR file is suitable for Linux i386 architecture, you might want to use RFC_45-10003377.SAR for Linux x86_64 architecture.
Now i need to compile sap:rfc extension. I used saprfc-1.4.1 downloaded from http://saprfc.sourceforge.net/. But, no phpize in path, so i must install php-devel first
> yum install php-devel
but, because the proxy setup is not done (our server lives outside the DMZ), I must edit yum.conf :
> vi /etc/yum.conf
and add one line :
> proxy=http://....
Then I'm ready to try php-devel again.
> yum install php-devel
OK. now try to compile sap:rfc. R the INSTALL file in a flash, and tries these steps (the part b in the INSTALL file)
> cd /home/saprfc-1.4.1
> phpize
What? no phpize? I was mistaken thinking that php-devel contains the phpize executable..
Seems that my yum install php-devel failed. Retrying, now with php-xml and php-snmp (becuase yum seems complained that these are not compatible with the version of php-devel I tried to install)
> yum install php-devel php-xml php-snmp
Hurray.. install complete.
retries compiling sap:rfc
> cd /home/saprfc-1.4.1
> phpize
[root@localhost saprfc-1.4.1]# phpize
Configuring for:
PHP Api Version: 20041225
Zend Module Api No: 20050922
Zend Extension Api No: 220051025
config.m4:30: warning: AC_CANONICAL_HOST invoked multiple times
config.m4:30: warning: AC_CANONICAL_HOST invoked multiple times
I don't know about AC_CANONICAL_HOST, lets ignore them for now.
> ./configure
all seems OK. If there is an error about missing gcc you should
>yum install gcc
If not, just proceed to make..
> make install
it compiles a saprfc.so in /usr/lib/php/modules.
I took a liberty to write these ini /etc/php.d/saprfc.ini :
; Enable saprfc extension module
extension=saprfc.so
now restart apache and pray..
> /etc/init.d/httpd restart

PHP Warning: PHP Startup: Unable to load dynamic library '/usr/lib/php/modules/librfccm.so' - /usr/lib/php/modules/librfccm.so: cannot open shared object file: No such file or directory in Unknown on line 0
PHP Warning: PHP Startup: Unable to load dynamic library '/usr/lib/php/modules/saprfc.so' - libstdc++-libc6.2-2.so.3: cannot open shared object file: Too many levels of symbolic links in Unknown on line 0
Wonderful. Things so complicated from this moment.

I think i found the problem:
[root@localhost lib]# ldd /usr/lib/librfccm.so
linux-gate.so.1 => (0x0035e000)
libdl.so.2 => /lib/libdl.so.2 (0x004ae000)
librt.so.1 => /lib/librt.so.1 (0x006de000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00110000)
libc.so.6 => /lib/libc.so.6 (0x00127000)
libstdc++-libc6.2-2.so.3 => not found
libm.so.6 => /lib/libm.so.6 (0x0026a000)
/lib/ld-linux.so.2 (0x00896000)
[root@localhost lib]# locate libstdc++-libc6.2-2.so.3
/usr/lib/libstdc++-libc6.2-2.so.3
[root@localhost lib]# ls -l /usr/lib/libstdc++-libc6.2-2.so.3
lrwxrwxrwx 1 root root 18 Jan 26 10:47 /usr/lib/libstdc++-libc6.2-2.so.3 -> libstdc++.so.5.0.5
[root@localhost lib]# locate libstdc++.so.5.0.5
/usr/lib/libstdc++.so.5.0.5
[root@localhost lib]# ls -l /usrlib/libstdc++.so.5.0.5
ls: /usrlib/libstdc++.so.5.0.5: No such file or directory
[root@localhost lib]# ls -l /usr/lib/libstdc++.so.5.0.5
lrwxrwxrwx 1 root root 18 Jan 26 10:50 /usr/lib/libstdc++.so.5.0.5 -> libstdc++.so.5.0.5

Yes. Endless link of links.
Tried to fix these with updating compat-libstdc++-33 and compat-libstdc++-296 (yes, I only know that the current version of stdc++ being installed is libstdc++so.6.x, and have no idea which of the compat packages contains the 5.0.5 version)

yum update compat-libstdc++-33 compat-libstdc++-296
...

[root@localhost ~]# ldd /usr/lib/librfccm.so
linux-gate.so.1 => (0x00854000)
libdl.so.2 => /lib/libdl.so.2 (0x00110000)
librt.so.1 => /lib/librt.so.1 (0x00114000)
libpthread.so.0 => /lib/libpthread.so.0 (0x00811000)
libc.so.6 => /lib/libc.so.6 (0x00358000)
libstdc++-libc6.2-2.so.3 => /usr/lib/libstdc++-libc6.2-2.so.3 (0x0011d000)
libm.so.6 => /lib/libm.so.6 (0x0015f000)
/lib/ld-linux.so.2 (0x00896000)
libgcc_s.so.1 => /lib/libgcc_s.so.1 (0x00186000)

everything seems in order.

> /etc/init.d/httpd restart

and now... drum roll please..