This time I get (another) annoying RAID5 failure. The CentOS 4.7 server won't boot because it was unable to start the RAID5 array. Yes, this is the second time I stumbled upon this problem (see this Indonesian-written post). I burned a new CentOS 4.7 DVD (using a new REAL server's DVD writer, no less), then I boot up the DVD, typed linux rescue in the boot command line, and tried to follow the exactly the same step I've done and written in this blog, but to no success: the system complains that the superblock doesnt match.
Seems I forgot the new RAID5 configuration in this server. I forgot that I have reinstalled this server with SAP ERP Netweaver, creating two software RAID5 arrays in the process, and of course with different partitions.
The partitions were: sda3, sdb3, sdc1, sdd2. The four partitions created a 215 megablock (thats about 100 GB, I think) md1 partition. Here's the chemistry:
- The kernel won't add non-fresh member (sdd2) into the array, it kicks it out of the RAID assembly.
- The remaining RAID assembly of three partitions couldn't be started. The cause is, which I found out after forcing the array to run, is that event counter in sda3 is not the same with the others. But kernel said nothing of this in the dmesg log. It just said 'unable to start degraded array ..'
- I forced the assembly to run. Must do this when the md device stopped. So, mdadm -S /dev/md1, then mdadm -A --force --run /dev/md1 /dev/sda3 /dev/sdb3 /dev/sdc1 /dev/sdd2. It runs indeed, writing error messages about sda3
- But sdd2 still kicked out from the array. I must manually add it to the array, mdadm -a /dev/md1 /dev/sdd2
Now, I am just waiting for the recovery (recovery status can be read in /proc/mdstat) to finish, so I could boot up this system in confidence. I hope nothing else went wrong.
Saturday, November 14, 2009
Saturday, October 10, 2009
Java-based Domain Driven Frameworks
Yesterday I'm browsing and exploring about Full-stack Domain Driven Application Frameworks written in the Java platform. The Java platform still has a lot of potential, but somehow the Rails framework tells me that we're still far from it, my exploration yesterday.
Here is a brief of the frameworks I read about:
Here is a brief of the frameworks I read about:
- JMatter - This one I already use a few months back. The UI is pretty decent, but the lack of production quality web-based view technology.. makes it very far behind others. It currently only has production-grade Swing viewer, making it a two tier application platform. Security-wise, two-tier platforms are not good -> each client must have a database connection to the db server, which potentially could be abused. Has no authorization code built in. Actions on the entity written in the domain model, encapsulating business logics.
- OpenXava - Web-based GUI with some AJAX parts. Has no authorization code built in. Separates actions from domain model, the concept is that actions must be contained in controllers.
- Jspresso - Has multiple view (GUI) technology implemented - ULC, Flex, and WingS. Flex is Adobe's RIA technology, WingS is AJAX Web-based view technology. ULC I have no experience. The Reference manual somewhat not completed. Domain model not written in Java, it was written in spring beans XML. I havent figure out how to write business logic, but entity relationship stuffs seems to be complete. Class-based authorization and dynamic authorization (The docs were very hazy about this one - seems that this is authorization based on object's state but I havent' found out how to implement authorization based on object's owner).
- Nakedobjects - Currently only has Swing/AWT viewer (sorry, haven't got time to find out which is which) and HTML (web-based) viewer. Domain model written in Java, contains business logic (similar to JMatter). Class-based authorization.
- Trails - only has Tapestry viewer, which generates Web-based GUI. Domain Model in Java. Seems to be the only one that implement class based authorization and association-based authorization.
Saturday, September 12, 2009
HTB traffic control burst calculation inaccuracy
My previous Kubuntu 7.04's Linux 2.6.20 kernel uses a 1000 Hz timer kernel (refer to HZ constant defininition and fourth value of kernel ABI /proc/net/psched routine psched_show in sch_api.c). HTB's burst is calculated by tc binary using the formula : bitrate/timer frequency + maximum transfer unit.
Ubuntu 8.04's Linux 2.6.24 kernel have an improved packet scheduler timer resolution. It now uses high resolution timer, with nanosecond accuracy. Refer to psched_show in sch_api.c, and Fourth value of /proc/net/psched now returns 1 G (10^9). HTB burst calculation is now bitrate/10^9 + mtu.. Somehow this is will always a small value, and as a side-effect HTB qos scheduler no longer capable of delivering high bitrates accurately to stations.
My analysis based on my limited knowledge of the packet scheduler : the linux's packet scheduler is driven by calls to dequeue function. This might be driven by tx complete interrupt or something else. So it is not timer-driven. But the packet scheduler routines (such as sch_htb) keeps track of passing time using the value of packet scheduler timer (previously, 1KHz timer, and now, the 1 GHz timer). Htb adds tokens into the token bucket based on time passed between previously recorded time of change in the class (cl->t_c) and the current time (q->now) (see sch_htb.c:660). Herein lies the problem, the current time q->now is no longer the current time, because the timer frequency is in nanoseconds and there are tens if not hundreds of instructions being executed between the assignment of q->now (see htb_dequeue at sch_htb.c:894) and the usage of q->now. And, maybe, some interrupt has occurred (I dont really know, dequeue is not being called in the context of NMI, is it ?), and maybe one milliseconds has passed.. The inaccuracy of the size of the added token, causes the htb to prevent packets being sent in timely manner -> it thinks that the class not eligible to send package because token remaining is less than packet size, the token count is actually less than what it supposed to be.
This is what I thought, for now. The cure seem to be changing tc source code to assume 1KHz timer (which amounts to 1ms accuracy) when tc finds out that a nanosecond timer is in use, so the token buffer is large enough to cater for problems caused by time inaccuracies said above.
Ubuntu 8.04's Linux 2.6.24 kernel have an improved packet scheduler timer resolution. It now uses high resolution timer, with nanosecond accuracy. Refer to psched_show in sch_api.c, and Fourth value of /proc/net/psched now returns 1 G (10^9). HTB burst calculation is now bitrate/10^9 + mtu.. Somehow this is will always a small value, and as a side-effect HTB qos scheduler no longer capable of delivering high bitrates accurately to stations.
My analysis based on my limited knowledge of the packet scheduler : the linux's packet scheduler is driven by calls to dequeue function. This might be driven by tx complete interrupt or something else. So it is not timer-driven. But the packet scheduler routines (such as sch_htb) keeps track of passing time using the value of packet scheduler timer (previously, 1KHz timer, and now, the 1 GHz timer). Htb adds tokens into the token bucket based on time passed between previously recorded time of change in the class (cl->t_c) and the current time (q->now) (see sch_htb.c:660). Herein lies the problem, the current time q->now is no longer the current time, because the timer frequency is in nanoseconds and there are tens if not hundreds of instructions being executed between the assignment of q->now (see htb_dequeue at sch_htb.c:894) and the usage of q->now. And, maybe, some interrupt has occurred (I dont really know, dequeue is not being called in the context of NMI, is it ?), and maybe one milliseconds has passed.. The inaccuracy of the size of the added token, causes the htb to prevent packets being sent in timely manner -> it thinks that the class not eligible to send package because token remaining is less than packet size, the token count is actually less than what it supposed to be.
This is what I thought, for now. The cure seem to be changing tc source code to assume 1KHz timer (which amounts to 1ms accuracy) when tc finds out that a nanosecond timer is in use, so the token buffer is large enough to cater for problems caused by time inaccuracies said above.
Sunday, September 6, 2009
Names in our code
This is just a summary of some rule-of-thumbs when developing source codes.
When coding, do these :
Rules that were Anti-patterns :
When coding, do these :
- use descriptive names
- please classify/categorize so we have shorter source files. Or, we have fewer source files in each folder (distribute files into category folders). In OOP, we should refactor into new classes if things gotten too crowded in one class, or even refactor the class into different packages. In PHP, we should refactor into new files and/or new folders.
Rules that were Anti-patterns :
- don't use generic names like $query. When reading it, I don't have a clue what does it stands for, a query to the users table, or a query to delete a user, or what? OK, maybe it can be used if the scope is local, that is, I could easily look for the meaning in the same function or small file.
- avoid long parameter list. It is difficult to find out which parameter means what. We could use: a value class in OOP languages, or associative array in PHP, or even object in PHP, to give meaningful parameter. When function insert(name,address,groupid,status,isAdmin) being called, it becames insert($name,$address,null,0,$isAdmin), and we must look elsewhere what does the 0 stand for. If we have $entity = array('name' => $name, 'address' => $address, 'groupid' => 0, 'isAdmin' => isAdmin); and call insert($entity); things are much easier to understand and easier to modify.
Monday, August 17, 2009
Multiple PHP in a machine blues
I am compiling PHP's source code and a PHP extension (apc). The reason for not using packaged version of PHP is that there is another critical application running in the server, with its own PHP version, and we don't want to risk incompatibility issues forcing the application run with a different PHP version. Meanwhile I need PHP compiled with debug flags, and also with gd enabled and apc extension for another application. I prefer Debug flags to be enabled because we're testing this whole stuff and don't want to be in the dark when error (s) cropped up (previous set of PHP-apache-oci8 triad sometimes issued segmentation faults but we don't now the source of the errors).
The problem is, that the apc Makefile won't use the debug flag set during previous PHP compilation. It turns out that we must be very careful that:
The problem is, that the apc Makefile won't use the debug flag set during previous PHP compilation. It turns out that we must be very careful that:
- the phpize being used is from the correct PHP compilation (I deleted the scripts/phpize and scripts/php-config after changing PHP's configure parameters, then invoke make install on PHP)
- no other phpize or php-config is being run, especially by setting the PATH environment variable so the path to the correct phpize/php-config is the first directory listed in the PATH.
Munin, again
More problems come when I'm just starting to use Munin-Node to monitor several Windows 2003 servers :
- the memory value reported by Munin-Node are all wrong. The sum of app and unused is not related to total physical memory installed in the servers. It turns out that the Windows API used returns 64-bit values, and simple change to the format specified fixes the problem.
- external plugin doesn't work. Turns out that the external plugin must not print the trailing newline when invoked by 'name' argument, and it also must prints a line of '.' after each 'config' and 'value' (or default) invocation.
- after several days of running, external plugins stopped working. I used sysinternals' process explorer in the server, and found out that there is some chance that external plugin invocation doesn't close the listener thread. After a few days the unclosed threads became too much and prevent the Munin-Node service from spawning new external plugins.
Thursday, August 6, 2009
Munin-Node 1.5 with correct memory plugin
Munin is a solution to monitor servers. It shows graphs of some important server's system parameters, such as disk free space, cpu utilization, memory usage, network traffic, and even the HD temperature. Munin requires an agent, called Munin Node, installed at the server (you must be root or administrators to do this). Of course, there is a specific Munin Node to be installed for each operating system. On Ubuntu systems, there is munin packages in ubuntu's repository. Meanwhile Jory Stone created Munin Node for Windows, and I'd be needing that because some of the servers used at my workplace is powered by Windows Server 2003.
I read the postings at jory's Munin Node site, and seems that memory plugin must be patched in order to have visible memory graphs. I've downloaded the source code from jory's site, then I followed the instructions to fix the memory plugin (uncommented few lines in MemoryMuninNodePlugin.cpp), and the next task is to compile the whole thing.
A simple thing, compiling, is not as simple as it seems.
First, the vcproj file is for VS 2008. I'm using VS 2005. After editing the version part of the file, the project file could be opened. Next, netfw.h couldnt be found. A few googling back and forth found some netfw.h, and I downloaded it with icftypes.h (look here). Then, msi.lib seems to be missing (I'm using VS 2005 which lacks that file). I am forced to download Platform SDK, and I choose only windows installer SDK to install (minimize download volume). Well, now I have a fresh munin-node.exe. But wheres that upload button in blogger... Oh, blogger doesn't allow us to upload files.
So i upload the executable in my own site at google.
UPDATE:
It seems that, no easy deployments for executables compiled in VS 2005 - we either must link to C Runtime (CRT) static library, or link to CRT DLL and use some mechanisms to ensure the DLL will be installed using Windows SxS (side by side) mechanism. The mechanisms include making the user installed the VC Runtime redistribution or include the CRT Merge shared module into a MSI-packaged installation format.
I compiled it once more and this time packaged the executable in .MSI format (seems that the vdproj format for VS 2008 were compatible with 2005). Jory sets up the release version to be statically-linked, so the only motivation for .MSI packaging is the ease of service installation.
I read the postings at jory's Munin Node site, and seems that memory plugin must be patched in order to have visible memory graphs. I've downloaded the source code from jory's site, then I followed the instructions to fix the memory plugin (uncommented few lines in MemoryMuninNodePlugin.cpp), and the next task is to compile the whole thing.
A simple thing, compiling, is not as simple as it seems.
First, the vcproj file is for VS 2008. I'm using VS 2005. After editing the version part of the file, the project file could be opened. Next, netfw.h couldnt be found. A few googling back and forth found some netfw.h, and I downloaded it with icftypes.h (look here). Then, msi.lib seems to be missing (I'm using VS 2005 which lacks that file). I am forced to download Platform SDK, and I choose only windows installer SDK to install (minimize download volume). Well, now I have a fresh munin-node.exe. But wheres that upload button in blogger... Oh, blogger doesn't allow us to upload files.
So i upload the executable in my own site at google.
UPDATE:
It seems that, no easy deployments for executables compiled in VS 2005 - we either must link to C Runtime (CRT) static library, or link to CRT DLL and use some mechanisms to ensure the DLL will be installed using Windows SxS (side by side) mechanism. The mechanisms include making the user installed the VC Runtime redistribution or include the CRT Merge shared module into a MSI-packaged installation format.
I compiled it once more and this time packaged the executable in .MSI format (seems that the vdproj format for VS 2008 were compatible with 2005). Jory sets up the release version to be statically-linked, so the only motivation for .MSI packaging is the ease of service installation.
Subscribe to:
Posts (Atom)