Rants On NFS Lack of File Handle Visibility To Sysadm
NFS is a not-so-recent solution to share filesystem across linux nodes. It have some capability that are currently indispensable for Linux Clusters : to lock files across nodes and allow either exclusive or non-exclusive access to the same file. Fault Tolerance / Recovery I have read some papers on NFS, it should be able to recover a restarting host / server. Unfortunately in several occassion we found this to be not quite true, after a host serving NFS being restarted, we have stale handle errors in the client. The workaround is to restart NFS client, and if that still doesn't fix the situation, restart NFS server. In our cases sometimes we need to restart twice across the cluster (because the client hangs running a program over NFS). Some might said program shouldn't be run over NFS (and only data files should) but we have deployed a SAP documented cluster architecture that requires such use of NFS. Locks When a file were locked in the NFS, a lock is being created in the hos