Wednesday, June 24, 2015

The mystery of TCP segmentation offload bug

There are incidents that have a generic description 'TCP segmentation offload bug' that affects multiple virtualization platforms. The workaround is the same, by disabling this feature.

Case one

Virtualization Platform : KVM/QEMU
Symptom : Periodically, guest would lose network connectivity after heavy load. Restarting the guest network doesn't fix the problem. Guest will be ok after rebooting.
Reference : https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978/comments/134
Workaround :  ethtool -K eth0 tx off sg off tso off ufo off gso off gro off lro off

Case two

Virtualization Platform : Xen
Symptom : DomU hangs after network heavy load (@10 Mbyte/s).
Reference : https://bugs.launchpad.net/ubuntu/+source/qemu-kvm/+bug/997978/comments/132
Workaround : disable offloading using ethtool
ethtool --offload gso off tso off sg off gro off

Case three

Virtualization Platform : VMWare
Symptom :
1. Page could not be displayed after VM migration to the same ESX host. Cisco Nexus 1000V and F5 involved before reaching the IIS VM.
2. In other incident, Cisco Nexus 1000V sending a large TCP segment causing Purple Screen of Death of the ESXi host.
Reference : https://supportforums.cisco.com/discussion/11883926/tcp-segmentation-offload-tso-and-vmxnet31000v-bug
Workaround : Turn off TSO in VM

Case four

Virtualization Platform : VMWare
Symptom :
When enabling Traffic Shaping on a Distributed vSwitch (DVS), Linux virtual machines using the VMXNET3 driver experience network throughput degradation.
Reference : http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2030927
Workaround : Disable TSO and LRO in Guest VM


Conclusion

Disabling TSO and LRO might fix your virtualization network problem, whatever it may be.

Thursday, June 18, 2015

SAP System Copy Procedures

Background

My company uses SAP AG's Enterprise Resource Planning software. It as a complex system, but being developed by Germans has its advantages, for example the have many documentation in the form of SAP Notes and other stuff. One of the challenging task in using SAP's ERP is system migration and cloning. Two purposes for the migration : first, if you want to move the system to another hardware. The second, if we need a cloned system to do stuffs without compromising your original system. SAP's term for migration activity is 'System Copy'

Where to start

To find out steps needed to be done for system copy, you could read this slideshare-hosted SAP document here (Best Practices : SAP System Landscape Copy). For another definitive starting point, check out SAP's wiki DOC-8324 System Copy and Migration. I noted that SAP has released Software Provisioning Manager from which we could do system copy, my previous system copy experiences has not involved such software. Anyway lets get down to the basics.

Homogenous or Heterogenous ?

SAP differentiates  homogenous and heterogenous system copy. They states that when the OS or Database is the same, the migration can be done using homogenous system copy. When one of the database or operating system is altered, the migration must be done using heterogenous system copy, with several exceptions. For homogenous system copy, we could use either the database-independent copy procedures or the database-dependent copy procedures. Usually the database-independent is the slower procedure between the two. For heterogenous system copy, there is no option, we can only use database-independent procedure.

Documentation and documentation

According to http://scn.sap.com/docs/DOC-48323, first thing that need to be done is to download the latest system copy guide from https://service.sap.com/sltoolset - Software Logistic Toolset 1.0 - Documentation - System Provisioning, which point us to either Netweaver 7.0 System Copy Guide or Netweaver 7.1 System Copy Guide. The old Best Practices document refers to Best Practice update SAP Note 885343. The Newer system Copy Guide refers us to either SAP Note 1768158 for NW 7.0 or SAP Note 1738258 for NW 7.1., SAP Note 885343 SAP System Landscape copy, and SAP Note
82478 – SAP system OS/DB migration. I noted that the System Copy Guide doesn't contain the steps need to be done to ensure archived data can be accessed, for those we need to refer to the SAP Library Administrator's Guide ->Technical Operations for SAP NetWeaver -> General Administration Tasks -> Data Archiving  and blog DOC-7856 SAP Netweaver Application Lifecycle Management. The guide doesn't specifically note general Oracle Tuning documents (it does refer to r3load optimization for Oracle in  SAP Note 936441) but the blog refers to SAP Note 1918774 (Performance issues when running a SAP Installation / System Copy) which really should be called "Oracle Performance issues when running a SAP Installation / System Copy".

Procedures

In general, the process of system copy consists of :
1. exporting source system
2. transfer the exported data ('load') to target server
3. perform SAP installation procedures on the target system
4. perform DB load on the target system

Test run

For production systems migration, it is highly recommend to do a test run, to measure the amount of downtime needed and to ensure potential problems (as SAP declared, customer-specific problems) are detected and noted before we really do the system copy. The test run consists of :
1. export source database
2. move the export data to target system
3. import data in the target system.
This seems simple, but for large production systems is not .. terabytes of data can be challenging to move, export / import are more or so challenging. For the real migration run, the production system downtime is from step 1 to 3. SAP installation could be done before the test run.

Conclusion

The SAP documentation contains many procedures to follow when doing System copy. It is imperative that the procedures are detected and gathered before actually doing the process.

 



Sunday, June 7, 2015

Cloud Storage Price

This post is a place where I would note prices relating to Cloud Storage.

As A consumer

Amazon :
S3 (Simple Storage Service) : $0.03 /GB/month for first TB  (ref: https://aws.amazon.com/s3/pricing/)
Glacier : $0.01 /GB/month
EC2 Elastic Block Storage : SSD $0.1 /GB/month  magnetic : $0.05 /GB/month  + $0.05 /million IO

Google : (ref : https://cloud.google.com/storage/pricing)
Standard Storage : $0.026 /GB/month, $0.01 /GB/month nearline  

Microsoft sells: (ref : https://onedrive.live.com/about/en-us/plans/)
15 GB : free
100 GB : $1.99 /month (means  $0.019 /GB/month)
200 GB : $3.99 /month
1 TB : $7 / month (include Office 365)

As a provider 

SwiftStack controller : (http://itknowledgeexchange.techtarget.com/storage-soup/swiftstack-enters-software-defined-storage-race/)
subscription : $10 / TB/month (means $0.01 /GB/month)

EMC VIPR :  (http://silvertonconsulting.com/blog/2013/09/30/emc-vipr-virtues-vexations-but-no-virtualization/#sthash.CEFmpF86.dpbs)
Controller base platform subscription : $0.01 /GB/month
with Object Data Service license subscription : $0.02 /GB/month

Hardware price: 
2012 ref :  http://www.slideshare.net/joearnold/7-steps-to-roll-out-a-private-open-stack-swift-cluster-joe-arnold-swiftstack-20120417
$42 520 for 105 TB .. $0.4 / GB CAPEX
2015 Backblaze storage pod ref : https://www.backblaze.com/blog/storage-pod-4-5-tweaking-a-proven-design/
$13 843 for 270 TB ..  $0.051 /GB CAPEX
Swiftstack ref : https://swiftstack.com/docs/admin/hardware.html (need to calculate for your HW prices)
HP Swift Ref : http://h20195.www2.hp.com/v2/getpdf.aspx/4AA5-2798ENW.pdf?ver=1.0
HP Proliant DL 360p with 8TBx3 LFF SAS + 100GB SSD : $12066 .. $0.5 /GB
with 4TB x3 LFF SAS + 100 GB SSD : $8466 .. $0.7 /GB
HP Proliant DL 380p with 8TBx7 LFF SAS + 120 GB SSD : $20361 .. $0.36 / GB
HP Proliant DL 180 with 2 drive cage kit, 2 TBx23 SFF + 120 GB SSD : $30602 .. $0.66 /GB
HP Proliant DL 360e with 6TBx13 LFF SAS + 120 GB SSD: $21678 .. $0.277 / GB

For HP Drives, SFF drives itself cost about $0.5 / GB, LFF drives $0.2 /GB

SuperMicro 6027R-TDARF with 7x4TB SATA + 120 GB SSD : $3953 .. $0.141 / GB (ref http://www.broadberry.com/superservers-supermicro-servers/sys-6027r-tdarf)
SuperMicro SSG 6037r-e1r16l with 15x4TB SATA + 120 GB SSD: $5765 .. $0.096 / GB
SuperMicro SSG 6047r-e1r24l with 23x4TB SATA + 120 GB SSD : $7284 .. $0.079 / GB