I’m in the middle of deploying an Oracle Data Warehouse solution and we decided to go a new direction with the storage. Normally, for a large Oracle cluster like this, we’d purchase a large HP or NetApp storage array and put a couple hundred disks in it. As you can imagine, this costs a lot of money, usually several hundred thousand USD. There are some limitations to this approach for data warehousing. Let me explain:
Sequential read performance is our most important criteria for the data warehouse storage. Even write performance is not very critical, because data will be loaded infrequently through an ETL process. There will only be a few users at a time on the system, but those users want their queries to run very fast. These are complex queries with JOINs that perform many full table scans. Random I/O performance is not that important, since this is not your typical transactional database or OLTP application.
From past benchmarking of a fully loaded HP EVA 8100, I’ve determined that it can do about 75K random I/Os per second, however, it can also do about 1.6GB/sec. sequential reads. The random performance is pretty good, but knowing we have a hard upper limit of 1.6GB/sec. is very limiting. If we need more performance from the data warehouse, we have to purchase a second storage array and stripe across both, costing even more money. We needed a more cost effective solution.
Enter the HP MSA 2000. These are “budget” storage arrays that are not nearly as powerful as an EVA, don’t hold nearly as many disks, and don’t have nearly as much random I/O performance. However, they do have pretty good sequential performance. Here are their performance characteristics:
MSA2 G2 Performance | MSA2300fc G2 | MSA2000sa G2 | MSA2000i G2 |
Protocol (host connect) | 4 Gb | 3 Gb | 1GbE Ethernet |
MSA2000 RAID 10 Performance Results | |||
Random Reads IOPs | 22,874 | 21,861 | 13,658 |
Random Writes IOPs | 15,008 | 14,491 | 14,399 |
Random Mix IOPs 60/40 read/write | 17,878 | 17,700 | 13,493 |
Sequential Reads MB | 1,238 | 1,050 | 274 |
Sequential Writes MBs | 532 | 532 | 266 |
MSA2000 RAID 5 Performance Results | |||
Random Reads IOPs | 22,044 | 21,000 | 12,335 |
Random Writes IOPs | 2,714 | 2,714 | 2,714 |
Random Mix IOPs 60/40 read/write | 5,325 | 4,926 | 5,325 |
Sequential Reads MBs | 1,238 | 1,050 | 274 |
Sequential Writes MBs | 725 | 617 | 266 |
MSA2000 RAID 6 Performance Results | |||
Random Reads IOPs | 21,975 | 21,000 | 12,292 |
Random Writes IOPs | 1,876 | 1,876 | 1,876 |
Random Mix IOPs 60/40 read/write | 4,293 | 3,865 | 3,851 |
Sequential Reads MBs | 1,238 | 1,050 | 274 |
Sequential Writes MBs | 772 | 729 | 266 |
Looking at the table above, the MSA 2000 we’re interested in is the MSA 2300fc G2, which can do 1238 MB/sec. sequential reads. At 1.24GB/second, this is getting very close to the EVA 8100’s 1.6GB/second. Now, consider that this array with a couple shelves of disks costs about 20% of the price of an EVA 8100.
For our data warehouse solution, we decided to purchase 4 of these smaller MSA 2300fc G2 arrays, and use Oracle ASM to stripe across them, hopefully achieving aggregate sequential throughput of approximately ~5GB/second (4x 1.24GB/second).
This solution seems to fit a data warehouse workload better than a single large storage array, because as performance and storage capacity needs grow, more MSA 2000s can simply be added, increasing performance and storage capacity as the application performance requirements grow.
No comments:
Post a Comment