Someone asked me out of band (thanks for asking, MattL) what I would do to quickly get a envelope of an iSCSI target.
In all the post-VMworld followup, I don't want to do a long dissertation here, but quick off-the-cuff comments may be useful. FYI, these are the quick set of tests we do during the application validation testing to quickly see if we're on the mark before we start with app workloads. Read on for the details!
OK - everyone, just remember - the goal here is a quick envelope, not an exhaustive "how will this work for me" test. Fast and dirty is the name of the game. All the tools I'm describing can be used on DAS, iSCSI SANs, FC SANs, and FCoE SANs. They can also be used in VMs to test VMware (including VMs on NFS datastores). Before anyone asks, let me ask and answer: "Why didn't we mention VMmark?" The answer is that we do use VMmark - but for what it was designed for - tile-based workloads with multiple VMs (i.e. in the vPod testing) - it's not a great (it's fine, but just not designed for this) I/O workload. It's ideal for the Round 1 scenario where you are looking at server choice in the VMware context.
On to the task at hand.
Round 1: Use IOmeter up front
IOmeter is fast, dirty, not real world, but can't be beat for quick envelope testing against IO subsystems. This is about 1-2 day's worth of work to hit a couple datapoints.
- Hit a couple workload sizes (8, 64K, 256K - more if you have time). 8K is good because it represents E2K7, a lot of VMware workloads, and OLTP database IO. 64K represent a database read prefetch, a coalesced VMware IO (rare). 256K represents a backup workload.
- Hit a couple different I/O mixes (100% read, 100% write, 50:50 read:write). the 100% read is a backup (so don't bother doing it at the 8K mark, and only do it sequential), and the 100% write is a restore (ditto).
- Hit a couple different randomness points (at least 50% random, 100% sequential). the random represents normal workload, the sequential represents backup/restore.
- So - NET:
- 8K, 50/50 r/w, 50/50 random (really - if you have time, do a couple at different I/O sizes and different read/write mixes and random/sequential mix - but if you can only do one, this is a the one to hit IMHO)
- 64K 100 r, 100 sequential
- 64K 100 w, 100 sequential
- 256K 100 r, 100 sequential
- 256K 100 w, 100 sequential
Round 2: Quick app workloads
These all are closer the the real world than IOmeter, since they use the engine (JET and the SQL/Oracle DB engines) that generates the backend disk workload. But, since they don't actually simulate the front end workload that applies against the app (and actually don't have the full database engine themselves always - ORION just uses parts of the database libraries), they aren't a "system test", but still rather a "unit test", and are also still an approximation. BUT, if IOmeter is a first order approximation, this is a second-order approximation. This represents about 4 days work to run a couple tests once you know the tools and have a reliable test harness (getting a reliable test harness at big scales can take weeks).
- Use the Exchange Jetstress tool. In fact, a smart move is to follow the Exchange Reviewed Solution Program (ESRP) guidelines here: http://technet.microsoft.com/en-us/exchange/bb412164.aspx
- Quick - check out the link above... I can't stop this plug, I'm proud of the work we've done - and it was (and continues to be a TON of work). BTW, we didn't embark on this with an end goal of the chart below. I'm also not suggesting quantity of submissions and quality of solution (i.e just because a vendor has one, they suck) are linked - for that, you need to find ones that reflect your requirements and compare. The reason there are so many is that we try to do this for a BROAD set of customer configs - note that the EMC submissions are all at different sizes, are standalone, replicated, CCR, etc). We have at least as many that are not posted yet that we did in VMware configs sitting here internally just waiting for the Microsoft ESRP gang to respond to our request to start formally accepting them now that Exchange on VMware is supported in the SVVP)
- What's good about this Jetstress as a test harness is that it's very random, no caching benefit to speak of, close to real (but not quite), and if you follow the ESRP guidelines you'll notice that it does both a steady state (what they call "Performance Test") and a backup (the "Soft Recovery" and "Streaming Backup" test cases)
- Use SQLIOSim - like Jetstress, not a real DB workload, but fast and dirty. Get details on SQLIOsim here: http://support.microsoft.com/kb/231619
- If you are passionate about Oracle vs. SQL Server - use ORION, which is the analagous tool from Oracle: http://www.oracle.com/technology/software/tech/orion/index.html
Ok, once that's all done, you have a pretty darn good idea of what you're looking at.
The next step is a system test, using a real workload. This gets MUCH harder, so is beyond the scope of the original question, but I'm posting for thoroughness
Round 3: Your Real App workload
This gets heavy duty (can be 3-4 man-months of testing), but is a real system test, and accurately can tell you how well this solution (not just the IO subsystem) will work for you.
- Exchange, use LoadGen. Loadgen actually has outlook clients login to the Exchange server and do everything users do in Outlook. Making Loadgen work and work well, IMHO is an art - and worth many, many posts. In our case, we literally have an Exchange Ranger (one of the few in the world) that does this on a regular basis working with a large extended team along with working on other stuff. Get Loadgen here: (32-bit: http://www.microsoft.com/downloads/details.aspx?familyid=DDEC1642-F6E3-4D66-A82F-8D3062C6FA98&displaylang=en; 64-bit: http://www.microsoft.com/downloads/details.aspx?familyid=0FDB6F14-1E42-4165-BB17-96C83916C3EC&displaylang=en)
- Application workloads on databases are TOUGH. The reason they are tough is that every workload, every application is different. Every application has wildly varying lifecycle stages (bulk loads, OLTP entry, random queries, large scale sequential reads, transformation into hypercubes, etc). That's why optimizations at the infrastructure layer (i.e. "lets get a faster disk!") can be dwarfed by the benefits you can get from having a real DB expert look at the application layer (which is why we have a Microsoft Practice - which is Microsoft's partner of the year!). Many, many times, a small change to the structure of a query, or a little planning in the database structure will get order-of-magnitude improvements. Ok - that said, what are your choices?
- Use a workload generator (at EMC, we use Quest Benchmark Factory a lot for this.) - what's good about these tools is that you can replay your exact transaction workload. Yes, you can do that natively (for example, in SQL Server 2005, using Profiler and the Database Tuning Advisor), but in most cases, these replay the transactions AS FAST AS THEY CAN, where what you're actually trying to do is play them back exactly as they did in the real world.
- Use a generic workload (SPC-1 and SPC-2 are examples, as is TPC-C, TPC-H). If IOmeter is a first order approximation, and the quick app workloads are a second order approximation, these are a third order approximation as you're using a particular TYPE of application workload (OLTP vs DSS for example). We use Quest Benchmark factory in all our Validation Test Reports (see Powerlink, under "solutions->Microsoft->SQL Server" and "solutions->Oracle") with TPC-C and TPC-H. IMHO, if you're going to to go this length (and it is a lot of work) - go the extra yard and replay YOUR transaction workload.