One of the topics I’d like to discuss throughout this article is VDI and some of the common issues we might run into when it comes to storage, IOPS and image management. At the same time I’d like to point out some technologies we have at our disposal in addressing these issues and talk a bit more on IOPS and block vs. file level storage. Before we get into some of the bits and bytes involved, I’d like to first point out that I had to split this article into two parts to keep it informative and interesting to read. During part one I’ll primarily focus on VDI in general, describing its use and some of the common pitfalls we might encounter. Part two primarily focuses on real world solutions. Let’s get started.
Think of it as a (virtual) infrastructure (see the different vendors below) from where VM’s based on client operating systems, hosted and managed from the data center, get published out to your users. Your users could be working on thin, zero or fat clients. It doesn’t matter because all the ‘real’ processing takes place on the virtual machine in the data center. Compressed graphics and key strokes are sent back and forth over the line, keeping network traffic small and fast. All major vendors offer their own VDI solution. Microsoft’s Remote Desktop Services is an example along with Citrix XenDesktop or perhaps VMware View, all slightly different but VDI nonetheless. Have a look at the example below based on VMware View.
When VDI was introduced a few years ago, it started out as a very promising technology and to some extent it has lived up to its expectations. However, it didn’t flourish the way they’d hoped and perhaps expected. This is partly because of the complexity involved with image management, especially with dedicated desktops, storage requirements and the accompanying IOPS that go with it. As far as IOPS go, the same can be said for pooled desktops as well.
Pooled vs Dedicated desktops
Since it’s not ‘one size fits all,’ virtual machines can be configured and offered in different ways, with pooled and dedicated (also referred to as persistent) probably being the two most popular types. No matter which type is used, VDI VMs are provisioned and based on a so-called master, or golden image, so they all start out exactly the same. Different vendors use different techniques to make this happen. Without going into too much detail, I’ll explain the differences below:
Pooled: With pooled desktops all changes made to the underlying operating system (master/golden image) are discarded on log-off or reboot. This means that installed applications or updates, etc. will be gone once a user logs off or reboots. The VM will again be clean and put back in its pool waiting to be re-used. This also goes for all personalized settings, so a good user profile solution needs to be in place.
Dedicated/persistent: All changes made to the underlying operating system (master/golden image) are preserved when a user logs off or reboots. Different vendors use different techniques to accomplish this. However, this also means that a dedicated VM is bound to a particular user as opposed to pooled desktops where you can chose what to do, tie it to a particular user or put it back into the pool for reuse. These options also differ slightly per vendor. Not a bad thing per se, but worth a mention.
Besides VDI there are a few other things I’d like to address throughout this article since they’re all closely related. Of course I’ll explain IOPS in more detail as they relate to VDI and storage in general, but I also would like to make a note on block vs. file level-based storage. Block level storage is widely used within VDI deployments, and all sorts of other architectures as well. I think there’s a lot of confusion around the differences between block vs. file level-based storage. Let’s see if I can clear up some misconceptions.
Block level storage is based of raw storage volumes (blocks). Think of it this way: a bunch of physical hard drives, as part of your SAN solution, for example, get split up into two or more (this can also be one big volume) raw storage volumes/blocks using special software, which are remotely accessed through either Fibre Channel or iSCSI. This way they are presenting themselves to a server-based operating system. It doesn’t get more flexible than this.
Basically each raw volume/block that gets created can be controlled as an individual hard drive. You can format it with NTFS, NFS or VMFS (VMware). Because of this it’s supported by almost all major applications Although very flexible, they’re also harder and more complicated to manage and implement. They are also more expensive than most file level storage solutions which we’ll have a look at next.
File level based storage is all about simplicity, which in most cases, is implemented in the form of a Network Attached Storage (NAS) device. Think of file level storage as one big pile to store raw data files, nothing more. It is a central repository to store your company’s files and folders accessed using a common file level protocol like SMB, CIFS or NFS used by Linux and/or VMware. Just keep in mind that file level based storage isn’t designed to perform at high levels, meaning that if your data gets a lot of read and write requests and the load is substantial, you’re better of using block based storage instead.
Though it’s easy to set up and a lot cheaper as well, a NAS appliance has its own file system (a non standard operating system) and handles all files and folders stored on the device as such. This is something to keep in mind when thinking about user access control and assigning permissions. Although most NAS and/or other file level storage devices support the existing authentication and security mechanism already in place, it could happen that you run into one that doesn’t.
As you can see, both have pros and cons. A lot depends on the use case you’re presented with. Deploying tens or even hundreds of client OS based VMs as part of a VDI on file based storage just won’t work, which leaves us with the more expensive SAN solution, or am I wrong?! Let’s continue and find out.
IOPS in some more detail
I’ll start with a quote from Wikipedia: “IOPS (Input/Output Operations Per Second, pronounced as eye-ops) is a common performance measurement used to benchmark computer storage devices like hard disk drives (HDD), solid state drives (SSD), and storage area networks (SAN). As with any benchmark, IOPS numbers published by storage device manufacturers do not guarantee real-world application performance.”
Typically an I/O operation is either a read or a write with a series of sub categories in between like: re-reads and re-writes which can be either done randomly or sequentially, the two most common access patterns. Depending on what needs to done a single I/O can vary from several bytes to multiple KBs. So you see, it’s basically useless for a vendor to state that their storage solution will deliver a certain amount of IOPS without stating the read vs. write percentage used during the benchmark, the size of the I/O’s being processed, and the latency that comes with it (how long it takes for a single I/O request to get processed). As stated on this Blog (make sure to check it out, take your time and read it thoroughly, as it’s one of the best storage related resources around) it’s nearly impossible to claim a certain amount of IOPS without also mentioning these additional parameters, since there is no standard way to measure IOPS. There are a lot of factors that could influence the amount of IOPS generated besides the ones mentioned above.
Of course all this doesn’t matter if you don’t know your sweet spot, how many IOPS do you need, and more importantly, what kind of IOPS you need. There are various file system benchmark tools available that can assist you in determining the amount and type of IOPS needed for your specific workload. Also, the Internet is full of blogs and hundreds of other articles describing how to determine the IOPS needed by certain applications. Just remember, don’t go with general estimations. Make sure you test and evaluate your environment thoroughly! Remember, tooling can get you a long way.
I already mentioned pooled and dedicated/persistent desktops as part of VDI deployments. If we look at VDI designs today most are primarily based on the pooled desktop design. There are several reasons for this. I’ll use Citrix XenDesktop as the underlying VDI technology, the one I know best. Although other vendors might use slightly different technologies, the storage (IOPS included) and (image) management issues we encounter when using dedicated/persistent desktops don’t just go away. When provisioning multiple desktops based on a master image using XenDesktop, pooled and dedicated, one technology it will use is differencing disks to store all delta writes made to the virtual machine. If it’s supported by your storage solution these disks can be thin provisioned, otherwise it will be as big as the base (master) virtual machine mentioned before. Be aware that each VM provisioned by XenDesktop will get its own ID and differencing disk.
Knowing this, you can probably imagine where potential storage issues could come in. First imagine this, when managing a few hundred pooled desktops, used at the same time, and for now I’ll just assume that your storage solution does support thin provisioning, these can all potentially grow as big as your underlying base (master) image. That’s a lot of storage. In practice, this probably won’t happen much, and even if it did, this isn’t something to worry about because when a pooled desktop gets rebooted all changes made to the VM (stored on the underlying differencing disk) will get deleted (a nightly scheduled reboot perhaps?) This way you end up with a fresh and empty VM waiting to be (re)used.
Using the pooled model, and with the above in the back of your mind, you could consider to under-commit your VM’s as far as storage goes since it’s highly unlikely your machines will grow beyond several GBs during the day. If you do, make sure to implement some kind of daily reboot schedule. Monitoring your VMs for a few days will tell you how far you can go as far as under-committing is concerned. This way you won’t have to allocate all of your storage right away. If thin provisioning isn’t an option you might want to reconsider your storage assignment, but since we don’t live in the stone ages anymore we should be fine. Read on…
Now picture the same scenario but this time we use dedicated desktops. We start out the exact same way, only this time when the VM gets a reboot all the writes to the underlying differencing disk won’t be deleted. The user logs off or shuts down, he or she comes back into the office the next day, the same VM gets fired up and the user logs on. They work throughout the day, again making changes to the underlying base (master) image (written to the differencing disk), perhaps installing new applications or updates etc…, but now nothing gets deleted. No matter how many times the VM is rebooted, the underlying differencing disk will keep expanding taking up more free space until it’s full. If you look at the above example it’s obvious that these dedicated VM’s will consume, or need, a lot more of allocated storage to begin with (and IOPS) than their pooled counterparts. This also means that no under-committing can occur. Size accordingly when using this solution.
If storage isn’t an issue (and often it isn’t) then management might be. In the end you’ll also need to manage these dedicated desktops on an individual basis. This is partly because with dedicated desktops it isn’t possible to update the underlying base (master) image without destroying the accompanying differencing disk. With pooled VMs we don’t have to worry about any persistent data, we can just update the base image, assign it to our pooled machines, reboot and we’re ready to go. Next to that, it’s only a matter of time before each user starts installing their own applications, making each desktop just a bit different from the other. Of course there are all sorts of automation tools out there that can assist you with these kinds of tasks, but I’ll leave that up to you. Still, this solution offers some big advantages over ‘normal’ hardware based desktops.
Fortunately there are several vendors out there, Citrix included, offering us smart technologies to overcome most, if not all, of the issues discussed. In part two of this series I’ll discuss XenDesktop Personal vDisks, Citrix PVS, Pernixdata and Atlantis ILIO as potential life savers when designing, troubleshooting and/or upgrading your virtual desktop. Some of the concepts discussed can be kind of hard to get your head around without going into too much technical detail, nevertheless I hope this has been somewhat informative, giving you at least a basic understanding on storage, IOPS and VDI in general, including some of the challenges that may come with it. Stay tuned.
Reference material used: