|

Virtualization: Storage

Storage

Storage virtualization abstracts the physical resources, and a logical representation of them is presented to virtual machines by the hypervisor. In full virtualization, guest operating systems are unaware of being in a virtualized environment; they believe they control the storage devices. Virtual machines only see a fraction of the physical disk space shared with other guests.

Storage solutions

Direct Attached Storage (DAS) is directly connected to the host machine through a cable and can be either an internal or external hard disk drive. While Storage Area Network (SAN) are storage devices interconnected through a network, which servers and computers can connect to. The SAN switch is responsible for the connection to the shared storage pool. Network Attached Storage (NAS) is storage that can be accessed through wireless internet or ethernet cable. To access files on a NAS server, the device connecting needs to have the same file system.

Storage optimization

Consolidation is often the reason for performance issues caused by storage; with many machines on one host, there will be a lot of I/O requests.

RAID – Redundant Array of Inexpensive Disks

Optimizing can be done with RAID methods (Redundant Array of Inexpensive Disks) for higher availability. Disk mirroring is when the exact same data are written to other disks, which will be helpful in the event of failure of the original disk. Mirroring disks will prevent data loss and avoid interruptions, allowing the opportunity to read from several disks. With disk striping, a file is “striped” in pieces and written to multiple disks, decreasing throughput time as the disks work concurrently.

Bandwidth

Apart from enough storage, there must also be enough bandwidth to access the stored data. It is possible to automate storage migration to resolve issues with disk performance. Policies to control network traffic by prioritizing specific machines or applications are helpful when there is contention.

Tiered storage

With tiered storage, less critical applications can be placed on lower tiers with slower disks. High-performance applications can be placed in higher tiers on faster disks, like SSD, bandwidth-prioritization, and an appropriate RAID setup. Allocating appropriately includes giving an application precisely what it needs to perform. Putting a less critical application in a higher tier would be a waste of resources, as it is unnecessary for the application to run.

Thin provisioning & Data deduplication

Methods for efficient use of storage are thin provisioning and data deduplication. The opposite of thin provisioning is thick provisioning. That is when the virtual machine immediately consumes all of the pre-allocated disk space, even if the space remains unused by the machine. While with thin provisioning, the virtual machine only uses disk space when needed. Data deduplication ensures that only one copy of the original data is stored. Identical data are recognized with the original being flagged while duplicates get replaced with pointers back to the original to not use space unnecessary.

Sources

Comer, D. 2021. The Cloud Computing Book
Portnoy, M. 2016. Virtualization Essentials. 2nd ed.
Shackleford, D. 2012. Virtualization Security