DCJ China
Eaton Powerware
IO Virtualization and Convergence in Consolidating Data Centers Print E-mail
Written by Shreyas Shah   
Wednesday, 28 May 2008
In general, with a lot of attention from industry leaders and market forces, the virtualization is a reality in data centers. IO virtualization is the next big step in this direction. While the technology is being applied along with IO converging technology, the unified fabric and IO convergence are solving real customer issues. This architecture saves TCO, OPEX, increases performance and reduces latency to applications running on virtualized environment.

PCIexpress is a de-facto standard in server area and IOV has been defined to run on top of PCIexpress. PCISIG has done a great job of defining IOV standards on de-facto IO Bus for IO virtualization. These standards, SR IOV and MR-IOV are coupled with FCOE to run various data streams on a single cable. These data streams virtualize and converge the IO streams to run IPC, storage and networking protocols. The MR-IOV coupled with FCOE saves the TCO by 70%, management cost and power cooling costs in excess of 70%.

Fibre Channel over Ethernet (FCoE) hosted on 10 Gbps Ethernet or 40Gbps Ethernet extends the reach of Fibre Channel networks, allowing it to connect virtually every datacenter server (CPU and Memory) to a centralized pool of storage. Fibre Channel over Ethernet integrates seamlessly with existing Fibre Channel equipment, protecting not only capital investment, but also investments in storage management software, policies, and administrators. The ability of FCoE to work seamlessly with existing infrastructure makes it an evolutionary technology, one that datacenters can deploy at the pace and to the extent that best serves their needs. FCoE allows Internet Protocol (IP) and Fibre Channel network traffic to be carried over existing FCoE-aware drivers, NICs, and MRIOV switches, allowing the use of a single cabling infrastructure within server racks or blade mounted servers.

This simplifies network topology while reducing cabling cost and complexity, eliminating I/O adapter cards in a rack, reducing power and cooling overhead — all while improving bandwidth by leveraging 10 Gbps Ethernet. As FCoE-enabled storage systems become available, datacenters can implement a fully IO converged fabric, reaching from servers to storage, using FCoE-aware core switches and MRIOV switches in access layers.

This architecture provides the first step towards stateless computing within racks or blade servers. This architecture along with FCOE convergence proves to be the best strategy for consolidating adaptive data centers where server is CPU and Memory, IOs are being pulled out of servers and connect on the fly to CPUs. This architecture provides the utmost flexibility to agile data centers where resource manager connects to the infrastructure and spawns applications on next available CPU and memory and interconnects the CPU+Memory to IO device and can migrate these applications on the fly to increase server utilization. Extending this concept from racks and bladed servers to data centers, one could imagine cloud computing that connects various state less computers through FCOE clouds.

Introduction

Most of today’s data centers are not very flexible to adapt to ever changing business demands. The business cycles are shortening and require more agile data centers to respond to new business needs in minutes vs weeks. The rigidity of data centers and the provisioning of these systems taking weeks, some of the IT managers ordered the systems they do not need and hence increased the server, storage and networking hardware systems. Also, one application running per server created over-provisioned data centers and drive the utilization of servers and network hardware in teens.
This over-provisioning and under running data centers and its hardware systems have created huge problem since the management cost is proportional to number of systems in data centers.

To create fluid data centers and drive increase the utilization of data centers’ systems, Hypervisors along with IO virtualization and FCOE network convergence standards have been created to effectively increase the utilization, reduction in TCO and OPEX and reduction in access switches and adaptors’ hardware.

To increase the performance and driving the utilization higher, Hypervisors were being created to virtualize the CPU and Memory and IO devices. These software layers created huge impact on server utilization. While running multiple guest OSes on a server, affects the IO performance. IO centric heavy loads will have huge impact on the performance of the systems. While dual and quad core CPUs are making splash in the market, number of virtual machines will increase and will require larger and fatter pipes on servers. PCIexpress is the standard on servers that provides connectivity between the chips and replacing old generation PCI/PCI-X buses. MR-IOV is one of the standards developed by PCISIG to address these issues. FCOE is another standard that addresses the convergence of the network from FC and Ethernet to FCOE running on enhanced Ethernet.

Some of the business benefits of MRIOV along with FCOE are as follows.

No interface adaptors per server
Virtual patch panel of MRIOV switches
Reduction in data center cabling
No access layer switches
Power and cooling savings
Seamless integration with existing FC infrastructure
Seamless management integration

Current Three Tiered Data Center Architectures
The data centers are being designed to be three tiered architectures. These are web layer, mid-application tier and data base tier. These three tiered architectures have servers, networking, storage needs. These products are being designed to support the pricing and feature sets that match with individual tier. The applications are running one app per server. This architecture is very rigid where the applications can not be migrated from one server to another server very easily. The applications that data centers have to manage are increasing and the retiring applications are decreasing (one of the reason is SOX) this creates application sprawl. Every server needs about ten connections and hence ten wires per server.

fig1.jpg

Figure 1 Traditional Three Tiered Data Center

With the x86 being pushed within data centers from web layer as well as data base layers, x86 servers are being used everywhere within data centers. The proprietary SPARC, HP-UX and IBM-AIX architecture are being used only in certain part of data centers where application re-write is a big challenge.

Issues with Current Data center Architectures
There are many issues being seen by data center management folks. These issues include the application being tied with servers to server connectivity and number of connections required per server. The applications are tightly coupled with servers. These applications are not easily migrate from server to server. This in turn drives the TCO and OPEX higher with lower utilization of server hardware.

To summarize, the major issues are

1. Server utilization
2. Application Migration capability
3. To drive server utilization and provide flexibility to applications
4. Number of connections per server
5. Six access switches

Today’s servers’ required ten connections per server system. This is another headache in data center to manage and port connectivity cost per server is pretty high. The ten connections are mentioned below.

1. Two connections for back up
2. Two connections for KVM
3. Two connections for networking data traffic
4. Two connections for Storage data traffic
5. Two connections for IPC traffic

Consolidation and Convergence in Data centers – Next Gen Data centers

The data centers are being consolidated and converged. This include, x86 as a primary platform, ten connections are being replaced with two connections per server. The Hypervisors are playing a crucial component in next generation data center architectures to virtualize the servers and drive the utilization higher.

The number of connections is being reduced from ten to two, it drives two access switches. The applications are running on virtual machines that make the retarded or semi-retarded applications being consolidated on a set of servers.

Following shows the next generation data centers with servers as being CPU and Memory, IO unifying fabric, IO devices feeding into core/director class switches. These switches will be consolidated into one switch to make the number of connections reduced from ten to two.

fig2.jpg

Figure 2 Next Generation Data Center


MRIOV the perfect storm with Gen-3 and FCOE
MRIOV (Multi Root IO virtualization) is the standard that has been developed by PCISIG to drive IO consolidation, virtualization and shared IO. MRIOV along with FCOE IO will drive the consolidation in the data centers with 10x the bandwidth of available 1 GE today. With MRIOV the latency through the switch is in range of 150-200 ns vs 10s of us for Giga Bit Ethernet, 300 to 400 ns for 10GE and 2 us for fiber channel switch. PCIexpress (The IO Bus) signaling rate has gone from 2.5 GT/s to 5.0 GT/s and moving towards 8.0 GT/s. This with 8x links provides 64 Gbps in each direction in next two years. Today, with 8x links, the port provides 32 Gbps that is enough to drive 25 Gbps+ in each direction.

Fibre channel switch typically helps in aggregating the server devices and does not help in server to server communication. MRIOV with server to server communication will storm the unified IO fabric that can go in blade servers and/or rack mounted servers within data centers.

The current data center architectures have rack mounted/blade servers attached to two FCOE access switches feeding into core/director class switches. There could be NFS devices sitting behind the core network cloud to connect to storage for file io. The block io will be provided with same networking + storage switches. The edge routers will feed into ISP network. There will be multiple appliances in the network.

As shown below, server could be physical or virtual based on whether Hypervisors are being used or not in the environment.

fig3.jpg

Figure 3 Next Generation Architectures


Conclusion

As shown in this paper, data centers are going through transition. The data centers issues are being addressed by PCI SIG on MRIOV. ANSI-T11 is standardizing FCOE to merge various server connections into two server connections.

MRIOV and FCOE end points will solve the problems mentioned earlier and provide huge business benefits like application migration, shared IO, IO virtualization and network convergence. This solves the issues that IT and operations’ managers are facing today and in future.

Future Work

There is huge amount of work lies ahead to implement state less computing and cloud computing architectures. The resource manager has to be agile and very responsive to the changes in the business policies that get transformed into infrastructure policies. QOS will be another challenge in cloud computing to be solved.

Future work will include the end to end QOS where the packets/messages enter into cloud and be respected from entry point all the way to the stack through virtual machines. The resources will be assigned and bandwidth and latency will be enforced to run mission critical applications with non-mission critical applications within the same cloud.

 

About the Author:  Shreyas Shah is Chief Architect with PLX Technology (www.plxtech.com)

Comments (0)add
You must be logged in to a comment. Please register if you do not have an account yet.

busy
 
Eaton Powerware
                      Bi-Wkly

Polls

Energy Star Server Spec completed - Will you spec your server orders based on it?