Hoov's Musings (volume 7, number 4)

The Ripple Effect: Part 3

As discussed in my previous Musing (Volume 7, Number 3), it seems to me that the technology soon being brought to market by Precision I/O has the potential to create a reasonably sized Ripple Effect. 

As stated in the previous Musing:

If all of a sudden, for a small incremental cost, commodity servers got 5x more efficient in terms of I/O – if they could actually run at wire speed at a fraction of the latency of present Ethernet/IP/TCP systems and at the same time return almost all of the CPU cycles presently consumed servicing network I/O to the server for application processing - what would change in the industry and what are the resultant opportunities created?

To create a Ripple Effect, a significant system disturbance is required to shift energy balances and to put things in motion.   In our scenario, the disturber that institutes the Ripple Effect doesn’t have to be the Precision I/O technology.  Alternatively, RDMA/TOE solutions could potentially create the same disturbance.  But I like Precision I/O as the disturber better and will continue to use it as the example in this Musing because (a) it promises to be transparent to existing applications, OSs, and server hardware, and (b) is general purpose in that it can improve the performance of any server (or client for that matter) engaged in significant I/O.  That means it could be adopted broadly and rapidly and therefore creates a substantial disturbance effect over a concentrated period of time. 

RDMA/TOE, on the other hand, will probably take longer to get adopted and in smaller amounts because special hardware is needed, applications need to change, and not all applications and environments benefit.  The disturbance effect would be drawn out and possibly not create enough energy to really change things much.

This example shows that, unlike the advent of PCs or ATM (which never really overcame the issue) but more like the introduction of browsers or switched Ethernet, a new idea with Ripple Effect potential doesn’t have to be disruptive in terms of deployment.  The very power of Precision I/O, and the reason I think it will be deployed widely is it’s transparency to other system elements in terms of interoperability.   So in this case the Ripple Effect could be caused by a radical change in thinking about other aspects of the overall ecosystem as the underlying assumptions about server capability changes dramatically, rather than the need to change a lot of things to ensure interoperability.   

So let’s move now to the conjectural portion of this series of Musings.  What could the impact of Precision I/O technology be?  Who are the potential winners and losers, and where are the new opportunities?

In this Musing I’ll focus on the server and application Ripple Effects.  In my next Musing I’ll focus on the networking Ripple Effects. 

RNIC Adoption

In the beginning, there was TOE.  But the world discovered that TOE (whether hardware or software-centric) didn’t really add much value at Gigabit speeds because getting data out of application space and into the networking stack (be it kernel or TOE) was the real system bottleneck.  So then there was RDMA and so-called RNICs.  I believe every TOE vendor is now an RNIC vendor (… and called it macaroni…..).   But there is still controversy over the advantage of RNICs at 1 Gbps, and for what applications under what loads, etc. 

I believe that one impact of Precision I/O is that it cleanly and clearly resolves that issue.  As a software solution, Precision I/O provides the perfect RNIC “avoidance strategy” for server connectivity from 400 Mbps to a few Gbps.  Also, Precision I/Os technology can go places that RNICs can’t.  Precision I/O doesn’t have to wait for the iSCSI market to happen or for RDMA standards to converge.  Precision I/O can find a home in the huge installed base of application and database servers, and possibly even web servers.  Once the technology finds a home there, it will just be assumed as part of the solution of other environments that (prior to Precision I/O) we used to think of as targets for RNICs. 

While Precision I/Os software can run on 10 Gig NICs to improve performance and server efficiency, to actually drive 10 Gbps in and out of the server, hardware acceleration on a special NIC will be needed.  Thus at 10 Gbps, the playing field is more level.    Those RNIC vendors who focused on 10 Gbps solutions may still find a small niche there.  It depends a lot on the timing and characteristics of the Precision I/O 10 Gbps solution.  But I’d rather dominate the 1 Gbps market and share the 10 Gbps market than vice versa.

But this doesn’t reflect a Ripple Effect.  It’s merely market segment dominance. 

Ripple Effect Winners

·         Precision I/O

Ripple Effect Losers

·         1 Gbps RNIC vendors

Data Center Application Deployment Impacts

Deployment of Precision I/O technology, combined with the on-going affects of Moore’s Law and the advent of 64 bit computing, should result in a big uplift in server scalability and speed by 2005. What resultant Ripple Effects in Data Center application and server deployment could result? 

Server Count Reduction?

The most obvious question to ask is, if servers become more efficient, will we need fewer of them?  I think the answer is probably no.  In general an increase in the efficiency of a technology tends to lead to greater deployment of that technology. In this case, I think there are a lot of people who would deploy more servers to achieve desired application performance levels, but don’t today because of some combination of (a) management complexity of too many servers, (b) running into physical limits – space, power, cooling, cabling, (c) the available clustering techniques either won’t support more servers or provide diminishing returns, or (d) simple economics. 

If each server unit is more powerful, Data Centers will probably be built out to the same limits, but the overall application processing and I/O capacity will be increased by 5x to 10x (taking into account the combined effect of Precision I/O, Moore’s Law, and 64 bit computing), improving overall performance and providing more headroom to handle widely varying application loads. 

That’s an easier and more conventional way to improve performance than utility computing, and therefore will probably slow down the present saunter towards utility computing in the enterprise.  Instead, since the cost of more powerful “reserve” servers will be less, I predict that best practices for providing some degree of utility or flexi-computing will involve holding some servers in reserve and bringing them into play if the production server(s) fail or head towards overload.  This is a lot simpler than trying to take computing resources away from one application in order to give it to another, and therefore is a more practical alternative. 

Having more powerful server units, especially if the overhead of supporting “sideways communications” with-in the cluster is reduced, will also take the pressure off the development of expanded clustering techniques.  There are a wide range of things called “clusters” and technology approaches supporting them. But in general, the fewer the units in a cluster the easier it is to implement and manage an effective clustering technique.  For a cluster that requires synchronization among the cluster elements, clustering between four CPUs doesn’t seem to be too hard, between eight is harder but seems to be doable, and beyond that becomes a real challenge.  But with higher performance servers, four, or maybe 8 as a stretch goal, is adequate to cover the vast majority of application and database cluster requirements.   (I’m not talking about big high performance computing clusters here; they’ll also get smaller with higher performance CPUs but will still be in the 100s of CPUs at the high end).  So we may be closer to achieving practical clustering capability than we previously thought. 

Instead, as servers get more powerful, I think there will be more focus on “reverse clustering.”  By this I mean support for multiple applications on one powerful server (as opposed to spreading application delivery across platforms), ala VmWare for example.  To ensure that such applications don’t stomp on one another, dynamic resource allocation and performance monitoring capability is required, which has the look and feel of utility computing, but within a server rather than between them, which I think is both different and easier.

Lowered Advantage For Blade or Rack Servers?

With servers becoming more powerful, will there be a reduced desire for and uptake of Blade and Rack Servers?  Since I don’t think that the number of servers will decrease, I also don’t think there will be any lessening of the physical and management values of Blade and Rack Servers, which are all about making lots of servers easier to deploy and operate. 

Some of the details of Blade and Rack server design may need to change, however.  In next month’s Musing I’m going to talk more in detail about the network-related Ripple Effects.  But I’ll preview some of that here because it is relevant to this topic.  Basically, you are going to need more network bandwidth per CPU because the server-to-network capability gap is going to be eliminated.  That could mean that the I/O product line roll-out of some Blade server vendors may need to accelerate.  The IBM Bladecenter, for instance, has 14 internal point-to-point GigE links, but their I/O cards only support 4 external GigE links (although with the Nortel/Alteon option for I/O cards you can use 2 I/O cards and load balancing to get something between 4 and 8 Gbps effective I/O).  If the CPUs get more I/O efficient, that sort of over-subscription won’t fly any more.  So they’ll need to offer 10 Gbps external connections (perhaps) sooner than they might have otherwise planned.  Although I’ve exemplified IBM here, all the Blade server vendors will need to do something to ensure that I/O is not a bottleneck.

Ripple Effect Winners

·         Customers

·         EMC/VmWare and other “virtual computer” competitors

Ripple Effect Losers

·         Cross-server resource management vendors

·         Slow-moving Blade server vendors who lag in providing enough I/O to match the new I/O efficiency of the CPUs

Server Vendor Impacts

In general, I don’t think the Tier 1 server vendors are very impacted by the Precision I/O technology as long as it is generally available to all of them.  The first reaction is to say – “gee, we’ll need fewer servers, therefore the server vendors revenues will plummet.”   But as I explained above, I don’t think that will happen.  In fact, they might be aided by the introduction of the technology because the value proposition of some of the Next-Gen server vendors, who have had to do a lot of difficult software and hardware development themselves to create more scalable and (often) function specific servers (like NAS, streaming media servers, etc.), and who are just starting to nip at the heels of the big guys, will get washed out.  

A related effect is that network-oriented “software-on-a-stick” proxy appliances of various types will be able to work at Gbps or multi-Gbps speeds, extending the popularity of such product configurations above the present DS-3/Fast Ethernet limits of today.  Expensive purpose-built hardware/software solutions will then be relegated to the 10 Gbps and beyond aggregate throughput requirements.  

Some server vendors could choose to put their heads in the sand and fight the adoption of Precision I/O, either because they are thinking short term and see unit volumes going down with more efficient servers, or they have been funding their own programs to achieve similar results and don’t want to see the ROI of that eliminated.  Big companies are highly susceptible to such thinking – the corporate equivalent of “steadily watching your own naval.”  So it could happen for awhile. But then someone like Dell who absolutely loves to take advantage of other people’s R&D will see the technology as a way to differentiate and will shame the more R&D-centric server vendors into following suit. 

The disruption I could see happening in this space would occur if one server vendor appreciates the value (or threat) associated with the Precision I/O technology and decides to acquire it for their own purposes to differentiate their offering vs. their competitors.  That could shift the balance of power for awhile as everybody will be working with the same CPUs, memories, hard drives, and operating systems, but one vendor will have some “special sauce” that makes the whole thing work better when networked. 

Ripple Effect Winners

·         Savvy server vendors

·         Network-oriented appliance proxy vendors who have focused on software for features and scale.

Ripple Effect Losers

·         Stubborn server vendors

·         “Next-Gen” server vendors

In my next Musing, I’ll talk about the potential Ripple Effects I see in the networking arena, which is where I think the real tsunami lies.  

(volume 7, number 4)

Home

Clients

Services

Hoov's Musings

Research Reports

About Acuitive


Send email to info@acuitive.com with questions or comments about this web site.
Copyright ©1997-2004 Acuitive, Inc.  All Rights Reserved