Networks Europe Nov-Dec 2015 | Page 16

D ATA C E N T R E S Reliability Testing Testing, Testing… By: Giacomo Losio, Head of Technology, ProLabs Introduction Giacomo Losio discusses the conviction of a successful optical infrastructure Reliability stems from a company’s culture, from its values, the way products are conceived and designed – testing is the responsibility of everyone. Many companies can offer a working transceiver, but only a few offer reliable ones. In the world of optical infrastructure equipment, the importance of reliability testing is sometimes understated. As a technology developer, few things cause as much delight as taking a new product all the way through a laborious design process and to celebrate when it works exactly as you’d hoped. However, the inevitable and all-important question soon comes to mind - ‘but how long for?’ It goes without saying that in the optical transceiver space reliability and expected lifespan are absolute necessities. Data centre failures can be expensive in a number of different ways: the recurring damage to adjoining infrastructure components, the cost of delays caused by system downtime and the long-term reputational damage as to the trustworthiness of the provider in question. For these reasons, the quality and reliability of data centre products has always been at the forefront of consumers’ minds. Although for a long time OEMs have demanded large sums of money for their products, end-users have seen this expense as a way to insure against a catastrophic data centre failure, choosing to side with the big brands regardless of cost. Historically, the lower-end data centre infrastructure market has been so saturated with underperforming providers that procurers have felt as though they were taking huge risks by purchasing parts from lesser known brands. In effect, customers have tended to weigh up the costs of a data centre failure and decide that although OEM’s are much more expensive, they are not as expensive as a product malfunction. How Are Products Tested? Reliability is the probability that a product will perform its intended function in a satisfactory manner, for a specified period of time, when operating under specified conditions. To be reliable, a product need not last forever; more rather, it must be ‘predictable’ - if it purports to last 20 years, it needs to work faultlessly for at least 20 years. Reliability testing is based on benchmarks set by standards agencies such as Telcordia, IEC and even military standards. Every device/subassembly used in the transceiver has to be qualified independently and internal interconnects have to be verified with particular attention since it is in this area where mechanical stress can often occur and the device can fail. Tests are clearly defined and readily repeatable with some tests running for as long as 2000-5000 hours (3-7 months!) or so). Products must be tested in specific environments (QT tests), and in tests called ALT (accelerated life tests) and HALT (highly accelerated life tests). The provider is gauging whether or not the product can be released in the first place, and then pre-determining percentage failures in order to focus on continuous design improvement. The quality and reliability of data centre products has always been at the forefront of consumers’ minds. 16 NETCOMMS europe Volume V Issue 6 2015 www.netcommseurope.com