Redundant dedicated server.

In this article we will see an introduction of different techniques that are used for dedicated server are available and can be accessed even when some part of the system fails.

When you have critical servers that must be available and working 24 hours a day, 365 days a year, we should try to minimize the failures that may affect the normal operation of the system. Failures will occur, but there are techniques and settings that help to have redundant servers in which certain parts can fail without affecting the operation.

In a current system, there are many components needed to make this work, the more components, most likely we have that something goes wrong. These problems may occur in the server, disk failure, power supplies, network cards, etc. and the necessary infrastructure for the server can be used, network components, Internet access, electrical servers, ....

Then we will discuss some of the techniques used to get redundant. The degree of redundancy of a system depends on its importance and the money they lose when the system is not available for a ruling. We are not worthwhile to invest in 'redundancy', if the necessary investment to have a redundant system costs more than you lose in money, reputation and working hours, if the system fails.

The techniques and configurations of which we speak here are not unique to Linux servers. Can be applied in most other major operating servers and platforms. We focus on Linux as the main theme of "The corner of Linux."

Redundancy of components on the server

Redundant components in a more normal server often, disks, network cards and power supplies. There are servers with multiple CPUs that even continue to work seamlessly with any CPU or memory module crashed.
Discs

Hard disks are the devices that record data. The most common failure in a server is the failure of a hard disk. If the server has only one disk and this fails, the server to fail and not be able to access the data contained in it. There are techniques that help us to minimize this problem and continue to operate the server and not lose any data even when your hard disk fails. As well as normal, is that it can replace the failed disk without shutting down the server (HotSwap)

The most common technique is called RAID (redundant array of independent disks) [Spanish | English]. With this technique we created a redundant set of disks that can help us both to increase the speed and performance of storage system, to keep the system running even if one disk fails. There are software and hardware implementations for different RAID configurations, the most common RAID1, RAID5 and RAID10.

Network cards

The network card is the device that enables the server to communicate with the outside world. It is therefore very common that the servers are at least 2 network cards, to ensure that this information will not be cut in case of failure of one of the cards.

Linux is also a technique called "Bonding," by which we can use 2 or more network cards as if they were a single device, adding the capabilities of those taking redundancy in case one of the cards fail.

Power supplies

The power supply is responsible for providing electricity to the server. It is also common for servers with 2 or more sources of food connected to electrical servers, to ensure supply in the event that one source or one of the electrical servers fail. You usually can replace power supplies fail without shutting down the server (HotSwap). Other system components such as routers, switches, disk cabinets, etc. usually use the same technique redundancy.
Redundancy in the electrical supply

All electrical components, and a server could not be less, you need a constant supply of electricity to operate. Failures in the supply, even for very short periods of time, will have catastrophic consequences for our system. And not only need a constant supply, so we need not have ups and downs abruptness that may damage electronic components.

To achieve this you can use different components depending on the degree of protection they want.

     * UPS (UPS): They are more or less advanced batteries that are connected between the server and the source of electric supply. Guarantee a steady supply and stable for a while, depending of this capacity.
     * Power: usually operate with diesel and can be connected between UPS and the electricity grid. Just come into operation when the supply is cut for more than a certain time. Can provide electricity for an indefinite period if they have fuel in the tank.
     * Independent of supply lines: In large data centers are usually at least 2 separate and independent connections to the electricity grid. Like in DomainGurus they use 10 different Tier 1 bandwidth providers. Multi-homed bandwidth which consists of a BGP blend of multiple Tier 1 providers.

If you want redundancy in the electrical system, it goes without saying that not only the servers must have double connections, routers, switches and final component of any system that uses electricity should have redundant power supply (connected). As the saying goes, your system will only be as secure, stable and redundant as the weakest component of it. It is not the first time, for example, that in a data center, groups of servers with redundancy at all levels have been held incommunicado since they were connected to a switch that has failed by not having a redundant power supply system.
Redundancy in network components

There is no point having servers with duplicate components and redundant power supply and a constant BALANCE if some of the network components fail and we can not access the server.

More normal components in a network are:

     * Routers (router): A device that connects network segments or entire networks
     * Switch (PBX): It is a device that connects two or more network segments
     * Network card or NIC: It is an electronic device that allows a DTE (Data Terminal Equipment), computer or printer, access to a network and share resources
     * Patch Cords: To interconnect the various components, there are many different types, the most common being twisted-pair cable and fiber optic
     * Lines: a wide area network, WAN (eg Internet)

Any of these components can fail, leaving the system incommunicado. But there are techniques to prevent this from happening, which is usually done is to configure the network for at least 2 different paths exist between two components A and B. The chart below you have a scheme in which you can see how to configure a network with dual redundancy from the server to the Internet. This way you can spoil a router, switch and a network card at a time without losing connectivity. The same scheme could be expanded to have triple or quadruple redundancy of components.

Redundant servers, load balancing

What happens if the electricity works and the network works, but our server failure so that none of the components are redundant and can avoid the failure of the fall. There are several different configurations with multiple servers, which can help with this problem. These are called clusters, there are different types, but among the most usales this load balancing with fault tolerance. In this type of clusters, not only does not matter if one or more of the servers to stop working, but if we need more resources to provide a service, can add new servers to increase the processing capacity of the cluster.

The most important components of such clusters are unique storage servers from all servers that provide a service and load balancing device, which can be for a specific hardware or software implemtarse in a normal server . The most important project for Linux on this issue is called Linux virtual server (LVS).

Below is a series of examples of how these clusters can be organized, where the failure of a server, not to run a service. When a fault or more servers in the cluster, the ability to process it is reduced, so it is important to always have some unused capacity so that in case of a failure is not reduce the response time greatly.

An example of a cluster with balaceo charges connected to a cabinet disk (Disk Array) to store the information. Typical use for file servers and web.

An example of cluster balaceo burdens connected with a database to store the information. Typical use for the site.

An example of a cluster with balaceo load for a mail system that provides IMAP and SMTP to its users.

Anyway, that's all you have to think in this introduction to dedicated server redundant. There is much information on the Internet if you want to go deeper into the subject. Most important is to have knowledge, and network management and knowing how the different components of a system. Experience and studies made of these materials will help to have more stable and redundant servers.