Planet of Clusters


The blockchain infrastructure must have certain properties that can ensure its reliability, security, fault tolerance and optimal maintenance costs. Such indicators should be inherent in both physical and virtual infrastructure. The physical infrastructure includes physical servers that have their own resources – memory and storage processors. The virtual infrastructure also covers memory and storage processors, and it also has shared storage across all servers. If a problem occurs in one of the servers of the physical infrastructure, then one, and in some cases, two or three applications may be affected. If a virtualization node fails in a virtual environment, on which more than a dozen machines can run, then all these machines will also fail. This means that all relevant services will fail, and it will take many times more time and resources to restore them. The task of minimizing the negative consequences of virtual infrastructure is a priority among other tasks, and it, in turn, is solved by building clusters. This process has its own specifics, and the clusters themselves are divided into different categories depending on their goals. Some companies manage to build clusters on their own, but most of them transfer these processes to a trusted provider

Planet of Clusters


Developing the topic of clusters, it should be noted that today clusters are used everywhere, and one or another version of the Linux program is often installed on nodes as an operating system. Since clusters are used to solve a variety of problems, they received their second name – supercomputer. In this case, the definition of “supercomputer” speaks for itself, since it is obvious that it will be considered a computing system that will demonstrate the highest possible corresponding performance among other systems that exist at the time of comparison. There is one interesting feature that concerns supercomputers – practically all of them have their own personal names, they are constantly being improved, new equipment is added to them, etc. For example, the latest most advanced models of supercomputers are already “tailored” for neural networks. Clusters can be used remotely over a local network, but practice has shown that despite the impressive comparison of clusters with supercomputers, ordinary users, as a rule, do not need them.

Almost always, clusters are located in large computer data centers, and various companies, if necessary, can rent this capacity. Of course, such data centers have very high requirements for security and reliability. In practice, this is expressed in the fact that automatic fire extinguishing equipment, temperature monitoring systems, and notification systems for specialists servicing this infrastructure are located there. Paradoxical as it may seem, data centers employ a very small number of personnel servicing a fairly large amount of equipment. Most often, a group is responsible for the functioning of the infrastructure, which consists of the chief system administrator, his two assistants – system administrators also and two technical specialists who maintain the components for the computing power.

Modern data centers use SSD drives as storage media. Their great advantage is that they are produced using innovative technology and provide a significant increase in the speed and volume of information processed. However, SSD drives also have a weak side – their low reliability, since compared to their main competitors – HDD hard drives, they have a service life which is shorter in almost three-four times.

Parallel system options

There are two options for parallel systems. If you have a very powerful server (SMP systems), then a parallel multi-threaded program loads on it. The second option is the presence of a computing cluster (MPP system), which uses the MPI library and runs many parallel processes. Today, more than 90% of the world’s supercomputers are clusters, with their individual nodes being the mainframe, and if necessary they can also be used as separate supercomputers.

The technical difference between a cluster and a multi-core processor is that in the latter all cores work with a common operating memory. In turn, in the cluster, each node has its own hard drives and its own RAM. It should be noted that there are clusters that do not have hard drives and when they are turned on, the operating system loads and runs directly in RAM. This has its advantages, since this approach eliminates the problems associated with hard drives. There is also a negative side, since without hard drives not every program can be launched. Because of this such clusters are quite specific and can only be used for a certain range of local tasks. To summarize the above, we note the following. In the case of a powerful mainframe, a parallel program is written, fast interaction between threads is carried out through shared RAM, but scalability is low. When using a cluster, the bottleneck will be data transfer over the local network between nodes, but scalability is very high.