
Here, a lot of data moves between the nodes to complete the processing. However, this method is more complex than a replicated database. This method saves a lot of disk storage as there is no redundancy. Each processing node owns a particular slice of the database and works on it. This model has the overhead of additional storage space.ĭistributed Database: In this model, the database is partitioned into multiple slices. In this model, the risk of data loss is low, even if a few nodes fail. Replicated Database: In this method, each processing node owns a complete copy of the data. The data that needs to be processed is shared among the nodes using various techniques. The processing nodes have independent random-access memory and disk that stores the necessary files and databases. Shared Nothing SystemsĪ more popular architecture of massively parallel processing systems is the “shared nothing” architecture. The shared disk needs an operating system to control it. These communications between nodes take up some bandwidth of the high-speed interconnect. The system needs to rely on a distributed lock manager. Since the processing nodes share a common disk, the coordination of the data access is complex. What Are the Disadvantages of Shared Disk Systems?

It is easy to add new nodes in the shared disk systems. The shared disk systems are more straightforward as they do not have to use a distributed database. No data is permanently lost even if one node is damaged. What Are the Advantages of Shared Disk Systems?Īs all the nodes share a single external database, the massively parallel processing system becomes highly available. The scalability of the shared disk systems depends upon the bandwidth of the high-speed interconnect and the hardware constraints on distributed lock manager. These processing nodes are connected with a high-speed bus. These nodes, however, share an external disk space for the storage of files. Shared Disk SystemsĮach processing node in the shared disk system will have one or more central processing units (CPUs) and an independent random-access memory (RAM). The massively parallel processing architectures belong to two major groups depending upon how the nodes share their resources. Massively Parallel Processing Architecture(s) In some systems, the distributed lock manager ensures data consistency and recovery of any failed node. The distributed lock manager takes a request for resources from various nodes and connects the nodes when the resources are available. In those massively parallel processing architectures where the external memory or disk space is shared among the nodes, a distributed lock manager (DLM) coordinates this resource sharing. It might be an ethernet connection, fiber distributed data interface, or any proprietary connection method. This is called a high-speed interconnect or a bus. A low latency, high bandwidth connection is required between the nodes. Even though their processing is independent of each other, they need to regularly communicate with each other as they are trying to solve a common problem. The nodes in a massively parallel processing system parallelly work on parts of a single computation problem. The nodes can be visualized as simple desktop PCs. These nodes are simple, homogeneous processing cores with one or more central processing units. Processing nodes are the basic building blocks of massively parallel processing. It’s essential to understand the hardware components of a massively parallel processing system to understand various architectures.


What Are the Major Hardware Components of Massively Parallel Processing?
