"Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously."
This subfield focuses on the development of algorithms that can be executed in parallel by multiple processors, in order to improve computational efficiency.
Parallel computing: The study of developing software that can run simultaneously on multiple computer processors.
Concurrency: The ability of multiple tasks to run concurrently, typically enabled by a multi-core processor.
Parallel processing architectures: The hardware models of computers capable of parallel processing.
Shared memory parallel algorithms: Algorithms that operate on shared memory systems, enabling all processors to access the same memory space.
Distributed memory parallel algorithms: Algorithms that operate on distributed memory systems, in which processors do not share the same memory space.
Load balancing: A technique employed in parallel computing to evenly distribute work among processors to maximize efficiency.
Synchronization: The process of coordinating the execution of multiple processes or threads to avoid race conditions and other synchronization issues.
Partitioning: Breaking down computational problems into independent sub-problems that can be independently assigned to different processors.
Communication: The exchange of data between processors that is necessary for many parallel algorithms.
Pipelining: A parallel computing technique where a series of computations are performed on the same data, with each computation handled by a different processor.
MapReduce: A framework for processing large data sets in a distributed, parallelized manner.
GPU computing: The use of graphics processing units (GPUs) to parallelize general-purpose computing tasks.
Message Passing Interface (MPI): A popular standard for message passing used in distributed computing.
OpenMP: An API that enables the creation of shared-memory parallel programs in C, C++, and Fortran.
CUDA: A parallel computing platform developed by NVIDIA that enables the use of GPUs for general-purpose computing tasks.
Java Concurrency: A Java package that provides support for concurrent programming in Java.
Parallel algorithms for sorting, searching, matching: A group of parallel algorithms that operate on arrays and other data structures in parallel.
Parallel numerical computation: Parallel algorithms that deal with numerical computation such as matrix multiplication, FFT, etc.
Parallel graph algorithms: Parallel algorithms that operate on graphs.
Parallel algorithms for geometric problems: Parallel algorithms that operate on geometric objects.
Hadoop: A popular open-source software framework that supports the distributed processing of large data sets across clusters of computers.
Apache Spark: A distributed computing framework that provides an interface for programming clusters with implicit data parallelism and fault tolerance.
(Functional/Declarative) parallel programming: Parallel programming language constructs that enable high-level, declarative programming without worrying about specific implementation details.
Task parallelism: A type of parallelism that involves breaking a larger problem down into smaller, independent tasks that can be computed in parallel.
Data parallelism: A type of parallelism where the same operation is performed on multiple data items in parallel.
Data Parallelism: In this type of parallel algorithm, the input data is divided into multiple subsets, and each subset is processed independently by parallel computing units simultaneously.
Task Parallelism: In this type of parallel algorithm, large-scale tasks are decomposed into smaller sub-tasks, and each sub-task is assigned to a separate parallel computing unit to be executed simultaneously.
Pipeline Parallelism: In this type of parallel algorithm, multiple stages of a computation are executed concurrently, and the output of one stage is passed on as input to the next stage.
Semi-Parallel Algorithms: In this type of parallel algorithm, parts of the algorithm are executed in parallel, while other parts are executed serially. This type of algorithm is useful when parallelizing certain parts of an algorithm does not offer any performance benefits.
Loop Parallelism: In this type of parallel algorithm, the iterations of a loop are executed concurrently by multiple parallel computing units, thus reducing the overall execution time of the loop.
Shared-memory Parallelism: In this type of parallel algorithm, each parallel unit has access to a shared memory, and can read and write data from and to this memory. This type of parallelism generally involves the use of threads.
Distributed-memory Parallelism: In this type of parallel algorithm, each parallel computing unit possesses its own private memory, and communicates with other units through message passing. This type of parallelism is commonly used in large-scale distributed computing applications, such as those encountered in cloud infrastructures.
Hybrid Parallelism: In this type of parallel algorithm, multiple parallel computing techniques are combined to achieve maximum performance. For example, a hybrid parallel algorithm may use both shared-memory and distributed-memory parallelism, or both data-parallel and task-parallel techniques.
GPU Parallelism: This type of parallelism involves the use of graphics processing units (GPUs) to accelerate computationally intensive tasks. GPUs contain large numbers of processing cores that can execute multiple threads simultaneously, making them ideal for data-intensive applications.
MapReduce Algorithms: This is a popular parallel algorithm framework used for processing large amounts of data. This type of algorithm breaks up a computation into smaller parts, which are then executed in parallel across a cluster of computers, with intermediate results combined at each step.
Load Balancing Algorithms: In this type of parallel algorithm, the computation is distributed among multiple parallel computing units in a way that balances the processing load across all units. This ensures that no single unit is overburdened, and helps to minimize overall execution time.
Sorting Algorithms: Parallel sorting algorithms make use of multiple parallel computing units to speed up the sorting of large amounts of data. By dividing the input data into smaller subsets, the sorting process can be completed more quickly than with a serial algorithm.
Graph Algorithms: Parallel graph algorithms are used to process large-scale graphs, such as social networks or search engine indexes. These algorithms typically use task-parallelism to decompose a graph into smaller sub-problems, and data-parallelism to process each sub-problem in parallel.
Matrix Multiplication Algorithms: Parallel matrix multiplication algorithms are used to accelerate the multiplication of large matrices. By breaking up the matrix into smaller sub-matrices, each sub-matrix can be processed in parallel, resulting in faster computation times.
Monte Carlo Algorithms: Parallel Monte Carlo algorithms are used to simulate complex systems, such as financial portfolios, stock prices, or particle interactions. By distributing the simulation across multiple parallel computing units, the calculation can be completed more quickly, enabling faster analysis and decision-making.
"Large problems can often be divided into smaller ones, which can then be solved at the same time."
"There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism."
"Parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors, due to the physical constraints preventing frequency scaling."
"Parallel computing is closely related to concurrent computing as they are frequently used together, and often conflated, though the two are distinct."
"It is possible to have parallelism without concurrency, and concurrency without parallelism (such as multitasking by time-sharing on a single-core CPU)."
"In parallel computing, a computational task is typically broken down into several, often many, very similar sub-tasks that can be processed independently and whose results are combined afterwards, upon completion."
"Parallel computers can be roughly classified according to the level at which the hardware supports parallelism, with multi-core and multi-processor computers having multiple processing elements within a single machine, while clusters, MPPs, and grids use multiple computers to work on the same task."
"Explicitly parallel algorithms, particularly those that use concurrency, are more difficult to write than sequential ones, because concurrency introduces several new classes of potential software bugs."
"Communication and synchronization between the different subtasks are typically some of the greatest obstacles to getting optimal parallel program performance."
"A theoretical upper bound on the speed-up of a single program as a result of parallelization is given by Amdahl's law, which states that it is limited by the fraction of time for which the parallelization can be utilized."
"As power consumption (and consequently heat generation) by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture."
"Specialized parallel computer architectures are sometimes used alongside traditional processors, for accelerating specific tasks."
"Concurrency introduces several new classes of potential software bugs, of which race conditions are the most common."
"In some cases parallelism is transparent to the programmer, such as in bit-level or instruction-level parallelism."
"In contrast to parallel computing, in concurrent computing, the various processes often do not address related tasks."
"Typical in distributed computing, the separate tasks may have a varied nature and often require some inter-process communication during execution."
"As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture."
"Communication and synchronization between the different subtasks are typically some of the greatest obstacles to getting optimal parallel program performance."
"Explicitly parallel algorithms, particularly those that use concurrency, are more difficult to write than sequential ones, because concurrency introduces several new classes of potential software bugs."