This is a bibliography survey upon Distributed Machine Learning. The survey contains algorithmic selections and architectures that can facilitate distributed learning on ML models. There is also a part that presents MLlib, a ML library from Apache Spark for distributed ML implementations.
Distributed Machine Learning is the idea of training machine learning models across multiple devices, rather than on a single machine. This technique is particularly useful for training models that generalise from a very big number of data, such as SVMs and Neural networks. Distributed ML allows faster and more- efficient training of these models. However, the distribution of data and computational load brings new challenges such as communication overhead, data inconsistency and fault tolerance. Companies have been attempting to address these issues by developing new frameworks and modifying distributed system technologies already in use. In this survey study, we will go over the existing issues with distributed training in more detail before examining some of the ideas that have been proposed to deal with the above problems.