Many daily applications are generating massive amount of data in the form of stream at an ever higher speed, such as medical data, clicking stream, internet record and banking transaction, etc. In contrast to the traditional static data, data streams are of some inherent properties, to name a few, infinite length, concept drift, multiple labels and concept evolution. Among all the data mining tasks, classification is one of the basic topics in data stream mining and has gained more and more attentions among different research communities. Extreme Learning Machine (ELM) has drawn much interests in data classification due to its high efficiency, universal approximation capability, generalization ability, and simplicity, which have greatly inspired the development of many ELM-based algorithms and their applications during the past decades. In this paper, we mainly provide a comprehensive review on ELM theoretical research and its variants in data stream classification, and categorize these algorithms from different perspectives. Firstly, we briefly introduce the basic principles of ELM and its characteristics. Secondly, we give an overview of different ELM variants to address the particular issues of data stream classification. Thirdly, we present an overview of different strategies to optimize the ELM, which have further improved the stability, accuracy and generalization ability of ELM, and briefly introduce some practical applications of ELM in data stream classification. Finally, we conduct several groups of experiments to compare the performance of ELM based models addressing the focused issues. Also, the open issues and prospects of ELM models used for stream classification are discussed, which are worthwhile to be further studied in the future.