A number of techniques have been proposed over the years to detect clones for improving software maintenance, reusability or security. However, there is still a lack of language agnostic approaches with code granularity flexibility for near-miss clone detection in big code in scale. It is challenging to detect near-miss clones in big code across large scale source repositories with hundreds of millions of lines of code (MLOC) or more. The main reason is that it requires more computing and memory resources as the scale of the source code increases. In particular, near-miss clone detection is more difficult and need more resources. In this paper, we present SNCD, a fast and scalable distributed clone detection approach. It overcomes single node CPU and memory resource limitation with MapReduce and HDFS by scalable distributed parallelization. Furthermore, it is partial index based and optimized with multi-threading strategy which further improve the efficiency. It can not only detect Type-1 and Type-2 clones but can also discover the most computationally expensive Type-3 clones for large repositories. Meanwhile, it works for both function and file granularities, and it supports many different programming languages. Experimental results show that SNCD scales better for big code with the size of code in terms of lines of code increases compared to existing clone detection techniques, with recall and precision comparable to state-of-art approaches. With BigCloneBench and the Mutation Framework, two recent and widely used benchmarks, SNCD achieves both high recall and precision, which is competitive with other existing tools.