Abstract:
Big data approaches are widely employed today. One of the pillars is big data analysis. With the nation's development and technological growth, it gets increasingly harder for us to organize them as our vast jumbled data grows. When dealing with terabytes, gigabytes, and a great deal of complicated and unstructured data, Big Data is what we refer to as. Real-time analysis is required. Every day, we depend on data to get by in our everyday lives. On social media, we hunt for random, unstructured material. The largest difficulty is dealing with these random data. Depending on the size of the dataset and the cluster's number of nodes, Hadoop Map Reduce can execute tasks in a matter of minutes. The second problem is undesirable for networks with a lot of data and for online transaction processing. It is also unsuitable for iterative execution. The purpose of this study is to explain a method to deal with issues like real-time processing, simple operations, and handling enormous datasets on the network, offering machine learning methods and 100 times quicker performance within memory primitives.