Abstract: In current times, huge volume of data at a very high velocity gets produced through social media and various sensors in embedded systems that are associated to the internet which causes a very big data problem. These challenging big datas need to beprocessed and stored by traditional Relational Database Management Systems (RDBMS). Due to this motive, the need for new software solutions has occurred for managing the big data in an efficient, scalable and cool way. In this study, an approach to combine the concept of batch processing and stream processing to an end where it can query the data set which also supports adhoc querying with less latency that can be run on any large scale machine learning algorithms for recognizing any interest pattern in the streaming data set was employed. The functionalities of Hadoop ecosystem s tool HIVE can also be used to produce the results to ad hoc queries, User Defined Functions (UDF) similar to writing a SQL stored procedures in the spark system. An interface with serdes which is serialization and de-serialization that helps us to talk to the standard stream where it can exactly query the dataset are employed. By proposing a new software solution AllJoyn Lambda in which AllJoyn is integrated in the lambda architecture and the prototype implementation of the architecture is done using Apache Hadoop Yarn over Apache Spark Streaming are presented. This study light up the high velocity streaming data set on a database without losing any data from the streaming domain, to support adhoc querying from the data set and to provide a mechanism for fast data processing and analytics using large scale machine learning. This research study highlights the analysis of large scale dataset processing, handling challenges and its comprehensive systematic review. From this study, here it conclude that building a smart environment by using the big data setup platform improves and enhances the results for the smart environment.
E. Seshatheriand and T. Bhuvaneswari, 2016. An Efficient Distributed Data Processing Method for Smooth Environment. Journal of Engineering and Applied Sciences, 11: 1855-1858.