الفهرس | Only 14 pages are availabe for public view |
Abstract Today the generated data are increasing exponentially. Hence, there is emergence need for analysis and refinement of the available big data to discover useful knowledge and unseen patterns from it. In this thesis, we investigate using MapReduce as a data intensive technology. MapReduce is a programming model initiated by Google{u2019}s Team for processing huge datasets in distributed systems. Cloud computing is a model which avails computing resources as a service through the internet. This thesis focuses on the document clustering as a data intensive task |