teamMA Posted November 1, 2013 Posted November 1, 2013 Hello everyone! I have an assignment about configuring and compering NFS and Hadoop. For me this two option for distributed file system are unknown and i would like to know if someone has previous knowledge related to this. My main question is which one NFS or Hadoop is the best choice, which is more suitable? Thank you.
AtomicMaster Posted November 1, 2013 Posted November 1, 2013 Since i draw a distinction between network and distributed systems (i.e. To me the fact that something communicates over a network doesn't make it distributed, distributed meaning given shares of doesn't apply to things where the share can only be given in one chunk to one device). So to me NFS is not a distributed file system, merely a network one, and Hadoop is not even a file system, it's a software development framework. In fact these things are so different from each-other that comparing the two is akin to comparing an octopus to an apple...
teamMA Posted November 1, 2013 Author Posted November 1, 2013 maybe my question was not clear....i am sorry because my native language is not English. My assignment states that we need to create a private cloud to store and process gigabytes of sensor data, which will require a suitable distributed file system and we need do decide which option between NFS and Hadoop is better to achieve that. They are two different technologies used for the same intention.
AtomicMaster Posted November 1, 2013 Posted November 1, 2013 (edited) Worry not, English is not my native language either. They are different technologies used with different intention. Questions you have to ask are: What format is the sensor data in? How/what is processing it and is it Hadoop-ready? Is it gigabytes total or gigabytes per second of data? What the total eventual workload is? As i have already said, NFS is technically not a distributed file system. It simply allows you to share files (or devices, since everything in linux is a file). It is not a distributed system, as in my file is located in this one box, the more boxes i add, the more shares i manage, and more i have to do to automate to limit loosing everything. Hadoop is a software framework for highly distributed systems, but not without it's limitations (at least just by itself). Hadoop doesn't store files, Hadoop doesn't act as an easily queryable database, and you have to design your project with Hadoop in mind from the beginning, your software, your data, your processing algorithms, everything has to be catered to this system. Neither of the solutions is a truly distributed (in the sense of "cloud") file system. NFS is a file system but is not distributed, Hadoop is distributed, but is not a file system... Edited November 1, 2013 by AtomicMaster
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now