Monday 1 June 2015

First Day: Industrial Training - Big Data

The database used in Big Data is based on the No SQL approach.

HADOOP: It is a Java framework which was initially named "NUTCH"

It has strong ties to the SMAC principles: Social Mobile Analytic Cloud.

HDFS: stands for Hadoop DIstributed File System. It uses the concept of FAT: File Allocation Table.

The distributed feature of HDFS refers to the fact that many machines have the same database under the same software monitoring. 

Main purpose of Hadoop is MapReduce framework and ability to handle with flat files.

flat files contain data in no tabular format eg JSON files.

3Vs: Volume, Velocity and Variety.

Partial Failure Support: the properity to maintain the availability of data even when data at some servers is lost.

Scalability: Smooth performance transition on increasing the load on the same algorithm or software.

In Hadoop, each storage is done three times on different nodes, this is known as replication.

DoS attack: Denial of Service attack, This attack sends so many requests to a server that actual genuine may not be given to actual clients.

No comments:

Post a Comment