Definition of Big data, History of Data Management,
Big data characteristics : Volume, Variety, Velocity, Veracity, Analytics,
Basic nomenclature
,
Analytics process model
,
Analytical model requirements
,
Types of data sources
,
Sampling
,
Types of data elements
,
Missing values
,
Standardizing data
,
Outlier detection and treatment
,
Categorization
.
A brief history of Hadoop, The Hadoop ecosystem,
Hadoop release, The building blocks of Hadoop, Name node-data node, secondary name node, Job tracker, Task tracker, The Hadoop Distributed File System: The design of HDFS, HDFS concepts, Hadoop file systems.