Introduction, Grasping the Fundamentals of Big Data, The Evolution of Data Management, Defining Big Data, Building a Successful Big Data Management Architecture, Beginning with capture, organize, integrate, analyze, and act, Setting the architectural foundation, Performance matters, Big Data Types, Defining Structured Data, sources of big structured data, role of relational databases in big data, Defining Unstructured Data, sources of unstructured data, Integrating data types into a big data environment
Statistics- Population, Sample, Sampled data, Sample space, Random sample, Sampling distribution, Variable, Variation, Frequency, Random variable, Uniform random variable, Exponential random variable, Mean, Median, Range, Mode, Variance, Standard deviation, Correlation, Linear Correlation, Correlation and Causality, Regression, Linear Regression, Linear Regression with Nonlinear Substitution, Classification, Classification Criteria, NaiveBayes Classifier,SupportVector Machine
Introduction Data Analytics, Drivers for analytics, Core Components of analytical data architecture, Data warehouse architecture, column oriented database, Parallel vs. distributed processing, Shared nothing data architecture and Massive parallel processing, Elastic scalability, Data loading patterns, Data Analytics lifecycle: Discovery, Data Preparation, Model Planning, Model Building, Communicating results and findings, Methods: K means clustering, Associationrules.
Data Science Tools- Cluster Architecture vs Traditional Architecture, Hadoop, Hadoop vs.Distributed databases, The building blocks of Hadoop, Hadoop datatypes, Hadoop software stack, Deployment of Hadoop in data center, Hadoop infrastructure, HDFS concepts, Blocks, Name nodes and Data nodes, Overview of HBase, Hive, Cassandra and Hypertable,Sqoop.
Introduction to R, Data Manipulation and Statistical Analysis with R, Basics, Simple manipulations, Numbers and vectors, Input/Output, Arrays and Matrices, Loops and conditional execution, functions, Data Structures, Data transformations, Strings and dates, Graphics.
- Unit 1
- Unit 2
- Unit 3
- Unit 4
- Unit 5
1. Judith Hurwitz, Alan Nugent, Fern Halper, Marcia Kaufman, Wiley Big Data For Dummies,
2. Runkler, Thomas A., Springer Vieweg Data Analytics, Models and Algorithms for Intelligent Data Analysis
3.Vignesh Prajapati Big Data Analytics with R and Hadoop, Packt Publication,