Category Archives: Hadoop

Getting started with Big data (Hadoop, Spark)!

Few questions running in your mind right now How and where do I start with Big data space? What is Big data, Hadoop, Spark, Hive, HBase, etc? What should I learn first? Which path should I take? Data Science, Data Engineer, Architect!

Default ports in Hadoop

JobTracker: 50030 NameNode: 50070 Secondary NameNode: 50090 DataNode: 50010 TaskTracker: 50060 Spark UI: 4040 Comment below if you find this blog useful.

Hadoop Acronyms

HDFS – Hadoop Distributed File System CAP Theorem (or) Brewer’s theorem – Consistency, Availability, Partition tolerance Theorem MR – Map Reduce NOSQL – Not Only SQL NN – Name Node HA – High-Availability DN – Data Node JT – Job Tracker HAR – Hadoop Archives HQL (or) HiveQL – Hive Query Language YARN – Yet… Read More »