{"id":1932,"date":"2014-12-15T11:23:42","date_gmt":"2014-12-15T11:23:42","guid":{"rendered":"http:\/\/joapen.com\/blog\/?p=1932"},"modified":"2015-08-12T21:20:47","modified_gmt":"2015-08-12T21:20:47","slug":"hadoop-components","status":"publish","type":"post","link":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/","title":{"rendered":"Hadoop components"},"content":{"rendered":"<p>Hadoop platform is composed by different components and tools.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Hadoop HDFS<\/span><\/strong>: A distributed file system that partitions large files across multiple machines for high throughput access to data.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Hadoop YARN<\/span><\/strong>; A framework for job scheduling and cluster resource management.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Hadoop map reduce<\/span><\/strong>; A programming framework for distributed batch processing of large data sets distributed across multiple servers.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Hive<\/span><\/strong>; A data warehouse system for Hadoop that facilitates data summation, ad-hoc queries, and the analysis of large data sets stored in Hadoop &#8211; compatible file systems. Hive provides a mechanism to project structure onto this data and query it using a SQL-like language called HiveQL. HiveQL programs are converted into MapReduce programs. Hive was initially developed by Facebook.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">HBase<\/span><\/strong>; An open-source, distributed, column oriented store modeler created after google&#8217;s big table (that is property of Google). HBase is written in Java.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Pig<\/span><\/strong>; A high-level data-flow language (commonly called &#8220;Pig Latin&#8221;) for expressing MapReduce programs; it&#8217;s used for analyzing large HDFS distributed data sets. Pig was originally developed at Yahoo Research around 2006.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Mahout<\/span><\/strong>; A scalable machine learning and data mining library.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Oozie<\/span><\/strong>; A workflow scheduler system to manage Hadoop jobs (MapReduce and Pig jobs). Oozie is implemented as a Java Web-Application that runs in a Java Servlet-Container.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Spark<\/span><\/strong>; It&#8217;s a cluster computing framework which purpose is to manage large scale of data in memory. Spark&#8217;s in-memory primitives provide performance up to 100 times faster for certain applications.<\/p>\n<p><strong><span style=\"text-decoration: underline;\">Zookeeper<\/span><\/strong>; It&#8217;s a distributed configuration service, synchronization service, and naming registry for large distributed systems.<\/p>\n<p><a href=\"http:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/telco-ref-arch-2-0\/\" rel=\"attachment wp-att-1936\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-1936\" src=\"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png\" alt=\"Telco-Ref-Arch-2.0\" width=\"1347\" height=\"817\" srcset=\"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png 1347w, https:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0-300x181.png 300w, https:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0-1024x621.png 1024w, https:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0-494x300.png 494w\" sizes=\"auto, (max-width: 1347px) 100vw, 1347px\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines for high throughput access to data. Hadoop YARN; A framework for job scheduling and cluster resource management. Hadoop map reduce; A programming framework for distributed batch processing of large data sets distributed across &#8230; <a title=\"Hadoop components\" class=\"read-more\" href=\"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/\" aria-label=\"Read more about Hadoop components\">Read more<\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[65,99],"tags":[],"class_list":["post-1932","post","type-post","status-publish","format-standard","hentry","category-hadoop","category-open-source"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Hadoop components -<\/title>\n<meta name=\"description\" content=\"Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines - joapen projects\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop components -\" \/>\n<meta property=\"og:description\" content=\"Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines - joapen projects\" \/>\n<meta property=\"og:url\" content=\"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/\" \/>\n<meta property=\"og:site_name\" content=\"joapen projects\" \/>\n<meta property=\"article:published_time\" content=\"2014-12-15T11:23:42+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2015-08-12T21:20:47+00:00\" \/>\n<meta property=\"og:image\" content=\"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png\" \/>\n<meta name=\"author\" content=\"joapen\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"joapen\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"1 minute\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\\\/\\\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#article\",\"isPartOf\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/\"},\"author\":{\"name\":\"joapen\",\"@id\":\"http:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\"},\"headline\":\"Hadoop components\",\"datePublished\":\"2014-12-15T11:23:42+00:00\",\"dateModified\":\"2015-08-12T21:20:47+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/\"},\"wordCount\":255,\"commentCount\":0,\"publisher\":{\"@id\":\"http:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\"},\"image\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2014\\\/12\\\/Telco-Ref-Arch-2.0.png\",\"articleSection\":[\"Hadoop\",\"Open Source\"],\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"CommentAction\",\"name\":\"Comment\",\"target\":[\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#respond\"]}]},{\"@type\":\"WebPage\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/\",\"name\":\"Hadoop components -\",\"isPartOf\":{\"@id\":\"http:\\\/\\\/joapen.com\\\/blog\\\/#website\"},\"primaryImageOfPage\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#primaryimage\"},\"image\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#primaryimage\"},\"thumbnailUrl\":\"http:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2014\\\/12\\\/Telco-Ref-Arch-2.0.png\",\"datePublished\":\"2014-12-15T11:23:42+00:00\",\"dateModified\":\"2015-08-12T21:20:47+00:00\",\"description\":\"Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines - joapen projects\",\"breadcrumb\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#breadcrumb\"},\"inLanguage\":\"en-US\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/\"]}]},{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#primaryimage\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2014\\\/12\\\/Telco-Ref-Arch-2.0.png\",\"contentUrl\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2014\\\/12\\\/Telco-Ref-Arch-2.0.png\",\"width\":1347,\"height\":817},{\"@type\":\"BreadcrumbList\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/2014\\\/12\\\/15\\\/hadoop-components\\\/#breadcrumb\",\"itemListElement\":[{\"@type\":\"ListItem\",\"position\":1,\"name\":\"Home\",\"item\":\"http:\\\/\\\/joapen.com\\\/blog\\\/\"},{\"@type\":\"ListItem\",\"position\":2,\"name\":\"Hadoop components\"}]},{\"@type\":\"WebSite\",\"@id\":\"http:\\\/\\\/joapen.com\\\/blog\\\/#website\",\"url\":\"http:\\\/\\\/joapen.com\\\/blog\\\/\",\"name\":\"joapen projects\",\"description\":\"Just a place to write\",\"publisher\":{\"@id\":\"http:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"http:\\\/\\\/joapen.com\\\/blog\\\/?s={search_term_string}\"},\"query-input\":{\"@type\":\"PropertyValueSpecification\",\"valueRequired\":true,\"valueName\":\"search_term_string\"}}],\"inLanguage\":\"en-US\"},{\"@type\":[\"Person\",\"Organization\"],\"@id\":\"http:\\\/\\\/joapen.com\\\/blog\\\/#\\\/schema\\\/person\\\/23919df2312175fe9c4609203595b217\",\"name\":\"joapen\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"en-US\",\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\",\"url\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\",\"contentUrl\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\",\"width\":400,\"height\":400,\"caption\":\"joapen\"},\"logo\":{\"@id\":\"https:\\\/\\\/joapen.com\\\/blog\\\/wp-content\\\/uploads\\\/2021\\\/04\\\/joapen-mini.jpeg\"},\"sameAs\":[\"http:\\\/\\\/www.joapen.com\"]}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop components -","description":"Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines - joapen projects","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop components -","og_description":"Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines - joapen projects","og_url":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/","og_site_name":"joapen projects","article_published_time":"2014-12-15T11:23:42+00:00","article_modified_time":"2015-08-12T21:20:47+00:00","og_image":[{"url":"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png","type":"","width":"","height":""}],"author":"joapen","twitter_misc":{"Written by":"joapen","Est. reading time":"1 minute"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#article","isPartOf":{"@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/"},"author":{"name":"joapen","@id":"http:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217"},"headline":"Hadoop components","datePublished":"2014-12-15T11:23:42+00:00","dateModified":"2015-08-12T21:20:47+00:00","mainEntityOfPage":{"@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/"},"wordCount":255,"commentCount":0,"publisher":{"@id":"http:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217"},"image":{"@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#primaryimage"},"thumbnailUrl":"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png","articleSection":["Hadoop","Open Source"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/","url":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/","name":"Hadoop components -","isPartOf":{"@id":"http:\/\/joapen.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#primaryimage"},"image":{"@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#primaryimage"},"thumbnailUrl":"http:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png","datePublished":"2014-12-15T11:23:42+00:00","dateModified":"2015-08-12T21:20:47+00:00","description":"Hadoop platform is composed by different components and tools. Hadoop HDFS: A distributed file system that partitions large files across multiple machines - joapen projects","breadcrumb":{"@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#primaryimage","url":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png","contentUrl":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2014\/12\/Telco-Ref-Arch-2.0.png","width":1347,"height":817},{"@type":"BreadcrumbList","@id":"https:\/\/joapen.com\/blog\/2014\/12\/15\/hadoop-components\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"http:\/\/joapen.com\/blog\/"},{"@type":"ListItem","position":2,"name":"Hadoop components"}]},{"@type":"WebSite","@id":"http:\/\/joapen.com\/blog\/#website","url":"http:\/\/joapen.com\/blog\/","name":"joapen projects","description":"Just a place to write","publisher":{"@id":"http:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"http:\/\/joapen.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":["Person","Organization"],"@id":"http:\/\/joapen.com\/blog\/#\/schema\/person\/23919df2312175fe9c4609203595b217","name":"joapen","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg","url":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg","contentUrl":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg","width":400,"height":400,"caption":"joapen"},"logo":{"@id":"https:\/\/joapen.com\/blog\/wp-content\/uploads\/2021\/04\/joapen-mini.jpeg"},"sameAs":["http:\/\/www.joapen.com"]}]}},"_links":{"self":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts\/1932","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/comments?post=1932"}],"version-history":[{"count":3,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts\/1932\/revisions"}],"predecessor-version":[{"id":1937,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/posts\/1932\/revisions\/1937"}],"wp:attachment":[{"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/media?parent=1932"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/categories?post=1932"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/joapen.com\/blog\/wp-json\/wp\/v2\/tags?post=1932"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}