Recap of Hadoop News for September 2018
Hadoop is the cornerstone of the big data industry, however,
the challenges involved in maintaining the hadoop network has led to the
development and growth of Hadoop-as-a-Service (HaaS) market.Industry research
reveals that the global Hadoop-as-a-Service market is anticipated to reach
$16.2 billion by 2020 growing a a compound annual growth rate of 70.8% from
2014 to 2020.With market leaders like Microsoft and SAP expanding their
horizons at the end user industry, HaaS is likely to witness rapid growth in the
next 7 years.Organizations like Commerzbank have already launched new platforms
based on HaaS solutions which demonstrate that HaaS is a promising solution for
building and managing big data clusters. HaaS will compel organizations to
consider Hadoop as a solution to various big data challenges.
Online Hadoop Training
Hortonworks unveils roadmap to make Hadoop
cloud-native.Zdnet.com, September 10, 2018
Considering the importance cloud, Hortonworks is partnering
with RedHat and IBM to transform Hadoop into a cloud-native platform.Today
Hadoop can run in the cloud but it cannot exploit the capabilities of the cloud
architecture to the fullest.The idea to make hadoop cloud-native is not a mere
matter of buzzword compliance,but the goal is to make it more fleet-footed.25%
of workloads from Hadoop incumbents - MapR, Hortonworks, and Cloudera are
running in the cloud ,however, by next year it is anticipated that half of all
the new big data workloads will be deployed on the cloud.Hortonworks is
unveiling the Open Hybrid Architecture initiative for transforming Hadoop into
a cloud-native platform that will address containerization, support Kubernetes,
and include the roadmap to encompass separating compute from data.
Master Hadoop Skills by working on interesting Hadoop
Projects
LinkedIn open-sources a tool to run TensorFlow on
Hadoop.Infoworld.com, September 13, 2018.
LinkedIn’s open-source project Tony aims at scaling and
managing deep learning jobs in Tensorflow using YARN scheduler in Hadoop.Tony
uses YARN’s resource and task scheduling system to run Tensorflow jobs on a
Hadoop cluster. LinkedIn’s open source project Tony can also schedule GPU based
tensorflow jobs through Hadoop,allocate memory separately for Tensorflow nodes
, request different types of resources (CPU’s vs GPU’s), and ensures that the
job outcomes are saved at regular intervals on HDFS and resumed from where the jobs were
interrupted or crashed.LinkedIn claims that there is no additional overhead for
Tensorflow jobs when using Tony because it is present at a layer which
orchestrates distributed Tensorflow and does not interrupt the execution of
tensorflow jobs.Tony is also used for visualizing, optimization, and debugging
of Tensorflow apps.
Big Data Hadoop Projects
Microsoft’s SQL Server gets built-in support for Spark and
Hadoop. September 24, 2018. Techcrunch.com.
Microsoft has announced the addition of new connectors which
will allow businesses to use SQL server to query other databases like MongoDB,
Oracle, and Teradata. This will make Microsoft SQL server into a virtual
integration layer where the data will never have to be replicated or moved to
the SQL server. SQL server in 2019 will come with in-built support for Hadoop
and Spark. SQL server will provide support for big data clusters through
Google-incubated Kubernetes container orchestration system. Every big data
cluster will include SQL server, Hadoop and Spark file system.
Big-data project aims to transform farming in world’s
poorest countries.September 24, 2018, Nature.com
Big data is really changing the way we use data for
agriculture. FAO, the Bill and Melinda Gates Foundation and national
governments have launched a US$500-million effort to help developing countries
collect data on small-scale farmers to help fight hunger and and promote rural
development. Collecting accurate information about seed varieties ,farmer’s
technological capacity, and farmers income will help coalition members
understand how ongoing agricultural investments
are making an impact.This data will also enable governments to customize
policies to help farmers.
Mining equipment-maker uses BI on Hadoop to dig for
data.TechTarget.com, September 26, 2018.
Milwaukee based maker of mining equipment Count Komatsu
Mining Corp. is looking to churn more data in place and share BI analytics of
the data within and outside the organization.To enhance the efficiency, Count
Komatsu has combined several big data tools that include Spark, Hadoop, Kafka ,
Kudu, and Impala from Cloudera. It has also included on-cluster analytics
software from BI on Hadoop analytics toolmaker Arcadia Data. This big data
platform has been assembled to analyse sensor data collected by the equipments
in the field to keep a track on wear and tear of massive shovels and earth
movers.The company forsees a future in which the platform will utilize IoT
application data for better predictive and prescriptive equipment
maintenance.[Source]-https://www.dezyre.com/article/recap-of-hadoop-news-for-september-2018/404
big data hadoop training and certification at Asterix Solution is designed to scale up from single servers
to thousands of machines, each offering local computation and storage. With the
rate at which memory cost decreased the processing speed of data never
increased and hence loading the large set of data is still a big headache and
here comes Hadoop as the solution for it.
Comments
Post a Comment