最新消息:网盘下载利器JDownloader--|--发布资讯--|--解压出错.密码问题--

Hadoop Fundamentals for Data Scientists

其他教程 killking 0评论
Hadoop Fundamentals for Data Scientists
January 2015 | .MP4, AVC, 3449 kbps, 1920x1080 | English, AAC, 128 kbps, 2 Ch | 6 hrs 5 mins | 7.95 GB
Instructors: Jenny Kim, Benjamin Bengfort


Hadoop's Architecture, Distributed Computing Framework, and Analytical Ecosystem

Get a practical introduction to Hadoop, the framework that made big data and large-scale analytics possible by combining distributed computing techniques with distributed storage. In this video tutorial, hosts Benjamin Bengfort and Jenny Kim discuss the core concepts behind distributed computing and big data, and then show you how to work with a Hadoop cluster and program analytical jobs. You’ll also learn how to use higher-level tools such as Hive and Spark.

Hadoop is a cluster computing technology that has many moving parts, including distributed systems administration, data engineering and warehousing methodologies, software engineering for distributed computing, and large-scale analytics. With this video, you’ll learn how to operationalize analytics over large datasets and rapidly deploy analytical jobs with a variety of toolsets.

Once you’ve completed this video, you’ll understand how different parts of Hadoop combine to form an entire data pipeline managed by teams of data engineers, data programmers, data researchers, and data business people.

Understand the Hadoop architecture and set up a pseudo-distributed development environment
Learn how to develop distributed computations with MapReduce and the Hadoop Distributed File System (HDFS)
Work with Hadoop via the command-line interface
Use the Hadoop Streaming utility to execute MapReduce jobs in Python
Explore data warehousing, higher-order data flows, and other projects in the Hadoop ecosystem
Learn how to use Hive to query and analyze relational data using Hadoop
Use summarization, filtering, and aggregation to move Big Data towards last mile computation
Understand how analytical workflows including iterative machine learning, feature analysis, and data modeling work in a Big Data context

Benjamin Bengfort is a data scientist and programmer in Washington DC who prefers technology to politics but sees the value of data in every domain. Alongside his work teaching, writing, and developing large-scale analytics with a focus on statistical machine learning, he is finishing his PhD at the University of Maryland where he studies machine learning and artificial intelligence.

Jenny Kim, a software engineer in the San Francisco Bay Area, develops, teaches, and writes about big data analytics applications and specializes in large-scale, distributed computing infrastructures and machine-learning algorithms to support recommendations systems.

 Hadoop Fundamentals for Data Scientists
Download 百度云

以下隐藏内容只提供VIP赞助会员

sorry! The following hidden content sponsorship VIP members only.

您必须 登录 才能发表评论!

网友最新评论 (1)

  1. ORelly - 数据科学家的Hadoop基础教程 通过本教程学习Hadoop的基础架构、分布式计算框架和分析生态系统。获得实用的Hadoop知识,通过分布式计算机术与分布式存储的联合使用,使得大数据和大规模分析变得可能。在本教程中,主讲Benjamin Bengfort和Jenny Kim将会就分布式计算和大数据后面的核心技术展开讨论,然后为你演示如何使用Hadoop集群和项目分析工作。你还会学习如何使用如Hive和Spark这样的高级工具。 Hadoop是一个有着多个组件的集群计算技术,这包括分布式系统管理、数据工程和仓库方法、用于分布式计算的软件工程和大规模分析。通过本视频,你将学习如何通过大规模数据集和快速地通过各种工具部署分析工作来运作分析。 一旦你完成了本教程的学习,你将会理解Hadoop的不同部分是如何组合在一起形成一套整个的由数据工程师、数据程序员、数据研究院和数据商组成的数据管道。
    wilde(特殊组-翻译)2年前 (2015-01-24)