Hadoop in Action

书名:Hadoop in Action
作者:ChuckLam
译者:
ISBN:9781935182191
出版社:ManningPublications
出版时间:2010-12-22
格式:epub/mobi/azw3/pdf
页数:325
豆瓣评分: 8.2

书籍简介:

HIGHLIGHT Hadoop in Action is an example-rich tutorial that shows developers how to implement data-intensive distributed computing using Hadoop and the Map- Reduce framework. DESCRIPTION Hadoop is an open source implementation of Google's MapReduce framework for scalable, distributed data processing. Hadoop in Action is for programmers, architects, and project managers who have to process large amounts of data offline. The book begins with several simple examples that illustrate the basic idea behind Hadoop. Later chapters explain the core framework components and demonstrate Hadoop in a variety of data analysis tasks. Throughout the book, readers will learn best practices and design patterns, and how to write meaningful programs in a MapReduce framework. KEY POINTS Explains distributed computing, MapReduce, and the Hadoop framework Focuses on most-used features and rapid development solutions Numerous hands-on examples to illustrate abstract ideas Concise, developer-centric, In Action style Multiple case studies demonstrate real-world Hadoop uses Covers popular Hadoop extensions that ease development and extend functionality

作者简介:

Chuck Lam 目前建立了一个名为RollCall的移动社交网络公司,让活跃的个体用户拥有了一个社交助理。他以前曾是RockYou的高级技术组长,开发了社交应用 程序和数据处理基础架构,能够支撑上亿的用户。在斯坦福大学攻读博士的时候,Chuck就对大数据产生了兴趣。他的论文“Computational Data Acquisition”首创了可用于机器学习的数据采集方法,吸纳了来自开源软件和网络游戏等领域的思想。

书友短评:

@ Marble Arch 看完啦,开始干活! @ Linxh 先看前几章,了解一下;需要再看,结合实践 @ 龙吟 写的很不错 @ 蓉蓉向你问好 前面的描述都很仔细,只是java和python穿插于全文有些不统一,另后面几章有些不那么连贯 @ masque 写得比较详细,我喜欢in action的书 @ jny 实践出真知!!!入门推荐!!! @ beren 讲了更多实用性的内容,如何使用如何安装如何运行,非常适合初学看 @ 菊 读了基础部分,关键还得实践 @ Leib 四星给书,一星给hadoop这个弱渣平台 @ 童年在地图上 差pig那章没看 hadoop入门书 入门是够了 不过也仅仅是入门而已 另觉得看原版比翻译版爽多了 对于有些术语翻译版确实无力吐槽

书籍目录

  • Another important job of the InputFormat is to divide the input data sources (e.g., input files) into fragments that make up the inputs to individual map tasks.These fragments are called "splits" and are encapsulated in instances of the InputSplit interface.
    —— 引自章节:2、Hadoop输入与输出
  • TextInputFormat:文件偏移量 :整行数据KeyValueTextInputFormat:第一个"\t"前的数据 : 后面的整行数据SequenceFileInputFormat:因为这是二进制文件,所以Key-Value都是由用户指定NLineInputFormat:与TextInputFormat一样,就是NLine的区别了
    —— 引自章节:2、Hadoop输入与输出
  • 添加微信公众号:好书天下获取

    添加微信公众号:“好书天下”获取书籍好书天下 » Hadoop in Action
    分享到: 更多 (0)

    评论 抢沙发

    评论前必须登录!

     

    添加微信公众号:“好书天下”获取书籍

    好书天下