好书推荐 好书速递 排行榜 读书文摘

Programming Pig

Programming Pig
作者:Alan Gates
出版社:O'Reilly Media
出版年:2011-10
ISBN:9781449302641
行业:其它
浏览数:18

内容简介

This guide is an ideal learning tool and reference for Apache Pig, the programming language that helps you describe and run large data projects on Hadoop. With Pig, you can analyze data without having to create a full-fledged application - making it easy for you to experiment with new data sets. Programming Pig shows newcomers how to get started, and teaches intermediate users the benefits of using Pig Latin, the data flow language for building and maintaining pipelines for processing data. Advanced users learn how to build complex data processing pipelines with Pig's macros and modularity features, and discover how to build systems for complex data processing needs by embedding Pig Latin into scripting languages. * Learn the advantages and disadvantages of using Pig instead of MapReduce * Understand how Pig fits in with other Hadoop components, such as HDFS, Hive, MapReduce, and HBase * Follow examples that explain built-in Pig Latin functions, and data operators such as join and group * Use grunt, the shell that Pig provides for exploring and working with HDFS * Get performance tuning tips for running Pig Latin scripts on Hadoop clusters in less time * Extend Pig with powerful user defined functions written in Java or Python

......(更多)

作者简介

......(更多)

目录

......(更多)

读书文摘

To be mathematically precise, a Pig Latin script describes a directed acyclic graph (DAG), where the edges are data flows and the nodes are operators that process the data.

Pig provides an engine for executing data flows in parallel on Hadoop. It includes a language, Pig Latin ...

......(更多)

猜你喜欢

点击查看