Skip to content

Roger Hosto

Good Talk

Menu
  • Home
  • Blogs
    • Databases Administration
      • MySQL
      • NoSQL
    • Development
    • Open Source Software
    • System Administration
  • Resume
  • About
Menu

Working with Hadoop Streaming

Posted on June 22, 2012 by webgeek

Hadoop streaming is a utility that comes with the Hadoop distribution. The utility allows you to create and run map/reduce jobs with any executable or script as the mapper and/or the reducer. For example:

shell> $HADOOP_HOME/bin/hadoop jar $HADOOP_HOME/hadoop-streaming.jar  -input myInputDirs -output myOutputDir -mapper /bin/cat -reducer /bin/wc

If you using the tar package from Apache Hadoop. You can find the hadoop-streaming.jar in $HADOOP_HOME/contrib/streaming/hadoop-streaming-xxx.jar

Category: Databases Administration

Leave a Reply

You must be logged in to post a comment.

  • Back to Basics: ORM and Its Impact on Database and Data Architecture
  • MySQL Error: 1062 'Duplicate entry' Error
  • Installing MariaDB 10.1 on CentOS 6.8
  • Linux Mint
  • Querying Apache Hadoop Resource Manager with Python.
  • LinkedIn
© 2026 Roger Hosto | Powered by Minimalist Blog WordPress Theme