Chạy công việc Hadoop mà không cần sử dụng JobConf

Tôi không thể tìm thấy một ví dụ đơn lẻ về việc gửi một công việc Hadoop không sử dụng lớp học không được chấp nhận JobConf. JobClient, không được dùng nữa, vẫn chỉ hỗ trợ các phương thức có tham số JobConf.Chạy công việc Hadoop mà không cần sử dụng JobConf

Có thể ai đó xin vui lòng chỉ cho tôi một ví dụ về mã Java nộp bản đồ Hadoop/giảm công việc chỉ sử dụng các lớp Configuration (không JobConf), và sử dụng các gói mapreduce.lib.input thay vì mapred.input?

Nguồn

2010-01-22 Greg Cottman

Hy vọng điều này hữu ích

import java.io.File; 

import org.apache.commons.io.FileUtils; 
import org.apache.hadoop.conf.Configured; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.LongWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Job; 
import org.apache.hadoop.mapreduce.Mapper; 
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
import org.apache.hadoop.util.Tool; 
import org.apache.hadoop.util.ToolRunner; 

public class MapReduceExample extends Configured implements Tool { 

    static class MyMapper extends Mapper<LongWritable, Text, LongWritable, Text> { 
     public MyMapper(){ 

     } 

     protected void map(
       LongWritable key, 
       Text value, 
       org.apache.hadoop.mapreduce.Mapper<LongWritable, Text, LongWritable, Text>.Context context) 
       throws java.io.IOException, InterruptedException { 
      context.getCounter("mygroup", "jeff").increment(1); 
      context.write(key, value); 
     }; 
    } 

    @Override 
    public int run(String[] args) throws Exception { 
     Job job = new Job(); 
     job.setMapperClass(MyMapper.class); 
     FileInputFormat.setInputPaths(job, new Path(args[0])); 
     FileOutputFormat.setOutputPath(job, new Path(args[1])); 

     job.waitForCompletion(true); 
     return 0; 
    } 

    public static void main(String[] args) throws Exception { 
     FileUtils.deleteDirectory(new File("data/output")); 
     args = new String[] { "data/input", "data/output" }; 
     ToolRunner.run(new MapReduceExample(), args); 
    } 
}

Nguồn

2010-01-22 09:12:29 zjffdu

Cả ba 'constructors job' đang bị phản đối ngay bây giờ. Cách chính xác là: 'Job job = Job.getInstance (getConf());' –

Trên phiên bản nào? Tôi đang sử dụng v1.0.4 nhưng không tìm thấy nhà xây dựng này. –

Tôi tin this tutorial minh họa loại bỏ các lớp JobConf phản đối sử dụng Hadoop 0.20.1.

Nguồn

2010-01-23 20:20:31

Đây là ví dụ hay với mã có thể tải xuống: http://sonerbalkir.blogspot.com/2010/01/new-hadoop-api-020x.html Nó cũng đã hơn hai tuổi và không có tài liệu chính thức nào thảo luận về API mới. Buồn.

Nguồn

2012-04-20 17:15:45

Trong API trước, có ba cách để gửi công việc và một trong số đó là gửi công việc và tham chiếu đến RunningJob và nhận id của RunningJob.

submitJob(JobConf) : only submits the job, then poll the returned handle to the RunningJob to query status and make scheduling decisions.

Làm thế nào người ta có thể sử dụng Api mới và nhận được một tham chiếu đến RunningJob và nhận được một id của runningJob như không ai trong số các api của trả về một tham chiếu đến RunningJob

http://hadoop.apache.org/docs/current/api/org/apache/hadoop/mapreduce/Job.html

nhờ

Nguồn

2013-04-08 04:35:38 Yatin

Hãy thử sử dụng Configuration và Job. Dưới đây là một ví dụ:

(Thay Mapper, Combiner, Reducer lớp học của bạn và cấu hình khác)

import org.apache.hadoop.conf.Configuration; 
import org.apache.hadoop.fs.Path; 
import org.apache.hadoop.io.IntWritable; 
import org.apache.hadoop.io.Text; 
import org.apache.hadoop.mapreduce.Job; 
import org.apache.hadoop.mapreduce.Mapper; 
import org.apache.hadoop.mapreduce.Reducer; 
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat; 
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; 

public class WordCount { 
    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException { 
    Configuration conf = new Configuration(); 
    if(args.length != 2) { 
     System.err.println("Usage: <in> <out>"); 
     System.exit(2); 
    } 
    Job job = Job.getInstance(conf, "Word Count"); 

    // set jar 
    job.setJarByClass(WordCount.class); 

    // set Mapper, Combiner, Reducer 
    job.setMapperClass(TokenizerMapper.class); 
    job.setCombinerClass(IntSumReducer.class); 
    job.setReducerClass(IntSumReducer.class); 

    /* Optional, set customer defined Partioner: 
    * job.setPartitionerClass(MyPartioner.class); 
    */ 

    // set output key 
    job.setMapOutputKeyClass(Text.class); 
    job.setMapOutputValueClass(IntWritable.class); 
    job.setOutputKeyClass(Text.class); 
    job.setOutputValueClass(IntWritable.class); 

    // set input and output path 
    FileInputFormat.addInputPath(job, new Path(args[0])); 
    FileOutputFormat.setOutputPath(job, new Path(args[1])); 

    // by default, Hadoop use TextInputFormat and TextOutputFormat 
    // any customer defined input and output class must implement InputFormat/OutputFormat interface 
    job.setInputFormatClass(TextInputFormat.class); 
    job.setOutputFormatClass(TextOutputFormat.class); 

    System.exit(job.waitForCompletion(true) ? 0 : 1); 
    } 
}

Nguồn

2015-03-31 09:33:08 coderz

Chạy công việc Hadoop mà không cần sử dụng JobConf

Trả lời

Các vấn đề liên quan