MapReduce中Mapper和Reducer类编写指南

在使用MapReduce技术编写Java程序处理大数据时，你需要编写Mapper和Reducer类来分别处理输入数据和聚合结果。以下是一个示例，展示了如何编写Mapper和Reducer类来统计标题中包含'Engineering'的文档数量。

Mapper类代码示例：javaimport org.apache.hadoop.io.;import org.apache.hadoop.mapreduce.;

public class TitleCountMapper extends Mapper<LongWritable, Text, Text, IntWritable> { private final static IntWritable ONE = new IntWritable(1); private Text word = new Text();

public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {        String line = value.toString();        String[] parts = line.split('	'); // 假设标题和内容之间使用制表符分隔

    String title = parts[0]; // 假设标题是第一个字段        if (title.endsWith('Engineering')) {            word.set('Engineering'); // 将匹配的标题设为键            context.write(word, ONE); // 输出键值对（'Engineering', 1）        }    }}

Reducer类代码示例：javaimport org.apache.hadoop.io.;import org.apache.hadoop.mapreduce.;

public class TitleCountReducer extends Reducer<Text, IntWritable, Text, IntWritable> { public void reduce(Text key, Iterable values, Context context) throws IOException, InterruptedException { int count = 0; for (IntWritable value : values) { count += value.get(); // 统计计数 } context.write(key, new IntWritable(count)); // 输出键值对（'Engineering', 计数） }}

代码解析:

Mapper类: * 接收输入的键值对，其中值为文本行。 * 将文本行按制表符分割，提取标题部分。 * 检查标题是否以'Engineering'结尾。 * 如果匹配，输出键值对 ('Engineering', 1)。* Reducer类: * 接收Mapper输出的键值对，其中键为'Engineering'，值为1。 * 统计所有值为'Engineering'的键对应的值的总和，即文档数量。 * 输出最终结果 ('Engineering', 计数)。

请注意:

以上代码仅供参考，实际应用中，你需要根据你的数据格式和需求进行适当的修改。* 例如，你需要根据实际情况修改分隔符，以及判断标题是否包含特定关键词的逻辑。

希望这篇指南能帮助你更好地理解和编写MapReduce程序中的Mapper和Reducer类。