Sign using adobe reader
Go here
Lets talk Hadoop & Netezza
Wednesday, November 25, 2015
Thursday, October 29, 2015
Hive Properties
https://murshedsqlcat.wordpress.com/2014/04/18/useful-hive-settings/
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties
Join two CSV data set using mapreduce
http://www.codeproject.com/Articles/869383/Implementing-Join-in-Hadoop-Map-Reduce
Diff between Writable and WritableComparable Interface
org.apache.hadoop.io.Writable is a Java interface. Any key or value type in the Hadoop Map-Reduce framework implements this interface. Implementations typically implement a static read(DataInput) method which constructs a new instance, calls readFields(DataInput) and returns the instance.
org.apache.hadoop.io.WritableComparable is a Java interface. Any type which is to be used as a key in the Hadoop Map-Reduce framework should implement this interface. WritableComparable objects can be compared to each other using Comparators.
org.apache.hadoop.io.WritableComparable is a Java interface. Any type which is to be used as a key in the Hadoop Map-Reduce framework should implement this interface. WritableComparable objects can be compared to each other using Comparators.
Wednesday, October 28, 2015
Im-mapper Combining for word count
https://vangjee.wordpress.com/2012/03/07/the-in-mapper-combining-design-pattern-for-mapreduce-programming/
Find the top N most frequent words
1- Let the mapper run as usual writing (key, 1) for reduce phase.
Reduce phase:
1- We override two methods: reduce() and cleanup().
2- at the beginning of the method, we compute the sum of all the values received from the mappers for this key, which is the number of occurrences of this word inside the book; then we put the word and the number of occurrences into a HashMap.
3- We sort the hashmap by count in the map.sortByValues(countMap);
4- in the cleanup() method first we sort the HashMap by values , then we loop over the keyset and output the first 20 items.
Source
Reduce phase:
1- We override two methods: reduce() and cleanup().
2- at the beginning of the method, we compute the sum of all the values received from the mappers for this key, which is the number of occurrences of this word inside the book; then we put the word and the number of occurrences into a HashMap.
3- We sort the hashmap by count in the map.sortByValues(countMap);
4- in the cleanup() method first we sort the HashMap by values , then we loop over the keyset and output the first 20 items.
Source
Finding the top 10 list from a set
http://blog.pivotal.io/pivotal/products/how-hadoop-mapreduce-can-transform-how-you-build-top-ten-lists
Subscribe to:
Posts (Atom)