Solve question about MapReduce and Hadoop

$10-30 USD

Avslutat

Publicerad

över ett år sedan

$10-30 USD

Betalning vid leverans

1) Describe how to implement the following queries in MapReduce: SELECT [login to view URL], [login to view URL], [login to view URL], [login to view URL], [login to view URL] FROM Employee as emp, Agent as a WHERE [login to view URL] = [login to view URL] AND [login to view URL] = [login to view URL]; SELECT lo_quantity, COUNT(lo_extendedprice) FROM lineorder, dwdate WHERE lo_orderdate = d_datekey AND d_yearmonth = 'Feb1995' AND lo_discount = 6 GROUP BY lo_quantity; SELECT d_month, AVG(d_year) FROM dwdate GROUP BY d_month ORDER BY AVG(d_year) Consider a Hadoop job that processes an input data file of size equal to 179 disk blocks (179 different blocks, not considering HDFS replication factor). The mapper in this job requires 1 minute to read and fully process a single block of data. Reducer requires 1 second (not minute) to produce an answer for one key worth of values and there are a total of 3000 distinct keys (mappers generate a lot more key-value pairs, but keys only occur in the 1-3000 range for a total of 3000 unique entries). Assume that each node has a reducer and that the keys are distributed evenly. The total cost will consist of time to perform the Map phase plus the cost to perform the Reduce phase. How long will it take to complete the job if you only had one Hadoop worker node? For simplicity, assume that that only one mapper and only one reducer are created on every node. 30 Hadoop worker nodes? 60 Hadoop worker nodes? 100 Hadoop worker nodes? Would changing the replication factor have any affect your answers for a-d? You can ignore the network transfer costs as well as the possibility of node failure. Suppose you have an 8-node cluster with replication factor of 3. Describe what MapReduce has to do after it determines that a node has crashed while a job is being processed. For simplicity, assume that the failed node is not replaced and your cluster is reduced to 7 nodes. Specifically: What does HDFS (the storage layer/NameNode) have to do in response to node failure in this case? I.e., what is the guarantee that HDFS has to maintain? What does MapReduce engine (the execution layer) have to do to respond to the node failure? Assume that there was a job in progress at the time of the crash (because MapReduce engine only needs to take action if a job was in progress). Where does the Mapper store output key-value pairs before they are sent to Reducers? Can Reducers begin processing before Mapper phase is complete? Why or why not? Repeat the RSA computation examples by a) Select two (small) primes and generate a public-private key pair. b) Compute a sample ciphertext using your public key c) Decrypt your ciphertext from 4-b using the private key d) Why can’t the encrypted message sent through this mechanism be larger than the value of n?

Hadoop

Map Reduce

Project ID: 34899940

Om projektet

1 anbud

Distansprojekt

Senaste aktivitet ett år sedan

Ute efter att tjäna lite pengar?

E-postadress

Fördelar med att lägga anbud hos Freelancer

Ange budget och tidsram

Få betalt för ditt arbete

Beskriv ditt förslag

Det är gratis att registrera sig och att lägga anbud på uppdrag

1 frilansar lägger i genomsnitt anbud på $200 USD för detta uppdrag

@benkushau

I will write mapreduce code for hadoop and spark I will provide services like, Hadoop cluster setup Hadoop map-reduce programming Apache spark cluster setup Apache spark map-reduce programming AWS EMR cluster setup & map-reduce programming

$200 USD Om 3 dagar