Computers and Technology
Computers and Technology, 13.12.2019 21:31, anggar20

Example: data set: collections of text documents. problem: count the frequency of nouns that appear at least 100 times in the documents. (i) mapper function: tokenize each line into a set of terms (words), and filter out terms that are not nouns. (ii) mapper output: key is a noun, value is 1. (iii) reducer input: key is a word, value is list of 1’s. (iv) reduce function: sums up the 1’s for each key (noun). (v) reducer output: key is a noun, value is frequency of the word (filter the nouns whose frequencies are below ) data set: amazon book ratings data. each line in the data file has 4 columns (reviewer id, book id, book genre, rating), where ratings are integer-valued ranging from 1 to 4. problem: identify the highest rated book, i. e., the book with highest average rating, for each book genre. note that each book can have more than one ratings (e. g., by different ) data set: movie preference data. each record in the data file contains the movie title and list of users who liked the movie. for example, the record jaws user111 user134 user313 user5812 star_wars user111 user313 user388 user4422 problem: for each pair of users, count the number of movies they both liked. the output may exclude pairs of users who do not have any movies they both liked.(c) data set: maximum and minimum daily temperature readings for weather stations from around the world. each line in the data files has 4 columns (station id, date, max temperature, min temperature). 2 problem: find the station id and date of anomalous temperature readings in the dataset. a temperature reading is anomalous if the minimum daily temperature exceeds the maximum temperature for the given day.(d) data set: instagram friendship graph. each record corresponds to an instagram user, followed by a list of his/her friends. for example, the graph data may contain the following records: john123 mary456 tom312 lee222 mary456 john123 tom312 john123 lee222 lee222 john123 tom312 the first line above states that mary456, tom312, and lee222 are friends of john123. problem: find pairs of instagram users who are not friends with each other but who share one or more common friends. this is known as the "friend-of-a-friend" (fof) problem. for example, mary456 and tom312 are both friends of john123, but they are not friends with each other. the hadoop program should only output the pair (u, v) if u < v. in the previous example, the program should only output the pair (mary456, tom312) but not (tom312, ) data set: cancer data. each line in the data file corresponds to a patient with the following nominal-valued attributes: patientid, gender, marital status, smoker, weight class, and class, where the class attribute has value yes or no to indicate whether the patient has cancer. 12345, female, married, smoker, normal, yes. 13, male, single, nonsmoker, normal, no. 14423, male, married, smoker, overweight, yes. problem: compute the gini index for each of the following attributes: gender, marital status, smoker, and weight class, based on the distribution of their class values.

answer
Answers: 1

Other questions on the subject: Computers and Technology

image
Computers and Technology, 23.06.2019 12:10, jefersina16
2. fabulously fit offers memberships for$35 per month plus a $50 enrollmentfee. the fitness studio offersmemberships for $40 per month plus a$35 enrollment fee. in how many monthswill the fitness clubs cost the same? what will the cost be?
Answers: 1
image
Computers and Technology, 23.06.2019 21:30, jayybrain6337
Enzo’s balance sheet for the month of july is shown. enzo’s balance sheet (july 2013) assets liabilities cash $600 credit card $4,000 investments $500 student loan $2,500 house $120,000 mortgage $80,000 car $6,000 car loan $2,000 total $127,100 total $88,500 which expression finds enzo’s net worth?
Answers: 1
image
Computers and Technology, 24.06.2019 00:10, roxymiller3942
Read each statement below. if the statement describes a peer-to-peer network, put a p next to it. if the statement describes a server-based network, put an s next to it. p - peer-to-peer s - server-based
Answers: 1
image
Computers and Technology, 24.06.2019 15:00, firdausmohammed80
Universal windows platform is designed for which windows 10 version?
Answers: 1
Do you know the correct answer?
Example: data set: collections of text documents. problem: count the frequency of nouns that appe...

Questions in other subjects:

Konu
Mathematics, 07.08.2021 20:00
Konu
Mathematics, 07.08.2021 20:00
Konu
Mathematics, 07.08.2021 20:00