Computers and Technology
Computers and Technology, 01.06.2020 00:57, jaylaa04

Question 1 (Index Construction):
Suppose you have joined a search engine development team to design a search algorithm based on both the Vector model and the Boolean model.
You have collected the following documents (unstructured) and plan to apply an index technique to convert them into an inverted index.

Doc 1ļ¼šdata science is field to use scientific method, process, algorithm, system to extract knowledge.

Doc 2ļ¼šdata mining is the process to discover pattern in large data to involve method at the database system.

Doc 3ļ¼šinformation system is the study of network of hardware and software that people use to process data.

To answer the below questions, you have to provide the detailed procedures step by step.
You need to remove all stop words and punctuation before the process of creating the inverted index. After that, please complete the following steps:

Question 1.1:
Create a merged inverted list including the within-document frequencies for each term.

Question 1.2:
Use the index created as above to create a dictionary and the related posting file.

Question 1.3:
Please design three Boolean queries, (for example, web AND search) and list the relevant documents for each query. Each query must contain at least two keywords while no one keyword appears in one document only.

Question 1.4:
Please use the Vector model to query on the inverted index, and compare the result with the Boolean model. (Hint: you can use cosine similarity and set a similarity threshold).

answer
Answers: 1

Other questions on the subject: Computers and Technology

image
Computers and Technology, 24.06.2019 00:00, Amrinderkhattra
Visualizing a game of ā€œtagā€ to remember the meaning of contagious
Answers: 3
image
Computers and Technology, 24.06.2019 15:30, jhony70
What type of forensic evidence was recovered during the bomb set off at the new mexico facility on the video that was similar to the evidence obtained at the boston bombings and how did the evidence allow the researchers to connect other pieces of evidence to the same bomb?
Answers: 2
image
Computers and Technology, 24.06.2019 17:50, connorwbrown07
Acontact list is a place where you can store a specific contact with other associated information such as a phone number, email address, birthday, etc. write a program that first takes in word pairs that consist of a name and a phone number (both strings). that list is followed by a name, and your program should output that name's phone number.
Answers: 1
image
Computers and Technology, 25.06.2019 06:00, Andy769
Shaniya has misspelled a scientific name in her biology report. she needs to correct it, but she has no access to a computer. she plans to use the word app on her phone without an office 365 subscription. can shaniya correct her mistake? why or why not? yes, she can navigate the window and do simple editing. yes, she can use this application for free and navigate the window. no, her document is ā€œread-only,ā€ so she cannot navigate the window. no, her application has limited features and she cannot access the documen
Answers: 1
Do you know the correct answer?
Question 1 (Index Construction):
Suppose you have joined a search engine development team to...

Questions in other subjects:

Konu
Computers and Technology, 29.07.2019 02:00