导读:此文是一篇岩土工程专业的硕士毕业论文SUMARRY,总共分成6个部分。对于不会写论文的summary部分的同学,可以看下本文的整个布局和框架,希望能给大家带来帮助。
1. Elsevier 爱思唯尔
目前,在没有信息检索(IR)系统支持的情况下,学术搜索就像是大海捞针。正如Manning等人(2009)的解释一样,IR是一个从大量收集中发现的非结构化材料的过程,以满足学术领域的信息需求。一般来说,用户更喜欢谷歌而不是其他专业网站和数据库。然而,信息检索的作用使其更容易满足信息需求,不应忽视学术搜索领域中的专业数据库。据我们所知,爱思唯尔是世界领先的网站,也是一个庞大的数据库,它不仅出版医学的文章、期刊、书籍和科学文献,还为科学和健康领域的客户提供数字信息解决方案服务。在本报告中,爱思唯尔将被选为红外分析的一个例子。
Nowadays, without information retrieval(IR) system support, academic searching is like to find a rice grain in haystack. Just like the interpretation of Manning et al (2009), IR is as a process of unstructured materials found from a large collection in order to satisfy the information needsin academic field. In general, users prefer Google rather than other professional website and database. However, the role of IR makes it more easily to meet the information needs so that we should not ignore the professional database in the academic search field. As we know, Elsevier is a world-leading website and is also a big database, which not only publishes the articles, journals, books and scientific literatures of medical, but also provides the digital information solution service for customers in science and health field. In the present report, Elsevier will be chosen as an example of IR analysis.
2. Method方法
在本节中,将占用两种不同类型的搜索任务。在已知项搜索的第一个任务中,这意味着我们希望搜索我们知道的特定项,但不知道位置。在实践中,我们采用了一个名为“材料科学与工程:A”的特殊期刊作为例子来评估爱思唯尔红外系统。我们尝试在搜索过程中输入正确的和拼写错误的项目,以测试结果中是否建议正确的拼写提醒和建议。可以使用“matarials”或“materials scence”等项作为测试的关键字来执行搜索。在第二个导论材料的搜索任务中,我们将通过“查找文章”部分查找“纳米材料定义”的相关文档进行搜索,以评估爱思唯尔红外系统。此外,本研究的另一个目的是利用精炼工具,如“Ti”,快速获得专业导论。可以快速搜索数据库中的相关资料。
In this section, two different types of searching tasks will be occupied. In the first task of known item search, which means we would like to search a particular item we know, but have no idea the location. In practice, we adopt a specific journal named “Materials Science and Engineering: A” as an example to evaluate Elsevier IR system. We tried to enter the correct as well as the misspelled items into the search process to test whether the correct spelling reminders and recommendations are suggested in the result. The search can be performed by using the items like “matarials” or “materials scence” as the key word for testing. In the second task of search for introductory materials, we would to find the relevant document of “definition of nanomaterials” through “find a article” section to search, in order to evaluate Elsevier IR system.In addition, another purpose of this research is to get professional introductionquickly using the refining tools, such as “titles”, “author name” and “journal or book titles” to quickly searched the relevant materials in the database.
3. Result
Firstly, when the target item was input correctly, the exact target will be presented together with some relevant names of journals, books and webpages. In contrast, the test of spelling correction and suggestion function when facing simple typing error like “ materials scence” was also conducted. The Elsevier presents that “ No results were found in journals or books” on the left part of the screen. Obviously, the correction of misspell function is not that advanced in Elsevier. However, some webpages with the keyword “materials” were presented on the right. Then you can infer that nothing could be found even on the right side if multiple errors item “ matarials sicence” were input. Moreover, no reminder on the possible misspell was shown on the website.
In contrast, the Elsevier system would present three lists on journals, books and webpages based on the relevant degree to the searching query when the correct item “materials science” was input. However, no further directory with different orientations was listed so that it is difficult for users to identify the relevance between the query and the first results (Ryen W. White et al, 2006).
As to the introductory articles searching, many other options including “Year”, “Publication title” et al was available to refine your searching results at the beginning. For instance, 3912 results were found when “definition of nanomaterials” was input with detailed number distribution of different classification: year, publication title, topic and so on in refine filter. Therefore, the found result can be reduced to 14 when topic “nanotechnology” and year “2015” had been chosen make it more convenient for the users. When one article is chosen, recommended articles, citing articles together with the related book content were listed on the right side of the website to provide more options that the readers are interested in, which is considered as the reformulation process on query expansion.
4. Analysis
As a global database, Elsevier is able to facilitate lots of users to search what they need effectively and accurately. However, there is still some space should be improved to make the service better. Take the known item searching as an example, the target results can be focused within a short time when the exact item was input. However, users can get nothing when a tiny error was typed. According to the research, more than 20% queries are misspelled due to various reasons (S. Cucerzan, 2004). As a result, the automatically correction of misspelled queries is urgent and necessary for the searching engines to develop and improve so that the information needs of the users can be met (Hargittai, 2006).
In general, the hierarchical faceted metadata mode that Elsevier applied on query refinement is well designed, including several metadata, such as author name, journal or book title, volume, issue and even page, which make it more precise for searching. At the same time, another level of refine filters involving “year”, “publication title” and “topic” et al could serve from another classification to broaden the users’ selection range, which can optimize the efficiency of searching (Ricardo & Berthier, 1999). What’s more, the query expansion function is well designed when one article or book is chosen under one query- three categories of expanding information will be listed as mentioned above. As a result, the retrieval performance of reformulate a query can be effectively improved (Olga Vechtomova, 2006). However, this kind of query expansion function does not show up before the articles or books list appears. Thus, the expanded queries are more related to the article that you choose from the list rather than the query that you input. This situation is acceptable if the article or book lists are highly consistent with the queries that users input. Nevertheless, what will happen if the relevance between the results and the queries is not so close? There may be a misleading to the users, which will result in waste of time.
Moreover, another potential drawback is also associated with the query reformulation process corresponded to the failed query search. When replacing the word “definition” by “introduction” from the query “definition of nanomaterials”, what we got form the results were only the articles or books with the key word “nanomaterials”, which indicated that it could be an improper query. However, no related queries or relevant topic involved in the reformulation process had been presented in the search results. In my opinion, this is a distinct disadvantage for an IR system since the loss of reformulation process can hardly exhibit a user-centered feature, which may attribute to the decreasing attraction from the users. At the same time, the failed design on reformulation process can also limit the exploration of users in query expansion (Ruthven, 2008).
5. Recommendation
As mentioned at the beginning of Analysis part, a distinct drawback of Elsevier system is the misspelling correction problem. No automatic correction or hint of incorrect query is shown when misspelling situation appears, which may lead to the loss of potential users after several failed attempts. As a consequence, I recommend that Elsevier should pay more attention on the new technology of automatically misspelled correction as well as the reminder function to a possible wrong query. In contrast, the corresponding function is well engaged in goggle scholar search engine, which owns large amount of users with fine experience (Duan, 2011).
Another recommendation is the navigation towards the hierarchical faceted metadata mode. For the IR system designers, how to support the hierarchical categories to facilitate the users has been an issue all the time. As described above, users can filter the results by refine the sub-categories. However, the number of results that users got is only the sum of individual number of the two sub-categories without other options, such as the results containing both sub-categories, leaving the users more tasks to seek their targets. Therefore, it is maybe an good idea that books and journals should be added with several key labels like the “key words” in the articles, which can provide convenience for the users to combine different topics (M. Hearst, 2006). #p#分页标题#e#
As to the reformulation process of Elsevier system, there are several drawbacks to be improved. Firstly, the query expansion function only exists associated to the results rather than the queries themselves. In other words, the relationship that IR system built is merely connected by the existing resources but not the potential results which users are searching for or interested in (Soo Young Rieh et al, 2006). Thus, this kind of experience won’t be widely accepted by lots users, especially the professional ones with specific purposes. As a consequence, the reformulation process towards the query expansion function built between queries and resources should be improved to meet the requirements of the users. In addition, another problem regarding to the reformulation process also exists as analyzed above. The system cannot present related queries or topics when unsuccessful queries are input at the beginning, which are able to bring much trouble for the users to proceed with their search just like the lack of misspelling correction. At this point, Google has to be mentioned as a positive model again. The combination of misspelling correction function with the related queries recommendations has been wonderful performed in google scholar search engine, which realize the perfect application of reformulation process, especially the query expansion. For the designers of Elsevier system, the misspelled correction technology together with eye-tracking technology (adopt three-line paragraph) may be helpful to optimize the functions (Cutrell and Guan, 2007). All in all, Elsevier is a qualified Information Retrieval system with lots of merits that facilitate the users except some technological drawbacks to make it more efficient and effective in the future.
References:
CD Manning et al. (2009). Information Retrieval System. Cambridge University Press. Available online: http://www.langtoninfo.co.uk
Cucerzan and Brill. (2004), Spelling Correction As An Iterative Process That Exploits The Collective Knowledge Of Web Users.Proceedings Of Conference On Empirical Methods In Natural Language Processing (Emnlp'04), pages 293–300.
Duan.H. (2011), Online spelling correctionfor query completion. Proceedings of the 20thinternational conference on World wide web, pages 117–126.
Hargittai. (2006), Hurdles to information seeking: Spelling and typographical mistakes during users' online behavior. Journal of the Association for Information Systems, 7(1):52–67.
M. Hearst. (2006). Design Recommendations for Hierarchical Faceted Search Interfaces, ACM SIGIR workshop on faceted search. Avialable online: http://flamenco.sims.berkeley.edu
Olga Vechtomova. (2006). A study of the effect of term proximity on query expansion. Journal of Information Science,32(4): 324-333.
Ricardo & Berthier. (1999). Modern Information Retrieval. ACM press. Available online: ftp://mail.im.tku.edu.tw
Ruthven. (2008). Interactive information retrieval. Annual Review of Information Science and Techonololy, 42(1): 43-91.
Ryen W. White et al. (2006) An implicit feedback approach for interactive information retrieval.Information Processing & Management, 42(1): 166-190.
S. Cucerzan and E. Brill. Spelling correction as an iterativeprocess that exploits the collective knowledge of web users.
Soo Young Rieh et al. (2006). Analysis of multiple query reformulations on the web: The interactive information retrieval context. Information Processing & Management, 42 (3): 751-768.
|