Finding and Evaluating Patterns in Web Repository Using Database Technology and Data Mining Algorithms

dc.contributor.advisor Püskülcü, Halis
dc.contributor.author Özakar, Belgin
dc.contributor.other 03.04. Department of Computer Engineering
dc.contributor.other 03. Faculty of Engineering
dc.contributor.other 01. Izmir Institute of Technology
dc.date.accessioned 2014-07-22T13:51:28Z
dc.date.available 2014-07-22T13:51:28Z
dc.date.issued 2002
dc.description Thesis (Master)--Izmir Institute of Technology, Computer Engineering, Izmir, 2002 en_US
dc.description Includes bibliographical references en_US
dc.description Text in English; Abstract: Turkish and English en_US
dc.description vii, 63 leaves en_US
dc.description.abstract Web mining is a very hot research topic, which combines two of the active research areas: Data Mining and World Wide Web. The Web mining research relates to several research communities such as Database, Statistics, Artificial Intelligence and Visualization. Although there exists some confusion about the Web mining, the most recognized approach is to categorize Web mining into three areas: Web content mining, Web structure mining, and Web usage mining. Web content mining focuses on the discovery/retrieval of the useful information from the Web contents/data/documents, while the Web structure mining emphasizes to the discovery of how to model the underlying link structures of the Web. Sometimes the distinction between these two categories is not very clear. Web usage mining is relatively independent, but not isolated category, in which the following studies continue; General Web Usage Mining, Site Modification, Systems Improvement and Personalization. General Web Usage Mining systems aim to discover general trends and patterns from the log files by adapting data mining techniques. The objective of the Site Modification systems is to improve the design of a web site by suggesting modifications in its content and structure. The research on System Improvement focuses on using the web usage mining for improving the web traffic. Finally, personalization systems aim to understand individual trends used for personalizing the web sites. The study subject to this thesis, IYTE Web Usage Mining (WUM) System was an example of system development in the field of General Web Usage Mining with a database approach where the flexible query capability of SQL (Structured Query Language) was explored. The data mining and database techniques were applied on the access/error/user logs of the web server of Izmir Institute of Technology. The main objective was to create a site improvement tool for the web administrator by reporting the distribution of the hits received by the web server according to the time stamp, users, service and URL types and at the same time revealing the nature of the errors generated by the web server. All data cleaning and transaction identification processes were handled by the software routines coded in Java. Clean transactions were imported into IYTE Web Usage Mining (IYTE WUM) relational database. Flexible features of SQL were utilized for application of algorithm Apriori to discover most frequent pair of URL s visited, in addition to extraction of general knowledge from data. en_US
dc.identifier.uri https://hdl.handle.net/11147/3403
dc.language.iso en en_US
dc.publisher Izmir Institute of Technology en_US
dc.rights info:eu-repo/semantics/openAccess en_US
dc.subject.lcc QA76.9.D343 .O99 2002 en
dc.subject.lcsh Data mining en
dc.subject.lcsh Web usage mining en
dc.title Finding and Evaluating Patterns in Web Repository Using Database Technology and Data Mining Algorithms en_US
dc.type Master Thesis en_US
dspace.entity.type Publication
gdc.author.institutional Özakar, Belgin
gdc.author.institutional Püskülcü, Halis
gdc.author.institutional Ergenç Bostanoğlu, Belgin
gdc.coar.access open access
gdc.coar.type text::thesis::master thesis
gdc.description.department Thesis (Master)--İzmir Institute of Technology, Computer Engineering en_US
gdc.description.publicationcategory Tez en_US
gdc.description.scopusquality N/A
gdc.description.wosquality N/A
relation.isAuthorOfPublication f3844554-c555-4f40-8a31-c2b1f5f2d3e6
relation.isAuthorOfPublication 3b51d444-157d-4dff-a209-e28543a80dcd
relation.isAuthorOfPublication.latestForDiscovery f3844554-c555-4f40-8a31-c2b1f5f2d3e6
relation.isOrgUnitOfPublication 9af2b05f-28ac-4014-8abe-a4dfe192da5e
relation.isOrgUnitOfPublication 9af2b05f-28ac-4004-8abe-a4dfe192da5e
relation.isOrgUnitOfPublication 9af2b05f-28ac-4003-8abe-a4dfe192da5e
relation.isOrgUnitOfPublication.latestForDiscovery 9af2b05f-28ac-4014-8abe-a4dfe192da5e

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
T000130.pdf
Size:
358.71 KB
Format:
Adobe Portable Document Format
Description:
MasterThesis

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: