Friday, January 10, 2025

International Journal on Web Service Computing (IJWSC)

International Journal on Web Service Computing (IJWSC)

ISSN: 0976 - 9811 (Online); 2230 - 7702 (Print)

Webpage URL: https://airccse.org/journal/jwsc/ijwsc.html

Myanmar Web Pages Crawler

Su Mon Khine and Yadana Thein, University of Computer Studies, Yangon

Abstract

Nowadays web pages are implemented in various kinds of languages on the Web and web crawlers are important for search engine. Language specific crawlers are crawlers that traverse and collect the relative web pages using the successive URls of web page. There are very few research areas in crawling for Myanmar Language web sites. Most of the language specific crawlers are based on n-gram character sequences which require training documents. The proposed crawler differs from those crawlers. The proposed crawler searches and retrieves Myanmar web pages for Myanmar Language search engine. The proposed crawler detects the Myanmar character and rule-based syllable threshold is used to judgment the relevance of the pages. According to experimental results, the proposed crawler has better performance, achieves successful accuracy and storage space for search engines are lesser since it only crawls the relevant documents for Myanmar web sites.

Keywords

Language specific crawler, Myanmar Language, rule-based syllable segmentation

Original Source URL: https://airccse.org/journal/jwsc/papers/6115ijwsc01.pdf

Volume URL: https://airccse.org/journal/jwsc/current2015.html

No comments:

Post a Comment

7th International Conference on Cloud and Internet of Things (ICCIoT 2026)

 #CloudComputing #internetofthings #BigData #cloudsecurity #networking #datastorage #programming #security #opensource #cloudstorage #clouda...