اخبار و رویدادها

Automatic Extraction of Web page Sections

Automatic Extraction of Web page Sections


In this study, we are going to propose an optimal solution to extract information from Web pages and identify different parts of them. Classifying web pages will give us a better insight of web parts and leads to a more precise extraction of web parts.

 

In this research, we first classify different web pages in well-known classes including blogs, news, social medias and forums. Then, for each class we will design and train a specific classifier to extract web parts. For instance, news agency web pages consist of title, content, published date, keywords and so on. Meanwhile, web pages in social networks contain posts with comments, number of likes, number of shares and so on. This inspired us to classify web pages in advance to extract web parts.

Provider

مهدی یداللهی
email: mehdiyadollahi68 [at] gmail.com
 

Supervisor

مسعود اسدپور
email: asadpour [AT] ut.ac.ir

 

 
آدرس کوتاه :