The Data Extraction Using Distributed Crawler Inside Multi-Agent System
Karel Tomala, Jan Plucar, Patrik Dubec, Lukas Rapant, Miroslav Voznak
DOI: 10.15598/aeee.v11i6.867
Abstract
The paper discusses the use of web crawler technology. We created an application based on standard web crawler. Our application is determined for data extraction. Primarily, the application was designed to extract data using keywords from a social network Twitter. First, we created a standard crawler, which went through a predefined list of URLs and gradually download page content of each of the URLs. Page content was then parsed and important text and metadata were stored in a database. Recently, the application was modified in to the form of the multi-agent system. The system was developed in the C# language, which is used to create web applications and sites etc. Obtained data was evaluated graphically. The system was created within Indect project at the VSB-Technical University of Ostrava.