Services |
|
|
•
|
Products |
|
|
|
•
|
|
•
|
|
|
•
|
Online Tools |
|
|
•
|
|
•
|
|
|
Web crawler features
our web crawlers can provide any capabilities that may be needed
|
|
The mostly required web crawler's capabilities are listed below:
- full automation of a site visitor’s actions (including automatic login into a target site, filling and submitting html forms etc);
- specifying regular expressions to find and parse the needed data from html pages;
- using sophisticated methods to filter and search the required information;
- running many work threads to perform the work in the shortest time;
- caching downloaded web pages to reuse them in the next time thus saving time and bandwidth;
- using Google, Yahoo, MSN and other search engines to find target sites by keywords;
- using JavaScript engine to go through dynamically-constructed html links in web sites;
- desktop or web graphic interface;
- storing output data in a preferable format: database, CSV file, excel, XML file or any you need;
- producing notifications (email, sound, SMS) in the predetermined cases;
- http, https, ftp, ftps support;
- web browser interface so that you have a possibility to see the work session and intervene into it manually if need;
- http or socks proxy support;
- restoring the previous work session if it was broken, so that the app can restart its work from the point where it was interrupted;
|
|
|