Web crawler features

our web crawlers can provide any capabilities that may be needed

The mostly required web crawler's capabilities are listed below:
  • full automation of a site visitorís actions (including automatic login into a target site, filling and submitting html forms etc);
  • specifying regular expressions to find and parse the needed data from html pages;
  • using sophisticated methods to filter and search the required information;
  • running many work threads to perform the work in the shortest time;
  • caching downloaded web pages to reuse them in the next time thus saving time and bandwidth;
  • using Google, Yahoo, MSN and other search engines to find target sites by keywords;
  • using JavaScript engine to go through dynamically-constructed html links in web sites;
  • desktop or web graphic interface;
  • storing output data in a preferable format: database, CSV file, excel, XML file or any you need;
  • producing notifications (email, sound, SMS) in the predetermined cases;
  • http, https, ftp, ftps support;
  • web browser interface so that you have a possibility to see the work session and intervene into it manually if need;
  • http or socks proxy support;
  • restoring the previous work session if it was broken, so that the app can restart its work from the point where it was interrupted;

