Boryi Network Information Inc.
Software Center, Suite 112
662 Lu Valley Ave.
High-Tech Development Dist.
Changsha, Hunan 410205
People's Republic of China
US: 1-610-343-1116
China: (+86) 731-8899-2840
 Web-bot and Webcrawler Development Services

We develop web-bots or webcrawlers, which fetch the data from the web or other sources. Either the data that the web-bots or webcrawlers retrieves is publicly available or not. We do not develop malicious software that is intended for spam or infringement of anybody's rights.

Web-bot or Webcrawler

Web-bot or webcrawler is a program that crawls through the web sites and collects the needed information from them. What info they can collect? In one word, any you want - product descriptions, prices, links, addresses, pictures etc. The collected information is then stored in the required database or file.

Features of our web-bots or webcrawlers:

  • full automation of a web site visitor’s actions (including automatic browsing, signing up new accounts, login using different user accounts, filling and submitting forms etc.)
  • regular expressions to retrieve the needed data from web pages
  • XML parser to extract the needed data from web services
  • sophisticated algorithms and methods to filter and search the interested information
  • multi-threads to increase the performance
  • retrieving web pages in compressed format e.g. gzip
  • caching downloaded web pages to save time and bandwidth
  • using open source JavaScript engine, such as V8 from Google and Rhino from Mozilla, to go through dynamically-constructed web pages
  • storing output data in a preferable format: database, CSV file, excel, XML file or any you need
  • sending email notifications in the predetermined cases
  • http, https, ftp, ftps support
  • http or socks proxy support to crawl web sites anonymously
  • restoring the previous crawling session if it was broken, so that the web-bot or webcrawler can restart its work from the point where it was interrupted or crashed
  • automatic quality assurance to ensure the data harvested from the web sites
  • web browser interface so that you have a possibility to see the work session and intervene into it manually if need
  • web graphic user interface to manage and monitor the web-bots or webcrawlers
Framework and Network System for Web-bot or Webcrawler

We built a framework to develop web-bots or webcrawlers much efficiently based on Java technology, and a network system to host, monitor and manage web-bots or webcrawlers efficiently and cheaply. The network system can be scaled up when the number of web-bots or webcrawlers increases. The user can specify schedules, export crawled data, and view the daily reports for each web-bot or webcrawler.

See our demo site. Use "admin" to login as an administrator, "developer" to login as a developer, "user" to login as a user / client, "guest" to login as a guest. No password is required.

If you would like further details on this or on having your own web-bots or webcrawlers, please contact us via our contact page. Put our state-of-art technology to work for you.