DataMassif
HTML ArticleExtractor

HTML ArticleExtractor

Collects articles from web pages: title, content with and without HTML markup

HTML ArticleExtractor

collects articles from web pages.

Data collected

  • Article title
  • HTML string of processed article content
  • Text content of the article (all HTML removed)
  • Article length in characters
  • Article description or short excerpt from the content
  • Author metadata
  • Website name

Use Cases

  • Collecting ready-made articles from any websites

Similar scrapers

Other tools in the "Content & Backlink Scrapers" category.

HTML EmailExtractor

HTML EmailExtractor

Scraping email addresses from website pages

HTML LinkExtractor

HTML LinkExtractor

Scraping external and internal links from the specified site can be performed on internal links up to the selected level.

HTML TextExtractor

HTML TextExtractor

Text block scraper, allows you to collect content from arbitrary websites

Net HTTP

Net HTTP

Downloads the specified page, supports multi-page scraping