Quantcast
Channel: Options for article scraping from many different websites - Stack Overflow
Viewing all articles
Browse latest Browse all 2

Options for article scraping from many different websites

$
0
0

I need to add webpage scraping functionality to a single page application.

I need to retrieve useful content from many different blogs and services. By useful content, I mean articles, texts and links to videos in order to embed them on my pages.

This is tool seems to offer what I need: http://www.diffbot.com/

Using it, I can simply input an article's URL and this service will retrieve all data that I need from that single page.
However, I do not need to handle 250 thousands requests on a monthly basis, which would cost $300 each month; I need a solution to handle about 5000 requests each month, with the possibility of scaling later.

I've found a lot of scraping solutions through Google, but they mostly offer solutions which scrape custom content periodically from a small number of websites - this is not what I need. Also, I do not have experience in this area, so I would like you to advise me on what I should use for this purpose. I am primarily dealing with JavaScript.

In addition, is it at all possible to allow pages to be scraped by the client's browser, rather than server-side?

I develop SPA with ReactJS and Flux architecture. Server NodeJS+Express, database - Backendless


Viewing all articles
Browse latest Browse all 2

Latest Images

Trending Articles





Latest Images