I need to add webpage scraping functionality to a single page application.
I need to retrieve useful content from many different blogs and services. By useful content, I mean articles, texts and links to videos in order to embed them on my pages.
This is tool seems to offer what I need: http://www.diffbot.com/
Using it, I can simply input an article's URL and this service will retrieve all data that I need from that single page.
However, I do not need to handle 250 thousands requests on a monthly basis, which would cost $300 each month; I need a solution to handle about 5000 requests each month, with the possibility of scaling later.
I've found a lot of scraping solutions through Google, but they mostly offer solutions which scrape custom content periodically from a small number of websites - this is not what I need. Also, I do not have experience in this area, so I would like you to advise me on what I should use for this purpose. I am primarily dealing with JavaScript.
In addition, is it at all possible to allow pages to be scraped by the client's browser, rather than server-side?
I develop SPA with ReactJS and Flux architecture. Server NodeJS+Express, database - Backendless