toFeed aims to provide syndication feeds for websites that don’t.

Introduction¶

What toFeed does at its core is scraping websites, converting the gathered data into syndication feed formats, such as RSS or Atom, and exposing the generated feeds to news aggregators through a web service. It grew out of my desire to be able to immediately see news from sites I regularly visit and filter them according to my own preferences.

toFeed relies on third-party modules such as BeautifulSoup, Jinja2 and Flask to scrape the websites and generate as well as expose the feeds. Of course that doesn’t mean that you are limited to these modules, writing your own adapter is easy and you are free to use whatever modules you want to do so. The decision to use BeautifulSoup instead of lxml.html was primarily made to avoid binary dependencies which would make the package less portable and harder to install for end users. Another reason was that I’m simply more familiar and comfortable working with BeautifulSoup.

Usage¶

You can either run toFeed locally on your own PC or externally on a server. It is recommended to use virtualenv in either case.

If you are planning on running toFeed locally, simply execute the main module and the toFeed service should start running on your localhost and expose the routes to your adapters from there.

Alternatively, if you are interested in setting up an external toFeed instance, I recommend using Heroku, which allows you to do so at no cost at all. Simply follow their Getting Started with Python on Heroku guide from the Declare process types with Procfile section onwards.

Introduction¶

Usage¶

Modules¶

Indices and tables¶