First Choice is to ALWAYS use any provided API. Followed shortly after by using an existing data-accumulator's API. And then if allowable, scrape site content using any number of tools including those listed below.
Caffeine Lab is a Python First Shop. So given that, there are a number of tools that we can easily subscribe to.
BeautifulSoup - Screen Scraping since 2004!
Mechanize - The first line tool for stateful programmatic web browsing.
Selenium - a powerful automation tool for web browsers that is great for testing, but also for simplifying boring web-based administration tasks.
Nightmare - a high-level browser automation library that is primarily used for UI testing and of course crawling.