I recently had to write a script that takes a link to an article and returns a title and brief excerpt or description of that article. Ideally, the excerpt should be the first few sentences from the body of the article. The first thing I struggled with was something I thought would be trivial: fetching the contents of the webpage. Once we have the contents of the page, we load everything into BeautifulSoup and make sure that we have some valid HTML. cleanSoup is just a helper function to filter out HTML that I’m not interested in or could munge up my results.