One of the challenges posed by a paywall is the paywall's impact on SEO. Since content is restricted to subscribers, Google can't spider your content. One answer is cloaking–used with paid content since paywalls are becoming popular with website owners but can have a significant negative impact on SEO. While there are plenty of sites, some high-profile, that employ this technique without penalty from Google, others have, over the years, reported being dropped completely from Google
A paper on Declarative Information Extraction in a Probabilistic Database System. It's about (1) automatically converting free text into structured data, (2) using the state of the art machine learning technique (Conditional Random Fields), which is (3) coded up in a few lines of SQL that integrates with the rest of your query processing. It represents a convergence where free text, relational data, and statistical models all come together in an elegant and very practical way.
Regular Expressions is a powerful tool when parsing and validating strings. And combining regular expressions with the simplicity of jQuery selectors can create some fast and useful string parsers. This post will show you a couple of really useful parsers that you can use in various environments.