links for 2009-05-28
By Josh Young
-
One of the challenges posed by a paywall is the paywall's impact on SEO. Since content is restricted to subscribers, Google can't spider your content. One answer is cloaking–used with paid content since paywalls are becoming popular with website owners but can have a significant negative impact on SEO. While there are plenty of sites, some high-profile, that employ this technique without penalty from Google, others have, over the years, reported being dropped completely from Google
-
A paper on Declarative Information Extraction in a Probabilistic Database System. It's about (1) automatically converting free text into structured data, (2) using the state of the art machine learning technique (Conditional Random Fields), which is (3) coded up in a few lines of SQL that integrates with the rest of your query processing. It represents a convergence where free text, relational data, and statistical models all come together in an elegant and very practical way.
-
Regular Expressions is a powerful tool when parsing and validating strings. And combining regular expressions with the simplicity of jQuery selectors can create some fast and useful string parsers. This post will show you a couple of really useful parsers that you can use in various environments.
This entry was posted on May 29, 2009 at 2:03 am and is filed under Uncategorized. You can follow any responses to this entry through the RSS 2.0 feed.
You can leave a response, or trackback from your own site.