Dec 18, 2010

Blogger reading difficulty



Google's newly-offered advanced search feature tells you the reading level (basic, intermediate, or advanced) of search results. I believe the intended use is to allow you to tailor your search results to your research level, but you can also use it to tell you the reading level of an entire site (as I did for this blog in the image above). Just go to advanced search, and under reading level, select "annotate results with reading levels".

[Note to developers: It would be super cool if there was an app that looked at your browsing history and outputted for each domain the reading level (and personalities and such).]

I looked at the results for some of my favorite blogs (plus some other popular news sites for context) and the results are below. (For links to the blogs, see the sidebar or the post I linked to in the previous sentence.)



For this next chart, I created a "difficulty score" for each site ( = simply B%*1 + I%*2 + A%*3).


Thoughts/surprises/observations:

-- Someone should do this for scientific journals, too. (Here is a site that lists the top 10 academic journals by category.)

-- Well done, Google, on all these exciting new research tools. See also the Body Browser and the Books Ngram Viewer. Google is still not at Wikipedia level in my book, but I can see why others think so.

-- I see people around the Web treating higher reading difficulties as somehow better. I think they mistake reading difficulty with intelligence. They are definitely not the same. Scott Adams has the most intelligent blog I know, and it's easy reading. I'd even bet the correlation is pretty weak. Advanced words are often necessary to articulate difficult concepts, but difficult concepts do not necessarily imply intelligence (nor even correlate with intelligence).

-- I wonder how the algorithm works. I am sure it is much more sophisticated than this, but I wonder if it is some measure of the proportion of uncommon words. Depending on how it works, "advanced" may or may not be synonymous with "difficult" as I am treating it.

-- As I would expect, the funnier blogs are lower on the reading difficulty scale. It's damn hard (impossible?) to be funny if people cannot easily understand you.

-- The results are clearly imperfect. (ABC News is more difficult than the New Yorker?? Harrison and Brett's blogs are very readable -- Google probably just does not know them well. And while Robert tackles advanced concepts on his blog, he articulates them with eminent clarity.)

-- I was somewhat surprised to see that Wehr in the World is relatively low on the difficulty scale. It does not offend me. If I could mimic only one blog, it would be Scott Adams'. Like his blog, I want my words to be easily recognizable but the concepts surprising and enlightening. (Of course, I am not close to his level, but a boy can try.)