Posts about measurement

My content, my readers, my numbers, damnit

Hey, My Yahoo, Google Reader, Pluck, Newsgator Enterprise and other RSS readers: Hand over my numbers. You are taking my RSS feed and caching it to serve more efficiently, which would be fine if only you told me how many times you are doing that. But you’re not.

Brad Feld is much more polite than I am about this. He complains that My Yahoo just stopped reporting how many subscribers a feed gets there and Google Reader never did report and many others, including those I list above, don’t report subscribers, even though there is an easy and automated way to do that.

That’s theft. If you took a song and cached it and fed it out to lots of people these days without reporting back to the owner, you’d get sued or slapped in jail.

Well, all I ask that you do for caching my feed is to report back the number of subscribers. Not much to ask. And not doing that is tantamount to theft.

Why do I care? Because I have an ego. Because I want to see how much RSS I serve and learn about it. Because I want to see how efficient my advertising is. And just because. Damnit.

RSS is becoming a ever-more-important transport mechanism but without metrics, some will refuse to be transported by it. My Yahoo and Google Reader are making hay including RSS in their new products. They should practice good citizenship and share the data those feeds generate with their creators.

I can’t go to the Syndicate conference this time, because it’s in California, but if I were there, I’d wear a T-shirt and carry a picket sign to all the players listed here and in my Feedburner report:

FREE MY FEED SUBSCRIBERS. HAND OVER MY NUMBERS.

: LATER: I should add that I’m not against caching because it saves on my server load. But I do want to maintain a relationship with readers who subscribed to my blatherings and the barest way to do that is to get statistics. I also am not crazy about services changing feeds without my permission; some cut my full-text feed back to just headlines. Do newsstands refuse to tell you how many copies of your publication they sell? Do they cut out pages and give you only covers? No. Online distributors should operate by similar rules of the road.

: UPDATE: Jeremy Zawodny, of Yahoo, reports in the comments that the Yahoo counts will be back; it’s a bug to be fixed. Bravo. Now how about you, Google?

: LATER: See a followup post on a fundamental principle, above.

: LATER CONFIRMATION: I also just heard from a Yahoo exec who confirms that, indeed, something got broken in an upgrade and that they will feed back stats on feeds. Once again, thanks, Yahoo.

: LATER STILL: (Repeating this from the post above): Matt Cutts of Google says in the comments here that he will mention this to the guys at Google Reader and believes there’s no reason not to build it into a next version of that new product. Bravo again.

The alleged ad scarcity

Gary Stein at Jupiter argues that the ad scarcity is real. I argued that it is a false scarcity created out of laziness or ignorance on the part of media buyers. Stein says:

Large brands seeking to make a significant impact will always seek high-profile placement.

Is that laziness? Not really. It is ego. Even if no one ever bought a Coke because they saw the neon sign in Times Square, it is still important enough to Coke’s brand and culture to demonstrate that they are big enough and strong enough to put that message in that spot. So, as long as there are new-model-car releases, brand-launches and opening-weekends, there will be a reason to pay tons of cash for the home page of Yahoo! Scarcity will exist for that, and it will keep prices high.

But the problem comes in if you say “well, Yahoo’s home page is gone, so there goes the campaign”. Online advertising is all about technology innovation and the twin turbines of targeting and optimization are will both increase the revenues for publishers as well as the effectiveness for advertisers.

The shortage? It’s just gonna lead to more technology innovation. Any network/serving technology/targeting system who can (really) do optimization is in a good place right now. They’re the ones who will benefit the most from a shortage.

Yes but….

The importance of any home page on any portal will decrease as more and more traffic is generated directly through searches and links. The problem remains that advertisers want to re-create TV; they want one-stop-shopping mass buys; they want upfront. But they shouldn’t want that because it only creates a scarcity that drives up prices. That’s why I say it’s laziness, because if they really did their homework and figured out that they can get better targeting, better reach, more efficiency, more effectiveness by engineering ad hoc networks tailored to their needs, they’d benefit greatly. But that takes work: not just the effort of putting together one flight but research and education and experimentation.

One agency and advertiser will figure this out and then the others will catch up for the same reason Gary cites: ego.

Another damned list

Feedster comes out with a list of 500 blogs. Jason Calacanis kvells (could it be because he sees his blogs on it?).

But this misses the point (again). Making a universal top n00 list, however it is made, continues to engage in old-media thing, big-media, mass-market think: The guys on top win.

No, in this new world of choice and control at the edges, it’s the niches, and those who can pull them together, who win. And it’s those who can demonstrate influence and engagement who will win — as soon as somebody figures out how to demonstrate it.

Besides, a universal top n00 list is even a bad execution of big-media think. When Ad Age gives you lists of magazine revenue, it separates women’s and entertainment and business publications; in big-media, those pass as niches and they are far more valuable comparisons. When talking about newspapers, you don’t lump in metro papers with town papers with trade papers; it’s a meaningless lump.

When somebody can tell me who the queen of the knitting bloggers is, then I’ll listen…. and so will knitting advertisers.

The numbers

Two people I respect tweaked me — one in a post, one in email (both with a business interest in the matter) — for what I wrote about the comScore kerfluffle (or efforts to start one). I think they misread me but that probably means I miswrote it, so let me restate to be clear:

First, I am delighted that we have the comScore research. I think this is incredibly important in the growth of blogs as a business — for those who want it to be a business — for this states their metrix in terms that advertisers will respect and understand… and buy. Remember that I pushed very hard for such research to be undertaken at Bloggercon II.

Second, the things in the results that don’t sound sensible can probably be corrected with improved methodology and I hope that comScore is smart enough to call upon the expertise of this open world to help them improve that methodology. Fred asks for Jason to hand over his server logs to help them and he asks for time and he’s right:

The issue with panel research is you need time to develop the statistical algorithms that weight the panel data correctly before you scale the numbers. And you need a very good dictionary of the domains you track. These take time to nail down. Clearly Comscore hasn’t nailed it yet, but they never said they had.

But third, even with the best study, I stand by my point that panel research — as valuable and necessary as it is for advertising to be able to judge an even playing field of properties — is at the end of the day still bullshit. Fred argues:

Jeff is so wrong about this that I find myself shaking my head on this one.

First, the problem with panel research of old is small sample sizes. The panels that have been used for decades in old media are almost always less than 100,000 people primarily because it is cost prohibitive to collect data from a larger panel. Clearly that is way too small for accurate measurement.

But Comscore invented the concept of a “megapanel” in 1999 and is currently measuring over 2 million unique Internet users. That’s the beauty of the Internet, you can measure online at a scale unimaginable offline. At that kind of scale, panel research is not only accurate, its amazingly accurate. As far back as 2001, Comscore was able to predict a missed quarter for Amazon.com. And the panel size and the technology have improved significantly since then.

Well, I spent years in publishing and then the internet seeing how wrong panel research could be. People magazine allegedly had
eight readers per copy
according to panel research. On the face of it, that’s absurd. But it determined the readership and thus cost per thousand and thus ad rates for People. And that paid my salary. But it was bullshit. And everybody knew it.

Online, I saw these big panel studies rate some of my smaller sites as huge and some of my biggest sites and nonexistent. And the reason for that, clearly, was that the sample didn’t have people clicking on mice in Ratbutt, Alabama. The problem is that when you measure small things — and blogs, individual blogs, are still very small — the effectiveness of a sample panel will decrease exponentially and the impact of one reader or one missing reader can be amplified to ludicrous extremes.

My fourth point is that, yes, I agree with Fred that Jason should take a deep breath and get over it and others who nitpicked the study should take it in the context of all such studies: They are all bullshit, so untie those knotted knickers and move on.

For, yes, this is an important study. It shows that blogs are big and growing. It shows the relative size of their audiences against other media properties and makes them real. It starts to give advertisers (and us) the flavor for the unmedium — our interests, our habits (e.g., political blogs get more traffic from smaller audiences), our demographics, and all that.

So what I’m really saying — though, clearly, I said it badly — is that the details and the nitpicking and the fact that it’s just bullshit doesn’t matter and so it’s not worth fighting over, especially if it can be improved. This is what advertisers need. This is what will make them buy.

My fifth and final point is that we would be foolish to stop at this kind of syndicated research as the basis of building blogs as a business, for — just like Google AdSense — it devalues and misjudges us. We have greater value in our relationships and influence and we need to find ways to measure that. No, that is not a substitute for the basic audience metrics advertisers demand. But it will prove our higher value and we can then get paid for that higher value and that is a good thing.

Numbers don’t joke

Jason Calacanis is having proper conniptions over the comScore marketing study on blogs released this week. Fred Wilson and Heather Green have moments of dubious doubt themselves. Krucoff, says Jason, nails it and quizzes Rick Bruner, who helped on the study itself, as Rick tries to answer questions on the methodology on his blog. I leave it to you to follow the links above to the specifics.

My first reaction is that all this shows how messed up panel research is. This is the method used by Nielsen et al to measure TV and radio and print readership — affecting billions of ad dollars — and it is and always has been relative bullshit. That’s why advertisers buy it, though — because it is relative, because they can compare this magazine to that magazine on the same sheet. But it’s all based on a small and only allegedly representative sample of people. It’s meaningless. When I worked on magazines that allegedly had eight readers per copy — damned dogeared, they were — we benefited from this relative bullshit. But when I came online, we could measure the bull ourselves: We could compare our server and cookie stats with what the panel research told us and we could tell when they didn’t have any panel members in entire states. Panel research is a novel in numbers.

My second reaction is, however, that blogs need some sort of numbers advertisers will buy. I thought it was a good idea to try to get that research and, as I read that link, I see that this is partly my fault: At the blogging business session I emceed at Bloggercon II, I emphasized the need to feed advertisers their metrics. So it’s a damned shame that this research is raising such eyebrows.

My third reaction is that we should be creating our own meaningful metrics. Bloggers who care about making business at blogging (and let’s remember: that’s only some of us) should be agreeing on cookies and also on new means of measurement and new things to measure.

This isn’t as simple and stupid as an a-list or a panel or page-views or eyeballs. This is a much richer thing, this unmedium of ours, and it needs much smarter measurement. See Mary Hodder’s napkin notes for just some of the means of measuring blogs’ popularity and appeal; I can think of many more.

Advertisers are screaming for proof of “engagement” these days and while silly, inky magazines are trying to “engage” with flashing ads on paper, we flash without trying. We engage or die. We live by relationships and trust — more fave ad words. We have influence — yet another fave word. We need to measure and report all that.

Instead, we’re futzing and fussing and fuming over the few numbers we have and giving advertisers another excuse to ignore us. Arrgh.

comScore should reveal much more about its study so that bloggers can poke at it and so the crowd — the wise crowd — can help improve the methodology and ferret out what makes sense and what doesn’t — and let’s remember that these numbers do show that blogs are a thing that should not be ignored. If something looks odd, explain it or explain why you can’t.

And we should start finding new ways to measure our real value — and that’s not about continuing to chase the big numbers that are so old-media and it’s not about continuing to value relative bullshit. We need to find the numbers that count.

: Oh, and while we’re at it, can somebody point me to exactly what the oft-quoted and bragged-about Alexa numbers are really based on? Does there data come from the toolbar still? Do you know a single soul who actually uses that toolbar? How big is their sample? How representative?

Every one of these services that now tout numbers should be transparent about methodology and sources and scale. It’s late, so I’m not going to go looking now. But if you can point me to such disclosure at Blogpulse, Technorati, Bloglines, et al, please leave a link in the comments and let’s start by finding the best of breed in transparency.