Posts about data

‘Decomputerize?’ Over My Dead Laptop!

This week, I wrote a dystopia of the dystopians, an extrapolation of current wishes among the anti-tech among us about dangers and regulation of technology, data, and the net. I tried to be detailed and in that I feared I may have gone too far. But now The Guardian shows me I wasn’t nearly dystopian enough, for a columnist there has beaten me to hell.

“To decarbonize we must decomputerize: why we need a Luddite revolution,” declares the headline over Ben Tarnoff’s screed.

He essentially makes the argument that computers use a lot of energy; consumption of energy is killing the planet; ergo we should destroy the computers to save the planet.

But that is a smokescreen for his true argument against his real devil, data. And that frightens me. For to argue against data overall — its creation, its gathering, its analysis, its use — is to argue against information and knowledge. Tarnoff isn’t just trying to reverse the Digital Revolution and the Industrial Revolution. He’s trying to roll back the fucking Enlightenment.

That he is doing this in the pages of The Guardian, a paper I admire and love (and have worked and written for) saddens me doubly, for this is a news organization that once explored the opportunities — and risks — of technology with open eyes and curiosity in its reporting and with daring in its own strategy. Now its writers cry doom at every turn:

Digitization is a climate disaster: if corporations and governments succeed in making vastly more of our world into data, there will be less of a world left for us to live in.

It’s all digitization’s fault. That is textbook moral panic. To call on Ashley Crossman’s definition: “A moral panic is a widespread fear, most often an irrational one, that someone or something is a threat to the values, safety, and interests of a community or society at large. Typically, a moral panic is perpetuated by news media, fuelled by politicians, and often results in the passage of new laws or policies that target the source of the panic. In this way, moral panic can foster increased social control.”

The Bogeyman, in Tarnoff’s nightmare, is machine learning, for it creates an endless hunger for data to learn from. He acknowledges that computer scientists are working to run more of their machines off renewable energy rather than fossil fuel — see today’s announcement by Jeff Bezos. But again, computers consuming electricity isn’t Tarnoff’s real target.

But it’s clear that confronting the climate crisis will require something more radical than just making data greener. That’s why we should put another tactic on the table: making less data. We should reject the assumption that our built environment must become one big computer. We should erect barriers against the spread of “smartness” into all of the spaces of our lives.

To decarbonize, we need to decomputerize.

This proposal will no doubt be met with charges of Luddism. Good: Luddism is a label to embrace. The Luddites were heroic figures and acute technological thinkers.

Tarnoff admires the Luddites because they didn’t care about improvement in the future but fought to hold off that future because of their present complaints. They smashed looms. He wants to “destroy machinery hurtful to the common good.” He wants to smash computers. He wants to control and curtail data. He wants to reduce information .

No. Controlling information — call it data or call it knowledge — is never the solution, not in a free and enlightened society (not especially at the call of a journalist). If regulate you must, then regulate information’s use: You are free to know that I am 65 years old but you are not free to discriminate against me on the basis of that knowledge. Don’t outlaw facial recognition for police — as Bernie Sanders now proposes — instead, police how they use it. Don’t turn “machine learning” into a scare word and forbid it — when it can save lives — and be specific, bringing real evidence of the harms you anticipate, before cutting off the benefits. On this particular topic, I recommend Benedict Evans’ wise piece comparing today’s issues with facial recognition to those we had with databases at their introduction.

Here is where Tarnoff ends. Am I the only one who sees the irony in the greatest progressive newspaper of the English-speaking world coming out against progress?

The zero-carbon commonwealth of the future must empower people to decide not just how technologies are built and implemented, but whether they’re built and implemented. Progress is an abstraction that has done a lot of damage over the centuries. Luddism urges us to consider: progress towards what and progress for whom? Sometimes a technology shouldn’t exist. Sometimes the best thing to do with a machine is to break it.

Save us from the doomsayers.

Geeks Bearing Gifts: Curation & Data

I’ve posted two really short chapters from Geeks Bearing Gifts today on Medium: one on curation, one on data. Then I’ll take a break for the holiday and come back with a bigger chapter on rethinking what mobile really means for news.

A snippet from the chapter on curation (relevant to current discussions about Google and news in Europe):

Screenshot 2014-12-23 at 10.17.42 AM

As early as 2009, Google Executive Chairman Eric Schmidt responded that Google News was sending one billion clicks a month — Google as a whole three billion a month — to publishers. “That is 100,000 opportunities a minute to win loyal readers and generate revenue — for free,” he wrote. Right. Curation — being curated — is a means of discovery and distribution for content. In an ecosystem of abundant content and no end of competitors for a reader’s attention, publishers should want to be curated so that readers may find their content. Later, in a discussion of the link economy and copyright, I will explore the business implications of valuing not only the creation of content but also the creation of an audience for it — sometimes, through curation.

And here’s a snippet from the chapter on data:

Screenshot 2014-12-23 at 10.35.06 AM

Data is a critical new opportunity for news organizations. What journalists have to ask — as with the flow of news — is how they add value to data by helping to gather it (with effort, clout, tools, and the ability to convene a community), analyze it (by calling upon or hiring experts who bring context and questions or by writing algorithms), and present it (contributing, most importantly, context and explanation). . . . 

Data needs to become a mindset and a skill set in news organizations. Journalists should receive training to become literate in the opportunities and requirements of using data. Journalists also have to work with specialists who can analyze, interpret, and present data, and who can create tools allowing both reporters and the public to work with it. From a business perspective, data should be seen as an asset worth investing in, one that can yield news and new engagement often at a low cost. Data is/are a step past the article.

Read the rest of each chapter here and here. If you can’t wait for the rest, then you can buy the book here. The perfect gift for the journowonk on your list.

Data are news

Tom Loosemore, blogging MP Tom Watson — and others, including the Guardian — have been fighting to get more public data made public in the UK. Now Watson and Loosemore have launched a $40k prize to mashup this data and come out with lots of lemonade. Here‘s Paul Bradshaw on the movement. Here are some — as a Brit tweet said — stonking good ideas already.

: LATER: This tweet by Charles Arthur of the Guardian — “wtf? No downloadable school league tables?” — made me realize that newspapers are also foolish not to make their data mashuppable. If we put out all our sports data as tables that could be downloaded and mashed up people would build no end of great stuff on top of us. That’s thinking like a platform. WWGD?

The fruits of change

The lesson of the Thomson-Reuters merger is the value of change. Thomson was a newspaper company and in the ’90s started shifting, getting rid of papers and getting into data and finding great success and growth there. Reuters was a newspaper service company and it made the shift into not only data but also, thanks to the wisdom of its current chief Tom Glocer, into direct-to-consumer news. Both specialized highly, in financial data in their cases. Compare and contrast them with Knight Ridder, which doubled down on broad, generalized print products, and Tribune Company, which diversified from print, though not on a specialized track but in more generalized electronic media (TV and radio). Recognizing the value of specialized data as news and acting early — with strong headstarts, of course — was a successful strategy. Can existing newspaper companies start to think of themselves as data providers and enablers, but in different spheres (e.g., hyperlocal, listings)? Is there time?

The data fight

The issues in the fight over telephone companies releasing data to the NSA aren’t so simple as they are being reported and spun under the dark cloud of privacy violation.

From what we know, data was released to the NSA so it could be analyzed to find patterns and thus to find anomalies that might lead to suspect communication and suspects, in turn. In other words, you can’t tell what’s abnormal until you define normal and we define normal.

If, in fact, it is aggregate data they are using to discover those exceptions, then we need to ask a new question that isn’t really being addressed in the networked world: Who owns the wisdom of the crowd? If the people own it, then one could argue that the government, acting as the people, may seek and use that data unless we, the people, forbid it through law. There is, of course, a proper debate about whether the law does allow it. There is also a proper debate over whether this is a necessary and prudent weapon in finding terrorists (and whether that is being done effectively). Indeed, a Washington Post poll says that 63 percent of Americans consider this an “acceptable way for the federal government to investigate terrorism.” And didn’t we protest that our government did not do a good enough job analyzing data and intelligence to prevent 9/11? If someone had been analyzing patterns of enrollment in flight schools — hmm, why are an abnormally high number of Saudis suddenly learning how to fly passenger jets? — then could we have stopped them? A further question is whether we have a right to know that all this is going on or whether that public knowledge cripples this investigation and our safety. Finally, it is not clear that releasing aggregate data necessarily violates individuals’ privacy. My point is that this isn’t as simple as raising the tattered-from-overuse privacy flag. Neither is this as simple as raising the also tattered war-on-terrorism flag.

This is about a new asset that is created in the networked world — the aggregate knowledge generated by our aggregate behavior — and who has a right to that.

This is certainly not new, only more efficient. Insurance companies have long used our health and mortality data in aggregate to set rates. Marketers use our aggregate data to adjust products and ad campaigns. Google uses our aggregate data to improve its search engine. So Google owns, analyzes, and exploits the data we create through our actions. In the case of the kiddie porn investigation, Google tried to refuse to hand over random aggregate data about our searches to the government; other search engines complied. The same thing occurred in the NSA case; some phone companies complied and Qwest did not.

The bottom line is that there isn’t yet a bottom line: The law and ethics around aggregate data are not clear.

See also this New York Daily News editorial:

Well, here we go again with the horrified screams from the crowd that’s inclined to believe the big bad government is peeping through every keyhole and recording every streetcorner chat about whether or not it looks like rain.

Revelations that the National Security Agency has been collecting a database of every telephone call in America – numbers dialed, that is, not conversations parsed – happen to come as British probers report that July’s London transit bombings might have been prevented if only security forces had been aware that one of the bombers regularly called Pakistan in the days before the blasts.

No, it’s no crime to call Pakistan. But when the call is part of a pattern that suggests a security risk, this is worth red-flagging and perhaps eavesdropping on – with a warrant and court supervision, as all right up to the commander in chief agree would be necessary.

Anyway, the idea that phone companies have been turning over raw logs to the NSA somehow doesn’t strike us as all that revelatory. Of course they have been, and they have been doing it legally. If the purpose is synthesizing data, then certainly the NSA would be keeping a database from which to synthesize. And where did you think the NSA was going to go to collect log data? …

: See also this Washington Post story on the privacy buggabuzzword:

“I wish I could say I was bothered by it but I’m not,” said Jacques Domenge, a 28-year-old Potomac man who visited a Cingular Wireless store in Rockville yesterday to replace a stolen phone.

“If it’s only done to protect people and find patterns that help the government find terrorists — I don’t think it will work, by the way, but let’s say it will — then I am all for it,” he said, adding that he had no problems with Cingular — or any other phone company — turning over records.

According to a Washington Post-ABC News poll released yesterday, 63 percent of Americans said they found the NSA program to be an acceptable way to investigate terrorism, including 44 percent who strongly endorsed the effort. Another 35 percent said the program was unacceptable, including 24 percent who strongly objected to it.

“The value of fighting terrorism, in a lot of our research, seems to be more important to the public than what they perceive as violations of their privacy — so far,” said Frank Newport, editor in chief of the Gallup Poll and vice president of the Gallup Organization in Princeton, N.J.

Newport said views of the NSA program — which was disclosed on Thursday by USA Today — should be viewed in the broader context of Americans grappling with more and more of their personal data being collected and analyzed by businesses. “When we ask what’s the most important problem facing the country, we don’t see any signs that privacy is beginning to percolate up,” he said.