Mom & Dad, I really don’t know how you had three of these little monsters, much less one.
Reblogging something I originally wrote over here.
So, I had a fun interaction on social media over the weekend — my friend Adam tweeted a link to an Anil Dash essay about what is “public,” and what that concept should mean. And I shot my mouth off (to my probably close to two dozen real human followers) in response, calling the piece “flimsy” and comparing it unfavorably to one of my favorite books, The Quantum Thief.
I’ve been re-reading TQT over the last several days, as a prelude to reading the final book in the series which was just released a week ago, and I think its discussion of privacy, public-key encryption, access-control, and identity — through the devices of gevulot, exomemory, and the political life of the Martian city called “The Oubliette” — is perhaps second-to-none. At this point I would say that “science fiction is where we work out the ethics of modern technology” is an assertion whose truth is so obvious…
Anyway, I received an immediate, object lesson in the public nature of social media — “I refute it thus.” No sooner had I tweeted my opinion than Mr. Dash himself showed up, asking what in his essay I thought didn’t contain “nuance.” Fair enough question! — and so nicely asked, that I tried to take the time and explain in a series of tweets what I thought my problems were. And even though I’m positive that anyone with half-a-million followers is so constantly inundated with unsolicited comment 100% of the time that he’s got to be virtually numb to the extended sensation of social media by now, he (very nicely) read what I had to say and asked me at the end to maybe write it up somewhere.
So here we are. I’ll lay out my objections one-by-one, from smallest to largest, so they’re written in one place. If anyone reads this and wants to complain, feel free to come to Twitter and yell at me there.
Minor Complaint #1: The Cloud
Dash’s mention of the cloud,
But we can’t let society’s norms be defined by which features are least expensive for storing on a database server in the cloud.
is pretty weak. I mean, I know everyone’s having fun knocking “the cloud” these days, but there’s nothing in here that’s inherent to “commodity computing and storage,” (except insofar as we’re trying to say that ‘the cloud’ lowers the activation energy for startups which are more likely to violate peoples’ senses of privacy, more on that in a second) and nothing he says couldn’t be applied equally to “server/client” architectures of years past.
Minor Complaint #2: The Public
I think there are at least two sense of the word “public,” one that applies to actions (are they public acts? or private ones?) and one that is more like a mass noun (“the public,” like hoi polloi) — and I think Dash is skipping lightly over the difference between the two. A “public defender” isn’t the same thing as a “public AA meeting in the park,” and very often they are at cross-purposes. “The public” may have an interest in “investigative journalism,” even if “people-calling-themselves-journalists doxxing anonymous people online” might be a violation of some public/private divide. This is kind of a weak objection, but I want to register it anyway.
Moderate Complaint #3: Trolls & Stalkers
I think Dash’s division between between “media/technology industries” who benefit from weaker definitions of privacy and “the public” who each individually benefit from stronger notions of privacy is too short — any modern discussion of privacy has to discuss trolls and stalkers, the sorts of people who exploit wide definitions of “privacy” in order to violate the same boundaries around other people.
Moderate Complaint #4: Tech-implementations of “privacy”
Programmers and engineers who create software with controls for privacy have moved in recent years to an on/off model where content is either viewable to the entire world or only to a list of people whom a user identifies as “friends”. Obviously, reducing public status to a binary consideration is convenient for a medium where everything must ultimately be represented in binary code. But we can’t let society’s norms be defined by which features are least expensive for storing on a database server in the cloud.
I honestly don’t feel as if “expense” is necessarily the only quantity that’s being minimized here. To start with, as Dash knows too (of course), tech types have a long history and experience with implementing and using different models of permission, privacy, publication, sharing, and trust. Programmers have experimented with innumerable numbers of ACLs, authentication and authorization systems (“authn” and “authz”), trust models, PKI, and rule-based systems for grouping users, setting permissions, auditing and sharing.
But it’s hard to deploy these approaches in systems that are used by The Public writ large, just take a look at the way that people complain about the complex settings for sharing and visibility in Facebook, or the intense amount of derision that was leveled towards Google+’s use of “circles” (which is, honestly, about the simplest projection of the kind of users-and-groups system you’ve found in every Unix system around the world for the last thirty years). Even in The Quantum Thief, Rajaniemi suggests that managing the system of sharing (called “gevulot,” an elaborate system of “co-memories” and quantum encryption) is so complicated that Martians have evolved specialized organs to handle it for them — like co-processors for privacy.
So yeah, building powerful-yet-intuitive systems for sharing and privacy into our modern web-apps would be expensive, but partly because it’s hard and no one’s ever done it before.
Final Complaint #5: Is Privacy Really a Function of the Data Itself?
This brings me to my final, and (I believe) most important problem with Dash’s piece. He writes, throughout, as if the question of “what is public?” is one which attaches to the kind of data itself. Is a tweet public? Is a GPS trace public? Do we, The Public, need to lobby our lawmakers to make these things, which tech and media companies currently collect because they are (by default) viewed as public information, into private data?
But I think these questions are asked under an assumption which, while most people make it, isn’t actually true at all: that whether something should be private or not is a function solely of what it is.
One of the things that genomics (where I spend part of my waking life) has been figuring out is that privacy isn’t just a question about what data you have (“is your genetic data private?”) but also a question about how much of it you’ve aggregated.
One of my favorite papers from last year was from the Erlich lab at the Whitehead Institute here in Cambridge,
Gymrek et al. “Identifying Personal Genomes by Surname Inference.”
What Dr. Erlich (who you should all go follow on Twitter) and his students did was build a database of genetic information from Y chromosomes (which are, like surnames in most western countries, handed down from father-to-son) that had been deposited in public genealogical web apps. Using this background database, they could re-identify from anonymous genomic information only the surname of a person who wasn’t in the original genealogical database. Depending on who you asked in the genetics community, this was either “blindingly obvious” or “too dangerous to publish.”
But the point is, you can’t simply point at a person’s genome and say, “this should be private.” Rather, how private something could/should be can depend crucially on how much other data you’ve already aggregated. As Jeff Leak said, a few weeks ago, it’s possible that “Privacy [is a] function of sample size.” To some extent, this is my problem with a lot of discussions about “big data.”
Dash is probably right, in saying that this kind of thing isn’t very common among consumer tech companies — but I’d argue that that’s largely because our understanding of “ethics” is usually tied to our concepts of personal identity themselves. The reason you see fields like genetics working this out first is because we have a slightly better understanding of the identity of your genome than we do of (say) a GPS trace. On the other hand, when Latanya Sweeney is your “Chief Technologist” at the FTC, it’s hard to argue that it’s not on someone’s radar in the general commercial arena.
I really like a lot of Anil Dash’s writing. And it’s always a good thing to have a conversation like the one he’s starting (or at least, continuing). I just hope we can move to a place, as a community, where our discussions of “nuance about public data” can be more … nuanced.
Yes, I had seen that, and I had a conversation with J about it already. I think NPR’s authors are scare-mongering to say that “measles and whooping cough made a comeback in the US and Europe” because of “changes in views about vaccine safety.”
This is emphatically false in the case of pertussis, which had a resurgence over the last 2 decades because no one realized how short-lived the immunity from pertussis vaccination lasted. Vaccination strategies for whooping cough were predicated on the idea that immunity was permanent, but it was discovered that the effect of immunization for pertussis wears off after 10 to 20 years. Since that realization was made, it is routine for people to get a tetanus booster containing the acellular pertussis vaccine if they haven’t had one before.
In the case of measles, it’s not that measles is spontaneously cropping up in un-vaccinated children. Take a look at the CDC’s news briefing on measles outbreaks in 2013. Measles doesn’t exist in the US, and as that briefing says (and also the two they link to about the NYC and NC outbreaks in 2013), all the cases the CDC documented came from abroad. It sounds like about half were brought by US residents and half were brought by non-US residents, and in each instance, the virus spread briefly among personal contacts within mostly family circles and then was extinguished and didn’t infect anyone outside the circle.
As far as vaccination rates needed for herd immunity, do you remember Ira Schwartz from the Naval Lab who studied SIR and SEIR disease models? He gave a talk at the physics department once and I remember him saying that the vaccination rates needed to extinguish epidemics are vastly different depending on the parameters of your disease (transmissibility of the virus, shedding time of the virus, duration of immunity, etc.). The only reference I can dig up that seems related is this one, but I imagine there are others.
In particular there’s this little section,
It could be said that they are still people who consider a bookshelf as a mere storage place for already-read books and do not think of the library as a working tool. But there is more to it than that. I believe that, confronted by a vast array of books, anyone will be seized by the anguish of learning, and will inevitably lapse into asking the question that expresses his torment and his remorse.
But, as they say, read the whole thing.
C, I thought you might be interested in this, a blog about preventing infections in hospitals.
Since a government shutdown is looming, and there are people out there all over the internet bashing the ACA this weekend in an attempt to provide political cover for the GOP on this issue, I’d like to address a couple of misguided myths about Obamacare. This is not a “debunking” because I’m not offering evidence:
- Obamacare will ration care so that people will have to wait weeks for appointments.
Our current market-based, private insurance system already creates these kinds of barriers to specialist care. When I go to schedule Neurology, Orthopedics, Dermatology, or Mental Health referrals for my patients, it typically takes months (not weeks) to schedule new appointments. The only way to make appointments sooner is if you already have been seen by the specialist within 3 years (and are not a “new” patient).
- Obamacare will make healthcare a bureaucratic nightmare.
If you’ve been to the doctor or a hospital in the last 10 years, you know it is already a bureaucratic nightmare. This is a result of private insurance companies rules and regulations, not government interference.
- Obamacare will increase my health insurance premiums.
Any changes to your health insurance premiums in the next couple years are overwhelmingly likely to be a result of your employer shifting more of the share of your annual insurance premiums to you. It was not that long ago that most employers were covering 80%, 90%, or 100% of their employee’s premiums, but no longer. This cost-shifting from employers to employees has been going on for several years now (e.g. the State Health Plan), but expect the trend to continue and the employee’s share to grow and grow and grow. And given how much health insurance costs, all it takes is a small 10% decrease in the employer’s share to equal a $1500 increase in your annual premiums…
- Obamacare will cause a death-spiral of sick people coming into the insurance exchanges.
Pre-existing exclusion bans may bring some small number of sicker people back into the insurance pool, but this will be swamped out by the large influx of healthy individuals entering the insurance pool.