Stealing, Copying, and the 4,000,000-Document Non-Theft

Recently, a young man named Aaron Swartz was arrested after an incident at MIT. Apparently, he had been using MIT’s network to download millions of documents from JSTOR, an online archive of scientific journal articles and the like. (Access to JSTOR is purchased by universities like MIT, and like Harvard, where Swartz is an employee).

What can this incident — and reaction to it — tell us about copyright in the U.S.? In this article, I’ll paint this incident as a symptom of larger copyright issues in our society. I’m not going to necessarily argue whether copyright is good or bad, but I will argue that public perception of copyright has been twisted away from all common-sense definitions — twisted by design. By the end, I hope to change the way you think about “intellectual property”, or at least cause you to consider how your views are shaped by the terms used in the debate.

Let’s start with Swartz.

The Immediate Response

Millions of files downloaded?  Most of them copyrighted files?  What a monstrous act of theft!

These would be your thoughts if you read the headlines in the immediate aftermath of the incident. The New York Times said “Internet Activist Charged in M.I.T. Data Theft” and “Open-Access Advocate is Arrested for Huge Download“. The Boston Herald said, “Cambridge Man Accused of Stealing 4 Million Documents in Huge MIT Hack“. Similar headlines appeared in the Huffington Post and the Register. One site even put up the headline “Millions of papers stolen from M.I.T. archives.

There’s a gigantic problem with these headlines that has to do with their use of that ugly word, “theft”, and its cousins. But we’ll get to that later.

First, there’s a different and even bigger problem with these headlines, and it is this: They’re false. Factually incorrect.

The indictment is available as a pdf here. The top of the first page lists the charges against Swartz. Copyright violation isn’t one of them. It’s not on there. Nor do any of the charges mention “theft”. Every single newspaper headline I’ve just listed is a lie. Each of them got the story completely wrong.

How did this happen, and why?

Backing Up: The Events In Question

JSTOR is an archive of scientific journal articles that are freely available to members of many institutions, especially universities. As a Harvard employee, Swartz clearly had legal free access to the files. (The fact that he accessed them while on MIT’s campus does not seem to change this.)

However, (it is alleged that) he took his privileges too far. On repeated occasions, he used MIT’s network to download very large numbers of files. Despite being blocked and told to stop, he found ways around the blocks. (He was also apparently downloading so much data so quickly that it threatened to take down JSTOR’s servers.) Thus, the criminal charges filed against Swartz are mainly related to computer fraud.

The Copyright Angle

Now, the Internet and copyright violation are deeply intertwined in the public consciousness of the United States.  Powerful interests in industry — namely, the RIAA (music) and MPAA (movies) — have gotten the attention of powerful interests in the government. This has become evident not only from the United States’ own laws and enforcement, but from, for example, recent revelations (via Wikileaks) on the U.S. pressuring countries such as New Zealand and Canada to pass draconian and/or unpopular copyright laws.

So it’s not surprising, in a case like this, to see copyright issues come to the forefront — in the eyes of the media, and in the eyes of the prosecutors.

But yeah, ok, it turns out that copyright was a non-issue; Swartz did in fact have legal access to these scientific articles. The only complaint is the fraud: after being asked to stop, he hid his identity and continued. But even if it wasn’t technically a copyright issue, it’s good that JSTOR was protected from all this “theft”, right? I mean, surely it’s important to give JSTOR legal recourses in order to defend themselves from “pirates”. So I wonder, how is the law protecting them? What charges is JSTOR pressing, and what legal actions is it taking?

The Injured Party Gets Theirs

Well … actually, JSTOR has dropped all charges and has no interest in pursuing legal action.

Wait, what?

Yes. You can read their statement here, on their website. I quote:

It was the government’s decision whether to prosecute, not JSTOR’s. As noted previously, our interest was in securing the content. Once this was achieved, we had no interest in this becoming an ongoing legal matter.

Interesting. JSTOR doesn’t seem to mind much at all. Which makes sense — after all, no harm done, right?

Right, more or less. So why is Aaron Swartz now facing charges carrying up to 35 years in prison? And for what?  Here are the things he (allegedly) did:

  • Downloaded documents he was legally allowed to download.
  • Violated a website’s Terms and Conditions — the website (JSTOR) has dropped all legal action.
  • Used so-called “hacking” techniques to avoid detection — things like changing his computer’s IP address or MAC address — along with basically playing laptop hide-and-seek with MIT police.

Now, maybe you live in a place without much crime. Maybe your city doesn’t have murders or rapes; maybe your car has never been broken into; maybe you weren’t into real estate in ’08 and you’re not into banks right now. But I’m thinking that the U.S. government might have better things to do with its resources than focus them on an apparently victimless “crime”, and the media might have more important stories to cover than a copyright violation that never happened.

For the record, I certainly don’t condone computer fraud. I also don’t condone “theft”, or “stealing”. In my opinion, these things are wrong. They’re also against the law (as is copyright violation), and I don’t think people should do them.

But I’d like to ask a bigger question. If the legal system isn’t protecting JSTOR here, then who is it protecting?

I think this case has already shown that there’s a huge sensitivity in the U.S. to issues relating to copyright. This incident, which has very little to do with copyright law, prompted massive headlines of “Theft! Stealing!” and so on. Is this reasonable? Why are so many people so defensive of “intellectual property”? How did we get to the point where a scholar downloading free journal articles prompts headlines of “Massive Theft”? Why does the textual description in Swartz’s indictment use the word “steal” several times, even though none of the actual charges are related to theft (or even to copyright)?

We must go deeper.

The Tools Of A Tyrant

How strangely will the Tools of a Tyrant pervert the plain Meaning of Words!

Samuel Adams.

He who wants to persuade should put his trust not in the right argument, but in the right word. The power of sound has always been greater than the power of sense.

Joseph Conrad.

It’s time we talked about the difference between “information” and “property”, and the difference between “copy” and “steal”. I’ll show that the copyright industry has achieved two triumphs of deception simply by conflating those pairs of terms. Regardless of how strong you think copyright law should be, this is not about copyright law itself — this is about how far our perceptions have been twisted from what copyright law really is.

Copyright Industry Triumph #1: The Term “Intellectual Property”

I present to you Exhibit A. Glance again at Conrad’s quote above, and think about the phrase “intellectual”  “property”.

As my philosophy professor might like to say, whoever came up with that term should be shot. Every time someone utters that phrase, they are using the sound of a familiar word, “property”, to overpower our sense of what copyright and patents actually are.

Let’s begin with the tangible. Property means, well, possessions: things that can only belong to one person or group at once. My desk lamp is my property; if you come and take it, I no longer have it. The branch of law governing these things is appropriately titled Property law, or possession law.

On the other hand, information can belong to many people at once. If you have an idea, you can share it with me, and then we both have the idea; and furthermore, you can never take that idea back from me. However, you can make laws that limit how I can use it: these are patent and copyright laws.

Notice a subtle distinction here, thanks to the age of computers. Say I have my money sitting in a digital bank account, and you steal my debit card and transfer money out. The money was represented digitally, in zeros and ones, but we can clearly see that the money was still property. Once you took it, I no longer had it. Property belongs to one and one alone.

On the other hand, imagine that while you are sitting on a park bench reading a book, I am secretly looking over your shoulder and writing down every word. Say I then take these notes to Kinkos and have them bound up to look exactly like your book. In a sense, I now “have” your book. But what I obtained from you was not property — you still have your book. The essential act here was the dissemination of information. You still have your information; I now have mine. If I read the book out loud to a friend, I have copied the information yet another time. Information is shared among everyone who sees it.

Now we can see the great duplicity of the phrase “intellectual property“. It takes an idea we are all used to, something that is intuitive and familiar — property/possessions — and uses it to label a concept which is fundamentally opposite. Possessions can only ever belong to one thing at a time — information belongs to the whole world. The phrase “intellectual property” gives the entirely false impression that information acts like a possession. In fact, the two could not be more different.

Understanding so-called “Intellectual Property” — What Is Copyright Law?

Congress shall have Power […] To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries.

The Constitution.

Copyright law IS NOT property law. Property law is about owning possessions.

Copyright (and patent) law is about controlling access to and use of information.

For example, say I own a patent on a mechanical desk lamp. Do I “possess” the idea of this desk lamp? No. You (or anybody else) are perfectly free to know everything about it. You can read all the schematics of how it works online at the U.S. Patent Office. What my patent gives me is control over how you use that information. You are not allowed to use it to create copies of the desk lamp without my permission.

I restate:

It is literally impossible to “own” ideas or information. It is possible only to control the use of those ideas or that information.

(Interlude — it’s not automatically clear that we should control people’s use of information. As the Constitution states, however, the idea behind copyright/patent law is that giving people exclusive control over the ideas they create, for a brief period of time, gives people a monetary incentive to create new ideas.)

One reason the copyright industry has gotten away with confusing property and information is that, often, copyrighted information is encoded in some physical piece of property: a CD, a book, etc. But the ideas are separate. The crime being committed in copyright violation has nothing to do with information being “taken”. (Sometimes you can’t help but access information — like when you hear it on the radio, or see it on your TV screen. You haven’t “taken” anything — information cannot be taken, just spread.)

In violating copyright, the crime committed has to do with the information being used in a manner disallowed by the copyright holder. If you have paid a certain royalty, you are granted certain uses of a music track — for example, you can listen to it on your own computer. But you are not permitted to copy it for a friend. If you haven’t paid the royalty, certain uses are still allowed — for example, you’re allowed to listen to the song on the radio. But your legal uses are more restricted.

Ok. The good news is that the preceding section was the long part — coming to grips with the difference between information and property. To understand the distinction, we have to realize that it is both a conceptual distinction — information can be shared among many people, while property belongs only to one — as well as the legal distinction — property is owned exclusively by a single person/group, while copyright law merely grants temporary control over who gets to use a piece of information, and how.

Once we come to this realization, the copyright industry’s second tour de force — the second perversion of the plain meanings of Words — is very simple. You can go ahead and begin the drumroll.

Copyright Industry Triumph #2: The Redefinition Of “Theft”

“Theft” and “Stealing” are precise terms. They have exact meanings both legally and in English: see definitions [1] [2] [3] [4] [5] [6]. All of these definitions make one thing very clear: When something is stolen, property (a possession) has changed hands. Something that used to belong to one person now belongs to another; the first person no longer has it. (Sidenote: stealing can also mean plagiarism, which has nothing to do with copyright, but merely means attaching one’s name to the work of another.)

Copying” is also a precise term. It is a duplication of something that already exists, without affecting the original.

And so: Now that we have taken a moment to define our terms, it is absurdly simple, this great duplicity which has been perpetrated right under our noses: We can see quite clearly that

Information can only be copied, never stolen. Information can be used without permission, but it can never be taken from anyone.

This is the fundamental truth that was simply ignored by all those newspaper articles proclaiming “Data Theft”, ignored even in part by the prosecutors. This is the simple fact obscured by the tyrannical word “theft”, so often used by the media and the government to apply in a place where it is simply inapplicable.

I merely repeat a definition: copyright laws give a party temporary control over how others are allowed to use information. Merely by understanding the definitions of what we’re talking about, we cannot escape the conclusion:

Copyright laws have nothing to do with “property” and nothing to do with “theft”.

Okay, But–

At this point, it is easy to get cold feet. We may accept these definitions logically, but find it hard to really “feel” them. What about, for instance, the RIAA’s assertions that every “pirated” song is a song that’s not being purchased? Sure, you’re not stealing any property from anyone when you download a song — but if the artist ends up with less money than they would have had, isn’t it kind of like you stole from them?

People tend to beat around the bush when responding to this point. They like to point out, for instance, that “music pirates” actually tend to be good consumers. But this doesn’t respond to the moral point. The point made here is that violating copyright is basically the same thing, morally, as stealing.

In fact, what makes the argument difficult to respond to is that it assumes ideas as possessions. We need to readjust our thinking. OK: Suppose you own a musical instruments shop. You sell instruments to people, or you charge people an hourly rate to come in and play your instruments. Now suppose someone stole your drumset — you’d be very angry with her! But now suppose she just snuck in the back and played your drums for a while without you noticing, and left without paying for her half-hour of fun. There were no other customers in the store waiting to play, so she did not take anything from you. But yet, you’re not exactly happy with her either.

This is the crime of copyright violation — the crime of use without permission. If we feel any other way about it, it is because we have been conditioned to feel that way by material objects — by books and CDs. The seeming connection between information and possession has fooled our intuition.

If it’s still difficult to grasp the intuitive difference between copyright violation and stealing, I ask you to consider this fact:

All information is inherently gifted to the world.

That sounds flowery, but I mean it pretty much literally. When an idea or piece of information is created, it leaves its creator behind forever. It has to spread out into world and it cannot be taken back. For example: A song can never be unsung. If a song is sung just once, to just one person, then they can always keep that song in their head. They can sing it to someone else, and they to someone else again. The song does not belong to some particular creator; it belongs to the whole world. This is not philosophy. This is the way information works. Contrast that with a handbag, which is first owned exclusively by its creator, then sold exclusively to a second person (, and then dumped exclusively in a closet corner).

Now, we can impose artificial conditions on the use of information, such as that song mentioned above. If the song is copyrighted, it still can’t ever be taken from the first listener’s mind, but the copyright-holder can legally prevent that person from singing it. Copyright infringement is nothing more nor less than singing a forbidden song. You may have seen a movie on TV seven different times; but to play it for yourself on Blu-Ray, you must legally pay a royalty fee. You already “have” the TV show in some sense, but you must pay a fee to “use” it.


  • Information is not property. Information can never be owned, it can only be spread from person to person.
  • Information cannot be stolen. It can only be copied.
  • Copyright law is about granting permission to use information.

Okay, But So What? Copy/Steal, Information/Property. Is This All Just Semantics?

It’s time to return, finally, to the story of Aaron Swartz. Here we have an example of a person accused, on a massive scale, of massive-scale theft — accused by the media, and accused implicitly by the government in their indictment. But the fundamental act here was copying of information: the spread of ideas. No possessions changed hands. Nobody is any poorer for the encounter. To top it off is the fact that, in this respect, no law — not even copyright law — was violated.

But yet, we have the following statement from the Massachussetts state attorney (source: NY Times):

“Stealing is stealing, whether you use a computer command or a crowbar, and whether you take documents, data or dollars,” the United States attorney for Massachusetts, Carmen M. Ortiz, said last week in a statement about the case. “It is equally harmful to the victim whether you sell what you have stolen or give it away.”

If our own government officials are so brainwashed that they do not know the difference between theft and copyright infringement, how will our society have a chance of taking a balanced approach to these issues?

The responses to this incident, I argue, are symptomatic of a copyright industry, a government heavily lobbied by them, and a media completely taken in by them, all of whom are rabidly obsessed with the oxymoronic idea of so-called “intellectual property”. I repeat — quote unquote “intellectual property” is an oxymoron, a paradox, a logical impossibility.

These semantics are important because the copyright industry has taken morally charged terms — “property” and “theft” — and attached them to concepts which simply have nothing to do with either. This has allowed them to turn our emotions against us. I believe that humans innately find the spreading of information good/useful and the taking of property wrong/evil. The copyright industry has fooled us by taking a term for the latter and attaching it to the former. I can’t put it more bluntly than that.

I’m not saying copyright law is useless. But I’m suggesting that we often don’t understand what’s really happening here, especially when the debate is framed in fundamentally misleading terms.  The effect?  A society in which the RIAA sues students for millions of dollars. We have a government in which a judge awarded the RIAA $1.6 million from a working mother of four — $62,500 for each song she shared. Imagine if you gave someone a loaf of bread and got fined $60,000 for it. Now imagine telling them an Aesop’s fable and being hauled into court. Imagine singing someone a song and being called a thief. (You can argue whether copyright law itself is good or bad — but this issue lies outside that debate. It’s about a society that misunderstands its own laws.)

If it is a crime to read over someone’s shoulder, so be it. But for our own sakes, let’s keep a sense of perspective!

This is a society where the RIAA sought $1.65 trillion from a website for violating copyright laws. That’s enough money to feed every man, woman, and child in the U.S. for a year. Oh, how I wish you could do that with songs. Funny story about that suit, though — the site did not violate any copyright laws of the country in which it was located, Russia, and the RIAA eventually dropped the issue. The funny part: Russia apparently shut down the site anyway because the U.S. threatened to block their entrance into the World Trade Organization otherwise.

Why did our government value its relationship with the RIAA more than its relationship with Russia? I don’t know, but it surely has to do with an obsession over copyright laws and the struggle to own information: to possess that which admits no possessor.

Ladies and gentlemen, if you are still not convinced that the misuse of these terms is a vital problem in American government, I ask that you consider just one more reference: The Protect IP Act, a bill currently being proposed in the U.S. Senate. Or, to cite its full title:

The Preventing Real Online Threats to Economic Creativity and Theft of Intellectual Property Act

Our government is trying to title a law as preventing “intellectual property” (something that does not exist) from being “stolen” (something that cannot be done).

What more can I possibly say?

In summary, I submit to you three claims:

  • Current understanding and interpretation of so-called “intellectual property” is deeply warped: morally, emotionally, and in implementation/interpretation.
  • This has caused a huge imbalance in how copyright/patent law is perceived to be by the citizens, media and government of the U.S.
  • This warping is in large part worsened by the Tyranny of Words: mis-definition, which is (of course) encouraged by the copyright industry.

And therefore, I restate my central points (which are mainly just dictionary definitions):

  • So-called “intellectual property” is not property, it is information.
  • Information cannot be stolen, it can only be copied.
  • Copyright and patent laws do not grant ownership of an idea (information), but rather temporary control over the use of that information.
  • Using information without permission is morally separate from stealing property.

Finally, I submit to you a humble request in two parts:

  • Do not accept or tolerate the use of the word “theft” when referring to a use of information without permission. When you see it, consider that either someone is attempting to manipulate your emotions and obscure what’s actually happening, or that person has fallen victim to manipulation without realizing it.
  • When you are tempted to think of information as a possession, recall that information by its nature “belongs” to everyone who has seen it. Copyright law is about placing limits on what a person is allowed to do with that information.

This article is in the public domain. I place no copyright restriction upon anything here. Instead, I place a moral restriction: follow your conscience.


One comment on “Stealing, Copying, and the 4,000,000-Document Non-Theft

  1. […] her ability to remain impartial in such cases as this. Take this following quote for example.“Stealing is stealing, whether you use a computer command or a crowbar, and whether you take documents, data or […]

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s