Words, Ideas, and Things: 2012

Thursday, December 27, 2012

What Is The Semantic Web?

The Semantic Web is—or is hoped to be—the next revolution in the way the Internet is used, just as the World Wide Web was a revolution in the way the Internet was used. To get some perspective, we need to look back at history.

Before the Internet, computers existed as standalone machines, possibly with multiple monitor/keyboard terminals spread around a building. For long distance connections, wired circuits (think modems) had to be be brought up and then maintained throughout a session. Local networks existed, but each network vendor had its own incompatible system. There wasn't a standard way of communicating across networks.

The Internet began as a U.S. Department of Defense project to connect research universities. By the end of 1969, networks at four universities were connected to each other. In 1983, the communication standard of this inter-network ("between"-network) was changed to the TCP/IP protocol suite, which is still the basis of Internet communication today. With an IP address (e.g. 203.0.113.100) and a port (e.g. 25), a computer in California can connect to the email program on a computer in Germany and leave a message for a user there. Or, slightly more user friendly, a kid growing up in rural North Dakota could use the telnet application to connect to a domain name (genesis.cs.chalmers.se) along with a port (3011) to play a text adventure game running on a university server in Sweden. (They've changed the address a bit since I was in high school.)

There was useful and fun stuff going on before the World Wide Web, but it was hard to discover new resources. It's hard to believe now, but it was common in those days to learn about Internet sites by reading about them in books. Paper books! Sure, there were Gopher servers with manually-maintained hierarchical categories of Internet resources, but these directories didn't keep up very well and the resources didn't usually link to each other.

The World Wide Web began as an internal project at CERN, the particle physics research center on the border of Switzerland and France. Researchers needed a better way to organize their information in a busy environment with lots of job turnover, so Tim Berners-Lee proposed a solution for CERN intentionally designed to work on a global scale as well. He wrote:

"a 'web' of notes with links (like references) between them is far more useful than a fixed hierarchical system. When describing a complex system, many people resort to diagrams with circles and arrows. Circles and arrows leave one free to describe the interrelationships between things in a way that tables, for example, do not. The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything." (source)

He was serious about the "anything" part, but we'll get back to that. As implemented, the circles came to represent documents and the arrows became references to other documents. Web pages linking to other web pages! The notion of document interlinking had been around for decades, but the World Wide Web turned the idea into practical, worldwide reality.

Linked documents sounds a little boring, but programmers have found ways to make web "documents" very interactive. Many other Internet applications have migrated into the web browser. Gopher was replaced by Yahoo (before Yahoo became a tabloid). Home users are more likely to use web mail than a standalone mail client. Twitter and Facebook have largely replaced IRC and other instant messaging clients. Web services (and web mail, unfortunately) are used to transfer files instead of FTP. It's a good thing that applications like Skype and BitTorrent exist, or people might forget there's a difference between the Internet and the World Wide Web!

What's next?

Many great things happened after we started linking documents; what if we try linking finer-grained pieces of data in usable ways? That's the idea behind the Semantic Web.

Think of it this way: the World Wide Web allowed organizations and individuals to put their relatively static documents "out there" for the world to see. But what about database generated content like library catalogs, or online store pricing, or current weather conditions? Web crawlers might be able to retrieve and usefully interpret some of this data, but that usually requires special per-site programming that breaks if the API or web formatting changes.

Getting Across Town, The Semantic Way

Here is an example of a web-published bus route:

http://lincoln.ne.gov/city/pworks/startran/routemap/weekday/route41.htm

An experienced bus rider can read this page and figure out how to plan a trip. A computer program would need help understanding how to parse all of this visually-structured data into precisely labeled information that it can reason about. Quick, what time does the last southbound bus leave "North Walmart" on Thursdays? It's not a trivial process to give that answer, even after we visually interpret the numbers as times in columns that correspond to bus stop locations on the map below. An even harder question might be: "I'm at arbitrary location X and want to reach location Y; what bus route gives me the shortest total walking distance?" In this case, a human on the right website might still have to manually look through all bus route pages, narrow it down to a couple of likely shortest routes, then spend more time comparing the tradeoff between walking farther to the first bus stop or walking farther from the last bus stop.

What would be really neat is a way for bus services and street map services to publish their data on the web in a computer-friendly form that allows third party web apps to combine all of this information and calculate answers to such questions. Even better: a universal format so mash-ups from unexpected combinations of data sources are easier to make. I'm thinking of a music app that checks your GPS position and your destination so it can create a playlist that ends within thirty seconds before your final stop. Or an emergency flight plan app that cross references ticket pricing options with weather predictions. Or a recipe web site that lets you mark missing ingredients and shows their pricing from the five closest stores. Or a personalized book recommendation site that filters by currently available titles in local public libraries. Or imagine searching the web for information on a brand-name drug, and the top results use the drug's generic name without mentioning the brand-name.

Many of these things are possible without semantic web technology; they just require more work to set up and don't tend to be very reusable. For example, Google Transit can help with bus route planning, if a city has formatted their data specifically for this Google web app and joined the transit partner program. But what if a new business wants to reuse this information in a creative way? What if Google cancels the Transit service? It would preferable to have an open standard for open data.

Linked Data

What's the plan, then? Open existing relational databases to the public? Not exactly. The World Wide Web Consortium is pushing for another database model that's a more natural fit for the web: a graph-style data model. From the Wikipedia article:

"Compared with relational databases, graph databases are often faster for associative data sets, and map more directly to the structure of object-oriented applications. They can scale more naturally to large data sets as they do not typically require expensive join operations. As they depend less on a rigid schema, they are more suitable to manage ad-hoc and changing data with evolving schemas. Conversely, relational databases are typically faster at performing the same operation on large numbers of data elements."

In other words, graph databases are less efficient but more flexible (see also The Death of the Relational Database). For people who aren't math majors or computer programmers, "graph database" may sound like "graphical database." But what's meant is graph theory: a bunch of nodes and connections between nodes, usually visualized as circles and lines. A directed graph adds direction to those lines, so you get circles and arrows. Recall what Tim Berners-Lee wrote in his original proposal for the World Wide Web: "The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything." The World Wide Web is made of connections like this:

(http://en.wikipedia.org/wiki/Cat) --links to--> (http://www.catpert.com/)

Each URL (Uniform Resource Locator) is a circle and web links are the arrows. If you can imagine all URLs and all arrows between them as a gigantic diagram, you're visualizing the World Wide Web as one big directed graph.

Now imagine that the circles can stand for anything, not just web documents. Imagine that the arrows can stand for any relationship, not just navigation links.

(rain gauge #2,388) --detected rain depth--> (3 cm)
(rain gauge #2,388) --time since last emptied--> (60 min)
(rain gauge #2,388) --location--> (Millennium Stadium)
(Cardiff) --contains--> (Millennium Stadium)

A web app that has access to this information can now give an answer the question, "How much has it rained in Cardiff in the last hour?" "An average of 3 cm, as reported by 1 rain gauge." Or with more gauges it might be, "An average of 2.95 cm, as reported by 15 rain gauges." These (something) --related somehow--> (something) snippets of information called triples can combine together into complex graphs of data. And, like web pages, this can happen across servers. The rain depth information could be on one server that only knows the gauge is in Millennium Stadium, while another server knows that Millennium Stadium is in Cardiff. In fact, it makes sense to reference a separate server with lots of geographical knowledge rather than trying to maintain geographical info on a specialized rain gauge server. If the geography server is updated, the rain server automatically and instantly benefits! This is an example of the synergy that can happen with linked data.

Wait, Where Are These Factoids?

Regular web links are in web pages and point to other web pages; we're used to that by now. But where are these triples located? They can be embedded into web page code in the form of RDFa. Graph databases called triplestores can also be put on the Internet and directly queried, much as a SQL database could be if it weren't hidden behind an intermediary website. In either case, typical Internet users won't "see" the Semantic Web directly as they see the World Wide Web's documents and links. The Semantic Web exists as a programming-oriented sibling or add-on to the World Wide Web, not as a replacement. Applications use the Semantic Web to enhance traditional web services.

What Makes the Semantic Web "Semantic"?

In philosophy, linguistics, and computer science, semantics has to do with meaning in contrast to syntax (which has to do with structure or format). Remember ad-libs?

The [adjective] outlaw [transitive past tense verb] a [common noun].

So long as these blanks are filled in with the specified parts of speech, the resulting sentence will be syntactically correct; it will have the right format for an English sentence. For example:

The lonely outlaw whistled a tune.
The law-abiding outlaw drank a mortgage.

The second sentence may have proper syntax, but it's nonsense. Because of their meaning, certain words and phrases don't go well together, at least not in a literal sense. Something else to consider:

This isn't a dog, it's a doberman pinscher.

Again, nothing wrong with the syntax, but a doberman pinscher is a type of dog. Another case:

There were witch trials in Salem.

The truth of this sentence depends (in part) on which Salem is meant. It's a true claim when referring to Salem, Massachusetts. It's false for Salem, Iowa...and many other Salems. In standalone databases, ambiguities and mis-matched concepts like these aren't much of a problem. A database created for a certain purpose in a certain context has implicit restrictions on the meaning of its data. A Massachusetts newspaper database and a Iowa newspaper database are going to mean something different by just plain "Salem." What happens if we try to publish all of these databases on the web and expect the data to mesh well together? Chaos, unintentional humor, and a general lack of usefulness!

For this reason, the Semantic Web has to be about more than just publishing everyone's data as (subject) --predicate--> (object) triples. Here's a flawed set of triples:

(witch trials) --took place in--> (Salem)
(Tom) --born in--> (Salem)

Was Tom born in the same city that the witch trials took place in? We can't tell because we don't know if the two "Salem"s are the same, or which "Tom" is meant. To solve this problem, URIs (Uniform Resource Identifiers) are used, roughly like this:

(http://dbpedia.org/resource/Category:Salem_witch_trials)
--(http://sw.opencyc.org/2008/06/10/concept/en/eventOccursAt)-->
(http://dbpedia.org/resource/Salem,_Massachusetts)

(http://dbpedia.org/page/Thomas_Poulter)
--(http://dbpedia.org/ontology/birthPlace)-->
(http://dbpedia.org/resource/Salem,_Iowa)

In this case, the "Tom" in question was born in a different Salem. If the URIs had matched up, it would have been possible to draw a new conclusion along the lines of (Tom) --born where occurred--> (Salem witch trials). Why call these URIs rather than URLs? Because they don't necessarily correspond to a visitable web page, although it's considered best practice to make such a page available when possible. A URI can identify a resource (or a concept!) without necessarily providing a location.

Did you notice that the URIs above come from both dbpedia.org and opencyc.org? There isn't a single, authorized web domain for the URIs used in linked data. Different organizations can contribute to the pool of URIs. What if two organizations use different URIs for the same thing? There's a triple for that!

(http://dbpedia.org/resource/Salem,_Massachusetts)
--(http://www.w3.org/2002/07/owl#sameAs)-->
(http://sw.cyc.com/concept/Mx4rvViiFpwpEbGdrcN5Y29ycA)

What about mismatches between URIs for "doberman pinscher" and "dog." As you might guess by now, a predicate (i.e. middle URI) can be used to say that a doberman is a type of dog. Then, hopefully, any computer program trying to decide if a given specimen is a dog won't stop at finding out that it's a "doberman pinscher"; it will check to see if doberman pinschers are dogs.

To answer the original question, what makes the Semantic Web "semantic"? All of this background work done by ontologists to separate and combine concepts and to specify the relationships among them. The Semantic Web isn't just about breaking data out of individual databases, but to publish data in terms of these shared vocabularies and relationship schemes. For data to be useful (and reusable) in a giant, global database, the information that was implicit in the context and structure of local databases has to become explicit. Triples format does this for structure. Ontology work does this for meaning.

When Will "Semantic Web" Be a Household Name?

It probably won't ever be a term everyone knows. The semantic revolution is happening behind the scenes among scientific, business, and cultural heritage groups. If things go well, the Semantic Web will increasingly influence the average person's experience with traditional web sites and services. Even if today's technical implementation of the Semantic Web remains niche, I have no doubt that some of its motivating ideas will reappear in future technologies.

Related Reading

W3C Semantic Web (Standards Information)
Semantic University (Introductory lessons)
Linked Data: Evolving the Web into a Global Data Space (Online book)
Semantic Data Integration on Biomedical Data Using Semantic Web Technologies (Book Chapter)
Semantic Web Challenge (Yearly App Awards)
LinkedData.org (Guide to projects and resources)

Sunday, December 16, 2012

Lingo: Authority Control

http://www.flickr.com/photos/kamikazestoat/425526222

Some Delicious tags:

These are user-submitted tags to help other users find webpages on a given topic.

Suppose I just found some interesting Legend of Zelda alt art and want to link it on Delicious. Which tag do I use? legendofzelda is popular, but so is zelda. If I want everyone to see my link, I had better use both! Maybe this is good enough, but since there will still be people browsing through the other tags listed above, should I use all of them? How do I know I've even found them all? What if someone starts using the tag zeldaseries next week?

Hey, maybe someone should clean up this mess by designating an official tag for the Legend of Zelda video game series. Or we call this the authorized tag. Here is a great three-part plan:

Decide on authorized tags for every distinct topic on Delicious.
Make sure that all current and future Delicious links use the authorized tags.
Enjoy finding all links related to a topic under one tag (and nothing unrelated)!

In Library Science lingo, steps one and two are called authority work: the behind-the-scenes work that needs to be done to have neatly organized access points to resources. Access points can be titles, names, or topics.

The Legend of Zelda: The Wind Waker (Video Game) -- a title
Miyamoto, Shigeru, 1952- -- a name
Sailing -- a topic

or a little older:

Dracula (Novel) -- a title
Stoker, Bram, 1847-1912 -- a name
Vampires -- a topic

A close synonym to authority work is authority control. I prefer to think of authority control as the goal of authority work. In other words, we do authority work to achieve a state of authority control (as in step three above). But it's more common to combine the concepts:

"Authority control is the process of bringing together all of the forms of name that apply to a single name; all the variant titles that apply to a single work; and relating all the synonyms, related terms, broader terms, and narrower terms that apply to a single subject heading." — Arlene Taylor, The Organization of Information (3rd edition), p. 44

It's not the most intuitive terminology. "Access point control" or "name deduplication" or "not having a pile of inconsistent labels" would all be better.

A Professionals Only Club?

Delicious is not likely to change its tagging system. Authority control has great benefits, but it takes a lot of extra time and effort. Delicious is fantastic for what it offers: quick-and-easy bookmark tagging and decent (if flawed) bookmark discovery.

Does this mean authority control is only in reach for professional librarians? Nope! I can think of a major Web 2.0 site that lets users participate in a kind of authority work: Wikipedia.

http://commons.wikimedia.org/wiki/File:Pommes-1.jpg

Quick! What are these called:

...pommes, chips, French fries? ...pommes frites, slap chips, Belgian fries?

Imagine separate Wikipedia articles for these variations and many more. Not desirable, to say the least. Wikipedia handles this situation by letting users decide on a single article title (e.g. French Fries) and creating redirects for alternate titles.

Why does this work for Wikipedia but not for Delicious? Primarily because of the number of volunteer editors willing to do this kind of behind-the-scenes work for articles. Trying to keep Delicious links organized would be much more maddening with much less payoff.

Controlled Vocabulary Resources

Not every library or website needs to come up with its own authorized titles, names, or subjects. Here are some (more or less) publicly available lists that can at least serve as a starting point:

Library of Congress Subject Headings. A very broad and inclusive set of subject terms. Academic libraries tend to re-use these for their collections. Example: Ships. Smaller libraries often use the Sears List of Subject Headings instead.

Library of Congress Name Authority File. Example: Rice, Anne, 1941-. Also see Getty's Union List of Artist Names. Example: Mondrian, Piet (Dutch painter, 1872-1944).

Library of Congress' Thesaurus for Graphic Materials. Check the three "Browse By" links on the left. Example: Nitrate negatives. Getty's Art & Architecture Thesaurus. Example: Googie.

Individuals might prefer to use vocabularies like these rather than come up with their own blog tags, image tags, or music tags. You can look beyond the library and archives scene too. If I had a music review blog, I would probably use AllMusic's genre name hierarchy. Example: Americana. Right now this doesn't do a lot of good on one blog, but the growth of Semantic Web technologies may mean better use of authorized vocabulary on the public web in the future. Or the SEO leeches might just mess that up too. Either way, you can always visit your library and take advantage of the authority control someone worked so hard to set up there!

Friday, December 7, 2012

Nicomachean Ethics (Pt. 2)

[Series introduction and table of contents here.]

Book I, Chapter 4

In my comments on Chapter 2, I described Aristotle's "grand goal" as the political art. That wasn't quite right. What he was saying back then and reiterates here in Chapter 4 is that the highest of goods is the same as whatever the political art's goal is. He sees politics as the most encompassing activity in human life, so its goal would be the most encompassing goal. And what is the goal of the political art? Happiness.

All human activities are subordinate to politics and politics is aimed at happiness. Got it. Aristotle doesn't feel the need to argue for the answer of "happiness" because he takes it as universally accepted by both "the many" and "the refined." (Yes, he's just a tad elitist.) He does note that "the many" give a variety of explanations for what constitutes happiness, e.g. health, wealth, pleasure, etc.

"Certain others, in addition, used to suppose that the good is something else, by itself, apart from these many good things, which is also the cause of their all being good."

"Certain others" being Plato and friends, obviously. It's interesting how Aristotle puts some distance between himself and this view. Before he elaborates, however, he goes off on another tangent about arguing from principles vs. arguing to principles. Why does he do this? I think it's because he wants to excuse himself from starting with Plato's principles. He actually names Plato as someone who understood these two different directions of argument. He's tip-toeing around his audience's reverence for his own former teacher. Aristotle is firmly on the side of arguing to principles, which might sound bad until you realize he's trying to be more of a scientist than an ideologue; he wants to use induction to discover what the true principles are from "things known to us" rather than "things known simply."

"Perhaps it is necessary for us, at least, to begin from the things known to us."

See, he's not being arrogant by going his own way from Plato. He's being extra humble.

Book I, Chapter 5

There are three "especially prominent" ways of life:

The life of enjoyment. This is what "the many" choose to pursue, though some rulers do as well. Aristotle calls this "the life of fattened cattle." These people think happiness and pleasure are the same.

The political life. The "refined and active" live the political life by pursuing honor...or maybe virtue. Aristotle considers the possibility that honor is more of a reaction people have when they encounter a person with virtue, which would make virtue the primary goal. He's not quite happy with this result, however, since there are many cases where the exercise of virtue and happiness seem at odds.

"For it seems to be possible for someone to possess virtue even while asleep or while being inactive throughout life and, in addition to these, while suffering badly and undergoing the greatest misfortune. But no one would deem happy somebody living in this way, unless he were defending a thesis."

Funny! But I have to wonder if Aristotle is being overly dismissive of the possibility of being fulfilled and happy despite great suffering, because a person is so overwhelmingly interested in what they're accomplishing.

The contemplative life. A footnote here says that Aristotle doesn't get around to explaining the contemplative life until Book X, Chapters 6-8. I've already seen how easily distracted he is, but this has to be some kind of record! Is "sophistication" a Greek word meaning "disorganized"?

Book I, Chapter 6

Aristotle argues that good can't be a Platonic form (see the "Certain others..." block quote above) because, roughly:

For something to have a Platonic form, its expressions must pertain to a "common idea."
Good can pertain to both what something is and its relations to other things.
What something is is an essential property.
How something relates to other things is an accidental property.
A common idea can't be both essential and accidental.
Therefore good can't be a Platonic form.

He goes on to list other difficulties in understanding good as a single idea. But then he admits that maybe we can divide instances of good into "things good in themselves" and things that "are advantageous" so we can consider whether the multiplicities of good might only be a problem for the latter category (what philosophers today call "instrumental good"). Perhaps there is a single idea common to all things good in themselves. For example, what if the idea of good itself is the only thing that is good in itself? Aristotle calls this "pointless."

In order to avoid pointlessness, it must be the case that all instances of things that are good in themselves outwardly manifest good in a common way, "just as the definition of whiteness is the same in the case of snow and in that of white lead." Aristotle believes that "honor, prudence, and pleasure" are good in themselves because people pursue these things for their own sake (even if they also pursue them in an instrumental sense). He doesn't see how the good of honor and the good of pleasure, for example, manifest in a common way, so good can't be a Platonic form even if we set aside instrumental goodness.

Now Aristotle has a problem. Why the heck do we call all of these disparate things "good" if they don't share a common idea?

"For they are not like things that share the same name by chance. Is it by dint of their stemming from one thing or because they all contribute to one thing? Or is it more that they are such by analogy?"

He doesn't have a ready answer. Instead, he points back at the Platonists and accuses them of having problems explaining how totally abstract forms and concrete human action interact with each other. Reminds me of physicalists in philosophy of mind who defend themselves by pointing out issues with Cartesian dualism.

I wonder what Aristotle would have made of Paul Ziff's book, Semantic Analysis. It seems to me that Ziff answered the question by discovering that things are never good in themselves and it's the other category that can fold neatly into a single idea.

Quotes from: Bartlett, R.C. & Collins, S.D. (2011). Aristotle's nicomachean ethics: A new translation. Chicago: The University of Chicago Press.

Sunday, November 11, 2012

Nicomachean Ethics (Pt. 1)

Time for a good old-fashioned blogmentary! In this series, I'm going all the way back to ancient Greek moral philosophy. Most of my previous readings in ethics have been more-or-less contemporary, with a side of Hume, Kant, and Mill. While I'm not a fan of confusing philosophy with history of philosophy, this Aristotle fellow keeps popping up in current, actively-defended philosophy. He's resilient! I decided it's high time to get acquainted with Aristotle's ethics beyond the popular quotes I've encountered elsewhere.

So you understand where I'm coming from, I have a very goal-oriented view of morality. Descriptively, morality arises from deeply-held human values. Normatively, moral truth arises from a fitting application of decisions or policies to the way the world works. This means I have a decidedly practical rather than mystical view of morality. In the not-so-helpful language of metaethics, "cognitivism," "success theory," "anti-realism," and "hybrid expressivism" should put you in the right neighborhood.

I will be using Robert C. Bartlett and Susan D. Collins' new (2011) translation, as pictured above. They pursued formal equivalence—as opposed to dynamic equivalence—to provide readers with a less filtered experience of Aristotle's wording. Think NASB instead of NIV or CEV, if you're familiar with Bible translations (and their acronyms!). I have no set plan on how much to write per original text or even if I'll comment on the whole thing. So long as I find the material interesting and worth discussing, I will. Finally, I encourage you to pick up a paperback copy for yourself. The Kindle edition has a typo in the first sentence and takes away from the excellent footnotes on nearly every page.

Series Links

Pt. 1 — Introduction and Book I, Chapters 1-3.
Pt. 2 — Book I, Chapters 4-6.
[On hold indefinitely. Aristotle is so very repetitive.]

Book I, Chapter 1

"Every art and every inquiry, and similarly every action as well as choice, is held to aim at some good. Hence people have nobly declared that the good is that at which all things aim."

Quite an opening line. The first sentence calls out for elaboration. Given an art, inquiry, action, or choice, what is the good being targeted? The second sentence is, intriguingly, hedged. Aristotle isn't flat-out saying all things aim at "the good." He's putting a common view on the table and expressing some sympathy for the people who take that view. It's one thing to say all things aim at "some good"; another to say all things aim at the same good. Even if they do, is this common good so abstract that we can only call it "the good"?

Aristotle immediately raises a difficulty with this noble declaration: how can all things aim at the same good when there are different types of things aimed at? As he puts it, "there appears to be a certain difference among the ends." Some ends are direct. The end of shipbuilding is the production of a ship. Other ends are indirect. The end of building warships isn't just the production of a warship, but of winning a war.

When one end is pursued as a means to a more encompassing end, Aristotle calls the encompassing end "naturally better" and "more choice-worthy." I'm less sure. Take bread-making, for example. The immediate end is the production of a loaf of bread. A further end is to alleviate hunger. Does this necessarily mean the work of alleviating hunger is better than the action of baking bread? Bread isn't the only way to take care of hunger; opening a can of beans could do the job. A person might value bread-making in itself, over and above its use as a hunger banisher. In other words, bread-making might have both instrumental and final value. (Or instrumental and intrinsic value, if you're not hip to Korsgaard).

I'm wary about pushing all value for one activity into its encompassing activity because it can lead pretty quickly to single-value ethics such as Mill's grand goal of aggregate happiness or Rand's grand goal of extending one's own lifespan. While we may value such broad ends and engage in many activities that promote them, I think it's a mistake—an error in judging human psychology—to empty all other values into such pools. The error is especially clear in Ayn Rand's case: we need to live to experience life, but what makes our lives worth living is more than just the time spent.

Book I, Chapter 2

"If, therefore, there is some end of our actions that we wish for on account of itself, the rest being things we wish for on account of this end, and if we do not choose all things on account of something else—for in this way the process will go on infinitely such that the longing involved is empty and pointless—clearly this would be the good, that is, the best."

Freshmen programmers who don't understand the need for a base case in recursive functions should be ashamed of themselves. The ancient Greeks knew this stuff! (They also put your middle school Geometry skills to shame.) Anyway, I still think Aristotle is wrong to ignore the possibility of multiple ends in the "on account of itself" category. But since he thunders on past that, what is his grand goal? ...the political art. Huh? I didn't see that coming, but it does make sense of this edition's beautiful cover art.

Aristotle lists activities such as economics, warfare, and rhetoric which can all be understood as supporting politics. Today we might say that all things are done for the good of society.

"[T]he good of the individual by himself is certainly desirable enough, but that of a nation and of cities is nobler and more divine."

Why not say that the good of nations and cities is subordinate to the good it produces for individuals? It will be interesting to see how Aristotle handles situations where what's good for the state is very bad for some individuals. Or when what's good for individuals is irrelevant to what's good or bad for society.

Book I, Chapter 3

This chapter argues for approaching political science in a rough—rather than an unduly precise—manner.

"The noble things and the just things, which the political art examines, admit of much dispute and variability, such that they are held to exist by law alone and not by nature. And even the good things admit of some such variability on account of the harm that befalls many people as a result of them: it has happened that some have been destroyed on account of their wealth, other on account of their courage"

Oh what a relief! He admits there are problems when civic good or other virtues are pushed to the extremes without considering their effects. Maybe he was familiar with Greek tragedies? This should have prompted some reflection on his part. If your great all-encompassing good can have bad effects, isn't this a flashing clue that you have the wrong fundamental good...or at least not the only fundamental good?

After some snappy characterizations of mathematicians and youngsters, Aristotle praises an attitude of patience when learning. He says his teachings are pointless for people who just follow their passions unreflectively, but of great benefit to people who "fashion their longings in accord with reason and act accordingly." This makes me ask myself, "When was the last time I allowed learning to shape my actions, and not just to justify them?" Honestly, not long ago, considering I participated in the political art just this week and made a different choice than I did four years ago.

Saturday, October 13, 2012

Fantastic Fiction's Fading Heritage

"It will be a terrible waste if the stories from the pulp era vanish because of this issue." (Science Fiction and Fantasy Writers of America, Inc. [SFWA], 2005, p. 9)

Because of the way copyright law is set up in the United States, it can be difficult or impossible to locate copyright owners for protected works going all the way back to the 1920s. Without a way to ask permission to reprint these "orphan works," they tend to fade out of culture and sometimes out of physical existence. Science fiction and fantasy literature grew into their modern forms in the 20s through 50s, but many of these genre-developing works are unpublishable orphans. No one is reading them or receiving royalties from their sale.

This paper will look at how copyright law created the so-called "orphan works problem" and how the Science Fiction and Fantasy Writers of America responded to the U.S. Copyright Office's call for comments on the situation.

Peer Pressure

In 1866, most of the major European powers signed an international copyright agreement in Berne, Switzerland. The Berne Convention for the Protection of Literary and Artistic Works (or simply the "Berne Convention") required its members to respect the rights of other member nations' authors as if they were domestic authors:

"Authors shall enjoy, in respect of works for which they are protected under this Convention, in countries of the Union other than the country of origin, the rights which their respective laws do now or may hereafter grant to their nationals, as well as the rights specially granted by this Convention." (Berne Convention for the Protection of Literary and Artistic Works [Berne Convention], 1979, art. 5)

The Convention disallowed any sort of requirement that authors register their works or stamp them with an official declaration before being protected by copyright:

"The enjoyment and the exercise of these rights shall not be subject to any formality; such enjoyment and such exercise shall be independent of the existence of protection in the country of origin of the work." (Berne Convention, 1979, art. 5)

A little over 120 years later, the U.S. finally signed on when Congress passed the Berne Convention Implementation Act of 1988. Why wait so long? One major issue was the "no formalities" clause quoted above. U.S. copyright term was also far shorter than the Convention's minimum of 50 years after the death of the author (Berne Convention, 1979, art. 7). In 1866, U.S. copyright worked like this (Peters, 1850, p. 436-439):

28 years of copyright, from the time the title of the work was properly registered.
Plus a 14 year extension, if re-registered within six months of the original expiration date.
So long as the correct notices are given in the book and in a newspaper...
...and a copy is put on deposit with the government.

Immediate adoption of the Berne Convention would have been an abrupt change in both duration and scope of copyright protection. In the meanwhile, the U.S. did sign the Buenos Aires Convention of 1910, which provided mutual copyright protection in much of North, Central, and South America and did allow formalities. To accommodate the U.S. (and other nations refusing the Berne Convention), a compromise was created in the form of the 1954 Universal Copyright Convention, which was widely accepted by the United States, Latin America, and Berne Convention members. By the 1980s, U.S. copyright worked like this:

Protection for the life of the author, plus 50 years after death.
Registration "is not a condition of copyright protection." (Copyright Act of 1976, Sec. 408, 1976)
Registration may still be required before suing infringers.

It was no longer a big leap to achieve conformity with Berne Convention standards. In 1989, the United States officially joined the Berne Convention.

The Trouble With "No Formalities"

For most of American history, copyright formalities put a substantial burden on authors, with several opportunities to slip up and lose protection:

"Given the complexity of these formalities, the cost of compliance was not trivial, and the consequences of noncompliance were severe. Failure to comply would result in copyright failing to arise (registration), being unenforceable (notice, deposit), or being subject to early termination, with entry of the work into the public domain (renewal)." (Sprigman, 493)

To a certain extent, the Berne Convention's push to remove formalities made sense as a way to more reliably protect authors' rights. It also fit with a popular European view that copyright is a kind of moral right which comes into existence the moment a work is put into a fixed form. Legal copyright would therefore serve to recognize and enforce a pre-existing moral copyright. Contrast this with the U.S. Constitution's utilitarian (goal oriented), positive (created by law) characterization of copyright: "To promote the Progress of Science and useful Arts, by securing for limited Times to Authors and Inventors the exclusive Right to their respective Writings and Discoveries" (art. I, § 8, cl 8). This allowed the U.S. government much more leeway on crafting law to promote these specified public goods. Requiring registration was a way to ensure that some official information was recorded about each copyrighted work; requiring renewal was a way to ensure neglected works would enter the public domain more quickly...or at least that the official information would be updated. The details of compliance were arguably too burdensome, but the removal of formalities has led to other problems.

Despite continued growth in writing and publishing, now-voluntary copyright registration has leveled off (Sprigman, 2004, p. 496):

And now-voluntary renewals are on their way to extinction (Sprigman, 2004, p. 498):

This means a smaller and smaller proportion of the kinds of works that were traditionally registered are being registered. And of these, an even smaller proportion are being renewed. By comparison:

Old Way

Many works never under copyright because their creators did not consider them worth the trouble of registering.
Registration records exist for copyrighted works.
Renewal records exist for works under extended copyright.

New Way

All works under automatic copyright, including poems in notebook, blog posts, personal song recordings, dance routine descriptions, etc.
Registration records might not exist for copyrighted works.
Renewal records probably don't exist for works under extended copyright.

What's the problem with this? The chance of relatively recent works becoming "orphaned" has greatly increased. A work is orphaned when locating its copyright owner becomes prohibitively difficult or outright impossible. Publishers can't reprint it. Creators can't seek permission to use it or adapt it into new works. And, of course, authors and their heirs miss out on potential income. When authors cannot be located, everyone loses.

Amazing Stories and Weird Tales

In a sense, there are two orphan works problems. The removal of formality requirements in the late 1970s — as preparation for joining the Berne Convention — has caused a problem with contacting the owners of unregistered or unrenewed works. But there was already a problem with official information falling out of date. A novel published in 1923 and renewed in 1950 is still under copyright until at least 2018. The name of the person who renewed it 62 years ago might not sufficient to discover who owns the copyright in 2012.

Think of these as the "no official records" and the "outdated official records" orphan works problems. One area of literature strongly affected by these problems is modern fantastic fiction, here defined as the science fiction and fantasy genres. A little history:

Science fiction and fantasy both got their start in the age of universal public domain (i.e. before 1923). Jules Verne, H.G. Wells, and Edgar Rice Burroughs were especially effective pioneers of science fiction from the 1860s through the 1910s. Fantasy fiction goes back to folklore, but it began its transformation into modern fantasy from the 1850s through the 1910s in the works of George MacDonald, Lewis Carroll, and L. Frank Baum.

Interest in these genres greatly expanded in the 1920s with the rise of pulp magazines offering monthly short stories on the cheap. Weird Tales began publishing fantasy and horror stories in March 1923. Amazing Stories began its run of science fiction stories in April 1926. Other pulp magazines hopped on the bandwagon and public interest in these genres continued to grow, spurred on by the publication of now-classic novels like Brave New World (1932), The Hobbit (1937), The Sword in the Stone (1938), Foundation (1942), 1984 (1948), and The Lion, The Witch, and the Wardrobe (1950). These novels and certain pulp stories like those of H.P. Lovecraft have been nearly continuously republished, but copyright owners for many lesser-known works published from the 1920s to the 1990s are difficult or impossible to locate today.

"There are scores of dead writers whose work is gone and forgotten because there is no one able to take responsibility for the rights. I bought a story from the estate of Richard McKenna a few years ago. The woman from whom I acquired the rights was his aged sister-in-law or someone like that. If that woman doesn't pass the rights on to someone else and let anyone know about it, Richard McKenna's work will not be reprinted for what, another 30 years? Do you really think anyone will remember who he is then? They barely remember him now.

Gerald Kersh is another example. I spent two years trying to track down rights to no avail. Someone who is a Kersh aficionado tried for two years before me. I finally was able to publish a couple of short stories by him via quasi legal means that protect my company from litigation. Kersh was a terrific writer and his stories deserve to be read.

That's why there is a problem." (SFWA, 2005, p. 9) [with minor corrections]

Pulp stories in the 20s and 30s. McKenna and Kersh in the 60s. The "outdated official records" problem is smudging out the fine lines of fantastic fiction's development, leaving only the thickest strokes. This would have been a problem even without the lifting of copyright formalities. Today, the "no official records" policy is compounding the issue:

"Since works are given copyright protection the moment they are written, there is no ready way to find authors to seek their permission to republish material, and the penalties for infringement are high, there is a lot of material that cannot be republished because the authors are essentially unlocatable. That is, the cost to locate them, if they can even be located, is often too high to justify the use of the work. Factoring in the 95 years / Life+70 years duration of copyright, a large amount of work is likely to be unrepublishable for over a hundred years and possibly lost altogether." (SFWA, 2005, p. 1)

In 2056 — the same distance into the future as the publication of Gerald Kersh's Nightshade & Damnations in the past — an editor may want to include a short story from 2012 and have even less hope than the publisher quoted above because the story was never officially registered.

Fantastic Fixes

On January 26, 2005, the U.S. Copyright Office put a notice in the Federal Register, asking for "written comments from all interested parties" on the topic of orphan works.

"The issue is whether orphan works are being needlessly removed from public access and their dissemination inhibited. If no one claims the copyright in a work, it appears likely that the public benefit of having access to the work would outweigh whatever copyright interest there might be." (Orphan Works, 2005)

The Copyright Office received over 700 initial responses from individuals and organizations! One of the "interested parties" was the Science Fiction and Fantasy Writers of America. The SFWA (as it's abbreviated) put out its own call for comments. Some of the resulting anecdotes are cited above. After lively internal debate, SFWA's formed-for-the-occasion Orphan Copyright Committee agreed on a set of seven proposals "felt to comprise a feasible solution to the problem and a dramatic improvement over the current situation" (SFWA, 2005, p. 2).

These proposals can be roughly organized into three themes: modernizing and simplifying the registration process (#1, #3, #5, #6), creating a legal path to using orphan works (#2, #3, #4), and issuing guidance on "succession of copyright interests (#7). To simplify even further, the proposals seek to make orphaning less likely to occur, and to open the remaining orphan works for responsible use.

SFFA's main recommendation for improving registration is the establishment of an Author Information Directory. This would be an online database that offers free or nearly free account setup for authors to enter information about their works and keep their contact information up to date. Authors could be encouraged to include at least the first 100 words of their works and would have the option to add notarized forms or digital signatures to verify their identity. From the point of view of authors, the directory would serve the dual function of providing more opportunities for royalties and of eliminating the chance of their works being used under the new rules for orphan works.

What new rules? After conducting a search according to guidelines drawn up by the Copyright Office, followed by a multi-month posting of public notice, publishers could pay into an escrow fund at a common rate for similar works. Such works could then be published for a limited time without fear of lawsuit. Authors who later come forward would simply be able to claim the funds already set aside for this purpose. Publishers who don't follow these guidelines would be fully at risk of current legal remedies for copyright violation.

Congressional (In)action

After taking comments from SFWA and hundreds of other groups, the U.S. Copyright Office issued a Report on Orphan Works to summarize concerns and give its own proposed solutions. The Copyright Office rejected calls for any kind of new database, worried that it would be too "burdensome" at this time, but recommended revisiting the question in ten years. Also rejected were the calls for specific search guidelines (libraries and archives opposed it), an escrow system (too complex), or a public notice requirement (publishers were against it). The Copyright Office did recommend legislative changes to limit legal remedies to "reasonable compensation" when copyright infringers are able to prove they had conducted a thorough search.

"The term 'reasonable compensation' is intended to represent the amount the user would have paid to the owner had they engaged in negotiations before the infringing use commenced." (U.S. Copyright Office, 2006, p. 116)

This compensation would not apply to non-commercial users, who would only be required to cease infringement activities immediately (U.S. Copyright Office, 2006, p. 13). The report ended with recommended legislative language.

From 2006 to 2008, several bills made their way through the House and Senate, based on the Copyright Office's report. The most successful bill was the Shawn Bently Orphan Works Act of 2008 which passed unanimously in the Senate. A similar bill, the Orphan Works Act of 2008, stalled out in the House.

The Senate bill echoed the Copyright Office's recommendations about limiting legal remedies to "reasonable compensation," and waiving even this compensation if the infringement was (1) non-commercial, (2) "primarily educational, religious, or charitable in nature," and (3) stopped on receipt of a valid claim of infringement. Also following recommendations, the bill required evidence of a "qualifying search" before infringing, plus clear attribution while infringing. The Senate bill added a requirement that a new symbol for orphan works be created and used to label such publications (S. 2913 § 2).

The House bill's most controversial difference was the requirement of a "notice of use archive": a database where users of orphan works must document the work they are using, what steps they took to locate the copyright owner, how the work be used, and contact information for the user (H.R. 5889 § 2). Prominent library groups opposed this requirement on the grounds that it would be too burdensome on large organizations wanting to use many orphan works (Adler, 2008). Some artists opposed the archive because they believed it would be too friendly to large organizations wanting to use many orphan works! There appears to have been a significant amount of misinformation going around in artistic communities at the time (Huttler, 2008).

The whole issue has been effectively shelved by Congress since 2008.

Attack of the Powerpoints

In April 2012, the Berkely School of Law held an orphan works symposium. Among the ideas floated during these talks was Jennifer Urban's suggestion that existing Fair Use law might be applicable to orphan works (2012). One of the four factors of Fair Use analysis concerns the "nature" of the copyrighted work, but what this means, exactly, is not spelled out in federal law. Urban cited cases where availability played some role in Fair Use decisions and argued for expanding this line of thinking to explicitly cover orphan works.

Lydia Loren advocated a change in metaphor: rather than continue using the term "orphan works," labeling them as "hostage works" would emphasize the way these are "works that are held hostage by the complexity of our copyright system. By its duration, by its lack of formalities, and then of course, coupled with the absentee owner" (2012, 2 min). Under this metaphor, users might be seen as hostage-liberators rather than orphan-exploiters. Loren also showed a troubling graph from a talk by Paul Heald (2012, 12 min 45 sec):

The main lesson to draw from this graph is that books in the public domain from before 1923 are still very popular. Same goes for recent books under copyright. It's that dip from the 20s through the end of the century that shows a severe under-representation of what was written in those decades. New works do have novelty going for them; public domain works tend to have low prices going for them, thanks to both the lack of royalties and competition. So while a moderate dip is only to be expected for older, copyrighted works, it's very likely that the orphan works problem has aggravated the situation.

Notice where the bulk of science fiction and fantasy's genre development occurred on the graph above. For fantastic fiction and all the other fading stories created in that gap, orphan works legislation would open exciting new opportunities for rediscovery and appreciation.

My Two Cents

This paper has focused on written works, but copyright law also applies to music, dance, visual arts, architecture etc. Creators in these areas aren't necessarily going to be well-served by orphan works legislation that focuses on texts. Today's technology is completely up to the task of storing and matching text, but still very much in development for finding re-used melodies, dance steps, or even photographic remixes. It might be smart to push for text-specific orphan works legislation first, as a kind of pilot program. When the creative world doesn't come to an end and information technology has improved, other types of content could be added.

The biggest flaw in orphan works legislation hasn't been the legislation itself, but misunderstandings, misrepresentations, and outright scare mongering. What's needed are multiple promotional campaigns by libraries and artists' groups (like SFWA). Specific examples of unrepublishable works would be most effective because it would raise awareness and increase interest in what the public is missing. What if a copyright owner appears because of these campaigns? There would be an opportunity to show the benefits of reconnecting owners with interested publishers! If the owner allows it, the book could even be marketed as a "rescued orphan." Everyone wins.

It's important to keep in mind that no orphan works legislation is going to be perfect; it just needs to meet the realistic goal of being a strong improvement over the current situation. Laws can always be amended later to more perfectly reflect contemporary values and technology. It just takes that first daring step to try something new.

References

Adler, P. S. (May 1, 2008). RE: S. 2913 [letter to Senators Leahy and Hatch on behalf of the Library Copyright Alliance]. Retrieved from http://www.sla.org/pdfs/publicpolicy/LCA050108DarkArchive.pdf

Berne Convention for the Protection of Literary and Artistic Works (1979, revised from 1886). Retrieved from http://www.wipo.int/treaties/en/ip/berne/trtdocs_wo001.html

Copyright Act of 1976, Pub. L. No. 94-553. 90 Stat. 2541 (1976). Retrieved from http://en.wikisource.org/wiki/Copyright_Act_of_1976#.C2.A7_408._Copyright_registration_in_general

Heald, P. (March 16, 2012). Do bad things happen when works fall into the public domain: The market for audiobooks. [Seminar video]. http://www.youtube.com/watch?feature=player_detailpage&v=-DpfZcftI00#t=765s

Huttler, A. (April 28, 2008). Orphan Works Act of 2008. [Web log post]. Retrieved from http://www.fracturedatlas.org/site/blog/2008/04/28/orphan-works-act-of-2008/

Loren, L. (April 12, 2012). Abandoning the orphans: An open access approach to hostage works [Audio presentation] Retrieved from http://media.law.berkeley.edu/qtmedia/BCLT/bclt_20120412-symposium/day1/Loren.m4a

Orphan Works, 70 Fed. Reg. 3739 (2005). Retrieved from http://www.copyright.gov/fedreg/2005/70fr3739.html

Orphan Works Act of 2008, H.R. 5889, 110th Cong. (2008). Retrieved from http://thomas.loc.gov/cgi-bin/bdquery/z?d110:h.r.05889:

Peters, R. (1850). The Public Statutes at Large of the United States of America, From the Organization of the Government in 1789, to March 3, 1845 (Vol. 4). Boston: Charles C. Little and James Brown.

Science Fiction and Fantasy Writers of America, Inc. (March 23, 2005). RE: Orphan Works Study (70 FR 3739). Retrieved from http://www.copyright.gov/orphan/comments/OW0607-SFFWA.pdf

Shawn Bently Orphan Works Act of 2008, S. 2913, 110th Cong. (2008). Retrieved from http://thomas.loc.gov/cgi-bin/bdquery/z?d110:s.02913:

Sprigman, C.J. (2004). Reform(aliz)ing copyright. Stanford Law Review, 57. p. 485-568. Retrieved from http://papers.ssrn.com/sol3/papers.cfm?abstract_id=578502

Urban, J. (April 12, 2012). Orphan works and mass digitization: Obstacles and opportunities. [PDF presentation]. Retrieved from http://www.law.berkeley.edu/files/Urban.pdf

U.S. Copyright Office. (2006). Report on Orphan Works. Retrieved from http://www.copyright.gov/orphan/orphan-report.pdf

Friday, September 21, 2012

Searchers and Finders

"[Finders] visualize there is something to be found, whereas searchers seem to wait and see whether something is to be found. I am convinced that having a firm belief that a relevant document exists makes it much more likely to find it."

— Evert Nijhof, Searching? Or actually trying to find something? – The comforts of searching versus the challenges of finding.

In more than a year of browsing new Library and Information Science papers on LISTA, there's no question that Nijhof's paper on information seeking styles has influenced my thinking the most. And it came from World Patent Information of all places!

Fox hunting mole, by Flickr user EricMagnuson.

Two Useful Archetypes

After observing the way many information professionals conduct novelty (or 'patentability') searches, Nijhof came to see two major clusters of techniques and attitudes:

Searchers...

focus on methodical search procedures (the journey)
tend to start broad and then narrow
accept customer requests at face value
respond to failure by giving evidence of procedure following

Finders...

focus on the objective (the destination)
tend to start narrow and then broaden
may question whether customer requests are customer needs
take failure personally and analyze the reason for failure

These aren't meant to be strict categories where a given person is either all-searcher or all-finder. That's why I'm using the term "archetype." On the other hand, individuals often do lean heavily one way or the other in Nijhof's experience.

It may sound like finders are great and searchers are mediocre under this scheme. Sort of true, but Nijhof is careful to point out that being a pure finder is a problem too. There comes a point in an extended search when it should become evident that finder techniques aren't striking oil. This is when a searcher's methodical techniques begin to shine. You don't want to be the hotshot finder who overlooks the equivalent of a checklist item.

Maybe an analogy will help. Suppose your ten-year-old wanders off in a shopping mall. How would you go about looking for her? You could start searching each store in order, or you could think of the most likely places she would go and check those first. Chances are, you'll find your kid quicker with the second option. But what if you don't? Should you keep checking the 8th, 9th, and 10th most likely stores? No. Now it's time to get methodical (possibly with the help of others).

The Virtues of a Precise Start

Since this is a major detour in Nijhof's article, let's look more closely at a few of the reasons he advocates starting out with narrow rather than broad searches.

Noise — Narrow searches start with a much lower signal-to-noise ratio than broad searches. Sure, you get fewer hits, but you can spend more time thoroughly checking each hit for its own sake and for additional search vocabulary.

Knowing the Landscape — When searching broadly, it can be hard to get a sense of what's available underneath broad terms. If I'm looking for medical information on certain kinds of dogs, it helps to start by looking at the level of detail in the database on one breed of dog. Otherwise a broad search could be using a completely inappropriate set of terminology for the available sources.

Default Mindset — Starting out broad puts a searcher in an "discard unless..." mindset rather than a finder's "assume relevant unless..." mindset. The searcher approach is supposed to help avoid missing relevant hits, but training yourself to say "nope, nope, nope..." right away might actually cause you to miss an important document among the pile of irrelevancy.

Concept Goggles

I hope no one takes the terminology "searcher" vs. "finder" too seriously. I've noticed a tendency in business and academia to take two common language synonyms, use them to refer to two interesting concepts, then act like everyone else is misusing these terms if they aren't using them in the same quirky fashion.

What's important is that you think about these two clusters of search behavior as useful concepts. Now when you sit down to start a search or watch someone else start a search, you won't be able to stop from thinking about the choice of broad or narrow terminology. If you tend toward the searcher archetype, you might consider leaving your comfort zone and trying some if-this-works-I'm-done-already narrow terms. If you tend toward the finder archetype, you might remember to switch approaches when a search isn't going well, perhaps by looking up all of the citations in resources you've already found that weren't quite right.

Come to think of it, isn't the whole premise of the TV show Bones about pairing up a methodical searcher with an intuitive finder and showing how they complement each other? Oh no, the goggles won't come off!

References

Nijhof, E. (2011). Searching? Or actually trying to find something? – The comforts of searching versus the challenges of finding. World Patent Information, 33(4), p. 360-363.

Saturday, August 25, 2012

Sex, Violence, and the First Amendment

The U.S. Supreme Court has ruled that states may pass laws restricting the sale of sexual materials to minors, but may not pass similar laws for violent materials. The difference lies in the Court's traditions regarding obscenity as an exception to First Amendment free speech rights.

Short version: obscenity has to do with sex, not violence.

Protected and Unprotected Speech

The First Amendment does not list exceptions for "the freedom of speech." Nevertheless, the Supreme Court has set aside certain kinds of speech as "unprotected" by the First Amendment. When speech is unprotected, state governments are effectively able to restrict it however they see fit. One major category of unprotected speech is obscenity. Here is the key authoritative text, now known as the Miller Test:

"[O]bscene material is unprotected by the First Amendment. 'The First and Fourteenth Amendments have never been treated as absolutes.' We acknowledge, however, the inherent dangers of undertaking to regulate any form of expression. State statutes designed to regulate obscene materials must be carefully limited. As a result, we now confine the permissible scope of such regulation to works which depict or describe sexual conduct. That conduct must be specifically defined by the applicable state law, as written or authoritatively construed. A state offense must also be limited to works which, taken as a whole, appeal to the prurient interest in sex, which portray sexual conduct in a patently offensive way, and which, taken as a whole, do not have serious literary, artistic, political, or scientific value."

— Miller v. California, 413 U.S. 15 (1973), edited for readability

Notice how obscenity is limited to "works which depict or describe sexual conduct." By definition, violence without sexual conduct can't be classified as legally obscene.

Variable Obscenity

In 1965, the owner of a Long Island lunch and periodicals business sold porn magazines to a 16 year old boy. New York had a law with wording similar to an earlier version of the Miller Test, with the addition of "for minors," "to minors," etc. The vendor was charged for violating this law and the case eventually made its way to the Supreme Court.

Can something be protected, non-obscene speech for adults and yet be obscene, unprotected speech for minors? The Courted decided: yes, it can!

"We do not regard New York's regulation in defining obscenity on the basis of its appeal to minors under 17 as involving an invasion of such minors' constitutionally protected freedoms. Rather [the New York law] simply adjusts the definition of obscenity 'to social realities by permitting the appeal of this type of material to be assessed in term of the sexual interests' of such minors. That the State has power to make that adjustment seems clear, for we have recognized that even where there is an invasion of protected freedoms 'the power of the state to control the conduct of children reaches beyond the scope of its authority over adults.'"

— Ginsberg v. New York, 390 U.S. 629 (1968), edited for readability

This is a BIG DEAL. The Court is saying that New York can classify material that's not obscene for adults as obscene for minors because, in general, states can vary the definition of an unprotected speech category where minors are concerned.

Gov. Schwarzenegger vs. Kratos

In 2005, California passed a bill prohibiting the sale or rental of violent video games to minors, where:

"'Violent video game' means a video game in which the range of options available to a player includes killing, maiming, dismembering, or sexually assaulting an image of a human being, if those acts are depicted in the game in a manner that does either of the following:

(A) Comes within all of the following descriptions:

(i) A reasonable person, considering the game as a whole, would find appeals to a deviant or morbid interest of minors.
(ii) It is patently offensive to prevailing standards in the community as to what is suitable for minors.
(iii) It causes the game, as a whole, to lack serious literary, artistic, political, or scientific value for minors.
(B) Enables the player to virtually inflict serious injury upon images of human beings or characters with substantially human characteristics in a manner which is especially heinous, cruel, or depraved in that it involves torture or serious physical abuse to the victim."

— California AB-1179, edited for readability

Section (A) should look familiar. It's similar to the Miller Test adjusted for minors, with two major differences. Part (i) drops the sexual requirement so that it can be applied to violence. Part (iii) completely inverts the serious value check. In the Miller Test, the presence of serious value overrides the other two parts and makes a work non-obscene no matter how offensive it is to a community. In the California law, the presence of offensive elements voids any value in the work. Section (B) puts a ban on additional games, just in case section (A) didn't throw a wide enough net. Altogether, this makes three likely grounds for questioning the law's constitutionality:

Dropping the sexual requirement.
Inverting the value check.
Banning games that fall outside the Miller-esque framework.

Interestingly, the Supreme Court slapped down the law for the first and most basic reason: attempting to regulate violent content rather than sexual content.

The Limits of Control

In Ginsberg, the Court had decided that the obscenity exception for free speech rights could come in an adult version and a minor version. California's video game law raised another question:

Can there be free speech exceptions that only come in a minor version?

There isn't a free speech exception when it comes to violent content for adults, so (1) a brand new exception would be required and (2) it would only apply to minors. First, the Court pointed at precedent against introducing new free speech exceptions for adults:

"Last Term, in Stevens, we held that new categories of unprotected speech may not be added to the list by a legislature that concludes certain speech is too harmful to be tolerated. Stevens concerned a federal statute purporting to criminalize the creation, sale, or possession of certain depictions of animal cruelty. [...] We held that statute to be an impermissible content-based restriction on speech. There was no American tradition of forbidding the depiction of animal cruelty—though States have long had laws against committing it.

The Government argued in Stevens that lack of a historical warrant did not matter; that it could create new categories of unprotected speech by applying a 'simple balancing test' that weighs the value of a particular category of speech against its social costs and then punishes that category of speech if it fails the test. [...] We emphatically rejected that 'startling and dangerous' proposition.

— Brown v. Entertainment Merchants Association, 564 U.S. 08-1448 (2011)

Violence may not be a valid free speech exception for adults, but can't it be an exception that only applies to minors?

"[The California Act] does not adjust the boundaries of an existing category of unprotected speech to ensure that a definition designed for adults is not uncritically applied to children. California does not argue that it is empowered to prohibit selling offensively violent works to adults —and it is wise not to, since that is but a hair’s breadth from the argument rejected in Stevens. Instead, it wishes to create a wholly new category of content-based regulation that is permissible only for speech directed at children.

That is unprecedented and mistaken. '[M]inors are entitled to a significant measure of First Amendment protection, and only in relatively narrow and well-defined circumstances may government bar public dissemination of protected materials to them.' Erznoznik v. Jacksonville [...]. No doubt a State possesses legitimate power to protect children from harm [...], but that does not include a free-floating power to restrict the ideas to which children may be exposed. 'Speech that is neither obscene as to youths nor subject to some other legitimate proscription cannot be suppressed solely to protect the young from ideas or images that a legislative body thinks unsuitable for them.' Erznoznik"

— Brown v. EMA, 564 U.S. 08-1448 (2011)

In other words, minors are only subject to the same basic free speech exceptions as adults, though these exceptions may be applied differently to minors. There is no basic free speech exception that has to do with depictions of violence, therefore violent video games are constitutionally protected speech for Americans of all ages.

This applies to books too, if anyone is still reading those things. I do recommend reading the whole majority opinion in Brown v. EMA. It makes excellent points about moral panics, censorship, and violence in children's literature.

Tuesday, August 14, 2012

Quote of the Day: Mill on Intellectual Freedom

"He who knows only his own side of the case, knows little of that. His reasons may be good, and no one may have been able to refute them. But if he is equally unable to refute the reasons on the opposite side; if he does not so much as know what they are, he has no ground for preferring either opinion. The rational position for him would be suspension of judgment, and unless he contents himself with that, he is either led by authority, or adopts, like the generality of the world, the side to which he feels most inclination. Nor is it enough that he should hear the arguments of adversaries from his own teachers, presented as they state them, and accompanied by what they offer as refutations. This is not the way to do justice to the arguments, or bring them into real contact with his own mind. He must be able to hear them from persons who actually believe them; who defend them in earnest, and do their very utmost for them. He must know them in their most plausible and persuasive form; he must feel the whole force of the difficulty which the true view of the subject has to encounter and dispose of, else he will never really possess himself of the portion of truth which meets and removes that difficulty. Ninety-nine in a hundred of what are called educated men are in this condition, even of those who can argue fluently for their opinions. Their conclusion may be true, but it might be false for anything they know: they have never thrown themselves into the mental position of those who think differently from them, and considered what such persons may have to say; and consequently they do not, in any proper sense of the word, know the doctrine which they themselves profess."

— John Stuart Mill, On Liberty

And even more relevant to yesterday's quote:

"[The Catholic Church] makes a broad separation between those who can be permitted to receive its doctrines on conviction, and those who must accept them on trust. Neither, indeed, are allowed any choice as to what they will accept; but the clergy, such at least as can be fully confided in, may admissibly and meritoriously make themselves acquainted with the arguments of opponents, in order to answer them, and may, therefore, read heretical books; the laity, not unless by special permission, hard to be obtained. This discipline recognizes a knowledge of the enemy's case as beneficial to the teachers, but finds means, consistent with this, of denying it to the rest of the world: thus giving to the élite more mental culture, though not more mental freedom, than it allows to the mass. By this device it succeeds in obtaining the kind of mental superiority which its purposes require; for though culture without freedom never made a large and liberal mind, it can make a clever nisi prius advocate of a cause. But in countries professing Protestantism, this resource is denied; since Protestants hold, at least in theory, that the responsibility for the choice of a religion must be borne by each for himself, and cannot be thrown off upon teachers.

— John Stuart Mill, On Liberty

Yes, "at least in theory." The same applies to voters who trust political teachers to tell them all they need to know about other views, without exposing themselves directly.

Monday, August 13, 2012

Quote of the Day: Craig on Intellectual Freedom

"Be on guard for Satan’s deceptions. Never lose sight of the fact that you are involved in a spiritual warfare and that there is an enemy of your soul who hates you intensely, whose goal is your destruction, and who will stop at nothing to destroy you. Which leads me to ask: why are you reading those infidel websites anyway, when you know how destructive they are to your faith? These sites are literally pornographic (evil writing) and so ought in general to be shunned. Sure, somebody has to read them and refute them; but why does it have to be you? Let somebody else, who can handle it, do it. Remember: Doubt is not just a matter of academic debate or disinterested intellectual discussion; it involves a battle for your very soul, and if Satan can use doubt to immobilize you or destroy you, then he will."

— William Lane Craig, "Q&A #29: Faith and Doubt" from http://www.reasonablefaith.org/faith-and-doubt

Saturday, August 11, 2012

Pope v. Illinois — Serious Value According to Whom?

Obscenity is an exception to First Amendment free speech protection. This doesn't mean obscenity is automatically illegal; it means states can choose to restrict it. For example, the following is a misdemeanor in Nebraska:

"It shall be unlawful for a person knowingly to (a) print, copy, manufacture, prepare, produce, or reproduce obscene material for the purpose of sale or distribution, (b) publish, circulate, sell, rent, lend, transport in interstate commerce, distribute, or exhibit any obscene material, (c) have in his or her possession with intent to sell, rent, lend, transport, or distribute any obscene material, or (d) promote any obscene material or performance."
— Nebraska Revised Statute 28-813

Where "obscene" is defined as meaning:

"(a) that an average person applying contemporary community standards would find that the work, material, conduct, or live performance taken as a whole predominantly appeals to the prurient interest or a shameful or morbid interest in nudity, sex, or excretion,
(b) the work, material, conduct, or live performance depicts or describes in a patently offensive way sexual conduct specifically set out in sections 28-807 to 28-829, and
(c) the work, conduct, material, or live performance taken as a whole lacks serious literary, artistic, political, or scientific value;"
— Nebraska Revised Statute 28-807 (emphasis added)

This language is taken from the Supreme Court's Miller Test for obscenity. What's interesting about the Miller Test is that all three parts must hold to classify material as "obscene." A photograph could, for example, be judged by "contemporary community standards" to appeal to sexual interest as a whole, it could depict state-defined sexual conduct in a "patently offensive way," yet if it contains "serious literary, artistic, political, or scientific value" it would not be legally obscene.

Pope v. Illinois was a 1987 Supreme Court case which looked at the "serious value" test. Specifically:

"whether, in a prosecution for the sale of allegedly obscene materials, the jury may be instructed to apply community standards in deciding the value question."

Why is this important? Suppose there's a novel with sexual elements that most people would find offensive. In Oregon, it's legal because people in Oregon still see literary value in it. In Alabama, it's obscenity and you can go to jail for selling the book because people in Alabama don't see literary value in it. I'm picturing Burt Reynolds hauling banned books across state lines.

Case Background

Rockford, Illinois in 1983. Police arrest two part-time clerks in an adult book store for selling porn mags. The clerks are convicted for selling obscenity. (This makes me wonder what officials thought adult bookstores normally sold.)

In both trials, the jury was instructed to decide the question of value according to how "ordinary adults in the whole State of Illinois" would view these magazines. Both clerks were convicted. Both lost Appellate Court appeals. The Illinois Supreme Court passed, but the U.S. Supreme Court took up the issue.

In a majority decision, the Supreme Court decided:

"Just as the ideas a work represents need not obtain majority approval to merit protection, neither, insofar as the First Amendment is concerned, does the value of the work vary from community to community based on the degree of local acceptance it has won. The proper inquiry is not whether an ordinary member of any given community would find serious literary, artistic, political, or scientific value in allegedly obscene material, but whether a reasonable person would find such value in the material, taken as a whole." — Opinion of the Court (emphasis added)

The Supreme Court also decided to send the case back to the Appellate Court to determine whether convictions based on constitutionally faulty jury instructions would be upheld. (See the text of the case for a lively debate about "harmless error.") I had trouble finding out the ultimate fate of the clerks.

Communities and Reasonable Persons

When reading Pope, I kept wondering, "How is 'community' defined? The state, the city, or what?" So I went back and skimmed through Miller v. California. In that case, it's made clear that the "forum community" is meant, i.e. for a California state law the forum community would be the whole state of California. Presumably for a city ordinance, it would be that whole city.

If a state-wide community doesn't get to decide whether a work has "serious literary, artistic, political, or scientific value," who does decide? One answer might be: the entire community of the United States of America. This would make decisions more consistent across state lines, but we could have situations where the people of Oregon see value in a work that Americans as a whole might not esteem. And there may be works of great artistic value to a broadly scattered fanbase that aren't esteemed by any one geographic community as a whole.

Happily, the Court rejected "community standards" outright when it comes to determining value (see the bold text I quoted above). Unhappily, the replacement standard of what "a reasonable person" would find valuable isn't very helpful. Ask a jury to decide whether a reasonable person would find value in a "patently offensive" film and — I suspect — they would take themselves to epitomize reasonable people, note that they themselves don't see value in it, and answer accordingly.

What is Beauty?

In a concurring opinion, Justice Scalia questions the entire notion of legally judging artistic value:

"I must note, however, that in my view it is quite impossible to come to an objective assessment of (at least) literary or artistic value, there being many accomplished people who have found literature in Dada, and art in the replication of a soup can. Since ratiocination has little to do with esthetics, the fabled "reasonable man" is of little help in the inquiry, and would have to be replaced with, perhaps, the "man of tolerably good taste" - a description that betrays the lack of an ascertainable standard. If evenhanded and accurate decision making is not always impossible under such a regime, it is at least impossible in the cases that matter. I think we would be better advised to adopt as a legal maxim what has long been the wisdom of mankind: De gustibus non est disputandum. Just as there is no use arguing about taste, there is no use litigating about it. For the law courts to decide "What is Beauty" is a novelty even by today's standards."

I'm not used to wholeheartedly agreeing with Scalia! Another approach with similarly broad results comes from Justice Stevens' dissent. Referring again to the bold text I quoted above, Stevens writes:

"The problem with this formulation is that it assumes that all reasonable persons would resolve the value inquiry in the same way. In fact, there are many cases in which some reasonable people would find that specific sexually oriented materials have serious artistic, political, literary, or scientific value, while other reasonable people would conclude that they have no such value. The Court's formulation does not tell the jury how to decide such cases.

In my judgment, communicative material of this sort is entitled to the protection of the First Amendment if some reasonable persons could consider it as having serious literary artistic, political, or scientific value."

You can guess by now that both Scalia and Stevens are questioning the utility of having obscenity laws at all, at least as far as consenting adults are concerned. Stevens goes even farther and argues that such laws are unconstitutional because the difference (in this case) between legal pornography and illegal obscenity is not something the clerks could have been expected to know before being charged and convicted:

"Under ordinary circumstances, ignorance of the law is no excuse for committing a crime. But that principle presupposes a penal statute that adequately puts citizens on notice of what is illegal. The Constitution cannot tolerate schemes that criminalize categories of speech that the Court has conceded to be so vague and uncertain that they cannot 'be defined legislatively.' [...] If a legislature cannot define the crime, Richard Pope and Michael Morrison should not be expected to. Criminal prosecution under these circumstances 'may be as much of a trap for the innocent as the ancient laws of Caligula.'"

(You can't see it, but I'm applauding here.)

Finally, Stevens points out that mere possession of obscenity is legal and he characterizes laws against selling or distributing obscenity as an "insult" to a citizenry that has the "right to read and possess material which it may not legally obtain."

I agree and consider obscenity laws — "absent some connection to minors, or obtrusive display to unconsenting adults" — to be outdated relics of a less tolerant age.

Text of Pope v. Illinois, 481 U.S. 497
Text of Miller v. California, 413 U.S. 15