![]() |
|||||||
![]() |
|||||||
|
|
|||||||
|
December 29, 2003
RDF intro, part 5 Why is RDF worth our time? My short answer is “because RDF is loose, but not too loose.” RDF has enough order to do useful things, but doesn’t require us to rewire the world first. If tech were tunes, RDF would be a jazz trio, not a Bach fugue. It lets players who hardly know each other improvise, yet it holds things together beautifully: the quintessential jam. In this way, RDF is much like the current Web. RDF shares many of the benefits that made the first Web a success. As Dave Beckett said, “RDF allows loose collaboration with little pre-coordination, links can break, anyone can link.” The world and the web are just too much of a mess to make neat, which is why the first Web flourished, because people could stitch it together a bit at a time without first clearing things with their head librarian. But the first Web has problems: it’s too loose. If RDF is a jazz trio, the first Web is more like street noise … a crazy cacophony of unrelated content. So how does RDF achieve this balance of form and freedom? First, it pins things down with URIs. In RDF, resources and predicates and ontologies all use well-defined, world-referencable, names. This grounds the model, giving us a shared vocabulary to work with. Even better, we’re not restricted by the structure; our vocabularies can continually evolve without breaking the system. Second, RDF statements are piecemeal in nature. You and I can describe the same resource in complementary ways and never know each other. Along comes a search bot and combines our statements, giving us a larger understanding of the resource than either of us had. This “semantic interleaving” gives RDF much of its freedom. Combine this with RDF’s flexible vocabularies, and we’ve got a description scheme that’s just “webby” enough to work. It’s loose, but not too loose. I’ll talk more about these benefits in future “yarns.” There’s only so much I can say without real-world examples, so I’ll underline these points further as I explore the products and ontologies that are built with RDF. Posted in RDF Intro, semantic web | Comments Off
December 28, 2003
RDF intro, part 4 Last time I talked about “triples”, which are the elementary nugget in RDF. What’s a triple? Have a look: <rdf:Description rdf:about='http://bigfractaltangle.com' dc:title='Big Fractal Tangle' /> This triple is saying, “The resource ‘http://bigfractaltangle.com’ has the title ‘Big Fractal Tangle.” It’s a single fact, expressed as an RDF statement, or triple. The three parts that make it a triple are:
Every RDF statement has a resource, a predicate, and an object. To see it another way, we can take the Grammar Rock approach: each sentence has a subject and a predicate. The subject is the resource we’re talking about. The predicate is the thing we’re saying about it. Before things get murky, it’s important to understand that what the RDF folks are calling a “predicate” is actually a “predicate name”. Seems like everyone drops the “name” part, even though in grammar (and logic), a predicate is both the name (dc:title) and its object (Big Fractal Tangle) together. Just to make things more confusing, the name/value combo is usually called a property in RDF. So in RDF-land, a predicate is a name and a property is pair. Once more, an RDF triple (statement) is comprised of a resource (subject), a predicate (property name), and an object (property value). Technically, each piece is actually a URI reference to the thingy in question, with the exception of the object, which can also be a “literal” (right-here, right-now value). Here are two more triples: <rdf:Description rdf:about='http://zombo.com' dc:subject='everything' /> <rdf:Description rdf:about="#tim"> <qed:favoriteSite rdf:resource='http://zombo.com' /> </rdf:Description> The first says that the zombo.com resource has a subject of ‘everything’ (a literal). The second says that my favorite website is the resource zombo.com. Take note of the two ways I’m writing the properties. The first is contained in the Description tag. The second is a tag in its own right, between the start and end Description tags. Why have both? Well, some properties, such as the second one, can’t be written “inline”, since you can’t have XML attributes (rdf:resource) inside XML attributes (qed:favoriteSite). Other than that, some people just prefer one way over the other. Many of you XML fans may still be wondering, “Can’t I do all this in XML? Why RDF? Why triples? Why Zombo?” Well, yesterday I hinted at the “pick-up-sticks” nature of RDF. Tomorrow I’ll rest my case by describing in detail the wholesome goodness of incomplete interconnectedness. Posted in RDF Intro, semantic web | Comments Off
December 27, 2003
RDF intro, part 3 Now that you’ve seen some actual RDF, we can take a step back to put things in context. RDF is essentially a data model — a way of describing data, or in this case, metadata. There’s plenty of data models out there. The ones we’re most familiar with are connected to programming languages and their functional flavors. Algol, Pascal, and C have their records and procedures. Prolog has statements and rules. Smalltalk, C++, and Java have objects and methods. SQL has tables and statements. At their core, each of these systems describe data in the same way: atomic data nuggets are grouped into bundles. Whatever the terminology, we store data as clumps of stuff. Records contain fields. Tables contain cells. Central to the task is how we name things. As programmers, we give meaningful names to both the bundles and the nuggets. “That’s the ‘employee’ record,” we say, “which contains the ‘city’ field.” The computer doesn’t really care if we call it the “dreznibble” record and the “doodat” field. All the machine needs is for every piece in the puzzle to have a unique name. Meaningful names merely make the job easier for humans. The semantics is a secondary by-product. RDF has the same goal as these other systems, only now the nuggets are called “triples.” You might think the bundles in this situation are “resources”, but they’re not. Resources are a step removed. RDF triples are about resources. The resources can be entirely independent of RDF. So what are the bundles in RDF? Well, if you’re being formal, I guess you could call them “graphs”, but that’d be missing the point. You see, in RDF, there are no bundles, at least not in the well-defined sense we’ve come to expect from records and objects and tables. In RDF, triples can come and go as they please; they don’t have to stick with the tour group. The RDF spec says, “To facilitate operation at Internet scale, RDF is an open-world framework that allows anyone to make statements about any resource. In general, it is not assumed that complete information about any resource is available.” In other words, there is no bundle. At most we’re talking pick-up-sticks, which for non-Prolog folks is a pretty loose twist. Even better, this “free love” approach manages to stay consistent and efficient. That’s what makes RDF different. Posted in RDF Intro, semantic web | Comments Off
December 26, 2003
RDF intro, part 2 In my last post, we learned that RDF is all about describing resources, and that resources are referenced with URIs. So how does it describe them? Let’s start by looking at the source of this very page (View / Page Source). There’s two snippets of RDF embedded in the HTML. The first helps tools like Movable Type create what are called “trackbacks”, links to this post from other people’s posts. Here’s the RDF: <!-- <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://bigfractaltangle.com/archive/2003/12/26.jsp" trackback:ping="http://immuexa.com/cgi-bin/mtype/mt-tb.cgi/80" dc:title="RDF intro, part 2" dc:identifier="http://bigfractaltangle.com/archive/2003/12/26.jsp" dc:subject="Tangle Yarns" dc:description="In my last post, we learned..." dc:creator="timothy" dc:date="2003-12-26T21:49:46-08:00" /> </rdf:RDF> --> First, notice that the entire snippet is surrounded by HTML comments. This practice hides the code from older browsers. It also helps the page validate as official HTML. Next, look at the surrounding <rdf:RDF> tags, which define the snippet as RDF. They also define the namespaces we’re using. What’s a namespace? Well, a big problem with computers is that people often use the same terms for slightly different things. When I say “subject,” I may mean “email subject.” When you say it, you may mean “category.” If we tried to mix our data together, your subjects would clash with my subjects, and it’d be a big mess. Namespaces let people say, “These words have my meaning.” They’re kind of like our own private dictionaries. We indicate a word is from a particular namespace by putting a special prefix in front, such as “dc:subject.” This lets the computer know we’re talking about a “subject” as understood in the “dc” namespace (or dictionary), as opposed to “mail:subject”, which uses the “mail” namespace. A big part of RDF is defining these separate vocabularies so we can put labels on things without stepping on each other’s toes. To indicate that you want to use a particular namespace, use an “xmlns” attribute, followed by an equals sign, followed by the URI of the official namespace definition, or “schema”. In this snippet, we’re using three namespaces: 1) the RDF namespace, rdf, 2) the trackback namespace, trackback, and 3) the Dublin Core namespace, dc. So we’ve got three namespaces, but what does the damned thing do? Well, RDF is the Resource Description Framework, so it must be describing a resource. Which resource? For a clue, look at the <rdf:Description> tag. What’s it “about”? Ah ha… there’s one of them URI things (http://bigfractaltangle.com/archive/2003/12/26.jsp). Looks like we’re describing this very page by specifying several of its properties: its creator (dc:creator), its date (dc:date), its title (dc:title), etc. Movable Type uses this information to reference this page from other websites. So, RDF describes a resource in the <Description> tag. The resource is identified in the “about” attribute. The remaining attributes in the tag make up the actual description, which is essentially a collection of properties about the resource. Properties are specified using namespace qualified names, such as “dc:title”. As you would expect with XML, properties can also be written out as tags instead of attributes. This snippet is the same: <!-- <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:trackback="http://madskills.com/public/xml/rss/module/trackback/" xmlns:dc="http://purl.org/dc/elements/1.1/"> <rdf:Description rdf:about="http://bigfractaltangle.com/archive/2003/12/26.jsp" > <trackback:ping> http://.../mt-tb.cgi/80 </trackback:ping> <dc:title> RDF intro, part 2 </dc:title> <dc:identifier> http://.../2003/12/26.jsp </dc:identifier> <dc:subject> Tangle Yarns </dc:subject> <dc:description> In my last post, we learned... </dc:description> <dc:creator> timothy </dc:creator> <dc:date> 2003-12-26T21:49:46-08:00 </dc:date> </rdf:Description> </rdf:RDF> --> “Wait a minute,” you may say. “All this fuss is about specifying properties? Can’t we do this with a dozen other ‘description frameworks’, like say SQL or UML or JavaBeans?” Yep. Nothing’s new under the sun. From one perspective, RDF’s simply another way to describe your average “record” in just about every computer language you’ve ever heard of. Properties and values, applied to some thing, which this time we’re calling a resource. So why bother with RDF? For that, tune in next time… Posted in RDF Intro, semantic web | Comments Off
December 25, 2003
RDF intro, part 1
My last week of postings have been mostly sauce with no meat, which means it’s high time I quit with conjecture and start talking turkey. For those new to the Semantic Web, I’m sure you’re saying, “How do I use this stuff?” I know how you feel. When I’m learning a new technology, I’m usually relieved when the writer stops talking around things and finally addresses the topic directly. So for my first “tangle yarn,” I’ll tackle RDF, the Resource Description Framework, since it’s the technological foundation for the whole magilla. There’s a lot written about RDF and friends. I’m hardly an expert, but I do have one thing going for me: I can still remember my initial confusion. Even after scanning a half-dozen articles and reading the first few chapters of Practical RDF, I still didn’t understand what all the fuss was about. I knew there was this thing called RDF, but I didn’t know what it was good for. In particular, I didn’t know why we need RDF when we already had XML, UML, and LMNOP. So here’s my own quick intro for the confused. As its name implies, RDF is a standard way to describe resources. Well, what’s a resource, exactly? Just about anything. Certainly the website Merriam-Webster is a resource, as is Zombo.com. Besides websites, resources are pretty much anything you can point a finger at: articles, images, news feeds, emails, DVDs, dehydrators, doorknobs, etc. If you can give it a name, it’s a resource. Then why call it resource? Why not record, or item, or object, or thingy? That part I don’t know, but we’re calling them resources anyway, no matter how cool Thingy Description Framework would have been. To name a resource, RDF uses the URI, which stands for Uniform Resource Identifier. Aside from sounding vaguely Jamaican in origin, URI seems suspiciously like URL, a geek term even journalists had to learn in the mid-nineties. URIs and URLs are nearly the same thing. They both Uniformly do something with Resources. URIs identify them. URLs locate them. Just to keep things fun, there’s also the URN, which names them. URI? URL? URN? You are joking! Nope, these puppies are the new noun. Here’s how to keep it straight: URLs tell computers how to get to something “out there”, like http://zombo.com/index.html. They tell the “how” (http), “where” (zombo.com), and “what” (index.html). Contrast this with URNs, which are merely names (urn:example.org:truth:1). URNs don’t correspond to real things on the Internet. They’re just names. Now here’s the key: URIs are both; they’re a generalization of both URLs and URNs. URLs are the subset of URIs that have location. URNs are the subset of URIs that don’t have location. Whether it’s by naming it or locating it, a URI uniformly identifies something. So now we know what resources are, and how to name them. Next comes the “describing” part, which I’ll describe tomorrow. Posted in RDF Intro, semantic web | Comments Off |
|||||||
![]() |
|||||||
|
"Big Fractal Tangle" is a phrase used by Tim Berners-Lee at ISWC 2003
to describe his vision of the Semantic Web (used with permission) "Tidepool" and "Storymill" are trademarks of Immuexa Corporation. Website design copyright © 2003-2004 by Immuexa. |
|||||||