4

2

Let's say I am a large publisher of information. Why should I participate in the Semantic Web? Why should I contribute to the Linked Data cloud?

I looked at what the W3C's Semantic Web Education and Outreach group had to say. They published the “Business Case” [http://www.w3.org/2001/sw/sweo/public/BusinessCase/Slides.pdf] but it mainly talks about better and easier data integration. This seems like a potential benefit to large enterprises but is mainly of interest behind the firewall.

I also saw a deck on SemWeb adoption [as a new user I can't post a second URL but you can see it here: www.slideshare.net/guest262aaa/semantic-web-adoption] which says on the second to last slide (aimed at CEO’s): “It can be hard to articulate the potential benefits, so find someone with a problem that can be solved with the Semantic Web and make that person a partner”.

Convince me (and others) to join the Linked Data cloud. Sell me on the benefits!

Regards,

Stuart

flag

6 Answers

4

With the growing number of industries using Linked Data to surface their information, and some impressive names among the early adopters already (BBC, O'Reilly Media, and some UK-based public-sector services to list a few), we are beginning to see the technology take hold practically.

You don't get a network effect until there is a strong, critical mass of users. But that doesn't mean there are no benefits right now. O'Reilly found that by using Linked Data for their own publications—in house even—they were able to organise, track and build upon their own internal catalogue in ways they didn't think possible beforehand: http://www.talis.com/nodalities/pdf/nodalities_issue6.pdf (first story).

The BBC is using it as part of their /programme data. Partly, they're using this to seed their own developer community with ways to reuse the data about their programmes, but they're also finding it useful for their own internal architecture.

So, the benefits of looking at data as a bunch of connections are there to be had within organisations already. O'Reilly is a publisher using it mostly internally, but you can bet you'll be seeing their name next to Linked Data early on when adoption rates increase.

So, to answer your question about seeing network effects happening: influential early adopters are using Linked Data now, some are publishers. It takes the early adopters to seed the critical mass which begins attracting new adopters for the benefits of the network itself rather than just for the underlying benefits of linking data itself. I wonder whether you're asking whether the Semantic Web community should already be a rolling snowball, growing as adopters see the benefits of not being left outside?

link|flag
3

OK I'll have a go at convincing you. What do you want as a publisher? Presumably you want as many as people as possible to read what you are publishing and to do that, you need to make it interesting or useful, preferably both. Suppose you are the publisher of "Gardener's Monthly" and as well as all your great articles, you've accumulated over time a heap of useful data about which plants grow well in different climates or soil types, how big they get, what colour the flowers are, what diseases they are susceptible to etc (I don't know much about gardening, so my example will probably not be very realistic!)

This enables you to provide all kinds of useful stuff that your readers want to know. Maybe Fred thinks he fancies planting a shrub in the corner of his garden, but he doesn't know what kind to get. He's got a small garden so he wants one that won't get bigger than 3 feet tall and he likes purple flowers. You have all the information he needs, so how do you provide it to him, without him having to read through all the back issues one by one?

Of course Fred isn't going to crank up curl and start dereferencing your URIs. Someone has to take this data and present it in a human friendly way. Maybe that's you as the owner of this great shrubbery database, maybe it's a plant retailer who decides to aggregate data from multiple sites. But someone can build some kind of browsing or search interface that helps Fred find his answer. Once he's narrowed his choice down to a rhododendron or an azalea, he can follow the links (rdf:seeAlso or whatever) back to regular articles on your site to find out more detailed information (even if ShrubSearch is operated by someone else).

So you get to be known as the best source of data in your domain, you get lots of readers, you make millions in advertising from the seed and fertiliser companies. Job done.

There are examples of companies making this work in practice with 'regular' data - for example IMDB. If I want to know something about a movie or an actor, I usually go straight to imdb.com, rather than to google or wikipedia. The advantage of doing this with linked data is that you don't necessarily have to build the user interface to the data yourself to get the benefits (though it might be a good idea to do that) - your data might be used in all sorts of unexpected ways, and by appropriate links and 'owning' the URIs for key things in your field, you still draw in readers. And you can merge your own data with other people's - you can pull in data from the National Plant Disease Research Centre (made up) or whatever to provide a better service to your readers.

There are lots of things that publishers can do on the web that they couldn't do on paper - the successful publishers of the future will be the ones that recognise and exploit the new possibilities.

Let me know if that has helped persuade you!

Bill

link|flag
3

I don't know what kind of publisher you are so here are a few possibilities for you to consider:

If you're a publisher of valuable online content then you should be publishing linked data about your content with as liberal licence as possible (ideally public domain). That will maximise the chances of people reusing your linked data and point directly to your content from all kinds of different applications. This is one advantage the New York Times is going to see as it adopts more Linked Data publishing.

Or you might be a publisher of scientific journals. In that case you will want to publish the underlying data of the articles so it can be cross-referenced and re-analysed by the scientific community. In an ideal world that would be open, but it could just as well be a premium service offered by the publisher.

If you are a commercial book publisher then you'll want to publish linked data describing your catalog. Again you should aim to make this as freely reusable as possible - every reuse of your metadata generates more possible sales. The same goes for music publishers - get your catalog out there and being used in every application that consumes playlists and music metadata.

link|flag
3

is there anybody at all doing these kinds of things?

As the responses show, two answers are the New York Times and O'Reilly Publishing. Nature is another good example.

The real value of semantic web technology to publishers is in managing the metadata. Better management of metadata leads to easier access to data for both the publisher's staff (to more easily find related content to aggregate into new specialized publications, to more easily track re-use rights, etc.) or for their customers (to, well, pay money for.) That's why the Times makes their taxonomies public: so that when you look up a concept, or add metadata showing that your article in your publication is related to a particular concept in their taxonomy, people can more easily find Times articles on the same concept, and then maybe click on the ads next to those articles, etc.

As publishers aggregate content from different sources and each piece of content has different metadata fields attached, they shouldn't throw away metadata fields that only show up in a subset of the content, but a traditional relational database that makes you define all the fields you will use before you enter any data doesn't provide this flexibility. RDF-based data storage does.

link|flag
1

All the answers follow a certain pattern here: "one might", "one could", etc.

So we can say that publishing linked data gives people the technical hability to do all these kinds of interesting things with the information on your website. But that alone is not reason enough for a large publisher to adopt this technology. We all know about the network effects, and the amazing things that might happen in the future with your data, but if we were to consider RIGHT NOW, in the short term, is there anybody at all doing these kinds of things?

I think that the sad reality is that Semantic Web awareness and tools are still way behind what's needed for us to start seeing these network effects happening.

link|flag
0

Putting out there your data will help people create applications around your data. These applications give a richer view that you had no capacity or sometimes idea to exploit. The good think is that it is the Web, aka links. In a very business way, these links will create more traffic around your brand and your data. It will help be more visible.

You can check what people like Open Library is doing. There is also the fact that libraries are using more and more Semantic Web technologies for their catalogs and then will need data from the publishers.

link|flag

Your Answer

Get an OpenID
or

Not the answer you're looking for? Browse other questions tagged or ask your own question.