3

DBpedia has some interesting data in it, but the modeling is often not as good as it could be because of the nature of how the dataset is produced. If I take some data from dbpedia, remodel it to my own liking, with my own vocabularies, mint new URIs for the concepts, perhaps finesse and process the values, add in some more links and so on, what rights can I exercise/waive over my derivative dataset ?

http://wiki.dbpedia.org/Imprint says: DBpedia 3.4 data is licensed under the terms of the Creative Commons Attribution-ShareAlike 3.0 license and the GNU Free Documentation License.

Can I put my derivative work in the public domain, or do I have to issue it under the same dual licenses?

DBpedia itself does not link to a license, or explicitly give attribution to wikipedia (or anyone else) in every derivative document it serves up - should it? or is it sufficient just to have a statement somewhere on the website?

What if I make a copy of someone's CC Attribution licensed dataset publicly available at a SPARQL endpoint? Should the query results have attribution and licensing statements embedded in them somehow?

flag

1 Answer

2

This is a grey area.

There is some debate as to whether Creative Commons is appropriate for data. If you are interested in these issues I'd recommend looking at the Science commons Open Access Data Protocol http://sciencecommons.org/projects/publishing/open-access-data-protocol/

Bear in mind that copyright laws are different in different parts of the world.

I am not a lawyer, but my understanding is that common facts are not copyrightable, however arrangements of facts are (at least that is the case in Australia). One could argue that the facts that you have gleaned from dbpedia are common facts, and if you republish them using an entirely different data model, and your arrangement of those facts is not derivative in any way from the dbpedia arrangement of those facts, then you could put your dataset in the public domain. (And this would only apply if the data you have extracted from dbpedia is not considered to be 'creative' data (like a photograph or article), which is copyrightable).

I suspect that the safest thing to do would be to choose to use the data under the terms of one of the licenses and publish your derived dataset with that license.

link|flag
1 
In the Netherlands there is also copyright on collections, and extracting from that by any non-manual way is not allowed. Have the data retyped manually is however. – Egon Willighagen Nov 23 at 8:07
Details in Dutch (sorry): nrc.nl/dossiers/auteursrecht/… – Egon Willighagen Nov 23 at 8:10
That's interesting. I wonder how, in a court case, they would prove that you had not done it manually (assuming it wasn't an impossible amount of typing). What if you hired, or somehow crowd-sourced typists? – kwijibo Nov 23 at 10:16
@kwijibo I remember a court case in the Netherlands about a second provider of the Dutch telephone book (early 90's). Here a second party had send a bunch of phone-books to china to get a typed copy. They then used the copy to print more copies (or CD's for call-centers) The number of typos was cause for the Judge to declare that it was likely that all data was typed (with other evidence such as contracts with the chinese party work slips etc...) – Jerven Nov 30 at 9:08

Your Answer

Get an OpenID
or

Not the answer you're looking for? Browse other questions tagged or ask your own question.