2

2

Can you advise me concerning the wisdom or otherwise of storing entire documents as the objects of triples?

I'm building a CMS-like application, and need to store thousands of XHTML-formatted documents somewhere on the server. I'm using Allegrograph to create and store semantic meta-data about the documents; but what about the documents themselves? Are there any disadvantages to storing them as xsd:string values in the triple-store? Is it considered better practice to store the documents elsewhere - flat files in a directory? - and reference them from triples?

flag

2 Answers

7

Advantages to storing your files as files in the file system:

  • Easier development and debugging
  • Easier data management, e.g., for backups
  • Probably faster (but may not be a problem at your scale) because it keeps the “big blobs” out of the triple store
  • Avoids problems with round tripping of Unicode characters etc

Advantages to storing your files as literals in the triple store:

  • Uniform access to data and metadata
  • Easier to keep data and metadata in sync
  • Versioning is more easily implemented

Personally I'd keep them as files.

link|flag
2

While there may be advantages in storing your documents in your triple store, e.g. only one data store to manage, interface to code against, etc. I'd argue that its simpler and more flexible to store your files in the file system, perhaps making them accessible over HTTP with a link to the document stored as a triple in your store.

As your application develops you'll likely to want to add features that relate specifically to the documents (e.g. editing, full-text indexing, validation, etc) and this will be easier to do if you're not constrained by having to integrate all that with a triple store.

link|flag

Your Answer

Get an OpenID
or

Not the answer you're looking for? Browse other questions tagged or ask your own question.