Content Modeling with XML
Posted in Content Management, PHP on May 20th, 2008 by admin – Be the first to commentThe Content Management System I'm currently working on has some known and reasoned limitations. It was designed a couple of years back for immediate ease-of-use by clients with simple needs and so i more an Adobe Contribute replacement rather than a full blown enterprise-ready überapp. There are some dynamic/shared elements such as 'Common Content' (think footer/sidebar content) and Collections (think FAQs and lists of things that need to be managed separately but appear on multiple pages). The navigation also builds itself from the site structure. It works well and I get good feedback from the clients and developers using it.
However, there normally comes a point when the client wants to do something a bit different. Lets say that rather than maintaining a simple list of FAQs with Collections, they want to create a knowledge base of articles. Up to this point clients have had to re-purpose a set list of properties for each 'Collection Item'. The fields were: publish date, title, summary, thumbnail and link. Thanks to a bit of ingenuity on the part of our developers, these properties were sufficiently generic to be useful in a range of scenarios from making a randomly rotating banner to an events list to a simple photo gallery with albums. I was surprised how flexible something so statically defined could be. That said, the fields weren't all used all the time and since the labels were generic, sometimes it has hard for the client to relate which form fields affected the final output on the website. In short we needed something more scalable and user-friendly. We needed 'custom' Collections where the form fields could be configured for each type of content the client required. We needed a way to model content.
It so happened that the way in which I'd designed the database earlier abstracted the notion of a schema for Collection Items away in favour of storing name/value pairs to facilitate versioning. In fact the static properties discussed above (title, summary etc.) were only enforced as hard-coded items in PHP. Good so far but I needed a way of mapping some sort of structure to the data on its way into the database as well as on its way out.
XML was the answer. For each 'Collection' of content added to the site, a collection type is used to provide a 'content schema' for the data. The basic idea is to have a single XML file that carries just enough information to build a form, validate the user input (and filter it if necessary) and then store the data entered.
A sample collection type is shown below:
-
<?xml version="1.0"?>
-
<collection-types>
-
<!-- DEFAULT -->
-
<collection-type guid="00000000-0000-0000-0000-000000000000">
-
<name><![CDATA[Default Collection Type]]></name>
-
<description><![CDATA[GCMS v1.0 Collection Type]]></description>
-
<settings items-sortable="true" grid-columns="link" grid-search="title|summary" />
-
<properties>
-
<property id="title" type="TextField" label="Image" required="true" />
-
<property id="summary" type="RichTextArea" label="Summary">
-
<default-value><![CDATA[blah blah]]></default-value>
-
</property>
-
<property id="thumbnail" type="AssetChooser" label="Image" />
-
<property id="link" type="LinkChooser" label="Link" />
-
</properties>
-
</collection-type>
-
</collection-types>
As you can see the XML schema is fairly simple. Each collection type gets a node with a unique identifier by way of the 'guid' attribute. This GUID gets stored in the database along with the name and description node values for each Collection and the GUID is used in an XPath query to retrieve the schema from this document at runtime.
The property nodes define the actual form fields that appear when the user adds/edits Collection Items. There are various property types which relate to the interface elements used when the form is rendered. Other attributes on each property node relate to how the data should be validated and filtered (such as have the entered text made uppercase or hyperlinks made clickable). Some more attributes specify default values that should be used, whether values should match another field, be unique etc.
The settings node in the XML document is used when the Collection records are viewed in the Content Management System. The attribute values are used to determine which properties are displayed in a datagrid and which can be searched.
And so, the XML simply defines a form to be rendered but also, the fact that we have a record of the structure of the stored content means that we can use this to publish content from the database into HTML (by means of templating), XML (by using the 'id' attribute of each property node as the tag names) and even to an SQLite database. Since SQLite supports only a few simple data types I was able to simply map the property types (taking into account their validation rules) and create an SQLite database for each Collection on the fly. This proved to be an ideal solution for reasons I've mentioned before.
So modeling simple content with XML is a very useful way to make your Content Management System flexible and because the content model is not stored recursively in your database, getting at the schema is simple and quick. If you're planning a CMS then I'd strongly recommend this approach