Thoughts on WYSISMUC
Exploring the concept of WYSISMUC, an interface pattern enabling on-the-fly structured data management inside a CMS.
Almost any information-oriented web page is dependent upon a core, central block of content. Because content is entered via a single large textarea in the CMS, this “central block” has two looming problems:
- It’s a blob. The content is not accessible nor re-usable by the CMS. All of the content is locked in a single opaque, shapeless, glumpf of something from which machines can’t derive meaning.
- It has crap markup. The HTML is saddled with bloated, near-meaningless markup (or farkup) courtesy of an antiquated WYSIWYG interface whose purpose in life is visual preview at the expense of any and all semantics.
Blobs and crap. Not the best way to start a day.
What Is the WYSISMUC Interface?
WYSISMUC (“what you see is structurally marked up content” or “whizzysmuck” I guess) is a concept that might help lead us out of this bog of unstructured stench. From Rick’s post:
In truth, there is a simple way to … allow those writing the material itself to work in a pseudo-WYSIWYG way, while maintaining a solid semantic structure underneath. After all, if you can take highly structured, chunked content and transform it for presentation as a single semantic block, why not reverse engineer the process?
The basic premise:
Inside a CMS, an author’s interface would have the standard textarea for that “central block” content input, but instead of an Old Country Buffet of formatting icons, a highly confined set of choices to enable basic structure. (Think interfaces like BaseCamp, Drupal 8, or Medium.)
More importantly, a second set of controls would enable rich, structured data to be applied by selecting blocks of text and clicking a key button — a pattern not unlike filling out fields for an anchor tag. Rick’s post cited a person’s bio as an example, and I’ll build off that:
What if we could do the same with a business, or, in this case, an event?
Any website could use these general examples. It’s even more fun to dream up structured components relevant to the subject of the website:
- A music site that references artist and album data.
- An academic site that references other scholarly articles.
- An extreme travel site that references locations by GPS coordinates.
- A knowledgebase that cross-references related documents.
(Further rabbit-hole thinking: imagine the same concept applied through the magic of external APIs. Instead of adding all of your own albums and artists, why not pull data directly from a source like discogs.com?)
What Are the Technical Gains?
The WYSISMUC interface promotes two important things: content structure and content relationships.
First, the modular, structured data provides the raw material for producing smart HTML. Think about the effort for marking up an address using a microdata schema …
This detailed markup converts human information (a place) into machine information that can be indexed (an organization, their map-able location, and associated contact information). This is a critical translative layer for things like Google Maps or recipe searches. See also RDFa and microformats.
To ask authors to produce this level of markup by hand would invite bloody mutiny. But when we capture structured sets of data right from the core author experience, implementing a dollop of template logic to dynamically produce the above block would be trivial.
More importantly, we’re creating relationships within the CMS itself. This is the magic. Islands of content inside a CMS are as useless as unmanaged HTML files on a server, so building programmatic ties creates a neural network that blows open Cirque du Soleil-like possibilities of content use — hopefully driving toward a richer, more customized, more engaging user experience.
Some examples off the top of my head:
- Embed “more info” fly-out windows or pop-up cells next to individual or business names that reveal the information captured in the CMS.
- Collate stand-alone event calendars based on events referenced by authors.
- List all of the articles that reference a certain album.
- Link all books to Amazon with a referral ID embedded.
- Produce an ad hoc RSS feed for any content referencing a certain person.
- Collect GPS coordinates and build a dynamic Google Map of referenced locations.
- Provide more targeted advertising opportunities.
- Aggregate names mentioned in an article, and place them at the end of the article as a “People Mentioned” section. (Not unlike the “Crunchbase” at the end of Techcrunch articles.)
- Cross-reference analytics and internal search logs to show if users are consuming supplemental material in knowledgebase articles.
- Create site maps localized for topics, orienting around content relationships versus top-down hierarchies.
- Publish richer APIs.
And these are all ideas just on the website. We’ve not even scratched the surface of publishing to channels like email and social. Just thinking about the possibilities of content remixing, re-use, recoflabulation is enough to get me drawing boxes and lines for days.
Traditional database-driven CMSs will have separate screens for different kinds of data: one screen to add a new author, another screen to enter a new blog post, another screen to manage events. Anyone who manages content on a site understands the tedious activity of traversing back and forth between these screens.
WYSISMUC eliminates much of that by exposing relevant schemas in the writing context. Workflow headaches are reduced, and ancillary datasets grow organically.
Structured content gives code-wranglers hooks to collect, parse and re-assemble the information in a web page. Access to smarter data encourages smarter layouts, and removes clunky workarounds like regex pattern matching.
Governance is both simplified and complicated. There’s the training speed bump, of course, and then enforcement of WYSISMUC best practices — making sure the right stuff is captured. The big upside is the mitigation of blobitis, and giving meaning to discreet elements. This also enables product and page designers to better articulate user experiences when they know exactly what kind of data is available.
What Are the Gotchas?
- As always, author adoption is the crux of any feature’s success. Additional fields that appear would have to be carefully considered. The “work” must be minimal. The experience must be intuitive.
- Additional training and documentation.
- Data-type definitions must be tightly defined. For instance, what are the rules around a “location” field in the event example above? Error-checking must be strict but helpful.
- Do large sites need dedicated metadata editors to add, enforce, and train on this type of content?
- And of course development overhead to flesh out the logic engines, and then more development to actually apply it to the site itself.
Is this Available Now?
There has been some thinking on this previously. The SemTag prototype was an effort to select text and apply richer tagging. Certain content management systems like SDL Tridion have “component” linking that approximates some of the functionality without much elegance. The WYSIWYM community has talked about the importance of structured content over candy-coated visual presentation for years.
Rick’s post extended the idea to non-standard formatting objects, which is critically important. We’re moving the interface away from a strict “one-button-per-HTML-tag” model to a hybrid of essential HTML controls and a set of “one-button-per-relationship”.