Data Schema

Data Schema

Data Schema (pronounced: skēmə), is a representation of different types of data (e.g. model, diagram) without exposing the underlying (i.e. messy) details. The visual representation of the schema omits most details not relevant to the information the schematic is intended to convey, and may add elements that aid comprehension. The most common schemas are: database, HTML, XML, structured data markup.

So what?

Data schema is the plumbing in a technology stack and accessing data in a tech stack is an abstract action. It works best with a simple, elegant schema that all who access the data can utilize with minimal disruption. Visualize data schema sitting atop layers of complex data stores. Its your data decoder ring, or perhaps Harold's Purple Crayon. The upshot? You needn’t contend with many complicated details in order to access data. You can simply click & go forth. It can be that simple.

Is it complicated?

Once again, the physical part of technology is the easiest part in the development of software solutions. The science of accessing and manipulating data outside of its original data store(s) is a relatively new science. Data is somewhat standardized although it largely varies and therefore, the majority of data requires substantial ‘tweeking’ before it is considered useful. This activity is the biggest time consumer of any data effort. Unfortunately, too many data solution project managers under-budget the enormity of the data usage efforts. This effort has a bittersweet term in industry, referred to as data ‘cleansing’. (Note – best managed through a dedicated effort. Refer to Data Citizens – Data Curator for a detailed role and justification).

Next, utilizing a schema requires business domain understanding along with the science of how the schema operates. Although a dedicated role will aid in this effort, this takes an exceptional Data Curator to make an abstract solution simple enough for the majority of laymen to utilize successfully. Finally, a schema is either a custom built creation or schema extensions are develop when the library of schemas available are not sufficient in themselves (usually the case to support the particulars of a business).

Is it worth it?

Data schemas serve as road maps. For instance, a database schema is essentially its data dictionary. Imagine accessing data from a database without a dictionary; nearly impossible. As for exchanging data, no need to reinvent the data mapping wheels between source & target. In the case of data exchange through hyperlinks or application navigation, the well honed use of HTML and XML use tags to link like data objects. For example, think of LinkedIn’s Friend of a Friend (FOAF). Here linked open data is used to create a virtual network of people (i.e. friends) of your friends. And on a web-scale of size and scalability involving enormous volumes of structured data (think BigData), you can utilize the power of a collaborative, community based initiative known as Schema.Org. Founded by Google, Microsoft, Yahoo and Yandex, Schema.Org vocabularies are developed by an open community process, using the public-schemaorg@w3.org mailing list and through GitHub. A shared vocabulary makes it easier for webmasters and developers to decide on a schema and get the maximum benefit for their efforts. It is in this spirit that the founders, together with the larger community have come together - to provide a shared collection of schemas. Schema.org vocabulary can be used with many different coding forms (i.e. RDFa, Microdata and JSON-LD). These vocabularies cover entities, relationships between entities and actions, and can easily be extended through a well-documented extension model. Over 10 million sites use Schema.org to markup their web pages and email messages.

Click here if you would like more information on data schema sservices.

<< back <<

SpiderPi

Data Schema