Exchanging data has never been easier with the technological advances in munging/wrangling data, improved data storage, bigger pipes, advanced scheduling. Less than 5 years ago, it took a lot of manual intervention to exchange data (note: the term ‘data exchange’ is the current industry term used to represent all data activity of data movement). There was forever a myriad of issues and errors, because data is similar to managing air. You need it, you do everything with it but like air, it is often dirty and riddled with toxins, many of which go undetected for too long and require extensive clean-up. But we really have advanced in this area and so we exchange data more than ever before.
And therein lay the issue.
Data exchange does not necessarily move data as in, ‘once it was here and now it is there’. Rather, data leaves it’s footprint and leaves its trace or a variation of itself and is proliferated throughout its lifespan. Therefore, if you are not keeping an account of your data exchanges, then you have just that; no idea the proliferation of the data.
Many organizations have a formalized method to share data (internally and/or externally) using a service oriented architecture (SOA). Advancements in data movement, include the adoption of Extensible Markup Language (XML), which translates any data format into a globally adopted standard to exchange data.
Additionally, industry has seen the widespread adoption of open source application program interfaces (API’s), which standardize the sharing of application functionality and the exchange of data for a whole wide range of needs. I bet you click on these APIs daily: Google Maps, Twitter and YouTube. But, those are are what I call data ‘one offs’. When you are serious about exchanging data, the ProgrammableWeb is the best software solution (not to mention software development overall, but that’s another story).
We like data provenance
It has become a popular industry wide activity (including the U.S. federal government) , allowing the self-servicing of data. Such a great advancement in information collection. However, it is very costly if you are lacking controls around your data exchanges and you need to prove how information was derived. Imagine not having any proof of how an analytic value was produced. Painful and potentially ruinous. A solution which brings data provenance to this dilemma is Spider8M, a data utility that keeps a cataloged record of data exchanges.
Please click here to hear more about data exchange.