Metadata Metadata Metadata..

Since the year dot people have been talking about metadata.  Talend even went to the extent of trying to make a product out of it (Part of MDM is metadata management). It comes in many many forms and these days seems to be a very broad term.

There is something interesting though. Metadata has a power that is unparalleled, and almost unimaginable? What am I on about?  Well lets take a look at metadata injection in PDI..

I’m pretty sure I have blogged about this before, but I certainly remember the first time Matt Casters presented metadata injection at the Pentaho Community Meetup event.  At the time I’d been using PDI for quite some time, but couldnt quite see how this feature could be useful except in very specific use cases.   Move forward 3 years? (4?) and now I’m building amazing rule based frameworks that not only re-configure their own metadata but also link themselves together in optimal formations too.

So Matt had the vision, and it was shown to be good.  In the latest version of PDI yet more steps now support metadata injection and there is ongoing work to bring the support to others. And it hasn’t stopped there, there is further work going on in the background to further extend the ability of managing transformation metadata.  But what’s interesting is that this feature almost seems to have a life of its own. It’s allowed people to build things never even envisaged, and that is being embraced by Pentaho – people noticed that giving this power to users means they do wonderful things – so why not give them more power!!

It’s not only myself who is a massive fan of this feature. Certainly at the London usergroup Nelson, Diethard and others regularly wax lyrical about it.  But what triggered this post?  Well 2 things – Firstly I was demonstrating the system mentioned above to codek 2.0 – the new guy replacing me.  He totally understood what we were doing and by the end was pretty much astonished. And then came the golden comment – “But how would you do this in X” (Product names censored to protect the guilty). Point made – You can’t.  (Oh; This was the 2nd person who was amazed what we’d managed to do with PDI – the other was a senior data architect – nice)

(A slight tangent here, but another situation where other ETL tools failed where PDI succeeded is data vault – an attempt was made to migrate the data vault framework from PDI to another tool, and in the end it simply couldn’t be done.)

But there is another real world example of metadata in the web, beyond the Pentaho world which perhaps is less obvious. Has anyone come across these “Stories” which are auto generated by google+ when you go on a “trip”. They are basically an auto generated presentation. What makes them so good though, is the underlying Metadata which google adds. Typically a lot of this is location based data, but other bits and bobs sneak in too.  Combine that with some wizzy transitions, extremely easy commenting / editing and you have a pretty neat feature.  And why is it so good – METADATA!

B.t.w. if you want to know more then come to the next London usergroup on 2nd Dec.  Details here: http://www.meetup.com/Pentaho-London-User-Group/events/178634722/