So, in recently researching lamda architectures I came across these links, and I thought some were worth sharing here:
This document has a great slide, which shows how you keep the data stores separate, but merge at the serving layer:
Just to keep things interesting, there is a subtly different view here: (Linkedin guy)
That solution is not dissimilar to this document here:
An important comment about the fundamental principle of immutable data in lamdba:
(Don’t worry, the page is nothing about Talend itself – A common marketing trick that tech companies seem to be using a lot these days – talk about cool tech, just to get yourself linked to. Oh damnit, I just did that. Damn!)
Then there’s the outsider – Kudu. Kudu seems to be going back to mutability. BUT kudu is far from being suitable for production use, and it has a horrible deployment architecture.
Finally, inquidia (BigData Pentaho partners in the states) have a page on it, a good summary of the options, latency implications etc. This can be found here: