The Log - What every software engineer should know about real-time data's unifying abstraction.
·1 min
The Log: What every software engineer should know about real-time data’s unifying abstraction.
- Thoughts:
- What’s the difference between log/stream and messaging/notification service? Possibly that the former is for data flow whereas the latter for one time notifications between systems.
- Data silos in a company don’t help but, once connected, can unlock new insights and functionality. Logs seem to be a good way to do that.
- An interesting idea is that log enables putting data into both Hadoop-style batch and real-time (such as search indexing or monitoring) systems. (It may sound simple but can be complicated. For example, in a previous team, I tried aggregating a stream of affiliate order events and it wasn’t easy.)
- In the following image, what happens if you realize one of the upstream systems was producing bad data for, say, a few hours or days? How do you backfill data downstream and what do you do about the bad data inside the log?
-