The tide is changing for analytics architectures. Traditional approaches, from the data warehouse to the data lake, implicitly assume that all relevant data can be stored in a single, centralized repository. But this approach is slow and expensive, and sometimes not even feasible, because some data sources are too big to be replicated, and data is often too distributed to make a “full centralization” strategy successful.
Have a question?