Data Lakes May Help With Big Data

Big data is a generally poorly defined term, despite it being one of the most common terms used these days. There is a general recognition that data is expanding rapidly in volume, and is being used for various purposes more often, thanks to the many tools available for using it.

Indeed, it’s become something of a mantra that data is a major asset of many a business.

This is even more compelling given the growth taking place in the Internet of Things, where virtually every physical asset/item/thing opened by a business is potentially connected to the internet and a source of reams of new data. The potential value of this data is not lost on management and stakeholders of companies.

And so this gives rise to several questions – Does the data need to be standardized? How does it get stored? By what tools is it best used? 

Data lakes constitute one of the current favoured approaches to these questions.

Traditionally, data is stored in data warehouses, which require some standardization and structure. They are therefore limited in their capacity and therefore lacking in usefulness for big data, which comes in all kinds of forms (structured, semi structured, unstructured, etc.) from many sources (just about anything imaginable).

Data lakes accommodate just about any form of data and are tied into analytical tools, such as Hadoop, that can handle such data.

Like big data, data lakes are poorly defined but they fit well with big data. Data lakes are the latest addition to the field of big data and Hadoop style tools, all together forming a critical new field for modern data management.









Leave a comment


  • No comments found