Overview#Data-lake is a method of storing data within a system that facilitates the collocation of data in variant schematas and structural forms, usually object blobs or file System.
Data-lake has no definitive meaning and is a Buzzword
The concept of Data-lake is to have a single data Store of all data in the organizational Entity ranging from raw data (which implies exact copy of source Data Store) to transformed data which is used for various tasks including reporting, visualization, Data Security Analytics and Machine Learning.
Data-lake includes structured data from relational databases (rows and columns), semi-structured data (CSV, logs, XML, JSON), unstructured data (emails, documents, PDFs) and even binary data (images, audio, video) thus creating a centralized data Store accommodating all forms of data.
Generally a Data-lake has less structure than a Data warehouse but there is not "formally" defined difference.
More Information#There might be more information for this subject on one of the following:
- AWS Simple Storage Service
- Actionable Intelligence
- Apache Hadoop
- Data Swamp
- Google Cloud Storage
- Web Blog_blogentry_290816_1