For enterprises that place a high value on data-driven business decisions, data lake analytics are emerging as a powerful complement to conventional data warehouse analytics. A data lake is a large and diverse reservoir of current and historical corporate data stored across a cluster of commodity servers, built on the Hadoop distributed computing platform or alternatives such as Amazon S3. Data lake analytics are more flexible and open-ended and traverse far more data than traditional highly structured enterprise data warehouse analytics.
To enable data lake analytics, IT teams need to find ways to migrate data from a variety of source systems into the Hadoop data lake, keep the lake current with the latest data, and maintain the agility to add new source feeds and thereby address new business opportunities and initiatives.
Attunity Replicate is a market-leading enterprise data integration solution that enables IT teams to meet key prerequisites of data lake analytics:
While the primary aim of data lake analytics is to better understand a business's market, customers, products, and core operations, there is also another type of valuable data lake analytics – analytics aimed at monitoring and optimizing the data lake itself. For this type of data lake analytics, Attunity offers a proven solution in the form of Attunity Visibility. Attunity Visibility provides operations staff with real-time insight into data lake/Hadoop performance and usage patterns, both at a high level and with a granular break-down of storage and compute resource utilization per group, per user, and per application. Equipped with this information, operations teams can meet chargeback or showback requirements and forecast future data lake capacity needs.