Data Lake Analytics

For enterprises that place a high value on data-driven business decisions, data lake analytics are emerging as a powerful complement to conventional data warehouse analytics. A data lake is a large and diverse reservoir of current and historical corporate data stored across a cluster of commodity servers, built on the Hadoop distributed computing platform or alternatives such as Amazon S3. Data lake analytics are more flexible and open-ended and traverse far more data than traditional highly structured enterprise data warehouse analytics.

To enable data lake analytics, IT teams need to find ways to migrate data from a variety of source systems into the Hadoop data lake, keep the lake current with the latest data, and maintain the agility to add new source feeds and thereby address new business opportunities and initiatives.

Enabling Data Lake Analytics with Qlik Replicate (formerly Attunity Replicate)

Qlik Replicate (formerly Attunity Replicate) is a market-leading enterprise data integration solution that enables IT teams to meet key prerequisites of data lake analytics:

  • Ingest heterogeneous source data. To fully capitalize on the power and versatility of data lake analytics, an organization's Hadoop application developers and analysts need ready access to data from a variety of sources. Qlik Replicate (formerly Attunity Replicate) provides a unified big data ingestion solution that lets you load a Hadoop data lake with data from all major source platforms, including relational databases, data warehouses, SAP applications, mainframes, and files.
  • Keep the lake fresh. The value of data lakes analytics is maximized when the analytics are based on data that keeps pace with the latest developments in your business and market. Qlik (Attunity) enterprise change data capture (CDC) technology enables real-time data replication to keep your data lake in sync with source systems and provide analysts with the freshest possible data.
  • Maintain agility. The Qlik (Attunity) data integration tool empowers data managers and analysts to configure, execute, and monitor new data ingestion processes through an intuitive GUI, without doing any manual coding. Freed from dependence on programming staff, analysts are able to move quickly to pull new data types into the data lake in order to explore new ideas and opportunities.

Data Lakes Analytics for Operations Staff

While the primary aim of data lake analytics is to better understand a business's market, customers, products, and core operations, there is also another type of valuable data lake analytics – analytics aimed at monitoring and optimizing the data lake itself. For this type of data lake analytics, Qlik (Attunity) offers a proven solution in the form of Qlik Visibility (formerly Attunity Visibility). Qlik Visibility (formerly Attunity Visibility) provides operations staff with real-time insight into data lake/Hadoop performance and usage patterns, both at a high level and with a granular break-down of storage and compute resource utilization per group, per user, and per application. Equipped with this information, operations teams can meet chargeback or showback requirements and forecast future data lake capacity needs.


Five Principles for Effectively Managing Your Data Lake Pipeline