Data Preparation

Introduction to Data Preparation

Data preparation – cleaning, structuring and enriching raw data to prepare it for analytics – has become increasingly difficult in recent years. As data volumes and data sources continue to grow exponentially, the tasks involved in data prep can be an enormous burden on IT teams, creating bottlenecks for business intelligence products. Lacking an effective data transformation tool, data administrators may find they must spend 80% of their time on data preparation, leaving only 20% of their time for extracting value from data through analytics.

Finding the right data preparation solution, however, can be a challenge. Traditional data integration tools tend to be slow, requiring manual coding that can quickly drain developer resources. These tools also lack the scale required to manage data preparation for data lakes, streaming and cloud platforms. And many data preparation solutions are unable to handle the burgeoning number of data sources from Big Data, the IoT and real-time data streams.

Qlik Replicate®: An Easy to use Solution for Data Preparation

Qlik Replicate is a software solution that provides automated, universal and real-time data integration to ensure faster and more thorough data preparation. By enabling IT teams to move data at high speed from sources to targets, Qlik Replicate helps to establish efficient data pipelines, with integration across all major data lakes, databases, data warehouses, streaming systems and mainframe systems, both in the cloud and on premises.

To support data preparation and data modernization, Qlik Replicate provides:

  • An easy-to-use graphical interface that significantly simplifies ingestion and replication for data preparation.

  • Streamlined and agentless configuration, using a zero-footprint architecture that eliminates the need for intrusive agents, triggers or timestamps on sources and targets, helping to improve source production performance.

  • Intelligent management and control, enabling administrators to design, execute and monitor thousands of tasks across distributed data and cloud environments.

  • Advanced change data capture (CDC) technology that serves the needs of transactional, streaming and batch architectures, and that integrates easily with a broad range of ETL technology

  • Automated end-to-end replication of data.

  • Universal stream generation, enabling databases to publish events to major streaming services such as Kafka, Amazon Kinesis and Azure Event Hub.

Qlik Catalog® Technical Overview

Data preparation for all major sources and targets

Qlik Replicate supports the industry’s largest variety of sources and targets, enabling IT teams to load, ingest, migrate, distribute, synchronize and consolidate data on premises or in cloud environments. These include:

  • All major RDBMS – Qlik Replicate is an ideal Oracle, DB2, Sybase, MySQL and SQL migration tool.

  • Cloud platforms, including AWS, Azure and Google Cloud.

  • Hadoop distributions such as Cloudera (Hortonworks), and MapR.

  • Data warehouses, including Exadata, Teradata, IBM Netezza, Vertica and others.

  • Streaming platforms such as Apache Kafka and Confluent Platform.

  • Applications, including SAP.

  • Legacy solutions, including IMS/DB, DB2 z/OS, RMS and VSAM.

Learn more about Qlik Replicate today