Hadoop has rapidly become the preferred enterprise platform for big data analytics. For many businesses, however, Hadoop adoption is impeded because in-house development and operations staff, accustomed to conventional database and data warehouses, lack the necessary Hadoop skills. For such organizations, the right Hadoop data integration tool can help by automating Hadoop data ingestion flows and shielding operators from some of the underlying technical complexities. The most established and well-proven Hadoop data integration tool is Attunity Replicate, the enterprise data integration technology of choice for more than 2000 businesses across a range of data-driven industries.
One of the appeals of a Hadoop data warehouse – or as it's sometimes called, a Hadoop data lake – is the ability of the Hadoop distributed parallel processing platform to analyze virtually any type of structured or unstructured data. By serving as a single unified Hadoop data integration tool that allows you to easily move data into Hadoop from any type of source system, Attunity Replicate enables your organization to fully exploit Hadoop's versatility and power.
With Attunity you can move data into any major Hadoop distribution, from on-premises or cloud-based source systems that include the following.
For each such source type, Attunity's Hadoop data integration tool empowers users to configure and execute source-to-Hadoop data migration jobs through a drag-n-drop GUI, without need for any deep technical knowledge of the underlying programming interfaces. Along with saving time and money by reducing reliance on senior programming staff, this highly automated solution for managing Hadoop and big data also positions your organization to be more agile in your approach to big data analytics, making it fast and easy to add new source data feeds to a Hadoop ecosystem in response to new business demands or opportunities.
Purpose-built for big data management, Attunity's Hadoop data ingestion tool supports high-performance bulk loading of Hadoop on-demand or on a schedule. With a modular architecture that does not require agents on the source or destination system, Attunity easily scales to accommodate the bulk data migration needs of the largest enterprises.
For real-time integration of source database systems and Hadoop or a conventional data warehouse and Hadoop, Attunity supports log-based change data capture (CDC) that continuously freshens your Hadoop data lake without hindering the performance of the source systems. Attunity also supports Kafka Hadoop data flows that stream message-encoded CDC data through Kafka and on to Hadoop's HBase or other big data systems such as Cassandra, Couchbase, or MongoDB.
Once the data is ingested and landed in Hadoop, IT often still struggles to create usable analytics data stores. Traditional methods require Hadoop-savvy ETL programmers to manually code the various steps – including data transformation, the creation of Hive SQL structures, and reconciliation of data insertions, updates and deletions to avoid locking and disrupting users. The administrative burden of ensuring data is accurate and consistent can delay and even kill analytics projects.
Attunity Compose for Hive automates the creation, loading and transformation of enterprise data into Hadoop Hive structures. Our solution fully automates the pipeline of BI ready data into Hive, enabling you to automatically create both Operational Data Stores (ODS) and Historical Data Stores (HDS). And we leverage the latest innovations in Hadoop such as the new ACID Merge SQL capabilities, available today in Apache Hive (part of the Hortonworks 2.6 distribution), to automatically and efficiently process data insertions, updates and deletions.
Attunity Replicate integrates with Attunity Compose for Hive to simplify and accelerate data ingestion, data landing, SQL schema creation, data transformation and ODS and HDS creation/updates.