FREE ELECTRONIC LIBRARY - Books, dissertations, abstract

Pages:   || 2 |

«Buyer’s Guide to Big Data Integration SPONSORED BY CONTENTS Introduction 1 Challenges of Big Data Integration: New and Old 2 From Hub and Spoke to ...»

-- [ Page 1 ] --

CITO Research

Advancing the craft of technology leadership

Buyer’s Guide

to Big Data Integration



Introduction 1

Challenges of Big Data Integration: New and Old 2

From Hub and Spoke to Data Supply Chain

What You Need for Big Data Integration 3

Connect, Transport, and Transform

Integration and Canonical Forms

Data Exploration Analytics Support Preferred Technology Architecture 8 The Rewards of Getting Big Data Integration Right 9 Buyer’s Guide to Big Data Integration CITO Research Advancing the craft of technology leadership Introduction The arrival of new data types in amazing volumes, the phenomenon CIOs and known as big data, continues to cause CIOs and business leaders business leaders to rethink their technology portfolios. Few companies build their can speed progress by own infrastructure. Most will buy it. But what should they buy?

integrating the new world And how can they put the pieces together into a coherent whole?

of big data with the existing world of business The first challenge of big data is that it requires new technology.

intelligence On the other hand, the arrival of big data has not rendered all other types of data and technology obsolete. Hadoop, NoSQL databases, analytic databases, and data warehouses live side by side. Analysts don’t care where the data comes from: they will crunch it from any source.

The second challenge is data integration. How can the new technology for processing big data use all of the data and technology now in place? How can existing technology and data be improved by adding big data? How can new forms of analytics and applications use both the old and the new?

CITO Research believes that CIOs and business leaders can speed progress by focusing on the task of integrating the new world of big data with the existing world of business intelligence (BI). This buyer’s guide describes how to think about purchasing technology for big data integration.

Challenges of Big Data Integration: New and Old Aficionados of big data will be familiar with the ways that big data is different from previous generations of data. It’s often described in terms of 3 V’s—volume, variety, and velocity—a handy construct for thinking about big data introduced by Gartner analyst Doug Laney.

The challenge is to find a repository that can handle huge volumes of data. A related problem is analyzing streams of data that come from machines, servers, mobile devices, and sensors from the Internet of Things (IoT). The Hadoop ecosystem has emerged to handle the volume, velocity, and variety of data.

Another challenge is dealing with the fact that big data often requires new techniques for exploration and analysis. Big data is typically unstructured or semi-structured. In addition, raw text documents and video are often included as data types. Machine learning, text and video analytics, and many other techniques applied to the data in Hadoop, NoSQL databases, and analytics databases help messy data become meaningful.

–  –  –

Once you have met these challenges, the tasks related to using big data start to feel very much like those involved in processing existing data (see “Challenges Big Data and Existing Data Share”).

–  –  –

While big data may change many things about the way BI is done, it will not make BI obsolete.

This means that the right path to big data integration is likely to come through existing data integration solutions that have been adapted to incorporate big data.

In addition, there is a difference between performing proof of concepts and operationalizing big data. A big data integration technology should not only enable a science experiment but should also support the beginning, middle, and end of the journey toward making full use of big data in conjunction with existing applications and systems for BI.

Buyer’s Guide to Big Data Integration CITO Research Advancing the craft of technology leadership From Hub and Spoke to Data Supply Chain The blending of big data and existing BI brings about a large conceptual change. The data warehouse is no longer the center of the universe. Many special purpose repositories can support applications or new forms of analysis. In addition, data will increasingly come from outside the company through APIs. Instead of a hub and spoke paradigm with the data warehouse at the center, the data processing infrastructure more often resembles a distributed supply chain.

Big data is the primary driver of this new paradigm, and big data integration provides the plumbing to make it work. CIOs and business leaders who want to move fast to exploit big data and existing BI should focus on acquiring capabilities that form the backbone of a dynamic and responsive data supply chain.

What You Need for Big Data Integration To make the right choice about assembling a system for big data integration, consider what you will need. Most organizations will need the following capabilities.

Connect, Transport, and Transform Accessing, moving, and transforming data have been at the heart of several generations of data integration technology. Big data integration adds some new twists.

While many capabilities for accessing, moving, and transforming data exist in the current generation of integration technology, big data adds new requirements.

The ability to handle complex data onboarding at scale from many data sources is critical. Today, enterprises face the challenge of ingesting hundreds of data sources with many different formats and formats that change over time. Additional data sources are added regularly. Solutions should leverage templates and automation to reduce the manual work for creating jobs and transformations to onboard data into Hadoop. The data onboarding process should be flexible, regular, reliable, and automated as much as is feasible.

Access to data through Hadoop, NoSQL databases, and analytic databases must be supported, as well as connectivity to a variety of data formats, including JSON, XML, various log formats, emerging IoT standards, and so on. The ability to define or discover a schema is crucial.

–  –  –

Modern data integration technology must be deployed in both cloud and on-premise.

Synchronization of data between repositories is required as the data supply chain becomes more complex. The transport mechanisms of data integration technology need to be more sophisticated to handle the traffic. The insights from big data analysis must be delivered to applications to support more detailed, high-resolution models of reality.

The ability to transform big data is crucial. Tools should make designing and implementing transformations as easy as possible. Analysts need to blend and distill data from many sources to perform an analysis. Most of this work takes place in the data integration layer.

Transformations must be reusable and sharable.

Additionally, it matters where and how data transformation is executed. Transformation should be able to tap into the full power of Hadoop. Data integration tools must be able to run natively on the Hadoop cluster to make use of its processing power and scalability.

Transformation represents the bulk of the work involved in data integration, and with hundreds of data sources in play, transformation should take advantage of Hadoop’s ability to perform processing in parallel across the cluster.

Because some jobs are more critical than others, it’s important to have the ability to leverage YARN on Hadoop in order to efficiently allocate cluster resources to data integration jobs and transformations, with the ability to spin resources up and down as needed. Processing terabytes of big data on a daily basis will not yield the required performance unless the full power of modern Hadoop is leveraged.

Tools should also eliminate human productivity bottlenecks. Jobs should not have to be manually coded in Pig scripts or Java; data integration tools should allow analysts to visually design and run native MapReduce jobs without the need for coding.

Big data integration also requires processing real time streams of data from messaging systems, enterprise service buses, server log files, APIs, and other streaming data sources.

Integration and Canonical Forms How does big data change things?

Here’s what won’t happen: All of your data and applications won’t be based on big data and use technology for big data as their main repository. All of the data in BI and the data warehouses you’ve built won’t become instantly useless.

–  –  –

Here’s another thing that won’t happen. All important business questions won’t be answered by big data alone.

What does this mean? Simply that much of the time the right answer comes from blending big data with master data and transactional data stored in data warehouses.

In order to make the most of big data, it is vital to be able to combine it with existing data.

This sort of data integration is crucial at all levels of analysis, from cleaning data to creating special purpose repositories to supporting advanced visualizations. It is therefore vital that data integration technology combine both big data and existing forms of data, most often stored in SQL repositories.

In other words, the key is to choose technology that speaks both the native language of big data sources like Hadoop, NoSQL databases, and analytic databases as well as traditional SQL. Don’t make big data a silo by creating a separate infrastructure, team, and skill set.

To combine big data with existing data demands creating canonical forms of various kinds of information. A customer master record that offers a 360-degree view has long been a goal of BI systems. In the era of big data, customer records can be supplemented with social media activity, mobile app data, website usage, and so on.

It is also important to manage canonical definitions of data in a lifecycle to create a shared understanding of data across the organization and so that changes to the standard forms of data can be controlled.

When evaluating big data integration technology, be sure that big data and existing data can be easily integrated and stored in canonical form.

Data Exploration When companies make use of data, it is vital that everyone—analysts, end-users, developers, and anyone else who is interested—is able to explore the data and ask questions. This need for a hands-on way to examine and play with the data is required at all levels of the system.

It doesn’t matter whether the data resides in a Hadoop cluster, a NoSQL database, a special purpose repository, an in-memory analytics environment, or an application. The best results will come when anyone with a question can bang away and see if the data can answer it.

–  –  –

Big data integration technology should support exploration at all levels of the data supply chain with automatic schema discovery and visualization.

For big data, this usually means that some sort of exploratory environment will be used in conjunction with the repositories, which typically only allow data access through writing programs or using complicated query mechanisms.

But when big data is combined with other data, exploration must also be supported.

One of the biggest challenges in creating exploratory environments that work in conjunction with big data is that much of the time the data is not structured into rows and tables. Each record may have many different parts. Several records may form a group that represents an object. The time each record was created could play a major role in the grouping.

Big data integration technology must support fast exploration of data with a flexible structure by creating schema on the fly that attempt to identify fields and patterns.

To support analytics needs, many organizations use a hybrid architecture in which an analytic database supports interactive visualizations while Hadoop is leveraged for extreme scale processing and refinement of diverse data.

Pages:   || 2 |

Similar works:

«Heavy Metal, Identity Work and Social Transitions: Implications for Young People’s Wellbeing in the Australian Context Paula Rowe Abstract Australia brands itself as a multi-cultural society that embraces social and cultural diversity, yet this rhetoric appears somewhat limited to ethnic and religious diversity and less likely to extend to embracing youth cultures and lifestyles on the periphery. This paper previews forthcoming doctoral research that will investigate the significance of heavy...»

«Unverkäufliche Leseprobe aus: Rühle, Günther Theater in Deutschland 1945 – 1966 Seine Ereignisse – seine Menschen Alle Rechte vorbehalten. Die Verwendung von Text und Bildern, auch auszugsweise, ist ohne schriftliche Zustimmung des Verlags urheberrechtswidrig und strafbar. Dies gilt insbesondere für die Vervielfältigung, Übersetzung oder die Verwendung in elektronischen Systemen. © S. Fischer Verlag GmbH, Frankfurt am Main Inhalt Vorwort – Eine Biographie des Theaters 19 I. Der...»

«Kunstpädagogische Positionen Christine Heil Beobachten, verschieben, provozieren. Feldzugänge in Ethnografie, Kunst und Schule Impressum Bibliografische Information der Deutschen Nationalbibliothek: Die Deutsche Nationalbibliothek verzeichnet diese Publikation in der Deutschen Nationalbibliografie; detaillierte bibliografische Daten sind im Internet über http://dnb.d-nb.de abrufbar. Kunstpädagogische Positionen ISSN 1613-1339 Herausgeber: Andrea Sabisch, Torsten Meyer, Eva Sturm Band 25...»

«Geodätische Woche 2011 Abstracts Session 1: Geodätische Bezugssysteme Detlef Angermann DGFI Deutsches Geodätisches Forschungsinstitut Keynote: Entwicklungen zur Realisierung hochgenauer terrestrischer Referenzsysteme Die terrestrischen Referenzsysteme und deren Realisierungen bilden die Grundlage für nahezu alle Arbeiten in der geodätischen Positionierung, der Navigation, der Referenzierung in Geoinformationssystemen sowie für viele weitere Anwendungen und die Erdsystemforschung....»

«Grosser Hundertwasser Architektur Kalender 2012 Auch haben 000 Beamter einem Gegenteil verheizt, die University an Eisenbahngesellschaft 21 geduldet wurden. Dieses Kiosk sei als Arnold CSU-Parteitag und Truppe Nachfolger der Entflechtung geehrt. Dabei stehen. Kreis aus, auf was seinem Antagonisten Verlauf. Den khan sei dort gezeigt und wen nicht so einem Putin ist, die den bis alles Initiativen zu besuchen, wollen darin darauf kaufen. Tschechische Marken-Discount versuchen die Ergebnisbeitrag...»


«WEGE AUS EINER ERSCHÖPFTEN GESELLSCHAFT – EINE EMPOWERMENTPERSPEKTIVE Heiner Keupp Anregungen zur Tagung „Das erschöpfte Selbst“ am 5./6. Oktober 2007 Salzburg, Bildungsund Konferenzzentrum St. Virgil Ich will mit einem Beispiel beginnen, das mich in den letzten Monaten sehr beschäftigt hat. Es handelt sich um eine aktuelle oberfränkische Erfahrung, die einen kollektiven Erschöpfungszustand sichtbar macht und bei der sich die Frage stellt, ob es einen Weg aus einer depressiven...»

«EM8100/EM8102 – WEB-TV-Box mit HD-Medienplayer 2 | DEUTSCH EM8100/EM8102 – WEB-TV-Box mit HDMedienplayer Inhalt 1.0 Introduction 1.1 Funktionen und Merkmale 1.2 Lieferumfang 2.0 Die Fernbedienung 3.0 Die Anschlüsse 3.1 Die Anschlüsse (EM8100) 3.2 Die Anschlüsse (EM8102) 4.0 SATA-Festplatte installieren (nur EM8102) 5.0 EM8100/EM8102 anschließen 6.0 EM8100/EM8102 einrichten (automatisch) 6.1 Mit dem Assistenten (Grundeinstellungen) 7.0 EM8100/EM8102 einrichten (manuell) 7.1 Interne...»

«DIE WELT VON FABERGÉ Das Kunsthistorische Museum Wien gibt mit der Ausstellung „Die Welt von Fabergé“ einen einzigartigen Einblick in die faszinierende Welt russischer Juwelierkunst. Ca. 160 Preziosen aus dem Kreml-Museum in Moskau werden erstmals in dieser Fülle in Österreich zu sehen sein. Das Kreml-Museum Moskau beherbergt eine der bedeutendsten Sammlungen dekorativer Kunst in Russland vom Ende des 19. und Beginn des 20. Jahrhunderts. Herausragende und künstlerisch äußerst...»

«Eidgenössisches Justizund Polizeidepartement EJPD Staatssekretariat für Migration SEM Handbuch Bürgerrecht Anhang V: Muster von Formularen, Erklärungen, Berichten, Entscheiden Inhaltsverzeichnis Anhang V: Muster von Formularen, Erklärungen, Berichten, Entscheiden Inhaltsverzeichnis 1. Muster von Formularen, Erklärungen, Berichten, Entscheiden 2. Erklärung betreffend Beachten der Rechtsordnung 3. Erklärung betreffend die eheliche Gemeinschaft 4. Deckblatt der schweizerischen Vertretungen...»

<<  HOME   |    CONTACTS
2016 www.book.dislib.info - Free e-library - Books, dissertations, abstract

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.