WWW.BOOK.DISLIB.INFO
FREE ELECTRONIC LIBRARY - Books, dissertations, abstract
 
<< HOME
CONTACTS



Pages:     | 1 ||

«Buyer’s Guide to Big Data Integration SPONSORED BY CONTENTS Introduction 1 Challenges of Big Data Integration: New and Old 2 From Hub and Spoke to ...»

-- [ Page 2 ] --

The rationale for a hybrid architecture is that analytics solutions that run directly on Hadoop are still evolving and do not necessarily support the full breadth of production use cases. In particular, many SQL-like tools on Hadoop perform well in certain use cases, but don’t always deliver the highly interactive analysis performance that the market is used to with relational data sources.

Taking a solution approach that combines the best of Hadoop (extreme scale processing and refinement of diverse data) with the best of analytic databases (speed of thought analysis on large volumes of relational data) often makes more sense.

In such a solution approach, it’s important to be able to deliver data sets and analytics to the business in an on-demand fashion. This can be helped by automating data modeling processes and using parameterized data integration workflows that can adapt to the everchanging business questions that analysts are asking. The goal is to create a process or framework once and avoid repeated requests that result in manual work and lengthen time to decision.

Buyer’s Guide to Big Data Integration CITO Research Advancing the craft of technology leadership Analytics Support It is well known among data analysts in any domain that as much as 80 percent of the work to get an answer or to create an analytical application is done up front to clean and prepare the data. Data integration technology has long been the workhorse of analysts who seek to accelerate the process of cleaning and massaging data.

In the realm of big data, this means that all of the capabilities mentioned so far must be present: easy to use mechanisms for defining transformations, the ability to capture and reuse transformations, the ability to create and manage canonical data stores, and the ability to execute queries. Of course, all of this needs to be present for big data repositories as well as those that combine all forms of data.

By supporting analysts in cleaning and distilling data using machine learning and sharing the results, the process of answering questions, building apps, and supporting visualizations is accelerated.

But analysts will face other problems unique to big data. As we pointed out earlier, big data is often dirty and noisy. Machine learning is needed to ferret out the signal. But machine learning techniques are often difficult to use.

The best big data integration technology will offer a guided experience in which machine learning suggests and then is moved in the right direction by analysts.

This guided approach is required because so many machine learning and advanced analytical techniques are available for many different types of data. The machine learning techniques used to create predictive models from streaming data are far different from those used for categorizing unstructured text.

Once an analyst has created a useful, clean data set, the value of the work can be amplified by allowing sharing and reuse. Right now, environments to support sharing and collaboration are emerging. Some environments support architected blending of big data at the source to enable easier use and optimal storage of big data. Big data integration technology should support such environments.

–  –  –

Preferred Technology Architecture The ideal system for big data integration is different at every company. The most data intensive firms will likely need every capability mentioned. Most companies will need quite a few of them and more as time goes on.

The best way to provision the capabilities for big data integration is to acquire as few systems as possible that have the needed features. Most of the capabilities mentioned are stronger when they are built to work together.

The ideal big data integration technology should simplify complexity, be future proof through abstractions, and invite as many people and systems as possible to make use of data.

A fact of life in the world of data analysis is that everything is going to change. The best technology will insulate you as much as possible from changes. It should be the vendor’s responsibility to create easy to use, powerful abstractions and maintain them going forward.

The fact that big data technologies are evolving should not be your problem. Neither should the inevitable shakeout that will occur as various forms of technology and vendors fade away. Does this represent a form of lock-in? Of course, but in the end, it is better to be married to a higher level of abstraction than a lower level one.

Open source is and has been leading the way in big data innovation. A large part of the innovation in Hadoop and other big data ecosystem components has come via open source projects, not proprietary or closed approaches. Open source leads to a virtuous cycle of greater technology adoption and community-driven improvements. As such, it is key to look for data integration tools that embrace open source innovation and look to align with its capabilities. At the same time, open technologies tend to be more flexible and extensible than proprietary products. In an immature big data integration and analytics landscape where no one vendor can provide a complete out of box solution to meet all anticipated needs, support for the flexibility provided by open standards, open APIs, and well-developed SDKs is paramount.

Buyer’s Guide to Big Data Integration CITO Research Advancing the craft of technology leadership By choosing technology that supports visual data modeling, it is possible to avoid a skill bottleneck. Programming knowledge should not be required for transforming, modeling, and blending data sources. Simplified environments allow more people to interact with data directly and in turn accelerate progress.

One key financial factor in choosing the right technology is the license model. Depending on how your software is deployed and the internal skill set for supporting software, there can be vast differences in the cost to acquire various capabilities. It is important to understand the benefits and drawbacks of traditional licenses, open source software licenses, and various hybrid offerings.

Select solutions based on real-world use cases, not hypothetical applications. Look for integration vendors that have helped customers achieve success specific to big data use cases, most importantly with Hadoop or NoSQL data. While the majority of vendors claim to work with big data, the reality is that many are new to the market or are older vendors that have had success with traditional use cases, but not big data use cases. Another thing to look for is deep services offerings and expertise. Solving major business problems with big data requires best practice architectures, proven project plans, hands-on training, and expert support.

Finally, the best systems for big data integration are built to be embedded into business processes and workflows. The simplified forms of transformation should be able to be pointed at big data sources or at SQL repositories and used inside MapReduce or applications. Data integration tools should enable transformed big data to be accessed through familiar BI tools and used to feed web pages, mobile apps, or enterprise applications.

The Rewards of Getting Big Data Integration Right Data does no good unless it is presented to a human who can somehow benefit from it or unless it is used in an automated system that a human designed. The point of big data integration is to make it as easy as possible to access, understand, and make use of data.

The rewards of getting big data integration right are the benefits that come from greatly expanded and timely use of all available data. Reducing delays, eliminating skill bottlenecks, and getting fresh data to analysts and applications means that an organization can move faster and more effectively.

–  –  –

By purchasing components and systems that are part of a coherent vision while at the same time leveraging ongoing open source innovation, it is possible to minimize cost and avoid compromising on needed capabilities.

The questions we started with should now be easier to answer:

What to buy? As few systems as possible that provide the capabilities you need now and in the future in a way that is easy to use and future proof.

What is the coherent whole? A vision of big data integration that incorporates existing forms and sources of data into a new system that supports all phases of a responsive, dynamic data supply chain.

Solving Big Data Integration Challenges With Pentaho Pentaho’s big data integration and analytics platform provides broad connectivity to any type or source of data, with native support for Hadoop, NoSQL, and analytic databases.

Pentaho’s complete visual big data integration tools eliminate coding in SQL or writing MapReduce Java functions, and empowers you to architect big data blends at the source for more complete and accurate analytics. Learn more at www.pentaho.com.

This paper was created by CITO Research and sponsored by Pentaho

CITO Research CITO Research is a source of news, analysis, research and knowledge for CIOs, CTOs and other IT and business professionals. CITO Research engages in a dialogue with its audience to capture technology trends that are harvested, analyzed and communicated in a sophisticated way to help practitioners solve difficult business problems.

Visit us at http://www.citoresearch.com



Pages:     | 1 ||


Similar works:

«PaRDeS ZEITSCHRIFT DER VEREINIGUNG FÜR JÜDISCHE STUDIEN E.V. JIDDISCHE QUELLEN (2008) HEFT 14 UNIVERSITÄTSVERLAG POTSDAM PaRDeS ZEITSCHRIFT DER VEREINIGUNG FÜR JÜDISCHE STUDIEN E.V. HERAUSGEGEBEN VON REBEKKA DENZ, ALEXANDER DUBRAU UND NATHANAEL RIEMER IM AUFTRAG DER VEREINIGUNG FÜR JÜDISCHE STUDIEN E.V. IN VERBINDUNG MIT DEM INSTITUT FÜR JÜDISCHE STUDIEN DER UNIVERSITÄT POTSDAM Jiddische Quellen (2008) HEFT 14 UNIVERSITÄTSVERLAG POTSDAM ISSN 1614-6492 ISBN 978-3-940793-41-6...»

«S(JS'I'llINlllll.J~ l'll(~II I'I'I~(~'''(JI11~ & (J11IIllN 1)1~,rl~I.()I))II~N'I' CSAAR (7: 2010: Amman) Sustainable Architecture and Urban Development \ Edited by Steffen Lehmann, Husam Al Waer, Jamal AI-Qawasmi. Amman: The Center for the Study of Architecture in Arab Region, 2010. (518)P. :i.,;l)11 ~1)4.,~1 w\.j4!~)1 4+&JI 'Yb w.lC.1.:.ueUl.:.J1 \~ ~ '1) ~ '.S~:.r ~~WI :i.,;l.,y.J1 J,.ts u.\j;.J\ J,.:.:i;.:. '.Syö.1 ~~ ~.,1 ~yll4+&Jl.yLl Sustainable Architecture and Urban Development...»

«Tchaikovsky Symphony No. 3 ‘Polish’ · Music for the Theatre Gothenburg Symphony Orchestra Neeme Järvi TCHAIKOVSKY, Pyotr Ilyich (1840–1893) Symphony No. 3 in D major, ‘Polish’, Op. 29 (1875) 42'50 I. Introduzione e Allegro 1 13'00 Moderato assai (Tempo di marcia funebre) – Allegro brillante II. Alla tedesca. Allegro moderato e semplice 2 6'30 III. Andante elegiaco 3 8'14 IV. Scherzo. Allegro vivo – Trio. L’istesso tempo 4 5'59 V. Finale. Allegro con fuoco (Tempo di Polacca) 5...»

«Bedienungsanleitung A.D.J. Supply Europe B.V. Junostraat 2 6468 EW Kerkrade The Netherlands www.americandj.eu Rev. 12/09 Inhaltsangabe ALLGEMEINE INFORMATIONEN ALLGEMEINE ANWEISUNGEN FUNKTIONEN VORSICHTSMASSNAHMEN BEI DER BEDIENUNG SICHERHEITSMASSNAHMEN INBETRIEBNAHME TABELLE DES SYSTEMMENÜS SYSTEMMENÜ BEDIENUNG UC3 STEUERUNG DMX WERTE & EIGENSCHAFTEN AUSWECHSELN DER SICHERUNG REINIGUNG STÖRUNGSBHEBUNGEN SPEZIFIKATIONEN: ROHS -Ein wichtiger Beitrag zur Erhaltung der Umwelt WEEE –...»

«RADIOLOGICAL SURVEY OF HUNGARIAN CLAYS; RADON EMANATION AND EXHALATION INFLUENTIAL EFFECT OF SAMPLE AND INTERNAL STRUCTURE CONDITIONS* Z. SAS1, J. SOMLAI1, J. JÓNÁS1, G. SZEILER1, T. KOVÁCS1, CS. GYÖNGYÖSI1, T. SYDÓ2 University of Pannonia, Institute of Radiochemistry and Radioecology, H-8201 Veszprém, POB.158. Hungary, E-mail: ilozas@almos.vein.hu, somlai@almos.vein.hu, jacint.jonas@gmail.com, szega@almos.vein.hu, kt@almos.vein.hu csaba.gyongyosi@rhk.hu Social Organization for...»

«ZfW (2015) 38:83–97 DOI 10.1007/s40955-015-0015-z Originalbeitrag Bildung im Alter im Kontext des dritten und vierten Lebensalters – Narrationen und Narrative Ines Himmelsbach Online publiziert: 12. März 2015 © Die Autor(en) 2015. Dieser Artikel ist auf Springerlink.com mit Open Access verfügbar. Zusammenfassung Der Beitrag lotet die Chancen der Nutzung des gerontologischen Datensatzes der Interdisziplinären Längsschnittstudie des Erwachsenenalters (ILSE) für bildungsbiografische...»

«Measuring database performance in online services: a trace-based approach Swaroop Kavalanekar1, Dushyanth Narayanan2, Sriram Sankar1, Eno Thereska2, Kushagra Vaid1, and Bruce Worthington1 Microsoft Corporation, 1 Microsoft Way, Redmond WA 98052, USA Microsoft Research Ltd., 7 J J Thomson Avenue, Cambridge CB3 0FB, United Kingdom dnarayan@microsoft.com Abstract. Many large-scale online services use structured storage to persist metadata and sometimes data. The structured storage is typically...»

«M¨ lardalen University Press Dissertations a No.21 Data Management in Vehicle Control-Systems Dag Nystr¨ m o October 2005 Department of Computer Science and Electronics M¨ lardalen University a V¨ ster˚ s, Sweden aa Copyright c Dag Nystr¨ m, 2005 o E-mail: dag.nystrom@mdh.se ISSN 1651-4238 ISBN 91-88834-97-2 Printed by Arkitektkopia, V¨ ster˚ s, Sweden aa Distribution: M¨ lardalen University Press a Abstract As the complexity of vehicle control-systems increases, the amount of...»

«International Journal of Database Management Systems ( IJDMS ), Vol.3, No.2, May 2011 A Study on Challenges and Opportunities in Master Data Management. Tapan kumar Das1 and Manas Ranjan Mishra2 SITE, VIT University, Vellore, TN, India tapan.das@vit.ac.in IBM India Pvt.Ltd, Bangalore, India mmishra9@in.ibm.com Abstract This paper aims to provide a data definition of one master data for cross application consistency. The concepts related to Master data management in broader spectrum has been...»

«Öffentliche Andacht Lesung von Texten Heiliger Schriften Sonntag, den 10. Juni 2012 um 15:00 Uhr „Das Wort Gottes ist der König der Worte“ 1. Aus den Bahá’í – Schriften „Gerühmt sei Dein Name, o Herr mein Gott!.“ (Bahá’u’lláh, Bahá’í-Gebete 14) 2. Aus den Bahá’í – Schriften „(.) Das Wort Gottes ist der König der Worte;.“ (Bahá’u’lláh, Botschaften aus ´Akká 11:32) 3. Aus der Hebräischen Bibel „Freuet euch des Herrn, ihr Gerechten;.“ (Psalm...»





 
<<  HOME   |    CONTACTS
2016 www.book.dislib.info - Free e-library - Books, dissertations, abstract

Materials of this site are available for review, all rights belong to their respective owners.
If you do not agree with the fact that your material is placed on this site, please, email us, we will within 1-2 business days delete him.