This section suggests different ideas on the way we think is the best to integrate Hadoop to existing architectures.


How to integrate existing data with Hadoop?

  • be transparent in the data importation to DFS
  • user periodic batch import
  • asynchronic saving of data to DFS
  • take data from xml, cvs, data bases
  • Pentaho: desktop app ETL using Hive chroned using OS jobs. Full version includes importing from SAP repositories
  • Use Hadoop Java API to asynchronously updload files to the File System
  • Use Hadoop Distributed File System’s commands to import data asynchronously

Integration with DB
  • SQL-Sqoop-Hive
http://www.cloudera.com/blog/2011/06/biodiversity-indexing-migration-from-mysql-to-hadoop/

Log files processing
  • Mount HDFS Fuse
https://ccp.cloudera.com/display/CDHDOC/Mountable+HDFS
http://wiki.apache.org/hadoop/MountableHDFS


Visualization

  • Build an open source tool
  • Flashless tools:http://raphaeljs.com/ It doesn’t include the exportation to pdf!
  • Incorporate graphics to Rocketui de Globant. Rochetui includes widgets on Yahoo-ui, Prototype and JQuery
  • Google Visualization: there is a tool which obtains a static image as a printscreen by calling a URL
  • Lab de Adobe project to export Flash to HTML5
  • Fusion charts http://www.fusioncharts.com/
  • Info Build Toolkig
  • GWT: data models with canvas. GWT exporter can be used to be referenced with a JS application
Visualization Tools Comparison

http://sixrevisions.com/javascript/20-fresh-javascript-data-visualization-libraries/

How to generate Dashaboard once the output is ready?

  • Hive+using intermediate data base. Then you can extract your reports from a MySQL DB
  • Export results to CSV
  • Fusion Tables

Pentaho functionality
Navigation@globant


Hadoop@Facebook



Hadoop@Twitter

Cassandra

http://www.datastax.com/dev/tutorials
http://www.datastax.com/docs/0.8/introduction/index#getting-started-with-cassandra
http://www.datastax.com/dev
facebooktwittergoogle_plusredditlinkedinby feather

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>