Friday, November 2, 2012

Cloudera's Impala

Cloudera is the best known Hadoop vendor around. Last week Cloudera announced it's latest offering, Project Impala.

Project Impala is a parallel real-time query engine that can run atop the raw Hadoop Distributed File System (HDFS) or the HBase tabular overlay for HDFS that makes it look somewhat like a relational database. 

Impala does not work through Hadoop MapReduce. Impala uses a SQL-like syntax and allows you to query data, whether stored in HDFS or Apache HBase – including SELECT, JOIN, and aggregate functions – in real time.

Here is a list of articles and opinions on what this will mean for Hadoop users:

1 comment:

  1. Big Data is best for financial firms to divide everything into smaller tasks, which are then distributed through many different servers.

    ReplyDelete