On March 18, 2014, Cloudera, a Hadoop distribution software company, based in Palo Alto, CA held what Neuralytix believes to be the first industry analyst day held by any Hadoop company ever! Much to the credit of the organizers, the event was as professional and informational as any other analyst event organized by much larger companies. Cloudera brought together partners and end-user customers at the event, and included senior executives from Informatica and Tableau.


Cloudera shared some relative company performance. It has doubled customer counts since 2010. The portion of revenues attributed to software sales has steadily increased, with the balance made up of services and training. This measure is a very critical and positive one. While other Hadoop distributions rely on services and training for the majority of its income, only software is able to provide a potential renewable annuity.

Senior Cloudera executives were very certain about the sustainability of its annual subscription based software model. They asserted that by making the software an annual subscription, Cloudera needs continually demonstrate value to its customers, to ensure that they renew each year. Neuralytix believes that this model brings a market equilibrium between supply and demand given the fast-changing landscape of the nascent Big Data market. Cloudera argues that if customers do not feel they are getting enough value, they can simply stop renewing the software subscription, and continue down the “free” open-source route if they choose; or perhaps, change to another distribution based on Apache Hadoop.

Neuralytix believes that the latter is very unlikely to happen given some of the projects that integrate with Hadoop that Cloudera has developed and primarily maintain. Thise includes Impala, which is open-source and maintained by Cloudera.

Neuralytix believes that Cloudera has created a winning combination for Big Data customers between its fair-market revenue model, the projects it has developed and maintains, and its Enterprise Data Hub (EDH) vision.

The messages Cloudera wanted industry influencers to understand were clear:

  • Cloudera’s EDH is designed to integrate legacy, current and future approaches to data management and analysis;
  • Cloudera has the most extensive partner ecosystem of any Hadoop company;
  • Despite being a for-profit company, with a defined view towards an IPO, Cloudera is committed to developing Hadoop for the open-source community (Cloudera increased its spend on research and development by over 50% year over year in 2013, and has trained over 20,000 students in three years); and
  • Evolving Hadoop to run “any type of workload”, including transactions.

Also attending the event was Doug Cutting, a co-founder of Cloudera and creator of Hadoop. During the event, Neuralytix asked Mr. Cutting to share whether he felt whether Hadoop is appropriate for transactional processing, a question often posed by Neuralytix clients. He ended his explanation that the concept of “real-time” is subjective, and while one enterprise may consider real-time in sub-nanosecond scale, others may be quite satisfied at improving output times to sub-second scale. To that end, he concluded that “it is hard to imagine [one day where] any type of workload that you won’t be able to run on Hadoop, including transactions.

This discussion is a critical one, since Neuralytix believes this is where the next Hadoop battle will take place. On the one side will be Cloudera with Spark and Spark Streaming, with its native integration with Hadoop, and Cloudera’s nemesis, Hortonworks, with Storm (at least, it will compete against Storm, given that Storm on YARN is still preliminary).

In the six years since Cloudera’s founding, Hadoop has come a long way. Nevertheless, there is yet a long road to come. While very large enterprises currently represent the marquis customers for Cloudera and its competitors, the Hadoop market still represents a very fragmented marketplace, with highly intricate dependencies. Cloudera itself has 240 software ISV partners representing a spectrum of companies from start-ups to highly established public software companies, many of which may also offer competitive partnerships and solutions to those offered with Cloudera.


The Hadoop (and by extension, the Big Data) market is not measured by revenue, but rather perception. Revenue does not properly capture the acceptance of Hadoop given the large population of users who use open-source Hadoop.

In the view of Neuralytix, the market has a very positive perception of Cloudera. We roughly see a parallel between RedHat during the rise of Linux a decade ago, and Cloudera today with respects to Hadoop. That positive perception represents a number of critical success factors:

  • Acceptance of the technology (Hadoop);
  • Recognition of Cloudera as a market leader;
  • Recognition of Cloudera as an enterprise player; and
  • Acceptance of Cloudera as a viable, sustainable growth business.

These four criteria make Cloudera a standout among Hadoop vendors. More importantly, it makes Cloudera a go-to vendor for all projects related to analytics. Repeating Neuralytix’s axiom regarding Big Data – “if you’re not doing it, your competitors are.” This means that all large enterprises are considering (and optimally, implementing) some form of Big Data project. It logically means that many enterprises have Cloudera on top-of-mind.

Without doubt, Cloudera must take advantage of its current market position. However, despite this advantage, Cloudera is not without challenges.

The biggest challenge facing Cloudera, and in fact all Hadoop vendors, is the speed of adoption of Hadoop overall. Clearly, some large enterprises and Web 2.0 companies have leveraged Big Data and Hadoop for some spectacular results. However, this does not mean that Hadoop is being adopted rapidly. Neuralytix research suggests that most enterprises have yet to begin exploring or implementing Big Data solutions.

For most enterprises, it is not so much about the maturity or reliability of technology; instead, most enterprises may have internal political issues that inhibit them from exploiting the value of data and information. For those enterprises that have taken a step towards implementing Big Data solutions, many have started with rogue technology organizations (often within marketing departments) and demonstrated through proofs of concept that Big Data solutions can yield impressive outcomes.

Other enterprises that have already taken deliberate and active steps towards using Big Data solutions, many of them based on Hadoop, are organizations that have a history of using business intelligence and data analytics to support business decisions – these include insurance companies, financial institutions and certain part of the retail sector.

Luckily, the business challenge for Cloudera is likely to be a veritable treasure trove of opportunity. The best way Cloudera can position themselves to take advantage of these opportunities will be to continue the educational and partnering paths in which it is already engaging. Additionally, it needs to develop more vertical solutions to inspire laggards to Big Data that better data management leads to big returns.

error: Please request access to our content by contacting [email protected].