Certified distributions maintain compatibility with open source Apache Spark distribution and thus support the growing ecosystem of Spark applications
“One of Databricks’ goals is to ensure users have a fantastic experience. Our belief is that having the community work together to maintain compatibility and therefore facilitate a vibrant application ecosystem is crucial to this vision,” said
In keeping with the open source nature of Spark, the certification process is fully transparent with open-source tests, lightweight, and 100% free - a mirror image of the “Certified on Spark” process for Spark applications. Vendors fill out a short questionnaire and then simply execute a set of open-source tests - developed and maintained by the community and used to test each release of Apache Spark - against their build of Spark to demonstrate compatibility.
“Certification shouldn’t be used as a tool for lock-in: Certified Spark Distributions are not required to ship all the bits of Apache Spark, or be open source, or prevented from innovating significantly within and around Spark,” said
As part of the certification program launch, five vendors have completed the certification process: DataStax, Hortonworks, IBM, Oracle, and Pivotal - industry leaders that have recognized and embraced the power of Spark when integrated with their respective platforms. Each of these vendors put their distributions through the certification process, which included a host of integration tests to ensure full compatibility with the latest Apache Spark release.
“One of the big risks faced by open source projects is fragmentation among distributors. Fragmentation is bad for both users and application developers, and ultimately for the growth of the project,” said
Vendors interested in certifying their Spark distribution should visit www.databricks.com and select "Apply for Certification." Enterprise users can also visit the Databricks site regularly to see the latest set of certified distributions and applications, and read “spotlight” blog articles that provide deep-dives on the Spark ecosystem by newly certified vendors.
All the inaugural members will be on hand at the upcoming Spark Summit from
"DataStax is strongly committed to making Cassandra and Spark the best combination for today's online applications," said
“We support the fact that Apache Spark project provides enterprises with an additional processing engine in Hadoop to execute in-memory algorithms for advanced analytics,” said
"Pivotal's open source credentials are quite extensive - Apache-compatible Hadoop, MADLib, RabbitMQ, CloudFoundry - and now we've added Spark to that set," said
Databricks was founded by the creators of Apache Spark, who have been working for the past six years on cutting-edge systems to analyze and process Big Data. They believe that Big Data is a tremendous opportunity that is still largely untapped, and are actively working to revolutionize what enterprises can do with it. Databricks is venture-backed by Andreessen Horowitz. For more information, visit http://www.databricks.com.
Most Popular Stories
- U.S. Families 'Extraordinarily Vulnerable': Yellen
- Hillary Clinton to Address CHCI Conference
- Larry Ellison Steps Down as Oracle CEO
- Alibaba Prices IPO at $68 a Share
- Apple Locks Itself Out of Devices
- Veterans to Get Training as Solar Panel Installers
- Hispanics Doubt Marco Rubio's Chances
- Wildfires Rage in California
- John Cantlie Delivers ISIS Message to Save Life
- Alibaba: Today China, Tomorrow the World