Sergey Nivens - Fotolia
Microsoft plunged deeper into the open source milieu last week, as it expanded support for non-Microsoft software in its Azure cloud database lineup.
Among a host of developer-oriented updates discussed at the Microsoft Connect(); 2017 conference were new ties to the Apache Spark processing engine and Apache Cassandra, one of the top NoSQL databases. The company also added the MariaDB database to open source relational database services available on Azure that already include MySQL and PostgreSQL.
Taken together, the moves are part of an ongoing effort to fill in Microsoft's cloud data management portfolio on the Azure platform, and to keep up with cloud computing market leader Amazon Web Services (AWS).
A database named MariaDB
Azure cloud database inclusion of MariaDB shows Microsoft's "deep commitment to supporting data stores that might not necessarily be from Microsoft," said consultant Ike Ellis, a Microsoft MVP and a partner at independent development house Crafting Bytes in San Diego, Calif.
Such support is important because MariaDB has gained attention in recent years, very much as an alternative to MySQL, which was the original poster child for open source relational databases.
MariaDB is a fork of MySQL, with development overseen primarily by Michael "Monty" Widenius, the MySQL creator who was vocally critical of Oracle's stewardship of MySQL once it became a part of that company's database lineup. In recent years, under the direction of Widenius and others, MariaDB has added thread pooling, parallel replication and various query optimizations. Widenius appeared via video at the Connect(); event, which took place in New York and was streamed online, to welcome Microsoft into the MariaDB fold.
Microsoft said it was readying a controlled beta of Azure Database for MariaDB. The company also said it was joining the MariaDB Foundation, the group that formally directs the database's development.
"MariaDB has a lot of traction," Ellis said. "Microsoft jumping into MariaDB is going to help its traction even more."
Cassandra on the cloud
While MariaDB support expands SQL-style data development for Azure, newly announced Cassandra support broadens the NoSQL part of the Azure portfolio, which already included a Gremlin graph database API and a MongoDB API.
David Chappellindependent consultant
Unlike MongoDB, which is document-oriented, Apache Cassandra is a key-value store.
Like MongoDB, Cassandra has found considerable use in web and cloud data operations that must quickly shuttle fast arriving data for processing.
Now in preview, Microsoft's Cassandra API works with Azure Cosmos DB. This is a Swiss army knife-style database -- sometimes described as a multimodel database -- that the company spawned earlier this year from an offering known as DocumentDB. The Cassandra update fills in an important part of the Azure cloud database picture, according to Ellis.
"With the Cassandra API, Microsoft has hit everything you would want to hit in NoSQL stores," he said.
Microsoft's latest Spark move sees it working with Databricks, the startup formed by members of the original team that conceived the Spark data processing framework at University of California, Berkeley computer science labs.
These new Spark services stand as an alternative to Apache Spark software already offered as part of Microsoft's HDInsight product line, which was created together with Hadoop distribution provider Hortonworks.
Known as Azure Databricks, the new services were jointly developed by Databricks and Microsoft and are being offered by Microsoft as a "first-party Azure service," according to Ali Ghodsi, CEO of San Francisco-based Databricks. Central to the offering is native integration with Azure SQL Data Warehouse, Azure Storage, Azure Cosmos DB and Power BI, he said.
Azure Databricks joins a host of recent cloud-based services appearing across a variety of clouds, mostly intended to simplify self-service big data analytics and machine learning over both structured and unstructured data.
Ghodsi said Databricks' Spark software has found use in credit card companies doing fraud analytics and in real-time life sciences firms combining large data sets, IoT and other applications.
Taking machine learning mainstream
The Microsoft-Databricks deal described at Connect(); is part of a continuing effort to broaden Azure's use for machine learning and analytics. Earlier, at its Microsoft Ignite 2017 event, the company showed an Azure Machine Learning Workbench, an Azure Machine Learning Experimentation Service and an Azure Machine Learning Model Management service.
Viewers generally cede overall cloud leadership to AWS, but cloud-based machine learning has become a more competitive area of contention. It is a place where Microsoft may have passed Amazon, according to David Chappell, principal at Chappell and Associates in San Francisco, Calif.
"AWS has a simple environment that is for use by developers. But it is so simple that it is quite constrained," he said. "It gives you few options."
The audience for Microsoft's Azure machine learning efforts, Chappell maintained, will be broader. It spans developers, data scientists and others. "Microsoft is really trying to take machine learning mainstream," he said.
Economics in the cloud
Microsoft's broadened open source support is led by this year's launch of SQL Server on Linux. But that is only part of Microsoft's newfound open source fervor.
"Some people are skeptical of Microsoft and its commitment to open source, that it is like lip service," Chappell said. "What they don't always understand is that cloud computing and its business models change the economics of open source software.
"In the cloud world, you aren't selling software; you are selling services," Chappell continued. "Whether it is open source or not, whether it is MariaDB, MySQL or SQL Server -- that doesn't matter, because you are charging customers based on usage of services."
Azure data services updates are not necessarily based on any newfound altruism or open source evangelism, Chappell cautioned. It's just, he said, the way things are done in the cloud.