Cloud Provider Popularity on Twitter and Stack Overflow

August 26, 2022
This Python analysis uses data from Twitter and Stack Overflow to measure the popularity of cloud providers.

by Ana Pires Fernandes

Analysing Publicly Available Data Can Help Inform Strategic Tech Choices In The Cloud

According to recent research, firms that excel in digital transformation are 26% more profitable, generate 9% more revenue from their physical assets, and achieve 12% higher market valuations than other large firms in their industries. The backbone of a holistic end-to-end digital transformation is moving systems to the cloud. Tech-savvy businesses cited easier deployment (49%), scalability (48%), faster implementation speed (44%), automatic updates (37%) and real-time visibility (34%) as the main benefits of choosing a cloud service which belongs to, and innovates, the state-of-the-art.

Keeping up with the latest advances in cloud technology is imperative for businesses but this can be a lengthy process which requires expertise and data analysis. It is insufficient to rely on any one single metric to compare technology and it is challenging to perform this analysis at scale and in an automated manner. For this reason, this blog examines the use of Python to measure the popularity of cloud providers by multiple metrics using the public APIs of two diverse data sources, Twitter and Stack Overflow (a website for developers to post technical questions). This data has been processed using Python with the Pandas package, and visualised using Plotly. While the focus is on cloud providers, this process can scale to monitor trends in any technology.

Cloud Providers on Stack Overflow

Figure 1 highlights the popularity of each cloud provider on the Stack Overflow website as measured by questions tagged by cloud provider. All tags relevant to a cloud provider were considered. As may be expected, Amazon Web Services, Microsoft Azure and Google Cloud are by far the most discussed on Stack Overflow, while Heroku and IBM have a notable presence but are some way behind.

Stack Overflow Question Tag Count

Fig 1. Count of Stack Overflow questions tagged by all tags relevant to a cloud provider (logarithmic scale). Source: Stack Overflow API

Figure 2 showcases the results of the popularity of each cloud provider from the annual developers’ survey conducted by Stack Overflow with 56,553 participants from all over the globe. The results in Figure 1 are corroborated by the results in Figure 2 for the first 4 most popular cloud providers. Interestingly, IBM Cloud is the 5th most tagged in questions but ranks last in the developers’ preference survey.

Stack Overflow Developer Survey

Fig 2. Developers from all levels of experience pick their most preferred cloud provider in the Stack Overflow survey. Alibaba Cloud and OpenStack were not part of the survey. Source: Stack Overflow, 2022 Developers Survey.

Cloud Providers on Twitter

Figure 3 and Figure 4 paint a comprehensive picture of cloud providers Twitter popularity using two metrics; the number of followers of their main Twitter account and the average tweet engagement defined as the average number of retweets, replies, quotes and likes of their last 250 Twitter posts. Once again, Amazon Web Services has an overwhelming lead in both metrics. As expected, average engagement per tweet closely follows the number of followers each cloud provider has. However, there are a few exceptions to this. Cloudflare ranks 8th by number of followers, yet it amasses an impressive level of engagement per tweet, ranking 2nd by that metric, and VMware has a comparably low tweet engagement for its number of followers. In general, the smaller cloud providers by Stack Overflow metrics are also the smaller providers by Twitter metrics as evidenced by OVHcloud, Oracle Cloud, Linode and CWCS Managed Hosting. Notable exceptions here are Alibaba Cloud and Heroku. Alibaba Cloud has comparably high average tweet engagement and ranks 4th in followers, while despite being popular amongst developers as measured by Stack Overflow metrics, Heroku does not seem to have a comparably large number of Twitter followers although it does better by average tweet engagement.

Cloud Provider Follower Count on Twitter

Fig 3. Number of Twitter followers each cloud provider has on their verified main official Twitter accounts (logarithmic scale). Source: Twitter API

Cloud Provider Average Tweet Engagement

Fig 4. Average engagement per tweet is comprised of an average count of likes, retweets, replies and quotes of the most recent 250 tweets in each cloud provider’s official verified main account. (Note: The value for CWCS Managed Hosting represents an average engagement per tweet of 0.14 using m notation for milli, 1e-3). Source: Twitter API.

Combining Cloud Providers Twitter and Stack Overflow Metrics

Figure 5 combines all Stack Overflow and Twitter metrics in a single visually accessible chart for comparison. Besides IBM Cloud’s unusually large Tweet engagement, VMwares unusually low tweet engagement, and Heroku's unusually low number of Twitter followers, when compared to their performance on Stack Overflow the data seems to be consistent across cloud providers and data sources. The leading cloud providers are Amazon Web Services, Microsoft Azure, and Google Cloud Platform, with Amazon Web Services leading by a high margin by all metrics. The smallest providers as measured by all metrics are OVHcloud, Linode and Oracle cloud. There is a clear pattern of darker colours (indicative of a lower number of followers in Twitter) and smaller bubble size (indicative of a lower engagement per tweet) in the bottom left of Figure 5. Similarly, there is a pattern of lighter colours (indicative of a higher number of followers in Twitter) and a bigger bubble size (indicative of a higher engagement per tweet) in the top right.

Twitter Presence vs Stack Overflow Presence

Fig 5. Developers cloud provider preference in percentage plotted against Stack Overflow’s question tag count (logarithmic scale). The colour gradient represents the number of followers each cloud provider has in their official verified accounts, and the bubble size represents average engagement per tweet (CWCS Managed Hosting not shown for convenience of image scale and OpenStack and Alibaba Cloud not present due to missing data). Source: Twitter API and Stack Overflow API.

This analysis has shown that there are benefits to examining data from multiple sources when measuring the popularity of technology and that each data source tells its own story and suffers its own biases. By automating the collection of data from multiple sources with Python it is possible to obtain versatile results which offer a coherent and comprehensive analysis of cloud provider popularity. This methodology is extendable to a wide range of technology.

Ana Pires Fernandes is studying a BA in Politics, Philosophy and Economics at the University of Manchester and is a Q-Step Data Analyst Intern at Opsmorph.

Contact Opsmorph

Whether you're a startup looking to accelerate development of your cloud platform, a research project looking to introduce machine learning, or an organisation of any size looking to make better use of your data, contact Opsmorph for a free consultation to see how we can help.

moc.hpromspo@ofni