Open Source in Python and JavaScript 2024

Part I: The Evolution of Ecosystems

October 14th, 2024
Erik Nogueira Kückelheim
Open Source Ecosystem Python JavaScript

Python and JavaScript are two of the most used programming languages. They consistently rank among the top choices for developers worldwide.

What's the secret of their success?

I say, and many will agree with me, their strong ecosystems are a huge part of the reason why. Developers have access to countless third-party tools, libraries, and frameworks, enabling them to push the boundaries of what's possible with Python and JavaScript every day.

The vast majority of these projects are open-source. That means they are freely available for everyone to view, modify, and distribute. As a result, they depend on a robust community of developers who collaboratively build and maintain the codebase.

But what's the current state of these communities? How large are they? Which areas are gaining the most traction? Which ecosystem has a stronger foundation, which is thriving more?

This article will be the first of a series of articles aiming to answer these questions. In this article, we will focus on the evolution of the open-source communities behind Python and JavaScript. We will explore trends of interest over time and the growth rate of both ecosystems in the past.

The base of this series is an analysis of over 36,000 open-source repositories hosted on GitHub.

GitHub

So, why is GitHub a good starting point to analyze the communities surrounding Python and JavaScript?

First, GitHub hosts the codebase of the vast majority of open-source projects. It's also essentially a social networking site for developers. It enables collaboration on code, allows users to track changes, and facilitates sharing work with others. Most communication in open-source occurs on GitHub, making it a central hub for these communities.

Yes, there are similar platforms out there. But none of them play a significant role when it comes to open-source in Python and JavaScript.

TypeScript

When talking about the JavaScript ecosystem, we cannot miss out on TypeScript. TypeScript is a statically typed superset of JavaScript developed by Microsoft. It was published in 2012.

TypeScript is built on top of JavaScript. It adds features like static typing and type inference. This is similar to Python's type hints, introduced in Python 3.5, which allow for optional type checking with tools like mypy.

There's ongoing debate about whether TypeScript should be considered a separate entity. However, it's clear that TypeScript is inherently linked to JavaScript. Importantly, TypeScript doesn't have its own runtime. It' used only during development. In the build process it's compiled into plain JavaScript, which can then be executed in any JavaScript runtime. Moreover, many open-source JavaScript repositories have transitioned to TypeScript over time.

For this analysis, I consider the JavaScript and TypeScript ecosystems as one. Excluding TypeScript from JavaScript's open-source community would significantly distort the results of this study.

Data

In the analysis, all GitHub repositories were included that meet the following criteria:

  • Language: The code of the repository has to be mainly written in Python or JavaScript/TypeScript.
  • Open Source License: The repository has to be licensed under a common open-source license[1].
  • Star Count: The repository has to have a minimum of 250 stars.

By applying these criteria, I aimed to get a complete picture of the communities behind JavaScript and Python.

Who Takes the Lead in Raw Numbers?

Let's have a look at some raw numbers:


Total Number of Repositories

Bat Charts showing the total number of open-source GitHub repositories written in Python or JavaScript

JavaScript leads the way with a total of 21,636 open-source repositories on GitHub that have at least 250 stars. Python, while in second place, still impresses with 15,296 repositories meeting the same criteria.

Both languages offer developers an abundance of openly available code that they can reuse or contribute to.

Now, let's consider the factor of time. All these repositories are, to some extent, established within the open-source community. But how did we get here? When were these repositories created? Are there any noticeable patterns in repository creation over time? And is the community still growing, or has it stagnated?

To answer these questions, we'll compare repository creation over time:


Cumulative Repository Creation Over Time (Python vs. JavaScript)

Line plot showing the commulative number of open-source GitHub repositories written in Python or JavaScript between 2008 and 2014.

The graph above shows the total number of repositories created for both Python and JavaScript between GitHub's early days in 2008 and today.

Both communities have experienced steady growth since the beginning. And they keep growing.

However, since around 2010, JavaScript has taken the lead and Python never managed to catch up.

This next graph illustrates the annual count of new repositories created for Python and JavaScript:


Number of Repositories Created per Year (Python vs. JavaScript)

Line plot highlighting the number of open-source GitHub repositories written in Python or JavaScript created per year between 2008 and 2014.

The figure highlights fluctuations year over year, uncovering some interesting trends.

Between 2015 and 2018, JavaScript experienced growth rates exceeding 2,000 new repositories per year. It reached its peak in 2017 with more than 2,500 new repositories. However, this number has seen a significant decline in recent years.

This decline doesn't indicate that JavaScript's ecosystem is shrinking. In fact, it continues to grow. However, the growth rate has more than halved since 2017.

Python's open-source community follows a similar trend, though slightly delayed and at a lower rate. It reached its peak in 2019, two years after JavaScript, with more than 1,700 repositories per year.

Like JavaScript, Python experienced a drop in growth recently (except in 2023). This decline doesn't have to be a bad sign. It may indicate that both communities have reached a state of stability and maturity. Quantity does not equal quality.

What's striking though is Python's surge in 2023. It marks the only year Python's growth rate has surpassed that of JavaScript.

Which Community is Doing What?

To better understand these patterns, I will analyze the dominant themes and technologies within the open-source communities of Python and JavaScript. What are the most prevalent topics in JavaScript repositories. How do they compare to those in Python? How have specific topics evolved within each language? Are there emerging trends indicating shifts in developer interest?

GitHub administrators can tag their projects with topics to indicate specializations, concepts, or technologies. I have collected these tags from all the open-source repositories in this study.

To get a first understanding of the primary areas of interest, let's have a look at the following wordcloud.


Dominant Topics in JavaScript Repositories

Wordcloud Highlighting dominant Topics in JavaScript's Open Source Community

Each word in this cloud represents a topic tagged in JavaScript repositories. The larger the word, the more frequently it has been tagged.

At first glance, JavaScript's ecosystem appears very diverse. There's a strong emphasis on frontend technologies. Popular JavaScript frameworks like React and Vue play a major role. Other frameworks, such as Electron and React Native, demonstrate that JavaScript is used to build UI interfaces across different platforms.

The prominence of Node.js highlights that JavaScript has long grown beyond its primary use in the browser. It now powers server-side applications, allowing developers to create full-stack solutions with a single language.

Overall, the community's primary focus is web development, on the frontend and in the backend. However, if you look past the dominant forces, you will also find topics like cli, llm, chatgpt, and ethereum entering new areas of JavaScript use cases.

So, how does it look in the Python world?


Dominant Topics in Python Repositories

Wordcloud Highlighting dominant Topics in Python's Open Source Community

Terms like machine learning, deep learning, tensorflow, openai, and pytorch dominate the landscape, indicating a strong interest in artificial intelligence and data-driven applications.

Additionally, topics like django, docker, linux, and flask show that Python doesn't just run scientific computations on personal computers. Python is executing tasks on servers, making its capabilities accessible to users around the globe.

At this point, I'd like to give kudos to the organizers of Hacktoberfest. They've managed to impressively hack their way into this dataset. Their influence within the open-source communities of JavaScript and Python is striking. Many repositories are tagged with hacktoberfest because it's October, and Hacktoberfest 2024 is kicking off. Open-source maintainers want to show they're open to contributions, so they add the tag this time of year.

The worldclouds above provide an interesting snapshot of the entire ecosystem. However, they may include topics that are no longer relevant, and contain GitHub projects that are long forgotten.

To gain deeper insights, we'll analyze topic trends over four-year periods. Each graph shows how frequently specific topics have been tagged in repositories created during a particular timeframe. Only the 10 most tagged topics are highlighted for each period.

JavaScript

Let's first examine the JavaScript open-source community. What do the figures reveal?


Number of JavaScript Repositories Created by Topic over Time

Pie Chart showing the top 10 important topics in JavaScripts's open-source community from 2008 to 2011 Pie Chart showing the top 10 important topics in JavaScripts's open-source community from 2012 to 2015 Pie Chart showing the top 10 important topics in JavaScripts's open-source community from 2016 to 2019 Pie Chart showing the top 10 important topics in JavaScripts's open-source community from 2020 to 2023

In GitHub's early days, the JavaScript landscape, especially in the frontend, was quite different from today. The community didn't rely on large UI frameworks. Instead, developers focused on integrating JavaScript with the core functionalities of the browser using pure HTML and CSS.

Libraries like jQuery played a crucial role in this integration. With the release of HTML5, the latest HTML standard, JavaScript further explored and extended its capabilities.

What's striking is the speed with which Node.js captured the attention of open-source developers. Node.js was released in mid-2009 and it seems the new JavaScript runtime was something the community has waited for. The language was ready to move beyond traditional client-side web development.

Something that might be confusing is the appearance of Typescript as a topic between 2008 and 2011. Typescript was only released by Microsoft in 2012. But, there's a simple explanation. As mentioned before, many open-source projects transitioned from JavaScript to Typescript over the years. Consequently, these GitHub repositories were often tagged with the typescript topic in retrospect.

From 2012 to 2015, JavaScript's community experienced a major transition. Modern JavaScript frameworks like React and Angular changed the way how web applications are built. While core technologies remained important, React in particular, revolutionized the community.

Moreover, Node.js not only enabled new types of applications, but also changed the way developers create them. Developer experience became a key focus. Programmers started to organize their code into modules and used tools like Webpack to bundle it afterwards.

In the following period, this transition solidified. React takes the lead and becomes the dominant factor in the community's development. Vue appears as a popular alternative to React, and Electron and React Native expanded JavaScript-based user interfaces across various platforms.

In recent years, the overall landscape has remained stable. New tools like Tailwind CSS and Vite reflect an even greater focus on developer experience. Notably, ChatGPT has opened the door to a new field of application.

Python

How has the Python community shifted its focus over time?


Number of Python Repositories Created by Topic over Time

Pie Chart showing the top 10 important topics in Python's open-source community from 2008 to 2011 Pie Chart showing the top 10 important topics in Python's open-source community from 2012 to 2015 Pie Chart showing the top 10 important topics in Python's open-source community from 2016 to 2019 Pie Chart showing the top 10 important topics in Python's open-source community from 2020 to 2023

Between 2008 and 2011, Python's open-source community on GitHub was still relatively small. The main development activity centered around Django, the web framework.

In the next period, from 2012 to 2015, Django remained the primary area of interest. development related to Linux began to catch up, highlighting Python's importance in Linux system development, server architecture, and automation. With the emergence of Docker, Python became a key player in the DevOps movement.

Given Python's role in server-side development, it's no surprise that security became a major concern during this time.

Last but not least, it's the first time machine learning enters the community's main interests and Django's lightweight alternative Flask gained traction.

The period between 2016 and 2019 markes a major transition. Topics around machine learning captured widespread attention. The world watched with excitement as neural networks achieved remarkable successes in image recognition, natural language processing, and speech recognition.

The Python community played a crucial role in advancing this technology. Libraries like PyTorch, TensorFlow, and Keras empowered developers and researchers to build sophisticated models with ease, making breakthroughs in artificial intelligence more accessible.

During this time, the demand for skills in AI and machine learning skills surged, fueling Python's growth beyond traditional software development. Only Django retained a spot among the top 10 topics.

The latest period, from 2020 to 2023, was dominated by the breakthrough of Large Language Models. The relatively new transformer architecture (introduced in 2017) led to significant improvements in training efficiency and paved the way for the historic success of models like GPT.

The dominant tool for development in these areas was PyTorch.

Final Notes

The open-source communities of Python and JavaScript have grown tremendously over time, each evolving in unique ways. JavaScript's rise has been motivated by the explosion of frontend frameworks like React, Vue, and the server-side revolution sparked by Node.js. Meanwhile, Python's growth has been bolstered by the increasing dominance of machine learning, data science, and artificial intelligence libraries.

Both ecosystems have matured, and although their growth rates have slowed in recent years, they remain dynamic and influential.

Technological advancements in both languages allow developers to apply them in almost any context. There's hardly anything you can't build with JavaScript or Python. Yet, their ecosystems are built around distinct areas. JavaScript continues to dominate web development, while Python is the go-to language for scientific computing and AI, with strong ties to server architecture.

This distinction even applies to this blog! The analysis was conducted using Python, while the website is primarily built with JavaScript on both the frontend and backend.

This article highlighted the size and focus on Python's and JavaScript's open-source communities. Both have accumulated an abundance of openly available code and keep contributing to their ecosystems. However, quantity does not necessarily equate to quality! In the next part of this series, we'll focus on the contributors of open-source projects.

Are Python and JavaScript projects actively maintained? How conntected is the community? And is it a small core or a diverse and versatile community of developers who push forward technological advancements? We'll see in the next part of this series. Stay tuned!

Footnotes

  • [1] Mozilla Public License 2.0, MIT, GNU Lesser General Public License 2.1, The GNU General Public License 2.0/3.0, Eclipse Public License 2.0, Creative Commons 0-1.0, The 3-Clause BSD License, The 2-Clause BSD License, GNU Affero General Public License 3.0, Apache License 2.0