Are developers passinate about learning new programming language?

Max Wang
5 min readNov 7, 2020

An exploratory research on frequency of learning new programming language

Stock Photo — Innovation

Introduction

This is an exploratory work on data analysis of 2020 Stack Overflow Developer Survey, the largest and most comprehensive survey of software developers (and anyone else who codes!) on Earth, covering all sorts of information like programming languages, jobs, code style and more. This will help future developers to give some thoughts on what would work best for them.

There are six sections in the survey. In 2020, compared to 2019, new question, “How frequently do you learn a new language or framework?” listed in the first place in section “Technology and Tech Culture”.

According to (Dominique Boucher 2017), in his article “Programming languages and innovation”, “It is always worthwhile to learn other programming paradigms and languages”. A few questions will be discussed in this post. Data analysis and machine learning (Logistic regression, Adaboost classifications) are untilized to get some insight into the survey data.

Overview

Response from 47,487 respondent , who have jobs (“Employed full-time”, “Employed part-time”, “Independent contractor”, “freelancer, or self-employed”, ) are being used for analysis. Amony them, 2% of the respondent, chose learning new language or framework ‘Once a decade’, 22.4% ‘Once every few years’, 33.7% ‘Once a year’. The remaining 30.1%, who chose ‘Every few months’, are the group of developers who are most passinate about learning new language or frameworks.

Questions to be disucssed

  1. Is age, gender and education level influence the frequency of new language learning?

2. Will developer who prefer to learn more new languages or frameworks has higher salary?

3. Which countries have higher learning new language rate?

4. Respondents who have stronger intention to learn more language or framework, will lead to higher work overtime?

5. Any difference of learn new language rate in the size of the organization?

6. Which are the major factors can be used to predict if the developer are passionate about new programming language or framework?

Q: Is age influence the frequency of new language learning?

A: Yes, mapping the frequency to numerical data 1 to 4(Every few months). Data illustrated that, younger developers are eager to learn new language or framework. Developer who chose ‘Once a decade’, have average age 39, and who chose ‘Every few months’, have average age 29.

Q: Are there any difference between man and woman with regard to learn new language?

A: No. There are slightly difference (1%) in learning new language or framework between man and woman according to the data.

Q: Are developers with higher education have higher new learn rate?

A: No. Developers who graduacted from Primary/elementary school (3.22) has the highest intention to know new language or framework, and doctoral degree (2.68), the developer who has highest eduction, has the lowest rate of learning new language.

Q: Will developer who prefer to learn more new languages or frameworks has higher salary?

A: No, who study new language ‘Every few months’, average earn 91,463USD per annual, 26% lower than developer who learn ‘Once every few years’ and earn average at 115,451USD

Q: Which countries have higher learning new language rate?

A: Removed countries with respondents less than 200. Top 5 countries are: Pakistan(3.46), Sri Lanka(3.43), Bangladesh(3.40), Iran , India. Together with other countries: Indonesia, Nigeria, Argentina, Philippines, Viet Nam, China, Brazil.

Respondents from countries like Sweden(2.86), Netherlands(2.86), Czech Republic ,Belgium, Russian Federation, have lower frequency in study new language or framework.

Q: Respondents who have stronger intention to learn more language or framework, will lead to higher work overtime?

A: Yes, the higher frequency developers learn new language (3.10), the higher overtime, except for group of developer, who ‘Never have overtime’.

Q: Any difference of learn new language rate in the size of the organization?

A: Developer who are freelancer, or sole proprietor, have the highest rate of

Learn know new language. And gradually decrease when company size increase. But , company have the largest size, have higher learn new rate than the other four.

Q: Any significant findings in the correlation among the numerical data?

A: Frequency of learning new language is positively correlated with overwork, compensation and job satisfaction is increasing with the age, education level, organization size. And younger developer in smaller company have more overwork, compensation is not increase with the overwork, instead it will impact the job satisfaction.

Q: Which are the major factors can be used to predict if the developer are passinate about new programming language or framework?

A: Respondents chose “Every few months” are grouped as developers passinate about learning new language and framework. The other respondents formed the opposite group. Two classification algorisms (LogisticRegression with different cutoffs, and AdaBoostClassifier with randomized search over the hyperparameters) have been applied in this study. Respondents have been filtered. To mitigate the impact of outliers,Years coding including/excluding eduction ‘More than 50 years’, ‘Less than 1 year’ have been removed, age is limited to 18 to 65, and similar work has been done for Converted Compenation and Work Week Hours. The coefficients of the features are ilustrated in the following table.

Result

Model accuracy on testing data set is about 72%. Country of Brazil, respondent who visit Stack Overflow multiple times per day, respondent chose “practice of DevOps to scaling software development” is extremely important and who thought his/her company has a good onboarding process, are among the most important factors in the model

Future work

Based on this exploratory study, future work can be done to improve the model. Questions’ validity and reliability can be further checked. Respondents in different country and in different age groups can be studied further. And potentially, better models can be leveraged to increase the predict accuracy, precision and recall scores.

Detailed analysis and code navigate to my GitHub.

--

--