Sunday, 9 November 2008

The TIOBE Programming Index

I’m sure a lot of you (that is if anyone is reading this) are familiar with the TIOBE Programming Index. Basically, it’s an index which claims to gauge the popularity of programming languages according to how many times the term language programming (for Delphi that would be Delphi Programming) appears in various search engines. Some in the Delphi community are very excited to see that Delphi is at #8 snapping at the heels of C#. Take a look at Jim McKeeth’s post on the subject here. Now I’m a big fan of Jim’s blog and podcasts, but I can’t get as excited as he is about this.

First of all, does it seem realistic to you that Delphi is only 0.02% behind C#? I’d love it to be true, but all other evidence points to C# being much more popular than Delphi. Unrightly so in my opinion, but true nevertheless. I’m not saying that the TIOBE index is fixed or anything, but it must be fatally flawed somewhere.

So I thought, what could we use as an index of programming language popularity? Someone mentioned job-listings, which is probably a very good indication, but jobs vary geographically, and usually job-listings are specific to a country or region. So for example, looking at a US job listing site would not give you the true state of a particular language in the world. There has to be something more relevant.

Then I thought, how about StackOverflow.com. For those not familiar with StackOverflow.com, have a look at my last post. If we could count the number of questions for a given language, wouldn’t we get a fair idea of the current popularity of a language? If people are asking questions about a language, it’s because that language is being used. It has to be a better indication of current use than how many times a search engine can find some particular words.

So here it is. I’ve taken the top 20 languages from the TIOBE Index, and extracted the number of questions for that particular language from StackOverflow.com by looking at the corresponding tags for that language. I’ve had to use several tags for some of them. For example for Delphi, I took Delphi, Delphi2009, Delphi7 and Delphi-programming.  I may have missed a few tags, but I think I managed to get all the main ones. I also took only the top 20 and not all 50 languages in the list, as it seemed like too much hard work.

Position TIOBE Position Language Stackoverflow Ratings
1 7 C# 26.381%
2 1 Java 15.933%
3 3 C++ 12.126%
4 10 Javascript 9.239%
5 5 PHP 8.560%
6 6 Python 7.988%
7 11 Ruby 4.887%
8 2 C 4.764%
9 4 (Visual) Basic 4.438%
10 9 Perl 1.893%
11 8 Delphi 1.770%
12 18 Actionscript 1.449%
13 13 PL/SQL 0.262%
14 20 Lua 0.166%
15 17 COBOL 0.064%
16 14 SAS 0.027%
17 15 ABAP 0.027%
18 16 Pascal 0.027%
19 12 D 0.000%
20 19 Logo 0.000%

Why has C# gone from 4% to 26%? Well, StackOverflow.com may just be a magnet for .net developers. Actually, I think it was seeded from a .net questions and answers forum which used to live on the Joel On Software page. I still think C# should be closer to the top than the TIOBE index gives it credit for.

Why is Delphi so strong on the TIOBE index, and not so strong on StackOverflow.com? Perhaps the Delphi community just hasn’t found out about StackOverflow.com yet. Perhaps Delphi is easier to use, and you simply don’t have the same number of questions needing answers. Maybe other Delphi resources, be they web sites or books or whatever, are so good that we don’t need StackOverflow.com as much as say Java developers do. Or it could simply be the fact that there are more C# and Java developers out there than there are Delphi developers.

I think it is a whole lot more complicated than this, but the Stack Overflow Index just seems a whole lot closer to what I would have guessed the list to look like than the TIOBE index does. It could be improved. If we could create some algorithm that takes into accounts answers or perhaps even votes as well as questions, we may get a better picture of the current state of each language.

What do you think?

3 comments:

McKeeth said...

I think you hit on a few of the reasons that StackOverflow has more C# questions then Delphi. I've found most Delphi developers REALLY prefer using forums.codegear.com for their Q & A. A lot of them are old school and just prefer NNTP.

One of the reasons that Delphi does so well on the TPCI (TIOBE Programming Community Index) is it has a number of years head start on C#, Java, Perl, etc. Delphi had thriving online communities way before those languages were even conceived (heck, back when Anders was still working at Borland!)

I think the ideal index would include the TPCI technique and then weigh in Stack Overflow questions, CodeGear Forum questions, and questions from other specific support forms with an extra weight attached to them.

Glad you have enjoyed the blog and podcast. Thanks for the link.

Babnik said...

Jim, it's all well and good for Delphi developers to post queries on the codegear forums, but all that does is make is more insular. We need to be on the front-page of sites like stack overflow so Java developers, C# developers, whoever, know Delphi is still around, and you can do some amazing stuff with it. How many c# developers know as much about Delphi Prism as we do? I'm not saying they'll convert anytime soon, but the first step is that they know it exists. I tend to use stack overflow as a learning tool, so I'll occasionally click on a Ruby or Python question, What we need is occasional clicks on Delphi questions from those who don't know it exists. They won't go to Code gear forums, that's for sure.

Peter Helsen said...

Just made a program to track popularity myself, and one thing that was very striking to me was the rather low position of C#..

I have tried Google, Bing, Yahoo, Altavista and Cuil in my alghoritm, but Bin and Cuil hav proved to be not trustworthy for this purpose.

The fact C# has 'fallen'is because of Yahoo mainly I think, just try perl and pyton in it for example

You can find my list (100 languages) at http://phelsen.wordpress.com/2009/11/27/webolarity/