Computer Languages Characters Frequency
All Languages Combined
the frequency of punctuation characters used in programing languages.
- Total num of files processed: 13,841
- Total num of punc chars counted: 14,084,715
Percentage of languages:
- 19.8% C
- 18.5% Python
- 13.5% PHP
- 12.9% Ruby
- 9.8% Perl
- 9.4% C++
- 7.3% Java
- 5.2% BASH
- 3.2% JavaScript
- 0.3% CSS
- 0.1% Emacs Lisp
JavaScript
Sample syntax JS: Raining Hearts
Java
Sample syntax Complex Numbers in Java
golang
using other source have similar result.
Sample syntax Golang: Script to Find Replace Multi-Pairs of Regex in a Directory
C++
C
PHP
Sample syntax PHP: Send Mail with Attachment
Python
Sample syntax Python: Find Replace Regex in Dir
Ruby
Sample syntax Ruby Tutorial
Perl
Sample syntax Perl: Find Replace String Pairs in Directory
Bash
CSS
Sample syntax Atomic CSS
Wolfram language, Mathematica
Sample syntax Geometric Inversion, 2D Grid, Polygon
Emacs Lisp
source is dired.el in emacs 29.
Haskell
About Source Input
After this study, i realized that the size of input does not matter much. It is not necessary to gather thousands of source code files. 20 or 50 files from a generic project is sufficient.
For certain languages, different projects do favor certain chars, but again, not overall significant.
For example, for python, just pick 20 files from standard library is good enough. No need to go out of the way to get source from different projects.