Programming Languages: Pick Your Poison (C++ vs Python vs Java)
When choosing a programming language, you’re usually making a determination across several vectors: performance, portability, supporting libraries, and ease of use. Let’s examine our choices across these parameters:
Generally considered, alongside C, to be among the most performant languages. This is somewhat of a misconception. While it is true that C++ has the ability to outperform other languages, this performance does not come by default. Those inexperienced with memory management and good programming practices will often write less performant code in C++ than in a more managed language such as Java or C#. That being said, due to C++ compiling down to native assembly code, it is possible to write code in a way to optimize low-level machine instructions. This ability is not available in the other languages being discussed.
The least performant on this list. This is because, and this over-simplified, due to the interpreted nature of Python. When you execute Python code, it is first run through a compiler which converts your code into a Python-specific bytecode that can be run by a Python interpreter, the interpreter then reads this bytecode and converts it once again into machine-specific instructions for the target platform (Dhruvil Karani, 2020.) The compilation step will often cause a delay in initial execution while the interpretation steps come at a somewhat significant (2X-10X slower) performance cost compared to traditionally compile languages.
This is the area where Java is, as much as I hate to admit it, mischaracterized. In its infancy, Java suffered from many of the same issues as we just discussed with Python — with the main difference being that the compilation step is a separate step before execution. Over the years, however, many of those performance issues have been addressed and optimized away. One example of such optimization is in Java’s ability to analyze and optimize application hotspots with the JIT (Just In Time) compiler which will compile frequently accessed Java bytecode as native code throughout runtime to gain native performance in key areas as the application runs (Oaks, 2014.) This is the basis for many people claiming that the JVM needs to “warm up” before being able to properly measure its performance.
C++ falls high on portability. There’s a C++ compiler for pretty much every platform (List of compilers, 2020.) C++ code, however, is not portable by default. Most standard libraries are supported on most compilers — however specific support can and does vary based on CPU architecture and platform. Additionally, the C++ spec does not provide any cross-platform GUI libraries. One advantage to going with C++ is that for any platform you support, you get native binaries that require no additional software on the system.
Python, being an interpreted language, is highly portable in that the same Python code should have identical output on any platform it runs on. The caveat to this, however, is that it requires a Python interpreter to be available on the machine. Additionally (and anecdotally), some of the supporting tools (pip specifically in this example) will not immediately be compatible with every platform. For example, the numpy package and packages that rely on it are not (as of a couple of months ago), installable on the new Mac M1 machines using pip. While not an issue with Python specifically, this example shows a weakness in its portability. Another disadvantage to Python is that it is much more difficult to get any distributable binaries that don’t require a Python interpreter to be pre-installed on the system. I suspect many binaries that would run Python in the background have a thin C++ layer such as we created in this class to distribute.
Java is generally considered the gold standard in terms of portability due to its Java Virtual Machine (JVM). All Java code and libraries are guaranteed to be portable to every platform on which a JVM can be found (Hartman, 2021.) Due to this guarantee, the language’s adoption, and the large amounts of money poured into the Java project, a JVM is available for pretty much every platform. Java code is compiled down into Java Byte Code, which requires a JVM on the target machine to run. The benefit is that the same Java Byte Code can be run, without changes, on any JVM.
Supporting Libraries and Ease of Use
The ease of use of a language is closely tied to its supporting libraries and the ease with which they’re used. To begin, C++ is a language that has a decently steep learning curve. While easier to get a Hello World application than Java, it takes considerably more effort to accomplish anything “substantial” than more managed languages such as the other ones on this list. Memory management needs to be handled explicitly and consciously. In addition to this, integrating outside libraries is more complicated due to the low-level nature of the language. The build process often requires the importing of multiple libraries, compiling and configuration using a tool such as CMake, and a plethora of other concerns you don’t need to deal with in many other languages. Package management is also not defined in the standard either (though this pain point will be less painful with the introduction and support of modules in C++20.) C++ also integrates pretty seamlessly with the wealth of C libraries available which gives it access to a very long history of software development.
Often considered the easiest language to work in. Not only for its more language-natural syntax but also for its robust, built-in, feature set and ecosystem of supporting libraries. Used often as a scripting language to perform quick and simple tasks, it is often combined with native libraries to offset the performance cost of the interpreter. Memory is managed and package management is, essentially, part of the Python standard.
A middle ground between C++ and Python. Many people find Java to be more difficult to get started with due to its large amount of boilerplate to get a simple Hello World application. While the syntax is very C-like, building substantial applications comes with less cognitive overhead considering memory management is handled for you and there are fairly strict guidelines on what is and is not allowed. Package management can be painful as well however Maven is fairly simple to use and has emerged as the defacto standard — making it relatively easy to figure out due to the number of resources available.
All that being said, I have naturally tended towards using C++ in most of my applications as I am I work on game engines for a living, specifically using the OpenGL, Metal, and Vulkan APIs. These APIs (other than Metal), are provided as C libraries or have a thin C++ abstraction on top of a C library.
Game engines, and games in general, use more memory than most standard applications both on the GPU and CPU. They also have fairly strict performance requirements. Players generally expect games to run at a constant 60 frames per second, or as high as 120 frames per second for a smooth virtual reality experience. This requires carefully managing when data is loaded or unloaded into memory and also requires that the underlying machine code is doing as little as possible. Manual memory management is not possible in most languages and any language with an interpretation layer will not allow you to micromanage specific assembly calls (this is likely an untrue blanket statement but holds as a general rule.)
For the actual gameplay systems implementation, many engines will use a higher-level language such as C# (unity), blueprints (Unreal), GDScript (Godot), or Lua (Roblox and World of Warcraft) and provide a simple interface to the underlying optimized code. If you asked yourself what the point of embedding Python into your application it’s largely for these kinds of setups. A setup wherein you provide a simplified, safe API in a higher-level language for interacting with your system while hiding away all the complicated pieces that can break if used wrong.
Programming languages are a fun topic that a lot of people get really passionate about, sometimes in non-productive ways. The key is to understand that every language was created for a purpose and most, looking at you Brainf*ck (Brainfuck, 2021), have use-cases where an argument can be made to be chosen.
Thank you for coming to my TED talk,
List of compilers. (2020, January 9). Wikipedia. https://en.wikipedia.org/wiki/List_of_compilers
Hartman, J. (2021, December 11). Java Virtual Machine (JVM) & its Architecture. Www.guru99.com. https://www.guru99.com/java-virtual-machine-jvm.html
Oaks, S. (2014). 4. Working with the JIT Compiler — Java Performance: The Definitive Guide [Book]. Www.oreilly.com. https://www.oreilly.com/library/view/java-performance-the/9781449363512/ch04.html
Dhruvil Karani. (2020, January 9). How does Python work? Medium; Towards Data Science. https://towardsdatascience.com/how-does-python-work-6f21fd197888
Brainfuck. (2021, October 5). Wikipedia. https://en.wikipedia.org/wiki/Brainfuck