Understanding Bytecode and Java Virtual Machines

While you don’t need to know about the details of Bytecode and JVM’s to qualify as a Java programmer, such knowledge is still very beneficial. The first step to mastering a language is understanding it’s very basic concepts and inner workings. Furthermore, such knowledge will matter on a higher level, especially when you’re dealing with optimizations, security and distribution for your Java program(s).

What is Java Bytecode?

The key to Java’s excellent security and the (major) reason for it’s famed portability across all platforms is Bytecode. Unlike Languages like Python which are interpreted, Java’s compiler does not produce executable code. Instead it produces a set of highly optimized instructions designed to be executed by the Java Virtual Machine (JVM). These set of instructions are called Java Bytecode. You can think of the JVM as an interpreter for Bytecode.

Understanding ByteCode

In order to maintain stability across platforms, there must be standards implemented. In Java, Bytecode is that standard that is upheld across all platforms. All Java Runtime Environments (JRE’s) are able to understand the same Bytecode. The only requirement to be able to run a Java program is that there must a JRE setup. Once it’s setup any Java Program can run on it.

One may ask, due the significant differences between different platforms and systems, how is it possible that the same Bytecode works everywhere? After all, each platform requires it’s own implementation of the JRE. And therein lies the secret. The JRE exists as a sort of intermediate between the Bytecode and the Operating system. Each JRE ensures that it effectively (and efficiently) maps the Java API to it’s host operating system, creating a platform for Bytecode to run on.

Normally, compiled programs are faster than interpreted ones. So one may think that since the Java Bytecode is interpreted, it will be pretty slow. However due to multiple reasons, this isn’t the case. For one, the Bytecode is heavily optimized to ensure the maximum performance. Second is the use of JIT Compilers, which are discussed in their own section further in this article.

Bonus Fact: There are certain special types of operating systems and processors specifically designed for Java. They are able to execute Java Bytecode natively without having to interpret it first.

Inside look at Bytecode

This section is going to briefly discuss the internal structure of Bytecode. It’s enough to give you an idea of what Bytecode really looks like. If you’re interested in learning more you can look it up online.

The great thing about Bytecode is that once you learn its syntax, then you can use it anywhere because it is an intermediate representation of the code, and not the actual executable code that the CPU executes. And unlike Machine code which is notoriously difficult to learn, Bytecode is significantly easier. Another plus point is a complete documentation done by the Company Oracle.

The Bytecode stream is a sequence of instructions for the Java virtual machine. Each instruction consists of a one-byte opcode followed by zero or more operands. Each instruction follows the following format.

opcode operand(optional) operand(optional) ......

If you’re familiar with Assembly language, you’ll notice the similarities.

Below is a Bytecode example, representing Java code. It’s OK if you don’t understand it. Just get a general idea of what the Bytecode syntax looks like. You can then compare it to the Java syntax to see the contrast, and even try to find how the two codes co-relate to each other.

0:   iconst_2
1:   istore_1
2:   iload_1
3:   sipush  1000
6:   if_icmpge       44
9:   iconst_2
10:  istore_2
11:  iload_2
12:  iload_1
13:  if_icmpge       31
16:  iload_1
17:  iload_2
18:  irem
19:  ifne    25
22:  goto    38
25:  iinc    2, 1
28:  goto    11
31:  getstatic       
34:  iload_1
35:  invokevirtual   
38:  iinc    1, 1
41:  goto    2
44:  return

The Java equivalent to the above Bytecode:

outer:
for (int i = 2; i < 1000; i++) {
    for (int j = 2; j < i; j++) {
        if (i % j == 0)
            continue outer;
    }
    System.out.println (i);
}

Bonus Fact: You can use javap tool in your IDE to return a Bytecode version of your code. Useful if you’re learning how to understand Bytecode.

Security in Java

While we’re discussing the JRE and JVM we’ll also discuss the measures Java takes to ensure the security of the system it’s running on. To those who don’t know what a Virtual Machine is, it’s basically an emulation of the operating system itself. It creates a safe environment where you can test and run code without fear of any harm to the actual operating system.

In our case, the JVM creates a restricted execution environment called the sandbox in which the Java program is executed. This prevents any unauthorized access to certain parts of your system. You may not think much this if it’s your own Java program you’re running, but if it’s a malicious program written by a hacker that you’ve downloaded unknowingly, the JVM will be your first line of defense.

JIT Compilers

Just in Time Compilers (JIT) rely on the concept of compiling on the fly. They are an essential component of the Java Runtime Environment that improves the performance of Java applications at run time.

JIT compilers use the Bytecode to generate a compiled version of the source code. Remember, compiling a program translates it into machine code, the computer’s native language. This is contrast to the Java interpreter which executes Bytecode directly. JIT Compilers work together with the JVM to convert appropriate Bytecode sequences (not all of it) into native code, which the hardware can directly execute. The use of such techniques, plus some simple optimizations done by the JIT compiler while compiling improves the overall performance of the program.

Another advantage of the JIT compiler is that it can be faster because the machine code is being generated on the exact machine that it will also be executed on. This means that the JIT has the best possible information available to it to produce optimized code. If you pre-compile bytecode into machine code, the compiler cannot optimize for the target machine(s), only the build machine (The machine on which it was built).

This marks the end of the Java ByteCode Article. Any suggestions or contributions are more than welcome. Questions regarding the article can be asked in the comments section below.

[Bytecode reference]

CodeProject