Today let us understand the compiling process in java, that is, preface to java virtual machine and architecture.
Preface To Java Virtual Machine And Architecture
In this post we are going to learn about the jvm architecture in a simpler way.
Often we come across this question in interview, that can you tell me something about the jvm architecture or how jvm works?
We often fail to answer sometimes in an interview. Because we haven’t prepared.
Now we are going to cover what is JVM? How JVM work? and what are the internal components in JVM?
Let’s get started,
Whenever we write a java program we use java compiler to compile and to get the .class (dot class) file or bytecode as a output.
But the JVM responsibility is to just load the class file and produce the output. That’s it. This is a simple layman definition.
Now our application consists of .class (dot class) file which is a sub output that is being generated by the java compiler.
It is being loaded into first major component of the JVM which is the class loader subsystem.
The class loader subsystem is responsible for loading your bytecode and treating a bytecode as an instruction.
Now it should be loaded and perform some operations with respect to other classes which are associated with the JVM.
So for example collection classes, system classes and all the various types of classes which are there in the class loader subsystem which is provided by the JVM are taken care of.
By using the class loader subsystem the class files one been loaded. But there are various certain internal processing in the class loader subsystem which are three,
Now loading is a phase where your class files have been loaded. Basically loading in was three different kind of loader.
The first one is the bootstrap loader, the second one is the application class loader and the third one is extension loader.
We are going to look forward to each of them one by one. Let’s start,
Bootstrap loader is responsible for loading your internal java class files which are there in the file called rt.jar. You often encounter the rt.jar file in your java directory which consists of all the important classes and important packages which are required by the java.
All the primary packages and all the primary classes are there in the rt.jar file.
Extension Class Type Loader:
Extension class type loader is responsible for loading up the important extensible file which are needed by the JVM or the classes which are required by the JVM for further processing.
It is basically in lib / hd directory. So lib / hd contain all the extension classes which are loaded after the bootstrap loader. Now after the extension class has been loaded successfully there is third important thing which is loaded called as classpath application.
Application class paths are specified with the cp parameter or -cp (minus cp) parameter. This basically loads your s patient classes which are supplied by the compile time or at the runtime which are needed by the compile time at or at the runtime.
You can explicitly supply this parameter command which is -cp (minus cp) or you can load these class paths in your environment variable as well.
So the loading phase is over. Now let’s get back to second phase called as linking.
Linking is the phase where most of the work has been done. Now linking involves three sub process which is verifying, preparing and resolving.
Verify phase is a phase where your java bytecode is taken care of. It basically check bytecode whether it is compatible to the JVM specification or not.
If there is a certain problem while verifying the java bytecode it will throw us a common error which is class not found exception which happens during the verifying phase.
Now the second phase is prepare phase. In the prepare phase all the class data variable or the instance variable are been initialized to their default value.
For example if i have a variable as
public static boolean = true;
In the prepare phase the something variable which is of boolean type of static will be initialized to the default value of boolean type which is false not true.
Because prepare phase involves initializing with the default value not with original value. Now after the preparing phase we have the resolving phase.
Now resolve phase job is to load the other associated class which are there in the main class. For example all the references variables are initialized during the resolve phase.
Say you have a class called as “bike”. Now “bike” can consist of another class of variable called as owner. Now during the resolve phase the bike class will go and check whether there is definition for owner or not.
So if there is a definition then there is no problem it will go into continue evolve again and again. But if it’s failed to find the class or the another class called owner it will go and throw us an error or exception called as class not found exception.
After the resolve phase the third phase comes into play which is the initialization. Now initialization is a phase where your static initialization part initialized first and after that all the values which we assigned in the prepare phase.
For example in the prepare phase we have assigned something variable to as false which is the default value of course is initialized to the actual value.
As i discussed in the prepare phase we have public static boolean something is initialized with false. Now in the initialization phase it will be initialized at public static boolean something to true which is the actual value in the initialization phase.
So that’s how class loader subsystem works actually in a much simpler way.
There is some other component which is called as “Runtime data part” which comprises of all the memories which are going to be utilized by the JVM itself.
Now let’s encounter the method part first.
Method part is an runtime data access area whose responsibility is to load your class data or the meta data which correspond to class.
All the data which is present or which correspond to the class will get stored in the method part. Basically the method part size is 64 megabyte by default but it can be tuned up by using a comp gen command or there is a command which i am going to discuss in our next post.
But this can be tuned up by using xx or xmx command. For example if your server load is thousand and thousand of classes or millions of classes.
Now there maybe a probability that you will get a java.lang out of memory error. This is due to the exceed in the size of the memory utilization.
So you can tune up by using xx, xmx and using comp gen commands. When you talk about the java 8, comp gen is been replaced by using meta space.
What java developer of 8 did is they do not need any kind of external developer input to tune up the method part space.
What they did is they introduced the meta space. Now meta space is responsible for automatically allocate memory, look for expansion as well as shrinking.
So that’s all about the method part.
Heap is the hardly used memory area in jvm. Heap basically stores all the object data. For example whenever you instantiate a new object, say
Class obj = new Class();
Now all the data or all the object which are going to be created will be there in the heap memory only.
All the properties, all the characteristics and all the attributes of the object will get stored in the heap. For example you can store arrays. Since arrays are also objects.
So all the objects get stored in the heap. By default heap memory is one fourth of the physical memory.
But this can also be tuned up using xx command and for the xs for the small size and xmx for the maximum size. So this can be tuned up by using this parameter command.
Now here comes the picture of java stacks.
Java stack contains stack frame. It is basically per method invocation. The task of the java stack is to load up the method and invoke the method based on the preference or based on the last in first out preference.
Now we often call the method one by one. So for example one thread is calling method one. Now method one is calling method two.
So method one will stack first then method two will come into picture and then method three will come to the picture.
Now this will be popped up based on the returning occurrence and we don’t need to worry about that because they are going to return anyway.
Sometimes they occur that java stack is been stacked up again and again. What happens actually is whenever you write a program and there is a recursive algorithm that is not encountered while handling by the developer.
So there occurs stack frame added again and again and again. Because of the developer fault. This may occur stack overflow exception.
Now this can also be tuned up. I guess there is a command called xx again which is used to change the size of the java stack frame.
Basically you don’t need to worry because this is automatically cleared up by the jvm itself. So you don’t need to worry about that.
Now here comes the fourth picture that is PC registers.
PC registers are basically program counter registers which basically point to the next instruction to be executed.
And the PC register is responsible for per thread management and suppose there are three thread, one thread, two thread; our thread one counts the instruction for the thread two, thread two counts the instruction for thread three and basically it is a pointer to the next instruction to be executed.
Here comes the fifth important stack which is the native method stack.
Native Method Stack:
The native method stack which works parallelly with the java stack. In java native methods stacks is basically the operating system dependent.
All the operating system native operating system dependent classes are loaded into this native method stack. For example there is a library in the lib folder called as dll’s.
There are lot of DLL available if you are using windows. Now dll is responsible for holding up your class data corresponding to the operating system dependency.
If you are using windows then there is a .dll (dot dll) file. If you are using Linux or UNIX there may be probability that you will find .sor (dot sor) kind of file extension.
Now this all belong to the native method stack.
Here comes the third most important execution block or the component which is called as the execution engine.
Execution Engine is responsible for executing a bytecode instruction. It basically comprises of various subsystem which are interpreter, garbage collection, just-in-time compiler and there is hotspot profiler.
Interpreter interprets the bytecode instruction line by line. It checks whether the bytecode instructions are compatible to execution engine or not and it passes using the native method interface and helps in the execution.
Now here comes picture of just-in-time compiler. Whenever execution engine encounter the similar kind of instruction to be executed again and again, what it does is it compiles some sort of piece of classes or some piece of code.
It compiles some part of code which is repeated over in again. So that it can improve the performance in the later time. For example there is a XYZ XYZ XYZ encounter again and again and again. The XYZ is actually an instruction.
Now what JIT does is JIT pre-compiles automatically the XYZ. Now whenever in the next instruction XYZ encounters, JVM reduces that time and thus leads to the performance of improvisation.
A Hotspot profiler keeps an eye on the bytecode and it helps the JIT compiler to basically look for the statistics and it provides the statistics results which are instruction which are repeated over and over again.
That’s all about the JVM architecture. I hope you understood the whole JVM architecture.