The Primitive Data Types in Java
With discussions on overflow, underflow, two's complements, Java default data types, and casting (widening casting, narrowing casting).
What are data types?
When programmers create a variable to store data, it is usually required to specify the type of data the variable can store. Whether it is a whole number, a decimal number, a letter or a bunch of letters. This is the data type of the variable, and it tells the compiler or interpreter what kind of data the variable can hold, the operations that can be performed on the data and the limits of the data. Think of a variable as a container, and its data type as the label on the container indicating what the container can hold. A programming language is said to be strongly typed if it forces variables to be used only in ways that are allowed for the type, otherwise an error is thrown. Java is an example of a strongly typed language. Whereas a programming language is said to be weakly typed if it allows variables to be used as if they were another type. An example of a weakly typed language is Python.
What are primitive data types?
When a programmer programs complex programs, there is usually the need to create custom data types to hold the unique kind of data the program requires. In order to create these custom data types, programming languages usually come out of the box with basic data types that programmers can use in building their custom data types. These basic data types are called primitive data types, and they can be seen as the building blocks for building other custom data types. Primitive data types have lower limits and upper limits. And they are named with a preserved keyword.
What are the primitive data types in Java?
Java comes shipped with 8 primitive data types. Namely: byte, short, int, long, float, double, char and boolean. Among the 8 data types, 6 are numeric data types - byte, short, int, long, float and double. It is important to note that primitive data types have their limits both in memory size and the range of values they can hold.
Some concepts before we discuss the primitive data types
Before discussing each primitive type. Let’s talk about three concepts - overflow, underflow and two’s complement.
Overflow
Overflow, in Java, happens when a value greater than the upper limit of a primitive data type is assigned to a variable.
Underflow
Underflow happens when a value lesser than the lower limit of a data type is assigned.
Two's complement
Two’s complement is a method in computer science that the computer uses to represent signed integers (positive integer or negative integer) where the most significant bit (that is, the left-most bit) signifies the sign. The sign is negative if the left-most bit is 1, and positive if it’s 0. The two’s complement of a binary number is obtained by inverting the bits and adding 1. For example, the two’s complement for the binary number 101 is 011, that’s changing 1 to 0 and 0 to 1, then adding 1. This way the computer is able to represent positive integers and negative ones. In the example above, the binary number - 101 is 5 in base 10, and its two’s complement (011) is -5 instead of 3.
Explanation of each primitive data type
byte
The byte
is an 8-bit (hence, the name byte) signed two’s complement integer. It labels a variable to hold whole numbers within the range -128 to 127 (inclusive). It is used to save memory when using a large array of numbers. Once one decides that the integers an array will hold is within the range -128 to 127, it makes sense to save memory by saving those integers as bytes. This is because if one uses int
or long
the computer will allocate extra memories that won’t be used.
It’s an error when one tries to assign an integer greater than 127 or lesser than -128 to a byte
. The concepts of overflow and underflow plays out if the result of an expression or an assigned value is greater than 127 or lesser than -128.
byte myByteResult = (byte) (Byte.MAX_VALUE + 5);
In the example above, Byte.MAX_VALUE
is used to get the highest (maximum) number of a byte
which is 127. Adding 5, it becomes 132. Java treats the resulting 132 as an int
(Java’s default in cases like this). We are using (byte)
to cast the int
132 to a byte
. (More on casting and default later). The 132 is forced to overflow to -124 (132 - 2^8) because the variable myByteResult
is a byte.
short
The short
is a 16-bit signed two’s complement integer. It indicates that a variable will contain whole numbers within the range of minimum value -32,768 to a maximum of 32,767. Like the byte
it can be used to save memories when using large arrays. Also, the concepts of overflow and underflow plays out.
short myShortResult = (short) (Short.MIN_VALUE - 5);
In the example above, Short.MIN_VALUE
is used to get the lowest (minimum) number of a short
which is -32,768. Subtracting 5, it becomes -32,773. Like we mentioned above, Java treats the resulting -32,773 as an int
. So we are using (short)
to cast it to a short
. It is, therefore, forced to underflow to 32763 (-32,773 + 2^16) because -32,773 is less than the minimum value a short
can hold .
int
The int
is the most common numeric primitive data type in Java. It’s a 32-bit signed two’s complement integer. It indicates that a variable can contain whole numbers within the range of minimum value -2^31 and maximum value (2^31 - 1).
long
The long
is a 64-bit signed two’s complement integer. It indicates that a variable can contain whole numbers within the range of minimum value -2^63 and maximum value 2^63-1.
When one wants more precision in their calculation one can use either float or double.
float
The float
is a single-precision 32-bit floating point number. It indicates that a variable can hold decimal point numbers. Basically, it can hold a maximum of 7 digits after the decimal point.
double
The double
is a double-precision 64-bit floating point number. It indicates that a variable can hold decimal point numbers. Basically, it can hold a maximum of 16 digits after the decimal point.
Although, both float
and double
are suitable when more precision is required than what byte
, short
, int
or long
can give. They are not suitable when precision as it relates to currency calculation is required. Java provides a class for such calculations - BigDecimal.
char
The char
data type is a single 16-bit Unicode character. It indicates that a variable can only hold a single character value. For example, char userLastInput = ‘a’;
. The ‘a’ is a single character but it can also be replaced by its unicode character which is \u0061
. The above assignment statement can be rewritten as char userLastInput = ‘\u0061’;
. You can get the unicode representation of characters from here - unicode-table.com/en. It is important to note that a char
variable cannot hold more than a single character. And one must use single quotes to wrap the character like this ‘a’
or ‘\u0061’.
boolean
The boolean
data type is used when a variable can only be assigned either of two values - true or false. A variable when declared as a boolean
can only be true or false and nothing else. It represents one bit of information.
boolean isAbove18 = false;
The above statement is an example.
The default of primitive data types in Java
Java uses some of its primitive data types as default for expressions. For example, a whole number assigned to a variable is by default an int
. This becomes visible when one tries to assign a whole number that is greater than the maximum value of an int
to a variable of data type long
even if the whole number is within the range of long
. The maximum value of an int
is 2147483647 (this can also be written as 2_147_483_647 to make it more readable). This value can be obtained programmatically using this statement - int maxInteger = Integer.MAX_VALUE;
. When one tries to assign to a long
the value 2_234_483_647 (which is obviously out of the range for an int
) like this long myLongValue = 2_234_483_647;
one gets the error - Integer Number too large
. This is because Java treats the value 2_234_483_647 as an int
(which is its default whole number data type). In order to clear this error, one has to use the flag “L” to indicate that it should be treated as a long.
long myLongValue = 2_234_483_647L;
The default data type for decimal numbers is the double
. This is also why when one tries this statement - float myFloatNumber = 4.77;
there is an error - Required type: float Found: double.
float myFloatNumber = 4.77f;
To clear the error, one must use the flag “F” or “f” as in the above statement.
What is casting?
Casting in Java is the process of converting from one data type to another type provided both are convertible. For instance, all the numeric primitive data types in Java are convertible. That is, it is possible to convert from type byte
to short
, short
to int
, int
to long
, long
to double
etc. There are three types of casting we will look at now:
- Widening Casting
- Narrowing Casting
- Explicit casting for expressions
Widening Casting
This is the type of casting done when one converts from a small data type (smaller in terms of size and range) to larger data type. For example, from byte
to int
(that is, one is widening the byte
size to that of an int
), from int
to long
etc. Java does this type of casting for us automatically.
short myShortValue = 12;
long myLongValue = myShortValue;
No compile error is thrown in the above statements because Java is able to handle the casting for us automatically.
Narrowing Casting
This is the type of casting that is involved when one casts from a larger data type to a smaller data type. For example, from int
to short
(that is, one is narrowing the int
size to that of a short
), or from long
to int
, or from double
to long
(yes, double
is larger than long
). Java does not do this type of casting automatically for us. This is because there is the possibility of an unintended data loss (or precision loss) when casting from a larger data type to a smaller one.
int myIntValue = 23;
byte myByteValue = (byte) myIntValue;
Without manually casting by ourselves by using (byte)
we get a compile error - Required type: byte Provided: int.
Explicit Casting for expressions
In a Java expression involving operands of different data types, the result of the expression is assigned the larger data type. However, if the larger data type is smaller than int
, the result defaults to int.
short myShortValue = 5;
byte myByteValue = 2;
int resultVar = myShortValue / myByteValue;
There is no compile error in the above snippet of code because the operands are of type short
and byte
, the larger data type being short
which is smaller than int
; so the result of the expression defaults to int
, and because the type of the variable - resultVar
is int
no error is thrown.
long myLongValue = 100;
double myDoubleValue = 2.5;
int resultVar = myLongValue / myDoubleValue;
A compile error is thrown above. This is because the data types of the operands are long
and double
. The result of the expression on the right is double
as it is the larger data type. The error is thrown because the variable resultVar
was defined as an int
. The error will be Required Type: int Provided: double
. In order to correct the error, the statement should be rewritten as:
double resultVar = myLongValue / myDoubleValue;
Or we can explicitly cast it like so:
int resultVar = (int) myLongValue / myDoubleValue;
However, it is important to know that in the second option there is the possibility for data loss.