|
Buffers API FundamentalsBy Amanda WaiteTable Of Contents:
Setting Up Your System for the Code Samples Buffers - A Primitive Technology? Buffer vs. Arrays What's Inside a Buffer? Wrapping Existing Arrays Accessing the Internal Array Wrapping Strings Writing Data and the Position Field The Limit Field and the flip() Method Reading Data and the remaining() and hasRemaining() Methods The bulk get() Methods The mark field and reset() Methods The rewind() and clear() Methods More on Wrapping Arrays Some of the Rules Exceptions Absolute put()'s and get()'s duplicate(), slice() and asReadOnlyBuffer() Conclusion For More Information We want to hear from you! Please send us your FEEDBACK. The following Technical Article may contain actual software programs in source code form. This source code is made available for developers to use as needed, pursuant to the terms and conditions of this license. OverviewIn this article we will be looking at the new APIs and functionality that were introduced in the java.nio (New I/O) package of the Java[tm] 2 Platform, Standard Edition (J2SE[tm]) version 1.4. The New I/O package came about as a result of Java Specification request #51 (JSR-51), which was raised as part of the Java Community Process. The key features of JSR-51 that have been implemented across the new APIs of the New I/O package include highly scalable network I/O, high performance buffered binary and character I/O and high performance file I/O. For the most part I/O operations are the function of the Channels API, but much of the performance increase for I/O operations comes from the introduction of the Buffers API. Buffers were designed to be used as the data store for all I/O operations involving Channels and special kinds of Buffers called Direct Buffers allow us to now transfer data at much faster rates than were possible with the original I/O classes. In addition to this, Channels can be used in conjunction with Selectors in order to perform multiplexed Non-Blocking I/O. This means that it is possible to design a server that doesn't need to create a new thread to handle each new connection. In fact, one thread can now handle thousands of connections! Finally we have Charsets (java.nio.charset) and Regular expressions (java.nio.regex). Charsets allow us to create named mappings between sequences of 16-bit Unicode characters and byte sequences, while regular expressions allow us to search character sequences such as Strings, for specific patterns of characters. In addition to these, the New I/O APIs introduce many new features that can be used to increase performance and improve programmability when performing I/O operations on Files.
Setting Up Your System for the Code SamplesThis article includes several code samples that are designed to reinforce the technical information presented in the text and figures. In order to run these code samples you will need to ensure that your system is setup correctly and is running a compatible version of the Java platform.
Buffers A Primitive Technology?The first of the New I/O APIs that we will be looking at is the Buffer API . We will be examining in detail how and where we would use Buffers as well as how they work behind the scenes. The Buffers API can be used on its own, but for the most part it will be used in conjunction with the other New I/O APIs. For this article we are going to try to keep things simple by looking at Buffers on their own. In a future article we will be looking at how Buffers are used when performing I/O operations using the Channels API.
Buffers vs. ArraysA Buffer is a container for a fixed amount of data of a specific primitive type (i.e. float, int, short, etc) and in this respect a Buffer is very similar to an array of primitives. In fact, the comparison with arrays is an important one as it does much to help us understand not only what Buffers can do but also how they work. For as long as we've had computing, programmers have been using arrays. We use them all of the time these days without really even thinking about it. We use them because they do one thing really well - storing data. Whenever we use an array we'll probably be writing lots of supporting code to help us to manipulate the data that it holds, either while it's actually inside the array or when we are moving it in and out of the array. Circumstances will often arise where we have a partially filled array that we would like to add data to elsewhere in our code, or where the usable data in the array starts at a point in the array other than at the beginning or does not fill the array completely. In order to overcome this it is often necessary to maintain variables that indicate the offset and the length of the data that is stored in the array. This doesn't scale well particularly if we need to pass the array to other sections of code that also need to be able to find the usable data inside the array. For this reason programmers have often developed their own classes or structs that can be used as a wrapper around an array and which will include fields for offset and length as well as offering other features specific to the purpose at hand. This is where the Buffers API comes in. Buffers are also really good at storing data but unlike arrays, they provide us with a number of tools that we can use for manipulating and keeping track of the data that they store. These tools include fields that allow us to find out where the usable data is within the Buffer along with methods that quickly allow us to change the state of the Buffer so that we can perform new operations upon it. The biggest single advantage that Buffers give us over arrays is the ability to do high performance I/O operations using Channels, and what are known as Direct Buffers. We will be looking at both of these features in a future article. The Buffers API gives us a Buffer class for each primitive type except for boolean. All of these classes extend the abstract class java.nio.Buffer.
java.nio.Buffer defines all of the methods that do not directly change the data that the Buffer contains while all of the methods that do change the contents of the buffer are declared in the Buffer classes themselves. These Buffer classes are in fact abstract classes. The methods used to change the contents of the Buffer are also abstract. Each of the Buffer classes has several subclasses in which these methods will be implemented. When we create a Buffer we actually get back an instance of one of these subclasses. To keep things simple, for the most part we will talk about using instances of the Buffer classes and not their subclasses. What this means is that we cannot create an instance of a Buffer using the 'new' keyword. Instead, we will call the static allocate() method of the Buffer class that we want to create an instance of. allocate() that takes as an input argument the capacity of the Buffer that we want to create. The capacity of the Buffer is the number of primitive elements of its specific type that it will hold and is specified as an integer. Following is an example: FloatBuffer floatBuf = FloatBuffer.allocate(24); This will "allocate" a new FloatBuffer that will hold (or has a capacity of) 24 elements of type float. Once you have an instance of a Buffer, you can find out its capacity by calling its capacity() method, the important thing to note is that once a Buffer has been instantiated, its capacity can never be changed. The reasons for this will become clearer as we look at how Buffers works.
What's Inside a Buffer?When using the allocate() method to get an instance of a Buffer, the Buffer that we get back is effectively a wrapper around an array of primitives. The type of the primitives that the array holds is determined by the type of Buffer class (FloatBuffer has a float array, IntBuffer has an int array, etc.). The array's size is determined by the Buffer's capacity. In the previous example, floatBuf would contain (or more accurately reference) a 24 element float array. The array is a private instance variable (field) of the Buffer, as is the capacity. The following figure (Fig. 1) illustrates this. The outside box represents the FloatBuffer that we created with the allocate() method and the inside box represents the array that it references. Two of the FloatBuffer's instance variables are also shown - the capacity and the float array fb. (see Fig. 1).
Fig. 1 We can actually see this if we look inside the source code for the allocate() method of the FloatBuffer class:
public static FloatBuffer allocate(int capacity) {
this.capacity = capacity;
fb = new float[capacity];
.....
}
Because the array is an instance variable of the Buffer, when it's created all of its elements will be initialized with the default value for its type. You never have to initialise a Buffer's elements regardless of where you are using the Buffer in your code (it's scope). Two other important fields of the new Buffer instance are the position field and the limit field. We'll be using these fields when we start looking at how to get data in and out of a Buffer.
Wrapping Existing ArraysAs we have seen, a Buffer can simply be a wrapper around an array. When we create an instance of a Buffer using the allocate() method a new array will be created for us. But it is also possible for us to create a Buffer that is a wrapper around an existing array. We do this using the static wrap() method of the Buffer classes. Let's say that we have an array of integers called primes that we would like to access using an IntBuffer. To achieve this we can do the following:
int[] primes = {1, 3, 5, 7, 11, 17, 19, 23, 29};
IntBuffer intBuf = IntBuffer.wrap(primes);
We have now effectively wrapped primes inside the new Buffer instance intBuf. Remember that primes is really just a reference to an array and this array is actually an object that exists in the Java heap. When intBuf was created it was given its own reference to the array referenced by primes. What this means is that both primes and intBuf reference the same array. If we change the contents of primes we effectively change the contents of intBuf and if we change the contents of intBuf we change the contents of primes. Because primes is just a reference variable it can be reassigned or even be completely de-referenced without affecting intBuf. Each of the Buffer classes has a wrap() method that takes as an argument an array that is the same type as that of the Buffer class.
Accessing the Internal ArraySo far we have looked at how to get an instance of a Buffer using the static allocate() and wrap() methods. Whenever we get an instance of a Buffer using either of these method calls, the Buffer will be a wrapper around a Java array. However, the ByteBuffer class provides an alternative method named allocateDirect() which like allocate() takes the capacity as an argument. With allocateDirect() the Buffer instance that we get back will not be a wrapper around a Java array, instead the storage area for the data will be in memory outside of the Java heap where it can be accessed directly by native system calls. This is what is known as a Direct Buffer; we will look into these in more detail when we examine ByteBuffers in a future atrticle. You need to know this in order to be able to understand why the Buffer classes provide us with the hasArray() and array() methods. hasArray() returns true if the Buffer is a wrapper around a Java array while the array() method allows us to obtain a reference to that array. The two methods will normally be used together; we will call hasArray() before attempting to obtain a reference to the array.
Wrapping StringsThe J2SE platform v1.4 introduces a new interface java.lang.CharSequence. The thought behind this is that with the introduction of the CharBuffer we now have (at least) three classes that can represent a readable sequence of characters. They are java.lang.String, java.lang.StringBuffer and java.nio.CharBuffer. In Java 2 SDK v1.4 each of these three classes implement the CharSequence interface. This means that in many cases we will be able to use them interchangeably. Such a case is the overloaded wrap() method of the CharBuffer class that takes a CharSequence as an argument. This allows us to create a new CharBuffer instance that wraps either a String, a StringBuffer or even another CharBuffer. As mentioned earlier, a CharSequence is a readable sequence of characters and as such there are no methods in the CharSequence interface that allow us to modify the character sequence. Because of this, whenever we wrap a CharSequence the CharBuffer that is returned is marked as read-only. Any attempt to change the contents of a read-only Buffer will result in a java.nio.ReadOnlyBufferException. It's possible to test a Buffer to see if it is read-only using its isReadOnly() method. Also a CharBuffer created in this way does not have an array and therefore the hasArray() method will return false. At first, a read-only CharBuffer that wraps a CharSequence doesn't seem particularly useful, but in a future article we will see how we can use this kind of Buffer when doing Character based I/O operations using the Channels API. Code Samples
Writing Data and the Position FieldIn order to get data in and out of a Buffer, each of the Buffer classes has a number of put() and get() methods. These methods can be either relative or absolute. As the names suggest, the put() methods put or write data into the buffer and the get() methods get or read data out of the Buffer. To begin with we are going to look at the relative put() and get() methods. When a relative put() or get() method is called it will start writing or reading data from the current position. The position is an integer field of the Buffer instance and it is used to determine the next element that will be read or written if we were to call a relative put() or get() method. When a Buffer is first created the position will point to the first element in the Buffer (element zero). In the following figure (Figure. 2) we create a new CharBuffer with 24 elements. CharBuffer cBuf = CharBuffer.allocate(24);
Figure. 2 After its creation this Buffer contains no data so let's add some to it using a relative put() method. The simplest form of put() takes a single primitive value as an argument. Of course, the value used must be of a type that is acceptable for the Buffer that we are trying to add it to, for example the put() method of the IntBuffer class will accept an int or any primitive that can be automatically promoted to an int such as a short, a byte or a char. The rules for promotion (or widening conversion) are fairly complex and beyond the scope of this article. For more details on this see the Java Language Specification . To continue our CharBuffer example, let's add a single character to the newly created Buffer. To do this we'll use the put(char value) method to write the character 'H' to cBuf (see Figure. 3).
cBuf.put('H');
Figure. 3 The figure shows that the 'H' was written at the element pointed to by the value of the position before the method call, and also that as a result of the put() method, the position has now changed. The position now points to the element after the one that we have just written. Another form of the put() method is one that takes an array of elements as anargument. This is referred to as a bulk put. The next figure (Figure. 4) demonstrates how we would use a bulk put to write an array of char's to our CharBuffer.
char[] cArray = {'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd'};
cBuf.put(cArray);
Fig. 4 The example shows that the data from the array was written immediately after the data that we had previously written, and that the position has once again been moved to the end of the new data. The CharBuffer class is special in that it also has a bulk put() method that takes a String (or more accurately a CharSequence) as an argument. We could therefore have simplified the previous example by changing the call as follows:
cBuf.put("ello World");
An alternative form of the bulk put() method is available that takes an array as an argument along with an offset and a length. As with the other bulk put methods, this method will copy data from an array into the Buffer. The offset is used to indicate a point in the array from which we want to start copying the data, while the length indicates how many elements we want to copy. Code Sample
The Limit Field and the flip() MethodNow that we have actually written data to the Buffer, albeit in a very limited example, we will at some stage want to read that data. To do this we are going to use a relative get() method. Since a relative get() will read data from the current position we are going to have to return the position back to the beginning of the data. Another problem here is that the data that we have written to the Buffer doesn't actually fill the Buffer. If we were to read the entire contents of the Buffer we would not only get our data but we would also get several unwanted characters (in this case '\0' characters) from the unused area of the Buffer. This is where the flip() method and the limit field come into their own. The idea of the flip() method is to flip us from a state where we are writing data to a Buffer over to a state where we can read that same data from the Buffer. The limit is another field of the Buffer instance and it is used to mark the end of the usable data in the Buffer. When we first create a Buffer instance, the limit will be the same as the capacity but it can be set either manually using the limit(int newLimit) method or by using the flip() method. When we call flip() on a Buffer, the limit is set to the current value of the position which in our ongoing example we know is at the end of the data that we added using the put() method; the position is returned to zero. The position now marks the beginning of the data while the limit marks the end of the data. Let's call flip() on the CharBuffer that we added the Hello World characters to earlier (see Figure. 5). cBuf.flip();
Figure. 5 As can be seen by the diagram, the limit is now where the position was before the call to flip(), and the position has been returned to zero. The Buffer is now ready for us the retrieve its data.
Reading Data and the remaining() and hasRemaining() MethodNow we are ready to get the data from the Buffer and we can do this using a relative get() method. The simplest form of the relative get() method takes no arguments and will return the value of the single element at the current position. It will then move the position on to the next element. We can repeat this operation again and again until the position reaches the limit, at which point we will be at the end of the data that we added earlier. At any stage we can find out how many elements remain between the position and the limit using the remaining() method of the Buffer class. We can also perform a simple boolean check to see if there is still data to be read using the hasRemaining() method, which returns true if the position is less than the limit. The following code snippet shows how we can use the hasRemaining() method with a while loop in order to read the data from the Buffer one element at a time.
CharBuffer cBuf = CharBuffer.allocate(24);
cBuf.put('H');
cBuf.put("ello World");
cBuf.flip();
while(cBuf.hasRemaining()) {
System.out.println(cBuf.get());
}
Example 1 Adding data to a Buffer and then retrieving each element
The bulk get() MethodsAnother form of the get() method takes an array as an argument and is known as a bulk get(). A bulk get() will attempt to read a contiguous sequence of elements from the Buffer and write them into the array that was passed as an argument. The operation will start from the current position and will attempt to fill the array. After the operation is completed, the position will be at the end of the data that has just been read. If the array is bigger than the available data then no data is written to the array and a java.nio.BufferUnderflowException will be thrown. If we want to read all of the available data from the Buffer into an array, we first need to create an array of the correct size to hold that data. As was seen earlier we can use the remaining() method to determine how many elements of data are available in the Buffer, and therefore determine the required size of the array. Once we have the array we can call a bulk get()on the Buffer. For example, in the following code snippet we create the same Buffer as in our ongoing example, and then read all of the data out of the Buffer into the array 'outArray':
CharBuffer cBuf = CharBuffer.allocate(24);
cBuf.put('H');
cBuf.put("ello World");
cBuf.flip();
char[] outArray;
if(cBuf.hasRemaining()) {
outArray = new char[cBuf.remaining()];
cBuf.get(outArray);
}
Example 2 Retrieving all of the data An alternative form of the bulk get() method is available that takes an array as an argument along with an offset and a length. As with the other bulk get methods, this method will copy data from the Buffer into the array. The offset is used to indicate a point in the array to which we want to start copying the data, while the length indicates how many elements we want to copy. Code Samples
The mark field and reset() MethodsThe mark is another integer field of the Buffer class. It is used to mark a point in the Buffer that we may want to return to. In order to set the mark we call the mark() method; this simply sets the mark at the current position. After reading or writing data we can return the position back to the mark by calling the reset() method. When a Buffer instance is first created the mark will not be set. In fact, the mark need never be set during the lifetime of a Buffer. If the mark is set it must have a value that is less than or equal to the position. If the position is ever set to a value lower than the mark, the mark is cleared.A point that may be worth noting is that clearing the mark is done by setting its value to -1.
The rewind() and clear() MethodsThe rewind() method is used to put the Buffer back into a state where it can be read again. It simply returns the position to zero and clears the mark if it has been set. The limit is not affected. The clear() method is used to set all of the markers back to their defaults. This is particularly useful if we want to reuse the Buffer for some new data. When we call clear(), the position is set to zero, the limit is set to the capacity and the mark is cleared. The data elements of the Buffer are not cleared as this would be an unnecessary overhead.
More on Wrapping ArraysAll of the Buffer classes have an overloaded version of the wrap() method that takes additional input arguments for offset and length. For example the ByteBuffer class has a wrap() method with the following signature: public static ByteBuffer wrap(byte[] array, int offset, int length) When we use this method to wrap an array, the Buffer that we get back will have a capacity that is the same as the length of the array, but its position will be set to offset and the limit will be set to offset + length.
Some of the RulesSo far we have come across several marker fields in the Buffer. There are certain rules that apply to their values. The limit must always be less than or equal to the capacity
The position must always be less than or equal to the limit
The mark must always be less than the position
The following is therefore true:
Any attempt to break these rules will result in a java.lang.IllegalArgumentException being thrown.
ExceptionsThere will be times when we try to put too much data into a Buffer or we try to get too much data out of a Buffer. When this happens we are going to see an Exception being thrown. This will typically occur when using bulk put()'s or get()'s with arrays (or with other Buffers) that are too large for the operation to complete, or when we simply forget where we are in the Buffer and try to get or put an element that is greater than or equal to the limit. An attempt to get too much data from a Buffer will result in a java.nio.BufferUnderflowException, and an attempt to put too much data into a Buffer will result in a java.nio.BufferOverflowException.
Absolute put()'s and get()'sSo far we have only talked about relative put() and get() methods. Each of the Buffer classes also provides an absolute put() and an absolute get() method. Each of these methods takes an index as an argument which is the index of the element to be read or written. As an example, here are the signatures of these methods as defined in the DoubleBuffer class: public DoubleBuffer put(int index, double d) public double get(int index) Some important things to note here are that the absolute methods do not respect the position. Therefore, it is possible to put or get data using the index of an element that is before the position. Also, a call to an absolute put() or get() method will not result in a change in the position. This is not the case with the limit. Any attempt to use an index with an absolute put() or get() method that is not less than the limit will result in a java.lang.IndexOutOfBoundsException being thrown.
duplicate(), slice() and asReadOnlyBuffer()Several other methods are provided by the Buffer classes that allow us to obtain different views of the same data. The duplicate() method creates a new Buffer instance of the same type as the Buffer that it was called on. The duplicate Buffer will reference the same data as the original, and because of this any changes made to the duplicate will be visible in the original and vice versa. Also upon creation, the position, limit and mark of the duplicate Buffer will be the same as that of the original. However, aside from their initial values, the position, limit and mark of the two buffers are completely independent. The slice() method also creates a new Buffer instance that references the same data as the original Buffer. This Buffer instance will be a view of the data between the position and the limit of the original Buffer. Its capacity is determined by the value returned by the remaining() method of the original Buffer. Also, its position will be zero and its limit will be the same as the capacity. As with the Buffer instance returned by the duplicate() method, the position, limit and mark of the new Buffer will be independent of the original. An interesting point here is that when slice() is called, the position of the original Buffer may not have been zero. If this is the case the new Buffer will be given an offset that is its starting point in the data of the original Buffer. asReadOnlyBuffer() is much the same as the duplicate() method with the exception that the new Buffer instance that it returns will be marked as read-only. Any attempt to modify the data using the new read-only Buffer instance will result in a java.nio.ReadOnlyBufferException.
ConclusionIn this article we saw how instances of the Buffer classes can be constructed using the allocate() and wrap() methods. We also looked at how Buffers are constructed internally (with the exception of Direct Buffers), and learned about the internal instance variables such as mark, position and limit that are used to help the programmer find their way around the data inside a Buffer. In addition, we looked at how data can be written into a Buffer and how data can be read from a Buffer using the various get() and put() methods. Through the code samples we have seen some practical uses of the Buffer classes although we will not be using them to their full potential until we examine the Channels API. Then we will be looking in detail at the ByteBuffer class and how an instance of a ByteBuffer can be used in conjunction with a Channel to perform I/O operations involving all different types of data.
For More Information
| |||||||||||||||||||||||||||||||||||||||