The Java I/O packages discussed in Chapter
9: Java provide an extensive set of input/output tools.
The stream architecture allows for a powerful and uniform approach
to input/output regardless of whether the communications are
with a disk file, a network connection, a serial line, or any
other source or destination. However, there are some aspects
of the basic I/O tools that were found to be lacking, especially
for cases where large numbers of streams are handled as, for
example, with a busy web server.
The NIO packages were added in the Java 1.4 edition to addresss
this scalability problem and other shortcomings. We will give
a brief overview here of some of the tools available in the
NIO packages, many of which are useful even for routine application
programs.
Stream Shortcomings
Some of the practical problems with Java stream classes involve:
-
I/O blocking - Blocking refers to the
situation where a read or write operation on a stream does
not return until it has finished. This might be a problem
if, for example, the data communication from a source is extremely
slow or intermittently delayed for long periods. The resources
of a thread are then being wasted during periods while it
waits for data transmission to resume.
- Asynchronous I/O - Threads servicing a I/O connection
cannot be easily and cleanly stopped or interrupted (see Chapter
8: Java) as you might want to do if the I/O operation is blocked.
- Scaling - With the above two problems, it is difficult
to efficiently serve a large numbers of I/O connections, as in
a web server dealing thousands of users, with a finite number,
i.e. a pool, of threads. When one I/O connection is blocked, the
thread controlling it should be able to move to an active I/O
connection.
- Memory Performance - Data transfers involve copying data
into the JVM's own buffers, which can be slow for large data sets
as compared to handling the data in native OS memory.
- Bytes-to/from-Primitives - In Chapter
9: Java we discuss how Java's strong data typing does not
normally allow a sequence of bytes to be arbitrarily treated as
primitive type data as can be done in, say, C/C++. In that section
we show how to use the ByteArrayInputStream
and ByteArrayOutputStream
classes to access a given section of bytes in an array as a particular
type. However, this can be a bit clumsy and faster, more convenient
techniques would be useful.
- Character Encoding/Decoding - As discussed in Chapter
9: Java : Character Codes, Java uses Unicode internally to
represent all text characters. For external communications, however,
it can encode/decode characters into other code schemes such as
UTF-8. Before Java 1.4, though, there were few ways to access
and control the Java coding/encoding tools directly.
The tools in the NIO packages address these problems. We survey
these tools in the following sections.
Channels
The NIO system added the concept of a channel,
which represents an open connection to a file, socket,
or other I/O source or destination. A channel basically
just indicates additional capabilities for a connection
beyond what the basic I/O classes allow.
The Channel interface (java.nio.channels.Channel)
itself only contains two methods: isOpen(),
to indicate whether a channel is opened or closed, and close(),
which, not surprisingly, closes a channel. Classes that
implement the Channel interface
will bring other capabilities specific to their particular
I/O job such as allowing for a non-blocking connection.
For Java 1.4 the stream classes FileInputStream,
FileOutputStream, and RandomAccessFile
and the classes DatagramSocket,
ServerSocket, and Socket
each had a getChannel() method
added. It returns an instance of a class that implements
the Channel interface and represents
the corresponding connection. There is also the Pipe
class, which is abstract. It provides the sink()
and source() methods, which return
instances of Pipe.SinkChannel and
Pipe.SinkChannel, resp. A pipe
is a one-way connection between two threads. (See Chapter
9: Supplements: Pipe Streams.)
This code, for example,
FileInputStream file_input = new FileInputStream
(fFileInputNameString);
FileChannel file_input_chan = file_input.getChannel ();
FileInputStream file_output = new
FileOutputStream (fFileOutputNameString);
FileChannel file_output_chan = file_output.getChannel
();
shows how to obtain an instance of FileChannel
(java.nio.channels.FileChannel)
for a file via a FileInputStream
object and for a FileOutputStream
object. Before we discuss what a FileChannel
object is good for, we will discuss NIO buffersm, which
are used to handle the data in channels.
Buffers
The java.nio
package itself mostly contains a set of buffer classes that
are useful for the channel communications. The base class
Buffer
is abstract and has the following abstract subclasses for
the primitive types:
(Concrete subclasses of these will be created according
to the JVM implementation.) An instance of one of these
buffers contains a linear array of data elements of the
particular type. The buffer classes provide a number of
useful tools to handle the data.
A buffer is described by the following settings:
-
capacity is the number of elements
in the buffer.
- limit is the index of the first element that should
not be read or written. This can be set to be less than
the capacity.
-
position is the index of the next
element to be read or written and it can be set directly.
- mark is the index to which a buffer's position will be
set to by an invocation of the reset()
method.
- readonly - the buffer can be
set prevent write or get operations. The isReadOnly()
method will indicate that state.
The values of these must obey the following:
0 <= mark <= position <= limit <= capacity
Some of the tools to handle a buffer include the following:
- get() and put()
methods obtain and write data, respectively, at specific locations
in the array. The locations can be specified for these operations
absolute, i.e. at a specified element index, or relative to the
current location.
- position(int newposition) methods
set the index for the next element to be read or written, while
position() returns the current position.
- mark() sets the current mark in the
buffer and subsequent reset() invocations
will move the position to the mark.
- flip() sets the limit to the current
position and then sets the position to zero. This is convenient,
for example, after one has read or put a set of data into the
buffer. You invoke flip and are all set up to get or write that
data.
- rewind() sets the position to zero
and discards the mark setting.
- clear() sets the limit and position
of the buffer to zero. Data can be added via put or read methods.
- remaining() returns the number of
elements between the current position and the limit, while hasRemaing()
indicates whether this value is nonzero.
Instances of these buffer classes can be created from an existing
array (of the corresponding type) using a wrap()
method or via an allocate method. For example, to create a buffer
for 2000 double type values,
DoubleBuffer double_buff = DoubleBuffer.allocate
(2000);
The ByteBuffer also offers an alternative
called direct allocation in which the buffer is actually
located in a native OS memory buffer rather than in the JVM. This
provides for faster performance, though the allocation operation
itself is somewhat slower than the allocate()
method.
ByteBuffer byte_buff = ByteBuffer.allocateDirect
(2000);
There is no allocateDirect() method for
the other buffers but note that view buffers (see next
page) of the other primitive types can be made of a directly
allocated ByteBuffer.
In Chapter 9:
Java: Converting Primitive Type Data to Bytes and Back, we discussed
how to use the byte array input and output streams to treat different
sections of a byte array as different primitive type data. In a
Chapter 9: Tech:
Histogram I/O example, we show how to use this technique in
a program. An alternative is provided by the NIO buffers. A section
of a ByteBuffer can be viewed as
one of the primitive type buffers. We will illustrate this in an
example on the next page.
FileChannel
OK, now that we have introduced channels and buffers, we
can discuss a particular type of channel class called FileChannel.
An instance of FileChannel can
be obtained via the getChannel()
method for the FileInputStream,
FileOutputStream, and RandomAccessFile
classes as shown in the above example
code.
As an example application of FileChannel,
we can copy data from the input file to the output file
in several ways. First we create a buffer with the allocate
method
ByteBuffer buffer = ByteBuffer.allocate
(32000);
and then we read from the input file and write to the output
file:
file_input_chan.read
(buffer); // read 32000 bytes into the buffer
buffer.flip (); //
set limit to 32000 and position to 0
file_output_chan.write (buffer);// write 32000 bytes to
the output file.
In both the read and write operations, the methods attempt
to go to the limit of the buffer. So the FileChannel
works like a random access file
but operations are carried with a BufferByte
rather than a byte array.
To read another 32000 bytes from the input file and copy
them to the output file:
buffer.clear
(); //
reset limit and position to 0
file_input_chan.position (32000); // set position for
start of read
file_input_chan.read (buffer); //
read 32000 bytes into the buffer
buffer.flip (); //
set limit to 32000 and position to 0
file_output_chan.position (32000);// set position for
start of write
file_output_chan.write (buffer); // write 32000
bytes to the output file.
Another approach is to loop over the read operation and
test if the end of the input file has been reached:
while (file_input_chan.read
(buffer) > 0) { // read returns number of elements read
buffer.flip ();
file_output_chan.write
(buffer);
buffer.clear ();
}
There are several other FileChannel
read/write methods, include those that allow transfers to/from
more than one buffer. In a gathering type of operation,
data is written to the channel with data gathered from several
buffers. Conversely, a scattering operation reads
data from the channel into several buffers.
Finally, the transferTo() method
does the work of copying a data from one file to another
without having to bother at all with buffers.
file_input_chan.transferTo(
0, (int)file_input_chan.size(),
file_output_chan);
After read/write operations are finished, you should close
the channels
file_input_chan.close();
file_output_chan.close();
Mapped Files
A convenient option with FileChannel
is the ability to treat an entire file as if it has been
completely read into a singel ByteBuffer.
You can then access any position or section of the file
data with the usual ByteBuffer
methods. This is done internally by mapping the file into
memory. The magic of page swapping in the OS will make it
appear as if the whole file is in memory, regardless of
the actual size of the file.
A mapped buffer is created with the map()
method in FileChannel as shown
in this example:
FileInputStream file_input = new FileInputStream
(fFileInputNameString);
FileChannel file_input_chan = file_input.getChannel ();
MappedByteBuffer mapped_buffer =
file_input_chan.map (FileChannel.READ_ONLY,
0, file_input_chan.size ())
Teh mapped_buffer can be used
just like a regular ByteBuffer
to access the data elements. Here we set the data to be
READ_ONLY but this can also be
READ_WRITE.
See the program Sum.java
at Sun Developer for a short example of using a mapped buffer
to speed up a computation of data in a file.
More examples of FileChannel
are given in File
Channels - JDC Tech Tips - May.7.2002. In particular,
one example shows how to read a file that is in little-endian
format. Java internally uses big-endian format for
the representation of the numbers in the primitives. This
means that the most significant byte comes first in a transfer
and the byte storage starts from the lowest memory address
and goes up. Many hardware platforms, such as Intel's, use
little-endian, in which the least significant byte
comes first.
CharSet
As mentioned in the above, NIO provides the programmer
with much greater access to character encoding tools. Often
communications with external sources and destinations requires
conversion between the Unicode that Java uses internally
with a different code used by the external agent. The java.nio.charset
package contains coder/decoder classes for which you can
explicitly specify the particular type of target codes to
encode or decode.
For example, we could follow the previous code snippet
with this one:
// UTF-8 is the common 8-bit
encoding format
Charset charset = Charset.forName ("UTF-8");
CharsetDecoder decoder = charset.newDecoder ();
CharBuffer charBuffer = decoder.decode (mapped_buffer);
The first two code lines create a decoder for UTF-8 encoded
text. The third uses it to decode text from the buffer (i.e.
the mapped file) and send the output into a CharBuffer.
See the java.nio.charset.Charset
class specification for more about its character decoding
and encoding capabilities.
Selectable
I/O
We do not discuss ports,
sockets
and other aspects of network I/O until Chapter
13 in Part II of JavaTech. In chapters 14 and
15 we show how to create a basic web server that connects
with client programs (e.g. browsers) to send hypertext files
or other types of data. We will leave the details till those
chapters and simply give a brief overview of the NIO tools
available to build servers more efficiently, especially
when dealing with lots of clients.
In Part II we will see that a server obtains a new socket
object to connect with each new client. The client also
has its own socket object to talk with the server. Typically,
the server creates a threaded process to serve the
client, i.e. to deal with the communications with the client
via the paricular socket assigned to it. While reading from
or writing data to the client, the socket will block, i.e.
not return until the operation is finished.
If at any given time there are lots of clients, perhaps
many thousands for a busy website, there will be many of
the server threads that are blocked while waiting for their
operations to finish. Assigning a unique thread to each
socket can waste resources if many of the sockets are blocked
but not actually reading or writing anything and instead
just waiting for responses from their clients.
The NIO system provides a way to deal with socket communications
much more efficiently. It allows for a program to create
a fixed sized pool of threads. A thread from the
pool can be taken out and assigned to deal with a client
when needed and then returned to the pool after it finishes
dealing with the client. This becomes quite effective with
NIO since it makes it possible to create non-blocking I/O
over sockets and to poll sockets to see if they need to
be serviced.
The primary classes for this task are:
The Selector acts like a listener
that polls SelectableChannels
for their status with regard to opening or accepting a socket
connection and with regard to reading or writing to a connection.
The Selector, object, which is
obtained via the static method Selector.open(),
determines the status of a channel not directly from the
channel but via a SelectionKey
object that indicates the status via a set of ready flags.
We won't go into further details here but you can find
several examples in these tutorials: Working
with SocketChannels - Core Java Tech Tips - Sept.9.2003.
More details are provided in this article:Non-Blocking
Socket I/O in JDK 1.4,
By Tim Burns, Owl Mountain, Dec.2001. Two example programs
with blocking connections and two examples with non-blocking
are given here: NIO
Examples - Sun Developer. The book Learning
Java... discusses an example server program that
uses these selectable, non-blocking channel techniques.
References &
Web Resources
Most recent update: Nov. 7, 2007
|