An Iterator for InputStreams in Java

Java IO streams are OK, but in my opinion are a bit of a hassle to use. I wondered recently how effective an iterator-based solution might be to reading a stream, so wrote an iterator for the job.  After all, it only took about ten minutes, and might be useful, if only to understand why it shouldn't be used!

The nature of a stream is to flow - there's not always flow control to move backwards and forwards through the data.  Whether this feature is available or not is determined by the 'markSupported()' method of the Iterator interface. 

If you consider the Iterator interface, you need 'hasNext()' and 'next()' to be supported.  In order to determine if there is more data to be read, you have to either know the length of the stream, or you have to try to read the data.  For generic streams, then, you need to be able to mark() your position, read some data, and then reset(). 

So, for many streams, it's not possible to use the Iterator interface, but for some, such as input streams created from files, or maybe just strings, it's quite useful and reduces code bulk.  The amount of data you actually need to read in order to return 'true' for 'hasNext()' is only one byte, so the overhead isn't that great.  In fact, it's generally stream length / buffer size + 1.

Example of use

You can use the iterator like this:

   1:ByteArrayInputStream bis = new ByteArrayInputStream( 
   2:"The quick brown fox jumps over the lazy dog".getBytes ());
   3:
   4://Here, we're setting a small buffer size so we can see the 
   5://iterator work on a small example.  10 byte chunk size.
   6:
   7:Iterator chunker = new ByteInputStreamIterator(bis, 10); 
   8:        
   9:while (chunker.hasNext ())
  10:{
  11:        byte[] chunk = chunker.next ();
  12:        System.out.println ( new String ( chunk ) );
  13:}
  14:

This should output something like:

The quick 
brown fox
jumps over
the lazy d
og

The code

   1:package com.ecreate.util;
   2:import java.io.ByteArrayOutputStream;
   3:import java.io.IOException;
   4:import java.io.InputStream;
   5:import java.util.Iterator;
   6:
   7:/**
   8: * This  is not particularly efficient because it needs to read everything
   9: * twice, but is convenient for known cases where data size will be small
  10: * @author msear
  11: *
  12: */
  13:public class ByteInputStreamIterator implements Iterator
  14:{
  15: private int maxChunkSize;
  16: private InputStream is;
  17: private int currentOffset;
  18:
  19: 
  20:
  21:public ByteInputStreamIterator( InputStream is, int maxChunkSize )
  22:{
  23:this.is = is;
  24:this.maxChunkSize = maxChunkSize;
  25:this.currentOffset = 0;
  26:if (!is.markSupported ())
  27:{
  28:throw new RuntimeException( "Given InputStream doesn't support marking / resetting" );
  29:}
  30:}
  31:
  32:public boolean hasNext ()
  33:{
  34:boolean more = false;
  35:byte[] buf = new byte[maxChunkSize];
  36:try
  37:{
  38:is.mark ( maxChunkSize );
  39:if( is.read (buf, 0, 1) != -1 ) more = true;
  40:is.reset ();
  41:}
  42:catch ( IOException e ) {}
  43:return more;
  44:}
  45:
  46:/**
  47:* Returns
  48:*/
  49:public byte[] next ()
  50:{
  51:byte[] buf = new byte[maxChunkSize];
  52:int bytesRead;
  53:
  54:try
  55:{
  56:bytesRead = is.read ( buf );
  57:
  58:ByteArrayOutputStream bos = new ByteArrayOutputStream();
  59:bos.write ( buf, 0, bytesRead );
  60:buf = bos.toByteArray ();
  61:currentOffset += bytesRead;
  62:}
  63:catch ( IOException e )
  64:{
  65:throw new RuntimeException ( "Can't read" );
  66:}
  67:return buf;
  68:}
  69:
  70:public void remove ()
  71:{
  72:try
  73:{
  74:is.skip ( maxChunkSize );
  75:}
  76:catch ( IOException e ) { throw new RuntimeException ( "Can't remove" ); }
  77:}
  78:
  79:}

Conclusion

Well, I'm not sure that I'll ever use the code, but I guess others will have the same frustrations about Java's really simple iterator techniques not being available for streams, so maybe it'll answer a few questions!


© eCreate Web Services Limited, 2008