bit::CompressedArray Class Reference

Compressed array of sorted values. More...

#include <CompressedArray.hh>

List of all members.

Public Member Functions

 CompressedArray ()
 CompressedArray (const Array &array)
 CompressedArray (const CompressedArray &array)
 CompressedArray (u64 num_elems, unsigned int bits_per_elem)
 Create an array with initial size.
 ~CompressedArray ()
CompressedArrayoperator= (const CompressedArray &array)
bool is_compressed () const
 Return true if compressed.
u64 num_elems () const
 Return the number of elements stored in the compressed array.
unsigned int shift () const
 Return the number bits used in the compression.
u64 compressed_size () const
 Return the number of bytes required to store the possibly recursively compressed array.
void resize (u64 num_elems)
 Change the number of elements in the array.
void set_width (int bits_per_elem)
 Change the width of the elements to given width.
void set (u64 elem, u32 value)
 Set the value of an element.
void set_grow (u64 elem, u32 value)
 Set the value of an element growing the buffer if necessary.
void set_grow_widen (u64 elem, u32 value)
 Set the value of an element growing and widening the buffer if necessary.
u32 get (u64 elem) const
 Return value of an element.
void compress (unsigned int shift)
 Compress the array.
void optimal_compress ()
 Compress the array once with the optimal compression width.
void recursive_optimal_compress ()
 Compress the array and the resulting arrays recursively with the optimal compression width.
void uncompress ()
 Uncompress the array.
u64 last_leq (u32 value)
 Find the largest index that has a equal or smaller value than specified.
std::string debug_str (int indent=0) const
 Return a debug string displaying the contents of the compressed array.
void write (FILE *file) const
 Write the array in file.
void read (FILE *file)
 Read the array from file.

Static Public Member Functions

template<class A>
static A inverse (const A &array)
template<class A>
static A shift_array (const A &array, int k)
template<class A>
static A mask (const A &array, unsigned int k)

Static Public Attributes

static const u64 constant_size_cost = 5
 Constant cost for storing shift value and pointer to inverse array to avoid compressing with small gain.

Private Attributes

Array m_array
 The masked part of the compressed array.
CompressedArraym_inv_array
 The inverse part of the compressed array.
unsigned int m_shift
 The amount of shift used in array compression.


Detailed Description

Compressed array of sorted values.

After the array is compressed, all write operations throw bit::invalid_call.

Warning:
Compressed arrays can store only up to (2^32 - 1) elements because indices are also stored in arrays, and array values can not be wider than 32 bits.
Based on the following scientific publication: B.Raj and E.W.D.Whittaker. Lossless compression of language model structure and word identifiers. Proceedings of ICASSP 2003. pp. 388-391.


Constructor & Destructor Documentation

bit::CompressedArray::CompressedArray  )  [inline]
 

bit::CompressedArray::CompressedArray const Array array  )  [inline]
 

bit::CompressedArray::CompressedArray const CompressedArray array  )  [inline]
 

bit::CompressedArray::CompressedArray u64  num_elems,
unsigned int  bits_per_elem
[inline]
 

Create an array with initial size.

The initial elements are guaranteed to be zero.

Parameters:
bits_per_elem = bits_per_element (0-32 bits)
num_elems = initial number of elements in the array
Exceptions:
bit::invalid_argument one of the bit arguments is invalid

bit::CompressedArray::~CompressedArray  )  [inline]
 


Member Function Documentation

void bit::CompressedArray::compress unsigned int  shift  )  [inline]
 

Compress the array.

Note that if shift equals the original array width, the array is left in its original state.

Parameters:
shift = the number of bits to shift the array in compression
Exceptions:
bit::invalid_call if compressed already
bit::invalid_argument if shift is equal or larger than bits_per_elem
bit::out_of_range if trying to compress an array with more than (2^32 - 1) elements.

u64 bit::CompressedArray::compressed_size  )  const [inline]
 

Return the number of bytes required to store the possibly recursively compressed array.

std::string bit::CompressedArray::debug_str int  indent = 0  )  const [inline]
 

Return a debug string displaying the contents of the compressed array.

Parameters:
indent = the number of white space indentation

u32 bit::CompressedArray::get u64  elem  )  const [inline]
 

Return value of an element.

Parameters:
elem = the index of the element
Returns:
the value
Exceptions:
bit::out_of_range accessing outside the array

template<class A>
static A bit::CompressedArray::inverse const A &  array  )  [inline, static]
 

bool bit::CompressedArray::is_compressed  )  const [inline]
 

Return true if compressed.

u64 bit::CompressedArray::last_leq u32  value  )  [inline]
 

Find the largest index that has a equal or smaller value than specified.

This function is a specialized version of the general bit::last_leq() as described in Raj and Whittaker (2003).

Parameters:
value = the candidate value
Returns:
the largest index containing equal or smaller value or max_u64 if no such index in array

template<class A>
static A bit::CompressedArray::mask const A &  array,
unsigned int  k
[inline, static]
 

u64 bit::CompressedArray::num_elems  )  const [inline]
 

Return the number of elements stored in the compressed array.

CompressedArray& bit::CompressedArray::operator= const CompressedArray array  )  [inline]
 

void bit::CompressedArray::optimal_compress  )  [inline]
 

Compress the array once with the optimal compression width.

Exceptions:
bit::invalid_call if compressed already
bit::out_of_range if trying to compress an array with more than (2^32 - 1) elements.

void bit::CompressedArray::read FILE *  file  )  [inline]
 

Read the array from file.

Bug:
The array can be left in corrupted state if read fails with exception.
Parameters:
file = file stream to read from
Exceptions:
bit::io_error if read fails

void bit::CompressedArray::recursive_optimal_compress  )  [inline]
 

Compress the array and the resulting arrays recursively with the optimal compression width.

Exceptions:
bit::invalid_call if compressed already
bit::out_of_range if trying to compress an array with more than (2^32 - 1) elements.

void bit::CompressedArray::resize u64  num_elems  )  [inline]
 

Change the number of elements in the array.

The values of the possible new elements are guaranteed to be zero ONLY if the array has never been resized smaller.

Parameters:
num_elems = the new number of elements
Exceptions:
bit::invalid_call if called after compression

void bit::CompressedArray::set u64  elem,
u32  value
[inline]
 

Set the value of an element.

Parameters:
elem = the index of the element to set
value = the value to set
Exceptions:
bit::invalid_argument value wider than the bit-width of the array or trying to write to zero-width array
bit::out_of_range accessing outside the array
bit::invalid_call if called after compression

void bit::CompressedArray::set_grow u64  elem,
u32  value
[inline]
 

Set the value of an element growing the buffer if necessary.

If the current capacity is not enough, the capacity is doubled (set to one from zero), and if that is not enough, the capacity is set to (elem+1).

Warning:
Reserving capacity and resizing the array happens before the actual value is written to the buffer. Thus, the size and capacity of the array may change even if the write throws bit::invalid_argument for too wide write.
Parameters:
elem = the index of the element to set
value = the value to set
Exceptions:
bit::invalid_argument value wider than the bit-width of the array
bit::invalid_call if called after compression

void bit::CompressedArray::set_grow_widen u64  elem,
u32  value
[inline]
 

Set the value of an element growing and widening the buffer if necessary.

See set_grow() for info how capacity is handled and warnings about exceptions.

Parameters:
elem = the index of the element to set
value = the value to set
Exceptions:
bit::invalid_call if called after compression

void bit::CompressedArray::set_width int  bits_per_elem  )  [inline]
 

Change the width of the elements to given width.

All elements must fit in the given width or bit::invalid_argument will be thrown and the array is guaranteed to be left in the original state.

Parameters:
bits_per_elem = the number of bits to use per element (1-32 bits)
Exceptions:
bit::invalid_argument invalid bits_per_elem
bit::invalid_call if called after compression

unsigned int bit::CompressedArray::shift  )  const [inline]
 

Return the number bits used in the compression.

Exceptions:
bit::invalid_call if uncompressed

template<class A>
static A bit::CompressedArray::shift_array const A &  array,
int  k
[inline, static]
 

void bit::CompressedArray::uncompress  )  [inline]
 

Uncompress the array.

It is safe to call this for uncompressed array.

void bit::CompressedArray::write FILE *  file  )  const [inline]
 

Write the array in file.

Parameters:
file = file stream to write to
Exceptions:
bit::io_error if write fails


Member Data Documentation

const u64 bit::CompressedArray::constant_size_cost = 5 [static]
 

Constant cost for storing shift value and pointer to inverse array to avoid compressing with small gain.

This is computed as 1 byte for shift and 4 bytes for the pointer regardless of the machine architecture.

Array bit::CompressedArray::m_array [private]
 

The masked part of the compressed array.

CompressedArray* bit::CompressedArray::m_inv_array [private]
 

The inverse part of the compressed array.

If NULL, then the array is uncompressed, m_array contains the array, and m_shift is not defined.

unsigned int bit::CompressedArray::m_shift [private]
 

The amount of shift used in array compression.


The documentation for this class was generated from the following file:
Generated on Mon Jan 8 15:51:04 2007 for bit by  doxygen 1.4.6