Pajdeg  0.2.2
Pajdeg
Files | Typedefs

The generic scanner used to read symbols from a stream or arbitrary buffer. More...

Files

file  PDScanner.h
 

Typedefs

typedef struct PDScannerPDScannerRef
 
typedef void(* PDScannerBufFunc) (void *info, PDScannerRef scanner, char **buf, PDInteger *size, PDInteger req)
 
typedef void(* PDScannerPopFunc) (PDScannerRef scanner)
 

Creating / deleting scanners

PDScannerRef PDScannerCreateWithState (PDStateRef state)
 
PDScannerRef PDScannerCreateWithStateAndPopFunc (PDStateRef state, PDScannerPopFunc popFunc)
 
void PDScannerAttachFixedSizeBuffer (PDScannerRef scanner, char *buf, PDInteger len)
 

Using

pd_stack PDScannerGenerateStackFromFixedBuffer (PDStateRef state, char *buf, PDInteger len)
 
PDBool PDScannerPopString (PDScannerRef scanner, char **value)
 
PDBool PDScannerPopStack (PDScannerRef scanner, pd_stack *value)
 
PDBool PDScannerPopUnknown (PDScannerRef scanner, char **value)
 
PDBool PDScannerEndOfStream (PDScannerRef scanner)
 
void PDScannerAssertString (PDScannerRef scanner, char *value)
 
void PDScannerAssertStackType (PDScannerRef scanner)
 
void PDScannerAssertComplex (PDScannerRef scanner, const char *identifier)
 

Raw streams

void PDScannerSkip (PDScannerRef scanner, PDSize bytes)
 
PDInteger PDScannerPassSymbolCharacterType (PDScannerRef scanner, PDInteger symbolCharType)
 
PDBool PDScannerAttachFilter (PDScannerRef scanner, PDStreamFilterRef filter)
 
void PDScannerDetachFilter (PDScannerRef scanner)
 
PDInteger PDScannerReadStream (PDScannerRef scanner, PDInteger bytes, char *dest, PDInteger capacity)
 
PDInteger PDScannerReadStreamNext (PDScannerRef scanner, char *dest, PDInteger capacity)
 

Adjusting scanner / source

void PDScannerPushContext (PDScannerRef scanner, void *ctxInfo, PDScannerBufFunc ctxBufFunc)
 
void PDScannerPopContext (PDScannerRef scanner)
 
void PDScannerSetLoopCap (PDInteger cap)
 
void PDScannerPopSymbol (PDScannerRef scanner)
 
void PDScannerPopSymbolRev (PDScannerRef scanner)
 

Aligning, resetting, trimming scanner

void PDScannerAlign (PDScannerRef scanner, PDOffset offset)
 
void PDScannerTrim (PDScannerRef scanner, PDOffset bytes)
 
void PDScannerReset (PDScannerRef scanner)
 

Debugging

void PDScannerPrintStateTrace (PDScannerRef scanner)
 

Detailed Description

The generic scanner used to read symbols from a stream or arbitrary buffer.

The Pajdeg scanner takes a PDStateRef state and optionally a PDScannerPopFunc and allows the interpretation of symbols as defined by the state and its sub-states.

The most public functions of the scanner are PDScannerPopString and PDScannerPopStack. The former attempts to retrieve a string from the input stream, and the other a pd_stack. If the next value scanned is not the requested type, the function keeps the value around and returns falsity. It is not uncommon to attempt to pop a stack, and upon failure, to pop a string and behave accordingly.

Typedef Documentation

typedef void(* PDScannerBufFunc) (void *info, PDScannerRef scanner, char **buf, PDInteger *size, PDInteger req)

Called whenever the scanner needs more data.

The scanner's buffer function requires that the buffer is intact in the sense that the content in the range 0..*size (on call) remains intact and in the same position relative to *buf; req may be set if the scanner has an idea of how much data it needs, but is most often 0.

typedef void(* PDScannerPopFunc) (PDScannerRef scanner)

Pop function signature.

The pop function is used to lex symbols out of the buffer, potentially requesting more data via the supplied PDScannerBufFunc.

It defaults to a simple read-forward but can be swapped for special purpose readers on demand. It is internally swapped for the reverse lexer when the initial "startxref" value is located.

typedef struct PDScanner* PDScannerRef

A scanner.

Function Documentation

void PDScannerAlign ( PDScannerRef  scanner,
PDOffset  offset 
)

Align buffer along with pointers with given offset (often negative).

Parameters
scannerThe scanner.
offsetThe offset.
void PDScannerAssertComplex ( PDScannerRef  scanner,
const char *  identifier 
)

Require that the next result is a complex of the given type, or throw assertion.

Parameters
scannerThe scanner.
identifierThe identifier.
void PDScannerAssertStackType ( PDScannerRef  scanner)

Require that the next result is a stack (the stack is discarded), or throw assertion.

Parameters
scannerThe scanner.
void PDScannerAssertString ( PDScannerRef  scanner,
char *  value 
)

Require that the next result is a string, and that it is equal to the given value, or throw assertion.

Parameters
scannerThe scanner.
valueExpected string.
PDBool PDScannerAttachFilter ( PDScannerRef  scanner,
PDStreamFilterRef  filter 
)

Attach a stream filter to the scanner.

Stream filters are used to e.g. compress/decompress or encrypt/decrypt binary content.

Parameters
scannerThe scanner.
filterThe filter.
void PDScannerAttachFixedSizeBuffer ( PDScannerRef  scanner,
char *  buf,
PDInteger  len 
)

Attach a fixed-size buffer to the scanner. The scanner will refuse to use the buffering function, if one is present.

Parameters
scannerThe scanner.
bufThe buffer.
lenThe length of the buffer.
PDScannerRef PDScannerCreateWithState ( PDStateRef  state)

Create a scanner using the default pop function.

Parameters
stateThe root state to use in the scanner.
PDScannerRef PDScannerCreateWithStateAndPopFunc ( PDStateRef  state,
PDScannerPopFunc  popFunc 
)

Create a scanner using the provided pop function.

Parameters
stateThe root state to use in the scanner.
popFuncThe pop function to use.
void PDScannerDetachFilter ( PDScannerRef  scanner)

Detach attached stream filter from the scanner.

Parameters
scannerThe scanner.
PDBool PDScannerEndOfStream ( PDScannerRef  scanner)

Determine if the scanner has reached the end of the stream

Parameters
scannerScanner object
Returns
Boolean value indicating whether the stream hit the end or not
pd_stack PDScannerGenerateStackFromFixedBuffer ( PDStateRef  state,
char *  buf,
PDInteger  len 
)

Set up a temporary scanner with the given root state, and process buf, returning a pd_stack entry after tearing down the scanner again.

Parameters
stateThe root state
bufBuffer
lenLength of buffer in bytes
Returns
pd_stack entry or NULL on failure
PDInteger PDScannerPassSymbolCharacterType ( PDScannerRef  scanner,
PDInteger  symbolCharType 
)

Skip until the given symbol character type is encountered, stopping after the symbol character type.

Note
Buffer growth is never done in this method, which means if the scanner's buffer is only partially complete, it may stop prematurely.
Parameters
scannerThe scanner.
symbolCharTypeThe symbol character type.
Returns
The number of bytes skipped.
void PDScannerPopContext ( PDScannerRef  scanner)

Pop the scanner's current context.

Parameters
scannerScanner
PDBool PDScannerPopStack ( PDScannerRef  scanner,
pd_stack value 
)

Pop the next stack.

Parameters
scannerThe scanner
valuePointer to stack ref. Must be freed
Returns
true if the next value was a stack
PDBool PDScannerPopString ( PDScannerRef  scanner,
char **  value 
)

Pop the next string.

Parameters
scannerThe scanner
valuePointer to string variable. Must be freed
Returns
true if the next value was a string
void PDScannerPopSymbol ( PDScannerRef  scanner)

Pop a symbol as normal, via forward reading of buffer.

Parameters
scannerThe scanner.
See also
PDScannerCreateWithStateAndPopFunc
void PDScannerPopSymbolRev ( PDScannerRef  scanner)

Pop a symbol reversedly, by iterating backward.

Parameters
scannerThe scanner.
See also
PDScannerCreateWithStateAndPopFunc
PDBool PDScannerPopUnknown ( PDScannerRef  scanner,
char **  value 
)

Pop the next value, which the scanner was not able to recognize.

Parameters
scannerThe scanner
valuePointer to string variable. Must be freed
Returns
true if the next value was an unrecognizable string
void PDScannerPrintStateTrace ( PDScannerRef  scanner)

Print a trace of states to stdout for the scanner.

Parameters
scannerThe scanner.
void PDScannerPushContext ( PDScannerRef  scanner,
void *  ctxInfo,
PDScannerBufFunc  ctxBufFunc 
)

Push a new buffer context onto a scanner, keeping the old one on a stack.

Parameters
scannerScanner
ctxInfoInfo object for the buffer function
ctxBufFuncBuffer function
PDInteger PDScannerReadStream ( PDScannerRef  scanner,
PDInteger  bytes,
char *  dest,
PDInteger  capacity 
)

Read parts or entire stream at current position via attached filter.

Iterates scanner and stream (contrary to PDScannerSkip above, which only iterates scanner).

Note
If a decompression filter is attached, not all data may be read at once due to capacity limitations, and may require calls to PDScannerReadStreamNext().
See also
PDScannerAttachFilter
PDScannerReadStreamNext
Parameters
scannerThe scanner.
bytesThe number of raw bytes to read.
destThe destination buffer.
capacityThe capacity of the destination buffer.
Returns
The number of bytes stored in dest. If this value is equal to capacity, there may be more data available via PDScannerReadStreamNext.
PDInteger PDScannerReadStreamNext ( PDScannerRef  scanner,
char *  dest,
PDInteger  capacity 
)

Continue reading stream data via attached filter.

See also
PDScannerAttachFilter
PDScannerReadStream
Parameters
scannerThe scanner.
destThe destination buffer.
capacityThe capacity of the destination buffer.
Returns
The number of bytes stored in dest.
void PDScannerReset ( PDScannerRef  scanner)

Reset scanner buffer including size, offset, trail, etc., as well as discarding symbols and results.

Parameters
scannerThe scanner.
void PDScannerSetLoopCap ( PDInteger  cap)

Set a cap on # of loops scanners make before considering a pop a failure.

This is used when reading a PDF for the first time to not scan through the entire thing backwards looking for the startxref entry.

The loop cap is reset after every successful pop.

Parameters
capThe cap.
void PDScannerSkip ( PDScannerRef  scanner,
PDSize  bytes 
)

Skip over a chunk of data internally, usually a PDF stream.

Note
The stream is not iterated, only the scanner's internal buffer offset is.
Parameters
scannerThe scanner.
bytesThe amount of bytes to skip.
void PDScannerTrim ( PDScannerRef  scanner,
PDOffset  bytes 
)

Trim off of head from buffer (only used if buffer is a non-allocated pointer into a heap).

Parameters
scannerThe scanner.
bytesBytes to trim off.