The generic scanner used to read symbols from a stream or arbitrary buffer.
More...
The generic scanner used to read symbols from a stream or arbitrary buffer.
The Pajdeg scanner takes a PDStateRef state and optionally a PDScannerPopFunc and allows the interpretation of symbols as defined by the state and its sub-states.
The most public functions of the scanner are PDScannerPopString and PDScannerPopStack. The former attempts to retrieve a string from the input stream, and the other a pd_stack. If the next value scanned is not the requested type, the function keeps the value around and returns falsity. It is not uncommon to attempt to pop a stack, and upon failure, to pop a string and behave accordingly.
Called whenever the scanner needs more data.
The scanner's buffer function requires that the buffer is intact in the sense that the content in the range 0..*size (on call) remains intact and in the same position relative to *buf; req may be set if the scanner has an idea of how much data it needs, but is most often 0.
Pop function signature.
The pop function is used to lex symbols out of the buffer, potentially requesting more data via the supplied PDScannerBufFunc.
It defaults to a simple read-forward but can be swapped for special purpose readers on demand. It is internally swapped for the reverse lexer when the initial "startxref" value is located.
Align buffer along with pointers with given offset (often negative).
- Parameters
-
scanner | The scanner. |
offset | The offset. |
void PDScannerAssertComplex |
( |
PDScannerRef |
scanner, |
|
|
const char * |
identifier |
|
) |
| |
Require that the next result is a complex of the given type, or throw assertion.
- Parameters
-
scanner | The scanner. |
identifier | The identifier. |
Require that the next result is a stack (the stack is discarded), or throw assertion.
- Parameters
-
void PDScannerAssertString |
( |
PDScannerRef |
scanner, |
|
|
char * |
value |
|
) |
| |
Require that the next result is a string, and that it is equal to the given value, or throw assertion.
- Parameters
-
scanner | The scanner. |
value | Expected string. |
Attach a stream filter to the scanner.
Stream filters are used to e.g. compress/decompress or encrypt/decrypt binary content.
- Parameters
-
scanner | The scanner. |
filter | The filter. |
Attach a fixed-size buffer to the scanner. The scanner will refuse to use the buffering function, if one is present.
- Parameters
-
scanner | The scanner. |
buf | The buffer. |
len | The length of the buffer. |
Create a scanner using the default pop function.
- Parameters
-
state | The root state to use in the scanner. |
Create a scanner using the provided pop function.
- Parameters
-
state | The root state to use in the scanner. |
popFunc | The pop function to use. |
Detach attached stream filter from the scanner.
- Parameters
-
Determine if the scanner has reached the end of the stream
- Parameters
-
- Returns
- Boolean value indicating whether the stream hit the end or not
Set up a temporary scanner with the given root state, and process buf, returning a pd_stack entry after tearing down the scanner again.
- Parameters
-
state | The root state |
buf | Buffer |
len | Length of buffer in bytes |
- Returns
- pd_stack entry or NULL on failure
Skip until the given symbol character type is encountered, stopping after the symbol character type.
- Note
- Buffer growth is never done in this method, which means if the scanner's buffer is only partially complete, it may stop prematurely.
- Parameters
-
scanner | The scanner. |
symbolCharType | The symbol character type. |
- Returns
- The number of bytes skipped.
Pop the scanner's current context.
- Parameters
-
Pop the next stack.
- Parameters
-
scanner | The scanner |
value | Pointer to stack ref. Must be freed |
- Returns
- true if the next value was a stack
Pop the next string.
- Parameters
-
scanner | The scanner |
value | Pointer to string variable. Must be freed |
- Returns
- true if the next value was a string
Pop the next value, which the scanner was not able to recognize.
- Parameters
-
scanner | The scanner |
value | Pointer to string variable. Must be freed |
- Returns
- true if the next value was an unrecognizable string
Print a trace of states to stdout for the scanner.
- Parameters
-
Push a new buffer context onto a scanner, keeping the old one on a stack.
- Parameters
-
scanner | Scanner |
ctxInfo | Info object for the buffer function |
ctxBufFunc | Buffer function |
Read parts or entire stream at current position via attached filter.
Iterates scanner and stream (contrary to PDScannerSkip above, which only iterates scanner).
- Note
- If a decompression filter is attached, not all data may be read at once due to capacity limitations, and may require calls to PDScannerReadStreamNext().
- See also
- PDScannerAttachFilter
-
PDScannerReadStreamNext
- Parameters
-
scanner | The scanner. |
bytes | The number of raw bytes to read. |
dest | The destination buffer. |
capacity | The capacity of the destination buffer. |
- Returns
- The number of bytes stored in dest. If this value is equal to capacity, there may be more data available via PDScannerReadStreamNext.
Continue reading stream data via attached filter.
- See also
- PDScannerAttachFilter
-
PDScannerReadStream
- Parameters
-
scanner | The scanner. |
dest | The destination buffer. |
capacity | The capacity of the destination buffer. |
- Returns
- The number of bytes stored in dest.
Reset scanner buffer including size, offset, trail, etc., as well as discarding symbols and results.
- Parameters
-
Set a cap on # of loops scanners make before considering a pop a failure.
This is used when reading a PDF for the first time to not scan through the entire thing backwards looking for the startxref entry.
The loop cap is reset after every successful pop.
- Parameters
-
Skip over a chunk of data internally, usually a PDF stream.
- Note
- The stream is not iterated, only the scanner's internal buffer offset is.
- Parameters
-
scanner | The scanner. |
bytes | The amount of bytes to skip. |
Trim off of head from buffer (only used if buffer is a non-allocated pointer into a heap).
- Parameters
-
scanner | The scanner. |
bytes | Bytes to trim off. |