Pajdeg  0.2.2
Pajdeg
Files | Data Structures | Typedefs | Enumerations | Functions

A content stream, i.e. a stream containing Adobe commands, such as for writing text and similar. More...

Files

file  PDContentStream.h
 
file  PDContentStreamPrinter.h
 
file  PDContentStreamTextExtractor.h
 

Data Structures

struct  PDContentStream
 
struct  PDContentStreamOperation
 

Typedefs

typedef struct PDContentStreamOperationPDContentStreamOperationRef
 
typedef struct PDContentStreamPDContentStreamRef
 
typedef PDOperatorState(* PDContentOperatorFunc) (PDContentStreamRef cs, void *userInfo, PDArrayRef args, pd_stack inState, pd_stack *outState)
 

Enumerations

enum  PDOperatorState { PDOperatorStateIndependent = 0, PDOperatorStatePush = 1, PDOperatorStatePop = 2, PDOperatorStateSeek = 3 }
 

Functions

PDContentStreamRef PDContentStreamCreate (void)
 ! Basic operations More...
 
void PDContentStreamAttachOperator (PDContentStreamRef cs, const char *opname, PDContentOperatorFunc op, void *userInfo)
 
void PDContentStreamAttachDeallocator (PDContentStreamRef cs, PDDeallocator deallocator, void *userInfo)
 
void PDContentStreamAttachResetter (PDContentStreamRef cs, PDDeallocator resetter, void *userInfo)
 
void PDContentStreamAttachOperatorPairs (PDContentStreamRef cs, void *userInfo, const void **pairs)
 
PDSplayTreeRef PDContentStreamGetOperatorTree (PDContentStreamRef cs)
 
void PDContentStreamSetOperatorTree (PDContentStreamRef cs, PDSplayTreeRef operatorTree)
 
void PDContentStreamInheritContentStream (PDContentStreamRef dest, PDContentStreamRef source)
 
void PDContentStreamExecute (PDContentStreamRef cs, PDObjectRef ob)
 
void PDContentStreamReset (PDContentStreamRef cs)
 
const pd_stack PDContentStreamGetOperators (PDContentStreamRef cs)
 
PDContentStreamRef PDContentStreamCreateTextSearch (PDObjectRef object, const char *searchString, PDTextSearchOperatorFunc callback)
 ! Advanced operations More...
 

Detailed Description

A content stream, i.e. a stream containing Adobe commands, such as for writing text and similar.

Content streams may have a wide variety of purposes. One such purpose is the drawing of content on the page. The content stream module contains support for working with the state machine to process the content in a variety of ways.

The mode of operation goes as follows:

At this point, no known example exists where the above complexity (in reference to arguments) is necessary. Instead, the following approximation is done:

Current operators stack is done exactly as defined.


Data Structure Documentation

struct PDContentStream

Content stream internal structure

The content stream is a simple wrapper around an object, with additional support for PDF operators (the ones used to draw stuff on screen, for example).

Data Fields

PDSplayTreeRef opertree
 operator tree
 
PDArrayRef args
 pending operator arguments
 
pd_stack opers
 current operators stack
 
pd_stack deallocators
 deallocator (func, userInfo) pairs – called when content stream object is about to be destroyed
 
pd_stack resetters
 resetter (func, userInfo) pairs – called at the end of every execution call; these adhere to the PDDeallocator signature
 
const char * lastOperator
 the last operator that was encountered
 
struct PDContentStreamOperation

A PDF content stream operation.

Content stream operations are simple pairs of operator names and their corresponding state stack, if any.

Data Fields

char * name
 name of operator
 
pd_stack state
 state of operator; usually preserved values from its args
 

Typedef Documentation

typedef PDOperatorState(* PDContentOperatorFunc) (PDContentStreamRef cs, void *userInfo, PDArrayRef args, pd_stack inState, pd_stack *outState)

Function signature for content operators.

Parameters
csContent stream reference
argsArguments as PDArray
inStateThe top-most entry in the operation stack, if any
outStatePointer to the resulting state after this operation; only applies to operators that return PDOperatorStatePush
Returns
State of operator (i.e. whether it pushes, pops, or does neither)

A PDF content stream.

Enumeration Type Documentation

Content operator result type.

Content operators are responsible for maintaining the operator state of their designated functions. For example, the 'q' and 'BT' operators need to return PDOperatorStatePush and the 'Q' and 'ET' operators need to return PDOperatorStatePop. The default behavior for all unattached operators is PDOperatorStateIndependent, i.e. the operator does not modify the current operators stack.

Enumerator
PDOperatorStateIndependent 

this operator does not push nor pop the stack

PDOperatorStatePush 

this operator pushes onto the stack

PDOperatorStatePop 

this operator pops the stack

PDOperatorStateSeek 

this operator performs a seek operation on the stream; the latest argument in the arg stack is a PDNumber indicating the number of bytes

Function Documentation

void PDContentStreamAttachDeallocator ( PDContentStreamRef  cs,
PDDeallocator  deallocator,
void *  userInfo 
)

Attach a deallocator to a content stream. Content streams call every deallocator attached to it once before it destroys itself. Deallocators in content streams are used to clean up user info objects that were allocated in the process of setting up operators.

Parameters
csThe content stream
deallocatorThe deallocator callback. It will be handed userInfo as argument
userInfoThe argument passed to deallocator
void PDContentStreamAttachOperator ( PDContentStreamRef  cs,
const char *  opname,
PDContentOperatorFunc  op,
void *  userInfo 
)

Attach an operator function to a given operator (replacing the current operator, if any).

Parameters
csThe content stream
opnameThe operator (e.g. "BT")
opThe callback, which abides by the PDContentOperatorFunc signature
userInfoUser info value passed to the operator when called
void PDContentStreamAttachOperatorPairs ( PDContentStreamRef  cs,
void *  userInfo,
const void **  pairs 
)

Attach a variable number of operator function pairs (opname, func, ...), each sharing the given user info object.

Pairs are provided using the PDDef() macro. The following code

1 PDContentStreamAttachOperatorPairs(cs, ui, PDDef(
2  "q", myGfxStatePush,
3  "Q", myGfxStatePop,
4  "BT", myBeginTextFunc,
5  "ET", myEndTextFunc
6 ));

is equivalent to

1 PDContentStreamAttachOperator(cs, "q", myGfxStatePush, ui);
2 PDContentStreamAttachOperator(cs, "Q", myGfxStatePop, ui);
3 PDContentStreamAttachOperator(cs, "BT", myBeginTextFunc, ui);
4 PDContentStreamAttachOperator(cs, "ET", myEndTextFunc, ui);
Parameters
csThe content stream
userInfoThe shared user info object
pairsPairs of operator name + operator callback
void PDContentStreamAttachResetter ( PDContentStreamRef  cs,
PDDeallocator  resetter,
void *  userInfo 
)

Attach a resetter to a content stream. Content streams call every resetter attached to it at the end of every call to PDContentStreamExecute.

Parameters
csThe content stream
resetterThe resetter callback
userInfoThe argument passed to the resetter
PDContentStreamRef PDContentStreamCreate ( void  )

! Basic operations

Set up a content stream for parsing one or multiple streams in PDObjects.

Returns
The content stream object
PDContentStreamRef PDContentStreamCreateTextSearch ( PDObjectRef  object,
const char *  searchString,
PDTextSearchOperatorFunc  callback 
)

! Advanced operations

Create a content stream configured to perform text search.

Note
Creating one text search content stream, and then using PDContentStreamGetOperatorTree and PDContentStreamSetOperatorTree to configure additional content streams is more performance efficient than creating a text search content stream for every content stream, when searching across multiple streams.
Parameters
objectObject in which search should be performed
searchStringString to search for
callbackCallback for matches
Returns
A pre-configured content stream
void PDContentStreamExecute ( PDContentStreamRef  cs,
PDObjectRef  ob 
)

Execute the content stream, i.e. parse the stream and call the operators as appropriate.

Parameters
csThe content stream
obThe object whose stream should be executed
const pd_stack PDContentStreamGetOperators ( PDContentStreamRef  cs)

Get the current operator stack from the content stream.

The operator stack is a stack of PDContentStreamOperationRef objects; the values in the object can be obtained usign ob->name and ob->state.

See also
PDContentStreamOperation
Parameters
csThe content stream
Returns
Operator stack
PDSplayTreeRef PDContentStreamGetOperatorTree ( PDContentStreamRef  cs)

Get the operator tree for the content stream. The operator tree is the representation of the operators in effect in the content stream. It is mutable, and updates to it, or to the content stream, will affect the original content stream.

Parameters
csThe content stream
Returns
The operator tree
void PDContentStreamInheritContentStream ( PDContentStreamRef  dest,
PDContentStreamRef  source 
)

Inherit a content stream, copying its resetters and operator tree into the destination. This is the recommended way to "clone" content streams, since the addition of deallocators and resetters. This copies resetters as well as operator trees, but does not copy deallocators for obvious reasons. The master content stream must remain alive until all child streams have finished.

Parameters
destDestination content stream (must be a clean content stream without, in particular, any resetters)
sourceSource content stream, whose values should be cloned in dest
void PDContentStreamReset ( PDContentStreamRef  cs)

Reset the content stream, calling attached resetters. This should be done when the stream has finished executing all objects it is meant to execute.

Parameters
csThe content stream
void PDContentStreamSetOperatorTree ( PDContentStreamRef  cs,
PDSplayTreeRef  operatorTree 
)

Replace the content stream's operator tree with the new tree (which may not be NULL). The content stream will use the new tree internally, thus making changes to and be affected by changes to the object.

Parameters
csThe content stream
operatorTreeThe new operator tree