Pajdeg
0.2.2
Pajdeg
|
From samples/add-metadata.c
:
We now want to add metadata to an existing PDF. If the PDF has metadata already, we explode, but that's fine, we'll deal with that soon. The first new thing we have to do is declare a mutator task function above main
.
It takes four arguments: the pipe, its owning task, the object it's supposed to do its magic on, and an info object that we can use to store info in. We'll get to this function in a bit.
The next thing we have to do is create a new object. This is actually done straight off without using tasks. It may be a little confusing at first, but a mutator mutates/changes something, it never creates anything.
In any case, to create objects, we need to introduce a new friend, the parser. In our main()
:
The object has been given a unique object ID and is set up, ready to be stuffed into the output PDF as soon as we hit execute
.
We want to tweak it first, of course. Here, we're setting the metadata to our own string. The three flags at the end are used to tell the object if we want it to set the Length property using our provided length, whether the passed buffer needs to be freed after the object is finished using it, and whether the passed content is encrypted or not. If you strdup()
'd a string and passed it in, you would give true
as the second argument, unless you planned to free()
it yourself once you knew the object was done with it.
Adding a metadata object is fine and all, but it won't do any good unless we point the PDF's root object at the new metadata object. That's what our task from before is for.
We're creating a mutator task for the "root object" property type (i.e. the root object of the PDF), and we're passing our addMetadata
function to it, and finally we're setting the info
object to the meta
object we made earlier.
Under the hood, this sets up a filter task for the root object's object ID, and attaches a mutator task to that filter task. The filter task will be pinged every time an object passes through the pipe and if the filter encounters the object whose ID matches the root object of the PDF, it will trigger its mutator task and hand it the object in question. That mutator task is our addMetadata
function.
Which we will get to very soon. Before we do, though, there are a few things left: adding our task to the pipe,
executing the pipe,
and some clean-up.
Caution: releasing the meta object before calling PDPipeExecute() will cause a crash, because addMetadata
uses it and addMetadata
is not called until PDPipeExecute() has been called.
The last part is the actual task callback.
We could blindly change the root object's Metadata key to point to our new object. We could, but it would be very bad. We would leave a potentially huge abandoned object in the resulting PDF. Even worse, a PDF would have as many metadata objects as it had gone through our pipe, since we would be adding a new one every time.
Here, we are using a PDDictionary for the first time. It's simply a key/value pair container, used to represent dictionaries in PDFs. We can get the dictionary associated with a PDObject using PDObjectGetDictionary(). There is a corresponding PDObjectGetArray() for array type objects, and so on.
In any case, if PDDictionaryGet() returns a non-NULL value for the "Metadata" key, we explode. With that out of the way, setting the metadata is fairly straightforward.
Our meta object is the info, passed to the task:
We put this into the dictionary as the Metadata value:
Note that while meta is a PDObject, by setting a PDDictionary entry's value to a PDObject, it will ultimately end up being a PDReference value. In other words, objects will translate into "<object id> <generation number> R" in a PDDictionary, when written to a PDF.
Finally we return PDTaskDone to signal that we're finished:
Put together, this is what it all looks like:
You can check out a dissection of a diff
resulting from a tiny PDF when piped using this program on the Add metadata diff example page.
In the next part we'll be conditionally replacing or inserting metadata depending on whether it exists or not. There are a couple of ways of doing this, such as always deleting the current medata object and putting in a new one, but we're going to replace or insert.
Next up: Replacing metadata