Gotcha when editing nodes programatically

tags:

Now this one is pretty annoying to.

I'm writing a module that imports data into nodes. Sometimes this data has already been imported, and I only want to update existing rows with changes, if any.

After pulling in a row of data from the import file, I see if it already exists as a node in my database. If it does, I grab the node ID and load the existing node:

$node = node_load($existing_nid);

I then do some magical stuff and save the node using the standard:

$node_validate($node);
// check for form errors, etc...
$node = node_submit($node);
$node_save($node);

The problem is that after doing this, the $node->created timestamp is always equal to time(). I don't want to reset the created timestamp for existing nodes, it makes no sense.

After poking around in node.module, I find that node_submit() pulls in the string version of the timestamp as $node->date. It then converts it back to a unix timestamp before assigning the value back to $node->created.

The fix is to set this property before submitting the node:

$node->date  = format_date($node->created, 'custom', 'Y-m-d H:i:s O');

And that's just plain ugly and a real time wasting gotcha.

I understand that the form needs to contain a string and that the submission process requires a unix timestamp, but it is this kind of thing that makes it extremely clumsy to work with nodes programatically using a pseudo form submission process.

It would be quite nice if a future version of Drupal provides developers with a separate API for managing nodes programatically to ensure that a lot of this ugliness of forms is hidden from the programmer, and also to ensure that all the appropriate callbacks are run when a node is managed outside an actual webform. It's very easy to skip the node_validate() and node_submit() hooks. I'm also not sure if proper permissions checks are done when handling nodes this way.