Identifying Importer Posts

Given a site with 100 posts, where an unspecified number of posts are manually written, and the rest are created using the WordPress importer, how would I programmatically identify the posts imported without having access to remote sites or the original import file?

E.g. was this post created by the Importer tool?

Solutions Collecting From Web of "Identifying Importer Posts"

Two things I could imagine:

  • Check the post_modified value. Maybe import creates a definitive timestamp that you could use. You’ll still have to save the import date somewhere so you can check against it.
  • I do some post importing via a stream/HTTP response (this is not the native importer). During the import, I map my SysBot Plugin user to every post that has no existing user in the installation. This allows me to use archives, filter the admin post list table, etc. A very convenient solution to check those. It also allows posts that get reviewed to get out of the “imported but not touched yet”-queue as the author can change. You might also be able to use the native importer and use the 'import_allow_create_users' with simply setting a callback of __return_false.
  • Infiltrate the WP importer and attach a meta value during import. There’s a hook that triggers for every imported post meta key named 'import_post_meta'.

In case you’re infiltrating the native importer plugin, you’d need to start at the 'import_start' hook. The last hook in the system (where you might want to check if everything went fine), is the 'import_end' hook.


I just encountered a class that I’ve not even known that it exists (in core, not the WP Importer plugin): WP_Importer. This class has a method, named get_imported_posts().

So in theory you could do the following:

$importer = new WP_Importer;
$importer_name = '???';
$bid = get_current_blog_id();
$imported_posts = $importer->get_imported_posts( $importer_name, $bid );

The meta key that gets searched for is built by the two meta keys:

$meta_key = $importer_name . '_' . $bid . '_permalink';

So there seems to be a unique trace route that one can follow.

Update #2

An example query you could run would be:

var_dump( $wpdb->get_results( "
    SELECT post_id, meta_value 
    FROM {$wpdb->postmeta} 
    WHERE meta_key 
    LIKE '%_permalink%'
" ) );

This should bring all similar posts up an allow you to determine the “importer_name” easier.

Edit #3

There’s get_importers(), which you can dump to see what importers are registered (using register_importer()). This should help identifying the actual importer name.

While I am not aware of import doing anything explicit to mark the posts, the possible indicators are:

  • guid field, if posts weren’t imported from same domain (or some other differentiation)
  • post_id field, which is suggested by import_id during import and can result in possibly detectable block of IDs that are not sequential to natively created ones