I have a lot of content in my site that's marked up with either the blockquote element or the q element. Almost every one of those elements has a cite attribute. Each cite attribute's value is a URL to the quoted link. (I use a small filter module that doesn't actually display the element in browsers, since browsers display them differently.) I'd like to have a database of all the URLs so that I can run some views on it:

    • most frequently quoted domain

    • most frequently quoted full URL

    • a table of all the quotes with links to the nodes

Database schema


On node save: index the quotes in the individual node. Save nid and URL to the database.

On cron: process x amount of nodes in the above fashion

Integrate with Search