Cite
I have a lot of content in my site that's marked up with either the blockquote element or the q element. Almost every one of those elements has a cite attribute. Each cite attribute's value is a URL to the quoted link. (I use a small filter module that doesn't actually display the element in browsers, since browsers display them differently.) I'd like to have a database of all the URLs so that I can run some views on it:
most frequently quoted domain
most frequently quoted full URL
a table of all the quotes with links to the nodes
Database schema
Database schema
Hooks
Hooks
On node save: index the quotes in the individual node. Save nid and URL to the database.
On cron: process x amount of nodes in the above fashion
Integrate with Search
Integrate with Search