Database Maintenance Of Your Blog

0 comments suggest edit

Lately I have been spending a little bit of time performing maintenance tasks on my blog’s SQL Server. I noticed that OdeToCode’s Scott Allen was in the same mood.

In looking to free up some database space, he took the somewhat drastic step of deleting all referrals and urls before a certain date. Since he doesn’t care about this data, it isn’t really all that drastic. But it makes a data packrat like me shudder. I wanted to free up some space as well so I created an approach that frees up a lot of space, but keeps the data I care about.

Around 90% to 99% of my referrals are from web searches and online blog readers. As a matter of fact, nearly all of these are from Google. Since those referrals are not that important to me as WebHost4Life also tracks that data, I wrote a script to delete them for Subtext. Note that the following SQL script is pretty aggressive, so use at your own risk. You might even think of some search strings that I missed.

DELETE FROM subtext_Referrals

WHERE UrlID IN

(

  SELECT UrlID

  FROM subtext_URLs

  WHERE Url LIKE ‘http://google.%’

    OR Url LIKE ‘http://%.yahoo.%’

    OR Url LIKE ‘http://yahoo.%’

    OR Url LIKE ‘%/Search/%’

    OR Url LIKE ‘%/Search?%’

    OR Url LIKE ‘http://search.%’

    OR Url LIKE ‘http://bloglines.%’

)

 

DELETE FROM subtext_URLs

WHERE Url LIKE ‘http://google.%’

  OR Url LIKE ‘http://%.yahoo.%’

  OR Url LIKE ‘http://yahoo.%’

  OR Url LIKE ‘%/Search/%’

  OR Url LIKE ‘%/Search?%’

  OR Url LIKE ‘http://search.%’

  OR Url LIKE ‘http://bloglines.%’

I then ran the same commands that Scott did after reading his post.

DBCC DBREINDEX(subtext_URLs)

DBCC DBREINDEX(subtext_Referrals)

DBCC SHRINKDATABASE(SubtextData)

In order to run those commands on .TEXT, just replace the “subtext_” prefix with “blog_” and you are set.

Now I haven’t tested this, but I imagine the corresponding script for Community Server would be the following based on its published schema.

DELETE FROM cs_Referrals

WHERE UrlID IN

(

    SELECT UrlID FROM cs_Urls

    WHERE Url LIKE ‘http://google.%’

      OR Url LIKE ‘http://%.yahoo.%’

      OR Url LIKE ‘http://yahoo.%’

      OR UrlLIKE ‘%/Search/%’

      OR Url LIKE ‘%/Search?%’

      OR Url LIKE ‘http://search.%’

      OR Url LIKE ‘http://bloglines.%’

)

 

DELETE FROM cs_Urls

WHERE Url LIKE ‘http://google.%’

  OR Url LIKE ‘http://%.yahoo.%’

  OR Url LIKE ‘http://yahoo.%’

  OR UrlLIKE ‘%/Search/%’

  OR UrlLIKE ‘%/Search?%’

  OR UrlLIKE ‘http://search.%’

  OR UrlLIKE ‘http://bloglines.%’

Found a typo or error? Suggest an edit! If accepted, your contribution is listed automatically here.

Comments

avatar

5 responses

  1. Avatar for jayson knight
    jayson knight February 6th, 2006

    Roger that on the CS scripts...those should work. Regardless, the bulk of the shrinkage (all seinfeldisms aside) will be done by by the dbcc shrinkdatabase command; I seriously doubt either the *_urls or *_referrals tables commit that much to the db size as the column sizes are pretty small, unless you literally have hundreds of thousands of rows.



    It's also worth mentioning that (IIRC) dbcc reindex is logged, so you most definitely should issue a dbcc shrinkdatabase (or truncate the log file by some other means) after reindexing any table.

  2. Avatar for Keyvan Nayyeri
    Keyvan Nayyeri February 6th, 2006

    As users don't want referrals for long time and SubText users are personal, I think that it's better if you add new section for this purpose and let users to clear their referrals.

  3. Avatar for Haacked
    Haacked February 6th, 2006

    Jayson, you'd be surprised at how big my subtext_URLs table got. It was 70,000K and shrunk down to 7,000K. It is already back up to 9,000K.



    The culprit is the URL column which is an nvarchar 255. I get so many Google searches that the sucker fills up.



    Keyvan, I think we'll probably want to do two things. Provide the ability to clean up referrals as part of the admin tool and add a plug-in to filter incoming referrals.

  4. Avatar for jayson knight
    jayson knight February 7th, 2006

    Sweet jesus Phil, that's a lot of Google traffic. Back in the .Text days I used to get a shedload of Google traffic as well; ever since moving to CS things have really died down. I think I might have figured out why and am actually going to post about it soon for some feedback.

  5. Avatar for Keyvan Nayyeri
    Keyvan Nayyeri July 1st, 2006


    In Part 1 I discussed about some tips to disable database storage for files to reduce the database...