by Steve Marx via Steve Marx's blog on 11/26/2008 2:49:23 AM
As promised, today I added a worker role to asynchronously process comments and attempt to detect spam, and I invite you to test it out! See the bottom of this post for details.
Here’s a flow diagram I drew on my whiteboard:
The steps are:
Note that after step (3), the synchronous portion is done, so the website remains responsive. (No need to wait for the spam check, which I consider potentially slow, despite it being quite speedy in practice.) The IsSpam property defaults to false, so the comment shows up right away, providing immediate feedback that comment submission succeeded.
IsSpam
false
The big advantage to this architecture is the loose-coupling. Because the spam check is asynchronous, the blog itself can continue to function without it. That means that if TypePad AntiSpam has downtime (or my worker role has a bug in it), normal use of my blog won’t be disrupted. It also means that if I later plug in a more sophisticated (and slower) analysis, I don’t have to worry about my comment form responding slowly or my front-end getting bogged down.
I can also scale the roles differently. I’m using two instances of the web role right now, but there’s no need for more than one worker role, since the incoming rate on comments is less than 50 comments an hour.
In my last post, I described the changes I made to my data model and the blog code. One additional change is a one-liner to enqueue work when a comment has been stored:
QueueStorage.Create(StorageAccountInfo.GetDefaultQueueStorageAccountFromConfiguration()) .GetQueue("commentqueue") .PutMessage(new Message(string.Format("{0}/{1}", comment.PartitionKey, comment.RowKey)));
The worker role code is quite simple. To talk to TypePad AntiSpam, I used the Akismet .Net 2.0 API project on Codeplex, with a minor change to point to TypePad AntiSpam instead. This is nearly all of the code from the worker (omitting the function which converts my comment object to an AkismetComment):
public override void Start() { var q = QueueStorage.Create(StorageAccountInfo.GetDefaultQueueStorageAccountFromConfiguration()) .GetQueue("commentqueue"); var akismet = new Akismet("<KEY DELETED>", "http://blog.smarx.com", "blog.smarx/2"); if (!akismet.VerifyKey()) { throw new ArgumentException("Invalid key."); } var svc = new BlogDataServiceContext(); while (true) { var msg = q.GetMessage(); if (msg != null) { var split = msg.ContentAsString().Split('/'); var partitionkey = split[0]; var rowkey = split[1]; var comment = (from c in svc.BlogCommentTable where c.PartitionKey == partitionkey && c.RowKey == rowkey select c).FirstOrDefault(); if (comment != null) { var akismetComment = GetAkismetComment(comment); if (akismetComment == null) { // comment is for a non-existent blog entry comment.IsSpam = true; svc.UpdateObject(comment); } else if (akismet.CommentCheck(akismetComment)) { comment.IsSpam = true; svc.UpdateObject(comment); } svc.SaveChanges(); } q.DeleteMessage(msg); } else { Thread.Sleep(1000); } } }
You can test this out yourself. Post a comment with the author “viagra-test-123” and watch it disappear within a couple seconds. (This string is hard-coded in Akismet and TypePad AntiSpam to be a spam indicator.)
Original Post: Windows Azure Worker Role to Deal with Spam
The content of the postings is owned by the respective author. AzureFeeds is not responsible for the contents of the postings. This site is automatically generated and cannot be reviewed for abusive content. If you find abusive content on AzureFeeds, please contact us. Designated trademarks and brands are the property of their respective owners. All rights reserved.