MyBlogLog Crawlers & Bots
We have noticed a dramatic increase in the number of bots crawling MyBlogLog.com, collecting data on our members and trying to position avatars at the top of Reader Rolls over the last several weeks.
In a few cases, the number of requests from these bots have had the equivalent of a Denial of Service (DoS) attack -- not only on MyBlogLog.com but also our member sites. This cannot continue. If you are crawling this site, please stop.
To address this problem in the short term, we are implementing new server monitoring and will ban anyone hitting the server more than 1000 pages an hour. We will also be updating our Terms of Service to make scraping/gaming of site content against the rules. Beyond that it's for the lawyers to resolve; and nobody wants that.
That said, we have a bunch of new ways to get MyBlogLog data coming in the next several weeks. While we can't talk about them now, please stay tuned to this blog.
Thanks,
Todd

Great news, Todd. I have noticed increased bot traffic on my sites. Wonder if it is related to this issue.
Glad you guys are on top of it.
Posted by: Josh Lane | July 05, 2007 at 03:43 PM
Truly too bad about the bots ... hope it all gets sorted out without having to involve lawyer-types :(
Posted by: WebUrbanist | July 05, 2007 at 07:42 PM
Hahaa. This bots/crawlers remind of the time today when I looked into my statcounter to see where the small traffic to my blogsite originating from. And it was mostly from my MyBlogLog page.
Posted by: Tsewang Rinzin | July 05, 2007 at 09:25 PM
good luck with it - it's an ongoing battle, what with proxies and spoofers and the various tools these people employ to get past such stuff :(
Posted by: rob | July 06, 2007 at 12:28 AM
I hope somebody in-house is tracking MBLs struggle to create a networking / sharing hub that isn't swamped by bots, spammers and the SEO crowd. I don't know if you guys will prevail, but I think the twists and turns would make a wonderful and revealing article or book someday.
Posted by: Wendell | July 06, 2007 at 04:30 AM
Hey hardworking MyBlogLog Team, You are not alone in this headache! Such has also hit myspace: "if you got locked for being phished in the last hour, it's a bug! we're working on fixing it right now. sorry about that! the mechanism to detect if your account has been phished has been very powerful for stopping phishing, but it went a little haywire just now. things should be back to normal soon!" Meanwhile; for the MyBlogLog Team and all annoyed by this - here are a few banana aspirins for everyone!
Posted by: ndpthepoetress | July 06, 2007 at 10:12 AM
Interesting development. Due to being involved only with MyBlogLog; and only one other 'community' site, that being Fuelmyblog; I was starting to get suspicious why I've had a radical increase in spam to my site mailbox. I've had it going for over three years, and NEVER had the amount of spam that's been hitting me over the last 8 days.
Still think 1000 hits an hour is a bit high. Are those hits on Member's blog sites that pull the widget, MyBlogLog page views, or a combination of both?
Posted by: JohnC | July 06, 2007 at 04:45 PM
John -- the 1,000 hits an hour is per reader, not per site. We have lots of sites pulling hundreds of thousands of page views a day and that's all good. We're referring to user who keep reloading pages and hitting every blog in the system in order to show up at the top of the reader roll.
Hope that clarifies!
Posted by: Eric Marcoullier | July 06, 2007 at 05:29 PM
Sorry, should have been more specific.
Are the hits mentioned counted when a person visits a site/blog that includes the MBL widget, when a person visits an MBL member/community page, or are both types of visits considered hits?
Seperate issue...on the gamers, why not create a count for number of blogs an MBL member hits that have an MBL widget. If they hit 10 blogs a minute; in any two minutes out of five; that hold the MBL widget, don't allow their browser to pull the MBL widget for the next hour.
Posted by: JohnC | July 06, 2007 at 11:29 PM
I'm really new to this. What on earth is bot traffic?
Posted by: Jill Alexander | July 07, 2007 at 07:42 AM
I read about the increased number of bots all over the place. The last thing I read was about how Yahoo is sending out hundreds of spiders now. And apparently ticking off a bunch of forum owners because of the increase it has put on their servers, etc. Invisiton Power Board admin team seem to be launching some sort of campaign against Yahoo over it.
Might want to jump over there and see what they know about the subject.
Posted by: paintchip | July 08, 2007 at 04:43 PM
I know a little about the bot traffic, but the numbers are impressive. Wow, how it would be fine, if these clicks would transform into the responses to our posts - into the living conversation. It is so hard to stay in silence and to talk just with myself.
Posted by: Tomas | July 10, 2007 at 03:16 AM
Here's the problem, follow the math:
Start
MBL(Issue)=spider/unknown
If unknown=Yahoo
then
MBL(Issue)=spider/Yahoo
and
MBL(Issue)-(spider/Yahoo)=0
else
Issue=(spider/unknown)/MBL
End
If it was Yahoo, it's nothin'. If it wasn't Yahoo, then the Issue lies in MBL dividing the unknown spider into pieces.
Divide and conquer.
Posted by: JohnC | July 10, 2007 at 04:37 PM
Yes, i've noticed after embedding MBL gadget to my new blog that it added 2 more seconds of load time to my site which is A LOT! It's good you guys are catching it and striking back quickly. Keep up!
Posted by: Early rising 101 | July 10, 2007 at 10:15 PM
Okay. I will pretend that I understand all this. Basically the problem is, bots that don't cause any problems for big sites can cause havoc with smaller sites that have less resources to handle the load. Seems like Yahoo and the like could ease off a little.
Posted by: MisterSteve | July 11, 2007 at 02:42 PM
Y'all are amazing. Knocking 'em out faster than Raid!!! LOL Keep up the great work :-) KUDOS.
Posted by: Goddess | July 11, 2007 at 04:25 PM
Boy I wish I knew what you were talking about...
Are you telling me that somebody has let a bunch of robots loose in the MyBlogLog offices???
Posted by: Rob Beland | July 12, 2007 at 12:57 PM
Rob, that's hilarious! LOL
Posted by: Lance | July 12, 2007 at 06:22 PM
Back on Tuesday, I wrote a blog post about the scraping of MyBlogLog that I've been doing. You can read a little bit about it on my blog, Orient Lodge,
http://www.orient-lodge.com/node/2365
Unlike the gaming that is a big issue and concern, my scraping is aimed at producing graphs of the interactions in MyBlogLog. I hope some of you have looked at my graphs and thought about their implications.
As I've noted in my blog posts, hopefully a good API will make that sort of scraping unnecessary. However, until an API is available, I would urge MyBlogLog not to completely ban scraping.
That said, the way I would approach the scraping and gaming is simply put a limit how often you get added as a 'Recent Reader'.
If someone spends less than ten or fifteen seconds looking at my site, then I don't consider them having recently read my site.
So when someone visits a site, they shouldn't get added as a recent reader if they visited a different site within the past 10 or 15 seconds.
Posted by: Aldon Hynes | July 13, 2007 at 05:58 AM