game
photo
retro
rant
Not logged in. · Lost password · Register

All content © NFGworld, unless otherwise noted, except for stuff we stole. Contact the editor-in-chief : baldbutsuave@thissitesdomain, especially if you are an attractive young female willing to do nude photography modelling. All rights reversed. 299

Author name (Administrator) #1
Avatar
Member since May 2011 · 2485 posts · Location: Brisbane
Group memberships: Administrators, Members
Show profile · Link to this post
Subject: I think I've just been scraped!
Now THIS is annoying.  From my server logs:

96.20.236.206 - [time] "GET /mb/search;nodef=1;DateFrom=1211851750;ResultView=1;Sort=2;title=3;?unb507ses HTTP/1.1" 200 14198 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322)"

96.20.236.206 - [time] "GET /mb/search;nodef=1;DateFrom=1212024551;ResultView=1;Sort=2;title=1;?unb507ses HTTP/1.1" 200 11053 "" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.7) Gecko/20060909 Firefox/1.5.0.7"

96.20.236.206 - [time] "GET /mb/search;nodef=1;DateFrom=1212024351;ResultView=1;Sort=2;title=1;?unb507ses HTTP/1.1" 200 11053 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"

96.20.236.206 - [time] "GET /mb/search;nodef=1;DateFrom=1211851558;ResultView=1;Sort=2;title=3;?unb507ses HTTP/1.1" 200 14198 "" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; InfoPath.1)"

Hundreds and thousands of these all in a row, from the same IP.  Why they're coming from 4 different applications I cannot say, but it's annoying!  People doing shit to my poor poor servers...  =(

The two things that annoys me most:
  • It's a rapid-fire issue, these 2400+ requests came with barely a 2-second break between 'em, on average
  • They used no fewer than four applications to do it:
    • No client reported
    • .NET CLR 1.1.4322
    • .NET CLR 1.1.4322; InfoPath.1
    • Gecko/20060909 Firefox/1.5.0.7

Is someone scraping me?  It seems like it...  BUT WHY?
BLEARGH
Author name #2
Member since Nov 2007 · 121 posts
Group memberships: Citizens, Denizens, Members
Show profile · Link to this post
Wonder if this means that in among all the Wordsworth and Shakespeare interspersed with filthy words, we'll also see some NFG wisdom next time I search for filth.
Any way to protect yourself from this, or is it basically impossible if a browser can see your page?
"...either stop and think or fuck right off" (TheOutrider)
Author name (Administrator) #3
Avatar
Member since May 2011 · 2485 posts · Location: Brisbane
Group memberships: Administrators, Members
Show profile · Link to this post
There's basically nothing to do.  I could put up some sort of complicated system to analyze traffic, but it's way more trouble than it's worth at this time.

I could put up a robots.txt file, but then I'd lose the massive benefits that google visitors bring (they want my sexy j-toys).
BLEARGH
Close Smaller – Larger + Reply to this post:
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
We love UNB by Yves Goergen!