[pLog-svn] Process size

Ayalon ayalon at blog.nl
Thu Dec 14 08:42:06 GMT 2006


Found it!

Searched the logs for the yahoobot. It looks like yahoo is crawling a lot of 
pages that don't exist.

for example blogname.mydomain.com/categoryname/year/month/date

While a normal yahoo request is passing by normally. This page doesn't 
excist in my enviroment and this will cause some kind of loop in the script 
with eventually causing a lot of mem in php, a lot of buffering in apache 
etc. I looked in the php and apache forums and bug reports, but can't find 
anything. Is there something wrong in the way not excisting url's etc are 
being handled? Because i see in my logs normal request are also handle 
correct. Pages, and especially within subdirs are causing these problems..

If you need more info, i'm happy to give. Probably also devel had this 
problem last weekend...


----- Original Message ----- 
From: "Ayalon" <ayalon at blog.nl>
To: <plog-svn at devel.lifetype.net>
Sent: Thursday, December 14, 2006 8:32 AM
Subject: RE: [pLog-svn] Process size


> Hi Oscar,
>
> Thanks for your email. But anyway, we're not allowing apache to grow so 
> big,
> it's php with e mem limit of 32. If I try to run it with a lower value the
> pages with comments are not shown (100+ comments). So do I have another
> choice?
>
> Jon: I'll true to get the logs to show you what yahoo is hitting..
> Jon2: Can you see anything on devel?
>
> -----Oorspronkelijk bericht-----
> Van: plog-svn-bounces at devel.lifetype.net
> [mailto:plog-svn-bounces at devel.lifetype.net] Namens Oscar Renalias
> Verzonden: donderdag 14 december 2006 8:28
> Aan: plog-svn at devel.lifetype.net
> Onderwerp: Re: [pLog-svn] Process size
>
> In addition to all I said before, why do you allow up to 32mb to each 
> Apache
> project? I think that's too much, nowadays 8-12mb should be a more
> reasonable figure. If there's memory leaks somewhere in PHP (not in our
> code, remember that there's no way to explicitely deallocate an object in
> PHP code as far as I know), 32mb isn't exactly going to help...
>
> On 12/14/06, Oscar Renalias <oscar at renalias.net> wrote:
>> No, we're not checking anything. From Lifetype's point of view, we
>> don't really care about who is making the request.
>>
>> Could it be that the Yahoo blog is performing searches? You should be
>> able to see what kind of requests the crawler is making by referencing
>> the timestamps you posted below from apache's error log file with the
>> data you've got in the access log. The apache access log should
>> contain the exact request, please find it and post it here, otherwise
>> we're just guessing.
>>
>> On 14 Dec 2006, at 00:15, Ayalon wrote:
>>
>> > Nope, there's nobody who has so many post in the mainpage as I
>> > configure everybody's blog.
>> >
>> > It's not just an issue with all bots, it's only with yahoo bot. When
>> > searching the net I found more similair problems. No check is done
>> > who is coming on the site or something??
>> >
>> > -----Oorspronkelijk bericht-----
>> > Van: plog-svn-bounces at devel.lifetype.net
>> > [mailto:plog-svn-bounces at devel.lifetype.net] Namens Oscar Renalias
>> > Verzonden: woensdag 13 december 2006 23:10
>> > Aan: plog-svn at devel.lifetype.net
>> > Onderwerp: Re: [pLog-svn] Process size
>> >
>> > I guess it's the same issue we had in devel.lifetype.net.
>> >
>> > One thing you should check is whether you've got any user who has
>> > configured his/her blog to display something like 80 or 100 posts in
>> > the front page, as that can cause performance problems. There's
>> > already a fix for that in LT 1.2, but in the meantime you will have
>> > to keep an eye on it on your own.
>> > Otherwise a crawler performs the exact same operations as a user via
>> > a browser would do.
>> >
>> > On 13 Dec 2006, at 20:48, Ayalon wrote:
>> >
>> >> Ok, I found the problem, and I tell you it's really strange but true:
>> >>
>> >> The yahoo bot (inktomi bot) is hitting my site and then the
>> >> Cache_lite.php is for some reason using to much memory:
>> >>
>> >> [Tue Dec 12 00:01:44 2006] [error] [client 74.6.85.156] PHP Fatal
>> >> error:
>> >> Allowed memory size of 33554432 bytes exhausted (tried to allocate
>> >> 84 bytes)
>> >> in /data/www/www.blog.nl/class/cache/Cache_Lite/Lite.php on line
>> >> 352
>> >>
>> >> [Tue Dec 12 00:05:27 2006] [error] [client 74.6.86.205] PHP Fatal
>> >> error:
>> >> Allowed memory size of 33554432 bytes exhausted (tried to allocate
>> >> 93 bytes)
>> >> in /data/www/www.blog.nl/class/cache/Cache_Lite/Lite.php on line
>> >> 352
>> >>
>> >>
>> >> When a normal user is hitting the site, there's nothing at all.
>> >> When the bot
>> >> is hitting my site this is happening with a lot of the requests.
>> >> Also the
>> >> apache process is growing so big that at the end the process is
>> >> using so much memory that it's starting to use the swap.
>> >>
>> >> What can cause this problem? Now I blocked yahoo bot via htaccess
>> >> and the problem is not there anymore. It started to happen after
>> >> the upgrade to the new lifetype platform. Are there some checks for
>> >> who is coming in?
>> >> Do you
>> >> need more data?
>> >>
>> >> Anyway it's strange and interesting, anybody an idea....??
>> >>
>> >>
>> >>
>> >>
>> >> -----Oorspronkelijk bericht-----
>> >> Van: plog-svn-bounces at devel.lifetype.net
>> >> [mailto:plog-svn-bounces at devel.lifetype.net] Namens Jon Daley
>> >> Verzonden: woensdag 13 december 2006 17:06
>> >> Aan: plog-svn at devel.lifetype.net
>> >> Onderwerp: RE: [pLog-svn] Process size
>> >>
>> >> On Wed, 13 Dec 2006, Ayalon wrote:
>> >>> Where can I find the post of the rewrite?
>> >> http://forums.lifetype.net/viewtopic.php?p=23240&highlight=htaccess
>> >> +rewrite+
>> >> error
>> >>
>> >>> Anyway my provider is telling me that it looks like lifetype has
>> >>> memoryleaks in various aspects of the script, starting with caching.
>> >>> Is
>> >> that possible?
>> >>      It is certainly possible.  I would expect to see more memory
>> >> usage in my setup if that were the case.
>> >>
>> >>> Is it possible to disable everything related to cache lift? Just
>> >>> to test some things...
>> >>      Line 39 of class/cache/cachemanager.class.php.  Change
>> >> $cacheEnable to false.  I think that should do the trick.
>> >> _______________________________________________
>> >> pLog-svn mailing list
>> >> pLog-svn at devel.lifetype.net
>> >> http://devel.lifetype.net/mailman/listinfo/plog-svn
>> >>
>> >>
>> >> _______________________________________________
>> >> pLog-svn mailing list
>> >> pLog-svn at devel.lifetype.net
>> >> http://devel.lifetype.net/mailman/listinfo/plog-svn
>> >>
>> >
>> > _______________________________________________
>> > pLog-svn mailing list
>> > pLog-svn at devel.lifetype.net
>> > http://devel.lifetype.net/mailman/listinfo/plog-svn
>> >
>> >
>> > _______________________________________________
>> > pLog-svn mailing list
>> > pLog-svn at devel.lifetype.net
>> > http://devel.lifetype.net/mailman/listinfo/plog-svn
>> >
>>
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://devel.lifetype.net/mailman/listinfo/plog-svn
>
>
> _______________________________________________
> pLog-svn mailing list
> pLog-svn at devel.lifetype.net
> http://devel.lifetype.net/mailman/listinfo/plog-svn 


______________________________________________________________________
This email has been scanned by the MessageLabs Email Security System.
For more information please visit http://www.messagelabs.com/email 
______________________________________________________________________


More information about the pLog-svn mailing list