[pLog-svn] Process size

Jon Daley plogworld at jon.limedaley.com
Thu Dec 14 14:53:29 GMT 2006


Please post your .htaccess.  You must have more things in it other than 
forcetype and errordocument.

On Thu, 14 Dec 2006, Jon Daley wrote:

> 	In the logs you posted earlier, you can see all the accesses to 
> error.php.  Why is that?  Because you are using URLs that don't exist and 
> have to be caught by the ErrorDocument and redirected to error.php.
> 	Your blog still works now, right? - maybe you just never had yahoo 
> grabbing bad urls before.  Maybe LifeType is parsing the bad url differently 
> than it did before.  Since it is a bad URL, I don't know how one could define 
> parsing it better - maybe it used to return a failure, and now it going to 
> whatever you have defined for a non-existent blog, etc. I am not sure.
> 	Did you read the wiki article I sent?
>
> 	Though, I find it strange that your pages are not returning 404 
> errors if that is really what is happening.
>
> 	Is there stuff in your apache error log?
>
> On Thu, 14 Dec 2006, Ayalon wrote:
>
>> Well what i did, as you can see in the previous screenshots, i removed all 
>> the  /blog/{blogname} from the links. Thats it.
>> 
>> In my htaccess is only the forcetype part and error.php part. No rewrite 
>> rules nothing..
>> 
>> Why does those url's have to do a redirect? I don't understand, and with 
>> lifetype 1.0.6 and previous it was working perfectly. (i'm using lifetype 
>> for years;))
>> 
>> ----- Original Message ----- From: "Jon Daley" 
>> <plogworld at jon.limedaley.com>
>> To: "LifeType SVN" <plog-svn at devel.lifetype.net>
>> Sent: Thursday, December 14, 2006 3:29 PM
>> Subject: Re: [pLog-svn] Process size
>> 
>> 
>> Ah - you don't have /category/{category} in your URLs.
>> So
>> http://amsterdam.blog.nl/toerist_in_eigen_stad/2006/12/12/economie-amsterdam-staat-er-goed-voor
>> has to do a redirect.  What is in your .htaccess?  I am not seeing a 404
>> returned, which means you either wrote some custom rules, or the
>> modrewrite stuff is being matched.
>> 
>> You have read this, right?
>> http://wiki.lifetype.net/index.php/Custom_URLs#Why_removing_.2Fblog.2F_from_the_URLs_is_evil
>> 
>> I bet if you put the URLs back to the defaults, your problem will go away.
>> 
>> We should probably write a custom URL validator or something to check when
>> people write bad URLs, though I am not sure how it is parsing it when you
>> have the extra slash.
>> 
>> 
>> On Thu, 14 Dec 2006, Ayalon wrote:
>> 
>>> Ok, yours is fixed, now my problem..
>>> 
>>> A not excisting url with the format categorie/year/month/day/ is causing
>>> memory errors..
>>> 
>>> 
>>> ----- Original Message ----- From: "Jon Daley" 
>>> <plogworld at jon.limedaley.com>
>>> To: <plog-svn at devel.lifetype.net>
>>> Sent: Thursday, December 14, 2006 3:10 PM
>>> Subject: Re: [pLog-svn] Process size
>>> 
>>> 
>>> The slashes are now enabled, that is just a configuration choice,
>>> and I hadn't ever noticed before.
>>> 
>>> I changed them to:
>>> category_link_format
>>> /category/{catname}/?$
>>> 
>>> archive_link_format
>>> /archives/{year}/?{month}/?({day}/?)?
>>> 
>>> actually, the archive format didn't work like that, because it started
>>> generating URLs like:
>>> http://jon.limedaley.com/plog/archives/2006/01//
>>> which caused it to only show one day's worth of posts.
>>> So, my archive_link_format is:
>>> /archives/{year}/?{month}/?{day}?$
>>> 
>>> slash removed to avoid the double slash problem, and the $ added to
>>> distinguish from my permalink entry:
>>> /archives/{year}/{month}/{day}/{postname}$
>>> 
>>> Probably it was matching with a null postname or something.
>>> 
>>> In all of that, though, I never had any memory problems.  I believe I have
>>> it currently set to 16MB.  This is enough to show lots of comments on one
>>> page, but not to show thousands, and small enough to not kill the server
>>> if lots of pages are viewed simultaneously.
>>> 
>>> 
>>> On Thu, 14 Dec 2006, Ayalon wrote:
>>>> Using subdomain enabled, the rest is in the screenshot.
>>>> 
>>>> 
>>>>
>>>> 
>>>> -----------------------------------------------------------------------------------------------------------------------------------
>>>> Jon:
>>>> 
>>>> Also in your config there minor issues related to url's:
>>>> 
>>>> http://jon.limedaley.com/plog/category/pregnancy will show the categorie
>>>> http://jon.limedaley.com/plog/category/pregnancy/ is showing the main 
>>>> page
>>>> 
>>>> At my situation yahoo bot is for some reason crawling:
>>>> 
>>>> http://amsterdam.blog.nl/amsterdam/2006/12/07 (not a correct url, and 
>>>> shows
>>>> mainpage)
>>>> http://amsterdam.blog.nl/amsterdam/2006/12/07/ (not a correct url and 
>>>> gets
>>>> a
>>>> mem error)
>>>> 
>>>> Obvious yahoo crawler is doing incorrect things, but this can't ofcourse
>>>> get
>>>> there serious errors...
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ----- Original Message ----- From: "Jon Daley"
>>>> <plogworld at jon.limedaley.com>
>>>> To: <plog-svn at devel.lifetype.net>
>>>> Sent: Thursday, December 14, 2006 1:50 PM
>>>> Subject: Re: [pLog-svn] Process size
>>>> 
>>>> 
>>>> What are your custom URL settings set to?  I can't duplicate this
>>>> on my blog.
>>>> 
>>>> On Thu, 14 Dec 2006, Ayalon wrote:
>>>> 
>>>>> First i'm running freedbsd 6.0  with php 5.1.6 and apache 2.2.2
>>>>> 
>>>>> The exact url, i make an exampe:
>>>>> 
>>>>> i'm running a subdomain config with straight categories behind it.
>>>>> 
>>>>> so: subdomain.domain.com/categorie
>>>>> 
>>>>> This ofcourse excist, if i type a categorie behind it that doesn´t 
>>>>> excist
>>>>> it
>>>>> comes up with an error.
>>>>> 
>>>>> Now i call a categorie but with a archive page look:
>>>>> 
>>>>> subdomain.domain.com/categorie/year/month/date This works fine.
>>>>> 
>>>>> But yahoo is searching it with a / behind the url:
>>>>> 
>>>>> subdomain.domain.com/categorie/year/month/date/ The thing is that this
>>>>> will
>>>>> work, but if you put a / behind the url it's not working anymore and is
>>>>> causing memory errors etc.
>>>>> 
>>>>> for example:
>>>>> 
>>>>> http://amsterdam.blog.nl/amsterdam/2006/12/07 works fine
>>>>> 
>>>>> http://amsterdam.blog.nl/amsterdam/2006/12/07/ gets errors in the log 
>>>>> and
>>>>> out
>>>>> of memory issues, please don't click it :)
>>>>> 
>>>>> Anybody an idea how to fix this?
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> ----- Original Message ----- From: "Oscar Renalias" <oscar at renalias.net>
>>>>> To: <plog-svn at devel.lifetype.net>
>>>>> Sent: Thursday, December 14, 2006 9:51 AM
>>>>> Subject: Re: [pLog-svn] Process size
>>>>> 
>>>>> 
>>>>>> Can you provide the exact URLs? Or even better, the exact line(s) from
>>>>>> the
>>>>>> logs.
>>>>>> 
>>>>>> How about your version of PHP? Are you running PHP 4 or PHP 5?
>>>>>> 
>>>>>> On 12/14/06, Ayalon <ayalon at blog.nl> wrote:
>>>>>>> Found it!
>>>>>>> 
>>>>>>> Searched the logs for the yahoobot. It looks like yahoo is crawling a
>>>>>>> lot
>>>>>>> of
>>>>>>> pages that don't exist.
>>>>>>> 
>>>>>>> for example blogname.mydomain.com/categoryname/year/month/date
>>>>>>> 
>>>>>>> While a normal yahoo request is passing by normally. This page doesn't
>>>>>>> excist in my enviroment and this will cause some kind of loop in the
>>>>>>> script
>>>>>>> with eventually causing a lot of mem in php, a lot of buffering in
>>>>>>> apache
>>>>>>> etc. I looked in the php and apache forums and bug reports, but can't
>>>>>>> find
>>>>>>> anything. Is there something wrong in the way not excisting url's etc
>>>>>>> are
>>>>>>> being handled? Because i see in my logs normal request are also handle
>>>>>>> correct. Pages, and especially within subdirs are causing these
>>>>>>> problems..
>>>>>>> 
>>>>>>> If you need more info, i'm happy to give. Probably also devel had this
>>>>>>> problem last weekend...
>>>>>>> 
>>>>>>> 
>>>>>>> ----- Original Message -----
>>>>>>> From: "Ayalon" <ayalon at blog.nl>
>>>>>>> To: <plog-svn at devel.lifetype.net>
>>>>>>> Sent: Thursday, December 14, 2006 8:32 AM
>>>>>>> Subject: RE: [pLog-svn] Process size
>>>>>>> 
>>>>>>> 
>>>>>>> > Hi Oscar,
>>>>>>> >
>>>>>>> > Thanks for your email. But anyway, we're not allowing apache to grow 
>>>>>>> >  >
>>>>>>> >
>>>>>>> so
>>>>>>> > big,
>>>>>>> > it's php with e mem limit of 32. If I try to run it with a lower >
>>>>>>> value
>>>>>>> >  >
>>>>>>> the
>>>>>>> > pages with comments are not shown (100+ comments). So do I have >
>>>>>>> another
>>>>>>> > choice?
>>>>>>> >
>>>>>>> > Jon: I'll true to get the logs to show you what yahoo is hitting..
>>>>>>> > Jon2: Can you see anything on devel?
>>>>>>> >
>>>>>>> > -----Oorspronkelijk bericht-----
>>>>>>> > Van: plog-svn-bounces at devel.lifetype.net
>>>>>>> > [mailto:plog-svn-bounces at devel.lifetype.net] Namens Oscar Renalias
>>>>>>> > Verzonden: donderdag 14 december 2006 8:28
>>>>>>> > Aan: plog-svn at devel.lifetype.net
>>>>>>> > Onderwerp: Re: [pLog-svn] Process size
>>>>>>> >
>>>>>>> > In addition to all I said before, why do you allow up to 32mb to > 
>>>>>>> each
>>>>>>> > Apache
>>>>>>> > project? I think that's too much, nowadays 8-12mb should be a more
>>>>>>> > reasonable figure. If there's memory leaks somewhere in PHP (not in 
>>>>>>> > >
>>>>>>> our
>>>>>>> > code, remember that there's no way to explicitely deallocate an >
>>>>>>> object
>>>>>>> >  >
>>>>>>> in
>>>>>>> > PHP code as far as I know), 32mb isn't exactly going to help...
>>>>>>> >
>>>>>>> > On 12/14/06, Oscar Renalias <oscar at renalias.net> wrote:
>>>>>>> >> No, we're not checking anything. From Lifetype's point of view, we
>>>>>>> >> don't really care about who is making the request.
>>>>>>> >>
>>>>>>> >> Could it be that the Yahoo blog is performing searches? You should 
>>>>>>> >>  >>
>>>>>>> be
>>>>>>> >> able to see what kind of requests the crawler is making by >>
>>>>>>> referencing
>>>>>>> >> the timestamps you posted below from apache's error log file with 
>>>>>>> >> >>
>>>>>>> the
>>>>>>> >> data you've got in the access log. The apache access log should
>>>>>>> >> contain the exact request, please find it and post it here, >>
>>>>>>> otherwise
>>>>>>> >> we're just guessing.
>>>>>>> >>
>>>>>>> >> On 14 Dec 2006, at 00:15, Ayalon wrote:
>>>>>>> >>
>>>>>>> >> > Nope, there's nobody who has so many post in the mainpage as I
>>>>>>> >> > configure everybody's blog.
>>>>>>> >> >
>>>>>>> >> > It's not just an issue with all bots, it's only with yahoo bot. 
>>>>>>> >> >  >>
>>>>>>> >> >  >
>>>>>>> When
>>>>>>> >> > searching the net I found more similair problems. No check is >> 
>>>>>>> > done
>>>>>>> >> > who is coming on the site or something??
>>>>>>> >> >
>>>>>>> >> > -----Oorspronkelijk bericht-----
>>>>>>> >> > Van: plog-svn-bounces at devel.lifetype.net
>>>>>>> >> > [mailto:plog-svn-bounces at devel.lifetype.net] Namens Oscar >> > 
>>>>>>> Renalias
>>>>>>> >> > Verzonden: woensdag 13 december 2006 23:10
>>>>>>> >> > Aan: plog-svn at devel.lifetype.net
>>>>>>> >> > Onderwerp: Re: [pLog-svn] Process size
>>>>>>> >> >
>>>>>>> >> > I guess it's the same issue we had in devel.lifetype.net.
>>>>>>> >> >
>>>>>>> >> > One thing you should check is whether you've got any user who has
>>>>>>> >> > configured his/her blog to display something like 80 or 100 posts
>>>>>>> >> >  >>
>>>>>>> > in
>>>>>>> >> > the front page, as that can cause performance problems. There's
>>>>>>> >> > already a fix for that in LT 1.2, but in the meantime you will >> 
>>>>>>> >> >  >
>>>>>>> have
>>>>>>> >> > to keep an eye on it on your own.
>>>>>>> >> > Otherwise a crawler performs the exact same operations as a user 
>>>>>>> >> >  >>
>>>>>>> >  >>
>>>>>>> > via
>>>>>>> >> > a browser would do.
>>>>>>> >> >
>>>>>>> >> > On 13 Dec 2006, at 20:48, Ayalon wrote:
>>>>>>> >> >
>>>>>>> >> >> Ok, I found the problem, and I tell you it's really strange but 
>>>>>>> >> >>  >>
>>>>>>> >>  >>
>>>>>>> >> true:
>>>>>>> >> >>
>>>>>>> >> >> The yahoo bot (inktomi bot) is hitting my site and then the
>>>>>>> >> >> Cache_lite.php is for some reason using to much memory:
>>>>>>> >> >>
>>>>>>> >> >> [Tue Dec 12 00:01:44 2006] [error] [client 74.6.85.156] PHP >> 
>>>>>>> >> Fatal
>>>>>>> >> >> error:
>>>>>>> >> >> Allowed memory size of 33554432 bytes exhausted (tried to >> >>
>>>>>>> allocate
>>>>>>> >> >> 84 bytes)
>>>>>>> >> >> in /data/www/www.blog.nl/class/cache/Cache_Lite/Lite.php on line
>>>>>>> >> >> 352
>>>>>>> >> >>
>>>>>>> >> >> [Tue Dec 12 00:05:27 2006] [error] [client 74.6.86.205] PHP >> 
>>>>>>> >> Fatal
>>>>>>> >> >> error:
>>>>>>> >> >> Allowed memory size of 33554432 bytes exhausted (tried to >> >>
>>>>>>> allocate
>>>>>>> >> >> 93 bytes)
>>>>>>> >> >> in /data/www/www.blog.nl/class/cache/Cache_Lite/Lite.php on line
>>>>>>> >> >> 352
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >> When a normal user is hitting the site, there's nothing at all.
>>>>>>> >> >> When the bot
>>>>>>> >> >> is hitting my site this is happening with a lot of the requests.
>>>>>>> >> >> Also the
>>>>>>> >> >> apache process is growing so big that at the end the process is
>>>>>>> >> >> using so much memory that it's starting to use the swap.
>>>>>>> >> >>
>>>>>>> >> >> What can cause this problem? Now I blocked yahoo bot via >> >> 
>>>>>>> htaccess
>>>>>>> >> >> and the problem is not there anymore. It started to happen after
>>>>>>> >> >> the upgrade to the new lifetype platform. Are there some checks 
>>>>>>> >> >>  >>
>>>>>>> >>  >>
>>>>>>> >> for
>>>>>>> >> >> who is coming in?
>>>>>>> >> >> Do you
>>>>>>> >> >> need more data?
>>>>>>> >> >>
>>>>>>> >> >> Anyway it's strange and interesting, anybody an idea....??
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >> -----Oorspronkelijk bericht-----
>>>>>>> >> >> Van: plog-svn-bounces at devel.lifetype.net
>>>>>>> >> >> [mailto:plog-svn-bounces at devel.lifetype.net] Namens Jon Daley
>>>>>>> >> >> Verzonden: woensdag 13 december 2006 17:06
>>>>>>> >> >> Aan: plog-svn at devel.lifetype.net
>>>>>>> >> >> Onderwerp: RE: [pLog-svn] Process size
>>>>>>> >> >>
>>>>>>> >> >> On Wed, 13 Dec 2006, Ayalon wrote:
>>>>>>> >> >>> Where can I find the post of the rewrite?
>>>>>>> >> >>
>>>>>>> http://forums.lifetype.net/viewtopic.php?p=23240&highlight=htaccess
>>>>>>> >> >> +rewrite+
>>>>>>> >> >> error
>>>>>>> >> >>
>>>>>>> >> >>> Anyway my provider is telling me that it looks like lifetype >> 
>>>>>>> >>> has
>>>>>>> >> >>> memoryleaks in various aspects of the script, starting with >> 
>>>>>>> >> >>>  >>
>>>>>>> >>>  >>
>>>>>>> >>>  >>>
>>>>>>> caching.
>>>>>>> >> >>> Is
>>>>>>> >> >> that possible?
>>>>>>> >> >>      It is certainly possible.  I would expect to see more >> >> 
>>>>>>> memory
>>>>>>> >> >> usage in my setup if that were the case.
>>>>>>> >> >>
>>>>>>> >> >>> Is it possible to disable everything related to cache lift? >> 
>>>>>>> >>> Just
>>>>>>> >> >>> to test some things...
>>>>>>> >> >>      Line 39 of class/cache/cachemanager.class.php.  Change
>>>>>>> >> >> $cacheEnable to false.  I think that should do the trick.
>>>>>>> >> >> _______________________________________________
>>>>>>> >> >> pLog-svn mailing list
>>>>>>> >> >> pLog-svn at devel.lifetype.net
>>>>>>> >> >> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> >> >>
>>>>>>> >> >>
>>>>>>> >> >> _______________________________________________
>>>>>>> >> >> pLog-svn mailing list
>>>>>>> >> >> pLog-svn at devel.lifetype.net
>>>>>>> >> >> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> >> >>
>>>>>>> >> >
>>>>>>> >> > _______________________________________________
>>>>>>> >> > pLog-svn mailing list
>>>>>>> >> > pLog-svn at devel.lifetype.net
>>>>>>> >> > http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> >> >
>>>>>>> >> >
>>>>>>> >> > _______________________________________________
>>>>>>> >> > pLog-svn mailing list
>>>>>>> >> > pLog-svn at devel.lifetype.net
>>>>>>> >> > http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> >> >
>>>>>>> >>
>>>>>>> >> _______________________________________________
>>>>>>> >> pLog-svn mailing list
>>>>>>> >> pLog-svn at devel.lifetype.net
>>>>>>> >> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> >>
>>>>>>> > _______________________________________________
>>>>>>> > pLog-svn mailing list
>>>>>>> > pLog-svn at devel.lifetype.net
>>>>>>> > http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> >
>>>>>>> >
>>>>>>> > _______________________________________________
>>>>>>> > pLog-svn mailing list
>>>>>>> > pLog-svn at devel.lifetype.net
>>>>>>> > http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> 
>>>>>>> 
>>>>>>> ______________________________________________________________________
>>>>>>> This email has been scanned by the MessageLabs Email Security System.
>>>>>>> For more information please visit http://www.messagelabs.com/email
>>>>>>> ______________________________________________________________________
>>>>>>> _______________________________________________
>>>>>>> pLog-svn mailing list
>>>>>>> pLog-svn at devel.lifetype.net
>>>>>>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>>>> 
>>>>>> _______________________________________________
>>>>>> pLog-svn mailing list
>>>>>> pLog-svn at devel.lifetype.net
>>>>>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>> 
>>>>> 
>>>>> ______________________________________________________________________
>>>>> This email has been scanned by the MessageLabs Email Security System.
>>>>> For more information please visit http://www.messagelabs.com/email
>>>>> ______________________________________________________________________
>>>>> _______________________________________________
>>>>> pLog-svn mailing list
>>>>> pLog-svn at devel.lifetype.net
>>>>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>>> 
>>>> 
>>>> -- 
>>>> Jon Daley
>>>> http://jon.limedaley.com/
>>>> 
>>>> The difference between genius and stupidity
>>>> is that genius has its limits.
>>>> -- Anonymous
>>>> 
>>>> 
>>>> 
>>>>
>>>> 
>>>> --------------------------------------------------------------------------------
>>>> 
>>>> 
>>>>> _______________________________________________
>>>>> pLog-svn mailing list
>>>>> pLog-svn at devel.lifetype.net
>>>>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>>> 
>>>> 
>>>> ______________________________________________________________________
>>>> This email has been scanned by the MessageLabs Email Security System.
>>>> For more information please visit http://www.messagelabs.com/email
>>>> ______________________________________________________________________
>>> 
>>> -- 
>>> Jon Daley
>>> http://jon.limedaley.com/
>>> 
>>> Music: A safe kind of high
>>> -- Jimi Hendrix
>>> 
>>> 
>>>
>>> 
>>> --------------------------------------------------------------------------------
>>> 
>>> 
>>>> _______________________________________________
>>>> pLog-svn mailing list
>>>> pLog-svn at devel.lifetype.net
>>>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>> 
>>> 
>>> ______________________________________________________________________
>>> This email has been scanned by the MessageLabs Email Security System.
>>> For more information please visit http://www.messagelabs.com/email
>>> ______________________________________________________________________
>>> _______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>>> 
>> 
>> -- 
>> Jon Daley
>> http://jon.limedaley.com/
>> 
>> If everything is coming your way then you're in the wrong lane.
>> -- Anonymous
>> 
>>
>> 
>> --------------------------------------------------------------------------------
>> 
>> 
>>> _______________________________________________
>>> pLog-svn mailing list
>>> pLog-svn at devel.lifetype.net
>>> http://devel.lifetype.net/mailman/listinfo/plog-svn 
>> 
>> 
>> ______________________________________________________________________
>> This email has been scanned by the MessageLabs Email Security System.
>> For more information please visit http://www.messagelabs.com/email 
>> ______________________________________________________________________
>> _______________________________________________
>> pLog-svn mailing list
>> pLog-svn at devel.lifetype.net
>> http://devel.lifetype.net/mailman/listinfo/plog-svn
>> 
>
> -- 
> Jon Daley
> http://jon.limedaley.com/
>
> The reason houses are made of brick instead
> of aluminum foil is the thermal diffusivity.
> -- Professor Shaeffer

-- 
Jon Daley
http://jon.limedaley.com/

Man invented language to satisfy his deep need to complain.
-- Lily Tomlin


More information about the pLog-svn mailing list