Bill Wesson wrote: > I run a nightly backup at 2AM of our Fedora Core 1 server using Mondo. > About once every two weeks the system will freeze up. It won’t process > email, present web pages, and I can’t log into it using SSH. The > console screen is blank. We have to cold start server. > > The easiest suggestion is to stop using Mondo, but I like the fast > recovery Mondo offers. > > So we’re trying to figure out how to diagnose this problem and where > to start. Memory, Hard Drive (is new), IDE controller, etc? > > Does anyone have any ideas where to start? > > Thanks, > > --Bill Wesson > > This is my script to run Mondo daily at 2AM: > > mkdir -p /home/mondo/`date +%A` > > mondoarchive -Oi -d /home/mondo/`date +%A` -E "/home/mondo" > > Log files don’t provide any clues. > > CRON log for Sunday-2AM > > Oct 24 01:50:00 payson CROND[24793]: (root) CMD (/usr/local/bin/weblogs) > > Oct 24 02:00:00 payson CROND[25052]: (root) CMD (run-parts > /etc/cron.daily-2am) > > Oct 24 02:00:00 payson CROND[25056]: (root) CMD (nice --adjustment=15 > /usr/local/sbin/update_site_summary_cache) > > Oct 24 02:00:00 payson CROND[25054]: (root) CMD (/usr/local/bin/weblogs) > > Oct 24 02:01:00 payson CROND[25961]: (root) CMD (run-parts > /etc/cron.hourly) > > Oct 24 02:10:00 payson CROND[7866]: (root) CMD (/usr/local/bin/weblogs) > > CRON log for Monday-2AM > > Oct 25 01:50:00 payson CROND[29959]: (root) CMD (/usr/local/bin/weblogs) > > Oct 25 02:00:00 payson CROND[30059]: (root) CMD (run-parts > /etc/cron.daily-2am) > > Oct 25 02:00:00 payson CROND[30063]: (root) CMD (nice --adjustment=15 > /usr/local/sbin/update_site_summary_cache) > > Oct 25 02:00:00 payson CROND[30061]: (root) CMD (/usr/local/bin/weblogs) > > Oct 25 02:01:00 payson CROND[30953]: (root) CMD (run-parts > /etc/cron.hourly) > > Oct 25 07:28:59 payson crond[2982]: (CRON) STARTUP (fork ok) > > MESSAGES log for Monday-2AM > > Oct 25 01:50:08 payson logger: weblogs: (29959) done. > > Oct 25 02:00:00 payson logger: weblogs: (30061) starting. > > Oct 25 02:00:01 payson autofs: automount shutdown succeeded > > Oct 25 02:00:09 payson logger: weblogs: (30061) done. > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 68.157.222.155#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 207.65.0.25#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 68.157.222.155#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 207.65.0.25#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 68.157.222.155#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 207.65.0.25#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 68.157.222.155#53 > > Oct 25 02:04:03 payson named[1736]: lame server resolving 'dogg.com' > (in 'dogg.c > > om'?): 207.65.0.25#53 > > Oct 25 07:27:44 payson syslogd 1.4.1: restart. > > 1AM - Monday > > total used free shared buffers cached > > Mem: 773208 713100 60108 0 66540 522432 > > -/+ buffers/cache: 124128 649080 > > Swap: 2048248 90048 1958200 > > 2AM - Monday > > total used free shared buffers cached > > Mem: 773208 768768 4440 0 102908 491276 > > -/+ buffers/cache: 174584 598624 > > Swap: 2048248 90048 1958200 > > Thanks, > > Bill Wesson, Network Administrator > > *Vision Engraving Systems* > > http://www.visionengravers.com > > 17621 N. Black Canyon Hwy > > Phoenix, AZ 85023 > > 602-439-0600 > I've seen that sort of thing happen when a machine goes I/O or CPU bound. I would suggest actually logging in via ssh and having a few xterms going with top, free, etc so that you can see what is going on. Are things actively running when you're backing it up? Contention for the same resource can cause issues like this. Look for memory leaks (procesess that grow and grow over time) too. JD -- JD Austin Twin Geckos Technology Services LLC email: jd@twingeckos.com http://www.twingeckos.com phone/fax: 480.344.2640 --------------------------------------------------- PLUG-discuss mailing list - PLUG-discuss@lists.plug.phoenix.az.us To subscribe, unsubscribe, or to change you mail settings: http://lists.PLUG.phoenix.az.us/mailman/listinfo/plug-discuss