[CentOS] 4gb seg fixup

Pagina: 1
Acties:

Acties:
  • 0 Henk 'm!

  • Widow
  • Registratie: Juli 2003
  • Laatst online: 25-08 10:41
Ik heb een server staan, volgens mij is het CentOS, waar een webservertje op draait. Een paar weken geleden was de SQL database die op de site draait er al mee opgehouden. Nadat die handmatig werd gestart liep dat weer, maar nu stuit ik weer op problemen met deze server. Als ik een scherm aan de server koppel zie ik dit voorbij komen de hele tijd:

Afbeeldingslocatie: http://www.xs4all.nl/~floort/seg_fixup.JPG

Als ik google naar 4gb seg fixup kom ik een hoop tegen over Xen en VM's, maar dit is een gewone webserver die niets met virtualisatie doet voor zover ik weet. Ik heb het vermoeden dat de hardeschijven van deze server dood beginnen te gaan, zie onderstaande log file:

Jun 17 09:22:07 extweb kernel: tg3: eth0: Link is up at 100 Mbps, full duplex.
Jun 17 09:22:07 extweb kernel: tg3: eth0: Flow control is off for TX and off for RX.
Jun 17 09:22:07 extweb kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Jun 17 09:22:11 extweb kernel: Bridge firewalling registered
Jun 17 09:22:11 extweb dnsmasq[2490]: started, version 2.39 cachesize 150
Jun 17 09:22:11 extweb dnsmasq[2490]: compile time options: IPv6 GNU-getopt no-ISC-leasefile no-DBus no-I18N TFTP
Jun 17 09:22:11 extweb dnsmasq[2490]: DHCP, IP range 192.168.122.2 -- 192.168.122.254, lease time 1h
Jun 17 09:22:11 extweb dnsmasq[2490]: reading /etc/resolv.conf
Jun 17 09:22:11 extweb dnsmasq[2490]: using nameserver [ip adres]#53
Jun 17 09:22:11 extweb dnsmasq[2490]: using nameserver [ip adres]#53
Jun 17 09:22:11 extweb dnsmasq[2490]: read /etc/hosts - 2 addresses
Jun 17 09:22:16 extweb xenstored: Checking store ...
Jun 17 09:22:16 extweb xenstored: Checking store complete.
Jun 17 09:22:16 extweb xenstored: Checking store ...
Jun 17 09:22:16 extweb xenstored: Checking store complete.
Jun 17 09:22:17 extweb kernel: device vif0.0 entered promiscuous mode
Jun 17 09:22:17 extweb kernel: xenbr0: port 1(vif0.0) entering learning state
Jun 17 09:22:17 extweb kernel: xenbr0: topology change detected, propagating
Jun 17 09:22:17 extweb kernel: xenbr0: port 1(vif0.0) entering forwarding state
Jun 17 09:22:17 extweb kernel: ADDRCONF(NETDEV_UP): peth0: link is not ready
Jun 17 09:22:19 extweb kernel: tg3: peth0: Link is up at 100 Mbps, full duplex.
Jun 17 09:22:19 extweb kernel: tg3: peth0: Flow control is off for TX and off for RX.
Jun 17 09:22:19 extweb kernel: ADDRCONF(NETDEV_CHANGE): peth0: link becomes ready
Jun 17 09:22:19 extweb kernel: device peth0 entered promiscuous mode
Jun 17 09:22:19 extweb kernel: xenbr0: port 2(peth0) entering learning state
Jun 17 09:22:19 extweb kernel: xenbr0: topology change detected, propagating
Jun 17 09:22:19 extweb kernel: xenbr0: port 2(peth0) entering forwarding state
Jun 17 09:22:22 extweb smartd[3121]: smartd version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Jun 17 09:22:22 extweb smartd[3121]: Home page is http://smartmontools.sourceforge.net/
Jun 17 09:22:22 extweb smartd[3121]: Opened configuration file /etc/smartd.conf
Jun 17 09:22:22 extweb smartd[3121]: Configuration file /etc/smartd.conf parsed.
Jun 17 09:22:22 extweb smartd[3121]: Device: /dev/sda, opened
Jun 17 09:22:22 extweb smartd[3121]: Device: /dev/sda, found in smartd database.
Jun 17 09:22:22 extweb smartd[3121]: Device: /dev/sda, is SMART capable. Adding to "monitor" list.
Jun 17 09:22:22 extweb smartd[3121]: Device: /dev/sdb, opened
Jun 17 09:22:22 extweb smartd[3121]: Device: /dev/sdb, found in smartd database.
Jun 17 09:22:23 extweb smartd[3121]: Device: /dev/sdb, is SMART capable. Adding to "monitor" list.
Jun 17 09:22:23 extweb smartd[3121]: Device: /dev/sdc, opened
Jun 17 09:22:23 extweb smartd[3121]: Device: /dev/sdc, found in smartd database.
Jun 17 09:22:23 extweb smartd[3121]: Device: /dev/sdc, is SMART capable. Adding to "monitor" list.
Jun 17 09:22:23 extweb smartd[3121]: Monitoring 3 ATA and 0 SCSI devices
Jun 17 09:22:23 extweb smartd[3121]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
Jun 17 09:22:23 extweb smartd[3121]: Sending warning via mail to root ...
Jun 17 09:22:24 extweb smartd[3121]: Warning via mail to root: successful
Jun 17 09:22:24 extweb smartd[3121]: Device: /dev/sda, 1 Offline uncorrectable sectors
Jun 17 09:22:24 extweb smartd[3121]: Sending warning via mail to root ...
Jun 17 09:22:24 extweb smartd[3121]: Warning via mail to root: successful
Jun 17 09:22:25 extweb smartd[3134]: smartd has fork()ed into background mode. New PID=3134.
Jun 17 09:22:26 extweb gdm[3152]: (null): cannot open shared object file: No such file or directory
Jun 17 09:22:27 extweb gdm[3232]: gdm_slave_xioerror_handler: Fatal X error - Restarting :0
Jun 17 09:22:31 extweb gdm[3248]: gdm_slave_xioerror_handler: Fatal X error - Restarting :0
Jun 17 09:22:35 extweb gdm[3267]: gdm_slave_xioerror_handler: Fatal X error - Restarting :0
Jun 17 09:22:35 extweb gdm[3152]: deal_with_x_crashes: Running the XKeepsCrashing script
Jun 17 09:23:16 extweb setroubleshoot: SELinux is preventing /usr/sbin/httpd (httpd_t) "execmem" access to <Unknown> (httpd_t). For complete SELinux messages. run sealert -l 213c748a-8093-433a-b8e6-f16825a6d545
Jun 17 09:23:16 extweb setroubleshoot: SELinux is preventing /usr/sbin/httpd from changing the access protection of memory on the heap. For complete SELinux messages. run sealert -l e60b3c2d-6429-44d9-8a3c-e18047e6e538
Jun 17 09:52:25 extweb smartd[3134]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
Jun 17 09:52:25 extweb smartd[3134]: Device: /dev/sda, 1 Offline uncorrectable sectors
Jun 17 10:22:26 extweb smartd[3134]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
Jun 17 10:22:26 extweb smartd[3134]: Device: /dev/sda, 1 Offline uncorrectable sectors
Jun 17 10:52:25 extweb smartd[3134]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
Jun 17 10:52:25 extweb smartd[3134]: Device: /dev/sda, 1 Offline uncorrectable sectors
Jun 17 11:22:25 extweb smartd[3134]: Device: /dev/sda, 1 Currently unreadable (pending) sectors
Jun 17 11:22:25 extweb smartd[3134]: Device: /dev/sda, 1 Offline uncorrectable sectors

Jun 17 11:48:54 extweb setroubleshoot: SELinux is preventing /usr/sbin/httpd from changing the access protection of memory on the heap. For complete SELinux messages. run sealert -l e60b3c2d-6429-44d9-8a3c-e18047e6e538

Echter als ik smartctl draai krijg ik niets te zien wat zou wijzen op kapotte HDD's:


code:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
[timo@extweb ~]$ sudo smartctl --all -d ata /dev/sda
smartctl version 5.36 [i686-redhat-linux-gnu] Copyright (C) 2002-6 Bruce Allen
Home page is http://smartmontools.sourceforge.net/

=== START OF INFORMATION SECTION ===
Model Family:     Western Digital Caviar SE (Serial ATA) family
Device Model:     WDC WD800JD-75LSA0
Serial Number:    WD-WMAM98959911
Firmware Version: 09.01D09
User Capacity:    80,000,000,000 bytes
Device is:        In smartctl database [for details use: -P show]
ATA Version is:   7
ATA Standard is:  Exact ATA specification draft version not indicated
Local Time is:    Wed Jun 16 15:03:19 2010 CEST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x82) Offline data collection activity
                    was completed without error.
                    Auto Offline Data Collection: Enabled.
Self-test execution status:      (   0) The previous self-test routine completed
                    without error or no self-test has ever 
                    been run.
Total time to complete Offline 
data collection:        (2460) seconds.
Offline data collection
capabilities:           (0x7b) SMART execute Offline immediate.
                    Auto Offline data collection on/off support.
                    Suspend Offline collection upon new
                    command.
                    Offline surface scan supported.
                    Self-test supported.
                    Conveyance Self-test supported.
                    Selective Self-test supported.
SMART capabilities:            (0x0003) Saves SMART data before entering
                    power-saving mode.
                    Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                    General Purpose Logging supported.
Short self-test routine 
recommended polling time:   (   2) minutes.
Extended self-test routine
recommended polling time:   (  33) minutes.
Conveyance self-test routine
recommended polling time:   (   5) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   200   200   051    Pre-fail  Always       -       0
  3 Spin_Up_Time            0x0003   167   164   021    Pre-fail  Always       -       2633
  4 Start_Stop_Count        0x0032   100   100   000    Old_age   Always       -       38
  5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       5
  7 Seek_Error_Rate         0x000f   200   200   051    Pre-fail  Always       -       0
  9 Power_On_Hours          0x0032   052   052   000    Old_age   Always       -       35657
 10 Spin_Retry_Count        0x0013   100   253   051    Pre-fail  Always       -       0
 11 Calibration_Retry_Count 0x0012   100   253   051    Old_age   Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       -       38
190 Unknown_Attribute       0x0022   067   049   045    Old_age   Always       -       33
194 Temperature_Celsius     0x0022   110   092   000    Old_age   Always       -       33
196 Reallocated_Event_Count 0x0032   199   199   000    Old_age   Always       -       1
197 Current_Pending_Sector  0x0012   200   200   000    Old_age   Always       -       1
198 Offline_Uncorrectable   0x0010   200   200   000    Old_age   Offline      -       1
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
200 Multi_Zone_Error_Rate   0x0009   200   199   051    Pre-fail  Offline      -       0

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%         0         -
# 2  Short offline       Completed without error       00%         0         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.


Wat kan ik nog meer proberen zonder de webservice te stoppen, om te kijken of deze server last heeft van HDD's die dood gaan? De server is een Dell Poweredge 800 die uit mn hoofd 4 jaar oud is.

Niets is zo permanent als een tijdelijke oplossing.


Acties:
  • 0 Henk 'm!

  • Rainmaker
  • Registratie: Augustus 2000
  • Laatst online: 14-07-2024

Rainmaker

RHCDS

code:
1
 5 Reallocated_Sector_Ct   0x0033   199   199   140    Pre-fail  Always       -       5


Er zijn dus 199 sectors gerelocate, veel dingen staan ook op pre-fail. Als dit ding geen RAID set heeft zou ik het zekere voor het onzekere nemen, en de schijf vervangen.

Zoals je in de melding kunt zien, zijn er 2 processen die deze wazige melding geven: getlog en getstatus.

Kijk eens in je ps lijst waar deze processen bij horen (bijvoorbeeld even kijken naar de ppid). Ook kun je een strace attachen aan dat proces, dat geeft je waarschijnlijk een aardig idee wat die processen doen.

We are pentium of borg. Division is futile. You will be approximated.


Acties:
  • 0 Henk 'm!

Verwijderd

ik zie in het stukje log ook nog een selinux vs http melding.

wellicht kan je selinux even disablen om te zien of dat wellicht een oorzaak is.

Acties:
  • 0 Henk 'm!

  • The-Hi_End
  • Registratie: Oktober 2005
  • Laatst online: 05-10 22:03
De melding die je krijgt is toch echt wel Xen gerelateerd.


de volgende lijnen tonen duidelijk aan dat Xen geïnstalleerd is:

code:
1
2
3
4
5
6
7
8
9
10
11
12
Jun 17 09:22:17 extweb kernel: device vif0.0 entered promiscuous mode
Jun 17 09:22:17 extweb kernel: xenbr0: port 1(vif0.0) entering learning state
Jun 17 09:22:17 extweb kernel: xenbr0: topology change detected, propagating
Jun 17 09:22:17 extweb kernel: xenbr0: port 1(vif0.0) entering forwarding state
Jun 17 09:22:17 extweb kernel: ADDRCONF(NETDEV_UP): peth0: link is not ready
Jun 17 09:22:19 extweb kernel: tg3: peth0: Link is up at 100 Mbps, full duplex.
Jun 17 09:22:19 extweb kernel: tg3: peth0: Flow control is off for TX and off for RX.
Jun 17 09:22:19 extweb kernel: ADDRCONF(NETDEV_CHANGE): peth0: link becomes ready
Jun 17 09:22:19 extweb kernel: device peth0 entered promiscuous mode
Jun 17 09:22:19 extweb kernel: xenbr0: port 2(peth0) entering learning state
Jun 17 09:22:19 extweb kernel: xenbr0: topology change detected, propagating
Jun 17 09:22:19 extweb kernel: xenbr0: port 2(peth0) entering forwarding state


om deze meldingen weg te krijgen zou je de volgende lijn even moeten invoeren op de command line:
code:
1
echo 'hwcap 0 nosegneg' > /etc/ld.so.conf.d/libc6-xen.conf


bron van deze lijn is: http://lists.xensource.co...ers/2006-11/msg00056.html