Editing
Routine Maintenance
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
=== Adaptec controllers === Here's some sample output: <pre> mail /usr/local/www/scripts# sh /root/verify.sh --------------------------------------------------------------------------------------------- Adaptec SCSI RAID Controller Command Line Interface Copyright 1998-2002 Adaptec, Inc. All rights reserved --------------------------------------------------------------------------------------------- CLI > open aac0 Executing: open "aac0" AAC0> container list /f Executing: container list /full=TRUE Num Total Oth Chunk Scsi Partition Creation System Label Type Size Ctr Size Usage B:ID:L Offset:Size State RO Lk Task Done% Ent Date Time Files ----- ------ ------ --- ------ ------- ------ ------------- ------- -- -- ------- ------ --- ------ -------- ------ 0 Mirror 33.9GB Open 0:01:0 64.0KB:33.9GB Normal 0 071002 05:39:32 /dev/aacd0 mirror0 0:00:0 64.0KB:33.9GB Normal 1 071002 05:39:32 1 Mirror 33.9GB Open 0:02:0 64.0KB:33.9GB Normal 0 071002 05:39:50 /dev/aacd1 mirror1 0:03:0 64.0KB:33.9GB Normal 1 071002 05:39:50 AAC0> disk list /f Executing: disk list /full=TRUE B:ID:L Device Type Removable media Vendor-ID Product-ID Rev Blocks Bytes/Bl ock Usage Shared Rate ------ -------------- --------------- --------- ---------------- ----- --------- -------- --- ---------------- ------ ---- 0:00:0 Disk N FUJITSU MAJ3364MC 3702 71390320 512 Initialized NO 160 0:01:0 Disk N FUJITSU MAJ3364MC 3702 71390320 512 Initialized NO 160 0:02:0 Disk N FUJITSU MAJ3364MC 3702 71390320 512 Initialized NO 160 0:03:0 Disk N FUJITSU MAJ3364MC 3702 71390320 512 Initialized NO 160 AAC0> disk show smart Executing: disk show smart Smart Method of Enable Capable Informational Exception Performance Error B:ID:L Device Exceptions(MRIE) Control Enabled Count ------ ------- ---------------- --------- ----------- ------ 0:00:0 Y 6 Y N 0 0:01:0 Y 6 Y N 0 0:02:0 Y 6 Y N 0 0:03:0 Y 6 Y N 0 0:06:0 N AAC0> task list Executing: task list Controller Tasks TaskId Function Done% Container State Specific1 Specific2 ------ -------- ------- --------- ----- --------- --------- No tasks currently running on controller AAC0> dia sh hi Executing: diagnostic show history No switches specified, defaulting to "/current". *** HISTORY BUFFER FROM CURRENT CONTROLLER RUN *** [00]: GetDiskLogEntry: container - 1, entry return 0 [01]: Container 1 started SCRUB task [02]: Starting Mirror:1 scrub [03]: Master disk: 2, start sector: 128, sector count = 71286784 [04]: Slave disk: 3, start sector: 128, sector count = 71286784 [05]: UpdateDiskLogIndex - Set - container 0, index 1 [06]: GetDiskLogEntry: container - 0, entry return 1 [07]: Container 0 started SCRUB task [08]: Starting Mirror:0 scrub [09]: Master disk: 1, start sector: 128, sector count = 71286784 [10]: Slave disk: 0, start sector: 128, sector count = 71286784 [11]: Mirror Scrub Container:1 ErrorsFound:0 [12]: Clear disk log: sector - 80, driveno 2 [13]: Clear disk log: sector - 80, driveno 3 [14]: Container 1 completed SCRUB task: [15]: Mirror Scrub Container:0 ErrorsFound:0 [16]: Clear disk log: sector - 81, driveno 1 [17]: Clear disk log: sector - 81, driveno 0 [18]: Container 0 completed SCRUB task: [19]: UpdateDiskLogIndex - Set - container 0, index 0 [20]: GetDiskLogEntry: container - 0, entry return 0 [21]: Container 0 started SCRUB task [22]: Starting Mirror:0 scrub [23]: Master disk: 1, start sector: 128, sector count = 71286784 [24]: Slave disk: 0, start sector: 128, sector count = 71286784 [25]: UpdateDiskLogIndex - Set - container 1, index 1 [26]: GetDiskLogEntry: container - 1, entry return 1 [27]: Container 1 started SCRUB task [28]: Starting Mirror:1 scrub [29]: Master disk: 2, start sector: 128, sector count = 71286784 [30]: Slave disk: 3, start sector: 128, sector count = 71286784 [31]: Mirror Scrub Container:1 ErrorsFound:0 [32]: Clear disk log: sector - 81, driveno 2 [33]: Clear disk log: sector - 81, driveno 3 [34]: Container 1 completed SCRUB task: [35]: Mirror Scrub Container:0 ErrorsFound:0 [36]: Clear disk log: sector - 80, driveno 1 [37]: Clear disk log: sector - 80, driveno 0 [38]: Container 0 completed SCRUB task: [39]: UpdateDiskLogIndex - Set - container 0, index 0 [40]: GetDiskLogEntry: container - 0, entry return 0 [41]: Container 0 started SCRUB task [42]: Starting Mirror:0 scrub [43]: Master disk: 1, start sector: 128, sector count = 71286784 [44]: Slave disk: 0, start sector: 128, sector count = 71286784 [45]: UpdateDiskLogIndex - Set - container 1, index 1 [46]: GetDiskLogEntry: container - 1, entry return 1 [47]: Container 1 started SCRUB task [48]: Starting Mirror:1 scrub [49]: Master disk: 2, start sector: 128, sector count = 71286784 [50]: Slave disk: 3, start sector: 128, sector count = 71286784 [51]: Mirror Scrub Container:1 ErrorsFound:0 [52]: Clear disk log: sector - 81, driveno 2 [53]: Clear disk log: sector - 81, driveno 3 [54]: Container 1 completed SCRUB task: [55]: Mirror Scrub Container:0 ErrorsFound:0 [56]: Clear disk log: sector - 80, driveno 1 [57]: Clear disk log: sector - 80, driveno 0 [58]: Container 0 completed SCRUB task: [59]: UpdateDiskLogIndex - Set - container 0, index 0 [60]: GetDiskLogEntry: container - 0, entry return 0 [61]: Container 0 started SCRUB task [62]: Starting Mirror:0 scrub [63]: Master disk: 1, start sector: 128, sector count = 71286784 [64]: Slave disk: 0, start sector: 128, sector count = 71286784 [65]: UpdateDiskLogIndex - Set - container 1, index 1 [66]: GetDiskLogEntry: container - 1, entry return 1 [67]: Container 1 started SCRUB task [68]: Starting Mirror:1 scrub [69]: Master disk: 2, start sector: 128, sector count = 71286784 [70]: Slave disk: 3, start sector: 128, sector count = 71286784 [71]: Mirror Scrub Container:1 ErrorsFound:0 [72]: Clear disk log: sector - 81, driveno 2 [73]: Clear disk log: sector - 81, driveno 3 [74]: Container 1 completed SCRUB task: [75]: Mirror Scrub Container:0 ErrorsFound:0 [76]: Clear disk log: sector - 80, driveno 1 [77]: Clear disk log: sector - 80, driveno 0 [78]: Container 0 completed SCRUB task: [79]: UpdateDiskLogIndex - Set - container 0, index 0 [80]: GetDiskLogEntry: container - 0, entry return 0 [81]: Container 0 started SCRUB task [82]: Starting Mirror:0 scrub [83]: Master disk: 1, start sector: 128, sector count = 71286784 [84]: Slave disk: 0, start sector: 128, sector count = 71286784 [85]: UpdateDiskLogIndex - Set - container 1, index 1 [86]: GetDiskLogEntry: container - 1, entry return 1 [87]: Container 1 started SCRUB task [88]: Starting Mirror:1 scrub [89]: Master disk: 2, start sector: 128, sector count = 71286784 [90]: Slave disk: 3, start sector: 128, sector count = 71286784 [91]: Mirror Scrub Container:1 ErrorsFound:0 [92]: Clear disk log: sector - 81, driveno 2 [93]: Clear disk log: sector - 81, driveno 3 [94]: Container 1 completed SCRUB task: [95]: Mirror Scrub Container:0 ErrorsFound:0 [96]: Clear disk log: sector - 80, driveno 1 [97]: Clear disk log: sector - 80, driveno 0 [98]: Container 0 completed SCRUB task: [99]: ======================== History Output Complete. AAC0> AAC0> exit Executing: exit press enter when ready to run verify <INS> --------------------------------------------------------------------------------------------- Adaptec SCSI RAID Controller Command Line Interface Copyright 1998-2002 Adaptec, Inc. All rights reserved --------------------------------------------------------------------------------------------- CLI > open aac0 Executing: open "aac0" AAC0> contai scr 0 Executing: container scrub 0 AAC0> contai scr 1 Executing: container scrub 1 AAC0> exit Executing: exit when done run: aaccli open aac0 dia sh hi c Nov 1 10:32:46 mail /kernel: aac0: **Monitor** Container 0 started SCRUB task Nov 1 10:32:47 mail /kernel: aac0: **Monitor** Container 1 started SCRUB task </pre> Here's an analysis of what we're seeing and what we're looking for: <pre> AAC0> container list /f Executing: container list /full=TRUE Num Total Oth Chunk Scsi Partition Creation System Label Type Size Ctr Size Usage B:ID:L Offset:Size State RO Lk Task Done% Ent Date Time Files ----- ------ ------ --- ------ ------- ------ ------------- ------- -- -- ------- ------ --- ------ -------- ------ 0 Mirror 33.9GB Open 0:01:0 64.0KB:33.9GB Normal 0 071002 05:39:32 /dev/aacd0 mirror0 0:00:0 64.0KB:33.9GB Normal 1 071002 05:39:32 1 Mirror 33.9GB Open 0:02:0 64.0KB:33.9GB Normal 0 071002 05:39:50 /dev/aacd1 mirror1 0:03:0 64.0KB:33.9GB Normal 1 071002 05:39:50 </pre> This is showing you the health of the arrays. You're looking for ''Normal'' under the State column, and the absence of a ! in the sector size - sometimes, you'll see this: 64.0KB!33.9GB That indicates a problem. <pre> AAC0> disk show smart Executing: disk show smart Smart Method of Enable Capable Informational Exception Performance Error B:ID:L Device Exceptions(MRIE) Control Enabled Count ------ ------- ---------------- --------- ----------- ------ 0:00:0 Y 6 Y N 0 0:01:0 Y 6 Y N 0 0:02:0 Y 6 Y N 0 0:03:0 Y 6 Y N 0 0:06:0 N </pre> This shows you a SMART report output. Looking for values in the Error Count column. <pre> AAC0> task list Executing: task list Controller Tasks TaskId Function Done% Container State Specific1 Specific2 ------ -------- ------- --------- ----- --------- --------- No tasks currently running on controller </pre> Look for absence of tasks running- a bad thing would be to see a rebuild or verify running when you didn't initiate it. With the history output, you're looking for any anomalies or events since the last time a verify was run. If you see a drive with lots of problems, you may want to take backups before allowing the verify to run since it could replicate errors onto the good drive. After you see the history output, it will prompt you to press enter to run the verify. If you're happy with all the output you're seeing- mirror is healthy, history looks good, it's safe to proceed. Otherwise ^C to exit. After hitting enter it will start the verify and start to tail the messages log file (so you can easily see when the verify is complete). Here's what that'll look like: <pre> Nov 1 14:38:08 mail /kernel: aac0: **Monitor** Container 1 completed SCRUB task: Nov 1 14:46:45 mail /kernel: aac0: **Monitor** Container 0 completed SCRUB task: </pre> So, putting it all together, after hitting enter to start the verify, you'll see: <pre> --------------------------------------------------------------------------------------------- Adaptec SCSI RAID Controller Command Line Interface Copyright 1998-2002 Adaptec, Inc. All rights reserved --------------------------------------------------------------------------------------------- CLI > open aac0 Executing: open "aac0" AAC0> contai scr 0 Executing: container scrub 0 AAC0> contai scr 1 Executing: container scrub 1 AAC0> exit Executing: exit when done run: aaccli open aac0 dia sh hi c Nov 1 10:32:46 mail /kernel: aac0: **Monitor** Container 0 started SCRUB task Nov 1 10:32:47 mail /kernel: aac0: **Monitor** Container 1 started SCRUB task </pre> When the scrub(s) (verify) are complete - if the server has multiple logical drives, it will run both in parallel - you should exit the tail of the log file (^C) and run: <pre> aaccli open aac0 dia sh hi c </pre> Which will show you the diagnostic history, you're looking for the results of the most recent scrub: <pre> [100]: Mirror Scrub Container:1 ErrorsFound:0 [101]: Clear disk log: sector - 81, driveno 2 [102]: Clear disk log: sector - 81, driveno 3 [103]: Container 1 completed SCRUB task: [104]: Mirror Scrub Container:0 ErrorsFound:0 [105]: Clear disk log: sector - 80, driveno 1 [106]: Clear disk log: sector - 80, driveno 0 [107]: Container 0 completed SCRUB task: </pre> ^C to exit the RAID CLI. If you see: [104]: Mirror Scrub Container:0 ErrorsFound:5 You'll want to rerun the verify on that drive till it shows 0, or perhaps replace the drive- you should be able to see from the output which drive had the problem. Depending on the size and how busy the drive is, the verify can take anywhere from an hour to the better part of a day. You will notice that the diagnostic history is not shown on our modern adaptec cards (i.e. any adaptec card not in a Dell 2450). The reason for this is the history is never cleared, so there's simply too much data to show and it just crashes the CLI. So, don't bother trying to see it...which does make it hard to see if there are problems going on, so you just need to watch the scrub to see it goes to 100%. You will also notice that on some servers there's no tail of messages. Again, this is cause no data is shown there about the completion of the scrub. The thing to do here is to go into the CLI and continue to <tt>show tasks</tt> to monitor scrub progress. See [[RAIC_CLI#Adaptec|Adaptec RAID CLI Reference]] for more details on how to use the CLI
Summary:
Please note that all contributions to JCWiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
JCWiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information