A-A+

sybase 崩溃故障一例

2015年12月02日 Sybase 暂无评论 阅读 6,744 次

今天上午出现了前端应用无法访问的故障,查看sybase日志,存在非常多的逻辑也报错

kernel  sddone: Input/output error
server bufwritedes: write error detected - spid=7, ppage=668672, bvirtpg=50642944, dbid=2
server Checkpoint process detected hardware error writing logical page '668672', virtual page '50642944' for dbid '2', cache 'default data cache'. It will sleep until write completes successfully.
kernel sddone: write error on virtual disk 3 block 311296:


server Error: 1251, Severity: 26, State: 1 
server An in-use preallocated semaphore cursor was encountered. 
server Error while undoing log row in database 'tempdb'. Rid pageid = 0x0; row num = 0x0. 
server WARNING: Pss found with open sdes. pspid 8, psuid 1708999780, pcurdb 66, system table entry 1, sdesp 0x0x000001, objid 8 
server WARNING: Pss found with open sdes. pspid 8, psuid 1708999780, pcurdb 66, system table entry 1, sdesp 0x0x000003, objid 8 
server Error: 6103, Severity: 17, State: 1 
server Unable to do cleanup for the killed process; received Msg 3300. 

Error 1251

Severity

26

Message text

An in-use preallocated semaphore cursor was encountered. 

Explanation

A semaphore is the position at the head of a queue of locks, and is used by the server’s Lock Manager to ensure that tasks obtain valid and compatible locks. Error 1251 is raised when a task attempts to access a semaphore from a position in the queue that is already in use.

1251 errors are due to a Adaptive Server locking synchronization problem and results in a stack trace.

Some situations that can raise a 1251 error:

  • When you drop a column using alter table on an all-pages-locked (APL) table with a clustered index; and the server is configured for parallel processing;
  • If you create or alter a database in a device on AIX and the specified size is too large;
  • If a query in chained transaction mode invokes a Java method using iJDBC;
  • If there are stranded data extents in the log segment of a dedicated log.

Action

Call Sybase Technical Support for assistance; you may be able to upgrade to an Adaptive Server Enterprise version where the problem is resolved.

Additional information

If the error was raised while creating or altering a database on AIX, note that the space actually available on a device is slightly smaller than the device size; specifying a slightly smaller size when you create or alter the database may help resolve the problem.

If the error was raised while executing alter table drop column, disable parallel processing with sp_configuremax scan parallel degree” as a workaround.

If the error was raised invoking a Java method, check your query. Commands like begin tran, commit, rollback, set chained on/off are not allowed in nested SQL. See Java in Adaptive Server Enterprise for details.

Before calling Sybase Technical Support, have the information available that is listed in “Reporting errors”, including the query that raised the error.

Versions in which this error is raised

All versions

 

查看系统日志

cciss: cmd f6500000 has CHECK CONDITION sense key = 0x3
Buffer I/O error on device cciss/c0d0p3, logical block 156817996
lost page write due to I/O error on cciss/c0d0p3

但是查看硬盘状态没有异常

这个时候sybase isql登陆也是没有反应,无法正常停止,只能kill sybase进程。

重启sybase,登陆后查看数据库状态,乱的一塌糊度了

1> select name,dbid,status from sysdatabases
2> go
name                           dbid   status
------------------------------ ------ ------
zzzzz1                            10    588
zzzzzz2                            11    588
zzzzzmaster                        9    584
xxxxx1                             6     76
xxxxxx2                             7     76
xxxxx3                             8     76
xxxxxmaster                            5     64
 master                              1      0 
 model                               3      0 
 sybsystemdb                     31513      0 
sybsystemprocs                  31514    584
tempdb                              2    588
tempdb1                             4     76

今天的备份还在,ftp出去一份。

fsck 检查文件系统,速度很慢。

重启时发现很久都ping 不通服务器,机房查看发现在fsck 另外的文件系统呢。

此时发现服务器硬盘亮起了红灯,再重启的时候发现坏了,这服务器有6 -7 年了,稳定性还是差了许多。

fsck

当fsck后,服务器正常启动,sybase也顺利启动,这个问题原来和sybase没有关系,各种奇奇怪怪的问题啊。

还准备修复sybase呢。。。。

给我留言