We are running int a situation , on UNIX and Linux systems, where a file system and/or a disk goes into Read-Only mode.
At this point, most system/application functionality will stop because the system can no longer write to a file, for example. Monitoring for this situation difficult. Once the systems gets into R/O mode, most often we cannot log in to the system in order to check anything, such as log outputs , to detect this R/O situation.
Does anyone have any ideas for a solution ?
Are there any clues in data from SNMP Polling that might provide an indication the problem is occurring?
When the problem is active, is it normally application failure due to inability to write to the disk that reveals the problems existence?
As far as I can tell, looking at the data in the CAPC Dashboard , it appears that we keep collecting data . So, I don’t know if anything in CAPC will really help, or help indicate this problem , but I thought I would ask.
F. Y. I. , I sent this same question to the UIM Community.
To your second part, yes, one way is the application people call because their app is not functioning, because it can no longer write to the disk(s). Another indication is that when you try to ssh to the system, it will not let you access the device. Again, I believe, this is because the system cannot write to various log files.