*Nix hosts with NFS connectivity issues
Applications that rely on the presence of a directory path to read and write data encounter problems if that path is not present or functioning correctly. You know that directories mounted to a Network File System (NFS) file share might encounter problems due to a variety of reasons, so you want to monitor them.
Data required
Procedure
Run the following search. You can optimize it by specifying an index and adjusting the time range.
source="/var/log/messages" nfs ("not responding" OR "still trying") |rex "server (?<nfs_host>\S+)(\s*(?<message>.*))?" |table _time host nfs_host message"
Search explanation
The table provides an explanation of what each part of this search achieves. You can adjust this query based on the specifics of your environment.
Splunk Search | Explanation |
---|---|
|
Search the global system messages log file. |
|
Get NFS log messages that include the text |
|
Use regex to capture the |
|
Display the results in a table with columns in the order shown. |
Next steps
Sample results for this search are shown in the table below. Use the results of this procedure to detect any machines in your environment where an NFS mount can't be reached.
_time |
client |
NFS server |
message |
---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
There are two types of mounts, soft and hard. The behavior associated with NFS errors and the messaging provided vary with the way the mount is done. Generally soft mounts time out while hard mounts do not. Hard mounts are preferred because they have more robust data protection behavior, but processes on the client hang until a response is received. If the clients are only reading the NFS mount, like a web server does when accessing static content, then a soft mount may be preferable.
Errors on the NFS server can range from dependent processes not running to the server being too busy. When troubleshooting, you should also check the network to make sure that the client can reach the NFS server. Then, run a similar search to the one given here but filter for the NFS server and look for error states in the raw logs.
Finally, you might be interested in other processes associated with the Maintaining *nix systems use case.