NFS tuning over 100 Gbps Infiniband
Posted: 2019/08/09 05:46:32
So, I'm currently using CentOS 7.6.1810, installed on my four-node micro compute cluster and all of them have a Mellanox ConnectX-4 100 Gbps 4x EDR Infiniband.
I've installed 'Infiniband Support' software group from the installation media, OpenSM is running, IPs have been assigned, everything is looking good.
Here is how the host is setup:
Here is how the client is setup:
The NFS share is physically mounted on an Intel 545 Series 1 TB SATA 6 Gbps SSD, but I'm only able to get around 300 MB/s max.
The IB NIC is currently running in datagram mode.
In my other post, I now know how to change it to connected mode, but I haven't done that with the system yet because it's currently busy finishing up an analysis for me (so that I would be able to take the MTU up from 2044 to either 4096 or 9216 or something like that. I'll have to play around with that.)
But besides that, I was wondering if there might be other tuning parameters that I can set (e.g. TCP_PAYLOAD_SIZE?) in order to improve the NFS transfer performance.
When the system is finished its current analysis, I'll have to re-run dd to generate some data for everybody to review, so for now, perhaps if people have general recommendations in regards to what are some common, high level NFS tuning parameters that I could employ.
I tried looking up, again, for example, TCP_PAYLOAD_SIZE but I didn't find it in any of the NFS tuning guides online, so I didn't know if there were other things that I could do.
I did read that some people were recommending asynchronous operation, but I don't think that I will be able to do that because the NFS mount/share is used to centrally store the results from my analyses, so I think that the synchronous writes would give it some "peace of mind" despite the fact that the system is connected to a 3 kW UPS.
Thank you.
I've installed 'Infiniband Support' software group from the installation media, OpenSM is running, IPs have been assigned, everything is looking good.
Here is how the host is setup:
Code: Select all
$ cat /etc/exports
/home/user/cluster *(rw,sync,no_root_squash,no_all_squash,no_subtree_check)
Code: Select all
$ cat /etc/fstab
...
aes1:/home/user/cluster /home/user/cluster nfs defaults 0 0
$ cat /proc/mounts
...
aes1:/home/user/cluster /home/user/cluster nfs4 rw,relatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,clientaddr=10.0.1.2,local_lock=none,addr=10.0.1.1 0 0
The IB NIC is currently running in datagram mode.
In my other post, I now know how to change it to connected mode, but I haven't done that with the system yet because it's currently busy finishing up an analysis for me (so that I would be able to take the MTU up from 2044 to either 4096 or 9216 or something like that. I'll have to play around with that.)
But besides that, I was wondering if there might be other tuning parameters that I can set (e.g. TCP_PAYLOAD_SIZE?) in order to improve the NFS transfer performance.
When the system is finished its current analysis, I'll have to re-run dd to generate some data for everybody to review, so for now, perhaps if people have general recommendations in regards to what are some common, high level NFS tuning parameters that I could employ.
I tried looking up, again, for example, TCP_PAYLOAD_SIZE but I didn't find it in any of the NFS tuning guides online, so I didn't know if there were other things that I could do.
I did read that some people were recommending asynchronous operation, but I don't think that I will be able to do that because the NFS mount/share is used to centrally store the results from my analyses, so I think that the synchronous writes would give it some "peace of mind" despite the fact that the system is connected to a 3 kW UPS.
Thank you.