Kelly,
 
Maybe I am missing something as to why this is a requirement. Is a ring configuration using RSTP a requirement? If that is the case, I haven't an answer that I think would help. I know RSTP allows for fast convergence of failure, I just haven't come across a case where the benefit mattered vs the complexity of scale. We tried to test RSTP when I was a cluster administrator at a university, (802.1W must be better than 802.1D right?) because a professor insisted that the performance of distributed operations would be better. This was a 4 rack cluster of ~70 nodes. The performance tanked. After a lot of trial and error, we settled on the architecture that I am attaching a drawing of using STP.

If redundancy and ease of operation is what you want - I would use redundant switches and us Linux to create a bonded interface that is in an active-passive state. You will have to use a non LACP bond (teaming) as LACP does not work across switches. Your switch's backplane and uplinks will be the only bottlenecks that would occur in the network. Most enterprise switch manufactures build a backplane that can handle the traffic that is possible to send through the all the ports combined at theoretical max.

2 switches that have 2 or 4 port LACP bonds or if you use switches that have proprietary stacking cables, use the stacking cable. Also have an LACP to upstream switching as well.

Hopefully the drawing attahed will help.

I have run clusters of over 2500 nodes with a nearly identical configuration. We used 4x 10Gb per node, 2 LACP bonds per node into 48 port switches. Those switches had a 6x 40Gb uplinks that were split in LACP to 2 top of rack switches. Top of rack switches had 100Gb uplinks to core. At the core were multiple internal networks as well as multiple wan connections.

My point in talking about the size and speed is not to brag (well, kinda - don't we all like cool toys), but to point out that this architecture will work with 1Gb switches and machines of 6 nodes all the way to thousands of nodes with bigger uplinks. You can scale the switching as your hardware changes and scales. The architecture remains the same.

If you are only using 100 nodes, you have less complication. As for plug and play like behavior, as long as you don't mac lock the switchports - the switches wont care what you plug into them as long as the NICs are properly configured.

Hope this helps. If I have missed something - I hope someone else finds this useful.

Mac

kelly stephenson wrote on 10/4/19 3:34 PM:
Looking for some networking advice from the group.

The system I have has several devices connected in a ring configuration using one Ethernet port IN and one Ethernet port out.  The system uses RSTP for loop free operation.  The idea is simplicity for installation, you just unplug and plugin a new device in the ring plus you gain redundancy, if one Ethernet cable breaks you still have another one.  This works but my client has never had more then a half dozen devices on the network yet.
When I say devices just imagine very large machines.  The number of devices could be as many as 100 in the ring or network.  Everything I've researched on RSTP says over 8 devices and its not effective/efficient so I'm researching other Ethernet failover/failsafe/redundant solutions.
So, the local network configuration needs to scale up to 100 devices, have redundancy, and low latency for M2M control.  Any thoughts?  

Thanks
Kelly


---------------------------------------------------
PLUG-discuss mailing list - PLUG-discuss@lists.phxlinux.org
To subscribe, unsubscribe, or to change your mail settings:
https://lists.phxlinux.org/mailman/listinfo/plug-discuss