Thanks! Interesting article regarding the spatial perception feature. I did a bit of research as well, and there are a few others (not many) that have reported this issue. Another possibility is that it's a time synchronization for the multi-room audio feature, especially since UDP port 55444 seems to be used for "time sync".
It's not just the packet size that matters, it's the rate as well (packets per second or pps). A large number of small packets can place more load on a network than a smaller number of large packets with a higher total data rate. This is why networking equipment specifications are given not just in Mbps, but also in pps. In the streaming audio/video example, the packet size would be close to the 1500-byte maximum.
In this particular case, my equipment can handle the load, but it's an unnecessary inefficiency. As someone with an engineering background, I hate inefficiency. I therefore moved all of my Echo devices into my IOT WLAN with layer-2 segregation between clients, and separated from my main LAN on its own VLAN. This stopped the extraneous traffic, and the Echo devices continue to work fine - I see no difference in functionality. However, I don't use the multi-room audio feature and my Echo devices are far enough apart that determining which one should answer an "Alexa" query (via spatial perception) is not an issue. Works for me.
Thanks,
cinergi