Opened 2 years ago

Closed 2 years ago

#9801 closed defect (fixed)

LookSee meeting file transfer very slow, 20 seconds

Reported by: Tom Goddard Owned by: Tom Goddard
Priority: moderate Milestone:
Component: VR Version:
Keywords: Cc:
Blocked By: Blocking:
Notify when closed: Platform: all
Project: ChimeraX

Description

In a LookSee 2 or 3 person meeting at SLAC with Mike Schmid and JJ on September 23, 2023, when one person opened a 900K triangle colored tomogram model the others did not see it until 20-30 seconds later. This was using either an iPhone hotspot for wifi, or an 802.11n router.

The file size was probably ~20 Mbytes and the bandwidth on the router a few feet away was reported 166 Mbits/sec so it should have been able to transfer in a second.

I suspect the problem is caused by async io not sending blocks of data frequently enough because the unity frame update loop is being interleaved with the socket writing. I saw this with socket reading using async io that it only seemed to do one read per unity frame update.

Might want to switch to using threads for the socket writes and reads since async io has been a pain. The GLTFast library I think only offers an asyncio model load, and I think my investigation showed it is treacherous to try to run asyncio routines synchronously.

When a model is opened by the person who started the meeting then the participants see that persons wands disappear I believe because no wand position updates are received as the file is being transferred. The person who opened the model continues to see the other participants wands update, and then sees their wands freeze for a second at the time that the model opens for the participants. So it looks like the gltf file is simply transferring slowly, then opens quickly when it has all arrived.

Change History (7)

comment:1 by Tom Goddard, 2 years ago

I tried an 885,000 triangle (emdb 16432, level .0034, step 2) surface colored by height producing a test.glb file of size of 23561688 bytes or about 190 Mbits. It transferred and displayed in a 2-person meeting in about 3 seconds. I tested this by clicking on the file open with the meeting host hand controller while wearing the participant headset, both Quest 2, connected to RBVIVR wifi from my office cubicle (15 feet from router), connected on wifi 6 with transmit receive speed reported as 573 Mbps in the Quest wifi settings for both headsets. A speed test at fast.com showed upload and download speeds of about 320 Mbps. Opening the test.glb from a file on the host headset took a small fraction of a second (< 0.3 seconds). Given the wifi speeds I would have expected the file to transfer and display in under 1 second. So it is not clear why it took 3 seconds. But this is much faster than the 20-30 second times we observed at SLAC (with slower 802.11n and iPhone hotspot wifi).

comment:2 by Tom Goddard, 2 years ago

The C# reading code allocates an array for the full size of the message and fills it as data is read, so there is no concatenation to slow down the read.

One possibility is that the async read is not getting CPU time very often because the Unity is not querying the task often enough, maybe only once per rendering update loop. The code currently reads while data is available and then does a stream.ReadSync() call. I could test that by always doing a synchronous stream.Read() until the whole message is read. Try that. The same problem could be slowing the writing to the socket using WriteAsync(). Can change that to Write() as a test. With synchronous read and write it took the same amount of time. Printing the synchronous read time gives 1.5 seconds for 31 Mbytes with a synchronous write. That is not too far off the expected time and could be due to stream buffering which is not optimal for large messages. If I use async read and write then the read time is reported as 1.7 seconds. So only about 15% slower with async. Decoding the message and opening the model took 0.9 seconds after the message read, and encoding the message took 0.8 seconds (not including serializing by concatenating length, message type, and json). So these timings account for the ~3 second transfer time. Each of these timings was on the second load so file data and module initializations are done.

The slightly slower than expected socket data transfer may be because both headsets are on the same RBVIVR wifi and the simultaneous data stream for both headsets lowers the wifi bandwidth.

Maybe the large JSON message encoding and decoding is also slow? Yes. JsonUtility converting json to unity OpenModelMessage instance took 0.6 seconds and converting the OpenModelMessage to json took 0.7 seconds. The decode of the json message to utf8 took 0.04 seconds. That 1.3 seconds should be close to 0 if more efficient code were used.

Last edited 2 years ago by Tom Goddard (previous) (diff)

comment:3 by Tom Goddard, 2 years ago

I should time the behavior with the slow 802.11n router. The very slow times at SLAC may have been in 3 person meetings where basically 4 copies of the data are transmitted between wifi router and headsets (two from opener to router, and one each to 2 participants). If the wifi bandwidth is the limiting factor and it is about 4x slower due to 802.11n vs 802.11ax and each headset-wifi transmission gets only 1/4 of the wifi bandwidth, then the time could be 16 times longer than what one would expect for a single 802.11ax transfer at 802.11ax speeds.

comment:4 by Tom Goddard, 2 years ago

I think it is at least worth changing the message format so the large gltf data is not converted to and from JSON since that is taking half the time on fast wifi 6. I think the best solution is to add an optional binary data block in the LookSee messages after the JSON data. This will involve adding a length uint32 before the json block so we know how long the json part is. This should reduce the large scene transfer from ~3 seconds to ~1.5 seconds. It will only improve the slow wifi scenario ~25% from not having to base64 encode the gltf binary data in json.

To improve the slow wifi transfer speed probably requires reducing the GLTF data size, possibly reducing vertices to 16-bit, normals to 8-bit, and somehow compressing triangle data (Draco?). I'm not sure any of that is worthwhile. We may just require modern wifi equipment. A simpler remediation might be to give a progress indicator for participants receiving a new model if it is expected to take more than several seconds.

comment:5 by Tom Goddard, 2 years ago

I tried 2-person meeting transfering my standard 900K emdb map with 802.11n D-link DIR 655 router (circa 2010). It displayed after 10 seconds using two Quest 2 headsets. Trying the same test with 3 headsets, starting with no models open the time for the model to open for the last participant who joined (Quest Pro) was 28 seconds. Trying the 3-person test with RBVIVR it took 6 seconds to display the model in the two participant headsets -- both seemed to take the same amount of time. This compares to earlier 2-person test that took 3 seconds.

Last edited 2 years ago by Tom Goddard (previous) (diff)

comment:6 by Tom Goddard, 2 years ago

I updated LookSee model gltf transfer to directly send bytes instead of base64 encode text within JSON message. This increased speed in 2-person transfer of 900K triangle (23 MByte) model from 3.5 seconds to 2 seconds on wifi 6.

On old wifi routers especially with 3 or more participants transfer will still be very slow.

A next step that would help speed transfers up is to compress the gltf data with C# GZipStream. Gzip on a 22 MB colored map gltf file reduced it to 11 MB. bzip2 reduced it to the same size. LZ4 only reduced to 16 MB. Probably gzip will be fast enough -- need to test in C#.

comment:7 by Tom Goddard, 2 years ago

Resolution: fixed
Status: assignedclosed

Using GZipStream took 2.3 seconds to compress 22 Mbytes which would slow down a wifi 6 transfer. More details in Github issue #5 (https://github.com/RBVI/LookSee/issues/5).

Another direction to improving the slow transfers would be to provide a progress message for the participants so they don't think the meeting is broken.

Note: See TracTickets for help on using tickets.