Getting your Trinity Audio player ready...
|
Synchronising devices with each other continues to be a persistent challenge in cloud gaming and the broader networking realm. In cloud gaming, a central source streams video, audio, and haptic feedback to various devices, including a player’s screen and controller. These devices often use different networks, which are not synchronised, resulting in a delay between the visual and auditory experiences. For instance, a player may observe an event on the screen and then hear it through their controller, but there could be a delay of half a second.
In light of this, MIT and software company researchers have adopted an innovative method for aligning streams sent to two devices. They introduced an inaudible white noise pattern into the game audio sent from the cloud server. Subsequently, they monitor the player’s controller-recorded audio for the presence of these patterns. Their system is called Ekho.
Ekho utilises the discrepancy between these noise patterns to consistently gauge and rectify the delay between streams. The researchers demonstrated that Ekho exhibits exceptional reliability in practical cloud gaming scenarios. It can maintain synchronisation within a deviation of fewer than ten milliseconds most of the time. In contrast, alternative synchronisation approaches consistently incurred delays exceeding 50 milliseconds.
Furthermore, while Ekho was developed explicitly for cloud gaming, this method has broader applications for aligning media streams sent to various devices, such as in training scenarios involving multiple augmented or virtual reality headsets.
Numerous methods aim to synchronise clocks through a ping-pong messaging system. In this process, a device initiates a ping message to the server, which then responds with a pong message. The device measures the round trip time for the transmission and divides it by two to estimate the network delay.
However, the network route is often uneven, causing the message to take longer to reach the server than for the response to return. Consequently, this approach needs to be more reliable and can introduce errors of several hundred milliseconds. Typically, humans can detect interstream delay when it surpasses ten milliseconds.
In their synchronisation in cloud gaming, the team conducted experiments focusing on game audio. In cloud gaming, the controller’s microphone captures the ambient audio within the room, including the game audio emanating from the screen’s speakers, and transmits it to the server.
Unfortunately, this approach is plagued by unreliability due to background noise interference. To address this issue, they developed Ekho. This solution discreetly inserts nearly imperceptible sequences of white noise, known as pseudo-noise, into the game audio before it is streamed to the player’s screen, harnessing these sequences for synchronisation.
The Ekho-Estimator module incorporates pseudo-noise sequences into game audio and, upon receiving the recorded game audio from the controller, detects these markers to align the streams accurately. This precise alignment data is then sent to the Ekho-Compensator module, which adjusts the server’s game audio by omitting a few milliseconds of sound or adding a few milliseconds of silence to synchronise the streams.
In actual cloud streaming sessions, Ekho outperformed other synchronisation methods, maintaining inter stream delays at less than ten milliseconds for almost 87% of the time, even when dealing with poor microphone quality or background noise in the recording. None of the other methods the team tested achieved a delay reduction below 50 milliseconds.
Encouraged by these outcomes, the researchers aim to assess Ekho’s performance in more intricate scenarios, such as aligning five controllers with a single-screen device.
Furthermore, considering that Ekho was initially designed for cloud gaming and had its range constraints, future endeavours may focus on improving Ekho’s capability to synchronise devices located at significant distances from one another, such as in an ample space like a concert hall.