Loading...

Positive Feedback Logo
Ad
Ad
Ad

Waters on the Hardware: Dejitter it’s Remarkable Switch X

04-05-2025 | By Dean Waters | Issue 138

Dean Waters enjoying a fine drink at Munich 2024 (photograph and image processing by David W. Robinson)

As lovers of clean, pure, wonderful audio, we spend much time and effort in regard to equipment, power, and cabling, among other things. When it comes to networking, we tend to not give it much thought other than "does it work?" After all, a network is a "just" a network, right?

In our modern methods of listening to content, we rely more and more on networking than ever. For many of us, we don't shuffle physical media around like we used to. Now it's often considered more of a trip down nostalgia avenue when we pull a record out of a sleeve, or even put a CD (or SACD) into a physical player. (And we all know nostalgia just ain't what it used to be....)

We've somewhat fallen victim to convenience above all else when we consume media. I'm not saying this is good nor bad. Just different. With this "new method" of living the streaming lifestyle, be it movies, TV, audio, and now more than ever high-resolution audio, we rely on the networking backbone of our systems like never before. And yes, when it comes to networking, we just buy some consumer grade switches, some Cat 5, 5e, or 6 cable, plug it in and listen. Even if we spend countless hours evaluating the perfect transport, the perfect DAC, phono stage, preamp, speakers, etc. We often don't give Ethernet networking a second thought. "Bits is bits," right? (And where have we heard that before?!)

To be honest, I was a member of this camp or what one could call "network neglect." I have a smart-home with full smart lights, outdoor and doorbell cameras, smart thermostats, smart door locks, even a smart garage door opener. And to use the term 'smart' means, more than anything, it's on the network. The same network as my audio setup and in my movie room. Having a long professional background in networks and data centers, you'd be right in imagining my home network has switches, access points, and routers galore. Yeah, I certainly wear the geek spinning propeller beanie when it comes to connectivity. Dorothy may have cried out "Lions and tigers and bears, Oh My!". At my place, it's "Routers and switches and gateways, Oh My!" 

Dean Waters and Bill Parish discussing networking and the Dejitter it Switch X. Vancouver, WA, 2025 (portrait by David W Robinson)

But there's a problem…several actually. Turns out I was wrong about the importance of high-end audio-grade networking. Which is very different than data-center grade networking, and on a completely different planet than typical home-grade networking. Simply putting in "better" networking equipment isn't likely to solve the unique challenges we face when it comes to consuming high-end (as in high-definition) media. Let's look at the challenges we face.

Networks are noisy. Noisy all the time.

If you have a network switch in your home (or business) network, ever notice how the 'activity' lights are always blinking and twinkling? Even when nothing's going on. In the middle of the night. When no one's home. It seems like there's some phantom constant activity going on. Well, you're right. There is. We can give thanks to "plug and play" services for this, among other things.

Pretty much every device on your network is constantly sending out Service Location Protocol (SLP) frames on the network, advertising their presence and available services to other devices on the network. Ever wonder how a newly installed computer on a network can just "see" all the printers available? Or how the smart app on your phone can detect all the smart devices on your network without you having to tell the app where they are (i.e., what IP address(s) are being used)? This is SLP in action. Each smart device (all of them!) on your network sends out frames (the term 'packets' is often used) saying "I'm a light bulb (or whatever) and here's my network address." Thermostats, printers, PC's...and now, doorbell, dishwashers, security cameras, thermostats...just about everything does this. Depending on the device, it could produce several frame a minute, or dozens of frames per minute per device. In other words, noise! Chatter, if you prefer.

Back in the olden days we used network hubs which would take each frame coming in on a port and would copy it to every other port. When multiple devices would send out frames at the same time, the frames would collide and kill each other. Transmitting devices would 'back off' a pre-determined amount of time and then would retransmit, hoping the frames would get through. This is why Ethernet networks are referred to as "Carrier Detect Multiple Access with Collision Detection" (CDMA/CD) networks. They can sense when it's "safe" to send a frame and detect when a collision occurs. Devices on Ethernet networks can't prevent collisions, they can only detect when they happen and then re-transmit (many times if needed) until the frame gets through. Ethernet network segments had to be small, with only a few "hubs." Effective performance (throughput) would drastically fall off as more devices were added. This is why Token-Ring networks were often used in enterprise environments prior to the introduction of network switching.

Nowadays we use Ethernet switches which help (a lot). Switches are themselves smart devices that can 'discover' the desired destination of an inbound frame and can deliver it to the desired destination port on the switch. Thus a device plugged into port 1 can talk to a device on port 5 while at the same time a device on port 3 can talk to a device on port 2, etc. In this case, the frames will never collide. This marked a massive step forward in LAN network communications.

And yet, as monumental as that advance was (and still is), it didn't solve everything. For starters, in many environments a majority of the traffic from all the 'client' devices is heading to one (or a few) "server" devices. 300 users all want to "talk" to the same email, file, print (etc.) server. Rather than let the frames collide at the server port on the switch, the switch will 'back off' some of the traffic into a buffer and then let it flow out sequentially to the server port even if the incoming client ports are all taking in frames simultaneously. Certainly a neat trick. This introduces delay. In the enterprise, as well as the typical home network, these bufferings and occasional collisions go largely unnoticed. The delays are too small to be noticed. When saving large files, you might notice it takes a fraction longer on a busy network, but only if you're paying really close attention. 

Switches (good ones anyway) even create their own chatter in the form of (802.1d) Spanning Tree broadcasts. Switches look for switch-loops in this way. A switch loop is when a switch gets plugged into itself (even if it's through another switching device). This will kill a network, so switches send out broadcast frames to see if those frames then come back to the switch on another port. Rather than get into a switching loop, the switch will disable the port. The point here is just about everything on the network generates unwanted chatter as it goes about its tasks…which affects our audio streams….

As wonderful as switching is, there is yet another issue: Broadcasts. Think back to all those SLP frames mentioned earlier. If switches are only moving frames to the correct destination ports, how does a device plugged into any port recognize all the other devices on the network, whether they are plugged into the same switch or not?

The answer is broadcasting. A broadcast frame is a frame that has a destination set to all devices. In other words, a typical point-to-point connection oriented frame will have both a destination address (where it's going) and a source address (where it's coming from). Broadcast frames use a special destination address that represents all devices on the network. The network switch sees these broadcast frames and will (by design) copy those frames to every port on the switch as well as any up (or down) stream switches which will also copy the frames to every port. And so on through every switch on the network. These frames aren't typically needed by most devices that receive them (which is every device…), so they are discarded. Before an end-device can discard the broadcast frame, it must first "open it up," so to speak, examine it, determine that there is nothing on the end node that needs it, and then discard it. This takes processing cycles, bandwidth, and time.

All of which are bad news for high-resolution audio. Too many devices sending out too many broadcast frames and broadcast storms (yes, that's an actual term) take place, bringing the network down to its knees. To solve this, we use routers and (more simply) NAT gateways. More on this later….

In digital audio, we mainly deal with two formats, PCM (Pulse Code Modulation), and PWM (Pulse Wave Modulation) used by DSD), also known as "bitstream." With PCM, we're dealing with audio that is broken up into discrete numerical values that represent the audio waveform at certain time-based intervals (the "sampling rate"). Standard CD's, for example, use a sampling rate of 44.1kHz, organized into 16-bit DWORDS (data words). Those time-slice samples are wrapped up into network frames and are sent across the network—along with all the other traffic and broadcasts. Physical CD's are PCM. As are WAV, FLAC, MP3, and a host of other lossless and lossy formats. For PWM (DSD) on the other hand, we take many more samples per second (DSD64 on SACDs samples a single bit at 2.8mHz...the slang is "Single DSD"; DSD128, "Double DSD" samples the stream at 5.6Mhz; DSD256, or "Quad DSD" samples at 11.2mHz) than we do with PCM, and represent that as a continuous stream of bits that indicate what relative direction a waveform is taking (rather than an absolute value) over time. Again, SACD and DSD are the realm of PWM technology. As the sampling rates go up, the more sensitive to time delays and noise of any kind the data-stream becomes. 

In addition, there are fewer things more analog than a digital signal (!!). "What?!" I hear you cry. When we examine exactly what indicates a "bit" in digital parlance, we are deciding between a zero and a one. Bits. Binary. This or that. Nothing in between. When we translate that into moving data across a wire, we're talking about voltage changes that happen incredibly fast. We're asking the "wire" (and everything connecting to it) to go from zero voltage to plus five volts. And to do it instantly with no ramp-up or ramp-down time. Oh, and do it millions of times per second. And keep track of all the other devices that are doing the same thing at the same time. It's a wonder it works at all!

The thing is voltages don't go from nothing to something without something going on in-between. There is a ramp-up and ramp-down of the voltages that are impacted by the switches themselves generating RF and electrical interference, among other things. In summary, there's a lot of things to address in networking in the audio realm.

Noise is the great hidden challenge in networked high-end audio.

Now let's look at a core product from The Switch X from Dejitter it and how it addresses the challenges brought up here.

The Switch X is an 8-port Ethernet switch specifically designed for the enhancement of audio based networking. Granted, as an Ethernet switch, it will handle all Ethernet traffic, but the Switch X goes much further than that.

There are two issues that will be addressed here. The first is noise abatement. The Switch X is created by taking a stock MikroTik Cloud Router Switch and replacing/upgrading the capacitors and other components inside that "leak" both energy and noise. This is done to clean up the voltage transitions on the wire (from the zero volts to +5 volts mentioned earlier). This reduces noise and harmonic artifacts on the wire and connections. Remember that anything that isn't a pure sine-wave creates harmonics. Without this noise reduction, wire electrical noise is constantly generated as the voltages on the wire ramp up and down. These changes in the Switch X from Dejitter it create a more perfect, seamless transition and results in both "cleaner"' voltages and changes on the wire. Jitter is reduced and down-stream devices (possibly even other switches) will not misinterpret the binary 0 and 1. This eliminates frame retransmissions due to voltage and timing errors between devices. This is critical in the highly time-sensitive and jitter-sensitive high-resolution audio streams.

The second big design implementation of the Switch X is network segregation. Earlier we described the noisy environments our networks live in. We need to get rid of all the excess traffic on our network that isn't directly related to the task of music reproduction. For a moment, think of your home (or office) network, specifically consider the point where your home network connects to the carrier that you use (whether it be coax, DSL, fiber, or satellite). There's something that happens that prevents your neighbors broadcast frames from getting into your own network. That something is a router, specifically a Dynamic Network Address Translation (NAT) gateway. A router is a device that connects multiple disparate networks together. We refer to the Internet as THE Internet when it's actually made up of millions of separate networks that can connect to each other in a controlled and secure manner. Routers allow this to happen. This is why you call the device (technically a CSU/DSU) that your carrier 'loans' you when you connect to their service an Internet router. It may have both WiFI and several switch ports, but the device itself is still a router. (For the super nerdy people like me, switches operate at Layer-2 (data-link) layer of the International Standards Organization's (ISO) Open Systems Interconnect (OSI) model and deals with MAC addresses. Routers act at Layer-3 (network) and deal with IP addresses. But I digress…)

From an architecture standpoint, the Switch X functions in a similar manner to your Internet router: It segregates traffic from separate networks and allows network-to-network communication.

There are two networks in the Switch X default configuration. In Switch X parlance, we refer to the networks as "clean" and "dirty." Those two networks, both inside the Switch X, are separated from each other by a NAT router/firewall. Okay, so what does this really mean? First, the Switch X has to set up and create a clean environment for your audio devices and data streams. This network is logically isolated from the dirty network which connects to an existing network, like a home network. All the chatter and broadcasting that happens on the dirty network stays there and does not pass onto the clean network due to the built-in router on the device. (Broadcast frames do not pass through routers.) Thus, the Switch X needs to perform the same type of service(s) of a typical (layer-3) Internet router. Namely, it needs to assign IP (network) addresses to all of the devices that are plugged into the clean network, while at the same time accepting a network address on the dirty network from an existing (Internet) router.

The Switch X keeps the two networks separate and allows traffic to flow from the clean network to the dirty network so that devices on the clean network can communicate with both other devices on the home (dirty) network as well as the Internet. Devices on the dirty network cannot initiate connections with devices on the clean network. Much the same way that devices on the Internet cannot initiate connections on your home network. This is done via the NAT inside the Switch X. Devices on the clean network can have bi-directional communications with devices on other networks so long as the device(s) on the clean network initiates the communication/session.

The net-net of all this is the audio (clean) network can now operate in a manner that is both noise abated and isolated from the rest of the network(s) broadcast, chatter, and noise. Your ears will show the difference when you first implement the Switch X into a configuration. For my ears, the difference was immediate and substantial. The character of the music opened up and started to breathe like I'd not heard before from the system. Less effort, less harshness, better staging, better depth in the low end, all the clarity in the mids and highs. These are the things that were brought to my attention as soon as I placed the Switch X in service. If you already have a system in place that you're very familiar with, one that you've carefully curated over many years of selecting the perfect gear for your setup, I'd recommend putting in a Switch X and you'll immediately discover that there's been a new level of realization in the setup. If you're just building a setup, a Switch X would be a great baseline to build upon. It really allows the networking side of your audio to produce all the greatness that's always been there but was squashed by 'normal' data-driven networking. 

The Switch X is based on the Mikrotik CRS309. This device is both a (layer-2) switch and a (layer-3) router. The eight physical ports accept SPF+ GBICs, up to 10Gbps per port. You can match the speed of each port to the speed of the device(s) that are being connected by installing the appropriate GBIC in each port. Fiber-optic connections (both single-mode fiber and multi-mode fiber) are supported, using the appropriate GBIC.

The Switch X can be completely customized via the Mikrotik command line interface (CLI) programming scripts, similar in concept to Cisco's IOS command structure. You can do things like set static routes, re-configure DHCP settings and ranges, reassign both clean and dirty networks, set static DNS servers, etc.

Dejitter it Switch X with the included linear power supply (image courtesy of Dejitter it)

The power supply servicing to the Switch X is a custom implemented linear power supply that delivers the precise amount of current in real time needed to properly power the unit. No more, no less. This is important as additional power above what would be required turns into noise. The power connector is a 4-pin connector. All the pins are used. Of particular importance is the 4th pin which connects directly to a sensing circuit on the motherboard of the Switch X and monitors the necessary power draw for the unit. The result of this precise delivery of the exact requirement is no power goes to waste in the form of RF and/or EMI artifacts that would negatively affect the cleanliness and purity of downstream devices connected to the Switch X.

Front view of the Switch X Linear Power Supply…

…while here's the rear view of its special connectors

I'll close here with what matters the most: The musical listening experience. 

I can say that the addition of the Switch X to my environment was a significant upgrade to the "traditional" networking gear I had been using previously. I didn't think the changes would be as dramatic as they were. And yet, here we are. I've spoken to many others that have had the same similar (and surprising) experiences with the Switch X. What I've learned the most from this experience is that networking is an essential part of modern Internet/LAN audiophile setups. We need to pay as much attention to the networking components of our system as we do to the amps, speakers, cables, DACs, phono stages, you name it. If a network touches the signal at any point along the audio path, then it simply cannot be ignored. Without audio-grade networking we're limiting the value and potential of the rest of our setups. Set one up and the difference will be clear (in oh, so many ways).

Next article: Dejitter it WiFi X, the perfect companion to the Switch X to handle wireless streaming needs. I was planning to include that product review in this article but then decided that there's enough material to make a separate (hopefully shorter!) stand-alone article!

Stay tuned!

Switch X

Retail: $3500

Specifications

  • 8-Port SFP+ Layer-2/Layer-3 switch/router
  • Up to 10 Gbits per port.
  • 81 Gbps non-blocking backplane throughput.
  • MikroTik Router OS

Dejitter it

https://dejitterit.com/

For additional information, contact Bill Parish @ GTT Audio.

908-850-3092

[email protected]

Equipment list used for this review: (borrowed items are in BOLD)

  • Dejitter it Switch X and Dejitter it WiFi X
  • Mola Mola Makua preamplifier w/integrated Tambaqui DAC and Phono stage
  • Vivid Audio Kaya 90 loudspeakers
  • Cardas Clear Beyond power cables (NEMA 5-15P to C19)
  • Kubala-Sosna Realization Series speaker cables
  • RSX Technologies Benchmark Series Interconnect cables
  • Sony UBP-X800M2 CD/SACD transport
  • Asustor FS6712X SSD NAS
  • Windows™ PC w/Audirvāna – DSD/PCM streaming server
  • PS Audio PowerPlant 15 power regenerator