Using the structure of commands to control jukeboxes to understand basic TCP packets

First, the fun part, you’re at your local bar or eatery and the music sucks but you don’t have any cash and left your phone at home. Luckily, you do have your trusty radio frequency transmitter with you and the ability to mess with sent transmissions!

Most remote controls use Infrared (IR) light (like for your TV) to send commands. This relies on literal line of sight, so if you can’t see the thing you’re trying to control you can’t tell it what you want it to do. Surprisingly at first, but making more sense as I think about it, most popular Jukeboxes use Radio Frequency commands to control them which can go around corners and through drunk people. What this means, is that a regular learning remote won’t be able to pick up and learn commands for your neighborhood eatery’s music box. What it also means is that if you have a RF transmitter like Hack5’s HackRFOne, you can have some fun.

Since commands for the box are transmitted over the air, the system needs a way of knowing that the signal is for it, this happens in 2 phases:

Every command starts with a preamble or a set of characters that say “hey this is for a jukebox!”

Then you need the pin for the specific jukebox you are trying to command. Let's get into that.


There are essentially 2 major jukebox manufacturers in the world at the moment, the following is true for one of them, at least, and untested on the other. The pin for every Jukebox is three numbers. If unchanged, the default is 000. The Jukebox package available with some google foo allows for a mode called “scan”. When selected, what scan does is:

Whatever signal you send (I’d use something benign like ‘volume down’), it will start with the preamble, then start with the pin 000 and keep going with 001 then 002, then 003 and so on.

What you do is watch the screen and when you see the volume change, you now have the pin for this box.

So, now that you have the pin, you have the structure of the packets that get sent to the Jukebox which is:  


You now have free reign to control the Jukebox as you see fit. If you’re lucky, the establishment you frequent has added features like additional credits and skipping garbage music to the floating keys (the nondescript buttons that start with ‘F or P’).

TCP Packet Structure

Ok, so now we understand how packet structure allows commands to be sent over the air to a Jukebox, how does that help us understand something like how TCP packets work?

TCP stands for Transmission Control Protocol and is one of the main ways that data is transmitted over a network. Where the example Jukebox packet only had 3 parts (preamble>pin>command) TCP packets are similarly structured but with much more information. On top of that, the Jukebox commands were one way where TCP is a back and forth communication between a device and host.

Here is the structure of a TCP packet:

As you can see, it’s a bit more involved than the format that we used for the Jukebox but follows the same structure building blocks.

The Handshake

We’ve got packet structure down on a basic level and can wrap our heads around how to send in one direction like a remote control. And we’ve gone over how the TCP packet structure, although more involved, is similar in that you just have to make sure that the packets are organized in the right sequence and with the right info for the other device to understand. Let's talk now about two way communication and how a device connects to a network.

The handshake between two devices happens with two factors. A synchronization (SYN) request and an acknowledgement (ACK) in both directions. Both the SYN and the ACK have to follow the packet structure discussed earlier to initiate and receive the proper reply.

Here is an example of the most basic two way handshake:

Let's say you’re trying to send a file from your laptop to a shared drive on your home network, that communication between your laptop and the drive would start like this:

  • Your laptop would send a structured SYN packet that says “hey shared drive, I have a file for you”
  • The shared drive would see that SYN packet (if it’s formed properly) and respond with a synchronization acknowledgement (SYN/ACK)
  • Your laptop would respond with a final acknowledgement packet (ACK)
  • Now communication is open between the two devices via a TCP connection
  • Any further communication happens on the application layer beyond this point but still by packet transfer

Ok, so now we know how digital things talk to each other over TCP, how can we have some fun with this?

SYN Flooding

Knowledge of packet structure becomes important when you’re trying to mess with how things work on a network. The SYN>ACK part of the communication being started is unauthenticated and by default, most network devices are built to allow for this type of communication being started. So, if you build packets with the right structure but the wrong info, you can really mess with what is happening on the network device.

Using promiscuous network analysis tools, an attacker is able to sniff traffic out of the air and see who is trying to connect to what network devices. Packets are visible and at the SYN>ACK level have very little protection. Knowing the proper packet structure an attacker could build SYN packets with fake IP addresses, fake hardware addresses and spamming every possible port on the network device, overloading that device and either filling available memory or just getting in the way of any devices initiating connection interactions. By kicking off a properly arranged SYN flood attack, a malicious actor could basically render any network useless and stop anyone from being able to connect.

There you have it. A basic packet structure to talk to Jukeboxes, which leads to a more complicated view of TCP packets, which gets you to how knowing what a packet set is supposed to look like allows an attacker to mess with ‘over the air’ network traffic. You now, also, understand packet structure well enough to use something like Wireshark to identify SYN flooding or poorly formed packets! (or at least you're on your way to getting there)