One of our specialties at Two Six Technologies is physical adversarial attacks. In a forthcoming paper, I present work on using everyday items to cause computer vision models to operate incorrectly. In this post, I walk you through work that did not make it into the paper; two dead ends I hit on the way to the solution. I’ll also provide an analysis of why they did not work.
Idea 1: Duct Tape
With seven colors of duct tape, I set out to make adversarial patterns. My goal was to use as little tape as possible while hiding the person from the object detector.
First, I wrote a script that, given an image with a person, maps 2‑inch‑by‑2‑inch squares onto the body’s keypoints. Each square follows the person, so the locations are consistent across images. I built an optimization loop that picked the most “important” squares, simulated tape placed at those locations, and repeated until the model stopped detecting the person. After running the optimization, I built the design in the real world and took test photos. Here was the result:

Note: the fact that the simulated image was based on a blurry picture did not change the result.
Three Problems:
- Orientation: The simulation always shows the tape facing straight at the camera, but once it is attached to a body, the tape conforms to the contours.
- Shadows & lighting: The simulation never accounted for shadows, so the simulated tape looks different from the tape without direct light on it.
- Practicality: Taping 70 or so tiny squares onto an outfit is pretty tedious.
Idea 2: Cardboard and face masks covered in duct tape
My thought was that since cardboard and face masks are bigger, they would be less tedious to attach than small pieces of tape. Additionally, I thought that since they were rigid, it would be easier for them to face the camera. I designed a similar simulation and optimization process as with the duct tape. After running the optimization, I once again made the design and took some photos. Here was the result:

It turns out to be rather difficult to get cardboard squares to stay on you and face the camera. Despite my attempt to mimic the simulation, the flaws in the angles and lighting ruin the attack.
“If we knew what it is we were doing, it would not be called research. Would it?” – Albert Einstein
What we learned
Both attempts failed because the simulation did not capture the messy reality of how objects wrap, bend, and cast shadows on a human body. For the final paper, I took a completely different route.
I’ll post a link to the paper here as soon as it’s public.



