Fishing Cash with Kinect

by Dr. Mike

October 12 2016

Why This Article?

One day I started to look back and think what made our two versions of the Summer Fishing Game so successful in Osaka, Japan, and why Kinect was the perfect technology platform for creating great marketing value for a shopping mall.

I thought we got a couple of experiences worth sharing. In this article, I’ll do my best to describe the key factors for building successful Kinect apps for places like shopping malls. I include a couple of tips and tricks for designers and developers of similar installations.

The Basic Idea

In 2015 I got a request from Igrek Design, a design house based in Osaka, to do all the programming of a Kinect-enabled Augmented Reality (AR) game that was going to be set up in a shopping mall for a summer sale event through the whole August.

Fantastic! So far I had built mainly business-to-business apps and research tools on Kinect, so entering the consumer space was something new and interesting to me. Not to mention the challenge of building a robust Kinect app remotely from my Malaysia base to Japan without having the exact matching hardware with the target venue.

Our Summer Fishing Game was to enable the player to:

Enter the game
Catch some virtual fishes
Get a discount coupon to be scanned with smartphone
Use the coupon to get discount in nearby shops in the mall

You’ll get the point from the video made by Igrek Design:

Kinect 2 sensor was mounted on top of a vertically tilted man-size 1920x1080px resolution screen at about 2.3 meters. This makes the distance to each player typically at least 3 meters whereas Kinect’s maximum distance is 4 meters.

The Marketing Impact

We were fortunate with the location of our installation: right in front of a convenience store surrounded by lots of other shops, so every single person passing by noticed our game. A perfect spot! The combination of well-designed posters and exciting-looking Kinect installation catches the eye really well.

The posters helped us to restrict the area for the player to avoid passers-by to enter the game accidentally.

It didn’t just attract only young folks familiar with this sort of technology.

The use of a discount coupon as the prize was quite interesting from the beginning. In 2015 version we simply let the player to take a photo of the coupon labelled with a timestamp.

2016 version was implemented so that the player gets a selfie with all the AR props (implemented with just a screen capture), which is sent to a website. Then, the game shows a QR code which the player can scan with the smartphone and land on a website which enables selection between a 100, 500 or 1000 yen coupon for different shops and download the photo. The most valuable coupon was available only after reaching a certain score... something no player ever managed to get the first time. Almost everyone tried twice since the game has a well-designed learning curve and the prize follows the score.

In 2015 we did not count the number of games played, only the actual coupons redeemed. This was about 10-15 each day, worth about 200 000 yen in total. Not bad! In 2016 our game produced exactly 7345 coupons, meaning about 200 every day! This was a total surprise to us and definitely made us think of the real possibilities of this kind of marketing. It also means the total value of all the coupons is somewhere around 3-4 million yen. (Yet, we cannot disclose the exact number and the total value of the coupons that were actually redeemed ... that’s classified.) Anyway, we can conclude that this sort of pure advertisement game can make real money circle around the neighborhood.

Technical Considerations to Game Design

1) Skeletal tracking distractors

For the 2015 version, we spent days and days testing the gameplay and adjusting parameters that regulated the fish capture. This was because Kinect cannot track the exact specifics of each different player's hand movement from 3 meters away all the time perfectly. Especially, when the hand movement is fast.

There were some hilarious moments when the AR glove appeared in this lady's handbag right below the actual hand. Yet, the game worked as intended! That was not luck, just the results of countless hours of testing.

Whatever is designed to be used in this kind of shopping mall environment has to be robust enough to handle all the variety of additional props and disturbances that in a normal Xbox setting at home would be considered as something on your troubleshooting list:

Handbags or plastic bags full of newly bought stuff
Hats, gloves, and baggy clothes
Fluffy hairstyles (hah!)

All of this will be there and things should still just work. What I did is I went through the game several times with different clothes (here, especially shirts), holding bags and even pillows, and tried to find practical limitations of what props would ruin the gameplay. Eventually, I ended up just limiting a minimum distance to recognize a bimanual catch gesture:

At least 25cm distance from the torso to either hand on the Z axis, both joints being fully visible and tracked
Object collision with one (not both!) of the gloves and the fish

That gave us wide enough space for recognizing the gesture. Yet, false positives were minimal since standing normally wouldn't trigger the gesture. The most important improvement was the latter item. A conventional programmer would probably pick the obvious choice: both gloves should collide with the fish object. But this is the part where a lot of testing needs to be done. My first implementation with this approach completely failed even my own tests, since there was too much variability in tracking hands in rapid movements: tracking state can become Not Tracked and the trajectory can vary a lot when hands are close together at the end of the catching movement. Additionally, without any limitations on the distance from the body to the hands, catching the fish would become far too easy. One could just hold the hand in the middle of the fish’s jumping trajectory which would kill all the fun. Especially when the bystanders would see this to be the winning strategy which would ruin the entire experience and marketing impact altogether! The difficulty level must be set exactly right even for a simple game like this.

The third and final implementation was then actually simpler than the previous two, yet more robust. Usually, I try to accommodate different body sizes in all distance calculations, but the third implementation seemed to work well with kids too so I could keep the gesture rules simple. Eventually, once a good balance between different parameters was found, we decided to let the rest go and we realized that much of the real-world disturbances can actually be used as a randomness factor in the game. Why not! As long as every player could catch at least a couple of fishes the first time, to try the game a second time was almost mandatory since we managed to induce the ”I almost got it!” feeling. Brilliant!

Even in cases where the person’s hair and rapid hand movement around the head caused skeletal tracking to mistake the hand to the hair didn’t seem to have much negative impact on the experience. As soon as Kinect resumed to track the person normally, which was a second or two later, the game went on as usual.

Funny tracking issues didn't seem to bother in the photo taking either. Actually, it started to feel like some players deliberately tried to make all the funny poses to mess things up. I suppose it is a natural tendency in human behavior.

In the 2016 version, the fishes just passed by on different heights around the player. In this case, simple object collision between any of the gloves and the fish was quite enough. The fun part and the factor we spent most time testing was the amount and speed of the fish to make the game non-trivial. Even if the player stayed in a static pose to simply block the fish’s path, maximum score could not be produced. This was a rather easy implementation and quick enough to test comprehensively.

2) Start pose

Advantages of having a start pose to identify the active player is still a bit fuzzy in my mind. Since Kinect does not really require any calibration pose typical to all other similar technologies, it is not needed for any technical reason. However, making a simple T-pose makes it very clear in the first-time player’s mind when the game starts. In the 2015 version, we did use the T-pose with 3 second countdown, partially because of the customer’s request. This worked just fine and the visuals made it obvious for anyone to figure out how to start the game. But, it takes those 3 seconds + transition to the next state in a situation where the game could be launched straight away.

In 2016 we didn’t have a start pose: we just showed the AR glove and diving gear overlays and let Kinect do the rest. Both worked well from both technical and player’s perspective! The only thing to mention here is that if any start pose is not used, the game needs to react and display something immediately once the user is recognized. AR props with an appropriate sound effect work equally well as a pre-defined start pose.

Perhaps after seeing both options working well, I might incline more to not using a start pose. The game works fine without it when the physical game area is restricted appropriately. In a completely open space using a start pose makes more sense ... which brings us to the next topic.

3) Game area

In a public space like this, restricting the game area on the floor is a must. Passers-by must not be let inside the game area, although sometimes it is inevitable. In this case only the person's head is seen by the sensor. It didn’t, however, trigger skeletal tracking to pick him up as a user. No harm done.

Whereas the area is pretty obvious to be defined computationally in terms of the distance from the sensor (here, 3 meters), we went through at least 3 test iterations of defining the correct horizontal limit. What needs to be tested is the scenario of 2-3 friends playing the game and standing close to each other. The safe limit is about arm's length on both sides of the player. This process ended up with us simply limiting the player's base position as -60cm to +60cm from in the X-axis of Kinect's coordinate system (to make this theoretically perfect, it would need to be fit to the player's height instead of a fixed amount of centimeters). A clear marker on the ground makes staying in the game idiot-proof. Well, almost!

4) Human factors

Since it was obvious to anyone that this installation is here just for fun, many people did exactly that.

For our function resembling taking a selfie, we must assume people will do all the funny things, alone or as a group. The more the merrier! Things didn't work at all as we intended, but it didn't seem to stop anyone from having fun, especially when you can get to keep the photo!

And sometimes the unthinkable happens even in the player's mind. The fish painted on the carpet is not for eating!

Conclusion

After explaining all this, I’d like to highlight that designing and implementing even simple game concepts such as our two versions of the Summer Fishing Game require quite a bit of detailing, testing, re-development, and re-testing to get it right. Game design, duration of the gameplay, engagement method for starting the game, gesture recognition, the physical space, and the prize have to come together very nicely. When done right it can really pay off in terms of making a successful advertisement game capable of circulating serious amounts of money.

In case you happen to be interested in this exact or similar concept, don’t hesitate to make contact via LinkedIn. Go fishing cash with Kinect!

Special thanks to Igrek Design.

Created by MGWalton