Category: Research

A new Kinect (version 3)

Microsoft’s Kinect sensor can recognize human bodies and track the movement of body joints in 3D space. In 2019 an entirely new version called Azure Kinect DK was released by Microsoft. It is the third major version of the Kinect.

Originally, the Kinect was released 2010 (version 1, Xbox) and 2013 (version 2, Xbox One) but production was discontinued in 2017. However, Kinect technology was integrated for gesture control in the HoloLens (2016). While the Kinect failed to become a mainstream gaming controller, it was widely used for research and prototyping in the area of human-computer interaction.

The camera looks quite different from its earlier cousins.

In early 2022 we acquired the new Azure Kinect for the Interaction Engineering course at the cost of around 750 € here in Germany.

Setting up the Kinect

The camera has two cables, a power supply and a USB connection to a PC. You have to download an install two software packages:

  • Azure Kinect SDK
  • Azure Kinect Body Tracking SDK

It feels a bit archaic because you need to run executables in the console. For instance, it is recommended that you perform a firmware update on the sensor. For this, go into the directory of the Azure Kinect SDK and call “AzureKinectFirmwareTool.exe -Update <path to firmware>”. The firmware is in another directory of this package.

As a next step you go into the Azure Kinect Body Tracking SDK directory where you can start the 3D viewer. Again, this has one parameter so you cannot just click it in the file explorer. Type “k4abt_simple_3d_viewer.exe CPU” or “k4abt_simple_3d_viewer.exe CUDA” to start the viewer (in the /tools directory).

This is what you see (with the CPU version this is very slow).

Differences between Kinect versions

The new Kinect obviously improves on various aspects of the older ones. The two most relevant aspects are the field of view (how wide angled is the camera view) and the number of skeleton joints that are reconstructed.

FeatureKinect 1Kinect 2Kinect 3
Camera resolution640×4801920×10803840×2160
Depth camera320×240512×424640×576 (narrow)
512×512 (wide)
Field of viewH: 57°
V: 43°
H: 70°
V: 60°
H: 75° (narrow)
V: 65° (narrow)
H: 120° (wide)
V: 120° (wide)
Skeleton joints202632

There is an open-access publication dedicated to the comparison between the three Kinects:

Michal Tölgyessy, Martin Dekan, Ľuboš Chovanec and Peter Hubinský (2021) Evaluation of the Azure Kinect and Its Comparison to Kinect V1 and Kinect V2. In: Sensors 21 (2). Download here.

Skeleton

Here is a schematic view of the joints that are recognized. In practice it turns out one has to put special attention to the robustness of the signal concerning hands, feet and also head orientation.

(Source: https://docs.microsoft.com/bs-latn-ba/azure/kinect-dk/body-joints)

To integrate the Kinect with a JavaScript program, e.g. using p5js, I recommend looking at the Kinectron project.

Links

Microsoft’s Azure Kinect product page

Azure Kinect documentation page

Course chapter on the Kinect (Interaction Engineering) in German

Wikipedia on the Kinect (very informative)

For developers

Kinectron (JavaScript, including p5js)

Azure Kinect DK Code Samples Repository

Azure Kinect Library for Node / Electron (JavaScript)

Azure Kinect for Python (Python 3)

Student projects: Interaction Engineering (2018/19)

Check out the latest Interaction Engineering team projects of last semester under interaction.hs-augsburg.de/projects.

15 students from all over the world with different backgrounds (computing, design, business, …) successfully completed the course and submitted a finished prototype. This year a number of projects dealt with reachability on mobile devices but we also saw gestural, touch and gaze interaction, one virtual reality project and interaction with a musical instrument.

Congratulations to all students for their excellent outcomes!

Click on the following screen shot to get to our project page where you can browse all projects. Each project comes with a short video, a written report and a set of slides.

Student projects: Interaction Engineering (2017/18)

Another round of fascinating  interaction engineering projects is completed. In this interdisciplinary course (computer science and design, Bachelor and Master students), we think up potential future human-computer interaction techniques based on current research publications.

This year we had 14 completed projects by 27 students. A new record after 12 projects of last year. Projects include interaction by gesture, full body, eye gaze, face, tangible object, Hololens and trampoline! We even had a Lego robot.

Check out all projects (video, report, slides) under

http://interaction.hs-augsburg.de/projects

Creative Coding

It’s fascinating to see how many coding platform projects are dedicated to facilitating programming specifically for artists.

The following video presents three such projects. It features Processing (a Java derivative), Cinder (a C++ based framework) and OpenFrameworks (also C++). All of them are free and open source.

Let’s use this opportunity to post two examples of “creative coding”, both dealing with transformations of the human body. The first one is a video called “unnamed soundsculpture“, a work by onformative.

They used Kinects to record a dancer and used particle systems to transform the result. The making of is at least as interesting as the final result:

The second example is “Future Self”, a light sculpture, that works with sensor input about the position/pose of the observer.

Gaze Interaction

While the current focus in HCI is sensor-based interaction (à la Kinect), recent developments could foster interaction with the eyes. The Danish company EyeTribe (formerly Senseye) is building a very nice tracking system with $2.3 million support from the Danish government. Partnering companies include the IT University of Copenhagen, DTU Informatics, LEGO and Serious Games Interactive.

EyeTribe plans to release an SDK for app development next year.

Microsoft’s Vision of HCI

Although Microsoft has a reputation for building suboptimal user interfaces, its research department actually has several world-class interaction design researchers (like Buxton, Hinckley, Wilson, Benko). There is no big human-computer interaction conference, be it CHI, SIGGRAPH, UIST or ITS, without several papers and keynote speakers from Microsoft Research. Recently, Microsoft has released several videos about the future in human-computer interaction, and these video actually assemble many quite recent research findings which are adopted almost one-to-one.

Here’s another one:

Some of the research concepts you see in the videos are:

  • Proxemic interaction (cf. Saul Greenberg, Till Ballendat et al.)
  • See-through displays
  • Multitouch and animation (cf. Takeo Igarashi)
  • Telepresence
  • Back-of-the-device interaction (e.g. Baudisch)
  • In-air gesture control
  • Interaction with and between multiple devices
  • Tangible Interaction (cf. Hiroshi Ishii et al.)

Windows 8 Critique by UI Expert Nielson

Jakob Nielson is a well known and highly regarded expert in the world of interface/interaction design and human-computer interaction in general. He wrote a critique on Windows 8 shortly after its release which caused a lot of controversy in the net (try Google with “Nielson Windows 8”). Nielson heavily criticizes the way that Windows 8 tries to fuse desktop and mobile UI.

What’s interesting is that Nielson did empirical user studies with 12 experienced PC users. The findings that I find most relevant are these three:

  • The double desktop (one traditional, one with big touchable tiles) is confusing since one has to switch between two worlds that work very differently (inconsistency).
  • The flat Metro style, while visually pleasing, makes it hard to distinguish regular text from clickable links.
  • Some of the new gestures that e.g. require the user to swipe from the outside of the touchpad into it are highly error-prone.

I recently got my own Windows 8 laptop and could experience “live” some of these concerns. Even now, I find it difficult to know whether I’m in the Metro world or in the traditional desktop world because with ALT+TAB you switch between all applications (of both worlds). Gesture interaction is a pain. Of course, Microsoft has the problem that it tries to introduce new interaction techniques for a huge range of actual hardware devices. That may be one reason why the resulting experience does not feel as optimized as in Apple products.

Nielson’s own summary is this:

Hidden features, reduced discoverability, cognitive overhead from dual environments, and reduced power from a single-window UI and low information density. Too bad.

If you want a balanced picture, read some of the counter arguments on the net. I do not link up any because I haven’t found anything substantial yet.

Facial Expression Replication in Realtime

The FaceShift software manages to use the depth image of the Kinect for a realtime replication of the speaker’s facial expression on an avatar’s face. Awesome. Look at the video to see how fast the approach is – hardly any visible latency between original motion and avatar and really subtle facial motions are translated to the virtual character. The software was developed by researchers from EPFL Lausanne, a research center with an excellent reputation, especially in the area of computer graphics. The software is envisioned to be used in video conferencing and online gaming contexts, to allow a virtual face-to-face situation while speaking.

Copyright © 2025 Michael Kipp's Blog

Theme by Anders NorenUp ↑