IPi Soft Evaluation
This {{{1}}} is a Work in Progress! |
---|
Commercial software programs produced by iPi Soft, enable the use of multiple depth sensors for human motion capture applications. In this article, the use of this software with multiple Microsoft Kinect sensors is evaluated in a tutorial format. Readers who desire to get up-and-running quickly with similar platform may forgo roughly ten hours of time invested in the activities outlined herein.
Overview
The software under evaluation includes,
- iPi Recorder - Free capture software
- iPi Mocap Studio - Commercially licensed motion analysis software (free trials are available)
- iPi Biomech Addon - Commercially licensed addon for Mocap Studio
The primary utility of this software combination is in the use of Mocap Studio to estimate the pose of a human motion model based upon data from multiple sensors. The Recorder application is necessary to record data in a (proprietary) format that can be used as input to Mocap Studio. Also, since more than one computer may be used to record sensor data simultaneously, the Recorder software is free too be installed on many machines. Finally, the Biomech Addon is not a necessary product for motion capture, but is useful for those who wish to export the results for further analysis in an environment such as Matlab.
Workflow
The high-level workflow for a full-cycle of data collection begins with preliminary setup, involves capturing data and processing for both camera calibration and action (human motion) data, and ends with the export of results to Matlab compatible format. Here's a listing of the main processes involved in the accompanying flowchart.
- Preliminary Setup
- Position cameras and hook them up to PCs and configure camera settings in Recorder
- Calibration Capture
- Use Recorder to capture calibration data (wanding with a light marker)
- Calibration Processing
- Use Mocap Studio to process the calibration data
- Action Capture
- Use Mocap Studio to capture action data (human motion)
- Action Processing
- Use Mocap Studio to process data from action capture and the calibration processing output to generate human motion tracking results
- Biomech Export
- Use Biomech Addon (from within Mocap Studio) to export the human motion tracking results in Matlab (*.mat) format for further analysis
Capture
Data collection is performed with iPi Soft Recorder running on Windows based host PCs that are connected to the Kinect sensors and a common LAN.
- Hardware setup
- Connect Kinects to their driving PCs (1 Kinect v2 per Win8+ PC || 2 Kinect v2 per Win7+ PC)
- Connect multiple PCs to a common LAN (i.e. WiFi connection)
- Launch iPi Recorder on each PC, select Kinect hardware, click record video
- Background subtraction
- Capture static background for 10s on each instance of iPi Recorder
- Configure network synchronization
- Select enter slave mode on each desired slave PC
- Connect to slaves on the single master PC
- Capture scene
- Click start to begin recording (optionally add a delay)
- Refer to calibration or action procedures below for more details
- Click stop to end recording, then proceed directly with the following step
- Download and merge slave data manually
- If you have external media, manually merging the slave data saves time
- Click close to cancel the automatic network download of slave data, click yes if prompted
- Manually transfer (i.e. copy and paste) the master data (*.master) and the slave data (*.slave0, *.slave1, ...) to the processing PC.
- In iPi Recorder, go to Home >> Merge Video, select all data files (master and slaves), then follow prompts to merge videos.
Processing
The processing of both calibration and action scene data is performed with the iPi Soft Mocap (Pro) software.
Requirements
Sensors
Computers
Procedure
Preliminary Setup
Calibration Capture
The purpose of a calibration scene is to capture assumed geometries that are visible in every depth image so that the transformations relating each camera frame may be computed. The known geometries expected by iPi Soft Mocap are either a reference plane or a light marker (i.e. a flashlight bulb). Since the former is only supported for two cameras maximum, it's advisable to elect for exclusive usage of the light marker method.
- Prepare light marker (before recording)
- Take the cap and lens off of a, for example, Mini Mag-Light
- Record "wanding" of workspace volume
- Sweep spirals about two foot in diameter about a vertical axis with roughly 4 revolutions per 6 feet
- Sweep these spirals in quadrants across the area of the horizontal plane bounding the desired workspace volume
- Repeat this procedure until at least 30s of footage has been recorded
- Important notes
- Be sure to maintain line-of-sight between the light and every camera as occlusions are bad (wasted) data points
- DO NOT hold the flashlight cap and lens in your unused hand while recording - this object may register false light markers in calibration processing
- I've tried this both with uneven lighting that's slightly dim and with even and very bright fluorescent light - the bright lighting produced better (usable) results
Calibration Processing
Action Capture
The recording of an action scene is quite simple. Every take must begin with a t-pose, and every part of the actor's body should remain in view of all cameras throughout the duration of the scene.
Note that the cameras should remain in the exact same place throughout the duration of capturing.
Action Processing
Testing Notes
(4) Kinect v1
- Setup
- Multiple Kinect v1 sensors may be driven by the same host system running Windows 7 (maybe earlier) or newer. The limitation imposed on number of Kinect v1 sensors per computer is the number of hardware USB controllers present in the system. It is recommended that the system provide one USB controller per Kinect v1. This hasn't been tested due to possible wasted time. In these tests, there are two Kinect v1 driven by each of two PCs, one slave and one master.
(3) Kinect v2
- Setup
- Each Kinect v2 must be driven by a dedicated PC running Windows 8 or 10. This is because the Kinect SDK only supports one device per system and requires Windows 8 or newer OS. I used the processing desktop as one PC and two ASUS laptops to drive the other two sensors remotely.
- Operation
- This procedure is roughly the same as for the Kinect v1 except that v2 does not have computer controls (i.e. configurable properties) to adjust elevation. This is not a problem at all since the larger FOV of v2 is quite forgiving when manually adjusting sensor elevation.
- The captured files were for a 22s calibration scene, and each of three files was 130-170MB. The approximate download time via 802.11n was 6 minutes.
- Calibration
- Adjusted the ROI to clip off the end of the scene when I walked out of the calibration volume to stop recording. The resulting ROI is 488 frames (16s). The calibration only took about a minute, but the quality was not good enough. Visual inspection of the resulting camera transforms revealed a failed calibration. The poor metrics were 147 good frames and 87% occlusions. I noticed that when making spirals with the light marker, I was occluding the light with the arm I was holding it in to the view of the camera behind me.
- In a second attempt, a much larger ROI (800 frames) resulted in another failed calibration. Very few (roughly 150) good frames were found.
- In a third attempt, the cameras were rotated slightly to move the glare from other cameras' IR emitters away from the calibration volume. A much longer clip was recorded in which a four quadrant helix wanding was performed twice and then the volume boundary was swept in a spiral (increased total volume swept compared to previous attempts). This resulted in three data files in the range of sizes 470-723MB (1.78GB after merge) with time length of 97s. In efforts to create the best chance of successful calibration, the ROI was maximally clipped to 2700 frames. This still didn't work. It turns out that I was holding the lens cap to the flashlight in my unused hand, and I think this was registering as false light markers in many frames.
- Attempt four finally worked although the resulting accuracy was deemed a failure at roughly 8cm. The cameras seemed to register correct positions, so I moved forward. Some new changes that were made this time include turning the room lights on (possibly no impact, I'm not sure) and increasing the duration of background capturing to 10s.
- Action
- The test action recorded is a sequence of three squats. The ROI was trimmed front and back to a duration of 18.5s. I proceed with the same processing technique used previously:
- Track forward course (tracking resolution: low) - about 5 minutes
- Refine forward fine (tracking resolution: high) - 14:30 (1.5fps)
- Jitter removal (default settings) - 3 minutes
- Problems
- When downloading slave data from two ASUS laptops, the download failed after roughly six minutes (near completion). To work around this issue, start by manually transfering the data files from the slaves to the master. Then, in the recorder program, invoke HOME >> MERGE VIDEOS, select the slave files and the master file and follow the prompts. You should then see the merged video file in the same directory as the master input file.
- Don't hold reflective stuff (i.e. flashlight lens cap) while capturing a calibration scene!