Sphero API Tutorial

From Mark T. Wiki
Jump to navigation Jump to search

This page offers my personal description of the Sphero API used to encode commands and general communication between Sphero and a host computer over a physical Bluetooth connection. This low-level (binary) API is documented by Sphero's developers at their online quick reference<ref name="sphero-api-web"> Sphero Docs API Quick Reference; Source: http://sdk.sphero.com/api-reference/api-quick-reference/</ref> as well as in a PDF document hosted on GitHub in Sphero API Developer Resources documentation <ref name="sphero-api-github"> Sphero Developer Resources (Orbotix) on GitHub; Source: https://github.com/orbotix/DeveloperResources/tree/master/docs</ref>.

Making sense of the Sphero API is greatly motivated for those who wish to implement an interface to the device from a new platform that isn't officially supported by others. For example, the MATLAB Interface, authored by Yi Jui Lee, forms the major basis for my interest.

Subsequent work that sprung out of the efforts set forth by Yi Jui with Sphero MATLAB Interface include my re-write and extension of this project, Sphero API Matlab SDK. Code for this project can be found on Matlab File Exchange (Sphero API Matlab SDK) and announced at Sphero Developer Community Forums.

Sphero Basics

When we strip away the applications layer and the marketing tools representing Sphero as a consumer product, we're left with a lifeless (hopefully) collection of hardware components that requires some keen technical interactions to perform meaningful actions. This perspective on Sphero offers the sort of clean state that is a good starting point for diving head-first into serial communications and control of embedded systems. Much of the approach used here is applicable to communicating with other robots on a low level.

As Sphero is described in its documentation,

... he's electronically a collection of raw inputs and outputs.

Understanding the functionality of the raw inputs and outputs and the ways in which they're connected is not a requirement for success in communicating with Sphero, but does offer insight regarding the purpose and capabilities of Sphero's communication protocol.

The outputs that are most obvious when watching Sphero in action are,

  • Light color and brightness change the appearance of Sphero
  • Motors

Whereas the inputs might be slightly more obscure,

  • Accelerometer and gyroscope help Sphero feel motion
  • Batteries provide energy

With only these inputs and outputs, Sphero is capable of using his microcontroller brain to obtain a pre-determined motion behavior such as staying still or moving at constant speed by:

  1. Sensing motion with accelerometer and gyroscope
  2. Thinking about how to obtain the desired behavior
  3. Actuating the motors in efforts to accomplish the task

This process describes the universal "sense, think, act" paradigm for robotic and automated systems as well as the fundamental way that Sphero works internally. The only missing link at this point is the ability for Sphero to know what procedure the user would like for it to perform.

The final item that completes the previous lists of inputs and outputs is the capability for Sphero to communicate to and from its users. This ability for Sphero to communicate is the fundamental focus of the remainder of this page, but it's critically important to recognize that Sphero is an autonomous system that performs actions under its own control.

At any point in time, Sphero is autonomously (and continuously) running a state machine that was designed by its developers at Orbotix and programmed into its firmware. Sphero's state machine will allow the device to perform various prescribed actions depending on specific conditions having been met. The factors which influence Sphero's behavior include (in order of decreasing assumed priority):

  1. Prescribed algorithmic behavior (i.e. programmed in the firmware)
    • Counting
    • Timing
    • Computation
    • External signals (interruption or sensory data)
  2. Sensory feedback (i.e. information about the physical environment)
    • IMU data (accelerometer and gyroscope sensors)
    • Battery voltage level
  3. Communications data (Sphero API)
    • The binary data stream coming through the Bluetooth radio

The lowest priority input to Sphero ought to be the communications and control data coming through its Bluetooth radio link. The speed of this data stream, the transmission reliability, and the reliability of users properly structuring commands are all reasons why Sphero is better off considering all other inputs with precedence. Ultimately, the communication and control data input merely offers a way to suggest to Sphero what actions it's state machine should consider performing next.

Sphero API

The first question some readers may ask is "What's an API?" Well, for starters, API is an acronym that stand for "Application Programming Interface." In the realm of computer programming, descriptions of this concept may vary depending on your perspective, so we'll adopt our own in the context of Sphero (and embedded systems, generally).

In the previous section, we described Sphero as an autonomous system that continually performs predetermined actions based upon its internal status and that of its inputs. Since all of Sphero's behavior is necessarily prescribed by the firmware written by the developers at Orbotix, special attention was directed toward enabling Sphero to do things that users want it to do.

We can think of Sphero's main program that controls its autonomous behavior as the robot's main application. In addition to writing the main application firmware that defines all of the functionality the Sphero can perform, the developers chose a group of functions from the main application which should be exposed to users. Then, the they had to decide exactly what information must be transmitted, and how this information is structured, in order for Sphero to communicate with users. Finally the main application is built with the ability to talk and listen in the language of this interface to the main application.

Generally, an application programming interface to a main program allows other programs to interact with it without knowing exactly how the main program works. The way this is done is summarized in the following list:

  1. Define the content of the commands and messages that must be communicated (similar to words)
  2. Define the way in which the information is communicated (similar to grammar, sentence structure, etc.)
  3. Ensure that the main application speaks this language perfectly
  4. Thoroughly document and distribute the specifications of this language including the functionality it offers and its limitations
  5. Implement the language on a satellite program to interact with the main program using the API

The Sphero API defines the ways in which binary data must be structured and transmitted over Sphero's Bluetooth connection in order to communicate with the device. Learning and understanding this API requires comprehension of at least three main concepts:

  1. Packet structure
    • What types of sentences does Sphero speak? How do we create them and how do we use them?
  2. Data representation
    • What words does Sphero know? What letters do we use? How do we write them?
  3. Using packets
    • How do we compose and decompose packets to send and receive them (example-based)?
  4. State management
    • How do we speak and listen to sentences in the right order and at the right time?

In the remainder of this article, we'll attempt to answer the questions posed in this list. First, we'll start by looking at the three types of packets defined by the Sphero API and what words are used to form them. Then, we'll discuss the ways in which the words are formed with some background information on how to read and write the letters of the binary alphabet. Finally, we'll discuss some of the requirements for communicating with Sphero over time. Examples of physical control and feedback will motivate the discussion along the way.

Packet Structure

The Sphero API defines three distinct types of packets that are used to communicate with Sphero. These packets are sequential lists of data that are used to communicate information for a particular purpose. There is one type of packet used to send data from the client to Sphero that we'll refer to as a command packet (CMD). Additionally, there are two different types of packets that Sphero can use to send data back to the client. We'll refer to these as a command response (RSP) and an asynchronous message (MSG).

The defining qualities of these packets are the meaning of the data elements that compose them and the order in which the data appears in the packet. In the following subsections we'll take our first look at the structure of these packets. The first thing you'll see is a listing of the ordered data fields specified by the packet definition such as:

[ FIELD_1 | FIELD_2 | ... | LAST_FIELD ]

Where the names denote semantic names we will use to refer to the fields, and the left-to-right ordering is the same as that which must be used when sending a packet over Sphero's Bluetooth connection.

Then, after introducing the packet signature, we'll take a look at some more details about the data fields with the following formatting convention:

FIELD_NAME (description of field)
Data type of this field (more on this in the next section)
Meaning of this field (Why is it here? What's it used for?)
Permissible values for field data (or where the values are specified)

At this point, we may be "putting the cart ahead of the horse" by skipping over prerequisite information regarding data types, bit-field encoding, and byte-packing of multi-byte data fields. Since I'm assuming that readers may not know this information already, I've chosen to take a top-down approach by introducing conceptual information on packet structure before really drilling down to the implementation details in the following section. For now, you can graze over these finer details while taking notice of the packet signatures, data field descriptions and meanings. When you move on to the next section, you may find yourself coming back here for a second reading to piece things together.

CMD (command packet)

[ SOP1 | SOP2 | DID | CID | SEQ | DLEN | DATA | CHK ]

Command packets are sent from the client to Sphero. They instruct Sphero to perform some operation denoted by DID and CID according to additional parameters passed in DATA; with per-command behavior options passed in SOP2. Every command contains a user-specified sequence number, SEQ (more on this with response packets), and must provide a checksum, CHK, to help Sphero guard against interpreting corrupted packets. One such per-command option encoded in SOP2 specifies that Sphero must respond to the command in a synchronous fashion by sending back a RSP packet.

SOP1 (start-of-packet 1)
uint8
Specifies the beginning of a packet
This value is always FFh
SOP2 (start-of-packet 2)
uint8
Bit-encoded selection of per-command options
Bits 7-4 are always set
Bits 3-2 are assumed to be reserved for future use and should be set
Bit 1 commands Sphero to restart its command timeout counter when set
Bit 0 commands Sphero to respond to the command with a RSP packet when set
DID (device ID)
uint8
Specifies which "virtual device" the command belongs to
This value is documented for a particular command
CID (command ID)
uint8
Specifies the command to perform
This value is documented for a particular command
SEQ (sequence number)
uint8
Used to identify RSP packets with a particular CMD packet
This field can contain any value
DLEN (data length)
uint8
Specifies the number of bytes in the remainder of the packet
This value is computed from the number of bytes in DATA and CHK
This value is at least 01h
This value is documented for a particular command
DATA (optional array of data)
array of byte-packed data
Contains the data (command arguments) for the command
The structure of this field is documented for a particular command
This field is not always required, and can be of variable length
CHK (checksum)
uint8
Contains a checksum that is used invalidate packets if they are corrupted through transmission
Computed from previous bytes excluding SOP1 and SOP2
Computed as the bit-wise complement of the modulo 256 of the sum of the preceding bytes (beginning with DID)
Sphero will not act on commands if this value is computed incorrectly. Rather a RSP packet will indicate the checksum failure if the command was otherwise structured properly.

RSP (command response packet)

[ SOP1 | SOP2 | MRSP | SEQ | DLEN | DATA | CHK ]

Command response packets are sent from Sphero to the client if and only if a previous CMD packet was sent from the client to Sphero with its answer bit set in SOP2. These packets can be identified when read by the client by checking the value of SOP2. The remaining fields can then be read in synchronously and interpreted to determine the status of Sphero's ability to successfully complete the command using MRSP, and read in data sent back by Sphero (if applicable) in DATA.

When reading these packets, a couple more characteristics help make sense of the data being received by the client. First, the checksum should be re-calculated on the packet and compared to the CHK the was sent by Sphero. If these don't match, the packet should be ignored. Second, the way to determine which CMD packet this RSP packet is responding to is to compare the SEQ field to the sequence number sent with the previous command. Since Sphero echos this field in the RSP packet when responding to a CMD packet, the values should be identical. If the sequence numbers don't match, it's possible that the client missed a RSP packet.

SOP1 (start-of-packet 1)
uint8
Specifies the beginning of a packet
This value is always FFh
SOP2 (start-of-packet 2)
uint8
Specifies the packet type as RSP
This value if always FFh
MRSP (message response)
uint8
Specifies the status of Sphero in responding to the CMD packet
This field contains 00h to indicate success
The value of this field specifies an error message if greater than 00h
SEQ (sequence number)
uint8
Used to identify this RSP packet with a particular CMD packet
This field can contain any value that has been issued in the SEQ field of a previous CMD packet
DLEN (data length)
uint8
Specifies the number of bytes in the remainder of the packet
This value is computed from the number of bytes in DATA and CHK
This value is at least 01h
This value is documented for a particular command
DATA (optional array of data)
array of byte-packed data
Contains the data (command outputs) for the response
The structure of this field is documented for a particular command
This field is not always required, and can be of variable length
CHK (checksum)
uint8
Contains a checksum that is used invalidate packets if they are corrupted through transmission
Computed from previous bytes excluding SOP1 and SOP2
Computed as the bit-wise complement of the modulo-256 of the sum of the preceding bytes (beginning with MRSP)
Responses should be ignored if this value doesn't match the computed checksum


MSG (asynchronous message packet)

[ SOP1 | SOP2 | ID_CODE | DLEN | DATA | CHK ]

The asynchronous message packet is sent from Sphero to the client at any time! These packets can be identified when read by the client by checking the value of SOP2, and they contain structured data in DATA that is decoded based upon the type of message being sent as specified by the message identifier code, ID_CODE. Various CMD packets configure Sphero to generate asynchronous messages periodically based upon the occurrence of events or some time-duration. Because of the asynchronous nature of MSG packets, the client must always be in a state that attempts to read and parse either RSP or MSG packets and behave accordingly to store the response data locally and optionally take action automatically when a MSG is received without interfering with the synchronicity of the CMD-RSP packet flow.

As with all packets, you'll also notice that the checksum field is here again. Make sure to compare this field to a checksum computed on the packet and ignore the message if the values don't match.

SOP1 (start-of-packet 1)
uint8
Specifies the beginning of a packet
This value is always FFh
SOP2 (start-of-packet 2)
uint8
Specifies the packet type as MSG
This value if always FEh
ID_CODE (message identifier code)
uint8
Specifies the type of this MSG packet
This value is documented for asynchronous message types
DLEN (data length)
uint16
Specifies the number of bytes in the remainder of the packet
This value is computed from the number of bytes in DATA and CHK
This value is at least 0001h
This value is documented for a particular asynchronous message type
DATA (optional array of data)
array of byte-packed data
Contains the data (information) for the asynchronous message
The structure of this field is documented for a particular asynchronous message type
CHK (checksum)
uint8
Contains a checksum that is used invalidate packets if they are corrupted through transmission
Computed from previous bytes excluding SOP1 and SOP2
Computed as the bit-wise complement of the modulo-256 of the sum of the preceding bytes (beginning with ID_CODE)
Responses should be ignored if this value doesn't match the computed checksum

Data Representation

As with all computers, data is encoded in binary format at the lowest level in Sphero. Since most people typically think about numbers in terms of their decimal representation, it's important to take a moment to describe these representations of numbers. First, we'll take a look at the generalized mathematical representation through use of number systems, and the benefits of each. Then, we'll specify some specific formats, or data types, in which binary and logical values are encoded in Sphero's packet structure. The content of this section loosely covers the basics of understanding the description and documentation of Sphero's numerical language.

Number Systems

Here we distinguish between the number, or physical quantity, of interest and the numerals, or characters/symbols, used to encode number information. Particular numbers are inherently invariant until the point when their values must be communicated; For exmaple, by people or automated systems. Most people typically think of numbers in terms of groups of ten using the decimal, or base ten, number system. Alternatively, conventional computers use logical states, akin to the notion of a switch being turned on or off, to represent numbers. Thus the binary, or base two, number system is used at the hardware level to communicate values. Various other concepts for encoding of information are also useful for certain contexts, and all of these are described in the following.

Logical Bitfield
The characters 0 and 1 are used to denote a sequence of true and false values.
Typically used by computers to physically communicate and store logical information. Oftentimes logical bitfields are regarded as numerical data (in binary format) that should be interpreted as a logical bitfield.
Binary (base two)
The characters 0 and 1 are used to denote a number. Binary numerals are denoted with the suffix b.
Typically used by computers to physically communicate and store numbers.
Example: ten equals 1010b, nineteen equals 10011b
Decimal (base ten)
The characters 0-9 are used to denote a number. Decimal numerals are denoted either with or without the suffix d.
Typically used by people to think about and communicate numbers.
Example: ten equals 10d or 10, nineteen equals 19d or 19
Hexadecimal (base sixteen)
The characters 0-9, A-F are used to denote a number. Hexadecimal numerals are denoted with the suffix h.
Typically used by humans to document numbers used in computers. Hexadecimal numerals are more compact than decimal representation and the base sixteen aligns better with computers' binary system as well as their basic unit of storage, a byte. A byte, or eight bits, requires at most two characters to represent using hexadecimal.
Example: ten equals Ah, nineteen equals 14h

Data Types

Full specification a computer's representation of data requires more than just the use of the binary number system. In addition to communicating data in terms of the states 1 and 0, computers must also understand exactly how many bits (characters) of information to use for a particular piece of data. Furthermore, some subtle details regarding the way in which negative numbers are represented in binary also affect the result of a binary encoding. A standardized approach to representing these details in writing finds its basis in the integer types of the C programming language in this document.

The general syntax used here to denoted the sign and number of bits that should be assumed in an integer's binary representation is denoted by the format:

<u?>int<N>

Where the character u is optional and denotes that the interger, int, is unsigned. The character N denotes the number of bits used to store the data in decimal notation. For example, uint8 is an eight bit integer that does not encode negative numbers, and int32 is a thirty-two bit integer that does encode negative numbers.

Since this notation is conventional in the C programming language (and many others), we will not elaborate on most of the finer details including negative number encoding. It's very likely that the programming environment supports use of similar convention so that the developer can rely upon high-level functionality to handle the bits for you. In the rest of this section, we'll address some key points regarding the handling of logical bitfield encoded data, integer data smaller than eight bits, and the transmission of data elements larger than eight bits.

Logical Bitfield
Requires bit indexing to access individual bits of an integer data type that is conventionally unsigned.
The data type typically used for logical bitfield encoding is typically uint8. However, in some cases only part of a data element is used to encode logical data, and in others, you may encounter larger logical bitfields. For instance, the MASK and MASK2 elements of a SET_DATA_STREAMING CMD are of type uint32.
The syntax of bit indexing varies between programming environments, but the indices are typically denoted in an N bit integer,
[ bit N | bit N-1 | ... | | bit 1 | bit 0 ]
Packed Nibble
Two nibbles, groupings of four bits, are used to encode numbers in a single byte such as a uint8 type.
The characters A-H of a uint8 type are decomposed into high nibble, low nibble, as,
[ ABCDEFGH ] = [ ABCD | EFGH ] = [ high nibble | low nibble ]
Multiple Byte Transmissions
Multiple bytes are transmitted most significant byte (MSB) first sequentially through the least significant byte (LSB).
Since Sphero transmits data one byte at a time, it's crucial to know which order the bytes are sent in when sending or receiving a data element larger than eight bits.
In the following examples for sixteen and thirty-two bit types, the bytes A-D are transmitted in alphabetical order.
uint32 or int32: [ byte 3 = MSB | byte 2 | byte 1 | byte 0 = LSB ] = [ A | B | C | D ]
uint16 or int16: [ byte 1 = MSB | byte 0 = LSB] = [ A | B ]

Using Packets

After reading about the structure of Sphero packets conceptually, brushing up on Sphero's representation of data, and then possibly rereading these sections, we're ready to move forward and discuss the basic ideas behind working with all packet types including, constructing packets used to send commands, receive command responses (optionally), and receive asynchronous notification messages.

In this section, refer to Sphero API documentation on sphero.com or the document Sphero_API_1.50.pdf in Orbotix DeveloperResources branch on GitHub. <ref name="sphero-api-web" /> <ref name="sphero-api-github" />.

Common Characteristics

Some aspects of CMD, RSP, and MSG packets are common. For instance, the CMD and RSP packets share the sequence number data field, and all packets involve the computation of a checksum byte.

The sequence number, SEQ, for commands is used in applications to identify correspondence between command responses and the commands which solicited the response. It's typical to manage the value of SEQ in an application by incrementing its value with every CMD packet sent. However, in these examples, we'll use an arbitrary magic number, 37h, to disambiguate the SEQ byte.

The checksum byte, CHK, is used to detect data transmission failures. The basic idea is that the transmitting machine computes a mathematical function on the packet data and appends the result to the packet. Then, when the receiver reads the incoming packet, the checksum is computed again and compared with the value sent by the transmitter. If these numbers do not match, then the packet data has been corrupted in transmission and should be ignored.

The first step in computing CHK for CMD, RSP, and MSG packets is to add up the values of all bytes beginning with the first byte after SOP2 (i.e. DID, MRSP, or ID_CODE, respectively) and ending with the last byte before CHK (i.e. DLEN or the last byte of DATA). We'll call this summation SUM in the following procedure used to compute CHK.

  1. Compute SUM on the appropriate bytes of the packet as described above
    • Let's assume that SUM = 561d
  2. Compute the modulo 256 of SUM
    • Modulo 256 of 561 equals 49
  3. Perform the bitwise complement on this result
    • Convert to binary: 49d = 00110001b
    • Perform bitwise complement: ~(00110001b) = 11001110b
    • Convert back to decimal or hexadecimal: 11001110b = 206d = CEh
  4. The checksum is, CHK = CEh

CMD Packet Encoding

There are many commands that can be chosen for this tutorial on constructing CMD packets. A representative subset of the commands we may use includes Ping, SetRGBLEDOutput, Roll and ReadLocator. Recall the CMD packet structure in the following.

Before we start to construct specific CMD packets, we'll first take a look at some aspects of the CMD packet that are similar for all commands.

The SOP2 byte contains logical bitfield options DATA in bit positions zero and one. These bits reset the command timeout timer (reset_timeout_flag) and request a command response (answer_flag), respectively. The bit format of the SOP2 byte is,

[ 1 | 1 | 1 | 1 | 1 | 1 | reset_timeout_flag | answer_flag ]

By default, it's good practice to issue commands synchronously (with the answer_flag bit set) and reset the command timeout counter. The resulting default SOP2 byte for a CMD packet is thus,

SOP2 = 11111111b = FFh = 255

Alternative variations of this byte may be,

SOP2 = 11111110b = FEh = 254 (reset timeout, do not issue a response)
SOP2 = 11111101b = FDh = 253 (do not reset timeout, issue a response)
SOP2 = 11111100b = FCh = 252 (do not reset timeout nor issue a response)

Choosing not receive a response to the command may allow for streaming commands at a faster rate, but does not allow verification that Sphero received the command.

Here are some examples of the numerical packet DATA byte strings (and useful notes) for basic commands.

Ping
Simple command used to verify communication with Sphero
DID = 00h, CID = 00h, DLEN = 01h, DATA = <no data>
packet = [ FFh | FFh | 00h | 01h | 37h | 01h | C6 ]
The SOP2 byte should always have the answer_flag bit set since it doesn't make sense to test communications without requesting a response.
SetRGBLEDOutput
Sets the color of Sphero's RGB LED. Input parameters in DATA include, RED, GREEN, BLUE values to specify color channel intensity (0-255), and FLAG to select the persistence of this change (0-1)
DID = 02h, CID = 20h, DLEN = 05h, DATA = [ RED | GREEN | BLUE | FLAG ]
packet = [ FFh | FFh | 02h | 20h | 37h | 05h | FFh | 00h | 00h | 00h | A2h ] (red)
packet = [ FFh | FFh | 02h | 20h | 37h | 05h | 00h | FFh | 00h | 00h | A2h ] (green)
packet = [ FFh | FFh | 02h | 20h | 37h | 05h | 00h | 00h | FFh | 00h | A2h ] (blue)
Roll
Makes Sphero move at a specified speed in a given direction. Input parameters in DATA specify how fast Sphero should move with a fractional SPEED (0-255), the global direction or HEADING in degrees (0-359), and the motion mode or STATE flag (0-2) with stop = 0, normal = 1, and fast = 2.
Here we'll look at using a HEADING greater than 255 degrees to show byte ordering for the uint16 heading as well as a stop command.
DID = 02h, CID = 30h, DLEN = 05h, DATA = [ SPEED | HEADING_MSB | HEADING_LSB | STATE ]
packet = [ FFh | FFh | 02h | 30h | 37h | 05h | 80h | 01h | 0Eh | 01h | 01h ] (half speed, to the left, normal mode)
packet = [ FFh | FFh | 02h | 30h | 37h | 05h | 00h | 00h | 00h | 00h | 91h ] (stop)
ReadLocator
Requests a response including Sphero's current position and velocity estimate
DID = 02h, CID = 15h, DLEN = 01h, DATA = <no data>
packet = [ FFh | FFh | 02h | 15h | 37h | 01h | B0 ]
The SOP2 byte should always have the answer_flag bit set since it doesn't make sense to request response DATA without requesting a response.
Note that some firmware versions may not support this command!

RSP Packet Decoding

Reading in a RSP packet is a relatively straightforward process. As bytes become available to a client application, the beginning of a RSP packet is identified by the expected sequence of bytes SOP1 and SOP2, [ FFh | FFh ]. Then, the remaining bytes of the packet are read with a total number of bytes inferred by DLEN. Once a prospective RSP packet has been read, the checksum should be checked in order to check for either a corrupted or misaligned packet. Assuming that everything is still okay at this point, then the packet is ready to be interpreted by makin sense of the message response, MRSP; sequence number, SEQ, to associate the RSP with its soliciting CMD; and optionally, decode accompanying DATA.

Some example RSP packets that correspond with the CMD packets introduced previously are described in the remainder of this section. In all of these examples, we consider the desired MRSP, zero, which indicates a successful command execution.

Ping
This is a Simple Response with no data. It simply carries a status code, MRSP associated with the CMD corresponding to SEQ
MRSP = 00h (success), DATA = <no data>
packet = [ FFh | FFh | 00h | 37h | 01h | C7h ]
SetRGBLEDOutput
This is a simple response similar to Ping.
Roll
This is a simple response similar to Ping.
ReadLocator
This response has associated data. The interpretation of incoming data is performed after parsing the simple response fields.
MRSP = 00h (success), DATA = [ X_POS (int16) | Y_POS (int16) | X_VEL (int16) | Y_VEL (int16) | SOG (uint16) ] 
packet = [ FFh | FFh | 00h | 37h | 0Bh | 00h | 09h | FFh | C5h | FFh | 8Fh | FFh | 67h | 00h | BEh | 3Eh ]
X_POS = 0009h =    9cm   (x component of position)
Y_POS = FFC5h = - 59cm   (y component of position)
X_VEL = FF8Fh = -113cm/s (x component of velocity)
Y_VEL = FF67h = -153cm/s (y component of velocity)
SOG   = 00BEh =  190cm/s (speed over ground or magnitude of velocity)

MSG Packet Decoding

Unlike CMD and RSP packets, it may seem like there's a bit more complexity in MSG packet formats. One reason for this is in the they're asynchronous so that we can never know exactly when a MSG packet will arrive; More on this later in State Management. The more immediate reason relates to the general nature of data that is encoded in these messages. First, are you notice in the sixteen bit DLEN field, there may possibly be much more data in MSG packets! Also, since some of these messages correspond with rich structured data, the interpretation of the data field is sometimes more involved than decoding data fields with simple pre-determined fields. This is the case for one such useful MSG packet, DataStreamingMessage, which will be the focus of this section. In addition to packet format and structure, the DATA array is also formatted according to variable parameters.

Most, if not all (I'm really not sure), MSG packets are solicited by device state behavior or configuration information that is modified by the use of various commands. In the case of DataStreamingMessage, the behavior by which Sphero issues these MSG packets (i.e. packet rate, total number of packets) is specified by the SetDataStreaming CMD. Additionally, this command also dictates the length and meaning of the data encoded in the MSG DATA array.

Various physical data sources in Sphero may be set up to stream back to a client application by invoking the SetDataStreaming command. One important required command parameter is a logical bitfield named MASK. This sensor mask variable is a thirty-two bit value wherein each bit corresponds with a certain data source. Knowledge of the MASK passed when invoking SetDataStreaming is critical to interpret the DataStreamingMessage MSG packets since only those sources selected by set bits will populate the MSG DATA array.

Although the SetDataStreaming command exposes parameters to configure the streaming of multiple data frames per packet, we'll assume that there is only one frame of sensor data per packet for now. Interpreting the DATA array depends greatly on the mapping of data sources to MASK bit position.

---------------------------+----------------------------------------------------
 DATA SOURCE               | MASK BIT POSITION
---------------------------+----------------------------------------------------
BYTE 3                     |
MASK_ACCEL_X_RAW ......... | 80 00 00 00h = 10000000 00000000 00000000 00000000b
MASK_ACCEL_Y_RAW ......... | 40 00 00 00h = 01000000 00000000 00000000 00000000b
MASK_ACCEL_Z_RAW ......... | 20 00 00 00h = 00100000 00000000 00000000 00000000b
MASK_GYRO_X_RAW .......... | 10 00 00 00h = 00010000 00000000 00000000 00000000b
MASK_GYRO_Y_RAW .......... | 08 00 00 00h = 00001000 00000000 00000000 00000000b
MASK_GYRO_Z_RAW .......... | 04 00 00 00h = 00000100 00000000 00000000 00000000b
BYTE 2                     |
MASK_MOTOR_RT_EMF_RAW .... | 00 40 00 00h = 00000000 01000000 00000000 00000000b
MASK_MOTOR_LT_EMF_RAW .... | 00 20 00 00h = 00000000 00100000 00000000 00000000b
MASK_MOTOR_LT_PWM_RAW .... | 00 10 00 00h = 00000000 00010000 00000000 00000000b
MASK_MOTOR_RT_PWM_RAW .... | 00 08 00 00h = 00000000 00001000 00000000 00000000b
MASK_IMU_PITCH_FILT ...... | 00 04 00 00h = 00000000 00000100 00000000 00000000b
MASK_IMU_ROLL_FILT ....... | 00 02 00 00h = 00000000 00000010 00000000 00000000b
MASK_IMU_YAW_FILT ........ | 00 01 00 00h = 00000000 00000001 00000000 00000000b
BYTE 1                     |
MASK_ACCEL_X_FILT ........ | 00 00 80 00h = 00000000 00000000 10000000 00000000b
MASK_ACCEL_Y_FILT ........ | 00 00 40 00h = 00000000 00000000 01000000 00000000b
MASK_ACCEL_Z_FILT ........ | 00 00 20 00h = 00000000 00000000 00100000 00000000b
MASK_GYRO_X_FILT ......... | 00 00 10 00h = 00000000 00000000 00010000 00000000b
MASK_GYRO_Y_FILT ......... | 00 00 08 00h = 00000000 00000000 00001000 00000000b
MASK_GYRO_Z_FILT ......... | 00 00 04 00h = 00000000 00000000 00000100 00000000b
BYTE 0                     |
MASK_MOTOR_RT_EMF_FILT ... | 00 00 00 40h = 00000000 00000000 00000000 01000000b
MASK_MOTOR_LT_EMF_FILT ... | 00 00 00 20h = 00000000 00000000 00000000 00100000b

Given a DATA array and its corresponding MASK, the process for decoding its bytes into usable information is as follows.

  1. Iterate through the bits of MASK, beginning with the most significant bit
  2. If the bit is set, shift out two bytes from DATA
    • These bytes are the data for the corresponding data source

Since all data sources are encoded with sixteen bits, the length of the DATA array should exactly double the number of bits set in MASK unless there are multiple data frames transmitted in a single packet. If this is the case, repeat this procedure for successive data frames until all elements are shifted out of DATA.

DataStreamingMessage
In this example, we'll specify that all of the raw and filtered accelerometer data sources should stream with one frame per packet
MASK = 11100000000000001110000000000000b
ID_CODE = 03h, DLEN = 0Dh = 12 = (6 data sources)*(two bytes per data source)*(1 frame per packet)
packet = [ FFh | FEh | 03h | 00h | 0Dh | 00h | 00h | 00h | 0Ah | 00h | FBh | FFh | DFh | 00h | 76h | 10h | 24h | 62h ]
Data results are decoded in hardware units (i.e. before conversion to physical units) below
ACCEL_X_RAW  = 0000h =     0
ACCEL_Y_RAW  = 000Ah =    10
ACCEL_Z_RAW  = 00FBh =   251
ACCEL_X_FILT = FFDFh =   -33
ACCEL_Y_FILT = 0076h =   118
ACCEL_Z_FILT = 1024h =  4132

State Management

Now that we understand the language that Sphero speaks, the last step to implementing Sphero API is learning how to successfully hold a conversation. In other words, just sending commands and receiving responses one at a time is not enough to manage all communications with the device. Managing Sphero's communication state within a client application is imperative in order to maintain consistent representations of device parameters and be prepared to interpret all types of data that may be received from the device at a given instant in time. A sensible perspective through which to approach these considerations is in terms of the attributes of Synchronous Command and Control and Asynchronous Messages before finally looking at The Big Picture.

Synchronous Command and Control

Many interactions with Sphero result directly from issuing a command. These may involve actions that are performed (changing color or moving around), modification of device parameters (changing heading offset or turning off stabilization), or event changing the device's future communication state. Intensive purposes underlying the intent in most command scenarios suggests that the successful command processing ought to be confirmed and the device state should be recorded within a client application.

By default, a client application should issue commands with the answer_flag bit set in SOP2 in order to confirm success of commands. This gives rise to the notion that and issued command (CMD) should be directly followed by a blocking attempt to read the command response (RSP). This is a simple place to start implementing communications in a client application. We can express the communication control flow in pseudocode as,

procedure: SendCommand(CMD)
  VALID_RSP = FALSE
  write CMD
  while VALID_RSP is FALSE
    receive INCOMING_DATA
    if INCOMING_DATA is a valid RSP
      RSP = INCOMING_DATA
      VALID_RSP = TRUE
    end if
  end while
  return RSP
end procedure

A naive implementation of a client application considering only synchronous behavior may simply wait for a valid response immediately after issuing a command. Since synchronous commands should be processed one at a time in sequence, this is a good approach for maintaining consistency. However, a generalization of the receive data functionality must be considered to enable success in communication involved asynchronous messages.

Asynchronous Messages

The main challenge to overcome when implementing a communication server that supports asynchronous messages is that asynchronous MSG packets may be received from Sphero at any time. One may consider supporting this behavior by incorporating the above pseudocode for the SendCommand procedure as follows,

while TRUE
  if SENDING_COMMAND
    SendCommand(CMD)
  else
    receive INCOMING_DATA
    if INCOMING_DATA is a valid MSG
      MSG = INCOMING_DATA
      handle MSG
    end if
  end if
end while

But there's a very important problem with this implementation. Since MSG packets can come at any time, it's possible to miss a MSG packet (and possibly complicate reading a RSP) when entering the SendCommand procedure. If, for instance, a MSG packet is received after entering SendCommand but before the RSP packet is sent back from Sphero, then the incoming MSG packet will not be received in this case. We need to adjust our approach to accommodate receiving incoming data for both RSP and MSG packets all the time and at the same time.

Our main program loop may then take a form such as,

while TRUE
  receive INCOMING_DATA
  if INCOMING_DATA is a valid RSP
    RSP = INCOMING_DATA
    notify VALID_RSP signal
  else if INCOMING_DATA is a valid MSG
    MSG = INCOMING_DATA
    handle MSG
  end if
end while

Now we've simply solved the problem of missing incoming data, but we have ignored the SendCommand procedure all together while introducing a new concept of signaling the status of a valid response receipt. Somehow, we must interrupt the main program to send a command, and instead of blocking the program indefinitely while waiting to receive the RSP packet, we must use a signaling method to detect the receipt of a valid response. These are common concepts in multithreaded applications where typical solutions involve the use of mutual exclusion locks and shared data. A modified version of SendCommand may look like,

procedure: SendCommand(CMD)
  VALID_RSP = FALSE
  write CMD
  wait for VALID_RSP signal
  return RSP
end procedure

References

<references />