Documentation

Mistake on this page? Email us

Pelion Device Management Client error recovery mechanism

Connectivity Error Handling

Device Management Client handles error recovery on behalf of applications, thereby providing a seamless connectivity experience and recovery from temporary network break issues or disruptions to Device Management services. Connectivity between a client and Device Management encompasses network connectivity, CoAP level connectivity and client-service level connectivity (including, for example, handling client certificate expiry or renewal).

The logic handling reconnection to Device Management:

  • Establishes a secure network connection.
  • Registers to Device Management.
  • Resends CoAP messages. More information about resending is in the CoAP specification.

This section explains what kind of connectivity errors an application may receive, what they mean and how the Device Management Client handles them.

Note: Some errors may need to be handled by the user or the application.

Reconnection attempt intervals

Device Management Client tries to establish a new connection to the server with incremented reconnection attempts:

  1. The client picks a random initial reconnection time between 2 to 10 seconds (to prevent multiple clients trying to connect to Device Management at the same time after a possible service break).
  2. It tries reconnection after this initial time.
  3. If the connection fails, the client returns an appropriate error to the application.
  4. The client continues retrying the connection to Device Management with an increased reconnection time. Every failed reconnection attempt increments the reconnection time by a factor of two, continuing until the reconnection time reaches one week. For example, if the client picks the initial reconnection time of 5 seconds, it tries to reconnect at 5, 10, 20, 40, 80... seconds up to one week.
  5. The reconnection time does not increase above one week; the client will attempt to reconnect once a week until it reconnects or the device stops operating.

Every successful reconnection resets the reconnection time; if there is another failure, the reconnection attempts will begin with the original reconnection time.

Error codes

Failed connection attempts can return different error codes to a client application. The following list explains the error codes and proposes possible fixes. The actual enumerations for these error codes are located in mbed-cloud-client/mbed-cloud-client/MbedCloudClient.h.

ConnectBootstrapFailed

Device Management Client failed to successfully bootstrap to Device Management and cannot retrieve credentials for the Device Management service. This normally happens when you are using a developer certificate, and have already created 100 devices with that certificate (you may also receive the error message "Account device quota reached" in your client application).

To fix this issue, delete some devices through Device Management Portal. Device Management Client continues to retry to bootstrap, and will connect as soon as there is room for new devices.

ConnectInvalidParameters

The application has entered one or more wrong parameters at registration. Normally, this error occurs when the application provides an invalid Device Management URL (the accepted CoAP format is coap:://<URL>:5684), device name or account ID (for example, a parameter longer than 64 characters).

ConnectNotRegistered

The application tried to call close() without the client being in the registered state.

ConnectTimeout

Device Management is not responding to the client's registration attempts. This normally happens when the client cannot finish a successful registration within three minutes and there are no network issues during that time.

ConnectNetworkError

There is a network level issue between the client and Device Management server causing a connection break. The client returns this error when trying to register, or if it loses connection while already registered. It falls back to the reconnection logic and attempts to recover from the lost connection by re-registering itself.

ConnectResponseParseFailed

The application received a malformed CoAP message from the server, which it failed to parse. This can happen if a third party server implementation has mismatching CoAP library implementations, and should not happen with Device Management services.

ConnectMemoryConnectFail

The client failed to store the Device Management device credentials it received during bootstrap. This can happen if the client cannot create a CoAP message due to low memory.

ConnectNotAllowed

The application tried to call an API that the client cannot handle at that stage. For example, the application tried to call keep_alive() before setup().

ConnectSecureConnectionFailed

There was a (D)TLS level failure during the registration phase. This can happen because of an expired device certificate, in which case the client falls back to the bootstrap phase to fetch updated certificates.

ConnectDnsResolvingFailed

The client cannot resolve the DNS query for the Device Management server URL addresses. It continues to retry until it resolves the DNS, then continues the connection process.

ConnectorFailedToStoreCredentials

The client cannot store the device credentials in the secure storage. Check the memory card (in Mbed OS). If it is corrupted, please format it.

ConnectorFailedToReadCredentials

The client cannot read the device credentials from the secure storage. Check the memory card (in Mbed OS). If it is corrupted, please format it.

ConnectorInvalidCredentials

The client failed to get the proper bootstrap credentials from the secure storage. Try to factory reset the secure storage, then try the operation again.

The client returns an Invalid Parameter error

Sometimes, your client application might return MbedCloudClient::ConnectInvalidParameters while registering with Device Management.

In factory mode, this can happen because you're using a wrong URI format to access the bootstrap service or LwM2M server:

  • For bootstrap, the URI format is: coaps:\\<mbed- bootstrap-server-url>:5684?aid=<your-account-id>.
  • For LwM2M, the URI format is: coaps:\\<mbed-LWM2M-server-url>:5684?aid=<your-account-id>.

Reflash these values with the factory configurator client (FCC), and run your application again.

The client prints RTX error (Mbed OS only)

If you see:

  • RTX error code 0x00000001 .. in your console, it means your application has run out of stack memory.

    Device Management Client handles its asynchronous operation through a separate thread. That thread has been assigned 8 kB of its own stack space, but for some applications, this might not be enough. You can increase the stack from your application's mbed_app.json file, by modifying the stack size value from 8192 to some higher value:

    "nanostack-hal.event_loop_thread_stack_size": <8192>,
    

    Remember to check your hardware configuration - it must have enough memory to handle a bigger stack size.

  • If you compiled your application as a debug version, it will require more flash memory than a release version - typically 1.5 to 2 times more. For debugging purposes, you may need to select hardware that is less constrained than your normal deployment devices.

Request frequency issues

Device Management provides sufficient capacity to handle REST API requests. When the frequency of issued requests is above a threshold set to protect our system, our system returns an error with code 429. If you receive this error, please pause request execution for 60 seconds. You can then resume normal work.

Firmware Update Error Handling

During the lifetime of the device, a number of errors relating to firmware update can occur:

  • Internal to the Cloud client, that are only printed if the debugging log is turned on.
  • Cloud client errors that are printed on the serial even when debugging is turned off.
  • Errors that are reported to the Cloud using the 5/0/5 UpdateResult LwM2M resource.

The following errors are reported to the Cloud.

WarningCertificateNotFound

UpdateResult: 6: Unsupported package type.

An update certificate is missing. The update certificate needs to be injected using the factory provisioning flow or, in case of developer mode, included in update_default_resources.c via manifest-tool.

ErrorWriteToStorage

UpdateResult: 2: Not enough storage space for the new firmware package.

Something went wrong when writing firmware to storage on device or there is not enough storage for the new firmware candidate.

ErrorInvalidHash

UpdateResult: 9: Unsupported protocol.

The hash of the downloaded firmware does not match the hash supplied in the manifest. It could be that the manifest was created with a wrong URL or a wrong firmware image. It is also possible that there was a network error or a storage error which corrupted the hash supplied in the manifest or the firmware candidate.

WarningIdentityNotFound

UpdateResult: 6: Unsupported package type.

The Device Identity, which consists of Device, Class and Vendor IDs cannot be retreived. The device identity files can be injected using the factory provisioning flow or, in case of developer mode, included in update_default_resources.c via manifest-tool.

WarningClassMismatch

UpdateResult: 6: Unsupported package type.

The Device Class does not match the one specified in the manifest, so the Update client rejects the firmware update. Refer to device indentifiers documentation for more information.

WarningVendorMismatch

UpdateResult: 6: Unsupported package type.

The Device Vendor ID does not match the one specified in the manifest, so the Update client rejects the firmware update. Refer to device indentifiers documentation for more information.

WarningDeviceMismatch

UpdateResult: 6: Unsupported package type.

The Device ID does not match the one specified in the manifest, so the Update client rejects the firmware update. Refer to device indentifiers documentation for more information.

WarningCertificateInvalid

UpdateResult: 6: Unsupported package type.

The update certificate on the device is not valid. This may be because:

  • The certificate has expired.
  • The data isn't a certificate.
  • The certificate has a signature mismatch.
  • The certificate is encoded wrong so that it's not bare DER.

WarningSignatureInvalid

UpdateResult: 6: Unsupported package type.

The signature on the manifest is not valid.

WarningURINotFound

UpdateResult: 7: Invalid URI

The device cannot reach the firmware URI specified in the manifest.

WarningRollbackProtection

UpdateResult: 6: Unsupported package type.

The firmware candidate is of an older version (the firmware version is the timestamp when the manifest is created) than the current active image. The rollback protection feature ensures that the update is rejected.

WarningUnknown

UpdateResult: 6: Unsupported package type.

All other errors.