Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FW 3.1e -> 3.21OPDM: unexpected 804E answer from zigate #370

Open
tcharp38 opened this issue Jun 20, 2021 · 10 comments
Open

FW 3.1e -> 3.21OPDM: unexpected 804E answer from zigate #370

tcharp38 opened this issue Jun 20, 2021 · 10 comments

Comments

@tcharp38
Copy link

During LQI collection, starting by zigate interrogation, I got unexpected answer.

Request to zigate: {"topic":"CmdAbeille1/0000/Management_LQI_request","payload":"address=0000&StartIndex=00"}
First answer tells there are only 3 entries in the table

[2021-06-19 00:00:06] Abeille1, Type=804E/Management LQI response, SQN=CB, Status=00, NTableEntries=03, NTableListCount=02, StartIndex=00, SrcAddr=0000
[2021-06-19 00:00:06] NAddr=5AC7, NExtPANId=8FDBA9CA044F2AFA, NExtAddr=00158D00047D3A86, NDepth=01, NLQI=B5, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice
[2021-06-19 00:00:06] NAddr=1AEA, NExtPANId=8FDBA9CA044F2AFA, NExtAddr=5C0272FFFEA0E396, NDepth=01, NLQI=AA, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice

then a request for startindex 02: {"topic":"CmdAbeille1/0000/Management_LQI_request","payload":"address=0000&StartIndex=02"}

but the next answer is now telling there are 0B entries in the table but does not return anything.

[2021-06-19 00:00:06] Abeille1, Type=804E/Management LQI response, SQN=10, Status=00, NTableEntries=0B, NTableListCount=00, StartIndex=03, SrcAddr=0000

then again zigate tells there are only 3 entries and returns proper startindex 02.

[2021-06-19 00:00:06] Abeille1, Type=804E/Management LQI response, SQN=CC, Status=00, NTableEntries=03, NTableListCount=01, StartIndex=02, SrcAddr=0000
[2021-06-19 00:00:06] NAddr=E93B, NExtPANId=8FDBA9CA044F2AFA, NExtAddr=00158D000465AF13, NDepth=01, NLQI=A8, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice

Any idea ?
I've not seen this before 3.1e.

@fairecasoimeme
Copy link
Owner

Hi there is a difference between NTablesEntries et NTableListCount. this is not the same thing.

NTablesEntries is the total number of Neighbour table entries on the remote node
NTableListCount is the number of Neighbour table entries reported in this response (through pNetworkTableList)

With the firmware code this is the NTableListCount which is used :

here is the code :

case ZPS_ZDP_MGMT_LQI_RSP_CLUSTER_ID:
                    {
                        uint8    u8Values;
                        uint8    u8Bytes;
                        ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uZdpData.sMgmtLqiRsp.u8Status,                    u16Length );
                        ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uZdpData.sMgmtLqiRsp.u8NeighborTableEntries,      u16Length );
                        ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uZdpData.sMgmtLqiRsp.u8NeighborTableListCount,    u16Length );
                        ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uZdpData.sMgmtLqiRsp.u8StartIndex,                u16Length );
                        if( sApsZdpEvent.uZdpData.sMgmtLqiRsp.u8Status == ZPS_E_SUCCESS )
                        {
                            for ( u8Values = 0; u8Values < sApsZdpEvent.uZdpData.sMgmtLqiRsp.u8NeighborTableListCount; u8Values++ )
                            {
                                ZNC_BUF_U16_UPD  ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uLists.asNtList[u8Values].u16NwkAddr,            u16Length );
                                ZNC_BUF_U64_UPD  ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uLists.asNtList[u8Values].u64ExtPanId,           u16Length );
                                ZNC_BUF_U64_UPD  ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uLists.asNtList[u8Values].u64ExtendedAddress,    u16Length );
                                ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uLists.asNtList[u8Values].u8Depth,               u16Length );
                                ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , sApsZdpEvent.uLists.asNtList[u8Values].u8LinkQuality,         u16Length );

                                u8Bytes  =  sApsZdpEvent.uLists.asNtList[u8Values].uAncAttrs.u2DeviceType;
                                u8Bytes |=  ( sApsZdpEvent.uLists.asNtList[u8Values].uAncAttrs.u2PermitJoining << 2 );
                                u8Bytes |=  ( sApsZdpEvent.uLists.asNtList[u8Values].uAncAttrs.u2Relationship << 4 );
                                u8Bytes |=  ( sApsZdpEvent.uLists.asNtList[u8Values].uAncAttrs.u2RxOnWhenIdle << 6 );
                                ZNC_BUF_U8_UPD   ( &au8LinkTxBuffer [u16Length] , u8Bytes,    u16Length );
                            }
                        }
                        ZNC_BUF_U16_UPD  ( &au8LinkTxBuffer [u16Length],  psStackEvent->uEvent.sApsDataIndEvent.uSrcAddress.u16Addr,    u16Length );
                        vSL_WriteMessage ( E_SL_MSG_MANAGEMENT_LQI_RESPONSE,
                                           u16Length,
                                           au8LinkTxBuffer,
                                           u8LinkQuality );
                    }
                    break;

@tcharp38
Copy link
Author

Hi
We are in line.

NTableEntries is supposed to be the total of entries in the table. So no reason to get 03 then 0B then 03.
This is the point.

I will try to do same test with 31d, just in case.

@tcharp38
Copy link
Author

I can see this issue from other users as well: zigate is sometimes responding badly:

  • I've got 0x23 table entries
  • I've got 0x0B table entries
  • I've got 0x23 table entries

Request to addr 0000, startIndex=18
[2021-08-24 11:36:43] Abeille1, Type=804E/Management LQI response, SQN=00, Status=00, NTableEntries=23, NTableListCount=02, StartIndex=18, SrcAddr=0000
[2021-08-24 11:36:43] NAddr=DB96, NExtPANId=A77D1F6C993597B4, NExtAddr=00158D0003F46CBE, NDepth=00, NLQI=7E, NBitMap=69 => RxONWhenIdle/Sibling/?/Router
[2021-08-24 11:36:43] NAddr=9052, NExtPANId=A77D1F6C993597B4, NExtAddr=00158D00044B1C3A, NDepth=00, NLQI=48, NBitMap=69 => RxONWhenIdle/Sibling/?/Router
Request to addr 0000, startIndex=1A
[2021-08-24 11:36:43] Abeille1, Type=804E/Management LQI response, SQN=18, Status=00, NTableEntries=0B, NTableListCount=00, StartIndex=23, SrcAddr=0000
[2021-08-24 11:36:44] Abeille1, Type=8000/Status, Status=00/Success, SQN=01, PacketType=004E
[2021-08-24 11:36:44] Abeille1, Type=804E/Management LQI response, SQN=01, Status=00, NTableEntries=23, NTableListCount=02, StartIndex=1A, SrcAddr=0000
[2021-08-24 11:36:44] NAddr=17F8, NExtPANId=A77D1F6C993597B4, NExtAddr=00158D00041477B6, NDepth=00, NLQI=80, NBitMap=69 => RxONWhenIdle/Sibling/?/Router
[2021-08-24 11:36:44] NAddr=9212, NExtPANId=A77D1F6C993597B4, NExtAddr=00158D000403FF07, NDepth=00, NLQI=30, NBitMap=69 => RxONWhenIdle/Sibling/?/Router
Request to addr 0000, startIndex=1C

Worse than this, this looks like following LQI management answers from Zigate are just... crap.

[2021-08-24 11:36:44] Abeille1, Type=804E/Management LQI response, SQN=1C, Status=00, NTableEntries=23, NTableListCount=0B, StartIndex=20, SrcAddr=0000
[2021-08-24 11:36:44] NAddr=0002, NExtPANId=A77D1F6C993597B4, NExtAddr=00158D00040017E0, NDepth=6C, NLQI=1F, NBitMap=75 => RxONWhenIdle/?/PermitJoinOFF/Router
[2021-08-24 11:36:44] Eq addr 0002 is unknown.
[2021-08-24 11:36:44] NAddr=A77D, NExtPANId=040016D4993597B4, NExtAddr=00158D00023ABF00, NDepth=5C, NLQI=04, NBitMap=0B => RxOFFWhenIdle/Parent/?/?
[2021-08-24 11:36:44] Eq addr A77D is unknown.
[2021-08-24 11:36:44] NAddr=8D00, NExtPANId=0400015004007F60, NExtAddr=A77D1F6C0000D400, NDepth=4D, NLQI=79, NBitMap=51 => RxONWhenIdle/Child/PermitJoinON/Router
[2021-08-24 11:36:44] Eq addr 8D00 is unknown.
[2021-08-24 11:36:44] NAddr=1C00, NExtPANId=0AB0980000000000, NExtAddr=80311C000400036C, NDepth=0B, NLQI=23, NBitMap=20 => RxOFFWhenIdle/Sibling/PermitJoinON/Coordinator
[2021-08-24 11:36:44] Eq addr 1C00 is unknown.
[2021-08-24 11:36:44] NAddr=1F75, NExtPANId=7D1F6C993597B400, NExtAddr=158D00040017E06C, NDepth=7D, NLQI=A7, NBitMap=40 => RxONWhenIdle/Parent/PermitJoinON/Coordinator
[2021-08-24 11:36:44] Eq addr 1F75 is unknown.
[2021-08-24 11:36:44] NAddr=0004, NExtPANId=993597B400158D00, NExtAddr=023ABF005C040B8D, NDepth=01, NLQI=00, NBitMap=50 => RxONWhenIdle/Child/PermitJoinON/Coordinator
[2021-08-24 11:36:44] Eq addr 0004 is unknown.
[2021-08-24 11:36:44] NAddr=9800, NExtPANId=60A77D1F6C0000D4, NExtAddr=004D79511C000AB0, NDepth=00, NLQI=00, NBitMap=00 => RxOFFWhenIdle/Parent/PermitJoinON/Coordinator
[2021-08-24 11:36:44] Eq addr 9800 is unknown.
[2021-08-24 11:36:44] NAddr=97B4, NExtPANId=1C000400036C0B23, NExtAddr=201F757D1F6C9935, NDepth=15, NLQI=00, NBitMap=C1 => ?/Parent/PermitJoinON/Router
[2021-08-24 11:36:44] Eq addr 97B4 is unknown.
[2021-08-24 11:36:44] NAddr=023A, NExtPANId=17E06C7DA7400004, NExtAddr=993597B400158D00, NDepth=00, NLQI=BF, NBitMap=D0 => ?/Child/PermitJoinON/Coordinator
[2021-08-24 11:36:44] Eq addr 023A is unknown.
[2021-08-24 11:36:44] NAddr=511C, NExtPANId=010050980060A77D, NExtAddr=1F6C0000D4004D79, NDepth=0A, NLQI=00, NBitMap=30 => RxOFFWhenIdle/?/PermitJoinON/Coordinator
[2021-08-24 11:36:44] Eq addr 511C is unknown.
[2021-08-24 11:36:44] NAddr=9935, NExtPANId=97B41C000400036C, NExtAddr=0B23201F757D1F6C, NDepth=00, NLQI=15, NBitMap=49 => RxONWhenIdle/Parent/?/Router
[2021-08-24 11:36:44] Eq addr 9935 is unknown.

Nobody in zigate team faced such behavior ?

@pipiche38
Copy link

I'm not sure to understand what is your problem .
on my end i never notice any issues on decoding the Neighbourg table from ZiGate.

@tcharp38
Copy link
Author

The problem is that

  • zigate may return stupid message during that interrogation with 3.1e (never saw that before). Saying the table is 23 then 0B then 23 again is... no sense isn't it ?
  • worse that this at some point, returned message is just bulshit as you can see all these unknown devices

I don't see why it would be a pb of "decoding".
But ok so far I can't prove it is zigate. I will try to reproduce something on my side and sniff the network in //.

@pipiche38
Copy link

pipiche38 commented Aug 25, 2021 via email

@tcharp38
Copy link
Author

Ok clear.
Let me try to capture a real falling case.

@pipiche38
Copy link

Now, how can you sniff as you mentioned the issues is when getting the Neighbourg table of the ZiGate ?

@tcharp38
Copy link
Author

You are perfectly right. I missed this point but I can now reproduce on my simple test bench.
As a summary, 804E answer might be bulshit when interrogating zigate (not transmitted over the air)

  • pb appears in 3.1e only (seems too), default FW (not the optimized one)
  • CRC & msg size seem valid => Table seems corrupted inside zigate before sending 804E
  • message size is big (488 chars => 244 bytes)
  • Zigate may say tableEntries=07 but tableListCount=0B => No sense !!!
  • NOT specific to me. Seen in logs from other users

See attached analysis file
analysis.txt

@tcharp38
Copy link
Author

Just to refresh that I still see such unexepected behavior with 3.21-OPDM, PiZigate v1

I get the first answer from Zigate telling 6 entries total but 2 only listed here

[2022-01-19 13:02:14] Abeille1, Type=804E/Management LQI response, SQN=68, Status=00, NTableEntries=06, NTableListCount=02, StartIndex=00, SrcAddr=0000
[2022-01-19 13:02:14] NAddr=8052, NExtPANId=C99F5D26E97D77A6, NExtAddr=00124B002242C5C5, NDepth=01, NLQI=D8, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice
[2022-01-19 13:02:14] NAddr=EE9B, NExtPANId=C99F5D26E97D77A6, NExtAddr=00158D000478E64E, NDepth=01, NLQI=CE, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice

then a non sense answer for without new request with "0B" as total and none reported

[2022-01-19 13:02:14] Abeille1, Type=804E/Management LQI response, SQN=10, Status=00, NTableEntries=0B, NTableListCount=00, StartIndex=06, SrcAddr=0000

then a good answer for next request (startIndex=02)

[2022-01-19 13:02:14] Abeille1, Type=804E/Management LQI response, SQN=69, Status=00, NTableEntries=06, NTableListCount=02, StartIndex=02, SrcAddr=0000
[2022-01-19 13:02:14] NAddr=8217, NExtPANId=C99F5D26E97D77A6, NExtAddr=00158D000464C9C2, NDepth=01, NLQI=F8, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice
[2022-01-19 13:02:14] NAddr=0098, NExtPANId=C99F5D26E97D77A6, NExtAddr=04CF8CDF3C77164B, NDepth=01, NLQI=F6, NBitMap=1A => RxOFFWhenIdle/Child/?/EndDevice

@tcharp38 tcharp38 changed the title FW3.1e: unexpected 804E answer from zigate FW 3.1e -> 3.21OPDM: unexpected 804E answer from zigate Jan 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants